在HarmonyOS中用AI识别图中文字 精华

奶盖
发布于 2021-4-8 19:03
浏览
9收藏

1. 介绍

 

AI的通用文字识别可以对文档翻拍、街景翻拍等图片来源的文字检测和识别,可以集成在其他应用中,提供文字检测、识别的功能,并根据识别结果提供翻译、搜索等相关服务。该功能在一定程度上支持文本倾斜、拍摄角度倾斜、复杂光照条件以及复杂文本背景等场景的文字识别。通用文字识别详细介绍可参考AI-通用文字识别,分词详细介绍可参考AI-分词。

 

🕮 说明
 ● 分词文本限制在500字以内,编码格式必须为utf-8。
 ● 分词目前只支持中文语境。
 ● 支持处理的图片格式包括JPEG、JPG、PNG、GIF、BMP。
 ● 目前支持的语言有:中文、英文、日语、韩语、俄语、意大利语、西班牙语、葡萄牙语、德语,以及法语(将来会增加更多语种),但不支持手写字体识别。

本教程将通过以下内容为您展示如何实现基于AI的通用文字识别功能。

 

2. 代码结构解读


基于AI的通用文字识别示例教程主要内容包括:图片列表展示、输入文本、分词、通用文字识别、结果展示等功能,可在7 完整示例代码中查看工程代码。DevEco Studio工程代码结构如下:

在HarmonyOS中用AI识别图中文字-鸿蒙开发者社区     ● provider:PictureProvider图片适配类,获取所有图片,并将图片放到图片列表中。
     ● slice:MainAbilitySlice本示例教程主页面。
     ● util:工具类
         ○ LogUtil是日志打印类,对HiLog日志进行了封装。
         ○ WordRecognition是通用文字识别类,对图片中的文字进行识别并保存。
         ○ WordSegment是分词类,对输入文本进行分词。
     ● MainAbility:主程序入口,DevEco Studio生成,未添加逻辑,不需变更。
     ● MyApplication:DevEco Studio生成,不需变更。
     ● resources:存放工程使用到的资源文件
         ○ resources\base\element中存放DevEco studio自动生成的配置文件string.json,不用变更。
         ○ resources\base\graphic中存放页面样式文件:
             ◼ ️background_ability_page.xml用于设置界面背景颜色。
             ◼ ️background_ability_main.xml用于设置界面布局样式。
             ◼ ️button_element.xml用于设置按钮样式。
         ○ resources\base\layout中布局文件:
             ◼ ️ability_main.xml用于展示图片和输入文本。
             ◼ ️item_image_layout.xml用于设置图片滑动区域图片。
        resources\base\media下存放图片资源(本教程使用了8张.jpg图片,开发者自行准备;icon.png由DevEco Studio生成不需变更)。
     ● config.json:配置文件。

 

3. 添加并展示图片

 

1.在"resources\base\media"目录下添加8张jpg图片(分别命名为1-8.jpg),并加载图片id数组,代码如下:

private int[] pictureLists = new int[]{ResourceTable.Media_1, ResourceTable.Media_2, 
ResourceTable.Media_3, ResourceTable.Media_4, ResourceTable.Media_5,
ResourceTable.Media_6, ResourceTable.Media_7, ResourceTable.Media_8};

2.获取图片id数组和MainAbilitySlice对象,代码如下:

public PictureProvider(int[] pictureLists, Context context) { 
	this.pictureLists = pictureLists; 
	this.context = context; 
}

3.展示图片到页面,代码如下:

@Override 
public Component getComponent(int var1, Component var2, ComponentContainer var3) { 
	ViewHolder viewHolder = null;// Component中展示图片类 
	Component component = var2; 
	if (component == null) { 
		component = LayoutScatter.getInstance(context).parse(ResourceTable.Layout_item_image_layout, 
				null, false); 
		viewHolder = new ViewHolder(); 
		Component componentImage = component.findComponentById(ResourceTable.Id_select_picture_list); 
		if (componentImage instanceof Image) { 
			viewHolder.image = (Image) componentImage; 
		} 
		component.setTag(viewHolder);//设置需要展示的图片 
	} else { 
		if (component.getTag() instanceof ViewHolder) { 
			viewHolder = (ViewHolder) component.getTag(); 
		} 
	} 
	if (viewHolder != null) { 
		viewHolder.image.setPixelMap(pictureLists[var1]); 
	} 
	return component; 
}

4.定义ViewHolder类,用于列表中展示图片,代码如下:

private static class ViewHolder { 
	Image image; 
}

 

4. 识别图片中的文字

 

1.调用文字识别方法对图片文字进行识别,代码如下:

wordRecognition(slice, pictureLists[index], handle); //index为待识别图片下标 
public void wordRecognition(Context context, int resId, MainAbilitySlice.MyEventHandle myEventHandle) { 
	mediaId = resId; 
	// 实例化ITextDetector接口 
	textDetector = VisionManager.getTextDetector(context); 
 
	// 实例化VisionImage对象image,并传入待检测图片pixelMap 
	pixelMap = getPixelMap(resId); 
	VisionImage image = VisionImage.fromPixelMap(pixelMap); 
 
	// 定义VisionCallback<Text>回调,异步模式下用到 
	VisionCallback<Text> visionCallback = getVisionCallback(); 
 
	// 定义ConnectionCallback回调,实现连接能力引擎成功与否后的操作 
	ConnectionCallback connectionCallback = getConnectionCallback(image, visionCallback); 
 
	// 建立与能力引擎的连接 
	VisionManager.init(context, connectionCallback); 
}

2.异步模式下回调方法,将图片中文字识别结果通过sendResult()方法发送到主线程,代码如下:

private VisionCallback getVisionCallback() { 
 return new VisionCallback<Text>() { 
     @Override 
     public void onResult(Text text) { 
         sendResult(text.getValue()); 
     } 
 }; 
}

3.连接引擎成功后进行文字识别,并将识别结果通过sendResult()方法发送到主线程,代码如下:

private ConnectionCallback getConnectionCallback(VisionImage image, VisionCallback<Text> visionCallback) { 
 return new ConnectionCallback() { 
     @Override 
     public void onServiceConnect() { 
         // 实例化Text对象text 
         Text text = new Text(); 
         // 通过TextConfiguration配置textDetector()方法的运行参数 
         TextConfiguration.Builder builder = new TextConfiguration.Builder(); 
         builder.setProcessMode(VisionConfiguration.MODE_IN); 
         builder.setDetectType(TextDetectType.TYPE_TEXT_DETECT_FOCUS_SHOOT); // 此处变量名将会被调整 
         builder.setLanguage(TextConfiguration.AUTO); 
         TextConfiguration config = builder.build(); 
         textDetector.setVisionConfiguration(config); 
         // 调用ITextDetector的detect()方法 
         if (!IS_ASYNC) { 
             int result2 = textDetector.detect(image, text, null); // 同步 
             sendResult(text.getValue()); 
         } else { 
             int result2 = textDetector.detect(image, null, visionCallback); // 异步 
         } 
     } 
 
     @Override 
     public void onServiceDisconnect() { 
         // 释放 成功:同步结果码为0,异步结果码为700 
         if ((!IS_ASYNC && (result == 0)) || (IS_ASYNC && (result == IS_ASYNC_CODE))) { 
	       textDetector.release(); 
         } 
         if (pixelMap != null) { 
             pixelMap.release(); 
             pixelMap = null; 
         } 
         VisionManager.destroy(); 
     } 
 }; 
}

🕮 说明
1.引擎使用TextDetectType.TYPE_TEXT_DETECT_FOCUS_SHOOT(聚焦拍照OCR)。
2.同步模式调用成功时,该函数返回结果码0。异步模式调用请求发送成功时,该函数返回结果码700。
3.同步模式下visionCallback为null,结果码由方法返回,检测识别结果由text中返回。
4.异步模式下visionCallback不为null,函数返回时text中的值无效(即:text参数为null),实际识别结果由回调函数visionCallback返回。
5.IS_ASYNC为boolean变量,同步模式时该值为false,异步模式时该值为true。

将文字识别结果发送到主线程(MainAbilitySlice类中接收),代码如下:

public void sendResult(String value) { 
 if (textDetector != null) { 
     textDetector.release(); 
 } 
 if (pixelMap != null) { 
     pixelMap.release(); 
     pixelMap = null; 
     VisionManager.destroy(); 
 } 
 if (value != null) { 
     maps.put(mediaId, value); 
 } 
 if ((maps != null) && (maps.size() == pictureLists.length)) { 
     InnerEvent event = InnerEvent.get(1, 0, maps); 
     handle.sendEvent(event); 
 } else { 
     wordRecognition(slice, pictureLists[index], handle); 
     index++; 
 } 
}

 

5. 提取用户输入的关键词

 

1.获取MainAbilitySlice传递的环境参数并进行分词操作,同步方式调用sendResult()方法将分词结果发送到主线程,代码如下:

public void wordSegment(Context context, String requestData, MainAbilitySlice.MyEventHandle myEventHandle) { 
	slice = context; // MainAbilitySlice.this 
	handle = myEventHandle; // MyEventHandle对象 
 
	// 使用NluClient静态类进行初始化,通过异步方式获取服务的连接。 
	NluClient.getInstance().init(context, new OnResultListener<Integer>() { 
		@Override 
		public void onResult(Integer resultCode) { 
			if (!IS_ASYNC) { 
				// 分词同步方法 
				ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData, 
						NluRequestType.REQUEST_TYPE_LOCAL); 
				sendResult(responseResult.getResponseResult()); 
				release(); 
			} else { 
				// 分词异步方法 
				wordSegmentAsync(requestData); 
			} 
		} 
	}, true); 
}

🕮 说明

1.IS_ASYNC为boolean变量,同步模式时该值为false,异步模式时该值为true。
2.responseResult对象中code属性为0表示分词成功。

2.异步请求回调此方法,通过sendResult()方法将分词结果发送到主线程,代码如下:

private void wordSegmentAsync(String requestData) { 
	ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData, 
			NluRequestType.REQUEST_TYPE_LOCAL, new OnResultListener<ResponseResult>() { 
				@Override 
				public void onResult(ResponseResult asyncResult) { 
					sendResult(asyncResult.getResponseResult()); 
					release(); 
				} 
			}); 
}

3.将分词结果发送到主线程中(MainAbilitySlice类中接收),代码如下:

private void sendResult(String result) { 
	List lists = null;// 分词识别结果 
	// 将result中分词结果转换成list 
	if (result.contains("\"message\":\"success\"")) { 
		String words = result.substring(result.indexOf(WORDS) + STEP, 
				result.lastIndexOf("]")).replaceAll("\"", ""); 
		if ((words == null) || ("".equals(words))) { 
			lists = new ArrayList(1);// 未识别到分词结果,返回"no keywords" 
			lists.add("no keywords"); 
		} else { 
			lists = Arrays.asList(words.split(",")); 
		} 
	} 
 
	InnerEvent event = InnerEvent.get(TWO, ZERO, lists); 
	handle.sendEvent(event); 
}

 

6. 根据关键词匹配图片

 

1.根据关键词匹配待识别图片,代码如下:

private void matchImage(List<String> list) { 
	Set<Integer> matchSets = new HashSet<>(); 
	for (String str: list) { // 遍历分词结果 
		// imageInfos待识别图片通用文字识别结果 
		for (Integer key : imageInfos.keySet()) { 
			if (imageInfos.get(key).indexOf(str) != NEG_ONE) { 
				matchSets.add(key); 
			} 
		} 
	} 
	// 获得匹配的图片 
	matchPictures = new int[matchSets.size()]; 
	int i = 0; 
	for (int match: matchSets) { 
		matchPictures[i] = match; 
		i++; 
	} 
	// 展示图片 
	setSelectPicture(matchPictures, LIST_CONTAINER_ID_MATCH); 
}

 

2.展示结果图片到页面,代码如下:

 

private void setSelectPicture(int[] pictures, int id) { 
	// 获取图片 
	PictureProvider newsTypeAdapter = new PictureProvider(pictures, this); 
 
	Component componentById = findComponentById(id); 
	if (componentById instanceof ListContainer) { 
		ListContainer listContainer = (ListContainer) componentById; 
		listContainer.setItemProvider(newsTypeAdapter); 
	} 
}

最终实现效果

 

在"请输入关键词"下面的输入框中输入需要分词的关键词,点击【开始通用文字识别】按钮进行关键词搜索图片,您将会在"搜索结果"下方看到包含关键词的图片。

 

 ● 垃圾分类人人做 做好分类为人人
 ● 可回收物 其他垃圾

在HarmonyOS中用AI识别图中文字-鸿蒙开发者社区

7. 完整示例代码

 

编写布局与样式

 

1.base/graphic/background_ability_main.xml

<?xml version="1.0" encoding="UTF-8" ?> 
<shape xmlns:ohos="http://schemas.huawei.com/res/ohos" 
       ohos:shape="rectangle"> 
    <solid 
        ohos:color="#FFFFFF"/> 
</shape>

2.base/graphic/background_ability_page.xml

<?xml version="1.0" encoding="UTF-8" ?> 
 <shape xmlns:ohos="http://schemas.huawei.com/res/ohos" 
        ohos:shape="rectangle"> 
     <solid 
         ohos:color="#FFFAF0"/> 
 </shape>

3.base/graphic/button_element.xml

<?xml version="1.0" encoding="utf-8"?> 
 <shape 
     xmlns:ohos="http://schemas.huawei.com/res/ohos" 
     ohos:shape="rectangle"> 
     <corners 
         ohos:radius="100"/> 
     <solid 
         ohos:color="#FF007DFE"/> 
 </shape>

4.base/layout/ability_main.xml

<?xml version="1.0" encoding="utf-8"?> 
<DirectionalLayout 
    xmlns:ohos="http://schemas.huawei.com/res/ohos" 
    ohos:height="match_parent" 
    ohos:width="match_parent" 
    ohos:orientation="vertical" 
    ohos:background_element="$graphic:background_ability_page" 
    > 
 
    <Text 
        ohos:id="$+id:text_helloworld" 
        ohos:height="match_content" 
        ohos:width="match_content" 
        ohos:background_element="$graphic:background_ability_main" 
        ohos:layout_alignment="horizontal_center" 
        ohos:text="关键词搜索图片" 
        ohos:text_size="30fp" 
        ohos:top_margin="5vp" 
        /> 
 
    <Text 
        ohos:id="$+id:picture_list" 
        ohos:height="match_content" 
        ohos:width="match_content" 
        ohos:background_element="$graphic:background_ability_main" 
        ohos:layout_alignment="horizontal_center" 
        ohos:text="图片列表" 
        ohos:text_size="20fp" 
        ohos:top_margin="15vp" 
        /> 
 
    <ListContainer 
        ohos:id="$+id:picture_list_show" 
        ohos:height="200vp" 
        ohos:width="match_parent" 
        ohos:orientation="horizontal" 
        ohos:left_margin="5vp" 
        ohos:right_margin="5vp" 
        /> 
 
    <Text 
        ohos:id="$+id:word_seg_title" 
        ohos:height="match_content" 
        ohos:width="match_content" 
        ohos:background_element="$graphic:background_ability_main" 
        ohos:left_margin="5vp" 
        ohos:text="请输入关键词:" 
        ohos:text_size="25fp" 
        ohos:top_margin="10vp" 
        /> 
 
    <TextField 
        ohos:id="$+id:word_seg_text" 
        ohos:height="match_content" 
        ohos:width="match_parent" 
        ohos:background_element="$graphic:background_ability_main" 
        ohos:hint="Enter a statement." 
        ohos:left_padding="5vp" 
        ohos:right_padding="5vp" 
        ohos:text_alignment="vertical_center" 
        ohos:text_size="20fp" 
        ohos:top_margin="5vp"/> 
 
    <Button 
        ohos:id="$+id:button_search" 
        ohos:width="match_content" 
        ohos:height="match_content" 
        ohos:text_size="20fp" 
        ohos:text="开始通用文字识别" 
        ohos:layout_alignment="horizontal_center" 
        ohos:top_margin="10vp" 
        ohos:top_padding="1vp" 
        ohos:bottom_padding="1vp" 
        ohos:right_padding="20vp" 
        ohos:left_padding="20vp" 
        ohos:text_color="white" 
        ohos:background_element="$graphic:button_element" 
        ohos:center_in_parent="true" 
        ohos:align_parent_bottom="true" 
        ohos:bottom_margin="5vp"/> 
 
    <Text 
        ohos:id="$+id:picture_list_result" 
        ohos:height="match_content" 
        ohos:width="match_content" 
        ohos:background_element="$graphic:background_ability_main" 
        ohos:layout_alignment="horizontal_center" 
        ohos:text="搜索结果" 
        ohos:text_size="20fp" 
        ohos:top_margin="5vp" 
        /> 
 
    <ListContainer 
        ohos:id="$+id:picture_list_match" 
        ohos:height="200vp" 
        ohos:width="match_parent" 
        ohos:orientation="horizontal" 
        ohos:left_margin="5vp" 
        ohos:right_margin="5vp" 
        /> 
</DirectionalLayout>

5.base/layout/item_image_layout.xml

<?xml version="1.0" encoding="utf-8"?> 
<DirectionalLayout xmlns:ohos="http://schemas.huawei.com/res/ohos" 
                   ohos:height="200vp" 
                   ohos:width="205vp"> 
 
    <Image 
        ohos:id="$+id:select_picture_list" 
        ohos:height="200vp" 
        ohos:width="200vp" 
        ohos:layout_alignment="horizontal_center" 
        ohos:top_margin="1vp" 
        ohos:scale_mode="stretch" 
        /> 
 
</DirectionalLayout>

 

功能逻辑代码


1.com/huawei/searchimagebykeywords/provider/PictureProvider

import com.huawei.searchimagebykeywords.ResourceTable; 
 
import ohos.agp.components.BaseItemProvider; 
import ohos.agp.components.Component; 
import ohos.agp.components.ComponentContainer; 
import ohos.agp.components.Image; 
import ohos.agp.components.LayoutScatter; 
import ohos.app.Context; 
 
import java.util.Optional; 
 
public class PictureProvider extends BaseItemProvider { 
    private int[] pictureLists; 
    private Context context; 
 
    /** 
     *  picture provider 
     * 
     * @param pictureLists pictureLists 
     * @param context context 
     */ 
    public PictureProvider(int[] pictureLists, Context context) { 
        this.pictureLists = pictureLists; 
        this.context = context; 
    } 
 
    @Override 
    public int getCount() { 
        return pictureLists == null ? 0 : pictureLists.length; 
    } 
 
    @Override 
    public Object getItem(int position) { 
        return Optional.of(this.pictureLists[position]); 
    } 
 
    @Override 
    public long getItemId(int position) { 
        return position; 
    } 
 
    @Override 
    public Component getComponent(int var1, Component var2, ComponentContainer var3) { 
        ViewHolder viewHolder = null; 
        Component component = var2; 
        if (component == null) { 
            component = LayoutScatter.getInstance(context).parse(ResourceTable.Layout_item_image_layout, 
                    null, false); 
            viewHolder = new ViewHolder(); 
            Component componentImage = component.findComponentById(ResourceTable.Id_select_picture_list); 
            if (componentImage instanceof Image) { 
                viewHolder.image = (Image) componentImage; 
            } 
            component.setTag(viewHolder); 
        } else { 
            if (component.getTag() instanceof ViewHolder) { 
                viewHolder = (ViewHolder) component.getTag(); 
            } 
        } 
        if (viewHolder != null) { 
            viewHolder.image.setPixelMap(pictureLists[var1]); 
        } 
        return component; 
    } 
 
    private static class ViewHolder { 
        Image image; 
    } 
}

2.com/huawei/searchimagebykeywords/slice/MainAbilitySlice 

import com.huawei.searchimagebykeywords.ResourceTable; 
import com.huawei.searchimagebykeywords.provider.PictureProvider; 
import com.huawei.searchimagebykeywords.util.WordRecognition; 
import com.huawei.searchimagebykeywords.util.WordSegment; 
 
import ohos.aafwk.ability.AbilitySlice; 
import ohos.aafwk.content.Intent; 
import ohos.agp.components.Button; 
import ohos.agp.components.Component; 
import ohos.agp.components.ListContainer; 
import ohos.agp.components.TextField; 
import ohos.app.Context; 
import ohos.eventhandler.EventHandler; 
import ohos.eventhandler.EventRunner; 
import ohos.eventhandler.InnerEvent; 
 
import java.util.HashSet; 
import java.util.List; 
import java.util.Map; 
import java.util.Set; 
 
public class MainAbilitySlice extends AbilitySlice { 
    private static final int LIST_CONTAINER_ID_SHOW = ResourceTable.Id_picture_list_show; 
    private static final int LIST_CONTAINER_ID_MATCH = ResourceTable.Id_picture_list_match; 
    private static final int NEG_ONE = -1; 
    private static final int ZERO = 0; 
    private static final int ONE = 1; 
    private static final int TWO = 2; 
    private Context slice; 
    private EventRunner runner; 
    private MyEventHandle myEventHandle; 
    private int[] pictureLists = new int[]{ResourceTable.Media_1, ResourceTable.Media_2, 
        ResourceTable.Media_3, ResourceTable.Media_4, ResourceTable.Media_5, 
        ResourceTable.Media_6, ResourceTable.Media_7, ResourceTable.Media_8}; 
    private Component selectComponent; 
    private int selectPosition; 
    private Button button; 
    private TextField textField; 
    private Map<Integer, String> imageInfos; 
    private int[] matchPictures; 
 
    @Override 
    public void onStart(Intent intent) { 
        super.onStart(intent); 
        super.setUIContent(ResourceTable.Layout_ability_main); 
 
        slice = MainAbilitySlice.this; 
 
        // 展示图片列表 
        setSelectPicture(pictureLists, LIST_CONTAINER_ID_SHOW); 
 
        // 所有图片通用文字识别 
        wordRecognition(); 
 
        // 设置需要分词的语句 
        Component componentText = findComponentById(ResourceTable.Id_word_seg_text); 
        if (componentText instanceof TextField) { 
            textField = (TextField) componentText; 
        } 
 
        // 点击按钮进行文字识别 
        Component componentSearch = findComponentById(ResourceTable.Id_button_search); 
        if (componentSearch instanceof Button) { 
            button = (Button) componentSearch; 
            button.setClickedListener(listener -> wordSegment()); 
        } 
    } 
 
    @Override 
    public void onActive() { 
        super.onActive(); 
    } 
 
    @Override 
    public void onForeground(Intent intent) { 
        super.onForeground(intent); 
    } 
 
    // 设置图片选择区域 
    private void setSelectPicture(int[] pictures, int id) { 
        // 获取图片 
        PictureProvider newsTypeAdapter = new PictureProvider(pictures, this); 
 
        Component componentById = findComponentById(id); 
        if (componentById instanceof ListContainer) { 
            ListContainer listContainer = (ListContainer) componentById; 
            listContainer.setItemProvider(newsTypeAdapter); 
        } 
    } 
 
    // 通用文字识别 
    private void wordRecognition() { 
        initHandler(); 
        WordRecognition wordRecognition = new WordRecognition(); 
        wordRecognition.setParams(slice, pictureLists, myEventHandle); 
        wordRecognition.sendResult(null); 
    } 
 
    // 分词 
    private void wordSegment() { 
        // 组装关键词,作为分词对象 
        String requestData = "{\"text\":" + textField.getText() + ",\"type\":0}"; 
        initHandler(); 
        new WordSegment().wordSegment(slice, requestData, myEventHandle); 
    } 
 
    // 匹配图片 
    private void matchImage(List<String> list) { 
        Set<Integer> matchSets = new HashSet<>(); 
        for (String str: list) { 
            for (Integer key : imageInfos.keySet()) { 
                if (imageInfos.get(key).indexOf(str) != NEG_ONE) { 
                    matchSets.add(key); 
                } 
            } 
        } 
        // 获得匹配的图片 
        matchPictures = new int[matchSets.size()]; 
        int i = 0; 
        for (int match: matchSets) { 
            matchPictures[i] = match; 
            i++; 
        } 
        // 展示图片 
        setSelectPicture(matchPictures, LIST_CONTAINER_ID_MATCH); 
    } 
 
    private void initHandler() { 
        runner = EventRunner.getMainEventRunner(); 
        if (runner == null) { 
            return; 
        } 
        myEventHandle = new MyEventHandle(runner); 
    } 
 
    public class MyEventHandle extends EventHandler { 
        MyEventHandle(EventRunner runner) throws IllegalArgumentException { 
            super(runner); 
        } 
 
        @Override 
        protected void processEvent(InnerEvent event) { 
            super.processEvent(event); 
            int eventId = event.eventId; 
            if (eventId == ONE) { 
                // 通用文字识别 
                if (event.object instanceof Map) { 
                    imageInfos = (Map) event.object; 
                } 
            } 
            if (eventId == TWO) { 
                // 分词 
                if (event.object instanceof List) { 
                    List<String> lists = (List) event.object; 
                    if ((lists.size() > ZERO) && (!"no keywords".equals(lists.get(ZERO)))) { 
                        // 根据输入关键词 匹配图片 
                        matchImage(lists); 
                    } 
                } 
            } 
        } 
    } 
}

3.com/huawei/searchimagebykeywords/util/LogUtil 

import ohos.hiviewdfx.HiLog; 
import ohos.hiviewdfx.HiLogLabel; 
 
public class LogUtil { 
    private static final String TAG_LOG = "LogUtil"; 
 
    private static final HiLogLabel LABEL_LOG = new HiLogLabel(0, 0, LogUtil.TAG_LOG); 
 
    private static final String LOG_FORMAT = "%{public}s: %{public}s"; 
 
    private LogUtil() { 
    } 
 
    public static void info(String tag, String msg) { 
        HiLog.info(LABEL_LOG, LOG_FORMAT, tag, msg); 
    } 
 
    public static void error(String tag, String msg) { 
        HiLog.info(LABEL_LOG, LOG_FORMAT, tag, msg); 
    } 
}

4.com/huawei/searchimagebykeywords/util/WordRecognition 

import com.huawei.searchimagebykeywords.slice.MainAbilitySlice; 
 
import ohos.ai.cv.common.ConnectionCallback; 
import ohos.ai.cv.common.VisionCallback; 
import ohos.ai.cv.common.VisionConfiguration; 
import ohos.ai.cv.common.VisionImage; 
import ohos.ai.cv.common.VisionManager; 
import ohos.ai.cv.text.ITextDetector; 
import ohos.ai.cv.text.Text; 
import ohos.ai.cv.text.TextConfiguration; 
import ohos.ai.cv.text.TextDetectType; 
import ohos.app.Context; 
import ohos.eventhandler.InnerEvent; 
import ohos.global.resource.NotExistException; 
import ohos.global.resource.Resource; 
import ohos.global.resource.ResourceManager; 
import ohos.media.image.ImageSource; 
import ohos.media.image.PixelMap; 
import ohos.media.image.common.PixelFormat; 
import ohos.media.image.common.Rect; 
import ohos.media.image.common.Size; 
 
import java.io.ByteArrayOutputStream; 
import java.io.IOException; 
import java.util.HashMap; 
import java.util.Map; 
 
public class WordRecognition { 
    private static final boolean IS_ASYNC = false; 
    private static final int IS_ASYNC_CODE = 700; 
    private Context slice; 
    private ITextDetector textDetector; 
    private PixelMap pixelMap; 
    private MainAbilitySlice.MyEventHandle handle; 
    private int[] pictureLists; 
    private int mediaId; 
    private Map maps = new HashMap<>(); 
    private int index; 
    private int result; 
 
    public void setParams(Context context, int[] pictureIds, MainAbilitySlice.MyEventHandle myEventHandle) { 
        slice = context; 
        pictureLists = pictureIds; 
        handle = myEventHandle; 
    } 
 
    public void wordRecognition(Context context, int resId, MainAbilitySlice.MyEventHandle myEventHandle) { 
        mediaId = resId; 
        // 实例化ITextDetector接口 
        textDetector = VisionManager.getTextDetector(context); 
 
        // 实例化VisionImage对象image,并传入待检测图片pixelMap 
        pixelMap = getPixelMap(resId); 
        VisionImage image = VisionImage.fromPixelMap(pixelMap); 
 
        // 定义VisionCallback<Text>回调,异步模式下用到 
        VisionCallback<Text> visionCallback = getVisionCallback(); 
 
        // 定义ConnectionCallback回调,实现连接能力引擎成功与否后的操作 
        ConnectionCallback connectionCallback = getConnectionCallback(image, visionCallback); 
 
        // 建立与能力引擎的连接 
        VisionManager.init(context, connectionCallback); 
    } 
 
    private VisionCallback getVisionCallback() { 
        return new VisionCallback<Text>() { 
            @Override 
            public void onResult(Text text) { 
                sendResult(text.getValue()); 
            } 
 
            @Override 
            public void onError(int i) { 
            } 
 
            @Override 
            public void onProcessing(float v) { 
            } 
        }; 
    } 
 
    private ConnectionCallback getConnectionCallback(VisionImage image, VisionCallback<Text> visionCallback) { 
        return new ConnectionCallback() { 
            @Override 
            public void onServiceConnect() { 
                // 实例化Text对象text 
                Text text = new Text(); 
 
                // 通过TextConfiguration配置textDetector()方法的运行参数 
                TextConfiguration.Builder builder = new TextConfiguration.Builder(); 
                builder.setProcessMode(VisionConfiguration.MODE_IN); 
                builder.setDetectType(TextDetectType.TYPE_TEXT_DETECT_FOCUS_SHOOT); 
                builder.setLanguage(TextConfiguration.AUTO); 
                TextConfiguration config = builder.build(); 
                textDetector.setVisionConfiguration(config); 
                // 调用ITextDetector的detect()方法 
                if (!IS_ASYNC) { 
                    int result2 = textDetector.detect(image, text, null); // 同步 
                    sendResult(text.getValue()); 
                } else { 
                    int result2 = textDetector.detect(image, null, visionCallback); // 异步 
                } 
            } 
 
            @Override 
            public void onServiceDisconnect() { 
                // 释放 
                if ((!IS_ASYNC && (result == 0)) || (IS_ASYNC && (result == IS_ASYNC_CODE))) { 
                  textDetector.release(); 
                } 
                if (pixelMap != null) { 
                    pixelMap.release(); 
                    pixelMap = null; 
                } 
                VisionManager.destroy(); 
            } 
        }; 
    } 
 
    public void sendResult(String value) { 
        if (textDetector != null) { 
            textDetector.release(); 
        } 
        if (pixelMap != null) { 
            pixelMap.release(); 
            pixelMap = null; 
            VisionManager.destroy(); 
        } 
        if (value != null) { 
            maps.put(mediaId, value); 
        } 
        if ((maps != null) && (maps.size() == pictureLists.length)) { 
            InnerEvent event = InnerEvent.get(1, 0, maps); 
            handle.sendEvent(event); 
        } else { 
            wordRecognition(slice, pictureLists[index], handle); 
            index++; 
        } 
    } 
 
    // 获取图片 
    private PixelMap getPixelMap(int resId) { 
        ResourceManager manager = slice.getResourceManager(); 
 
        byte[] datas = new byte[0]; 
        try { 
            Resource resource = manager.getResource(resId); 
            datas = readBytes(resource); 
            resource.close(); 
        } catch (IOException | NotExistException e) { 
            LogUtil.error("get pixelmap failed, read resource bytes failed, ", e.getLocalizedMessage()); 
        } 
 
        ImageSource.SourceOptions srcOpts = new ImageSource.SourceOptions(); 
        srcOpts.formatHint = "image/jpg"; 
        ImageSource imageSource; 
        imageSource = ImageSource.create(datas, srcOpts); 
        ImageSource.DecodingOptions decodingOpts = new ImageSource.DecodingOptions(); 
        decodingOpts.desiredSize = new Size(0, 0); 
        decodingOpts.desiredRegion = new Rect(0, 0, 0, 0); 
        decodingOpts.desiredPixelFormat = PixelFormat.ARGB_8888; 
        pixelMap = imageSource.createPixelmap(decodingOpts); 
        return pixelMap; 
    } 
 
    private static byte[] readBytes(Resource resource) { 
        final int bufferSize = 1024; 
        final int ioEnd = -1; 
 
        ByteArrayOutputStream output = new ByteArrayOutputStream(); 
        byte[] buffers = new byte[bufferSize]; 
        byte[] results = new byte[0]; 
        while (true) { 
            try { 
                int readLen = resource.read(buffers, 0, bufferSize); 
                if (readLen == ioEnd) { 
                    results = output.toByteArray(); 
                    break; 
                } 
                output.write(buffers, 0, readLen); 
            } catch (IOException e) { 
                LogUtil.error("OrcAbilitySlice.getPixelMap", "read resource failed "); 
                break; 
            } finally { 
                try { 
                    output.close(); 
                } catch (IOException e) { 
                    LogUtil.error("OrcAbilitySlice.getPixelMap", "close output failed"); 
                } 
            } 
        } 
        return results; 
    } 
}

5.com/huawei/searchimagebykeywords/util/WordSegment 

import com.huawei.searchimagebykeywords.slice.MainAbilitySlice; 
 
import ohos.ai.nlu.NluClient; 
import ohos.ai.nlu.NluRequestType; 
import ohos.ai.nlu.OnResultListener; 
import ohos.ai.nlu.ResponseResult; 
import ohos.app.Context; 
import ohos.eventhandler.InnerEvent; 
 
import java.util.ArrayList; 
import java.util.Arrays; 
import java.util.List; 
 
public class WordSegment { 
    private static final boolean IS_ASYNC = true; 
    private static final String WORDS = "words"; 
    private static final int ZERO = 0; 
    private static final int TWO = 2; 
    private static final int STEP = 8; 
    private Context slice; 
    private MainAbilitySlice.MyEventHandle handle; 
 
    public void wordSegment(Context context, String requestData, MainAbilitySlice.MyEventHandle myEventHandle) { 
        slice = context; 
        handle = myEventHandle; 
 
        // 使用NluClient静态类进行初始化,通过异步方式获取服务的连接。 
        NluClient.getInstance().init(context, new OnResultListener<Integer>() { 
            @Override 
            public void onResult(Integer resultCode) { 
                if (!IS_ASYNC) { 
                    // 同步 
                    ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData, 
                            NluRequestType.REQUEST_TYPE_LOCAL); 
                    sendResult(responseResult.getResponseResult()); 
                    release(); 
                } else { 
                    // 异步 
                    wordSegmentAsync(requestData); 
                } 
            } 
        }, true); 
    } 
 
    private void wordSegmentAsync(String requestData) { 
        ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData, 
                NluRequestType.REQUEST_TYPE_LOCAL, new OnResultListener<ResponseResult>() { 
                    @Override 
                    public void onResult(ResponseResult asyncResult) { 
                        sendResult(asyncResult.getResponseResult()); 
                        release(); 
                    } 
                }); 
    } 
 
    private void sendResult(String result) { 
        List lists = null; // 分词识别结果 
        // 将result中分词结果转换成list 
        if (result.contains("\"message\":\"success\"")) { 
            String words = result.substring(result.indexOf(WORDS) + STEP, 
                    result.lastIndexOf("]")).replaceAll("\"", ""); 
            if ((words == null) || ("".equals(words))) { 
                lists = new ArrayList(1); 
                lists.add("no keywords"); // 未识别到分词结果,返回"no keywords" 
            } else { 
                lists = Arrays.asList(words.split(",")); 
            } 
        } 
 
        InnerEvent event = InnerEvent.get(TWO, ZERO, lists); 
        handle.sendEvent(event); 
    } 
 
    private void release() { 
        NluClient.getInstance().destroy(slice); 
    } 
}

6.com/huawei/searchimagebykeywords/MainAbility 

import com.huawei.searchimagebykeywords.slice.MainAbilitySlice; 
 
import ohos.aafwk.ability.Ability; 
import ohos.aafwk.content.Intent; 
public class MainAbility extends Ability { 
    @Override 
    public void onStart(Intent intent) { 
        super.onStart(intent); 
        super.setMainRoute(MainAbilitySlice.class.getName()); 
    } 
}

7.com/huawei/searchimagebykeywords/MyApplication 

import ohos.aafwk.ability.AbilityPackage; 
 
public class MyApplication extends AbilityPackage { 
    @Override 
    public void onInitialize() { 
        super.onInitialize(); 
    } 
}

🕮 说明

以上代码仅demo演示参考使用,产品化的代码需要考虑数据校验和国际化。

 

8. 恭喜您

 

通过本教程的学习,您已学会如何使用AI能力中的通用文字识别和分词。

 

 

@文章转载自HUAWEI Codelabs

已于2022-5-5 14:22:01修改
8
收藏 9
回复
举报
17条回复
按时间正序
/
按时间倒序
红叶亦知秋
红叶亦知秋

学习下先进的AI识别技术

回复
2021-4-8 19:07:33
鸿蒙张荣超
鸿蒙张荣超

👍👍👍

回复
2021-4-8 21:55:21
鸿蒙时代
鸿蒙时代

AI这块有很多功能,我们正不知怎么下手尼,好好学习。

回复
2021-4-9 11:20:15
麒麟Berlin
麒麟Berlin

👍👍👍

回复
2021-4-10 10:31:12
wx60caea552dd58
wx60caea552dd58

怎么实现通过一个接口就能做识别,不需要加载模型,而且不是远程api,有大神能解释下原理么,一直的困惑

回复
2021-6-17 14:28:26
wx60e53a989804f
wx60e53a989804f

这代码我没改直接跑根本识别不了

回复
2021-7-18 15:18:34
wx60e53a989804f
wx60e53a989804f

更别说改一下识别图片文字并输出了,,,,

回复
2021-7-18 15:31:20
金鱼蹦蹦跶
金鱼蹦蹦跶 回复了 wx60e53a989804f
这代码我没改直接跑根本识别不了

需要在真机上或者远程真机上运行,因为模拟器上没有芯片。

回复
2021-7-18 16:32:04
wx60e53a989804f
wx60e53a989804f

结果是模拟器更不跑不了AI模块

回复
2021-7-19 12:55:46
chaoxiaoshu
chaoxiaoshu 回复了 wx60e53a989804f
结果是模拟器更不跑不了AI模块

可以跑啊

我按照官方的AI通用文字识别文档写的完全可以在模拟器上运行成功

回复
2021-7-19 13:44:47
wx60e53a989804f
wx60e53a989804f 回复了 chaoxiaoshu
可以跑啊 我按照官方的AI通用文字识别文档写的完全可以在模拟器上运行成功

到底是识别文档还是识别图片,,识别图片的确能运行成功但是AI模块对图片的识别文字的结果是null,我Debug看了,,如果可以的话可以截图给我看看你识别匹配成功的截图么?

 

回复
2021-7-19 15:34:38
wx60e53a989804f
wx60e53a989804f 回复了 金鱼蹦蹦跶
需要在真机上或者远程真机上运行,因为模拟器上没有芯片。

Dev开发的用真机模拟是要升级到鸿蒙系统么?可以debug么?

回复
2021-7-19 15:36:13
chaoxiaoshu
chaoxiaoshu 回复了 wx60e53a989804f
到底是识别文档还是识别图片,,识别图片的确能运行成功但是AI模块对图片的识别文字的结果是null,我Debug看了,,如果可以的话可以截图给我看看你识别匹配成功的截图么?

这是原图

这是识别后的结果

回复
2021-7-19 16:26:41
爱吃土豆丝的打工人
爱吃土豆丝的打工人

HUAWEI Codelabs 上的代码一般都是可以直接跑通的,就是有些时候需要改一改配置文件。

AI这块的功能,整体来说体验还是很不错的。

回复
2021-7-19 17:26:10
wx60e53a989804f
wx60e53a989804f 回复了 金鱼蹦蹦跶
需要在真机上或者远程真机上运行,因为模拟器上没有芯片。

偶然的原因跑去看了测试版,发现了远程真机,,,弄出来了

回复
2021-7-19 18:58:09
wx60e53a989804f
wx60e53a989804f 回复了 chaoxiaoshu
这是原图 这是识别后的结果

你应该使用的远程真机,吧,不是自带的模拟器吧

回复
2021-7-19 18:59:46
ID君
ID君

楼主能发个项目包吗,我试了一下把代码都拷贝一遍结果只能跑个样子,点按钮啥的都没反应,不知道是哪一步出问题了

回复
2022-3-29 22:33:22
回复
    相关推荐