在HarmonyOS中用AI识别图中文字 精华
1. 介绍
AI的通用文字识别可以对文档翻拍、街景翻拍等图片来源的文字检测和识别,可以集成在其他应用中,提供文字检测、识别的功能,并根据识别结果提供翻译、搜索等相关服务。该功能在一定程度上支持文本倾斜、拍摄角度倾斜、复杂光照条件以及复杂文本背景等场景的文字识别。通用文字识别详细介绍可参考AI-通用文字识别,分词详细介绍可参考AI-分词。
🕮 说明
● 分词文本限制在500字以内,编码格式必须为utf-8。
● 分词目前只支持中文语境。
● 支持处理的图片格式包括JPEG、JPG、PNG、GIF、BMP。
● 目前支持的语言有:中文、英文、日语、韩语、俄语、意大利语、西班牙语、葡萄牙语、德语,以及法语(将来会增加更多语种),但不支持手写字体识别。
本教程将通过以下内容为您展示如何实现基于AI的通用文字识别功能。
2. 代码结构解读
基于AI的通用文字识别示例教程主要内容包括:图片列表展示、输入文本、分词、通用文字识别、结果展示等功能,可在7 完整示例代码中查看工程代码。DevEco Studio工程代码结构如下:
● provider:PictureProvider图片适配类,获取所有图片,并将图片放到图片列表中。
● slice:MainAbilitySlice本示例教程主页面。
● util:工具类
○ LogUtil是日志打印类,对HiLog日志进行了封装。
○ WordRecognition是通用文字识别类,对图片中的文字进行识别并保存。
○ WordSegment是分词类,对输入文本进行分词。
● MainAbility:主程序入口,DevEco Studio生成,未添加逻辑,不需变更。
● MyApplication:DevEco Studio生成,不需变更。
● resources:存放工程使用到的资源文件
○ resources\base\element中存放DevEco studio自动生成的配置文件string.json,不用变更。
○ resources\base\graphic中存放页面样式文件:
◼ ️background_ability_page.xml用于设置界面背景颜色。
◼ ️background_ability_main.xml用于设置界面布局样式。
◼ ️button_element.xml用于设置按钮样式。
○ resources\base\layout中布局文件:
◼ ️ability_main.xml用于展示图片和输入文本。
◼ ️item_image_layout.xml用于设置图片滑动区域图片。
resources\base\media下存放图片资源(本教程使用了8张.jpg图片,开发者自行准备;icon.png由DevEco Studio生成不需变更)。
● config.json:配置文件。
3. 添加并展示图片
1.在"resources\base\media"目录下添加8张jpg图片(分别命名为1-8.jpg),并加载图片id数组,代码如下:
private int[] pictureLists = new int[]{ResourceTable.Media_1, ResourceTable.Media_2,
ResourceTable.Media_3, ResourceTable.Media_4, ResourceTable.Media_5,
ResourceTable.Media_6, ResourceTable.Media_7, ResourceTable.Media_8};
2.获取图片id数组和MainAbilitySlice对象,代码如下:
public PictureProvider(int[] pictureLists, Context context) {
this.pictureLists = pictureLists;
this.context = context;
}
3.展示图片到页面,代码如下:
@Override
public Component getComponent(int var1, Component var2, ComponentContainer var3) {
ViewHolder viewHolder = null;// Component中展示图片类
Component component = var2;
if (component == null) {
component = LayoutScatter.getInstance(context).parse(ResourceTable.Layout_item_image_layout,
null, false);
viewHolder = new ViewHolder();
Component componentImage = component.findComponentById(ResourceTable.Id_select_picture_list);
if (componentImage instanceof Image) {
viewHolder.image = (Image) componentImage;
}
component.setTag(viewHolder);//设置需要展示的图片
} else {
if (component.getTag() instanceof ViewHolder) {
viewHolder = (ViewHolder) component.getTag();
}
}
if (viewHolder != null) {
viewHolder.image.setPixelMap(pictureLists[var1]);
}
return component;
}
4.定义ViewHolder类,用于列表中展示图片,代码如下:
private static class ViewHolder {
Image image;
}
4. 识别图片中的文字
1.调用文字识别方法对图片文字进行识别,代码如下:
wordRecognition(slice, pictureLists[index], handle); //index为待识别图片下标
public void wordRecognition(Context context, int resId, MainAbilitySlice.MyEventHandle myEventHandle) {
mediaId = resId;
// 实例化ITextDetector接口
textDetector = VisionManager.getTextDetector(context);
// 实例化VisionImage对象image,并传入待检测图片pixelMap
pixelMap = getPixelMap(resId);
VisionImage image = VisionImage.fromPixelMap(pixelMap);
// 定义VisionCallback<Text>回调,异步模式下用到
VisionCallback<Text> visionCallback = getVisionCallback();
// 定义ConnectionCallback回调,实现连接能力引擎成功与否后的操作
ConnectionCallback connectionCallback = getConnectionCallback(image, visionCallback);
// 建立与能力引擎的连接
VisionManager.init(context, connectionCallback);
}
2.异步模式下回调方法,将图片中文字识别结果通过sendResult()方法发送到主线程,代码如下:
private VisionCallback getVisionCallback() {
return new VisionCallback<Text>() {
@Override
public void onResult(Text text) {
sendResult(text.getValue());
}
};
}
3.连接引擎成功后进行文字识别,并将识别结果通过sendResult()方法发送到主线程,代码如下:
private ConnectionCallback getConnectionCallback(VisionImage image, VisionCallback<Text> visionCallback) {
return new ConnectionCallback() {
@Override
public void onServiceConnect() {
// 实例化Text对象text
Text text = new Text();
// 通过TextConfiguration配置textDetector()方法的运行参数
TextConfiguration.Builder builder = new TextConfiguration.Builder();
builder.setProcessMode(VisionConfiguration.MODE_IN);
builder.setDetectType(TextDetectType.TYPE_TEXT_DETECT_FOCUS_SHOOT); // 此处变量名将会被调整
builder.setLanguage(TextConfiguration.AUTO);
TextConfiguration config = builder.build();
textDetector.setVisionConfiguration(config);
// 调用ITextDetector的detect()方法
if (!IS_ASYNC) {
int result2 = textDetector.detect(image, text, null); // 同步
sendResult(text.getValue());
} else {
int result2 = textDetector.detect(image, null, visionCallback); // 异步
}
}
@Override
public void onServiceDisconnect() {
// 释放 成功:同步结果码为0,异步结果码为700
if ((!IS_ASYNC && (result == 0)) || (IS_ASYNC && (result == IS_ASYNC_CODE))) {
textDetector.release();
}
if (pixelMap != null) {
pixelMap.release();
pixelMap = null;
}
VisionManager.destroy();
}
};
}
🕮 说明
1.引擎使用TextDetectType.TYPE_TEXT_DETECT_FOCUS_SHOOT(聚焦拍照OCR)。
2.同步模式调用成功时,该函数返回结果码0。异步模式调用请求发送成功时,该函数返回结果码700。
3.同步模式下visionCallback为null,结果码由方法返回,检测识别结果由text中返回。
4.异步模式下visionCallback不为null,函数返回时text中的值无效(即:text参数为null),实际识别结果由回调函数visionCallback返回。
5.IS_ASYNC为boolean变量,同步模式时该值为false,异步模式时该值为true。
将文字识别结果发送到主线程(MainAbilitySlice类中接收),代码如下:
public void sendResult(String value) {
if (textDetector != null) {
textDetector.release();
}
if (pixelMap != null) {
pixelMap.release();
pixelMap = null;
VisionManager.destroy();
}
if (value != null) {
maps.put(mediaId, value);
}
if ((maps != null) && (maps.size() == pictureLists.length)) {
InnerEvent event = InnerEvent.get(1, 0, maps);
handle.sendEvent(event);
} else {
wordRecognition(slice, pictureLists[index], handle);
index++;
}
}
5. 提取用户输入的关键词
1.获取MainAbilitySlice传递的环境参数并进行分词操作,同步方式调用sendResult()方法将分词结果发送到主线程,代码如下:
public void wordSegment(Context context, String requestData, MainAbilitySlice.MyEventHandle myEventHandle) {
slice = context; // MainAbilitySlice.this
handle = myEventHandle; // MyEventHandle对象
// 使用NluClient静态类进行初始化,通过异步方式获取服务的连接。
NluClient.getInstance().init(context, new OnResultListener<Integer>() {
@Override
public void onResult(Integer resultCode) {
if (!IS_ASYNC) {
// 分词同步方法
ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData,
NluRequestType.REQUEST_TYPE_LOCAL);
sendResult(responseResult.getResponseResult());
release();
} else {
// 分词异步方法
wordSegmentAsync(requestData);
}
}
}, true);
}
🕮 说明
1.IS_ASYNC为boolean变量,同步模式时该值为false,异步模式时该值为true。
2.responseResult对象中code属性为0表示分词成功。
2.异步请求回调此方法,通过sendResult()方法将分词结果发送到主线程,代码如下:
private void wordSegmentAsync(String requestData) {
ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData,
NluRequestType.REQUEST_TYPE_LOCAL, new OnResultListener<ResponseResult>() {
@Override
public void onResult(ResponseResult asyncResult) {
sendResult(asyncResult.getResponseResult());
release();
}
});
}
3.将分词结果发送到主线程中(MainAbilitySlice类中接收),代码如下:
private void sendResult(String result) {
List lists = null;// 分词识别结果
// 将result中分词结果转换成list
if (result.contains("\"message\":\"success\"")) {
String words = result.substring(result.indexOf(WORDS) + STEP,
result.lastIndexOf("]")).replaceAll("\"", "");
if ((words == null) || ("".equals(words))) {
lists = new ArrayList(1);// 未识别到分词结果,返回"no keywords"
lists.add("no keywords");
} else {
lists = Arrays.asList(words.split(","));
}
}
InnerEvent event = InnerEvent.get(TWO, ZERO, lists);
handle.sendEvent(event);
}
6. 根据关键词匹配图片
1.根据关键词匹配待识别图片,代码如下:
private void matchImage(List<String> list) {
Set<Integer> matchSets = new HashSet<>();
for (String str: list) { // 遍历分词结果
// imageInfos待识别图片通用文字识别结果
for (Integer key : imageInfos.keySet()) {
if (imageInfos.get(key).indexOf(str) != NEG_ONE) {
matchSets.add(key);
}
}
}
// 获得匹配的图片
matchPictures = new int[matchSets.size()];
int i = 0;
for (int match: matchSets) {
matchPictures[i] = match;
i++;
}
// 展示图片
setSelectPicture(matchPictures, LIST_CONTAINER_ID_MATCH);
}
2.展示结果图片到页面,代码如下:
private void setSelectPicture(int[] pictures, int id) {
// 获取图片
PictureProvider newsTypeAdapter = new PictureProvider(pictures, this);
Component componentById = findComponentById(id);
if (componentById instanceof ListContainer) {
ListContainer listContainer = (ListContainer) componentById;
listContainer.setItemProvider(newsTypeAdapter);
}
}
最终实现效果
在"请输入关键词"下面的输入框中输入需要分词的关键词,点击【开始通用文字识别】按钮进行关键词搜索图片,您将会在"搜索结果"下方看到包含关键词的图片。
● 垃圾分类人人做 做好分类为人人
● 可回收物 其他垃圾
7. 完整示例代码
编写布局与样式
1.base/graphic/background_ability_main.xml
<?xml version="1.0" encoding="UTF-8" ?>
<shape xmlns:ohos="http://schemas.huawei.com/res/ohos"
ohos:shape="rectangle">
<solid
ohos:color="#FFFFFF"/>
</shape>
2.base/graphic/background_ability_page.xml
<?xml version="1.0" encoding="UTF-8" ?>
<shape xmlns:ohos="http://schemas.huawei.com/res/ohos"
ohos:shape="rectangle">
<solid
ohos:color="#FFFAF0"/>
</shape>
3.base/graphic/button_element.xml
<?xml version="1.0" encoding="utf-8"?>
<shape
xmlns:ohos="http://schemas.huawei.com/res/ohos"
ohos:shape="rectangle">
<corners
ohos:radius="100"/>
<solid
ohos:color="#FF007DFE"/>
</shape>
4.base/layout/ability_main.xml
<?xml version="1.0" encoding="utf-8"?>
<DirectionalLayout
xmlns:ohos="http://schemas.huawei.com/res/ohos"
ohos:height="match_parent"
ohos:width="match_parent"
ohos:orientation="vertical"
ohos:background_element="$graphic:background_ability_page"
>
<Text
ohos:id="$+id:text_helloworld"
ohos:height="match_content"
ohos:width="match_content"
ohos:background_element="$graphic:background_ability_main"
ohos:layout_alignment="horizontal_center"
ohos:text="关键词搜索图片"
ohos:text_size="30fp"
ohos:top_margin="5vp"
/>
<Text
ohos:id="$+id:picture_list"
ohos:height="match_content"
ohos:width="match_content"
ohos:background_element="$graphic:background_ability_main"
ohos:layout_alignment="horizontal_center"
ohos:text="图片列表"
ohos:text_size="20fp"
ohos:top_margin="15vp"
/>
<ListContainer
ohos:id="$+id:picture_list_show"
ohos:height="200vp"
ohos:width="match_parent"
ohos:orientation="horizontal"
ohos:left_margin="5vp"
ohos:right_margin="5vp"
/>
<Text
ohos:id="$+id:word_seg_title"
ohos:height="match_content"
ohos:width="match_content"
ohos:background_element="$graphic:background_ability_main"
ohos:left_margin="5vp"
ohos:text="请输入关键词:"
ohos:text_size="25fp"
ohos:top_margin="10vp"
/>
<TextField
ohos:id="$+id:word_seg_text"
ohos:height="match_content"
ohos:width="match_parent"
ohos:background_element="$graphic:background_ability_main"
ohos:hint="Enter a statement."
ohos:left_padding="5vp"
ohos:right_padding="5vp"
ohos:text_alignment="vertical_center"
ohos:text_size="20fp"
ohos:top_margin="5vp"/>
<Button
ohos:id="$+id:button_search"
ohos:width="match_content"
ohos:height="match_content"
ohos:text_size="20fp"
ohos:text="开始通用文字识别"
ohos:layout_alignment="horizontal_center"
ohos:top_margin="10vp"
ohos:top_padding="1vp"
ohos:bottom_padding="1vp"
ohos:right_padding="20vp"
ohos:left_padding="20vp"
ohos:text_color="white"
ohos:background_element="$graphic:button_element"
ohos:center_in_parent="true"
ohos:align_parent_bottom="true"
ohos:bottom_margin="5vp"/>
<Text
ohos:id="$+id:picture_list_result"
ohos:height="match_content"
ohos:width="match_content"
ohos:background_element="$graphic:background_ability_main"
ohos:layout_alignment="horizontal_center"
ohos:text="搜索结果"
ohos:text_size="20fp"
ohos:top_margin="5vp"
/>
<ListContainer
ohos:id="$+id:picture_list_match"
ohos:height="200vp"
ohos:width="match_parent"
ohos:orientation="horizontal"
ohos:left_margin="5vp"
ohos:right_margin="5vp"
/>
</DirectionalLayout>
5.base/layout/item_image_layout.xml
<?xml version="1.0" encoding="utf-8"?>
<DirectionalLayout xmlns:ohos="http://schemas.huawei.com/res/ohos"
ohos:height="200vp"
ohos:width="205vp">
<Image
ohos:id="$+id:select_picture_list"
ohos:height="200vp"
ohos:width="200vp"
ohos:layout_alignment="horizontal_center"
ohos:top_margin="1vp"
ohos:scale_mode="stretch"
/>
</DirectionalLayout>
功能逻辑代码
1.com/huawei/searchimagebykeywords/provider/PictureProvider
import com.huawei.searchimagebykeywords.ResourceTable;
import ohos.agp.components.BaseItemProvider;
import ohos.agp.components.Component;
import ohos.agp.components.ComponentContainer;
import ohos.agp.components.Image;
import ohos.agp.components.LayoutScatter;
import ohos.app.Context;
import java.util.Optional;
public class PictureProvider extends BaseItemProvider {
private int[] pictureLists;
private Context context;
/**
* picture provider
*
* @param pictureLists pictureLists
* @param context context
*/
public PictureProvider(int[] pictureLists, Context context) {
this.pictureLists = pictureLists;
this.context = context;
}
@Override
public int getCount() {
return pictureLists == null ? 0 : pictureLists.length;
}
@Override
public Object getItem(int position) {
return Optional.of(this.pictureLists[position]);
}
@Override
public long getItemId(int position) {
return position;
}
@Override
public Component getComponent(int var1, Component var2, ComponentContainer var3) {
ViewHolder viewHolder = null;
Component component = var2;
if (component == null) {
component = LayoutScatter.getInstance(context).parse(ResourceTable.Layout_item_image_layout,
null, false);
viewHolder = new ViewHolder();
Component componentImage = component.findComponentById(ResourceTable.Id_select_picture_list);
if (componentImage instanceof Image) {
viewHolder.image = (Image) componentImage;
}
component.setTag(viewHolder);
} else {
if (component.getTag() instanceof ViewHolder) {
viewHolder = (ViewHolder) component.getTag();
}
}
if (viewHolder != null) {
viewHolder.image.setPixelMap(pictureLists[var1]);
}
return component;
}
private static class ViewHolder {
Image image;
}
}
2.com/huawei/searchimagebykeywords/slice/MainAbilitySlice
import com.huawei.searchimagebykeywords.ResourceTable;
import com.huawei.searchimagebykeywords.provider.PictureProvider;
import com.huawei.searchimagebykeywords.util.WordRecognition;
import com.huawei.searchimagebykeywords.util.WordSegment;
import ohos.aafwk.ability.AbilitySlice;
import ohos.aafwk.content.Intent;
import ohos.agp.components.Button;
import ohos.agp.components.Component;
import ohos.agp.components.ListContainer;
import ohos.agp.components.TextField;
import ohos.app.Context;
import ohos.eventhandler.EventHandler;
import ohos.eventhandler.EventRunner;
import ohos.eventhandler.InnerEvent;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class MainAbilitySlice extends AbilitySlice {
private static final int LIST_CONTAINER_ID_SHOW = ResourceTable.Id_picture_list_show;
private static final int LIST_CONTAINER_ID_MATCH = ResourceTable.Id_picture_list_match;
private static final int NEG_ONE = -1;
private static final int ZERO = 0;
private static final int ONE = 1;
private static final int TWO = 2;
private Context slice;
private EventRunner runner;
private MyEventHandle myEventHandle;
private int[] pictureLists = new int[]{ResourceTable.Media_1, ResourceTable.Media_2,
ResourceTable.Media_3, ResourceTable.Media_4, ResourceTable.Media_5,
ResourceTable.Media_6, ResourceTable.Media_7, ResourceTable.Media_8};
private Component selectComponent;
private int selectPosition;
private Button button;
private TextField textField;
private Map<Integer, String> imageInfos;
private int[] matchPictures;
@Override
public void onStart(Intent intent) {
super.onStart(intent);
super.setUIContent(ResourceTable.Layout_ability_main);
slice = MainAbilitySlice.this;
// 展示图片列表
setSelectPicture(pictureLists, LIST_CONTAINER_ID_SHOW);
// 所有图片通用文字识别
wordRecognition();
// 设置需要分词的语句
Component componentText = findComponentById(ResourceTable.Id_word_seg_text);
if (componentText instanceof TextField) {
textField = (TextField) componentText;
}
// 点击按钮进行文字识别
Component componentSearch = findComponentById(ResourceTable.Id_button_search);
if (componentSearch instanceof Button) {
button = (Button) componentSearch;
button.setClickedListener(listener -> wordSegment());
}
}
@Override
public void onActive() {
super.onActive();
}
@Override
public void onForeground(Intent intent) {
super.onForeground(intent);
}
// 设置图片选择区域
private void setSelectPicture(int[] pictures, int id) {
// 获取图片
PictureProvider newsTypeAdapter = new PictureProvider(pictures, this);
Component componentById = findComponentById(id);
if (componentById instanceof ListContainer) {
ListContainer listContainer = (ListContainer) componentById;
listContainer.setItemProvider(newsTypeAdapter);
}
}
// 通用文字识别
private void wordRecognition() {
initHandler();
WordRecognition wordRecognition = new WordRecognition();
wordRecognition.setParams(slice, pictureLists, myEventHandle);
wordRecognition.sendResult(null);
}
// 分词
private void wordSegment() {
// 组装关键词,作为分词对象
String requestData = "{\"text\":" + textField.getText() + ",\"type\":0}";
initHandler();
new WordSegment().wordSegment(slice, requestData, myEventHandle);
}
// 匹配图片
private void matchImage(List<String> list) {
Set<Integer> matchSets = new HashSet<>();
for (String str: list) {
for (Integer key : imageInfos.keySet()) {
if (imageInfos.get(key).indexOf(str) != NEG_ONE) {
matchSets.add(key);
}
}
}
// 获得匹配的图片
matchPictures = new int[matchSets.size()];
int i = 0;
for (int match: matchSets) {
matchPictures[i] = match;
i++;
}
// 展示图片
setSelectPicture(matchPictures, LIST_CONTAINER_ID_MATCH);
}
private void initHandler() {
runner = EventRunner.getMainEventRunner();
if (runner == null) {
return;
}
myEventHandle = new MyEventHandle(runner);
}
public class MyEventHandle extends EventHandler {
MyEventHandle(EventRunner runner) throws IllegalArgumentException {
super(runner);
}
@Override
protected void processEvent(InnerEvent event) {
super.processEvent(event);
int eventId = event.eventId;
if (eventId == ONE) {
// 通用文字识别
if (event.object instanceof Map) {
imageInfos = (Map) event.object;
}
}
if (eventId == TWO) {
// 分词
if (event.object instanceof List) {
List<String> lists = (List) event.object;
if ((lists.size() > ZERO) && (!"no keywords".equals(lists.get(ZERO)))) {
// 根据输入关键词 匹配图片
matchImage(lists);
}
}
}
}
}
}
3.com/huawei/searchimagebykeywords/util/LogUtil
import ohos.hiviewdfx.HiLog;
import ohos.hiviewdfx.HiLogLabel;
public class LogUtil {
private static final String TAG_LOG = "LogUtil";
private static final HiLogLabel LABEL_LOG = new HiLogLabel(0, 0, LogUtil.TAG_LOG);
private static final String LOG_FORMAT = "%{public}s: %{public}s";
private LogUtil() {
}
public static void info(String tag, String msg) {
HiLog.info(LABEL_LOG, LOG_FORMAT, tag, msg);
}
public static void error(String tag, String msg) {
HiLog.info(LABEL_LOG, LOG_FORMAT, tag, msg);
}
}
4.com/huawei/searchimagebykeywords/util/WordRecognition
import com.huawei.searchimagebykeywords.slice.MainAbilitySlice;
import ohos.ai.cv.common.ConnectionCallback;
import ohos.ai.cv.common.VisionCallback;
import ohos.ai.cv.common.VisionConfiguration;
import ohos.ai.cv.common.VisionImage;
import ohos.ai.cv.common.VisionManager;
import ohos.ai.cv.text.ITextDetector;
import ohos.ai.cv.text.Text;
import ohos.ai.cv.text.TextConfiguration;
import ohos.ai.cv.text.TextDetectType;
import ohos.app.Context;
import ohos.eventhandler.InnerEvent;
import ohos.global.resource.NotExistException;
import ohos.global.resource.Resource;
import ohos.global.resource.ResourceManager;
import ohos.media.image.ImageSource;
import ohos.media.image.PixelMap;
import ohos.media.image.common.PixelFormat;
import ohos.media.image.common.Rect;
import ohos.media.image.common.Size;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
public class WordRecognition {
private static final boolean IS_ASYNC = false;
private static final int IS_ASYNC_CODE = 700;
private Context slice;
private ITextDetector textDetector;
private PixelMap pixelMap;
private MainAbilitySlice.MyEventHandle handle;
private int[] pictureLists;
private int mediaId;
private Map maps = new HashMap<>();
private int index;
private int result;
public void setParams(Context context, int[] pictureIds, MainAbilitySlice.MyEventHandle myEventHandle) {
slice = context;
pictureLists = pictureIds;
handle = myEventHandle;
}
public void wordRecognition(Context context, int resId, MainAbilitySlice.MyEventHandle myEventHandle) {
mediaId = resId;
// 实例化ITextDetector接口
textDetector = VisionManager.getTextDetector(context);
// 实例化VisionImage对象image,并传入待检测图片pixelMap
pixelMap = getPixelMap(resId);
VisionImage image = VisionImage.fromPixelMap(pixelMap);
// 定义VisionCallback<Text>回调,异步模式下用到
VisionCallback<Text> visionCallback = getVisionCallback();
// 定义ConnectionCallback回调,实现连接能力引擎成功与否后的操作
ConnectionCallback connectionCallback = getConnectionCallback(image, visionCallback);
// 建立与能力引擎的连接
VisionManager.init(context, connectionCallback);
}
private VisionCallback getVisionCallback() {
return new VisionCallback<Text>() {
@Override
public void onResult(Text text) {
sendResult(text.getValue());
}
@Override
public void onError(int i) {
}
@Override
public void onProcessing(float v) {
}
};
}
private ConnectionCallback getConnectionCallback(VisionImage image, VisionCallback<Text> visionCallback) {
return new ConnectionCallback() {
@Override
public void onServiceConnect() {
// 实例化Text对象text
Text text = new Text();
// 通过TextConfiguration配置textDetector()方法的运行参数
TextConfiguration.Builder builder = new TextConfiguration.Builder();
builder.setProcessMode(VisionConfiguration.MODE_IN);
builder.setDetectType(TextDetectType.TYPE_TEXT_DETECT_FOCUS_SHOOT);
builder.setLanguage(TextConfiguration.AUTO);
TextConfiguration config = builder.build();
textDetector.setVisionConfiguration(config);
// 调用ITextDetector的detect()方法
if (!IS_ASYNC) {
int result2 = textDetector.detect(image, text, null); // 同步
sendResult(text.getValue());
} else {
int result2 = textDetector.detect(image, null, visionCallback); // 异步
}
}
@Override
public void onServiceDisconnect() {
// 释放
if ((!IS_ASYNC && (result == 0)) || (IS_ASYNC && (result == IS_ASYNC_CODE))) {
textDetector.release();
}
if (pixelMap != null) {
pixelMap.release();
pixelMap = null;
}
VisionManager.destroy();
}
};
}
public void sendResult(String value) {
if (textDetector != null) {
textDetector.release();
}
if (pixelMap != null) {
pixelMap.release();
pixelMap = null;
VisionManager.destroy();
}
if (value != null) {
maps.put(mediaId, value);
}
if ((maps != null) && (maps.size() == pictureLists.length)) {
InnerEvent event = InnerEvent.get(1, 0, maps);
handle.sendEvent(event);
} else {
wordRecognition(slice, pictureLists[index], handle);
index++;
}
}
// 获取图片
private PixelMap getPixelMap(int resId) {
ResourceManager manager = slice.getResourceManager();
byte[] datas = new byte[0];
try {
Resource resource = manager.getResource(resId);
datas = readBytes(resource);
resource.close();
} catch (IOException | NotExistException e) {
LogUtil.error("get pixelmap failed, read resource bytes failed, ", e.getLocalizedMessage());
}
ImageSource.SourceOptions srcOpts = new ImageSource.SourceOptions();
srcOpts.formatHint = "image/jpg";
ImageSource imageSource;
imageSource = ImageSource.create(datas, srcOpts);
ImageSource.DecodingOptions decodingOpts = new ImageSource.DecodingOptions();
decodingOpts.desiredSize = new Size(0, 0);
decodingOpts.desiredRegion = new Rect(0, 0, 0, 0);
decodingOpts.desiredPixelFormat = PixelFormat.ARGB_8888;
pixelMap = imageSource.createPixelmap(decodingOpts);
return pixelMap;
}
private static byte[] readBytes(Resource resource) {
final int bufferSize = 1024;
final int ioEnd = -1;
ByteArrayOutputStream output = new ByteArrayOutputStream();
byte[] buffers = new byte[bufferSize];
byte[] results = new byte[0];
while (true) {
try {
int readLen = resource.read(buffers, 0, bufferSize);
if (readLen == ioEnd) {
results = output.toByteArray();
break;
}
output.write(buffers, 0, readLen);
} catch (IOException e) {
LogUtil.error("OrcAbilitySlice.getPixelMap", "read resource failed ");
break;
} finally {
try {
output.close();
} catch (IOException e) {
LogUtil.error("OrcAbilitySlice.getPixelMap", "close output failed");
}
}
}
return results;
}
}
5.com/huawei/searchimagebykeywords/util/WordSegment
import com.huawei.searchimagebykeywords.slice.MainAbilitySlice;
import ohos.ai.nlu.NluClient;
import ohos.ai.nlu.NluRequestType;
import ohos.ai.nlu.OnResultListener;
import ohos.ai.nlu.ResponseResult;
import ohos.app.Context;
import ohos.eventhandler.InnerEvent;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class WordSegment {
private static final boolean IS_ASYNC = true;
private static final String WORDS = "words";
private static final int ZERO = 0;
private static final int TWO = 2;
private static final int STEP = 8;
private Context slice;
private MainAbilitySlice.MyEventHandle handle;
public void wordSegment(Context context, String requestData, MainAbilitySlice.MyEventHandle myEventHandle) {
slice = context;
handle = myEventHandle;
// 使用NluClient静态类进行初始化,通过异步方式获取服务的连接。
NluClient.getInstance().init(context, new OnResultListener<Integer>() {
@Override
public void onResult(Integer resultCode) {
if (!IS_ASYNC) {
// 同步
ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData,
NluRequestType.REQUEST_TYPE_LOCAL);
sendResult(responseResult.getResponseResult());
release();
} else {
// 异步
wordSegmentAsync(requestData);
}
}
}, true);
}
private void wordSegmentAsync(String requestData) {
ResponseResult responseResult = NluClient.getInstance().getWordSegment(requestData,
NluRequestType.REQUEST_TYPE_LOCAL, new OnResultListener<ResponseResult>() {
@Override
public void onResult(ResponseResult asyncResult) {
sendResult(asyncResult.getResponseResult());
release();
}
});
}
private void sendResult(String result) {
List lists = null; // 分词识别结果
// 将result中分词结果转换成list
if (result.contains("\"message\":\"success\"")) {
String words = result.substring(result.indexOf(WORDS) + STEP,
result.lastIndexOf("]")).replaceAll("\"", "");
if ((words == null) || ("".equals(words))) {
lists = new ArrayList(1);
lists.add("no keywords"); // 未识别到分词结果,返回"no keywords"
} else {
lists = Arrays.asList(words.split(","));
}
}
InnerEvent event = InnerEvent.get(TWO, ZERO, lists);
handle.sendEvent(event);
}
private void release() {
NluClient.getInstance().destroy(slice);
}
}
6.com/huawei/searchimagebykeywords/MainAbility
import com.huawei.searchimagebykeywords.slice.MainAbilitySlice;
import ohos.aafwk.ability.Ability;
import ohos.aafwk.content.Intent;
public class MainAbility extends Ability {
@Override
public void onStart(Intent intent) {
super.onStart(intent);
super.setMainRoute(MainAbilitySlice.class.getName());
}
}
7.com/huawei/searchimagebykeywords/MyApplication
import ohos.aafwk.ability.AbilityPackage;
public class MyApplication extends AbilityPackage {
@Override
public void onInitialize() {
super.onInitialize();
}
}
🕮 说明
以上代码仅demo演示参考使用,产品化的代码需要考虑数据校验和国际化。
8. 恭喜您
通过本教程的学习,您已学会如何使用AI能力中的通用文字识别和分词。
@文章转载自HUAWEI Codelabs
学习下先进的AI识别技术
👍👍👍
AI这块有很多功能,我们正不知怎么下手尼,好好学习。
👍👍👍
怎么实现通过一个接口就能做识别,不需要加载模型,而且不是远程api,有大神能解释下原理么,一直的困惑
这代码我没改直接跑根本识别不了
更别说改一下识别图片文字并输出了,,,,
需要在真机上或者远程真机上运行,因为模拟器上没有芯片。
结果是模拟器更不跑不了AI模块
可以跑啊
我按照官方的AI通用文字识别文档写的完全可以在模拟器上运行成功
到底是识别文档还是识别图片,,识别图片的确能运行成功但是AI模块对图片的识别文字的结果是null,我Debug看了,,如果可以的话可以截图给我看看你识别匹配成功的截图么?
Dev开发的用真机模拟是要升级到鸿蒙系统么?可以debug么?
这是原图
这是识别后的结果
HUAWEI Codelabs 上的代码一般都是可以直接跑通的,就是有些时候需要改一改配置文件。
AI这块的功能,整体来说体验还是很不错的。
偶然的原因跑去看了测试版,发现了远程真机,,,弄出来了
你应该使用的远程真机,吧,不是自带的模拟器吧
楼主能发个项目包吗,我试了一下把代码都拷贝一遍结果只能跑个样子,点按钮啥的都没反应,不知道是哪一步出问题了