Unity 接入有道智云AI - 图片翻译

2022-08-29 17:02:00 浏览数 (1)

一、接口介绍

基于文字识别与文本翻译技术,满足用户翻译图片文字的需求。只需要通过调用图片翻译API,传入图片的Base64编码,指定源语言与目标语言,通过POST请求方式,就可以识别图片中的文字并进行翻译。

协议须知:

二、申请AppID、密钥

1.登录有道智云AI开放平台,进入控制台

2.在应用总览中创建应用,填写相关内容

3.获取应用ID、密钥

三、在Unity中应用

1.定义请求数据结构

根据官方文档中接口调用参数说明定义对应的数据结构:

代码语言:javascript复制
/// <summary>
/// 图片翻译请求数据结构
/// </summary>
[Serializable]
public class OcrTransRequest
{
    /// <summary>
    /// 文件上传类型
    /// 目前支持Base64 请置该字段为1
    /// </summary>
    public string type;
    /// <summary>
    /// 源语言
    /// </summary>
    public string from;
    /// <summary>
    /// 目标语言
    /// </summary>
    public string to;
    /// <summary>
    /// 应用ID
    /// </summary>
    public string appKey;
    /// <summary>
    /// UUID 唯一通用识别码
    /// </summary>
    public string salt;
    /// <summary>
    /// 签名 MD5
    /// 应用ID q salt 应用密钥
    /// </summary>
    public string sign;
    /// <summary>
    /// 翻译结果音频格式 支持mp3
    /// </summary>
    public string ext;
    /// <summary>
    /// 要识别的图片
    /// type为1时,图片的base64编码
    /// </summary>
    public string q;
    /// <summary>
    /// 服务器响应类型 目前只支持json
    /// </summary>
    public string docType;
    /// <summary>
    /// 是否需要服务端返回渲染的图片 0否 1是
    /// </summary>
    public string render;

    public OcrTransRequest(string from, string to, string appKey, string appSecret, string q)
    {
        type = "1";
        this.from = from;
        this.to = to;
        this.appKey = appKey;
        salt = DateTime.Now.Millisecond.ToString();
        
        //拼接应用ID q salt 应用密钥
        string signStr = appKey   q   salt   appSecret;
        //获取字节数据
        byte[] inputBytes = Encoding.UTF8.GetBytes(signStr);
        //获取哈希数据
        byte[] hashBytes = new MD5CryptoServiceProvider().ComputeHash(inputBytes);
        sign = BitConverter.ToString(hashBytes).Replace("-", "");

        this.q = HttpUtility.UrlEncode(q);
        ext = "mp3";
        docType = "json";
        render = "0";
    }
    public override string ToString()
    {
        return string.Format("from={0}&to={1}&type={2}&q={3}&appKey={4}&salt={5}&sign={6}", from, to, type, q, appKey, salt, sign);
    }
}

2.定义响应数据结构

根据官方文档中接口返回结果参数说明定义对应的数据结构:

代码语言:javascript复制
/// <summary>
/// 图片翻译响应数据结构
/// </summary>
[Serializable]
public class OcrTransResponse
{
    /// <summary>
    /// 图片所对应的方向
    /// </summary>
    public string orientation;
    /// <summary>
    /// 图片中的语言
    /// </summary>
    public string lanFrom;
    /// <summary>
    /// 图片的倾斜角度
    /// </summary>
    public string textAngle;
    /// <summary>
    /// 错误码
    /// </summary>
    public string errorCode;
    /// <summary>
    /// 目标语言
    /// </summary>
    public string lanTo;
    /// <summary>
    /// 图片翻译的具体内容
    /// </summary>
    public ResRegion[] resRegions;
}
/// <summary>
/// 图片翻译的具体内容
/// </summary>
[Serializable]
public class ResRegion
{
    /// <summary>
    /// 区域范围 包含四个值
    /// 左上角的x值 左上角的y值 区域宽度 区域高度
    /// </summary>
    public string boundingBox;
    /// <summary>
    /// 行数
    /// </summary>
    public int linesCount;
    /// <summary>
    /// 行高
    /// </summary>
    public int lineheight;
    /// <summary>
    /// 该区域的原文 
    /// </summary>
    public string context;
    /// <summary>
    /// 行间距
    /// </summary>
    public int lineSpace;
    /// <summary>
    /// 翻译结果
    /// </summary>
    public string tranContent;
    /// <summary>
    /// 当render=1即返回渲染图片,返回结果lines
    /// </summary>
    public string lines;
    /// <summary>
    /// 图片颜色
    /// </summary>
    public string color;
    /// <summary>
    /// 行识别结果
    /// </summary>
    public string text;
    /// <summary>
    /// 识别的字的结果
    /// </summary>
    public string word;
    /// <summary>
    /// 文字的高度
    /// </summary>
    public int textHeight;
}

3.封装调用函数

代码语言:javascript复制
/// <summary>
/// 图片翻译
/// </summary>
public class OcrTrans
{
    //应用ID和密钥 在有道智云AI开放平台创建应用获取
    private static readonly string appid = "";
    private static readonly string secret = "";

    public static OcrTransResponse SendRequest(string from, string to, string picbase64)
    {
        string result = "";
        string url = "https://openapi.youdao.com/ocrtransapi";
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
        request.Method = "POST";
        request.ContentType = "application/x-www-form-urlencoded";
        OcrTransRequest otr = new OcrTransRequest(from, to, appid, secret, picbase64);
        byte[] data = Encoding.UTF8.GetBytes(otr.ToString());
        request.ContentLength = data.Length;
        using (Stream reqStream = request.GetRequestStream())
        {
            reqStream.Write(data, 0, data.Length);
            reqStream.Close();
        }
        HttpWebResponse resp = (HttpWebResponse)request.GetResponse();
        Stream stream = resp.GetResponseStream();
        using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
        {
            result = reader.ReadToEnd();
        }
        Debug.Log(result);
        OcrTransResponse response = JsonUtility.FromJson<OcrTransResponse>(result);
        return response;
    }
}

4.测试

代码语言:javascript复制
using System;
using System.IO;
using UnityEngine;

public class OcrTransExample : MonoBehaviour
{
    private void Start()
    {
        byte[] bytes = File.ReadAllBytes(Application.dataPath   "/test.png");
        string base64 = Convert.ToBase64String(bytes);
        OcrTrans.SendRequest("en", "zh-CHS", base64);
    }
}

各语言对应代码:

0 人点赞