腾讯云流式TTS语音合成客户端实现

2019-08-29 01:21:31 浏览数 (1)

腾讯云流式TTS介绍

接入文档链接:https://cloud.tencent.com/document/api/441/19499

该接口传入参数为json,目前还不支持云api3.0鉴权,输出协议采用了http chunk协议,数据格式包括opus压缩后的分片和pcm原始音频流,本文将从鉴权开始,详细介绍流式tts的客户端实现。

接口鉴权

1.构造json请求参数,为了方便将参数排序,使用TreeMap存储参数

代码语言:javascript复制
 mRequestMap.put("Action", "TextToStreamAudio");
 mRequestMap.put("Text", text);
 mRequestMap.put("SessionId", "session-1234");
 mRequestMap.put("AppId", "1255824371");
 mRequestMap.put("Timestamp", ""   System.currentTimeMillis() / 1000L);
 mRequestMap.put("Expired", ""   (System.currentTimeMillis() / 1000L   600));
 mRequestMap.put("Speed", "0");
 mRequestMap.put("SecretId", SECRET_ID);
 mRequestMap.put("VoiceType", 0   "");
 mRequestBody =  (new JSONObject(mRequestMap)).toString();                     
                                         

2.生成签名串,按要求拼接字符串后加密即可,这里需要注意仔细阅读鉴权文档的说明,不然很容易出错

代码语言:javascript复制
private static String generateSign(TreeMap<String, String> params) {
        String paramStr = "POST"   DOMAIN_NAME   "?";
        StringBuilder builder = new StringBuilder(paramStr);
        for (Map.Entry<String, String> entry : params.entrySet()) {
            builder.append(String.format(Locale.CHINESE, "%s=%s", entry.getKey(), String.valueOf(entry.getValue())))
                    .append("&");
        }

        //去掉最后一个&
        builder.deleteCharAt(builder.lastIndexOf("&"));

        String sign = "";
        String source = builder.toString();
        System.out.println(source);
        Mac mac = null;
        try {
            mac = Mac.getInstance("HmacSHA1");
            SecretKeySpec keySpec = new SecretKeySpec(SECRET_KEY.getBytes(), "HmacSHA1");
            mac.init(keySpec);
            mac.update(source.getBytes());
            sign = Base64.encodeToString(mac.doFinal(), 2);
        } catch (NoSuchAlgorithmException | InvalidKeyException e) {
            e.printStackTrace();
        }

        System.out.println("生成签名串:"   sign);
        return sign;
    }

到这里我们就获得了一个完整的签名串,接下来就是本文的重点点部分了,网络请求和网络解析

chunk分块传输编码

这里由于腾讯云采用了http chunk协议返回,不同于常规的http诸如json返回,采用多段分片返回数据的方式。消息体由数量未定的块组成,并以最后一个大小为0的块为结束。

每一个非空的块都以该块包含数据的字节数(字节数16进制以表示)开始,跟随一个CRLF (回车及换行),然后是数据本身,最后块CRLF结束。在一些实现中,块大小和CRLF之间填充有白空格(0x20)。

最后一块是单行,由块大小(0),一些可选的填充白空格,以及CRLF。最后一块不再包含任何数据,但是可以发送可选的尾部,包括消息头字段。

消息最后以CRLF结尾。一个完整的chunk返回示例如下:

代码语言:javascript复制
HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked

25
This is the data in the first chunk

1C
and this is the second one

3
con

8
sequence

0

如果对chunk协议希望有一个完整的了解,可以参考这篇wiki:分块传输编码

请求TTS数据

代码如下,我们直接获取返回数据数据流管道,用于数据读取

代码语言:javascript复制
private static InputStream obtainResponseStreamWithJava(String postJsonBody, TreeMap<String, String> requestMap) throws IOException {
        //发送POST请求
        URL url = new URL(SERVER_URL);
        HttpURLConnection conn = (HttpURLConnection) url.openConnection();
        String authorization = generateSign(requestMap);
        conn.setRequestMethod("POST");
        conn.setRequestProperty("Content-Type", "application/json");
        conn.setRequestProperty("Authorization", authorization);
        conn.connect();
        OutputStream out = conn.getOutputStream();
        out.write(postJsonBody.getBytes("UTF-8"));
        out.flush();
        out.close();
        if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {//todo
            Log.w(TAG, "HTTP Code: "   conn.getResponseCode());
        }
//        String result = new String(toByteArray(conn.getInputStream()), "UTF-8");
        InputStream inputStream = conn.getInputStream();
        return inputStream;
    }

OPUS

根据官网的文档得知,数据分为两种,opus压缩和pcm原始音频流,题主了解到opus拥有较好的压缩比(10:1),可以很好的节省传输时间和网络带宽。

opus是开源库,但是是用C 编写的,由于Android5.0以上才支持opus格式的播发,所以如果需要兼容5.0的系统,需要编译so库。opus源码地址

TTS数据解析

这里主要参考官网的java示例,循环读取数据,按以下格式说明不断读取头/序号/长度/音频数据,直到到达数据末尾。

tts分片格式tts分片格式

代码示例如下:

代码语言:javascript复制
private void processProtocolBufferStream(final InputStream inputStream) throws DeserializationException {
            final long start = System.currentTimeMillis();

            YoutuOpusDecoder decoder = null;

            List<PcmData> pcmCache = new ArrayList<>();
            boolean fillSuccess;
            int pbPkgCount = -1;

            while (!Thread.currentThread().isInterrupted()) {
                pbPkgCount  ;
                try {
                    //read head
                    byte[] headBuffer = new byte[4];
                    fillSuccess = fill(inputStream, headBuffer);
                    if (!fillSuccess) {
                        throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount));
                    }
                    //read seq
                    byte[] seqBuffer = new byte[4];
                    fillSuccess = fill(inputStream, seqBuffer);
                    if (!fillSuccess) {
                        throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount));
                    }
                    int seq = bytesToInt(seqBuffer);
                    //read pkg size
                    byte[] pbPkgSizeHeader = new byte[4];
                    fillSuccess = fill(inputStream, pbPkgSizeHeader);
                    if (!fillSuccess) {
                        throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount));
                    }
                    int pbPkgSize = bytesToInt(pbPkgSizeHeader);
                    Log.i(TAG, String.format("PB pkg#%s size = %s", pbPkgCount, pbPkgSize));
                    if (pbPkgCount == 0) {
                        sTimeEnd = System.currentTimeMillis();
                        sTimeCost = sTimeEnd - sTimeStart;
                    }
                    if (pbPkgSize <= 0) {
                        throw new ReadBufferException(String.format("PB pkg#%s size %s <= 0, break;", pbPkgCount, pbPkgSize));
                    } else if (pbPkgSize > 5000) {
                        throw new ReadBufferException(String.format("PB pkg#%s size %s > 5000 bytes, too large, break;", pbPkgCount, pbPkgSize));
                    }

                    //read pb pkg
                    byte[] pbPkg = new byte[pbPkgSize];
                    fillSuccess = fill(inputStream, pbPkg);
                    if (!fillSuccess) {
                        throw new ReadBufferException(String.format("read PB pkg#%s fail, break;", pbPkgCount));
                    }

                    //init decoder
                    if (decoder == null) {
                        decoder = new YoutuOpusDecoder();
                        decoder.config();
                    }
                    //decode
                    Log.i("DEBUG-1", "seq:"   seq);
                    Pair<Integer, short[]> pair = decoder.decodeTTSData(seq, pbPkg);
                    short[] pcm = pair.second;

                    Log.d(TAG, (pcm == null ? "fail decode #" : "decode #")   pbPkgCount);

                    //packaging pcm
                    if (pcm == null) {
                        pcm = new short[0];
                    }
                    PcmData pcmData = new PcmData(pcm, seq == -1);

                    //stop check
                    if (Thread.currentThread().isInterrupted()) {
                        Log.w(TAG, "pcm data ready, but thread is interrupted, break;");
                        break;
                    }

                    //init player
                    if (mOpusPlayer == null) {
                        mOpusPlayer = new OpusPlayer();
                        mOpusPlayer.setPcmSampleRate(16000);
                        mOpusPlayer.setUncaughtExceptionHandler(new UncaughtExceptionHandler() {
                            @Override
                            public void uncaughtException(Thread thread, Throwable ex) {
                                if (mTtsExceptionHandler != null) {
                                    mTtsExceptionHandler.onPlayException(thread, ex);
                                }
                            }
                        });
                    }

                    //enqueue
                    if (pbPkgCount < mCacheCount) {//缓冲
                        pcmCache.add(pcmData);
                    } else {//enqueue
                        for (PcmData d : pcmCache) {
                            mOpusPlayer.enqueue(d);
                        }
                        pcmCache.clear();
                        mOpusPlayer.enqueue(pcmData);
                    }

                    //end
                    if (seq == -1) {
                        long ms = System.currentTimeMillis() - start;
                        Log.d(TAG, "finish last pb pkg#"   pbPkgCount   ", total cast time "   ms   " ms");
                        break;
                    }
                } catch (Exception e) {
                    if (mOpusPlayer != null) {
                        mOpusPlayer.forceStop();
                    }
                    if (e instanceof InterruptedIOException) {
                        Log.i(TAG, "Interrupted while reading server response InputStream", e);// 正常流程, 无需抛出异常
                    } else {
                        throw new DeserializationException(e);
                    }
                }
            }
        }

其中,按小端字节读取方式如下:

代码语言:javascript复制
 /**
     * 从 InputStream 读取内容到 buffer, 直到 buffer 填满
     *
     * @return 如果 InputStream 内容不足以填满 buffer, 则返回 false.
     * @throws IOException 可能抛出的异常
     */
    private static boolean fill(InputStream in, byte[] buffer) throws IOException {
        int length = buffer.length;
        int hasRead = 0;
        while (true) {
            int offset = hasRead;
            int count = length - hasRead;
            int currentRead = in.read(buffer, offset, count);
            if (currentRead >= 0) {
                hasRead  = currentRead;
                if (hasRead == length) {
                    return true;
                }
            }
            if (currentRead == -1) {
                return false;
            }
        }
    }

TTS语音播放

TTS完成解析的数据都经由YoutuOpusDecoder类进行播放,此处主要封装了两个功能,第一个功能是封装了AudioTrack播放pcm原始音频,第二个是将解析完成的音频不断送入播放器

完整代码如下:

代码语言:javascript复制
public class OpusPlayer {
    private static final String TAG = "OpusPlayer";

    private BlockingQueue<PcmData> mPcmQueue = new LinkedBlockingQueue<>();
    private volatile Thread mPlayThread;
    private int mPcmSampleRate;
    private UncaughtExceptionHandler mUncaughtExceptionHandler;

    public void setUncaughtExceptionHandler(UncaughtExceptionHandler handler) {
        mUncaughtExceptionHandler = handler;
    }

    public void setPcmSampleRate(int pcmSampleRate) {
        mPcmSampleRate = pcmSampleRate;
    }
    

    public void enqueue(PcmData pcmData) {
        mPcmQueue.add(pcmData);

        if (mPlayThread == null) {
            mPlayThread = new Thread(new Runnable() {

                PcmPlayer mPlayer;

                @Override
                public void run() {
                    Log.d(TAG, getThreadLogPrefix()   "start");
                    int playerPrepareFailCount = 0;
                    int playCount = 0;
                    long start = System.currentTimeMillis();

                    while (!Thread.currentThread().isInterrupted()) {
                        
                        //准备播放器
                        boolean isPlayerReady = preparePlayerIfNeeded();
                        if (!isPlayerReady) {
                            releasePlayer();
                            playerPrepareFailCount  ;
                            if (playerPrepareFailCount > 5) {
                                releasePlayer();
                                throw new RuntimeException("prepare player fail too many times, abort.");//不再尝试了
                            } else {
                                Log.w(TAG, getThreadLogPrefix()   "prepare player fail, retry.");
                                continue;//再尝试
                            }
                        }

                        //出队
                        PcmData pcmData;
                        try {
                            pcmData = mPcmQueue.take();
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                            Log.d(TAG, getThreadLogPrefix()   "force stop");
                            break;
                        }
                        
                        //播放
                        if (pcmData != null) {
                            try {
                                short[] pcm = pcmData.getPcm();
                                if (pcm != null) {
                                    mPlayer.play(pcm);
                                    Log.d(TAG, getThreadLogPrefix()   "play #"   playCount);
                                } else {
                                    Log.d(TAG, getThreadLogPrefix()   "play #"   playCount   " fail, pcm == null !!");
                                }
                                if (pcmData.isLastOne()) {
                                    Log.d(TAG, getThreadLogPrefix()   "finish all task, will stop");
                                    break;
                                }
                                playCount  ;
                            } catch (AudioTrackException e) {
                                e.printStackTrace();
                                releasePlayer();//下一个循环会尝试重新初始化 player
                            }
                        } else {
                            Log.w(TAG, getThreadLogPrefix()   "mPcmQueue.take() == null, nothing to play");
                        }
                    }

                    releasePlayer();
                    long time = System.currentTimeMillis() - start;
                    Log.d(TAG, getThreadLogPrefix()   "stop, ran "   time   " ms");
                }

                /**
                 * @return true: player is ready
                 */
                boolean preparePlayerIfNeeded() {
                    if (mPlayer == null) {
                        mPlayer = new PcmPlayer();
                        try {
                            mPlayer.prepare(AudioManager.STREAM_MUSIC, mPcmSampleRate, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT);
                        } catch (AudioTrackException e) {
                            e.printStackTrace();
                            releasePlayer();
                        }
                    }
                    return mPlayer != null;
                }

                void releasePlayer() {
                    if (mPlayer != null) {
                        mPlayer.release();
                        mPlayer = null;
                    }
                }

            });
            mPlayThread.setPriority(Thread.NORM_PRIORITY - 1);//播放耗时最长, 优先级比解码线程低一点, 可以让出多一点时间给解码线程
            mPlayThread.setName(TAG   ".mPlayThread");
            if (mUncaughtExceptionHandler != null) {
                mPlayThread.setUncaughtExceptionHandler(mUncaughtExceptionHandler);
            }
            mPlayThread.start();
        }
    }

    private static String getThreadLogPrefix() {
        Thread currentThread = Thread.currentThread();
        String s = currentThread.getName()   "#"   currentThread.getId()   ": ";
        return s;
    }
    
    public void forceStop() {
        if (mPlayThread != null && !mPlayThread.isInterrupted()) {
            mPlayThread.interrupt();
            mPlayThread = null;
        }
        mPcmQueue.clear();
    }

    public static class PcmData {
        private final short[] mPcm;
        private final boolean mIsLastOne;

        public PcmData(short[] pcm, boolean isLastOne) {
            mPcm = pcm;
            mIsLastOne = isLastOne;
        }

        short[] getPcm() {
            return mPcm;
        }

        boolean isLastOne() {
            return mIsLastOne;
        }
    }


}

0 人点赞