腾讯云流式TTS介绍
接入文档链接:https://cloud.tencent.com/document/api/441/19499
该接口传入参数为json,目前还不支持云api3.0鉴权,输出协议采用了http chunk协议,数据格式包括opus压缩后的分片和pcm原始音频流,本文将从鉴权开始,详细介绍流式tts的客户端实现。
接口鉴权
1.构造json请求参数,为了方便将参数排序,使用TreeMap存储参数
代码语言:javascript复制 mRequestMap.put("Action", "TextToStreamAudio");
mRequestMap.put("Text", text);
mRequestMap.put("SessionId", "session-1234");
mRequestMap.put("AppId", "1255824371");
mRequestMap.put("Timestamp", "" System.currentTimeMillis() / 1000L);
mRequestMap.put("Expired", "" (System.currentTimeMillis() / 1000L 600));
mRequestMap.put("Speed", "0");
mRequestMap.put("SecretId", SECRET_ID);
mRequestMap.put("VoiceType", 0 "");
mRequestBody = (new JSONObject(mRequestMap)).toString();
2.生成签名串,按要求拼接字符串后加密即可,这里需要注意仔细阅读鉴权文档的说明,不然很容易出错
代码语言:javascript复制private static String generateSign(TreeMap<String, String> params) {
String paramStr = "POST" DOMAIN_NAME "?";
StringBuilder builder = new StringBuilder(paramStr);
for (Map.Entry<String, String> entry : params.entrySet()) {
builder.append(String.format(Locale.CHINESE, "%s=%s", entry.getKey(), String.valueOf(entry.getValue())))
.append("&");
}
//去掉最后一个&
builder.deleteCharAt(builder.lastIndexOf("&"));
String sign = "";
String source = builder.toString();
System.out.println(source);
Mac mac = null;
try {
mac = Mac.getInstance("HmacSHA1");
SecretKeySpec keySpec = new SecretKeySpec(SECRET_KEY.getBytes(), "HmacSHA1");
mac.init(keySpec);
mac.update(source.getBytes());
sign = Base64.encodeToString(mac.doFinal(), 2);
} catch (NoSuchAlgorithmException | InvalidKeyException e) {
e.printStackTrace();
}
System.out.println("生成签名串:" sign);
return sign;
}
到这里我们就获得了一个完整的签名串,接下来就是本文的重点点部分了,网络请求和网络解析
chunk分块传输编码
这里由于腾讯云采用了http chunk协议返回,不同于常规的http诸如json返回,采用多段分片返回数据的方式。消息体由数量未定的块组成,并以最后一个大小为0的块为结束。
每一个非空的块都以该块包含数据的字节数(字节数16进制以表示)开始,跟随一个CRLF (回车及换行),然后是数据本身,最后块CRLF结束。在一些实现中,块大小和CRLF之间填充有白空格(0x20)。
最后一块是单行,由块大小(0),一些可选的填充白空格,以及CRLF。最后一块不再包含任何数据,但是可以发送可选的尾部,包括消息头字段。
消息最后以CRLF结尾。一个完整的chunk返回示例如下:
代码语言:javascript复制HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
25
This is the data in the first chunk
1C
and this is the second one
3
con
8
sequence
0
如果对chunk协议希望有一个完整的了解,可以参考这篇wiki:分块传输编码
请求TTS数据
代码如下,我们直接获取返回数据数据流管道,用于数据读取
代码语言:javascript复制private static InputStream obtainResponseStreamWithJava(String postJsonBody, TreeMap<String, String> requestMap) throws IOException {
//发送POST请求
URL url = new URL(SERVER_URL);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
String authorization = generateSign(requestMap);
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type", "application/json");
conn.setRequestProperty("Authorization", authorization);
conn.connect();
OutputStream out = conn.getOutputStream();
out.write(postJsonBody.getBytes("UTF-8"));
out.flush();
out.close();
if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {//todo
Log.w(TAG, "HTTP Code: " conn.getResponseCode());
}
// String result = new String(toByteArray(conn.getInputStream()), "UTF-8");
InputStream inputStream = conn.getInputStream();
return inputStream;
}
OPUS
根据官网的文档得知,数据分为两种,opus压缩和pcm原始音频流,题主了解到opus拥有较好的压缩比(10:1),可以很好的节省传输时间和网络带宽。
opus是开源库,但是是用C 编写的,由于Android5.0以上才支持opus格式的播发,所以如果需要兼容5.0的系统,需要编译so库。opus源码地址
TTS数据解析
这里主要参考官网的java示例,循环读取数据,按以下格式说明不断读取头/序号/长度/音频数据,直到到达数据末尾。
代码示例如下:
代码语言:javascript复制private void processProtocolBufferStream(final InputStream inputStream) throws DeserializationException {
final long start = System.currentTimeMillis();
YoutuOpusDecoder decoder = null;
List<PcmData> pcmCache = new ArrayList<>();
boolean fillSuccess;
int pbPkgCount = -1;
while (!Thread.currentThread().isInterrupted()) {
pbPkgCount ;
try {
//read head
byte[] headBuffer = new byte[4];
fillSuccess = fill(inputStream, headBuffer);
if (!fillSuccess) {
throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount));
}
//read seq
byte[] seqBuffer = new byte[4];
fillSuccess = fill(inputStream, seqBuffer);
if (!fillSuccess) {
throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount));
}
int seq = bytesToInt(seqBuffer);
//read pkg size
byte[] pbPkgSizeHeader = new byte[4];
fillSuccess = fill(inputStream, pbPkgSizeHeader);
if (!fillSuccess) {
throw new ReadBufferException(String.format("read PB pkg#%s size header fail, break;", pbPkgCount));
}
int pbPkgSize = bytesToInt(pbPkgSizeHeader);
Log.i(TAG, String.format("PB pkg#%s size = %s", pbPkgCount, pbPkgSize));
if (pbPkgCount == 0) {
sTimeEnd = System.currentTimeMillis();
sTimeCost = sTimeEnd - sTimeStart;
}
if (pbPkgSize <= 0) {
throw new ReadBufferException(String.format("PB pkg#%s size %s <= 0, break;", pbPkgCount, pbPkgSize));
} else if (pbPkgSize > 5000) {
throw new ReadBufferException(String.format("PB pkg#%s size %s > 5000 bytes, too large, break;", pbPkgCount, pbPkgSize));
}
//read pb pkg
byte[] pbPkg = new byte[pbPkgSize];
fillSuccess = fill(inputStream, pbPkg);
if (!fillSuccess) {
throw new ReadBufferException(String.format("read PB pkg#%s fail, break;", pbPkgCount));
}
//init decoder
if (decoder == null) {
decoder = new YoutuOpusDecoder();
decoder.config();
}
//decode
Log.i("DEBUG-1", "seq:" seq);
Pair<Integer, short[]> pair = decoder.decodeTTSData(seq, pbPkg);
short[] pcm = pair.second;
Log.d(TAG, (pcm == null ? "fail decode #" : "decode #") pbPkgCount);
//packaging pcm
if (pcm == null) {
pcm = new short[0];
}
PcmData pcmData = new PcmData(pcm, seq == -1);
//stop check
if (Thread.currentThread().isInterrupted()) {
Log.w(TAG, "pcm data ready, but thread is interrupted, break;");
break;
}
//init player
if (mOpusPlayer == null) {
mOpusPlayer = new OpusPlayer();
mOpusPlayer.setPcmSampleRate(16000);
mOpusPlayer.setUncaughtExceptionHandler(new UncaughtExceptionHandler() {
@Override
public void uncaughtException(Thread thread, Throwable ex) {
if (mTtsExceptionHandler != null) {
mTtsExceptionHandler.onPlayException(thread, ex);
}
}
});
}
//enqueue
if (pbPkgCount < mCacheCount) {//缓冲
pcmCache.add(pcmData);
} else {//enqueue
for (PcmData d : pcmCache) {
mOpusPlayer.enqueue(d);
}
pcmCache.clear();
mOpusPlayer.enqueue(pcmData);
}
//end
if (seq == -1) {
long ms = System.currentTimeMillis() - start;
Log.d(TAG, "finish last pb pkg#" pbPkgCount ", total cast time " ms " ms");
break;
}
} catch (Exception e) {
if (mOpusPlayer != null) {
mOpusPlayer.forceStop();
}
if (e instanceof InterruptedIOException) {
Log.i(TAG, "Interrupted while reading server response InputStream", e);// 正常流程, 无需抛出异常
} else {
throw new DeserializationException(e);
}
}
}
}
其中,按小端字节读取方式如下:
代码语言:javascript复制 /**
* 从 InputStream 读取内容到 buffer, 直到 buffer 填满
*
* @return 如果 InputStream 内容不足以填满 buffer, 则返回 false.
* @throws IOException 可能抛出的异常
*/
private static boolean fill(InputStream in, byte[] buffer) throws IOException {
int length = buffer.length;
int hasRead = 0;
while (true) {
int offset = hasRead;
int count = length - hasRead;
int currentRead = in.read(buffer, offset, count);
if (currentRead >= 0) {
hasRead = currentRead;
if (hasRead == length) {
return true;
}
}
if (currentRead == -1) {
return false;
}
}
}
TTS语音播放
TTS完成解析的数据都经由YoutuOpusDecoder类进行播放,此处主要封装了两个功能,第一个功能是封装了AudioTrack播放pcm原始音频,第二个是将解析完成的音频不断送入播放器
完整代码如下:
代码语言:javascript复制public class OpusPlayer {
private static final String TAG = "OpusPlayer";
private BlockingQueue<PcmData> mPcmQueue = new LinkedBlockingQueue<>();
private volatile Thread mPlayThread;
private int mPcmSampleRate;
private UncaughtExceptionHandler mUncaughtExceptionHandler;
public void setUncaughtExceptionHandler(UncaughtExceptionHandler handler) {
mUncaughtExceptionHandler = handler;
}
public void setPcmSampleRate(int pcmSampleRate) {
mPcmSampleRate = pcmSampleRate;
}
public void enqueue(PcmData pcmData) {
mPcmQueue.add(pcmData);
if (mPlayThread == null) {
mPlayThread = new Thread(new Runnable() {
PcmPlayer mPlayer;
@Override
public void run() {
Log.d(TAG, getThreadLogPrefix() "start");
int playerPrepareFailCount = 0;
int playCount = 0;
long start = System.currentTimeMillis();
while (!Thread.currentThread().isInterrupted()) {
//准备播放器
boolean isPlayerReady = preparePlayerIfNeeded();
if (!isPlayerReady) {
releasePlayer();
playerPrepareFailCount ;
if (playerPrepareFailCount > 5) {
releasePlayer();
throw new RuntimeException("prepare player fail too many times, abort.");//不再尝试了
} else {
Log.w(TAG, getThreadLogPrefix() "prepare player fail, retry.");
continue;//再尝试
}
}
//出队
PcmData pcmData;
try {
pcmData = mPcmQueue.take();
} catch (InterruptedException e) {
e.printStackTrace();
Log.d(TAG, getThreadLogPrefix() "force stop");
break;
}
//播放
if (pcmData != null) {
try {
short[] pcm = pcmData.getPcm();
if (pcm != null) {
mPlayer.play(pcm);
Log.d(TAG, getThreadLogPrefix() "play #" playCount);
} else {
Log.d(TAG, getThreadLogPrefix() "play #" playCount " fail, pcm == null !!");
}
if (pcmData.isLastOne()) {
Log.d(TAG, getThreadLogPrefix() "finish all task, will stop");
break;
}
playCount ;
} catch (AudioTrackException e) {
e.printStackTrace();
releasePlayer();//下一个循环会尝试重新初始化 player
}
} else {
Log.w(TAG, getThreadLogPrefix() "mPcmQueue.take() == null, nothing to play");
}
}
releasePlayer();
long time = System.currentTimeMillis() - start;
Log.d(TAG, getThreadLogPrefix() "stop, ran " time " ms");
}
/**
* @return true: player is ready
*/
boolean preparePlayerIfNeeded() {
if (mPlayer == null) {
mPlayer = new PcmPlayer();
try {
mPlayer.prepare(AudioManager.STREAM_MUSIC, mPcmSampleRate, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT);
} catch (AudioTrackException e) {
e.printStackTrace();
releasePlayer();
}
}
return mPlayer != null;
}
void releasePlayer() {
if (mPlayer != null) {
mPlayer.release();
mPlayer = null;
}
}
});
mPlayThread.setPriority(Thread.NORM_PRIORITY - 1);//播放耗时最长, 优先级比解码线程低一点, 可以让出多一点时间给解码线程
mPlayThread.setName(TAG ".mPlayThread");
if (mUncaughtExceptionHandler != null) {
mPlayThread.setUncaughtExceptionHandler(mUncaughtExceptionHandler);
}
mPlayThread.start();
}
}
private static String getThreadLogPrefix() {
Thread currentThread = Thread.currentThread();
String s = currentThread.getName() "#" currentThread.getId() ": ";
return s;
}
public void forceStop() {
if (mPlayThread != null && !mPlayThread.isInterrupted()) {
mPlayThread.interrupt();
mPlayThread = null;
}
mPcmQueue.clear();
}
public static class PcmData {
private final short[] mPcm;
private final boolean mIsLastOne;
public PcmData(short[] pcm, boolean isLastOne) {
mPcm = pcm;
mIsLastOne = isLastOne;
}
short[] getPcm() {
return mPcm;
}
boolean isLastOne() {
return mIsLastOne;
}
}
}