百度語音識別API的使用樣例(python實現)
阿新 • • 發佈:2018-12-30
百度給的樣例程式,不論C還是Java版,都分為method1和method2兩種
前者稱為隱式(post的是json串,音訊資料編碼到json裡),後者稱為顯式(post的就是音訊資料)
一開始考慮到python wave包處理的都是“字串”,擔心跟C語言的陣列不一致,所以選擇低效但保險的method1,
即先將音訊資料base64編碼,再加上取樣率、通道數等資訊彙集成dict,最後總體編碼成json串
結果老是報:
3300 輸入引數不正確
先後試過urllib2和pycurl包,都是上面情況
不得已換用method2,成功(看來wave包對音訊的儲存並不是“字串”)
#encoding=utf-8 import wave import urllib, urllib2, pycurl import base64 import json ## get access token by api key & secret key def get_token(): apiKey = "xxxxxxxx" secretKey = "xxxxxxxxx" auth_url = "https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=" + apiKey + "&client_secret=" + secretKey; res = urllib2.urlopen(auth_url) json_data = res.read() return json.loads(json_data)['access_token'] def dump_res(buf): print buf ## post audio to server def use_cloud(token): fp = wave.open('vad_0.wav', 'rb') nf = fp.getnframes() f_len = nf * 2 audio_data = fp.readframes(nf) cuid = "xxxxxxxxxx" #my xiaomi phone MAC srv_url = 'http://vop.baidu.com/server_api' + '?cuid=' + cuid + '&token=' + token http_header = [ 'Content-Type: audio/pcm; rate=8000', 'Content-Length: %d' % f_len ] c = pycurl.Curl() c.setopt(pycurl.URL, str(srv_url)) #curl doesn't support unicode #c.setopt(c.RETURNTRANSFER, 1) c.setopt(c.HTTPHEADER, http_header) #must be list, not dict c.setopt(c.POST, 1) c.setopt(c.CONNECTTIMEOUT, 30) c.setopt(c.TIMEOUT, 30) c.setopt(c.WRITEFUNCTION, dump_res) c.setopt(c.POSTFIELDS, audio_data) c.setopt(c.POSTFIELDSIZE, f_len) c.perform() #pycurl.perform() has no return val if __name__ == "__main__": token = get_token() use_cloud(token)
執行結果
{"corpus_no":"6150045491002357923","err_msg":"success.","err_no":0,"result":["播放小蘋果,"],"sn":"243903724071431919050"}