2016-09-28 04:11
浏览 89

Google Speech API“请求中的采样率与FLAC标头不匹配”

I'm trying to convert an mp4 video clip into a FLAC audio file and then have google speech spit out the words from the video so that I can detect if specific words were said.

I have everything working except that I am getting an error from the Speech API:

  "error": {
    "code": 400,
    "message": "Sample rate in request does not match FLAC header.",
    "status": "INVALID_ARGUMENT"

I am using FFMPEG in order to convert the mp4 into a FLAC file. I am specifying that the FLAC file be 16 bits in the command, but when I right click on the FLAC file Windows is telling me it is 302kbps.

Here is my PHP code:

// convert mp4 video to 16 bit flac audio file
$cmd = 'C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/wamp/www/test.mp4 -c:a flac -sample_fmt s16 C:/wamp/www/test.flac';
exec($cmd, $output);

// convert flac to text so we can detect if certain words were said
$data = array(
    "config" => array(
        "encoding" => "FLAC",
        "sampleRate" => 16000,
        "languageCode" => "en-US"
    "audio" => array(
        "content" => base64_encode(file_get_contents("test.flac")),

$json_data = json_encode($data);

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=MY_API_KEY');
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: application/json"));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $json_data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

$result = curl_exec($ch);

图片转代码服务由CSDN问答提供 功能建议

我正在尝试将mp4视频片段转换为FLAC音频文件然后让谷歌语音吐出单词 从视频中我可以检测出是否有特定的单词。

除了我从Speech API收到错误外,我还能正常工作: < pre> { “error”:{ “code”:400, “message”:“请求中的采样率与FLAC标题不匹配。”, “status”:“INVALID_ARGUMENT”\ 我正在使用FFMPEG将mp4转换为FLAC文件。 我在命令中指定FLAC文件为16位,但是当我右键单击FLAC文件时,Windows告诉我它是302kbps。


 $ cmd ='C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/ wamp /  www / test.mp4 -c:flac -sample_fmt s16 C:/wamp/www/test.flac'; 
exec($ cmd,$ output); 
 //将flac转换为文本以便我们检测是否 某些单词被称为
 $ data = array(
“config”=&gt; array(
“sampleRate”=&gt; 16000,
“languageCode”=&gt  ;“en-US”
“content”=&gt; base64_encode(file_get_contents(“test.flac”)),
 \  n $ json_data = json_encode($ data); 
 $ ch = curl_init(); 
curl_setopt($ ch,CURLOPT_URL,'https://speech.googleapis.com/v1beta1/speech:syncrecognize?key=MY_API_KEY'  ); 
; ncurl_setopt($ ch,CURLOPT_HTTPHEADER,array(“Content-Type:application / json”)); 
 ncurl_setopt($ ch,CURLOPT_RETURNTRANSFER,  true); 
curl_setopt($ ch,CURLOPT_POST,true); 
curl_setopt($ ch,CURLOPT_POSTFIELDS,$ json_data); 
curl_setopt($ ch,CURLOPT_SSL_VERIFYPEER,false); 
 $ result = curl_exec($ ch);  
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • dongshushi5579 2016-09-28 04:25

    Fixed it by being very specific in my FFMPEG command:

    $cmd = 'C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/wamp/www/test.mp4 -acodec flac -bits_per_raw_sample 16 -ar 44100 -ac 1 C:/wamp/www/test.flac';
    打赏 评论
  • dongtun3259 2017-02-13 21:59

    kjdion84's answer worked well, and I played a bit more to find out the root cause.

    As per this answer, all encodings support only 1 channel (mono) audio

    I was creating the FLAC file with this command:

    ffmpeg -i test.mp3 test.flac

    Sample rate in request does not match FLAC header

    But adding the -ac 1 (setting number of audio channels to 1) fixed this issue.

    ffmpeg -i test.mp3 -ac 1 test.flac

    Here is my full Node.js code

    const Speech = require('@google-cloud/speech');
    const projectId = 'EnterProjectIdGeneratedByGoogle';
    const speechClient = Speech({
        projectId: projectId
    // The name of the audio file to transcribe
    var fileName = '/home/user/Documents/test/test.flac';
    // The audio file's encoding and sample rate
    const options = {
        encoding: 'FLAC',
        sampleRate: 44100
    // Detects speech in the audio file
    speechClient.recognize(fileName, options)
        .then((results) => {
            const transcription = results[0];
            console.log(`Transcription: ${transcription}`);
        }, function(err) {

    Sample rate could be 16000 or 44100 or other valid ones, and encoding can be FLAC or LINEAR16. Cloud Speech Docs

    打赏 评论

相关推荐 更多相似问题