SpeechClient.php guesses based on file extensions #446

adziuk · 2017-04-11T21:36:27Z

The functions determineSampleRate and determineEncoding attempt to guess sample rate and encoding from the file extension. This is susceptible to incorrect naming of files, additionally, The Cloud Speech Servers support examining the file header and extracting information from them. Additional checking on the client side is unnecessary, and likely to drift from the server.

dwsupplee · 2017-04-12T17:25:29Z

We actually do rely on the server to parse FLAC headers for us. Only AMR and AMR_WB are assumed because it is my understanding those must have sample rates of 8000/16000 (respectively).

/cc @pbogle I believe we spoke briefly about assuming the encoding type off of the file type and the sample rate based off of whether it is AMR/AMR_WB. Is this still okay for us to do, or would you prefer we make no assumptions?

adziuk · 2017-04-12T18:07:27Z

determineEncoding determineSampleRate

pbogle · 2017-04-12T18:20:39Z

In virtually all cases, please do not implement argument validation or configuration guessing on the client.

The server already validates all arguments, and where possible determines the file type (e.g. FLAC or WAV) and audio parameters by inspecting the file headers, which is more reliable and comprehensive than what the client is doing.

Attempting to do this on the client creates inconsistencies with the server logic and in some cases defeats it, resulting in unnecessary errors to the user.

Even for AMR and AMB_WB It is fragile to set the sample rate based on current constraints in the server because those could change in future versions. (If I told you otherwise, I must have been confused and apologize for leading you down the wrong path; on principle it's not something I would recommend.)

dwsupplee · 2017-04-12T18:23:22Z

Okay, no problem! We'll get these removed.

dwsupplee · 2017-04-14T20:55:33Z

Closed by: #449

jdpedrie added the api: speech Issues related to the Speech-to-Text API. label Apr 12, 2017

dwsupplee mentioned this issue Apr 13, 2017

Return all results and do not detect encoding/sample rate in the client #449

Merged

dwsupplee closed this as completed Apr 14, 2017

yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Apr 6, 2020

JustinBeckwith assigned dwsupplee Feb 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SpeechClient.php guesses based on file extensions #446

SpeechClient.php guesses based on file extensions #446

adziuk commented Apr 11, 2017

dwsupplee commented Apr 12, 2017

adziuk commented Apr 12, 2017

pbogle commented Apr 12, 2017 •

edited

Loading

dwsupplee commented Apr 12, 2017

dwsupplee commented Apr 14, 2017

SpeechClient.php guesses based on file extensions #446

SpeechClient.php guesses based on file extensions #446

Comments

adziuk commented Apr 11, 2017

dwsupplee commented Apr 12, 2017

adziuk commented Apr 12, 2017

pbogle commented Apr 12, 2017 • edited Loading

dwsupplee commented Apr 12, 2017

dwsupplee commented Apr 14, 2017

pbogle commented Apr 12, 2017 •

edited

Loading