-
Notifications
You must be signed in to change notification settings - Fork 605
Speech Recognizer - initiator + profile configuration #54
Comments
Hi Juliano, The indices are optional, and are only required to enable cloud-based wakeword verification. Here is the relevant text from the SpeechRecognizer Documentation:
If you are not able to provide accurate begin and end indices for your recognize call, then cloud-based wake word verification is not supported and will not be performed. That said, other features in AVS may depend correctly specifying the initiator in the future, so it is important to pick the correct initiator. While you may currently be able to specify TAP as the initiator for a wakeword device, it may cause incorrect behavior in the future. Hope that helps, |
Hi Ken, Thank you for the response. The only point which is not clear for me is that API that you sent is saying that if initiator selected is WAKEWORD then Indices ARE REQUIRED (that information is available on table provided on API) Based on that, how can I choose to use WAKEWORD initiator whiteout providing the indices. Is there any WILD value that I need to provide to cloud for startIndexInSamples and endIndexInSamples? Can I just ignore and do not sent those values? Regards, |
Hi Juliano, The wording in the documentation is perhaps a little misleading. The indexes are required "for products that use cloud-based wake word verification". In other words, not providing the indexes means your product will not use cloud-based wake word verification. The code in the AudioInputProcessor module is already set up to handle this. If you call recognize with an initiator and do not provide indexes, it will still send the initiator as WAKEWORD, but will omit the wakeWordIndices payload. Regarding the placeholder value if you do not have indexes, it is provided as a default in the AudioInputProcessor header:
Ken |
Ok got it. Thank you for the clarification. Now lets move to implementation.
This is how I am calling it: The problem is that If I do not provide keyword String that means that it will be empty by default and AIP will fail on recognize event saying: By the other hand if I provide Indices as INVALID_INDEX and also give to API keyword string as "Alexa" then looks like cloud is considering all my Recognize events as false alarms because I am receiving a STOP capturefor my Recognize event and no other directive after that. Can you please check implementation vs what cloud is expecting to receive. How does cloud knows if it needs to execute cloud verification on buffer sent? Thank you in advanced |
Hi Juliano, I apologize for the long delay in responding to this. I have done some investigation into this and confirmed that AVS is currently not allowing a WAKEWORD initiator without the indices. I'm working with the AVS team to identify the correct workaround or solution, and will update this ticket as soon as I have a clear answer. Ken |
Hi @julianobRibeiro, sorry that this ticket has been not updated for a while. When it comes to the AVS specification, a client has to meet them quite strictly, so I don't believe a workaround for the indices are possible if you wish to use the WAKEWORD initiator type. I think the one thing I'm missing from your initial posts is why you need to do this. I believe switching to the TAP profile will allow the user to still say "Alexa" in their spoken query. Did you try this, and was this sufficient for your needs? |
Closing for now. Please re-open if you wish to continue discussion. |
Hi,
We are implementing Speech Recognizer using profile as FAR_FIELD and initiator ad WAKEWORD.
According to AVS API if WAKEWORD is used for initiator then KEYWORD Indices are required to be sent by device to the cloud for 2nd stage of Wake Word verification.
I have a couple of questions as to that:
Is it mandatory to send KEYWORD to cloud together with command buffer when WAKEWORD engine is in use?
If not mandatory. How can I configure my AudioInputprocessor to use WAKEWORD and DO NOT provide keyword indices to could? I have tried to send AudioInputProcessor::INVALID_INDEX as indices BEGIN and END values but cloud is considering voice buffer as false detection almost all the time. As soon as I send the recognize event I am receiving a STOP directive almost immediately.
m_aip->recognize(audioProvider, Initiator::WAKEWORD, aipBegin, aipEnd, keyword);
If Mandatory. Can we switch and use TAP profile even if we have a WAKEWORD engine running? Looks like that only difference from TAP and WAKEWORD is the Wake Word 2nd stage Verification.
Thank you
Juliano Ribeiro
The text was updated successfully, but these errors were encountered: