Speech Recognizer - initiator + profile configuration #54

julianobRibeiro · 2017-07-11T23:58:07Z

Hi,

We are implementing Speech Recognizer using profile as FAR_FIELD and initiator ad WAKEWORD.
According to AVS API if WAKEWORD is used for initiator then KEYWORD Indices are required to be sent by device to the cloud for 2nd stage of Wake Word verification.
I have a couple of questions as to that:

Is it mandatory to send KEYWORD to cloud together with command buffer when WAKEWORD engine is in use?
If not mandatory. How can I configure my AudioInputprocessor to use WAKEWORD and DO NOT provide keyword indices to could? I have tried to send AudioInputProcessor::INVALID_INDEX as indices BEGIN and END values but cloud is considering voice buffer as false detection almost all the time. As soon as I send the recognize event I am receiving a STOP directive almost immediately.
m_aip->recognize(audioProvider, Initiator::WAKEWORD, aipBegin, aipEnd, keyword);
If Mandatory. Can we switch and use TAP profile even if we have a WAKEWORD engine running? Looks like that only difference from TAP and WAKEWORD is the Wake Word 2nd stage Verification.

Thank you
Juliano Ribeiro

The text was updated successfully, but these errors were encountered:

kencecka · 2017-07-13T23:00:41Z

Hi Juliano,

The indices are optional, and are only required to enable cloud-based wakeword verification. Here is the relevant text from the SpeechRecognizer Documentation:

This object is only required for wake word enabled products that use cloud-based wake word verification.

If you are not able to provide accurate begin and end indices for your recognize call, then cloud-based wake word verification is not supported and will not be performed.

That said, other features in AVS may depend correctly specifying the initiator in the future, so it is important to pick the correct initiator. While you may currently be able to specify TAP as the initiator for a wakeword device, it may cause incorrect behavior in the future.

Hope that helps,
Ken

julianobRibeiro · 2017-07-13T23:18:51Z

Hi Ken,

Thank you for the response. The only point which is not clear for me is that API that you sent is saying that if initiator selected is WAKEWORD then Indices ARE REQUIRED (that information is available on table provided on API)

Based on that, how can I choose to use WAKEWORD initiator whiteout providing the indices. Is there any WILD value that I need to provide to cloud for startIndexInSamples and endIndexInSamples?

Can I just ignore and do not sent those values?

Regards,
Juliano

kencecka · 2017-07-13T23:23:59Z

Hi Juliano,

The wording in the documentation is perhaps a little misleading. The indexes are required "for products that use cloud-based wake word verification". In other words, not providing the indexes means your product will not use cloud-based wake word verification.

The code in the AudioInputProcessor module is already set up to handle this. If you call recognize with an initiator and do not provide indexes, it will still send the initiator as WAKEWORD, but will omit the wakeWordIndices payload.

Regarding the placeholder value if you do not have indexes, it is provided as a default in the AudioInputProcessor header:

        avsCommon::avs::AudioInputStream::Index begin = INVALID_INDEX,
        avsCommon::avs::AudioInputStream::Index keywordEnd = INVALID_INDEX,

Ken

julianobRibeiro · 2017-07-13T23:43:11Z

Ok got it. Thank you for the clarification.

Now lets move to implementation.
If I am not providing the Indices I can not provide keyword(string) while calling recognize(). Below is signature for recognize() form AudioinputProcessor:

std::future<bool> recognize(
        AudioProvider audioProvider,
        Initiator initiator,
        avsCommon::avs::AudioInputStream::Index begin = INVALID_INDEX,
        avsCommon::avs::AudioInputStream::Index keywordEnd = INVALID_INDEX,
        std::string keyword = "");

This is how I am calling it:
m_aip->recognize(audioProvider, Initiator::WAKEWORD);

The problem is that If I do not provide keyword String that means that it will be empty by default and AIP will fail on recognize event saying:
AudioInputProcessor:executeRecognizeFailed:reason=emptyKeywordWithWakewordInitiator

By the other hand if I provide Indices as INVALID_INDEX and also give to API keyword string as "Alexa" then looks like cloud is considering all my Recognize events as false alarms because I am receiving a STOP capturefor my Recognize event and no other directive after that.
m_aip->recognize(audioProvider, Initiator::WAKEWORD, AudioInputProcessor::INVALID_INDEX, AudioInputProcessor::INVALID_INDEX, keyword);

Can you please check implementation vs what cloud is expecting to receive. How does cloud knows if it needs to execute cloud verification on buffer sent?

Thank you in advanced
Juliano Ribeiro

kencecka · 2017-08-16T18:00:32Z

Hi Juliano,

I apologize for the long delay in responding to this. I have done some investigation into this and confirmed that AVS is currently not allowing a WAKEWORD initiator without the indices. I'm working with the AVS team to identify the correct workaround or solution, and will update this ticket as soon as I have a clear answer.

Ken

dhpp · 2018-04-18T16:13:37Z

Hi @julianobRibeiro, sorry that this ticket has been not updated for a while.

When it comes to the AVS specification, a client has to meet them quite strictly, so I don't believe a workaround for the indices are possible if you wish to use the WAKEWORD initiator type.

I think the one thing I'm missing from your initial posts is why you need to do this. I believe switching to the TAP profile will allow the user to still say "Alexa" in their spoken query. Did you try this, and was this sufficient for your needs?

dhpp · 2018-04-20T15:27:24Z

Closing for now. Please re-open if you wish to continue discussion.

kencecka self-assigned this Aug 8, 2017

yugoren added AIP labels Aug 11, 2017

kuodehai mentioned this issue Nov 1, 2017

make MediaPlayer test #285

Closed

dhpp closed this as completed Apr 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech Recognizer - initiator + profile configuration #54

Speech Recognizer - initiator + profile configuration #54

julianobRibeiro commented Jul 11, 2017

kencecka commented Jul 13, 2017 •

edited

Loading

julianobRibeiro commented Jul 13, 2017 •

edited

Loading

kencecka commented Jul 13, 2017

julianobRibeiro commented Jul 13, 2017 •

edited

Loading

kencecka commented Aug 16, 2017

dhpp commented Apr 18, 2018

dhpp commented Apr 20, 2018

Speech Recognizer - initiator + profile configuration #54

Speech Recognizer - initiator + profile configuration #54

Comments

julianobRibeiro commented Jul 11, 2017

kencecka commented Jul 13, 2017 • edited Loading

julianobRibeiro commented Jul 13, 2017 • edited Loading

kencecka commented Jul 13, 2017

julianobRibeiro commented Jul 13, 2017 • edited Loading

kencecka commented Aug 16, 2017

dhpp commented Apr 18, 2018

dhpp commented Apr 20, 2018

kencecka commented Jul 13, 2017 •

edited

Loading

julianobRibeiro commented Jul 13, 2017 •

edited

Loading

julianobRibeiro commented Jul 13, 2017 •

edited

Loading