Streaming mic data to AVS and play response audio with Mediaplayer #23

boyce-xx · 2017-06-07T10:05:59Z

Hi, we can run default SpeechSynthesizerIntegrationTest app now, and get a response audio and play it with Mediaplayer in TEST_F(SpeechSynthesizerTest, handleOneSpeech) test function. but we have a question about how to stream mic data to AVS with stopCapture feature based on this test function?

We have tried to modify this test case, please see the sample code below, our expect flow is:
recording mic data and streaming it to AVS --> AVS detected a stopCapture --> stop recording --> playing the response audio with mediaplayer.

build is successful, but always blocking in the line ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);

TEST_F(SpeechTest, handleOneSpeech) {
// SpeechSynthesizerObserver defaults to a FINISHED state.
//ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// Send audio of "Joke" that will prompt SetMute and Speak.
m_directiveSequencer->setDialogRequestId(FIRST_DIALOG_REQUEST_ID);


// Check that AIP is in an IDLE state before starting.
//ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::IDLE, AUDIO_FILE_TIMEOUT_DURATION));

// Request the alarm channel for the test channel client.
//ASSERT_TRUE(m_focusManager->acquireChannel(ALERTS_CHANNEL_NAME, m_testClient, ALARM_ACTIVITY_ID));
//ASSERT_EQ(m_testClient->waitForFocusChange(AUDIO_FILE_TIMEOUT_DURATION), FocusState::FOREGROUND);

// Signal to the AIP to start recognizing.
ASSERT_TRUE(m_tapToTalkButton->startRecognizing(m_AudioInputProcessor, m_TapToTalkAudioProvider));

// Check that AIP is now in RECOGNIZING state.
ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::RECOGNIZING, AUDIO_FILE_TIMEOUT_DURATION));

std::cout << "Start record wav file" << '\n';
//system("arecord -c 1 -r 16000 -f S16_LE -d 5 hello.wav");

// std::string file = inputPath + RECOGNIZE_JOKE_AUDIO_FILE_NAME;
std::string file = "hello.wav";
/*
setupMessageWithAttachmentAndSend(
    CT_FIRST_RECOGNIZE_EVENT_JSON,
    file,
    avsCommon::avs::MessageRequest::Status::SUCCESS,
    SEND_EVENT_TIMEOUT_DURATION);   */

std::cout << "file: " << file << std::endl;
  const int RIFF_HEADER_SIZE = 44;
  std::ifstream inputFile(file.c_str(), std::ifstream::binary);
  if (!inputFile.good()) {
    std::cout << "Couldn't open audio file!" << std::endl;
    return ;
  }
  std::cout << "open audio file success!" << std::endl;
  inputFile.seekg(0, std::ios::end);
  int fileLengthInBytes = inputFile.tellg();
  if (fileLengthInBytes <= RIFF_HEADER_SIZE) {
    std::cout << "File should be larger than 44 bytes, which is the size of the RIFF header" << std::endl;
    return ;
  }

  inputFile.seekg(RIFF_HEADER_SIZE, std::ios::beg);
  std::cout << "\t\tfile lenght In bytes: " << fileLengthInBytes << std::endl;
  int numSamples = (fileLengthInBytes - RIFF_HEADER_SIZE) / 5;
  std::cout << "\t\tnumSamples: " << numSamples << std::endl;
  std::vector<int16_t> retVal(115200, 0);
  //inputFile.read((char *)&retVal[0], numSamples * 2);
  //m_AudioBufferWriter->write(retVal.data(), retVal.size());

  int index = 0;
  while(!inputFile.eof()){
    index ++;
    inputFile.read((char *)&retVal[0], 115200 );
    m_AudioBufferWriter->write(retVal.data(), inputFile.gcount());
    retVal.clear();
    std::cout << "index= " << index << std::endl;
  }
  inputFile.close(); 


TestMessageSender::SendParams sendRecognizeParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendRecognizeParams, NAME_RECOGNIZE));

// Wait for the directive to route through to our handler.
TestDirectiveHandler::DirectiveParams params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);
params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::HANDLE);

// Unblock the queue so SpeechSynthesizer can do its work.
params.result->setCompleted();

// SpeechSynthesizer is now playing.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::PLAYING);

//Check that SS grabs the channel focus by seeing that the test client has been backgrounded.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::BACKGROUND);

// SpeechStarted was sent.
TestMessageSender::SendParams sendStartedParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendStartedParams, NAME_SPEECH_STARTED));

// Media Player has finished.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// SpeechFinished is sent here.
TestMessageSender::SendParams sendFinishedParams = m_avsConnectionManager->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendFinishedParams, NAME_SPEECH_FINISHED));

// Alerts channel regains the foreground.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::FOREGROUND);

}

The text was updated successfully, but these errors were encountered:

scotthea-amazon · 2017-06-07T14:59:02Z

Hello boyce-xx,
Since you are indicating that you are blocked at an ASSERT_EQ with no logic in the parameters, do you really mean that the test fails there because the ASSERT_EQ fails? Or, do you mean something else?

Assuming that ASSERT_EQ fails, can you tell us what param.type value was actually encountered?
It may be that another operation (e.g. PREHANDLE) is being queued before HANDLE is received. A snapshot of a log from running this code (with any configuration parameters redacted) might help us understand what is going on in this case.

Also assuming the problem is that the ASSERT_EQ is failing, it may make sense to remove the ASSERT_EQ and loop around waitForNext() until param.type is HANDLE.

Please let us know if this helps, and what you find,
-SWH

boyce-xx · 2017-06-08T09:55:45Z

Hi @scotthea-amazon ,
Thanks for your explanation.
We can stream audio file to AVS and play successfully the response audio with MediaPlayer now, but I got another issue after playing: It will post again to AVS then the test case will be failed., please see the log below.

MediaPlayer:doStopSuccess:reason=alreadyStopped
MediaPlayer:handlePlayCalled
MediaPlayer:doStopSuccess:reason=alreadyStopped
MediaPlayer:callingOnPlaybackStarted
MediaPlayer:handleGetOffsetInMillisecondsCalled
Updated state for state provider of namespace:SpeechSynthesizer and name:SpeechState to {"token":"amzn1.as-ct.v1.Domain:Application:Weather#ACRI#WeatherPrompt.708a00cc-b34b-457e-9d0b-711170d5c721","offsetInMilliseconds":268,"playerActivity":"PLAYING"}

Found bundle for host avs-alexa-na.amazon.com: 0x7f7c10001720 [can multiplex]

Conn: 0 (0x7f7c10000b60) Receive pipe weight: (-1/0), penalized: FALSE

Multiplexed connection found!

Found connection 0, with requests in the pipe (2)

Re-using existing connection! (#0) with host avs-alexa-na.amazon.com

Using Stream ID: 5 (easy handle 0x7f7bf400ac90)

POST /v20160207/events HTTP/2
Host: avs-alexa-na.amazon.com
Accept: /
Authorization: Bearer Atza|XXXXXXXXXx
Content-Length: 416
Content-Type: multipart/form-data; boundary=------------------------bca30f6496ef7a3f

InProcessAttachmentReader:readFailed:reason=SDS is closed
< HTTP/2 204
< access-control-allow-origin: *
< x-amzn-requestid: 0a0c63fffe9571f6-000034c3-0002c577-573a69c508f754f5-808f3433-5
<
MediaPlayer:callingOnPlaybackFinished
Updated state for state provider of namespace:SpeechSynthesizer and name:SpeechState to {"token":"amzn1.as-ct.v1.Domain:Application:Weather#ACRI#WeatherPrompt.708a00cc-b34b-457e-9d0b-711170d5c721","offsetInMilliseconds":0,"playerActivity":"FINISHED"}
DirectiveProcessor:onHandlingCompeted:messageId=a4e9f05e-fd4e-4b44-95c8-d5eb0b4b3e4f,directiveBeingPreHandled=(nullptr)
CapabilityAgent:removingMessageIdFromMap:messageId=a4e9f05e-fd4e-4b44-95c8-d5eb0b4b3e4f

Found bundle for host avs-alexa-na.amazon.com: 0x7f7c10001720 [can multiplex]

Conn: 0 (0x7f7c10000b60) Receive pipe weight: (-1/0), penalized: FALSE

Multiplexed connection found!

Found connection 0, with requests in the pipe (1)

Re-using existing connection! (#0) with host avs-alexa-na.amazon.com

Using Stream ID: 7 (easy handle 0x7f7c101216e0)

POST /v20160207/events HTTP/2
Host: avs-alexa-na.amazon.com
Accept: /
Authorization: Bearer Atza|XXXXXXXXXXXX
Content-Length: 417
Content-Type: multipart/form-data; boundary=------------------------25187b641250e137

< HTTP/2 204
< access-control-allow-origin: *
< x-amzn-requestid: 0a0c63fffe9571f6-000034c3-0002c577-573a69c508f754f5-808f3433-7
<
/home/bob/Desktop/Document/TeddyBear/AVS/Linux+C/0.4/SDK/20170602_gst/alexa-client-sdk-master/Integration/test/SpeechTest.cpp:1108: Failure
Expected: params.type
Which is: 4-byte object <06-00 00-00>
To be equal to: TestDirectiveHandler::DirectiveParams::Type::HANDLE
Which is: 4-byte object <03-00 00-00>

Closing connection 0
DirectiveSequencer:shutdown
[ FAILED ] SpeechTest.handleOneSpeech (34222 ms)
[----------] 1 test from SpeechTest (34222 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (34222 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] SpeechTest.handleOneSpeech

# My TEST_F is:

TEST_F(SpeechTest, handleOneSpeech) {
// SpeechSynthesizerObserver defaults to a FINISHED state.
//ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// Send audio of "Joke" that will prompt SetMute and Speak.
//m_directiveSequencer->setDialogRequestId(FIRST_DIALOG_REQUEST_ID);

// Check that AIP is in an IDLE state before starting.
//ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::IDLE, AUDIO_FILE_TIMEOUT_DURATION));

// Request the alarm channel for the test channel client.
ASSERT_TRUE(m_focusManager->acquireChannel(ALERTS_CHANNEL_NAME, m_testClient, ALARM_ACTIVITY_ID));
//ASSERT_EQ(m_testClient->waitForFocusChange(AUDIO_FILE_TIMEOUT_DURATION), FocusState::FOREGROUND);

// Signal to the AIP to start recognizing.
ASSERT_TRUE(m_tapToTalkButton->startRecognizing(m_AudioInputProcessor, m_TapToTalkAudioProvider));

// Check that AIP is now in RECOGNIZING state.
//ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::RECOGNIZING, AUDIO_FILE_TIMEOUT_DURATION));

std::cout << "Start record wav file" << '\n';
std::string file = "weather.wav";

  std::cout << "file: " << file << std::endl;
  const int RIFF_HEADER_SIZE = 44;
  std::ifstream inputFile(file.c_str(), std::ifstream::binary);
  if (!inputFile.good()) {
    std::cout << "Couldn't open audio file!" << std::endl;
    return ;
  }
  std::cout << "open audio file success!" << std::endl;
  inputFile.seekg(0, std::ios::end);
  int fileLengthInBytes = inputFile.tellg();
  if (fileLengthInBytes <= RIFF_HEADER_SIZE) {
    std::cout << "File should be larger than 44 bytes, which is the size of the RIFF header" << std::endl;
    return ;
  }

  inputFile.seekg(RIFF_HEADER_SIZE, std::ios::beg);
  std::cout << "\t\tfile lenght In bytes: " << fileLengthInBytes << std::endl;
  int numSamples = (fileLengthInBytes - RIFF_HEADER_SIZE) / 5;
  std::cout << "\t\tnumSamples: " << numSamples << std::endl;
  std::vector<int16_t> retVal(115200, 0);

  int index = 0;
  while(!inputFile.eof()){
    index ++;
    inputFile.read((char *)&retVal[0], 115200 );
    m_AudioBufferWriter->write(retVal.data(), inputFile.gcount());
    retVal.clear();
    std::cout << "index= " << index << std::endl;
  }
  inputFile.close();


TestMessageSender::SendParams sendRecognizeParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendRecognizeParams, NAME_RECOGNIZE));

// Wait for the directive to route through to our handler.
TestDirectiveHandler::DirectiveParams params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
//ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);
params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::HANDLE);

// Unblock the queue so SpeechSynthesizer can do its work.
params.result->setCompleted();

// SpeechSynthesizer is now playing.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::PLAYING);

//Check that SS grabs the channel focus by seeing that the test client has been backgrounded.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::BACKGROUND);

// SpeechStarted was sent.
TestMessageSender::SendParams sendStartedParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendStartedParams, NAME_SPEECH_STARTED));

// Media Player has finished.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// SpeechFinished is sent here.
TestMessageSender::SendParams sendFinishedParams = m_avsConnectionManager->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendFinishedParams, NAME_SPEECH_FINISHED));

// Alerts channel regains the foreground.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::FOREGROUND);

}

scotthea-amazon · 2017-06-08T21:04:23Z

Hello boyce-xx,

Looking through our previous exchange I see that I mis-read your first post and thought you were failing in ASSERT_EQ vs. HANDLING, not PREHANDLING. Sorry!

The logs you provided have been very helpful. From the server side logs I can see that SpeechStarted and SpeechFinished events were received. The last 204 in the log you provided is in response to the SpeechFinished. So, by the time you are reaching those ASSERT_EQ tests it appears your client has already processed the SpeechSynthesizer.Speak directive. In the original test those ASSERT_EQs are intended to wait on and test for the handling of a Speaker.SetMute directive. That SetMute directive is supposed to be handled before the SpeechSynthesizer.Speak directive. If for some reason you are not receiving the SetMute() directive, you could just remove the following lines from your test:

// Wait for the directive to route through to our handler.
TestDirectiveHandler::DirectiveParams params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
//ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);
params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::HANDLE);

// Unblock the queue so SpeechSynthesizer can do its work.
params.result->setCompleted();

A couple of related questions:

Are you using the DirectiveSequencer?
If so, have you registered your own handler for Speaker.SetMute?

Please let us know if this helps,
-SWH

boyce-xx · 2017-06-15T01:13:02Z

@scotthea-amazon , Thanks for your help, following your suggestions, the error has been resolved.

Xmos v1.14

scotthea-amazon added ADSL labels Jun 7, 2017

scotthea-amazon self-assigned this Jun 8, 2017

boyce-xx closed this as completed Jun 15, 2017

kuodehai mentioned this issue Nov 1, 2017

make MediaPlayer test #285

Closed

dhananjayj29 mentioned this issue Jan 1, 2018

avs-sdk-crash in speaking/thinking state #391

Closed

yangk mentioned this issue Jan 5, 2018

Wake-up words cannot work #429

Closed

sescapa mentioned this issue Apr 11, 2018

Audio through HMDI and Alexa stops in mid-sentence #614

Closed

xuqifu mentioned this issue Jul 12, 2018

setupPipelineFailed #833

Closed

6 tasks

Guillaume0477 pushed a commit to Guillaume0477/avs-device-sdk that referenced this issue Sep 16, 2019

Merge pull request alexa#23 from chrisc-xmos/xmos_v1.14

3f87b37

Xmos v1.14

xlb767923274 mentioned this issue Nov 11, 2021

SampleApp crash on ubuntu 18.04 LTS #1994

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming mic data to AVS and play response audio with Mediaplayer #23

Streaming mic data to AVS and play response audio with Mediaplayer #23

boyce-xx commented Jun 7, 2017

scotthea-amazon commented Jun 7, 2017

boyce-xx commented Jun 8, 2017 •

edited

Loading

scotthea-amazon commented Jun 8, 2017

boyce-xx commented Jun 15, 2017

Streaming mic data to AVS and play response audio with Mediaplayer #23

Streaming mic data to AVS and play response audio with Mediaplayer #23

Comments

boyce-xx commented Jun 7, 2017

scotthea-amazon commented Jun 7, 2017

boyce-xx commented Jun 8, 2017 • edited Loading

scotthea-amazon commented Jun 8, 2017

boyce-xx commented Jun 15, 2017

boyce-xx commented Jun 8, 2017 •

edited

Loading