Skip to content
This repository has been archived by the owner on Jan 16, 2024. It is now read-only.

Streaming mic data to AVS and play response audio with Mediaplayer #23

Closed
boyce-xx opened this issue Jun 7, 2017 · 4 comments
Closed
Assignees

Comments

@boyce-xx
Copy link

boyce-xx commented Jun 7, 2017

Hi, we can run default SpeechSynthesizerIntegrationTest app now, and get a response audio and play it with Mediaplayer in TEST_F(SpeechSynthesizerTest, handleOneSpeech) test function. but we have a question about how to stream mic data to AVS with stopCapture feature based on this test function?

We have tried to modify this test case, please see the sample code below, our expect flow is:
recording mic data and streaming it to AVS --> AVS detected a stopCapture --> stop recording --> playing the response audio with mediaplayer.

build is successful, but always blocking in the line ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);

TEST_F(SpeechTest, handleOneSpeech) {
// SpeechSynthesizerObserver defaults to a FINISHED state.
//ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// Send audio of "Joke" that will prompt SetMute and Speak.
m_directiveSequencer->setDialogRequestId(FIRST_DIALOG_REQUEST_ID);


// Check that AIP is in an IDLE state before starting.
//ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::IDLE, AUDIO_FILE_TIMEOUT_DURATION));

// Request the alarm channel for the test channel client.
//ASSERT_TRUE(m_focusManager->acquireChannel(ALERTS_CHANNEL_NAME, m_testClient, ALARM_ACTIVITY_ID));
//ASSERT_EQ(m_testClient->waitForFocusChange(AUDIO_FILE_TIMEOUT_DURATION), FocusState::FOREGROUND);

// Signal to the AIP to start recognizing.
ASSERT_TRUE(m_tapToTalkButton->startRecognizing(m_AudioInputProcessor, m_TapToTalkAudioProvider));

// Check that AIP is now in RECOGNIZING state.
ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::RECOGNIZING, AUDIO_FILE_TIMEOUT_DURATION));

std::cout << "Start record wav file" << '\n';
//system("arecord -c 1 -r 16000 -f S16_LE -d 5 hello.wav");

// std::string file = inputPath + RECOGNIZE_JOKE_AUDIO_FILE_NAME;
std::string file = "hello.wav";
/*
setupMessageWithAttachmentAndSend(
    CT_FIRST_RECOGNIZE_EVENT_JSON,
    file,
    avsCommon::avs::MessageRequest::Status::SUCCESS,
    SEND_EVENT_TIMEOUT_DURATION);   */

std::cout << "file: " << file << std::endl;
  const int RIFF_HEADER_SIZE = 44;
  std::ifstream inputFile(file.c_str(), std::ifstream::binary);
  if (!inputFile.good()) {
    std::cout << "Couldn't open audio file!" << std::endl;
    return ;
  }
  std::cout << "open audio file success!" << std::endl;
  inputFile.seekg(0, std::ios::end);
  int fileLengthInBytes = inputFile.tellg();
  if (fileLengthInBytes <= RIFF_HEADER_SIZE) {
    std::cout << "File should be larger than 44 bytes, which is the size of the RIFF header" << std::endl;
    return ;
  }

  inputFile.seekg(RIFF_HEADER_SIZE, std::ios::beg);
  std::cout << "\t\tfile lenght In bytes: " << fileLengthInBytes << std::endl;
  int numSamples = (fileLengthInBytes - RIFF_HEADER_SIZE) / 5;
  std::cout << "\t\tnumSamples: " << numSamples << std::endl;
  std::vector<int16_t> retVal(115200, 0);
  //inputFile.read((char *)&retVal[0], numSamples * 2);
  //m_AudioBufferWriter->write(retVal.data(), retVal.size());

  int index = 0;
  while(!inputFile.eof()){
    index ++;
    inputFile.read((char *)&retVal[0], 115200 );
    m_AudioBufferWriter->write(retVal.data(), inputFile.gcount());
    retVal.clear();
    std::cout << "index= " << index << std::endl;
  }
  inputFile.close(); 


TestMessageSender::SendParams sendRecognizeParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendRecognizeParams, NAME_RECOGNIZE));

// Wait for the directive to route through to our handler.
TestDirectiveHandler::DirectiveParams params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);
params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::HANDLE);

// Unblock the queue so SpeechSynthesizer can do its work.
params.result->setCompleted();

// SpeechSynthesizer is now playing.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::PLAYING);

//Check that SS grabs the channel focus by seeing that the test client has been backgrounded.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::BACKGROUND);

// SpeechStarted was sent.
TestMessageSender::SendParams sendStartedParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendStartedParams, NAME_SPEECH_STARTED));

// Media Player has finished.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// SpeechFinished is sent here.
TestMessageSender::SendParams sendFinishedParams = m_avsConnectionManager->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendFinishedParams, NAME_SPEECH_FINISHED));

// Alerts channel regains the foreground.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::FOREGROUND);

}

@scotthea-amazon
Copy link
Contributor

Hello boyce-xx,
Since you are indicating that you are blocked at an ASSERT_EQ with no logic in the parameters, do you really mean that the test fails there because the ASSERT_EQ fails? Or, do you mean something else?

Assuming that ASSERT_EQ fails, can you tell us what param.type value was actually encountered?
It may be that another operation (e.g. PREHANDLE) is being queued before HANDLE is received. A snapshot of a log from running this code (with any configuration parameters redacted) might help us understand what is going on in this case.

Also assuming the problem is that the ASSERT_EQ is failing, it may make sense to remove the ASSERT_EQ and loop around waitForNext() until param.type is HANDLE.

Please let us know if this helps, and what you find,
-SWH

@boyce-xx
Copy link
Author

boyce-xx commented Jun 8, 2017

Hi @scotthea-amazon ,
Thanks for your explanation.
We can stream audio file to AVS and play successfully the response audio with MediaPlayer now, but I got another issue after playing: It will post again to AVS then the test case will be failed., please see the log below.

MediaPlayer:doStopSuccess:reason=alreadyStopped
MediaPlayer:handlePlayCalled
MediaPlayer:doStopSuccess:reason=alreadyStopped
MediaPlayer:callingOnPlaybackStarted
MediaPlayer:handleGetOffsetInMillisecondsCalled
Updated state for state provider of namespace:SpeechSynthesizer and name:SpeechState to {"token":"amzn1.as-ct.v1.Domain:Application:Weather#ACRI#WeatherPrompt.708a00cc-b34b-457e-9d0b-711170d5c721","offsetInMilliseconds":268,"playerActivity":"PLAYING"}

  • Found bundle for host avs-alexa-na.amazon.com: 0x7f7c10001720 [can multiplex]
  • Conn: 0 (0x7f7c10000b60) Receive pipe weight: (-1/0), penalized: FALSE
  • Multiplexed connection found!
  • Found connection 0, with requests in the pipe (2)
  • Re-using existing connection! (#0) with host avs-alexa-na.amazon.com
  • Using Stream ID: 5 (easy handle 0x7f7bf400ac90)

POST /v20160207/events HTTP/2
Host: avs-alexa-na.amazon.com
Accept: /
Authorization: Bearer Atza|XXXXXXXXXx
Content-Length: 416
Content-Type: multipart/form-data; boundary=------------------------bca30f6496ef7a3f

InProcessAttachmentReader:readFailed:reason=SDS is closed
< HTTP/2 204
< access-control-allow-origin: *
< x-amzn-requestid: 0a0c63fffe9571f6-000034c3-0002c577-573a69c508f754f5-808f3433-5
<
MediaPlayer:callingOnPlaybackFinished
Updated state for state provider of namespace:SpeechSynthesizer and name:SpeechState to {"token":"amzn1.as-ct.v1.Domain:Application:Weather#ACRI#WeatherPrompt.708a00cc-b34b-457e-9d0b-711170d5c721","offsetInMilliseconds":0,"playerActivity":"FINISHED"}

DirectiveProcessor:onHandlingCompeted:messageId=a4e9f05e-fd4e-4b44-95c8-d5eb0b4b3e4f,directiveBeingPreHandled=(nullptr)
CapabilityAgent:removingMessageIdFromMap:messageId=a4e9f05e-fd4e-4b44-95c8-d5eb0b4b3e4f

  • Found bundle for host avs-alexa-na.amazon.com: 0x7f7c10001720 [can multiplex]
  • Conn: 0 (0x7f7c10000b60) Receive pipe weight: (-1/0), penalized: FALSE
  • Multiplexed connection found!
  • Found connection 0, with requests in the pipe (1)
  • Re-using existing connection! (#0) with host avs-alexa-na.amazon.com
  • Using Stream ID: 7 (easy handle 0x7f7c101216e0)

POST /v20160207/events HTTP/2
Host: avs-alexa-na.amazon.com
Accept: /
Authorization: Bearer Atza|XXXXXXXXXXXX
Content-Length: 417
Content-Type: multipart/form-data; boundary=------------------------25187b641250e137

< HTTP/2 204
< access-control-allow-origin: *
< x-amzn-requestid: 0a0c63fffe9571f6-000034c3-0002c577-573a69c508f754f5-808f3433-7
<
/home/bob/Desktop/Document/TeddyBear/AVS/Linux+C/0.4/SDK/20170602_gst/alexa-client-sdk-master/Integration/test/SpeechTest.cpp:1108: Failure
Expected: params.type
Which is: 4-byte object <06-00 00-00>
To be equal to: TestDirectiveHandler::DirectiveParams::Type::HANDLE
Which is: 4-byte object <03-00 00-00>

  • Closing connection 0
    DirectiveSequencer:shutdown
    [ FAILED ] SpeechTest.handleOneSpeech (34222 ms)
    [----------] 1 test from SpeechTest (34222 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (34222 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] SpeechTest.handleOneSpeech


# My TEST_F is:

TEST_F(SpeechTest, handleOneSpeech) {
// SpeechSynthesizerObserver defaults to a FINISHED state.
//ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// Send audio of "Joke" that will prompt SetMute and Speak.
//m_directiveSequencer->setDialogRequestId(FIRST_DIALOG_REQUEST_ID);

// Check that AIP is in an IDLE state before starting.
//ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::IDLE, AUDIO_FILE_TIMEOUT_DURATION));

// Request the alarm channel for the test channel client.
ASSERT_TRUE(m_focusManager->acquireChannel(ALERTS_CHANNEL_NAME, m_testClient, ALARM_ACTIVITY_ID));
//ASSERT_EQ(m_testClient->waitForFocusChange(AUDIO_FILE_TIMEOUT_DURATION), FocusState::FOREGROUND);

// Signal to the AIP to start recognizing.
ASSERT_TRUE(m_tapToTalkButton->startRecognizing(m_AudioInputProcessor, m_TapToTalkAudioProvider));

// Check that AIP is now in RECOGNIZING state.
//ASSERT_TRUE(m_StateObserver->checkState(AudioInputProcessor::State::RECOGNIZING, AUDIO_FILE_TIMEOUT_DURATION));

std::cout << "Start record wav file" << '\n';
std::string file = "weather.wav";

  std::cout << "file: " << file << std::endl;
  const int RIFF_HEADER_SIZE = 44;
  std::ifstream inputFile(file.c_str(), std::ifstream::binary);
  if (!inputFile.good()) {
    std::cout << "Couldn't open audio file!" << std::endl;
    return ;
  }
  std::cout << "open audio file success!" << std::endl;
  inputFile.seekg(0, std::ios::end);
  int fileLengthInBytes = inputFile.tellg();
  if (fileLengthInBytes <= RIFF_HEADER_SIZE) {
    std::cout << "File should be larger than 44 bytes, which is the size of the RIFF header" << std::endl;
    return ;
  }

  inputFile.seekg(RIFF_HEADER_SIZE, std::ios::beg);
  std::cout << "\t\tfile lenght In bytes: " << fileLengthInBytes << std::endl;
  int numSamples = (fileLengthInBytes - RIFF_HEADER_SIZE) / 5;
  std::cout << "\t\tnumSamples: " << numSamples << std::endl;
  std::vector<int16_t> retVal(115200, 0);

  int index = 0;
  while(!inputFile.eof()){
    index ++;
    inputFile.read((char *)&retVal[0], 115200 );
    m_AudioBufferWriter->write(retVal.data(), inputFile.gcount());
    retVal.clear();
    std::cout << "index= " << index << std::endl;
  }
  inputFile.close();


TestMessageSender::SendParams sendRecognizeParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendRecognizeParams, NAME_RECOGNIZE));

// Wait for the directive to route through to our handler.
TestDirectiveHandler::DirectiveParams params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
//ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);
params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::HANDLE);

// Unblock the queue so SpeechSynthesizer can do its work.
params.result->setCompleted();

// SpeechSynthesizer is now playing.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::PLAYING);

//Check that SS grabs the channel focus by seeing that the test client has been backgrounded.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::BACKGROUND);

// SpeechStarted was sent.
TestMessageSender::SendParams sendStartedParams = m_avsConnectionManager->waitForNext(DIRECTIVE_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendStartedParams, NAME_SPEECH_STARTED));

// Media Player has finished.
ASSERT_EQ(m_speechSynthesizerObserver->waitForNext(WAIT_FOR_TIMEOUT_DURATION), SpeechSynthesizerState::FINISHED);

// SpeechFinished is sent here.
TestMessageSender::SendParams sendFinishedParams = m_avsConnectionManager->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_TRUE(checkSentEventName(sendFinishedParams, NAME_SPEECH_FINISHED));

// Alerts channel regains the foreground.
ASSERT_EQ(m_testClient->waitForFocusChange(WAIT_FOR_TIMEOUT_DURATION), FocusState::FOREGROUND);

}

@scotthea-amazon scotthea-amazon self-assigned this Jun 8, 2017
@scotthea-amazon
Copy link
Contributor

Hello boyce-xx,

Looking through our previous exchange I see that I mis-read your first post and thought you were failing in ASSERT_EQ vs. HANDLING, not PREHANDLING. Sorry!

The logs you provided have been very helpful. From the server side logs I can see that SpeechStarted and SpeechFinished events were received. The last 204 in the log you provided is in response to the SpeechFinished. So, by the time you are reaching those ASSERT_EQ tests it appears your client has already processed the SpeechSynthesizer.Speak directive. In the original test those ASSERT_EQs are intended to wait on and test for the handling of a Speaker.SetMute directive. That SetMute directive is supposed to be handled before the SpeechSynthesizer.Speak directive. If for some reason you are not receiving the SetMute() directive, you could just remove the following lines from your test:

// Wait for the directive to route through to our handler.
TestDirectiveHandler::DirectiveParams params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
//ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::PREHANDLE);
params = m_directiveHandler->waitForNext(WAIT_FOR_TIMEOUT_DURATION);
ASSERT_EQ(params.type, TestDirectiveHandler::DirectiveParams::Type::HANDLE);

// Unblock the queue so SpeechSynthesizer can do its work.
params.result->setCompleted();

A couple of related questions:

  • Are you using the DirectiveSequencer?
  • If so, have you registered your own handler for Speaker.SetMute?

Please let us know if this helps,
-SWH

@boyce-xx
Copy link
Author

@scotthea-amazon , Thanks for your help, following your suggestions, the error has been resolved.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants