-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sample: Prime speech using a tap-to-dismiss splash screen on Safari #995
Comments
@rodmcleay, we have just tested it on an iPhone with iOS 11.4, running Safari, and use Cognitive Services Speech. It works. Can you check for a few things?
I agree we need to make the speech detection more robust and informative. But also need to make sure detection doesn't pop up the "Access to Microphone" dialog too early. But unfortunately, in some cases, you can't have both. |
How to get it working on Chrome for iOS, any help would be greatly appreciated. |
@shubhamchawla It doesn't work in Chrome for iOS because Chrome does not support WebRTC on iOS. The only browser on iOS which support WebRTC is Safari right now. |
Hi Compulim,
Is that the config you would expect? Get token is on the client at the moment.
|
I got it working with Safari and Firefox with the following javascript. I just include this in a javascript file while still using the linked CognitiveServices.js file from the cdn. I use the bing speech recognizer and the browser speech synthesizer. This works because their current version uses window.navigator.getUserMedia which is being deprecated so change that to use window.navigator.mediaDevices.getUserMedia. Then Safari has problems with playing audio using the speech synthesizer programatically, so I register an event to the microphone click to play a sound from the speech synthesizer and remove that event. Finally, Safari also has problems recording audio programatically again so I create the audio context before actually needing it and connect the processor. Safari doesn't allow recording audio or playing audio with the speech synthesizer unless it is a direct result from a touch or tap. This includes the then part of the promise returned from window.navigator.mediaDevises.getUserMedia. I've tested this with the latest version of Chrome, Firefox, and Edge on windows 10, Chrome on android, and Safari on an iPad pro. The only browser I haven't had it work on is internet explorer.
|
@rosskyl this is good hack, without the need to touch the Web Chat code. Can you explain a little bit more on synthesis part? Do you mean Safari requires touch/tap for both synthesis and recognition part? |
The first time you use either the speech synthesis or recognizer, it needs to be triggered by a user touch or tap. After the speech synthesis was triggered once, then I was able to get it to work without needing a touch or tap. Apple requires this to prevent the web page from automatically playing audio or recording audio even though all of the other browsers allow it. The speech synthesis or recognizer will not work if they are triggered from a |
@rosskyl Thanks for the explanation. I totally understand the recognizer requirement for tap/touch, but it just feel weird to me for the synthesis part. I bet one don't need to tap/touch for WebAudio. Anyway, it's Apple's requirement then we need to work with it. 😉 |
You could try it without adding the event listener, but I couldn't get it to work without it. You could also write your own custom speech synthesizer and try it with WebAudio. I originally wrote my own that used the speech synthesizer, but ended up with the same problem the BrowserSpeechSynthesizer had. I fixed it with the event listener and figured out it worked with the BrowserSpeechSynthesizer also. |
Thanks @rosskyl. I will make this a bug. BTW, we are planning to polyfill HTML WebSpeech API using Cognitive Services. So we don't need to maintain two different APIs, and we can bring Cognitive Services to platforms that does not support WebSpeech (e.g. Edge, desktop Firefox). As always, we welcome contributions, and we will take quality projects as dependencies. Anyway, note to bug fixer:
|
Hi @rosskyl If I add your code to the project, it gives me an error when I press the microphone. Can you please help me? The error I get in Chrome is this:
And in Firefox is: I am using cognitiveServices. What can be failing? Thanks |
I believe that is because some of the internals for the cognitiveServices changes. The following is what I currently use:
I have this working for all of the major browsers on Windows, android, macOS, and iOS. |
@rosskyl This works much better, at least for the rest of browsers. I already made it work for any mobile device. In the end I did it in the following way:
A part I do some other checks as if the user is on a mobile device or if the message comes from the micro or not |
Just a note that this will only work for the browser speech synthesizer. It does not work for the cognitive services speech synthesizer. I tried to prime it like above by creating an audio context and playing a tone but that does not work. I can get the tone to play on the mic tap, but can't get it to work programmatically. |
@rosskyl |
Closing due to lack of activity - see linked samples issues above. |
I'm have a webchat conrol for a bot that that is up and running and working well in Chrome. The link How to enable speech in Web Chat shows how to set this up and we have done it exactly like this.
It mentioned multiple browsers, but does not specify Safari in any way.
We need this working on an iPhone, however it just doesn't seem to work, there is not a lot of feedback from the browser, the icon changes and it appears to have turned on the microphone after access is approved.
Nothing spoken is recorded/recognized and the text area of the bot stays empty, no 'listining....' or any other indication its working other than the red microphone on icon in the browser header. clicking the icon mutes and un-mutes as you'd expect, it just doesn't seem to be connected to the webchat control in the browser.
All of my investigation appears to go around in circles.
Thanks for taking the time to read, any assistance would be much appreciated, I'm at the end on this investigation and pulling my hair out.
The text was updated successfully, but these errors were encountered: