Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iOS Safari often doesn't trigger start or end events on SpeechSynthesisUtterance #66

Open
jessegreenberg opened this issue Mar 11, 2022 · 11 comments
Assignees
Labels
dev:voicing type:bug Something isn't working

Comments

@jessegreenberg
Copy link
Contributor

jessegreenberg commented Mar 11, 2022

While working on related issue phetsims/number-play#134 (comment) and #65 I am seeing that the start and end event listeners frequenly fail to fire in iOS Safari. Causing an assertion in UtteranceQueue because the this.announcingUtteranceWrapper isn't cleared on the end event. And reading blocks do not highlight because the start event doesn't fire.

I tried re-introducing the safari workaround removed in #52, but it doesn't seem to fix it always. Though maybe it is better? It is hard to say, it seems to happen randomly.

@jessegreenberg jessegreenberg added the type:bug Something isn't working label Mar 11, 2022
@jessegreenberg jessegreenberg self-assigned this Mar 11, 2022
@jessegreenberg jessegreenberg changed the title iOS Safari often doesn't trigger start or end event listeners iOS Safari often doesn't trigger start or end events on SpeechSynthesisUtterance Mar 11, 2022
@jessegreenberg
Copy link
Contributor Author

I am not too interested in finding a way to "trick" Safari into always firing these events. It seems too unpredictable. Instead I am wondering if we can find a way to handle this gracefully and fallback to some behavior when we don't get these events.

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Mar 11, 2022

Stream of consciousness notes to help think it through...

Sometimes we don't get a start event or an end event and synth.speaking is false. It is as though we never requested speech in SpeechSynthesisAnnouncer. But listeners are added in UtteranceQueue assuming they will be removed by the end event that we never get.

We need to gracefully handle these failure cases somehow, detecting when we request speech but speech never starts. This way the SpeechSynthesisAnnouncer can notify the UtteranceQueue that listeners should be removed.

Proposal: Keep track of when speech is requested in SpeechSynthesisAnnouncer. If synth.speaking is not true in a reasonable amount of time then we can assume there has been a failure and we will get no events nor start speaking. We cancel the synth and remove listeners. Something to think about is adding this failed utterance to the back of the queue again to try until we have a success.

EDIT: Although, if we keep track of the time before the start event instead of the time before synth.speaking becomes true then this workaround may also work for the ChromeOS issue in phetsims/number-play#138 (comment)

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Mar 12, 2022

The proposal is working really well so I'm going to commit it. The fundamental changes are

  1. There is now a pendingUtterance on SpeechSynthesisAnnouncer that is set immediatelety in requestSpeech. It's existence well prevent the Announcer from setting readyToSpeak = true so that the UtteranceQueue cannot announce anything if we are still waiting for something to speak. It could take longer than VOICING_UTTERANCE_INTERVAL between the call to getSynth().speak() and the time getSynth().speaking becomes true.
  2. If the amount of time between setting the pendingUtterance and the start event is greater than PENDING_UTTERANCE_DELAY we can assume there has been a failure of the engine and we cancel speech and remove listeners before the next attempt to speak.

@jessegreenberg
Copy link
Contributor Author

Next, I want to fix #65 and then Ill be able to verify if this is fixed in number-play where it was originally found. Finally, Ill check and see if this is working in ChromeOS.

@jessegreenberg
Copy link
Contributor Author

#65 is resolved and number-play is working well in iOS Safari. The failure case is now being detected but not handled in ChromeOS. Lets continue that part of the issue in #64.

@zepumph could you please review the two commits here and the summary of these changes in #66 (comment)?

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Mar 13, 2022

This implementation could change a bit to be more complete and support the needs of phetsims/joist#782. More specifically, cancel and cancelUtterance need to also cancel the pendingUtterance.

@jessegreenberg
Copy link
Contributor Author

In the above commit I made it so that we remove listeners on the pending utterance when cancel functions are called. Also changed pendingUtterance to pendingSpeechSynthesisUtteranceWrapper so that we have access to the listeners to remove.

I think this is better but I was wrong about this causing phetsims/joist#782.

Ready for review again, sorry for the confusion let me know if you would like to talk about these changes.

@zepumph
Copy link
Member

zepumph commented Jun 9, 2022

const PENDING_UTTERANCE_DELAY = 5000;

Can you please write a note about how 5 seconds was chosen, Even if you say "it is kinda random and it worked so I went with it", I think it will help us maintain the code better.

const utteranceWrapperToEnd = utterance === this.currentlySpeakingUtterance ? this.speakingSpeechSynthesisUtteranceWrapper :
( this.pendingSpeechSynthesisUtteranceWrapper && utterance === this.pendingSpeechSynthesisUtteranceWrapper.utterance ) ? this.pendingSpeechSynthesisUtteranceWrapper :
null;

It would be nice to talk about this complexity, from the looks of it it appears we need to be graceful to this function being called multiple times? But I'm not sure why it (and cancel() has to look like this). More generally, I see pendingSpeechSynthesisUtteranceWrapper used 18 times in the file. That feels pretty extensive and I wonder if we can simplify it now that we understand the whole problem a bit more.

Maybe it would be best to discuss this further.

@zepumph zepumph assigned jessegreenberg and unassigned zepumph Jun 9, 2022
@zepumph
Copy link
Member

zepumph commented Jun 10, 2022

  • Delete currentlySpeakingUtterance, and use speakingSpeechSynthesisUtteranceWrapper.utterance in all of those places instead.
  • Delete pendingSpeechSynthesisUtteranceWrapper, and add a pending flag to SpeechSynthesisUtteranceWrapper that is set to false in the start listener.
Index: js/SpeechSynthesisAnnouncer.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/js/SpeechSynthesisAnnouncer.ts b/js/SpeechSynthesisAnnouncer.ts
--- a/js/SpeechSynthesisAnnouncer.ts	(revision 366673277fbae0ca372bc55f6e6f56ea9e48002e)
+++ b/js/SpeechSynthesisAnnouncer.ts	(date 1654884892112)
@@ -332,7 +332,8 @@
       // start counting up until the synth has finished speaking its current utterance.
       this.timeSinceUtteranceEnd = synth.speaking ? 0 : this.timeSinceUtteranceEnd + dt;
 
-      this.timeSincePendingUtterance = this.pendingSpeechSynthesisUtteranceWrapper ? this.timeSincePendingUtterance + dt : 0;
+      this.timeSincePendingUtterance = this.speakingSpeechSynthesisUtteranceWrapper && this.speakingSpeechSynthesisUtteranceWrapper.pending ?
+                                       this.timeSincePendingUtterance + dt : 0;
 
       if ( this.timeSincePendingUtterance > PENDING_UTTERANCE_DELAY ) {
         assert && assert( this.pendingSpeechSynthesisUtteranceWrapper, 'should have this.pendingSpeechSynthesisUtteranceWrapper' );
@@ -350,7 +351,7 @@
       // see documentation for the constant for more information. By setting readyToAnnounce in the step function
       // we also don't have to rely at all on the SpeechSynthesisUtterance 'end' event, which is inconsistent on
       // certain platforms. Also, not ready to announce if we are waiting for the synth to start speaking something.
-      if ( this.timeSinceUtteranceEnd > VOICING_UTTERANCE_INTERVAL && !this.pendingSpeechSynthesisUtteranceWrapper ) {
+      if ( this.timeSinceUtteranceEnd > VOICING_UTTERANCE_INTERVAL && !( this.speakingSpeechSynthesisUtteranceWrapper && this.speakingSpeechSynthesisUtteranceWrapper.pending ) ) {
         this.readyToAnnounce = true;
       }
 
@@ -489,7 +490,7 @@
       // Important that the pendingSpeechSynthesisUtteranceWrapper is cleared in the start event instead of when `synth.speaking` is
       // set to true because `synth.speaking` is incorrectly set to true before there is successful speech in ChromeOS.
       // See https://github.com/phetsims/utterance-queue/issues/66 and https://github.com/phetsims/utterance-queue/issues/64
-      this.pendingSpeechSynthesisUtteranceWrapper = null;
+      this.speakingSpeechSynthesisUtteranceWrapper.pending = false;
       this.currentlySpeakingUtterance = utterance;
 
       // Interrupt if the Utterance can no longer be announced.
@@ -536,9 +537,6 @@
     // See https://github.com/phetsims/utterance-queue/issues/40
     this.timeSinceUtteranceEnd = 0;
 
-    // Utterance is pending until we get a successful 'start' event on the SpeechSynthesisUtterance
-    this.pendingSpeechSynthesisUtteranceWrapper = speechSynthesisUtteranceWrapper;
-
     this.getSynth()!.speak( speechSynthUtterance );
   }
 
@@ -579,7 +577,6 @@
   cancel(): void {
     if ( this.initialized ) {
       const utteranceToCancel = this.speakingSpeechSynthesisUtteranceWrapper ? this.speakingSpeechSynthesisUtteranceWrapper.utterance :
-                                this.pendingSpeechSynthesisUtteranceWrapper ? this.pendingSpeechSynthesisUtteranceWrapper.utterance :
                                 null;
 
       if ( utteranceToCancel ) {
@@ -594,9 +591,9 @@
    * (utterance-queue internal)
    */
   override cancelUtterance( utterance: Utterance ): void {
-    const utteranceWrapperToEnd = utterance === this.currentlySpeakingUtterance ? this.speakingSpeechSynthesisUtteranceWrapper :
-                                  ( this.pendingSpeechSynthesisUtteranceWrapper && utterance === this.pendingSpeechSynthesisUtteranceWrapper.utterance ) ? this.pendingSpeechSynthesisUtteranceWrapper :
-                                  null;
+    const utteranceWrapperToEnd = this.speakingSpeechSynthesisUtteranceWrapper ?
+                                  this.speakingSpeechSynthesisUtteranceWrapper.utterance === utterance ?
+                                  this.speakingSpeechSynthesisUtteranceWrapper : null : null;
 
     if ( utteranceWrapperToEnd ) {
       this.handleSpeechSynthesisEnd( utteranceWrapperToEnd.utterance.getAlertText(), utteranceWrapperToEnd );

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Jul 15, 2022

Coming back to this, remaining work:

@jessegreenberg
Copy link
Contributor Author

I got blocked by #84 last time I worked on #66 (comment), Ill try finish this next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev:voicing type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants