A button to start speech recognition using Web Speech API, with an easy to understand event lifecycle.
- Requires
react@>=16.8.0
andcore-js@3
- Modifying props while recognition has started will no longer abort recognition immediately, props will be updated in next recognition
SpeechGrammarList
is only constructed whengrammar
props is present- If
speechRecognition
prop is not present, capability detection is now done throughwindow.mediaDevices.getUserMedia
Try out this component at github.io/compulim/react-dictate-button.
Reasons why we need to build our own component, instead of using existing packages on NPM:
- Most browsers required speech recognition (or WebRTC) to be triggered by a user event (button click)
- Bring your own engine for Web Speech API
- Enable speech recognition on unsupported browsers by bridging it with cloud-based service
- Support grammar list thru JSpeech Grammar Format
- Ability to interrupt recognition
- Ability to morph into other elements
First, install our production version by npm install react-dictate-button
. Or our development version by npm install react-dictate-button@master
.
import DictateButton from 'react-dictate-button';
export default () => (
<DictateButton
className="my-dictate-button"
grammar="#JSGF V1.0; grammar districts; public <district> = Tuen Mun | Yuen Long;"
lang="en-US"
onDictate={this.handleDictate}
onProgress={this.handleProgress}
>
Start/stop
</DictateButton>
);
Name | Type | Default | Description |
---|---|---|---|
className |
string |
undefined |
Class name to apply to the button |
disabled |
boolean |
false |
true to abort ongoing recognition and disable the button, otherwise, false |
extra |
{ [key: string]: any } |
{} |
Additional properties to set to SpeechRecognition before start , useful when bringing your own SpeechRecognition |
grammar |
string |
undefined |
Grammar list in JSGF format |
lang |
string |
undefined |
Language to recognize, for example, 'en-US' or navigator.language |
speechGrammarList |
any |
window.SpeechGrammarList (or vendor-prefixed) |
Bring your own SpeechGrammarList |
speechRecognition |
any |
window.SpeechRecognition (or vendor-prefixed) |
Bring your own SpeechRecognition |
Note: change of
extra
,grammar
,lang
,speechGrammarList
, andspeechRecognition
will not take effect until next speech recognition is started.
Name | Signature | Description |
---|---|---|
onClick |
(event: MouseEvent) => void |
Emit when the user click on the button, preventDefault will stop recognition from starting |
onDictate |
({ result: { confidence: number, transcript: number }, type: 'dictate' }) => void |
Emit when recognition is completed |
onError |
(event: SpeechRecognitionErrorEvent) => void |
Emit when error has occurred or recognition is interrupted, see below |
onProgress |
({ abortable: boolean, results: [{ confidence: number, transcript: number }], type: 'progress' }) => void |
Emit for interim results, the array contains every segments of recognized text |
onRawEvent |
(event: SpeechRecognitionEvent) => void |
Emit for handling raw events from
SpeechRecognition
|
Although previous versions exported a React Context, it is recommended to use the hooks interface.
Name | Signature | Description |
---|---|---|
useAbortable |
[boolean] |
If ongoing speech recognition can be aborted, true , otherwise, false |
useReadyState |
[number] |
Returns the current state of recognition, refer to this section |
useSupported |
[boolean] |
If speech recognition is supported, true , otherwise, false |
To determines whether speech recognition is supported in the browser:
- If
speechRecognition
prop isundefined
- If both
window.navigator.mediaDevices
andwindow.navigator.mediaDevices.getUserMedia
are falsy, it is not supported- Probably the browser is not on a secure HTTP connection
- If both
window.SpeechRecognition
and vendor-prefixed are falsy, it is not supported - If recognition failed once with
not-allowed
error code, it is not supported
- If both
- Otherwise, it is supported
Even the browser is on an insecure HTTP connection,
window.SpeechRecognition
(or vendor-prefixed) will continue to be truthy. Instead,mediaDevices.getUserMedia
is used for capability detection.
One of the design aspect is to make sure events are easy to understand and deterministic. First rule of thumb is to make sure onProgress
will lead to either onDictate
or onError
. Here are some samples of event firing sequence (tested on Chrome 67):
- Happy path: speech is recognized
onProgress({})
(just started, therefore, noresults
)onProgress({ results: [] })
onDictate({ result: ... })
- Heard some sound, but nothing can be recognized
onProgress({})
onDictate({})
(nothing is recognized, therefore, noresult
)
- Nothing is heard (audio device available but muted)
onProgress({})
onError({ error: 'no-speech' })
- Recognition aborted
onProgress({})
onProgress({ results: [] })
- While speech is getting recognized, set
props.disabled
tofalse
, abort recognition onError({ error: 'aborted' })
- Not authorized to use speech or no audio device is availablabortable: truee
onError({ error: 'not-allowed' })
Instead of passing child elements, you can pass a function to render different content based on ready state. This is called function as a child.
Ready state | Description |
---|---|
0 |
Not started |
1 |
Starting recognition engine, recognition is not ready until it turn to 2 |
2 |
Recognizing |
3 |
Stopping |
For example,
<DictateButton>
{({ readyState }) =>
readyState === 0 ? 'Start' : readyState === 1 ? 'Starting...' : readyState === 2 ? 'Listening...' : 'Stopping...'
}
</DictateButton>
You can build your own component by copying our layout code, without messing around the logic code behind the scene. For details, please refer to DictateButton.js
, DictateCheckbox.js
, and DictationTextbox.js
.
In addition to <button>
, we also ship <input type="checkbox">
out of the box. The checkbox version is better suited for toggle button scenario and web accessibility. You can use the following code for the checkbox version.
import { DictateCheckbox } from 'react-dictate-button';
export default () => (
<DictateCheckbox
className="my-dictate-checkbox"
grammar="#JSGF V1.0; grammar districts; public <district> = Redmond | Bellevue;"
lang="en-US"
onDictate={this.handleDictate}
onProgress={this.handleProgress}
>
Start/stop
</DictateCheckbox>
);
We also provide a "textbox with dictate button" version. But instead of shipping a full-fledged control, we make it a minimally-styled control so you can start copying the code and customize it in your own project. The sample code can be found at DictationTextbox.js.
- Hide the complexity of Web Speech events because we only want to focus on recognition experience
- Complexity in lifecycle events:
onstart
,onaudiostart
,onsoundstart
,onspeechstart
onresult
may not fire in some cases,onnomatch
is not fired in Chrome- To reduce complexity, we want to make sure event firing are either:
- Happy path:
onProgress
, then eitheronDictate
oronError
- Otherwise:
onError
- Happy path:
- Complexity in lifecycle events:
- "Web Speech" could means speech synthesis, which is out of scope for this package
- "Speech Recognition" could means we will expose Web Speech API as-is, which we want to hide details and make it straightforward for recognition scenario
Please feel free to file suggestions.
- While
readyState
is 1 or 3 (transitioning), the underlying speech engine cannot be started/stopped until the state transition is complete- Need rework on the state management
- Instead of putting all logic inside
Composer.js
, how about- Write an adapter to convert
SpeechRecognition
into another object with simpler event model andreadyState
- Rewrite
Composer.js
to bridge the newSimpleSpeechRecognition
model and React Context - Expose
SimpleSpeechRecognition
so people not on React can still benefit from the simpler event model
- Write an adapter to convert
Like us? Star us.
Want to make it better? File us an issue.
Don't like something you see? Submit a pull request.