Skip to content

Commit

Permalink
feat(wss): add and remove websocket params
Browse files Browse the repository at this point in the history
  • Loading branch information
apaparazzi0329 committed Aug 10, 2022
1 parent 069bba3 commit 1b5f171
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 7 deletions.
32 changes: 25 additions & 7 deletions ibm_watson/speech_to_text_v1_adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ def recognize_using_websocket(self,
speaker_labels=None,
http_proxy_host=None,
http_proxy_port=None,
customization_id=None,
grammar_name=None,
redaction=None,
processing_metrics=None,
Expand All @@ -56,6 +55,7 @@ def recognize_using_websocket(self,
speech_detector_sensitivity=None,
background_audio_suppression=None,
low_latency=None,
character_insertion_bias: float = None,
**kwargs):
"""
Sends audio for speech recognition using web sockets.
Expand Down Expand Up @@ -190,10 +190,6 @@ def recognize_using_websocket(self,
labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
:param str http_proxy_host: http proxy host name.
:param str http_proxy_port: http proxy port. If not set, set to 80.
:param str customization_id: (optional) **Deprecated.** Use the
`language_customization_id` parameter to specify the customization ID
(GUID) of a custom language model that is to be used with the recognition
request. Do not specify both parameters with a request.
:param str grammar_name: (optional) The name of a grammar that is to be
used with the recognition request. If you specify a grammar, you must also
use the `language_customization_id` parameter to specify the name of the
Expand Down Expand Up @@ -287,6 +283,28 @@ def recognize_using_websocket(self,
for next-generation models.
* For more information about the `low_latency` parameter, see [Low
latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
:param float character_insertion_bias: (optional) For next-generation
`Multimedia` and `Telephony` models, an indication of whether the service
is biased to recognize shorter or longer strings of characters when
developing transcription hypotheses. By default, the service is optimized
for each individual model to balance its recognition of strings of
different lengths. The model-specific bias is equivalent to 0.0.
The value that you specify represents a change from a model's default bias.
The allowable range of values is -1.0 to 1.0.
* Negative values bias the service to favor hypotheses with shorter strings
of characters.
* Positive values bias the service to favor hypotheses with longer strings
of characters.
As the value approaches -1.0 or 1.0, the impact of the parameter becomes
more pronounced. To determine the most effective value for your scenario,
start by setting the value of the parameter to a small increment, such as
-0.1, -0.05, 0.05, or 0.1, and assess how the value impacts the
transcription results. Then experiment with different values as necessary,
adjusting the value by small increments.
The parameter is not available for previous-generation `Broadband` and
`Narrowband` models.
See [Character insertion
bias](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#insertion-bias).
:param dict headers: A `dict` containing the request headers
:return: A `dict` containing the `SpeechRecognitionResults` response.
:rtype: dict
Expand Down Expand Up @@ -321,7 +339,6 @@ def recognize_using_websocket(self,

params = {
'model': model,
'customization_id': customization_id,
'acoustic_customization_id': acoustic_customization_id,
'base_model_version': base_model_version,
'language_customization_id': language_customization_id
Expand Down Expand Up @@ -353,7 +370,8 @@ def recognize_using_websocket(self,
'split_transcript_at_phrase_end': split_transcript_at_phrase_end,
'speech_detector_sensitivity': speech_detector_sensitivity,
'background_audio_suppression': background_audio_suppression,
'low_latency': low_latency
'low_latency': low_latency,
'character_insertion_bias': character_insertion_bias
}
options = {k: v for k, v in options.items() if v is not None}
request['options'] = options
Expand Down
17 changes: 17 additions & 0 deletions ibm_watson/text_to_speech_adapter_v1.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ def synthesize_using_websocket(self,
voice=None,
timings=None,
customization_id=None,
spell_out_mode: str = None,
http_proxy_host=None,
http_proxy_port=None,
**kwargs):
Expand Down Expand Up @@ -60,6 +61,21 @@ def synthesize_using_websocket(self,
If you include a customization ID, you must call the method with the service credentials of the custom model's owner. Omit the
parameter to use the specified voice with no customization. For more information, see [Understanding customization]
(https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
:param str spell_out_mode: (optional) *For German voices,* indicates how
the service is to spell out strings of individual letters. To indicate the
pace of the spelling, specify one of the following values:
* `default` - The service reads the characters at the rate at which it
synthesizes speech for the request. You can also omit the parameter
entirely to achieve the default behavior.
* `singles` - The service reads the characters one at a time, with a brief
pause between each character.
* `pairs` - The service reads the characters two at a time, with a brief
pause between each pair.
* `triples` - The service reads the characters three at a time, with a
brief pause between each triplet.
The parameter is available only for IBM Cloud.
**See also:** [Specifying how strings are spelled
out](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-synthesis-params#params-spell-out-mode).
:param str http_proxy_host: http proxy host name.
:param str http_proxy_port: http proxy port. If not set, set to 80.
:param dict headers: A `dict` containing the request headers
Expand Down Expand Up @@ -90,6 +106,7 @@ def synthesize_using_websocket(self,
params = {
'voice': voice,
'customization_id': customization_id,
'spell_out_mode': spell_out_mode
}
params = {k: v for k, v in params.items() if v is not None}
url += '/v1/synthesize?{0}'.format(urlencode(params))
Expand Down

0 comments on commit 1b5f171

Please sign in to comment.