Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge uberi pr601 9c68e15 fix recognize google cloud by alinerguio #4

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ SpeechRecognition

Library for performing speech recognition, with support for several engines and APIs, online and offline.

**UPDATE 2022-02-09**: Hey everyone! This project started as a tech demo, but these days it needs more time than I have to keep up with all the PRs and issues. Therefore, I'd like to put out an **open invite for collaborators** - just reach out at [email protected] if you're interested!

Speech recognition engine/API support:

* `CMU Sphinx <http://cmusphinx.sourceforge.net/wiki/>`__ (works offline)
Expand Down
6 changes: 3 additions & 3 deletions reference/pocketsphinx.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Installing other languages

By default, SpeechRecognition's Sphinx functionality supports only US English. Additional language packs are also available, but not included due to the files being too large:

* `International French <https://drive.google.com/open?id=0Bw_EqP-hnaFNN2FlQ21RdnVZSVE>`__
* `Mandarin Chinese <https://drive.google.com/open?id=0Bw_EqP-hnaFNSWdqdm5maWZtTGc>`__
* `Italian <https://drive.google.com/open?id=0Bw_EqP-hnaFNSXUtMm8tRkdUejg>`__
* `International French <https://drive.google.com/file/d/0Bw_EqP-hnaFNN2FlQ21RdnVZSVE/view?usp=sharing&resourcekey=0-CEkuW10BcLuDdDnKDbzO4w>`__
* `Mandarin Chinese <https://drive.google.com/file/d/0Bw_EqP-hnaFNSWdqdm5maWZtTGc/view?usp=sharing&resourcekey=0-AYS4yrQJO-ieZqyo0g6h3g>`__
* `Italian <https://drive.google.com/file/d/0Bw_EqP-hnaFNSXUtMm8tRkdUejg/view?usp=sharing&resourcekey=0-9IOo0qEMHOAR3z6rzIqgBg>`__

To install a language pack, download the ZIP archives and extract them directly into the module install directory (you can find the module install directory by running ``python -c "import speech_recognition as sr, os.path as p; print(p.dirname(sr.__file__))"``).

Expand Down
12 changes: 5 additions & 7 deletions speech_recognition/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1038,8 +1038,6 @@ def recognize_google_cloud(self, audio_data, credentials_json=None, language="en
try:
import socket
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
from google.api_core.exceptions import GoogleAPICallError
except ImportError:
raise RequestError('missing google-cloud-speech module: ensure that google-cloud-speech is set up correctly.')
Expand All @@ -1053,15 +1051,15 @@ def recognize_google_cloud(self, audio_data, credentials_json=None, language="en
convert_rate=None if 8000 <= audio_data.sample_rate <= 48000 else max(8000, min(audio_data.sample_rate, 48000)), # audio sample rate must be between 8 kHz and 48 kHz inclusive - clamp sample rate into this range
convert_width=2 # audio samples must be 16-bit
)
audio = types.RecognitionAudio(content=flac_data)
audio = speech.RecognitionAudio(content=flac_data)

config = {
'encoding': enums.RecognitionConfig.AudioEncoding.FLAC,
'encoding': speech.RecognitionConfig.AudioEncoding.FLAC,
'sample_rate_hertz': audio_data.sample_rate,
'language_code': language
}
if preferred_phrases is not None:
config['speechContexts'] = [types.SpeechContext(
config['speechContexts'] = [speech.SpeechContext(
phrases=preferred_phrases
)]
if show_all:
Expand All @@ -1071,10 +1069,10 @@ def recognize_google_cloud(self, audio_data, credentials_json=None, language="en
if self.operation_timeout and socket.getdefaulttimeout() is None:
opts['timeout'] = self.operation_timeout

config = types.RecognitionConfig(**config)
config = speech.RecognitionConfig(**config)

try:
response = client.recognize(config, audio, **opts)
response = client.recognize(config=config, audio=audio)
except GoogleAPICallError as e:
raise RequestError(e)
except URLError as e:
Expand Down