Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds Parameter use_enhanced and model to GoogleCloudSpeech (Fix #734) #735

Merged
merged 8 commits into from
Dec 18, 2024

Conversation

HideyoshiNakazone
Copy link
Contributor

Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases

Adds the parameters use_enhanced and model to the recognize_google_cloud method for more customizable options for the user and better results in specific cases
@HideyoshiNakazone
Copy link
Contributor Author

HideyoshiNakazone commented Feb 22, 2024

Hello @Uberi and @ftnext, i was wondering if it's possible for someone to review my merge request.

Thank you very much,
Vitor Hideyoshi.

@HideyoshiNakazone
Copy link
Contributor Author

Hello @ftnext, is there any interest in this feature? It doesn't break any of GoogleCloudSpeech python api, only extends it. I'm currently already using this implementation in the company i work in, but would love to have this feature merged.
If there is anything blocking the merge please tell me :)

@Uberi
Copy link
Owner

Uberi commented Apr 26, 2024

Hi @HideyoshiNakazone!

Looks good overall, but would it be possible to document these parameters in the docs for that function? If so, happy to merge this!

@HideyoshiNakazone HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from 052dec3 to 8e0fa40 Compare April 26, 2024 19:13
@HideyoshiNakazone
Copy link
Contributor Author

@Uberi, thanks a lot! I added the parameters to the Docstring of the method Recognizer.recognize_google_cloud and added them to the library reference file.
If there is any other places you'd like me to add documentation i'll be happy to :)

@ftnext
Copy link
Collaborator

ftnext commented Apr 29, 2024

@HideyoshiNakazone Thank you very much for this pull request! I'm very sorry to respond too late.
@Uberi Thanks your comment!

In my opinion, it seems to be better to introduce keyword arguments (a.k.a. **kwargs)
https://docs.python.org/3/tutorial/controlflow.html#keyword-arguments

Certainly, adding use_enhanced and model as arguments would implement this feature.
However, if there are additional arguments to be added in the future, there is a concern that they could be added again (not easy to extend).

I think it would be preferable for Cloud Speech API-specific arguments to be specified as variant keyword arguments.

def recognize_google_cloud(self, audio_data, credentials_json=None, language="en-US", preferred_phrases=None, show_all=False, **api_params):
    """
    If ``preferred_phrases`` is an iterable of phrase strings, ...

    api_params: Cloud Speech API-specific parameters as dict (optional)

        The ``use_enhanced`` is a boolean option ...

        Furthermore, you can use the option ``model`` to set your desired model,

    Returns the most likely transcription if ``show_all`` is False (the default).
    """

    config = {
        'encoding': speech.RecognitionConfig.AudioEncoding.FLAC,
        'sample_rate_hertz': audio_data.sample_rate,
        'language_code': language,
        **api_params,
    }

(It seems that preferred_phrases might be included in api_params too, but this is another issue)

@HideyoshiNakazone HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from fc26183 to e82fd4d Compare November 28, 2024 00:29
This implementation is needed for the configuration of Cloud Speech API-specific parameters. This implementation only validates and creates assertions for the two most used params: use_enhanced and model.
@HideyoshiNakazone HideyoshiNakazone force-pushed the add-parameters-google-cloud branch from e82fd4d to 4be8026 Compare November 28, 2024 00:31
@HideyoshiNakazone
Copy link
Contributor Author

Hi @ftnext and @Uberi ,

I know it has been a while, but could you please take another look at my pull request? I believe it now meets the requirements discussed in this thread, and it would be great if we could proceed with merging it.

Thanks for your time!

Copy link
Collaborator

@ftnext ftnext left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
May I make a simple correction?

speech_recognition/__init__.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ftnext ftnext left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ftnext ftnext changed the title Adds Parameter use_enhanced and model to GoogleCloudSpeech Adds Parameter use_enhanced and model to GoogleCloudSpeech (Fix #734) Dec 18, 2024
@ftnext ftnext merged commit 597d09e into Uberi:master Dec 18, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants