Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation enhancement #55

Closed
shkup opened this issue Oct 15, 2022 · 3 comments · Fixed by #56
Closed

Documentation enhancement #55

shkup opened this issue Oct 15, 2022 · 3 comments · Fixed by #56
Assignees

Comments

@shkup
Copy link

shkup commented Oct 15, 2022

Please add a paragraph for custom phrases.

For custom phrases there is no need for ubuntu or different installations. Phrases can be generated with curl or http client like postman.
After building a text to speech resource in Azure you can use it by REST calls (http requests).
The request url is:
https://<YOUR_RESOURCE_REGION>.tts.speech.microsoft.com/cognitiveservices/v1
You should add the following headers to your request: (EdgeTX supposed to support up to 32khz .wav file but in that range 8khz is the highest value supported by the conversion service. It's possible to select higher quality like riff-48khz-16bit-mono-pcm and convert to 32khz afterwards with another tool like ffmpeg).

Ocp-Apim-Subscription-Key: <YOUR_RESOURCE_KEY>
Content-Type: application/ssml+xml
X-Microsoft-OutputFormat: riff-8khz-16bit-mono-pcm

And in the request body (raw) place your ssml (Change the voice name according to your preference, the full list is: tts.speech.microsoft.com/cognitiveservices/voices/list):

<speak version='1.0' xml:lang='en-US'>
    <voice xml:lang='en-US' xml:gender='Female' name='en-US-MichelleNeural'>YOUR_PHRASE_HERE</voice>
</speak>

Generally speaking, the implementation in voice-gen.py can be a series of http requests without the proxy objects. I don't see any benefit if they require ton of installations, but I might be wrong here. I'll be happy to hear if they contribute somehow.
It seems that it can be an os independent implementation.

@pfeerick pfeerick changed the title Documentation fix Documentation enhancement Oct 15, 2022
@pfeerick
Copy link
Member

pfeerick commented Oct 15, 2022

This is an enhancement, not a fix. What was documented was a known reliable mechanism, until someone like yourself pointed out alternatives ;) And in point of fact, since the documentation is about how to generate a release, well, actually, yes, you do need to be on Linux/WSL2, as that is the only supported OS for the scripts as written. However, they are were changed to python as part of an effort to make things platform agnostic (it was all shell scripts before).

re: voice-gen.py and also documentation generally, PRs are welcome. However, I have no intention of changing the current implementation as it mostly follows the suggested implementation from Microsoft, and is working fine as it is. I have no objections to a platform agnostic alternative being added also, but won't change the current implementation just to prove a point.

@shkup
Copy link
Author

shkup commented Oct 15, 2022

I didn't say this implementation is not good or doesn't follow best practices. It is very well written.
I was more curious about the difference.
Anyway, I think that documentation section with viable, relatively easy approach towards generating custom phrases is valuable. I really like adding special functions with the model that being loaded etc. so I took the time to investigate.
Thank you for your time and patience. Again, great work!

@pfeerick
Copy link
Member

I honestly don't know the difference... I would expect it is mostly just a wrapper to make things more object/python like.

You might also like the stuff that doesn't need special functions - like the model names, and switch audio based simply on a file being in the right folder... Can make life much easier if you don't change the switch functions around on the field, since you don't even need to add SF lines for the switch audio then.

https://doc.open-tx.org/manual-for-opentx-2-2/advanced-features/audio#model-folder-sound-files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants