Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language #7

Open
SangenBR opened this issue Jan 11, 2021 · 7 comments
Open

Language #7

SangenBR opened this issue Jan 11, 2021 · 7 comments
Labels
enhancement New feature or request

Comments

@SangenBR
Copy link

I would like to ask for support for Portuguese and Brazilian Portuguese, is it possible?

@arition
Copy link
Member

arition commented Jan 12, 2021

I am happy to support Portuguese and Brazilian Portuguese, but I know little about these languages. If you can provide me some sample Portuguese subtitle files (ass files, more is better), I can train a model for you.

@arition arition added enhancement New feature or request help wanted Extra attention is needed labels Jan 12, 2021
@SangenBR
Copy link
Author

send subtitles in your email
do you train hardsub video subtitles?

@arition
Copy link
Member

arition commented Jan 12, 2021

Received. Training takes time, so please be patient.
For details about training, please check https://github.com/freyjaSubOCR/freyja-sub-ocr-training.

@SangenBR
Copy link
Author

do in your time without haste, thanks

@arition
Copy link
Member

arition commented Feb 11, 2021

Progress report

Training for Portuguese (or any kinds of Latin languages) is much harder than I think. The main problem is that sentences in Latin languages are much longer than the sentences in Chinese, so the model structures need to be changed. However I am really busy recently and I do not have time to try different kinds of network structure and parameters. The variable width of Latin characters also hurt the performance of the model.

Good news is that I managed to train a baseline model. The accuracy is not very high, but it is usable. I will try to update the GUI and provide a test version for you if I have time this week.

@arition arition removed the help wanted Extra attention is needed label Feb 11, 2021
@SangenBR
Copy link
Author

SangenBR commented Feb 12, 2021

Oh, I would like to test, but do it in your time I'm not in a hurry.

@SangenBR
Copy link
Author

I see that your problem is the opposite of mine because I think that the Chinese language has few characters and reading and very fast it gets to me when I see some Chinese anime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants