Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training with disfluencies in speech #1701

Closed
duhtapioca opened this issue Jul 24, 2024 · 0 comments
Closed

Training with disfluencies in speech #1701

duhtapioca opened this issue Jul 24, 2024 · 0 comments

Comments

@duhtapioca
Copy link

duhtapioca commented Jul 24, 2024

Hi

We're looking to finetune a zipformer streaming model on our custom dataset of around 100 hours that we are about to get manually annotated. The speech in that dataset may contain disfluencies. So, in this case, is it better to create the annotations with disfluencies or should we opt to ignore them in the transcripts?

From the CSJ experiments in #892, we infer that the model trained and tested on fluent transcripts is performing slightly better. Is this inference correct? In the case of zipformer, are we to expect similar results or is training with disfluent transcriptions worth a shot? If yes what would be the ideal format for annotating disfluent speech for zipformer?

Any advice on this would be of great help.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant