You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you so much for the work you have done in your tacotron implementation. I have a question if you may.
I have a speech corpus with time alignments. For each audio sample, I have a file that looks like this.
0.471000 121 sil
0.618000 121 Z
0.666000 121 i
0.716750 121 n
0.852974 121 a:
0.910125 121 z
0.987444 121 a
1.070000 121 t
1.130000 121 u
1.182000 121 l
What is the best tacotron implementation that can exploit this information?
The text was updated successfully, but these errors were encountered:
Thank you so much for the work you have done in your tacotron implementation. I have a question if you may.
I have a speech corpus with time alignments. For each audio sample, I have a file that looks like this.
0.471000 121 sil
0.618000 121 Z
0.666000 121 i
0.716750 121 n
0.852974 121 a:
0.910125 121 z
0.987444 121 a
1.070000 121 t
1.130000 121 u
1.182000 121 l
What is the best tacotron implementation that can exploit this information?
The text was updated successfully, but these errors were encountered: