-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lexicon: don't add phoneme-inventory
tag when lexicon doesn't have phonemes.
#547
base: main
Are you sure you want to change the base?
Conversation
This change might break other setups, so it's fine if you don't want to merge this.
I mean, this is the relevant question, right? I don't know the answer. But it all depends on the answer. If there are jobs where this would change the behavior, then we cannot merge it, we would need to introduce this as a new flag for the job. If there are no jobs where this would change sth, then we can merge this. But I don't know. |
The question I rather have is: Does this give any benefit somewhere? So is there something not working when it exists but is empty? |
Some context on this decision: for our ASR pipelines at AppTek we're starting to provide extra lexica, which assumes that there's no phoneme inventory tag at all on such extra lexica. However, when we find a phoneme tag, even if it's empty, we crash with an assertion error. We could also fix this on our end, but I found it pretty weird that the |
Then it is fine for me, at i6 we usually do not have lexica without an inventory, and I do not see where something would crash if it is not there or cause new behavior if it was empty anyway. Still, we should get a lot of people to look at this to be on the safe side. |
I am not sure if I understand your problem. Are you saying that you have an external lexicon and you cannot extend your original lexicon with the additional external one? And you need to use multiple lexica at the same time? AFAIK, the main use of the phoneme inventory within rasr is to have the order of phonemes, which will have an effect on the state tyings that are done within rasr, if you are not using any external state tying file.
I also don't understand this statement. And you are also saying that your external lexicon does not have any phoneme inventory? Or are you saying that you have two lexica that do not share the same token inventory? |
Hi Tina
Yes exactly at AppTek we offer the option to add an additional lexicon at runtime. So we do not want to append the main one. The lexicons are created with the same g2p as the main one with a seperate API - so the phoneme set is the same - but RASR crashs if We have a method to remove the phoneme inventory element in apptek_asr but Nahuels implementation here is much more elegant for the feature of extra lexicons / custom words |
Thanks @sarahberanek for the explanation, now it is clear. |
This change might break other setups, so it's fine if you don't want to merge this.