Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified ↔ Traditional Mandarin Chinese transliterations cannot be edited #2007

Open
Yorwba opened this issue Nov 10, 2019 · 0 comments
Open
Labels
enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba.

Comments

@Yorwba
Copy link
Contributor

Yorwba commented Nov 10, 2019

In Traditional Chinese script, the character 著 can have many different Mandarin pronunciations, such as zhe, zháo, zhāo, zhù and zhuó. In Simplified Chinese, the character 着 is used instead, except when the pronunciation is zhù, which is written with the same 著 as in Traditional Chinese.

That means the automatic transcription process can produce errors if it guesses the wrong pronunciation, such as for the sentence below:

The pronunciation of 著 was misidentified as zhù, affecting both the Simplified Chinese and Pinyin transcription. I was able to correct the Pinyin transcription to zhe, but the Simplified Chinese transcription cannot be edited.

The Wiki article on autogenerated transcriptions says:

  • The autogeneration software is mostly reliable but produces errors from times to times. If it's rather not reliable, edition may be prevented because it would require too much human work from contributors to fix all the transcriptions. If it's near-100% perfect, edition may be prevented as well unless it produces substantial errors.

I'm not sure about the rationale for disabling editing for near-100%-but-not-100% perfect transcriptions (preventing vandalism or accidentally adding translations as transcriptions, maybe?), but I think getting a common grammatical marker like 著/着 wrong does count as a substantial error.

Since there are relatively few characters with ambiguous transcriptions, it might be best to make transcriptions editable only if they involve one of those characters.

@trang trang added the enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba. label Nov 16, 2019
jiru added a commit that referenced this issue Mar 9, 2020
jiru added a commit that referenced this issue Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba.
Projects
None yet
Development

No branches or pull requests

2 participants