Current state of Traditional Chinese support #11

JanVeb · 2022-05-29T02:16:02Z

Hi there, this seems as just the thing I need, much better solution than other similar plugins with way more stars, great job.

Is it possible to translate to pinyin with numbers instead of tone marks? (If not, I could write you code for that, just asking, in case you didn't add this possibility but would like to add it as well, as often, in pinyin resources used for programing, pinyin is written with numbers, rather than tone marks, but this is really simple problem to solve, especially with output from your plugin)

Also I see in your example, its possible to translate traditional characters to simplified and vice versa, then In open issues you have user ShawTim on Dec 26, 2020 asking question:

Does segment support splitting Traditional Chinese into words? #8

Your answer:

It probably won't work very well for traditional characters because the segmentation library used (jieba) is trained on simplified texts. For now you'll probably have to convert to simplified first.

So I wanted to check, since its already passed more than a year since this question was asked, and there are examples in your project readme of translating traditional to simplified hanzi, did you add this latter on, is it working relatively good, or how good is it converting traditional characters to simplified ones?

I mean, I don't need your library for translating from traditional to simplified characters, but rather to translate traditional and simplified characters to pinyin, is your library good for translating traditional characters to pinyin now, since you have examples for translating traditional to Simplified characters and vice versa.

peterolson · 2022-05-30T05:50:36Z

You can convert between simplified and traditional, but the segmentation will only work well with simplified. If you want to segment traditional text, you can convert to simplified, segment, and then re-use the same segment lengths on the original traditional text.

But it would be a good idea to bake this in to the library to avoid this extra step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current state of Traditional Chinese support #11

Current state of Traditional Chinese support #11

JanVeb commented May 29, 2022

peterolson commented May 30, 2022

Current state of Traditional Chinese support #11

Current state of Traditional Chinese support #11

Comments

JanVeb commented May 29, 2022

peterolson commented May 30, 2022