-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
用户自定义词典可否支持包含标点符号的词组? #67
Comments
@cxa 你好,感谢反馈这个问题。 这个问题应该是可以解决的,我试试解决一下这个问题。晚上还有点事得出去一下,晚上回来或者明天再解决。 谢谢反馈! |
我自己看了下源码,问题应该出在这:https://github.com/yanyiwu/cppjieba/blob/master/include/cppjieba/SegmentBase.hpp#L12 。如果能将 |
依次代表的就是分隔符就是 |
在最近的代码版本中新增了一个Jieba::ResetSeparators api来手动重置分隔符, |
👍 |
请问python版的jieba有解决办法吗?谢谢 |
@liuleigit 建议你去 https://github.com/fxsjy/jieba/issues 提问,这里主要针对 cpp 版本。作者也不是一个人。 |
@t-k- 好的,thank you |
yanyiwu 你好,
首先感谢你的辛勤工作。
我想使用 cc-cedict 作为分词词典,然而里面会包含类似“同一个世界,同一个梦想”这个包含标点的词组,结巴并不能将它识别为一个词组。不知能否加上一个参数选项以支持这种需求?
The text was updated successfully, but these errors were encountered: