Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用CRFSegment进行分词时,出现java.lang.NullPointerException #747

Closed
1 task
andyxzq opened this issue Jan 15, 2018 · 3 comments
Closed
1 task

Comments

@andyxzq
Copy link

andyxzq commented Jan 15, 2018

注意事项

请确认下列注意事项:

  • 我已仔细阅读下列文档,都没有找到答案:
  • 我已经通过Googleissue区检索功能搜索了我的问题,也没有找到答案。
  • 我明白开源社区是出于兴趣爱好聚集起来的自由社区,不承担任何责任或义务。我会礼貌发言,向每一个帮助我的人表示感谢。
  • 我在此括号内输入x打钩,代表上述事项确认完毕。

版本号

当前最新版本号是:1.5.3
我使用的版本是:1.5.3

我的问题

  1. 下载data-for-1.5.3.zip,解压后上传data目录到hdfs上;
  2. 继承IIOAdapter来读取hdfs上的文件;
  3. hanlp.properties放在src/main/resources目录下,其中指定了CRFSegmentModelPath和各个词典的路径,以及IOAdapter;
  4. 程序运行时,crf模型加载成功(不然会报找不到模型文件error),但map算子中调用segment()进行分词时,日志中报空指针异常。

触发代码

    val segment = new CRFSegment()
          .enableNameRecognize(true)
          .enableTranslatedNameRecognize(true)
          .enableJapaneseNameRecognize(true)
          .enablePlaceRecognize(true)
          .enableOrganizationRecognize(true)
          .enablePartOfSpeechTagging(true)
          .enableCustomDictionary(true)
   segment.seg(text)

其他信息

Caused by: java.lang.NullPointerException
at com.hankcs.hanlp.algorithm.Viterbi.compute(Viterbi.java:121)
at com.hankcs.hanlp.seg.CharacterBasedGenerativeModelSegment.segSentence(CharacterBasedGenerativeModelSegment.java:81)
at com.hankcs.hanlp.seg.Segment.seg(Segment.java:507)

@hankcs
Copy link
Owner

hankcs commented Jan 15, 2018

  1. 重新上传一次data,或者试试旧版data。
  2. 建议看看日志里最开始有没有CoreNatureDictionary.tr.txt加载失败的警告。这个警告会通过logger给出,而不是exception。
  3. 现在修改为报exception通知。

@andyxzq
Copy link
Author

andyxzq commented Jan 15, 2018

@hankcs ,谢谢。确实是CoreNatureDictionary.tr.txt没有加载。目前已修改正确。但是,通过CustomDictionary动态加载的词被分开了,比如“戏精”被分开了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants