You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
your java runtime environment (java -version): java version "1.8.0_221"
the log file provided by TermSuite (under the current ./logs/ directory) : NA
a short description of the problem you encounter.
I am trying to use TermSuite to extract terms from documents in Chinese. I've successfully installed treetagger for Chinese, as well as found a segmenter for Chinese that works. However, when I try to run TermSuite with the following settings:
which work for 'en' on a set of English documents - but when applied to 'zh' on a set of Chinese documents, I get the following error:
Exception in thread "main" fr.univnantes.termsuite.tools.TermSuiteCliException: An unexpected error occurred: Resource initialization error at fr.univnantes.termsuite.tools.CommandLineClient.launch(CommandLineClient.java:295) at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.main(TerminologyExtractorCLI.java:203) Caused by: fr.univnantes.termsuite.api.TermSuiteException: Resource initialization error at fr.univnantes.termsuite.framework.service.TermSuiteResourceManager.loadResource(TermSuiteResourceManager.java:68) at fr.univnantes.termsuite.framework.service.TermSuiteResourceManager.get(TermSuiteResourceManager.java:77) at fr.univnantes.termsuite.framework.injector.ResourceInjector.injectResources(ResourceInjector.java:25) at fr.univnantes.termsuite.framework.injector.EngineInjector.injectResources(EngineInjector.java:31) at fr.univnantes.termsuite.framework.pipeline.SimpleEngineRunner.run(SimpleEngineRunner.java:28) at fr.univnantes.termsuite.framework.pipeline.AggregateEngineRunner.run(AggregateEngineRunner.java:49) at fr.univnantes.termsuite.framework.pipeline.AggregateEngineRunner.run(AggregateEngineRunner.java:49) at fr.univnantes.termsuite.framework.Pipeline.run(Pipeline.java:25) at fr.univnantes.termsuite.api.TerminoExtractor.execute(TerminoExtractor.java:95) at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.run(TerminologyExtractorCLI.java:137) at fr.univnantes.termsuite.tools.CommandLineClient.launch(CommandLineClient.java:287) ... 1 more Caused by: org.apache.uima.resource.ResourceInitializationException at fr.univnantes.julestar.uima.resources.MultilineResource.load(MultilineResource.java:44) at fr.univnantes.termsuite.framework.service.TermSuiteResourceManager.loadResource(TermSuiteResourceManager.java:55) ... 11 more Caused by: fr.univnantes.julestar.uima.resources.ResourceFormatException: Expected two columns at line 36. Got: "职掌" at fr.univnantes.julestar.uima.resources.MapResource.doError(MapResource.java:52) at fr.univnantes.julestar.uima.resources.MapResource.doRow(MapResource.java:30) at fr.univnantes.julestar.uima.resources.TabResource.doLine(TabResource.java:15) at fr.univnantes.julestar.uima.resources.MultilineResource.load(MultilineResource.java:41) ... 12 more
The text was updated successfully, but these errors were encountered:
Hi,
I am trying to use TermSuite to extract terms from documents in Chinese. I've successfully installed treetagger for Chinese, as well as found a segmenter for Chinese that works. However, when I try to run TermSuite with the following settings:
-c /input_files/pdf2txt -l zh --contextualize --context-scope 3 --context-assoc-rate MutualInformation --enable-semantic-gathering --post-filter-property documentFrequency --post-filter-th 2 --semantic-distance Jaccard --tsv /ctxt3_jac_pmi.tsv --tsv-properties "rank,pilot,isFixedExp,pattern,freq,spec,semScore,isDico,isDistrib"
Exception in thread "main" fr.univnantes.termsuite.tools.TermSuiteCliException: An unexpected error occurred: Resource initialization error at fr.univnantes.termsuite.tools.CommandLineClient.launch(CommandLineClient.java:295) at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.main(TerminologyExtractorCLI.java:203) Caused by: fr.univnantes.termsuite.api.TermSuiteException: Resource initialization error at fr.univnantes.termsuite.framework.service.TermSuiteResourceManager.loadResource(TermSuiteResourceManager.java:68) at fr.univnantes.termsuite.framework.service.TermSuiteResourceManager.get(TermSuiteResourceManager.java:77) at fr.univnantes.termsuite.framework.injector.ResourceInjector.injectResources(ResourceInjector.java:25) at fr.univnantes.termsuite.framework.injector.EngineInjector.injectResources(EngineInjector.java:31) at fr.univnantes.termsuite.framework.pipeline.SimpleEngineRunner.run(SimpleEngineRunner.java:28) at fr.univnantes.termsuite.framework.pipeline.AggregateEngineRunner.run(AggregateEngineRunner.java:49) at fr.univnantes.termsuite.framework.pipeline.AggregateEngineRunner.run(AggregateEngineRunner.java:49) at fr.univnantes.termsuite.framework.Pipeline.run(Pipeline.java:25) at fr.univnantes.termsuite.api.TerminoExtractor.execute(TerminoExtractor.java:95) at fr.univnantes.termsuite.tools.TerminologyExtractorCLI.run(TerminologyExtractorCLI.java:137) at fr.univnantes.termsuite.tools.CommandLineClient.launch(CommandLineClient.java:287) ... 1 more Caused by: org.apache.uima.resource.ResourceInitializationException at fr.univnantes.julestar.uima.resources.MultilineResource.load(MultilineResource.java:44) at fr.univnantes.termsuite.framework.service.TermSuiteResourceManager.loadResource(TermSuiteResourceManager.java:55) ... 11 more Caused by: fr.univnantes.julestar.uima.resources.ResourceFormatException: Expected two columns at line 36. Got: "职掌" at fr.univnantes.julestar.uima.resources.MapResource.doError(MapResource.java:52) at fr.univnantes.julestar.uima.resources.MapResource.doRow(MapResource.java:30) at fr.univnantes.julestar.uima.resources.TabResource.doLine(TabResource.java:15) at fr.univnantes.julestar.uima.resources.MultilineResource.load(MultilineResource.java:41) ... 12 more
The text was updated successfully, but these errors were encountered: