-
Notifications
You must be signed in to change notification settings - Fork 6
module__org.bibliome.alvisnlp.modules.treetagger.TreeTaggerReader
#org.bibliome.alvisnlp.modules.treetagger.TreeTaggerReader
Read files in tree-tagger output format and creates a document for each file read.
Each document contains a single section named sectionName; its contents is constructed by concatenating the first column of each token separated with a space character.
org.bibliome.alvisnlp.modules.treetagger.TreeTaggerReader keeps the tree-tagger tokenization in annotations added into the layer wordLayerName. The POS tag and lemma are recorded in the annotation's posFeatureKey and lemmaFeatureKey features respectively.
The document identifier is the path of the corresponding file.
Optional
Type: String
Name of the section of each document.
Optional
Type: SourceStream
Path to the source directory or source file.
Optional
Type: Mapping
Constant features to add to each annotation created by this module
Optional
Type: Mapping
Constant features to add to each document created by this module
Optional
Type: Mapping
Constant features to add to each section created by this module
Optional
Type: String
Name of the feature where to store word lemmas.
Optional
Type: String
Name of the feature where to store word POS tags.
Default value: UTF-8
Type: String
Character set of input files.
Default value: sentences
Type: String
Name of the layer where to store sentence annotations.
Default value: words
Type: String
Name of the layer where to store word annotations.