Replies: 1 comment
-
I think is indeed best to split this up as much as possible. PAULA is agnostic to how to model morphological features, but the further processing is as you've seen dependent on the complexity of the annotation value. In general, regular expressions with prefixes work better than having a |
Beta Was this translation helpful? Give feedback.
-
In my corpus, POS + morphological features are encoded as a 9-character string, with the first character being the POS and the others the morphological features. If I encode this string as it is in PAULA, and then write a ANNIS query such as
postag=/v.*/
, this takes a huge amount of time to be processed. I guess that the best way to encode it is to separate the POS character from the others. Should one also consider to split all other values using multiFeat? How do you usually model morphological features in PAULA?Beta Was this translation helpful? Give feedback.
All reactions