Skip to content

A word tokenizer component for UIMA that take advantage of unicode general classes. The tokenizer only handles French for the moment, but can be extended quite easily.

License

Notifications You must be signed in to change notification settings

grdscarabe/uima-word-tokenizer

About

A word tokenizer component for UIMA that take advantage of unicode general classes. The tokenizer only handles French for the moment, but can be extended quite easily.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages