- Remove "click here" links;
- handle more punctuation symbols as features, not as tokens - this way there will be more token-level chains. Only comma is handled this way now;
- introduce more features (they are quite basic now);
- switch to an another CRF implementation (Wapiti is hard to use as a library).