- Python 2.7.11
- torch
- CoNLL-2003 corpora
- Put CoNLL-2003 corpora into data/conll2003 folder. Your CoNLL 2003 files should be renamed and placed as below:
- data/conll2003/eng.testa.dev;
- data/conll2003/eng.testb.test;
- data/conll2003/eng.train.
- Run script
bash utils/prepare-senna-data.sh
. It downloads senna embeddings, gazeteers and puts them into data/. - Install torch and python libs from requirements.txt (
pip install -r requirements.txt
).
All experiments are done by using AWS g2.2xlarge with GPU.
Run bash experiments/convolution-net.sh
. After about 5 hours
model with 87.5% F1 will be learnt.
In snapshots directory will be saved the model with the best F1 (each 2 epochs).
Learning logs also saving there.
For example, learning log for 74 epoch:
processed 46666 tokens with 5648 phrases; found: 5778 phrases; correct: 4990.
accuracy: 97.65%; precision: 86.36%; recall: 88.35%; FB1: 87.34
LOC: precision: 90.35%; recall: 90.89%; FB1: 90.62 1678
MISC: precision: 73.82%; recall: 75.50%; FB1: 74.65 718
ORG: precision: 80.69%; recall: 85.01%; FB1: 82.79 1750
PER: precision: 93.87%; recall: 94.74%; FB1: 94.31 1632