A follow up to https://github.com/asafamr/SymPatternWSI , adapted to BERT.
Paper: Towards better substitution-based word sense induction - https://arxiv.org/abs/1905.12598
Python 3.7
Install requirements.txt with pip -r
This will install python pacakges including pytorch and huggingface's BERT port.
(for CUDA support first install pytorch accroding to their instructions).
Run download_resources.sh to download datasets.
Run wsi_bert.py for sense induction on both SemEval 2010 and 2013 WSI task datasets.
Logs should be printed to "debug" dir.
SemEval 2013 WSI mean(STD) over 10 runs:
FNMI:21.4(0.5) FBC:64.0(0.5) Geom. mean:37.0(0.5)
(previous SOTA 11.3,57.5,25.4)
SemEval 2010 WSI mean(STD) over 10 runs:
F-S:71.3(0.1) V-M:40.4(1.8) Geom. mean:53.6(1.2)
(previous SOTA 61.7,9.8,24.59)