This script can split COCA vocabulary into small groups to be imported into dictionary app (e.g. Eudic) for studying.
Please refer to COCA 词频表使用 and 快速掌握 COCA 词汇表.
- Python 2
This script is orginally written in Python 2, now has updated to Python3, so please make sure you have Python 3 installed in your environment. For my it's /usr/bin/python3
.
python split.py coca20000.txt 15
by default, the Output file is coca20000_batch_import.txt
.
The last number 15 is the group size, it means each group contains 15 words, you can change it to your need.
coca20000.txt
contains the origianl vocabulary listcoca_refinded.txt
contains the final refined vocabulary list according to this article 快速掌握 COCA 词汇表