Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT

This is the code used to run the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT.

Use

run_one_experiment.py runs one experiment: training a classifier to classify subjects from objects in mBERT embeddings in one training language, and testing this classifier on one test language. Classifiers and mBERT embeddings are cached, to reuse for other experiments with the same train/test language.

make_script_to_run_all_experiments.py makes a script to run run_one_experiment.py for every pair of languages in source_langs.txt and sink_langs.txt

To reproduce the experiments in the paper:

Change the paths in source_langs.txt and sink_langs.txt, so that they match the location of the Universal Dependency Treebanks on your machine
Run make_script_to_run_all_experiments.py to create the scripts that will run the experiments. This will create a script run_batch_0.sh (or more if you're running for multiple seeds
Run run_batch_0.sh

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cached_bert_vectors		cached_bert_vectors
classifiers		classifiers
results		results
README.md		README.md
data.py		data.py
make_script_to_run_experiments.py		make_script_to_run_experiments.py
run_one_experiment.py		run_one_experiment.py
sink_langs.txt		sink_langs.txt
source_langs.txt		source_langs.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT

Use

About

Releases

Packages

Languages

toizzy/deep-subjecthood

Folders and files

Latest commit

History

Repository files navigation

Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT

Use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages