FireBERT

Hardening BERT classifiers against adversarial attack

Gunnar Mein, UC Berkeley MIDS Program ([email protected])
Kevin Hartman, UC Berkeley MIDS Program ([email protected])
Andrew Morris, UC Berkeley MIDS Program ([email protected])

With many thanks to our advisors: Mike Tamir, Daniel Cer and Mark Butler for their guidance on this research. And to our significant others as the three of us hunkered down over the three month project.

*Note: This repo used to be anonymous while the paper was in blind review.

Paper

Please read our paper: FireBERT 1.0. When citing our work, please include a link to this repository.

Instructions

The best way to run our project is to download the .zip files in release v1.0. Expand the "data.zip", "resources-1.zip" and "resources-2.zip" files into "data" and "resources" folders, respectively.

To obtain the values from tables 1 and 2 in the "Results" section of the paper, run the respective "eval_xxxx.ipynb" notebooks.
To obtain the values from table 3, run "generate_adversarials.ipynb"
To tune a basic BERT model, run the "bert_xxxx_tuner.ipynb" notebooks.
To co-tune FACT on synthetic adversarials, run the "firebert_xxxx_and_adversarial_co-tuner.ipynb" - notebooks.
To recreate the illustrations in the "Analysis" section, play around with "analysis.ipynb". It produces many possible graphs. Try changing the values at the top of cells.

Major Pre-requisites

tensorflow 2.1 or higher, GPU preferred
torch, torchvision (PyTorch) 1.3.1 or higher
pytorch-lightning 0.7.1 or higher
transformers (Hugging Face) 2.5.1 or higher

Hardware and run-time expectations

Authors used Intel i7-9th generation personal computers with 64 GB of main memory and NVIDIA 2080 (Max-Q and ti) graphics cards, and various GCP instances. Full evaluation runs for pre-made adversarial samples can be done in a small number of hours. Active attack benchmarks with TextFooler are done in hours for MNLI, but might take days for IMDB. Co-tuning for FACT is expected to run for multiple hours.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
FireBERT.pdf		FireBERT.pdf
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
bert_base_model.py		bert_base_model.py
bert_imdb_tuner.ipynb		bert_imdb_tuner.ipynb
bert_mnli_tuner.ipynb		bert_mnli_tuner.ipynb
eval_imdb.ipynb		eval_imdb.ipynb
eval_mnli.ipynb		eval_mnli.ipynb
firebert_base.py		firebert_base.py
firebert_fct.py		firebert_fct.py
firebert_fse.py		firebert_fse.py
firebert_fve.py		firebert_fve.py
firebert_imdb_and_adversarial_co-tuner.ipynb		firebert_imdb_and_adversarial_co-tuner.ipynb
firebert_mnli_and_adversarial_co-tuner.ipynb		firebert_mnli_and_adversarial_co-tuner.ipynb
generate_adversarials.ipynb		generate_adversarials.ipynb
processors.py		processors.py
randomsearchIMDB.py		randomsearchIMDB.py
randomsearchIMDB.sh		randomsearchIMDB.sh
randomsearchMNLI.py		randomsearchMNLI.py
randomsearchMNLI.sh		randomsearchMNLI.sh
switch.py		switch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FireBERT

Paper

Instructions

Major Pre-requisites

Hardware and run-time expectations

About

Releases 1

Packages

Languages

License

FireBERT-author/FireBERT

Folders and files

Latest commit

History

Repository files navigation

FireBERT

Paper

Instructions

Major Pre-requisites

Hardware and run-time expectations

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages