This code accompanies the paper "Universal Adversarial Attacks with Natural Triggers for Text Classification" https://arxiv.org/abs/2005.00174, accepted by NAACL 2021.
Pytorch, AllenNLP, Hugging Face Transformers (see requirements.txt).
First, download the pretrained ARAE model here, and unzip into the "./ARAR/oneb_pretrained" folder.
Then, go to sst or snli directory and run python sst_attack.py
or python snli_attack.py
.
The argument attack_class
is used to select the class label to attack, and the argument len_lim
specifies the length of attack trigger.