This repository contains code to reproduce results from the paper:
Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks (AAAI 2021)
Xiaosen Wang, Yichen Yang, Yihe Deng, Kun He
There are three datasets used in our experiments. Download and place the file train.csv
and test.csv
of the three datasets under the directory /data/ag_news
, /data/dbpedia
and /data/yahoo_answers
, respectively.
There are two dependencies for this project. Download and put glove.840B.300d.txt
and counter-fitted-vectors.txt
to the directory /data/
.
- python 3.6.5
- numpy 1.15.2
- tensorflow-gpu 1.12.0
- keras 2.2.0
textcnn.py
,textrnn.py
,textbirnn.py
: The models for CNN, LSTM and Bi-LSTM.train.py
: Normally or adversarially training models.utils.py
: Helper functions for building dictionaries, loading data, or processing embedding matrix etc.build_embeddings.py
: Generating the dictionary, embedding matrix and distance matrix.FGPM.py
: Fast Grandient Projection Method.attack.py
: Attack models with FGPM.Config.py
: Settings of datasets, models and attacks.
-
Generating the dictionary, embedding matrix and distance matrix:
python build_embeddings.py --data ag_news --data_dir ./data/
You could use our pregenerated data by downloading and place aux_files into the directory
/data/
. -
Training the models normally:
python train.py --data ag_news -nn_type textcnn --train_type org --num_epochs=2 --num_checkpoints=2 --data_dir ./data/ --model_dir ./model/
(You will get a directory named like
1583313019_ag_news_org
in path/model/runs_textcnn
)You could also use our trained model by downloading and placing runs_textcnn, runs_textrnn and runs_textbirnn into the directory
/model/
. -
Attack the normally trained model by FGPM:
python attack.py --nn_type textcnn --data ag_news --train_type org --time 1583313019 --step 2 --grad_upd_interval=1 --max_iter=30 --data_dir ./data/ --model_dir ./model/
(Note that you may get another timestamp, check the file name of the model in
/model/runs_textcnn
) -
Training the models by ATFL to enhance robustness:
python train.py --data ag_news -nn_type textcnn --train_type adv --num_epochs=10 --num_checkpoints=10 --grad_upd_interval=1 --max_iter=30 --data_dir ./data/ --model_dir ./model/
(You will get a directory named like
1583313121_ag_news_adv
in path/model/runs_textcnn
)You could also use our trained model by downloading and placing runs_textcnn, runs_textrnn and runs_textbirnn into the directory
/model/
. -
Attack the adversarially trained model by FGPM:
python attack.py --nn_type textcnn --data ag_news --train_type adv --time 1583313121 --step 3 --grad_upd_interval=1 --save=True --max_iter=30 --data_dir ./data/ --model_dir ./model/
(Note that you may get another timestamp, check the file name of the model in
/model/runs_textcnn
)
Most experiments setting have been provided in our paper. Here we provide some more details to help reproduce our results.
-
For normal training, we set
num_epochs
to2
on CNN models and3
on RNN models. For adversarial training, we train 10 epochs for all models except for RNN models of Yahoo! Answers dataset with3
epochs. -
The parameter
max_iter
denotes the maximum number of iterations, namelyN
in the FGPM algorithm. According to the average length of the samples, we empirically setmax_iter
to30
onag_news
,40
ondbpedia
, and50
onyahoo_answers
. Moreover, to speed up the training by ATFL on Yahoo! Answers, we calculate the gradient every5
optimal synonym substitution operations (i.e.grad_upd_interval = 5
) . -
In order to maintain the fairness for comparison, we restrict the candidate words in the first 4 clostest synonyms of each word. While implement adversarial trainging, to obtain more adversarial examples, we do not have such restriction.
-
In order to improve the readability of adversarial examples, we have enabled stop words by default to prohibit the attack algorithm from replacing words such as
the/a/an
with synonyms. Stop words can be seen inConfig.py
. You can also turn off stop words by settingstop_words = False
when attack or adversarial training.
Questions and suggestions can be sent to [email protected].
If you find this code and data useful, please consider citing the original work by authors:
@article{wang2021Adversarial,
title={Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks},
author={Xiaosen Wang and Yichen Yang and Yihe Deng and Kun He},
journal={AAAI Conference on Artificial Intelligence},
year={2021}
}