- Table of Contents
- This is the official implementation for NDSS 2023 paper "BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense".
- [arXiv] | [video] | [slides]
- BEAGLE can act as an Adaptive Defense against various backdoor attacks, requiring only a few examples of the backdoor.
- BEAGLE serves as a Backdoor Scanning tool. With just a few poisoned models and samples of a specific backdoor, it synthesizes a scanner specifically tailored to detect that backdoor.
- BEAGLE also operates as a Backdoor Removal tool. Using a few poisoned samples from a single model, it synthesizes the trigger injection function and employs adversarial training to harden the model against backdoor effects.
.
├── checkpoints # Model path of the pre-trained StyleGAN
├── cifar10 # Forensics on CIFAR-10 dataset
│ ├── backdoors # Backdoor functions
│ ├── ckpt # Pre-trained attacked models
│ ├── data # Trigger patterns and dataset
│ ├── models # Model architectures
│ ├── backdoor_removal.py # Backdoor removal using BEAGLE
│ ├── decomposition.py # Attack decomposition
│ ├── invert_func.py # Decomposition functions
│ ├── stylegan.py # Load the StyleGAN model
│ ├── stylegan2ada_generator_with_styles_noises.py # StyleGAN model functions
│ └── utils.py # Utility functions
├── trojai_round3 # Forensics on TrojAI round-3 dataset
│ ├── abs_beagle_filter.py # Sythesized scanner against filter backdoors
│ ├── abs_beagle_polygon.py # Sythesized scanner against polygon backdoors
│ ├── gen_filter.py # Attack decomposition for filter backdoors
│ ├── gen_polygon.py # Attack decomposition for polygon backdoors
│ ├── invert_func.py # Decomposition functions
│ ├── stylegan.py # Load the StyleGAN model
│ ├── stylegan2ada_generator_with_styles_noises.py # StyleGAN model functions
│ └── synthesis_scanner.py # Scanner synthesis functions
# Create python environment (optional)
conda env create -f environment.yml
source activate beagle
Please download the pre-trained StyleGAN model from the following link: Download Pre-trained Model
After downlowding, place it in the ./checkpoints directory
.
This model is fine-tuned from StyleGAN2-ADA by NVlabs. Special acknowledgment!
- BEAGLE serves as a tool for backdoor removal, given a few poisoned samples.
- With just a few poisoned samples (e.g., 10), BEAGLE extracts the injected trigger from these samples, and employs adversarial training to harden the model using the extracted trigger.
- Our codes are based on the CIFAR-10 dataset and the ResNet18 model architecture.
- We use BadNets, Refool and WaNet as three example backdoor attacks.
- Go to the
./cifar10
directory.
cd ./cifar10
- Decompose the attack and extract triggers from the poisoned samples.
# Decomposition of BadNets attack (bimonial mask)
python decomposition.py --gpu 0 --dataset cifar10 --network resnet18 --attack badnet --target 0 --n_clean 100 --n_poison 10 --func mask --func_option binomial --save_folder forensics --verbose 1 --epochs 1000 --seed 1024
# Decomposition of Refool attack (uniform mask)
python decomposition.py --gpu 1 --attack refool --func mask --func_option uniform
# Decomposition of WaNet attack (complex transformation)
python decomposition.py --gpu 2 --attack wanet --func transform --func_option complex
Arguments | Default Value | Description |
---|---|---|
gpu | 0 | Available GPU ID. |
dataset | "cifar10" | Utilized dataset. |
network | "resnet18" | Utilized model architecture. |
attack | "dfst" | Backdoor attack type. |
target | 0 | Attack target label. |
n_clean | 100 | Number of available clean samples. |
n_poison | 10 | Number of available poisoned samples. |
func | "mask" | Decomposition function. |
func_option | "binomial" | Decomposition functioning option. |
save_folder | "forensics" | Result folder. |
verbose | 1 | Print control. |
epochs | 200 | Total number of processing epochs. |
seed | 1024 | Random seed for reproducibility. |
- Outputs are saved by default in the
./forensics
folder. - For each attack and its corresponding decomposition function, a folder is created (e.g.,
./forensics/mask_binomial_badnet_cifar10_resnet18
). This folder contains a visualization figure and the trigger parameterparam
.
- Use adversarial training to counteract the backdoor, using the decomposed trigger.
- Specifically, in each batch, we apply the trigger to half of the samples as an adversarial augmentation to mitigate the backdoor effect.
# Backdoor removal using adversarial fine-tuning
python backdoor_removal.py --gpu 0 --attack badnet --ratio 0.01 --batch_size 128 --lr 0.01 --epochs 10
New Arguments | Default Value | Description |
---|---|---|
ratio | 0.01 | Proportion of samples used from the training set. |
batch_size | 128 | Size of each batch. |
lr | 0.01 | Learning rate for fine-tuning |
epochs | 10 | Number of fine-tuning epochs |
-
Outputs are printed. For example, standard fine-tuning maintains the Attack Success Rate (ASR) of BadNets at 100%, whereas BEAGLE reduces it to 2.86%.
-
More backdoor attacks can be found at BackdoorVault and OrthogLinearBackdoor.
- BEAGLE serves as a backdoor scanning tool for backdoor detection across a large number of models.
- Using a few poisoned models and samples from a unique backdoor type, we extract the trigger properties to develop a synthesized scanner tailored for that backdoor.
- Our code is based on the TrojAI Round-3 dataset, which includes over 1,000 models. We used 40 of these models to design our scanners.
- The TrojAI dataset is available for download at TrojAI-Round-3.
- Go to the
./trojai_round3
directory.
cd ./trojai_round3
- Decompose the attack and extract triggers from the poisoned samples.
# Decomposition of Polygon backdoors
python gen_polygon.py --gpu 0 --dataset_dir "[trojai_dataset_dir]" --epochs 1000 --save_folder "forensics/trojai_polygon/" --verbose 1 --seed 1024
# Decomposition of Instagram filter backdoors
python gen_filter.py --gpu 1 --dataset_dir "[trojai_dataset_dir]" --epochs 1000 --save_folder "forensics/trojai_filter/" --verbose 1 --seed 1024
Arguments | Default Value | Description |
---|---|---|
dataset_dir | - | Directory for the TrojAI dataset. |
epochs | 1000 | Total number of decomposition epochs. |
save_folder | - | Directory to save outputs. |
- Outputs are stored in the
save_folder
. - To modify the models for attack decomposition, edit Line 357 in
gen_polygon.py
and Line 312 ingen_filter.py
. - A folder is created for each model (e.g.,
./forensics/polygon/id-00000003
). This folder contains a visualization figure and the trigger parameter (e.g.,mask_pattern
).
- Attack properties are summarized in
synthesis_scanner.py
.
- We enhance the ABS scanner with the summarized attack propoerties.
# Scanning for Polygon backdoors
python abs_beagle_polygon.py --gpu 0 --dataset_dir "[trojai_dataset_dir]" --scatch_dirpath "scratch"
# Scanning for Instagram filter backdoors
python abs_beagle_filter.py --gpu 1 --dataset_dir "[trojai_dataset_dir]" --scatch_dirpath "scratch"
- Results will be output in
result.txt
, with each line recording the backdoor type, model ID, trojan probability, and additional trigger inversion results.
Please cite our paper if you find it useful for your research.😀
@article{cheng2023beagle,
title={BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense},
author={Cheng, Siyuan and Tao, Guanhong and Liu, Yingqi and An, Shengwei and Xu, Xiangzhe and Feng, Shiwei and Shen, Guangyu and Zhang, Kaiyuan and Xu, Qiuling and Ma, Shiqing and Zhang, Xiangyu},
journal={arXiv preprint arXiv:2301.06241},
year={2023}
}