Moderate-fitting

This is the implementation of the paper "Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models".

Quick links

Overview
Set up the environment
Arguments
Run the experiments
Reference

Overview

The vulnerability of PLMs under backdoor attacks has been proved with increasing evidence in the literature. In this paper, we present several simple yet effective training strategies that could effectively defend against such attacks.

Set up the environment

(1) create a virtual environment (optional)

conda create -n moderate_env python=3.8
conda activate moderate_env

(2) run the code

python setup.py install

Arguments

(1) Some arguments in the LoRA config file:

poisoned_test_file: the path of the poisoned testing data

poisoned_train_file: the path of the poisoned training data 

clean_test_file: the path of the clean testing data

mid_dim: the bottleneck dimension of the reparameterization network

lora_r: the LoRA rank r

(2) Some arguments in the Adapter config file:

poisoned_test_file: the path of the poisoned testing data

poisoned_train_file: the path of the poisoned training data 

clean_test_file: the path of the clean testing data

mid_dim: the bottleneck dimension of the reparameterization network

bottleneck_dim: the projection dimension of the Adapter

(3) Some arguments in the Prefix-Tuning config file:

poisoned_test_file: the path of the poisoned testing data

poisoned_train_file: the path of the poisoned training data 

clean_test_file: the path of the clean testing data

mid_dim: the bottleneck dimension of the reparameterization network

prefix_token_num: the number of prefix tokens

Run the experiments

(1) To defend against word-level attack on SST-2 with low-rank reparameterized LoRA, run the following code:

cd ./examples/examples_text-classification

bash run_poison.sh 6 6 lora_roberta-base-sst2-badnet-5

(2) To defend against word-level attack on SST-2 with low-rank reparameterized Adapter, run the following code:

cd ./examples/examples_text-classification

bash run_poison.sh 6 6 adapter_roberta-base-sst2-badnet-5

(3) To defend against word-level attack on SST-2 with low-rank reparameterized Prefix-Tuning, run the following code:

cd ./examples/examples_text-classification

bash run_poison.sh 6 6 prefix_roberta-base-sst2-badnet-5

Reference:

[1] https://github.com/thunlp/OpenDelta

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dist		dist
docs		docs
examples		examples
figs		figs
opendelta.egg-info		opendelta.egg-info
opendelta		opendelta
README.md		README.md
description_of_HSOL_dataset.pdf		description_of_HSOL_dataset.pdf
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Moderate-fitting

Quick links

Overview

Set up the environment

Arguments

Run the experiments

Reference:

About

Releases

Packages

Contributors 2

Languages

thunlp/Moderate-fitting

Folders and files

Latest commit

History

Repository files navigation

Moderate-fitting

Quick links

Overview

Set up the environment

Arguments

Run the experiments

Reference:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages