Skip to content

Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]

Notifications You must be signed in to change notification settings

lancopku/agent-backdoor-attacks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BadAgents: Backdoor Attacks on LLM-based Agents

This is the repository containing the code and data for the NeurIPS 2024 paper Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents [pdf]


Poisoned Data

We have released the poisoned training data used in Web Shopping (put in here) and Tool Learning (download from here) experiments.

Query-Attack and Observation-Attack

The code for Query-Attack and Observation-Attack is in AgentTuning.

Thought-Attack

The code for Thought-attack is mainly based on ToolBench. We provide an instruction in ToolBench/README.md on how to use the poisoned data we provide.

Citation

If you use our code and data, please kindly cite our work as

@article{yang2024watch,
  title={Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents},
  author={Yang, Wenkai and Bi, Xiaohan and Lin, Yankai and Chen, Sishuo and Zhou, Jie and Sun, Xu},
  journal={arXiv preprint arXiv:2402.11208},
  year={2024}
}

About

Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published