Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

The codes is implemented based on pytorch and codes for tool-use is base on ToolBench. We appreciate these open-source codes.

Install

https://github.com/AlibabaResearch/DAMO-ConvAI.git
cd attention-buckets
conda create -n your_envs python=3.9
conda activate your_envs
pip install -r requirment.txt
# repalce "your_env_path/lib/site-packages/transformers/models/llama/modeling-llama.py" with our 'modeling-llama.py'

Code for tool-use

git clone [email protected]:OpenBMB/ToolBench.git
cd ToolBench

We only present information about our work, for more information about toolbench please refer to ToolBench(https://github.com/OpenBMB/ToolBench)

Data

Put all dataset in ToolBench/data

Original data of ToolBench: Download the dataset using the following link: Google Drive or Tsinghua Cloud.
results of our method: Download the dataset using the following link: Google Drive

core codes and how to run and how to eval

1.replace "ToolBench/toolbench/inference/utils.py" with our "inference/utils.py" 2.move our "inference/config.py" to "ToolBench/toolbench/inference" 3.replace "ToolBench/toolbench/utils.py" with our "utils.py"

# run
bash scripts/inference_toolllama_pipeline.sh
# eval, the same to ToolBench
cd tooleval
bash run_convert_answer.sh
bash run_pass_rate.sh
# get pass rate of chatgpt_cot and your method and then run to get preference.
bash run_preference.sh

Code for rag

cd base_rag

Data

Put the data in ../qa_dataset Download the dataset using the following link: Google Drive

how to run and eval

# run
# bsz >= total bases num
CUDA_VISIBLE_DEVICES=i python test_nq_kl.py --flag i  --bsz  8 --num_doc $num_doc --ngpu $n_gpu --data_name $data_name

# eval
python merge_result.py --ngpu $n_gpu --data_name $data_name --num_doc $num_doc

Citation

Feel free to cite us if you like our work.

@article{Chen2023FortifyTS,
  title={Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use},
  author={Yuhan Chen and Ang Lv and Ting-En Lin and Chang Heng Chen and Yuchuan Wu and Fei Huang and Yongbin Li and Rui Yan},
  journal={ArXiv},
  year={2023},
  volume={abs/2312.04455},
  url={https://api.semanticscholar.org/CorpusID:266053571}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
base_rag		base_rag
inference		inference
README.md		README.md
basic.py		basic.py
modeling_llama.py		modeling_llama.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

Install

Code for tool-use

Data

core codes and how to run and how to eval

Code for rag

Data

how to run and eval

Citation

About

Releases

Packages

Languages

Fiorina1212/Attention-buckets

Folders and files

Latest commit

History

Repository files navigation

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

Install

Code for tool-use

Data

core codes and how to run and how to eval

Code for rag

Data

how to run and eval

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages