Skip to content

wenhaoli-xmu/UIO-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs

Experiment setup

conda env create -f environment.yaml
conda activate uiollms

Our repository requires flash attention and PyTorch dependencies, which are related to the local environment, and need to be installed manually with suitable versions according to your environment.

Download Dataset

cd uiollms

wget https://huggingface.co/datasets/namespace-Pt/projects/resolve/main/activation-beacon.tar.gz?download=true -O ./raw_data/activation-beacon.tar.gz

cd raw_data
tar -xzvf activation-beacon.tar.gz
cp activation-beacon/pretrain/redpajama-sample.json .
cp activation-beacon/finetune/longalpaca.json .

Pretrain

cd uiollms
python train.py --env_conf 32x.json

Illustration on 32x.json Config

  • You can modify the device_map field in the 32x.json file to change the GPU used for model loading. By assigning different GPUs to different modules, you can achieve pipeline parallelism.
  • You can adjust the config/32x.json configuration file to change the parameters of LoRA fine-tuning, such as chunk size and compression ratio, etc.
  • You can modify the tbptt field to change the number of computation graphs retained in memory during UIO-TBPTT algorithm execution. A larger value will slightly improve convergence, but increase overhead.
  • You can modify the corpus field to configure your desired dataset. All supported datasets and their specific configuration methods can be found in the src/data.py file. Use the truncation field to configure the maximum token count for each dataset, and partition to set the instance ratio for the dataset.
  • You can modify the save_ckp field to set the save path for checkpoints.

Benchmark

Language Modeling

cd uiollms
python test.py --env_conf 32x.json

As long as a checkpoint is generated during the training process (controlled by the save_ckp and save fields), you can use the above command to perform evaluation.

The test.py script will automatically search for a test.json file in the working directory, which is used to configure the dataset to be evaluated. Each instance in the file has the following format:

{
    "task_type": "perplexity",
    "task_name": "pg19.test.128k", 
    "num_instance": 100,
    "truncation": 99382
}

Citation

@article{uiollms
    author = {Wenhao Li, Mingbao Lin, Yunshan Zhong, Shuicheng Yan, Rongrong Ji},
    year = {2024},
    title = {UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs}
    journal = {arXiv}
}

About

Official implementation of UIO-LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages