GitHub - soacker/Mesa-Extrapolation: [NeurIPS 2024] Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs

Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs

Implementation of the proposed Mesa-Extrapolation in Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs.

1.Abstract

Large language models (LLMs), although having revolutionized many fields, still suffer from the challenging extrapolation problem, where the inference ability of LLMs sharply declines beyond their max training lengths. In this work, we conduct a theoretical analysis to better understand why No Position Encoding (NoPE) fails outside its effective range, as well as examining the power of Position Encoding (PE) in this context. Our findings reveal that with meticulous weave position, PE can indeed be extended beyond effective range. Our theorems establish that LLMs equipped with weave PE can achieve improved extrapolation performance without additional cost. Furthermore, we introduce a novel weave PE method, Mesa-Extrapolation, which utilizes a chunk-based triangular attention matrix and applies Stair PE to manage the final chunk. This method not only retains competitive performance but also offers substantial benefits such as significantly reduced memory demand and faster inference speed. Extensive experiments validate the effectiveness of Mesa-Extrapolation, demonstrating its potential as a scalable solution to enhancing LLMs' applicative reach.

2.Overall

The schematic diagram of our method is shown below：

Our approach achieves minimal memory usage and the fastest inference latency：

It extends the existing Phi-3-instruct model, which supports a sequence length of 128k, to at least 192k：

3.Usage

Dependencies

Our current implementation is based on transformers==4.31.0. We will continue to update it in the future. For attention calculation, we currently support both the flash-attention and torch implementation.

Passkey Data Generation

python datas/make_passkey_data.py

Run

python experiments/evaluate_passkey_retrieval.py

4.TODOs

[1] Release core code of Mesa-Extrapolation, including llama, Pythia, Baichuan, Phi, etc.

[2] Supports implementation with newer versions of Transformers.

[3] Integrates with open-source inference frameworks such as vLLM.

5.Contributing

We welcome contributions from the research community to improve the efficiency of MesaExtrapolation. If you have any idea or would like to join in, please contact us ([email protected]).

If you find our method useful, please kindly cite our paper.

@misc{xin2024llm,
      title={Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs}, 
      author={Xin Ma and Yang Liu and Jingjing Li and Xiaoxu Ma},
      year={2024},
      eprint={2410.15859},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
datas		datas
experiments		experiments
figures		figures
methods		methods
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs

1.Abstract

2.Overall

3.Usage

Dependencies

Passkey Data Generation

Run

4.TODOs

5.Contributing

About

Releases

Packages

Languages

soacker/Mesa-Extrapolation

Folders and files

Latest commit

History

Repository files navigation

Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs

1.Abstract

2.Overall

3.Usage

Dependencies

Passkey Data Generation

Run

4.TODOs

5.Contributing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages