Support YaRN models (RoFormer implementation in rotary_embedding kernel) #980

casper-hansen · 2023-09-07T11:48:00Z

The YaRN model with a context size of 64k and 128k was recently released and pre-trained by people from Nous Research and EleutherAI. It uses the RoFormer type of embeddings that seem different from GPT-NeoX and GPT-J. It is based on the LLaMa 2 model, so support is mostly there, just need some small adjustments.

The original YaRN module uses the flash attention rotary embedding implementation and seems similar in functionality. You may also be interested in the original RoFormer implementation from Huggingface.

Models catalog:
https://huggingface.co/NousResearch/Yarn-Llama-2-7b-64k
https://huggingface.co/NousResearch/Yarn-Llama-2-7b-128k
https://huggingface.co/NousResearch/Yarn-Llama-2-13b-64k
https://huggingface.co/NousResearch/Yarn-Llama-2-13b-128k

viktor-ferenczi · 2023-09-23T09:03:06Z

Copied from #1027:

YaRN paper: YaRN: Efficient Context Window Extension of Large Language Models
YaRN code: YaRN Github

viktor-ferenczi mentioned this issue Sep 22, 2023

Support YaRN models (RoFormer implementation in rotary_embedding kernel) #1027

Closed

viktor-ferenczi mentioned this issue Sep 23, 2023

YaRN tests #1161

Closed

Yard1 mentioned this issue Oct 5, 2023

YaRN support implementation #1264

Merged

WoosukKwon closed this as completed in #1264 Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support YaRN models (RoFormer implementation in rotary_embedding kernel) #980

Support YaRN models (RoFormer implementation in rotary_embedding kernel) #980

casper-hansen commented Sep 7, 2023

viktor-ferenczi commented Sep 23, 2023 •

edited

Loading

Support YaRN models (RoFormer implementation in rotary_embedding kernel) #980

Support YaRN models (RoFormer implementation in rotary_embedding kernel) #980

Comments

casper-hansen commented Sep 7, 2023

viktor-ferenczi commented Sep 23, 2023 • edited Loading

viktor-ferenczi commented Sep 23, 2023 •

edited

Loading