Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YaRN tests #1161

Closed
wants to merge 2 commits into from
Closed

YaRN tests #1161

wants to merge 2 commits into from

Conversation

viktor-ferenczi
Copy link
Contributor

@viktor-ferenczi viktor-ferenczi commented Sep 23, 2023

Issue: #980

Currently the branch has preliminary code to test the context window quality with pass key retrieval tasks. It does not plot a graph, that's not the goal of the test. It allows for running the same test on both the reference implementation of YaRN using Transformers and vLLM using parameters to provide similar output, therefore allows for comparing our upcoming implementation with the reference one.

TODO

  • Implement YaRN in vLLM
  • Verify model output and context window quality
  • Turn the test code into a pytest test case

@viktor-ferenczi viktor-ferenczi force-pushed the yarn branch 2 times, most recently from f74f49d to 63a52f6 Compare September 23, 2023 17:39
@viktor-ferenczi viktor-ferenczi marked this pull request as draft September 24, 2023 01:57
@casper-hansen
Copy link
Contributor

Thanks for giving this a shot @viktor-ferenczi. YaRN models look impressive because of their low perplexity and long contest windows, so I’m sure the community will love to test this out once it’s ready.

@trannhatquy
Copy link

please finish this pull request, it will really help because this model is very good

@viktor-ferenczi
Copy link
Contributor Author

I don't have extensive LLM (or vLLM) development experience yet. I'm learning into it on the job here, so it won't be implemented fast (unless I get help on this). I'm dedicated to complete it at some point, but also need to find the time working on this. (I have a day job.)

@trannhatquy
Copy link

@zhuohan123 @WoosukKwon please see this pull request

@viktor-ferenczi
Copy link
Contributor Author

viktor-ferenczi commented Sep 26, 2023

@casper-hansen There is #555 and #464. They seem to share code with YaRN, just different RoPE scaling approaches.

I suggest to have a unified configuration and partially shared implementation. #555 would be a good start if that can be reviewed and finalized first. It is meaningless to redo the work which has already been done there. That PR has different test cases for the long context as what I wrote, so maybe they could be merged to use both.

The new LLM option would look something like:

rope_scaling=dict(
    type='linear',  # linear, dynamic or yarn
    factor=2.0,  # scaling factor
    ...  # Hyper-parameters if required, like YaRN's alpha and beta
)

Also, it seems to be defined in the YaRN models, except of the alpha and beta hyper-parameters. The paper mentions alpha=1 and beta=32 for Llama 2 models.

In the NousResearch/Yarn-Llama-2-13b-128k YaRN model's config.json there is:

  "rope_scaling": {
    "factor": 32.0,
    "original_max_position_embeddings": 4096,
    "type": "yarn",
    "finetuned": true
  }

The smaller NousResearch/Yarn-Llama-2-7b-64k YaRN model has:

  "rope_scaling": {
    "factor": 16.0,
    "original_max_position_embeddings": 4096,
    "type": "yarn",
    "finetuned": true
  }

What do you think?

@WoosukKwon WoosukKwon added the new model Requests to new models label Sep 27, 2023
@Yard1
Copy link
Collaborator

Yard1 commented Oct 4, 2023

Hi @viktor-ferenczi, I would be willing to contribute the implementation, unless you have already started work on this.

@viktor-ferenczi
Copy link
Contributor Author

I just added tests and haven't written the actual YaRN code yet.

What may help you is that #464 was merged recently.

Please go ahead with the implementation, because I lack the time to work on it right now.

@Yard1
Copy link
Collaborator

Yard1 commented Oct 5, 2023

Implementation PR: #1264

@viktor-ferenczi viktor-ferenczi changed the title Support YaRN YaRN tests Oct 5, 2023
@WoosukKwon WoosukKwon mentioned this pull request Oct 13, 2023
3 tasks
@WoosukKwon WoosukKwon mentioned this pull request Nov 2, 2023
3 tasks
@zhuohan123 zhuohan123 closed this Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new model Requests to new models
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants