-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4K context doesn't work? #3329
Comments
Try changing the compress_pos_emb value to 2 instead of 1. |
See #3153, (after changing compress_pos_emb to 2), it still wont work. |
I created |
You can also update truncation_length on a per-model basis using the models/user-config.yaml file, which may better suit your needs. Also, if you haven't updated it, you should update your models/config.yaml and characters/instruction-following/*.yaml files. I also just realized that this was a llama-2 model, you don't need to use compress_pos_emb to get 4k with those. Start by updating your models/config.yaml and you may find it just works. |
Set it how you want it and click "Save settings" on the models tab. The settings will be loaded automatically when you load the model. |
Did you download the model when it first dropped? Looks like the old config.json has 2048 set and it was updated to 4096 a couple days later. |
This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment. |
Describe the bug
Not sure if i'm doing things wrong but i downloaded a 4K context size model and any time i try to make an API request i still get errors bout too large acontext size
Is there an existing issue for this?
Reproduction
Download TheBloke_Llama-2-13B-GPTQ set max lenth to 4096, it no worky
Screenshot
Logs
System Info
The text was updated successfully, but these errors were encountered: