-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: CohereForAI/c4ai-command-r-v01 : ValueError: User-specified max_model_len (131072) is greater than the derived max_model_len (None=8192 in model's config.json). This may lead to incorrect model outputs or CUDA errors. Make sure the value is correct and within the model > #3676
Comments
cc @zeppombal |
Lines 749 to 772 in 14ccd94
If you want to change But I can't think of a good way to solve this. |
But I thought they use rope scaling, so should be accounted for in vllm. Shouldn't change embedding size I'd think. |
@pseudotensor that makes sense. |
@pseudotensor If you look at the code right below - it'll scale the derived max model length by the factor if it exists in the Lines 786 to 796 in 14ccd94
|
But less than the maximum should be controlled by passing model_max_length to vLLM. I'm not aware of any other model that fails int his way with rope scaling. |
You're right - I did a bit of more research and found https://huggingface.co/CohereForAI/c4ai-command-r-v01/discussions/12, and it seems that the "model_max_length" is added after discussion in this thread This is indeed a bug we should fix then - although going forward I'm sure if we should just take |
At least from that discussion, a default of low is ok, but shouldn't require editing the model config.json in order to go up to its maximum. Although normally I think vLLM always does maximum by default unless make smaller, different than what that person said HF does. |
@pseudotensor Yep - I had the same thoughts in #3727. Please take a look, thanks! |
The max_model_length was added to support llama.cpp (ggerganov/llama.cpp#6033 (comment)) |
Your current environment
Head of main after various cohere updates/fixes.
Issues:
have to comment out trust-remote-code due to a bug in their model that has a PR for registration of the model name that isn't merged yet.
🐛 Describe the bug
The text was updated successfully, but these errors were encountered: