-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert.py couldn't convert internlm2 #5031
Comments
It should be easy to extend - take a look at the existing ARCHes |
internml just released tool (https://github.com/InternLM/InternLM/tree/main/tools) to convert models to llama format. However communities found out converting with the new llama format failed with error: see the issue in internml for more community discussion happened so far. |
llamaified version: https://huggingface.co/chargoddard/internlm2-base-20b-llama seemed to be converted with convert.py but gets this error:
Related: #4360 |
this was seen with previous version of internlm,that is, converting to gruff is fine. hosting failed with the error. the same. |
In case of the InternLM2 model, the problem is with the token 354 I created a simple script that edits the sentencepiece model |
Try the llamaified InternLM2 tokenizer https://huggingface.co/RangiLyu/InternLM2-tokenizer-llama |
By using this (and nulling the "rope_scaling" field from config.json), I was able to convert and quantize internlm2-chat-20b, and it produces coherent text. However, the model never stops generating. Here's a snippet; it goes on longer than this: Click to show
Okay, the provided Now I suppose the next step is for someone to integrate all these steps into one of the
|
with the latest code, the exactly same issue still there. maybe convert.py should be updated as well? |
Does it work now after merging #5305? |
Yep, conversion and inference is good. The chat model could still use some renamed tokens though. |
Hi, it seems there is still an open issue? https://huggingface.co/internlm/internlm-xcomposer2-vl-7b |
I did the follow steps:
furthermore, if i change the EOS token with intervitens' script before step 3, repalce 'tokenizer.model' with 'tokenizer_fixed.model', finish step3,4,5, and it outputs:
After browsing the above replies, I wonder if the string |
Before step 3, you may attempt to modify the configuration in the good luck ~ |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
the latest convert.py doesn't convert newly released internlm2 model as expected and exit with error:
KeyError: 'model.tok_embeddings.weight'
internlm2 official response to the issue is:
"Unlike other GQA models, it packed q, k, v weights into one tensor."
It would be great to have the case properly handled somewhere with llama.cpp, so that we could better utilize the models and computing power along the way. See the issue logged in internlm2 community as below for more details.
internlm issue
The text was updated successfully, but these errors were encountered: