-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
require -l in run.cpp #594
Conversation
9933565
to
3be4854
Compare
Does it actually matter what it is? Or is it just the tokenizer anyway? Maybe don't call it model if it's tokenizer? |
@mikekgfb it's the model in the sense that it's used to select the tokenizer class, prompt structure, and default vocab size, all of which are different for llama2 vs. llama3. |
@larryliu0820 and I were talking that maybe the tokenizer should get the vocab size from the tokenizer.model file, but that's not how they work today. |
After #626 lands, the tokenizer will get the vocab size from the *.model file. So we should be able to automatically deduce 2 or 3 by a tokenizer load failure, and clean up some of the edge cases in the runner. |
Agreed. Let's make sure changing this to remove the default doesn't break documented workflows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@metascroy do we still need to land this, or did you already land a version of this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/594
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 New FailuresAs of commit 84d8e5c with merge base a80ae4d (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Closing Stale PR |
We pass the llama version in run.cpp with -l arg. Today it has a default value of 2 (for llama2), but I don't think this should have a default value.