-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Qwen pipeline and example #12292
Conversation
e8dde16
to
489d351
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
others LGTM.
parser.add_argument( | ||
"--repo-id-or-model-path", | ||
type=str, | ||
default="Qwen/Qwen2-7B-Instruct", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "Qwen/Qwen2.5-7B-Instruct" can directly run with current code, maybe use "Qwen/Qwen2.5-7B-Instruct" as default ?
We can also update this in next PR 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, changed to 2.5 as default and renamed the file to qwen.py
Will test qwen2.5 soon
default="Qwen/Qwen2-7B-Instruct", | ||
help="The huggingface repo id for the Baichuan2 model to be downloaded" | ||
", or the path to the huggingface checkpoint folder", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add parser.add_argument("--lowbit-path", type=str, ...
here.
parser.add_argument("--n-predict", type=int, default=32, help="Max tokens to predict") | ||
parser.add_argument("--max-context-len", type=int, default=1024) | ||
parser.add_argument("--max-prompt-len", type=int, default=960) | ||
parser.add_argument("--disable-transpose-value-cache", action="store_true", default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like Qwen2 GW is already ready ? If so, maybe add parser.add_argument("--quantization_group_size", type=int, default=0)
here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not tested yet, will test and update in a following PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also update readme, other LGTM.
README updated. |
Merge it first. Will fix any issues in a following PR. |
No description provided.