-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update state dict and model together #573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved
weight_scale=scales.float(), | ||
weight_zero_point=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two has to either both be tensor or both be scalar, it should work if you do scales.float().item()
I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait... This is a vector of 32000 elements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh but _qdq_dynamic_quantized_linear
only supports per tensor quantization, then you may want to call a different op in that function
to align the size you could do: weight_zero_point = torch.zeros(scales.shape)
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
* code beautification * code beautification, move functions together * rewrite model rewriter * rewrite quantizers * weights is none check * typo * not weight -> weight is not None * fix dimensions for parallel prefill * test * typo * bfloat16 on ARM with MacOS 14 * precision for a8w4 * sdpa_kv * fixes * inline qlq definition * trial and error * qdq not working * ci * not so fast with bf16=fast * typo, and handle fast across maxcos version... * typo * type cast
Update state dict and model together