-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support OpenVINO int8 static quantization #3025
Conversation
@AlexKoff88, please take a look |
@tomaarsen, following up on our conversation on Linkedin. We prepared an integration of quantization with OpenVINO. Can you please review it? |
Thanks a bunch for this! I think this is looking quite solid already - I already wrote a few comments and I'm doing local tests now. I will be updating the Benchmark figures & Recommendations based on whatever my findings are.
|
I'm also getting this warning, can we do something about that?
Another time I got 2:
|
Also update the performance ratio lower bound from 94% to 99%
Indenting was off; "all-MiniLM-L6-v2" had to be updated to "sentence-transformers/all-MiniLM-L6-v2" in a few places; and updated recommendation
BTW, it should work in Intel GPU as well (e.g. integrated graphics) and it will be even faster if you make the input shape static. |
Alright, thanks for sharing, I'll do some experiments! |
I'm having issues with iGPU - not something we have to worry about now, it's not a dealbreaker for this PR.
If I run:
|
Thanks @tomaarsen, we will take a look on our end but it can be a driver/driver-absence problem. I wonder if you installed any. Please find details here: https://docs.openvino.ai/2024/get-started/configurations/configurations-intel-gpu.html cc'ed @vladimir-paramuzov, @sshlyapn |
I tried this code on iGPU of my Intel Core Ultra 7 165U laptop and it works. But I used the most recent version of OpenVINO: |
Thanks for taking care of the comments @l-bat - I built on top of it to address some final nitpicks, such as:
I think this is about ready to be merged, what do you think @l-bat @AlexKoff88?
|
Thank you, @tomaarsen, for implementing the changes and addressing the remaining details. We agree this is ready for merging. Please let us know if there's anything specific you're still waiting on from our side to proceed. |
I think we're all set! Ideally, I'd like to include some other PRs in the upcoming release, but I do intend to release this soon (1-2 weeks presumably). Thanks a bunch for leading this work, it should be very very valuable! I'll merge this once the tests go green again after my final nits.
|
Add Post-Training Static Quantization support for OpenVINO models
Usage examples:
To quantize Hugging Face Hub Model:
To quantize a local model: