-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding. #4001
Conversation
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
@tqchen, sorry have to tag you again on this PR as it seems that you may be the right person. Please feel free to suggest anyone else, but this is a quite small PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a python wrapper, a reference to the docs in the https://github.com/dmlc/tvm/tree/master/docs/api/python, and a testcase
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Thanks @tqchen, I have added test case and python binding. Thanks for your feedback. |
@kimishpatel Thanks for the PR. You can also look at qnn.op.quantize. This is a wrapper that internally lowers to the sequence of relay ops you mentioned. This PR will benefit that wrapper as well. |
@anijain2305, thanks for the pointer. Wasn't quite aware of that. |
If you are looking at running pre quantized models, you might want to have a look at QNN dialect. We have added a number of QNN ops in there that deal with scale and zero points. |
@tqchen, sorry to bug again :). Seems like all checks have passed, so just nudging for the merge. Thanks a bunch. |
Thanks @anijain2305 @kimishpatel |
…ng. (apache#4001) * Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Added python binding. Added test. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
…ng. (apache#4001) * Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Added python binding. Added test. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
…ng. (apache#4001) * Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Added python binding. Added test. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
In the quantized gemm implementation we are working on, we need to quantize input data. During this step we first apply scale and zero point to the input data. Then we do rounding and casting to int8.
tvm::round
gets lowered by llvm intoroundf
function call which make the op slower. I instead exposedllvm.nearbyint
via tvm and was able to recover the lost performance.So this PR is just upstreaming that change.