-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QNN] Use Int16 upcast in Fallback Conv2D. Fix test names. #4329
Conversation
67a0fd1
to
6dffc52
Compare
@FrozenGene @jackwish Can you please review this? |
6dffc52
to
c478677
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
const Expr& zp_kernel, const QnnConv2DAttrs* param) { | ||
auto shifted_data = data; | ||
Expr Conv2DFallBack(const Expr& data, const Expr& weight, const QnnConv2DAttrs* param) { | ||
// Upcast the zero point to Int16. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty straightforward :)
@zhiics Can you please review and merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @anijain2305 @FrozenGene @jackwish |
* [TOPI][OP] Support Faster-RCNN Proposal OP on CPU (apache#4297) * Support Proposal operator on CPU. * PyLint space issue * PyLint space issue * Pylint singleton-comparison issue * [QNN][Legalize] Specialize for Platforms without any fast Int8 arithmetic units. (apache#4307) * fix error when memory_id is VTA_MEM_ID_OUT (apache#4330) * [CI][DOCKER] Add ONNX runtime dep (apache#4314) * [DOCKER] Add ONNX runtime dep * Improve ci script * [QNN] Quantize - Fixing the sequence of lowering. (apache#4316) * [QNN] Use Int16 upcast in Fallback Conv2D. Fix test names. (apache#4329) * [doc][fix] fix sphinx parsing for pass infra tutorial (apache#4337) * change ci image version (apache#4313) * [Codegen] remove fp16 function override for cuda (apache#4331) * add volatile override back * [codegen] remove fp16 function override for cuda * [CI] Set workspace to be per executor (apache#4336) * [Build][Windows] Fix Windows build by including cctype (apache#4319) * Fix build * dummy change to retrigger CI * dummy change to retrigger ci * dummy change to retrigger ci * Enable hipModuleGetGlobal() (apache#4321) * [Relay][Pass] Add pass to remove unused functions in relay module (apache#4334) * [Relay][Pass] Add pass to remove unused functions in relay module * Add tests * Fix lint * Fix visit order * Add pass argument * Fix * Add support for quant. mul operator in tflite frontend (apache#4283) A test for qnn_mul has to be added when the qnn elemwise tests (apache#4282) get merged. * Add topi.nn.fifo_buffer to TVM doc (apache#4343) * Solve custom model of prelu (apache#4326) * Deprecate NNVM warning msg (apache#4333) * [Contrib] Add MKL DNN option (apache#4323) * [Contrib] Add MKL DNN * update * update * [Relay][Frontend][TF] Fix transpose when axes is not a param (apache#4327) * [Relay][Frontend][TF] Use _infer_value_simulated when axes is not a const to Transpose * uncomment tests * dummy change to retrigger ci * [RUNTIME] Add device query for AMD GcnArch (apache#4341) * add gcnArch query * kGcnArch query for cuda is a no-op * [Test][Relay][Pass] Add test case for lambda lift (apache#4317) * [Relay][Frontend][ONNX] operator support: DepthToSpace, SpaceToDepth (apache#4271) * imp module is deprecated (apache#4275) * [VTA] Bug fix for padded load with large inputs (apache#4293) * bug fix for padded load with large inputs * Update TensorLoad.scala * Update test_vta_insn.py * fix inconsistent tag name (apache#4134) * [CodeGen] Add build config option disable_assert to control whether to generate assert (apache#4340) * Bump up CUDA log version in tophub.py (apache#4347) * Add check to ensure input file was successfully opened in NNVM deploy code demo (apache#4315) * [COMMUNITY] Add DISCLAIMER, KEYS for ASF release (apache#4345) * [COMMUNITY] Add DISCLAIMER, KEYS for ASF release * Add file name spec * [Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter via eta expansion (apache#4218) * Fix constructor pretty printing * Make Module::HasDef name consistent with API * Add VM constructor compilation via eta expansion * Lint * Fix CI * Fix failing test * Address comment * Retrigger CI * Retrigger CI * Update dmlc_tvm_commit_id.txt
Before this PR, QNN conv2D Fallback lowering will upcast the data and weight to Int32 and then subtract zero points. However, zero points are always (u)int8. So, we can safely upcast to just int16 and subtract zero points.
Other fix is for the test names, so that they can be picked up by pytest. This was a silly mistake on my side. Adding the fix here.