Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][QNN] Quantized fully connected #3597

Closed
wants to merge 56 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
8d9e317
[Relay] [Quantization] WIP - Common files for the qauntization work.
Jul 8, 2019
5485b58
[Relay] [Quantization] WIP - Prototyping requantize op.
Jul 8, 2019
877d834
Requantize operator implementation.
anijain2305 Jul 10, 2019
705b796
Typo and lint fixes.
anijain2305 Jul 10, 2019
6cd1328
Lint fix.
anijain2305 Jul 10, 2019
ac4349b
Doc fix.
anijain2305 Jul 10, 2019
a9fef75
Uncommenting the lint script (fixing mistake).
anijain2305 Jul 10, 2019
d9eff68
Modifying the unit tests.
anijain2305 Jul 10, 2019
abc7c4e
Moving C++ files into src/relay/qnn
anijain2305 Jul 11, 2019
275ddd0
Moving python files to python/tvm/relay/qnn. Some minor fixes.
anijain2305 Jul 11, 2019
a0ad8ca
Moving the attrs.h inside the include directory.
anijain2305 Jul 11, 2019
ff8936c
Pushing files that I forgot earlier. Changing util location.
anijain2305 Jul 11, 2019
bdca4c6
[Relay] [Quantization] WIP - Common files for the qauntization work.
Jul 8, 2019
755f934
[Relay] [Quantization] WIP - Prototyping requantize op.
Jul 8, 2019
dba71f0
Requantize operator implementation.
anijain2305 Jul 10, 2019
6016b2a
Typo and lint fixes.
anijain2305 Jul 10, 2019
d54cea8
Lint fix.
anijain2305 Jul 10, 2019
ca954e0
Doc fix.
anijain2305 Jul 10, 2019
db24f1e
Uncommenting the lint script (fixing mistake).
anijain2305 Jul 10, 2019
523e16a
Modifying the unit tests.
anijain2305 Jul 10, 2019
18bff76
Moving C++ files into src/relay/qnn
anijain2305 Jul 11, 2019
32b69df
Moving python files to python/tvm/relay/qnn. Some minor fixes.
anijain2305 Jul 11, 2019
21168ae
Moving the attrs.h inside the include directory.
anijain2305 Jul 11, 2019
4a4beec
Pushing files that I forgot earlier. Changing util location.
anijain2305 Jul 11, 2019
120c050
Incorporating comments. API change. Lint fixes.
anijain2305 Jul 15, 2019
989bbea
Modifying the GetFixedPointMultiplierShift API as per comments.
anijain2305 Jul 15, 2019
8df0ddb
Forgot the dialect change.
anijain2305 Jul 15, 2019
8d0af86
Retriggering Jenkins.
anijain2305 Jul 15, 2019
ff1b9e3
Changing rewrite to qnn_lower.
anijain2305 Jul 15, 2019
362869f
Renaming Quantize to Qnn for clarity.
anijain2305 Jul 15, 2019
36f0ed9
Remove use_int_domain.
anijain2305 Jul 17, 2019
b45c629
Working quantized fully-connected with int8 and uint8
Jul 17, 2019
419dee0
Merge branch 'requantize' into qfullyconnected
Jul 17, 2019
4958495
Incorportaing review comments.
anijain2305 Jul 19, 2019
f858a83
Adding API doc for QNN dialect.
anijain2305 Jul 19, 2019
823cc94
Move the qnn_lower pass to transform namespace.
anijain2305 Jul 19, 2019
28a9587
Moving from expr to module. Adding namespace in C++.
anijain2305 Jul 19, 2019
76476dc
Working test case for int/uint with bias_add
Jul 19, 2019
732d6ce
Minor sentence rewrites. Added qnn namespace.
anijain2305 Jul 19, 2019
fadc573
Added the API doc.
anijain2305 Jul 19, 2019
956d3de
Chanding default out_dtype to int8. Adding a test with in/out_dtype a…
anijain2305 Jul 19, 2019
7a63597
merge from upstream/requantize
Jul 19, 2019
3ffdbf8
Merge branch 'requantize' into qfullyconnected
Jul 19, 2019
d700945
Style fixes. Better error messages.
anijain2305 Jul 19, 2019
21963dc
Removing extra code.
Jul 22, 2019
29c9e06
Merge branch 'requantize' into qfullyconnected
Jul 22, 2019
d0fdd1c
Adding documentation.
anijain2305 Jul 22, 2019
33cc075
More documentation fixes.
anijain2305 Jul 22, 2019
bb38855
Adding out dtype check for requantize.
anijain2305 Jul 22, 2019
7aac28d
Adding corner case for FP32 to fixed point conversion.
anijain2305 Jul 22, 2019
635b053
Adding extra line.
anijain2305 Jul 22, 2019
222e189
Documentation fix.
anijain2305 Jul 22, 2019
6c833d5
quantized fully connected working with requantize.
Jul 22, 2019
a115c96
Adding static inline.
anijain2305 Jul 23, 2019
572a8f3
Merge branch 'master' into requantize
Jul 24, 2019
dd213b6
Merge branch 'requantize' into qfullyconnected
Jul 24, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions docs/langref/relay_op.rst
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,16 @@ This level support backpropagation of broadcast operators. It is temporary.
tvm.relay.contrib.adaptive_avg_pool2d


**Level 11: QNN Dialect Operators**

This level supports quantized operators present in the QNN dialect.

.. autosummary::
:nosignatures:

tvm.relay.qnn.op.requantize


Level 1 Definitions
-------------------
.. autofunction:: tvm.relay.log
Expand Down Expand Up @@ -332,3 +342,8 @@ Level 10 Definitions
.. autofunction:: tvm.relay.nn.batch_matmul
.. autofunction:: tvm.relay.contrib.adaptive_max_pool2d
.. autofunction:: tvm.relay.contrib.adaptive_avg_pool2d


Level 11 Definitions
--------------------
.. autofunction:: tvm.relay.qnn.op.requantize
97 changes: 97 additions & 0 deletions include/tvm/relay/qnn/attrs.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

/*!
* \file tvm/relay/qnn/attrs.h
* \brief Auxiliary attributes for qnn operators.
*/

#ifndef TVM_RELAY_QNN_ATTRS_H_
#define TVM_RELAY_QNN_ATTRS_H_

#include <tvm/attrs.h>
#include <tvm/relay/base.h>
#include <string>

namespace tvm {
namespace relay {
namespace qnn {

/*! \brief Attribute for requantize operator */
struct RequantizeAttrs : public tvm::AttrsNode<RequantizeAttrs> {
double input_scale;
int32_t input_zero_point;
double output_scale;
int32_t output_zero_point;
std::string rounding;
DataType out_dtype;

TVM_DECLARE_ATTRS(RequantizeAttrs, "relay.attrs.RequantizeAttrs") {
TVM_ATTR_FIELD(input_scale)
.describe("The scale of the input tensor.");
TVM_ATTR_FIELD(input_zero_point)
.describe("The zero point of the input tensor.");
TVM_ATTR_FIELD(output_scale)
.describe("The scale of the output tensor.");
TVM_ATTR_FIELD(output_zero_point)
.describe("The zero point of the output tensor.");
TVM_ATTR_FIELD(rounding).set_default("AWAY_FROM_ZERO")
.describe("Defines the rounding direction when the value is midway between"
"two representable values. There are two supported modes - UPWARD"
"or AWAY_FROM_ZERO. Both modes behave exactly same except at the"
"midpoints between the two representable values. At the midpoint,"
"UPWARD rounds towards positive infinity (for example -1.5 will be"
"rounded to -1). AWAY_FROM_ZERO is the standard rounding where the"
"value is rounded away from zero at midpoints (for example, -1.5"
"rounds to -2). More context can be found at following gblic manual"
"https://www.gnu.org/software/libc/manual/html_node/Rounding.html."
"FE_UPWARD corresponds to UPWARD here and FE_TONEAREST corresponds"
"to AWAY_FROM_ZERO rounding mode.");
TVM_ATTR_FIELD(out_dtype)
.set_default(NullValue<DataType>())
.describe("Output data type, set to explicit type under mixed precision setting");
}
};

/*! \brief Attributes for quantized dense operator */
struct QDenseAttrs : public tvm::AttrsNode<QDenseAttrs> {
IndexExpr units;
DataType out_dtype;
// Quantization related attributes.
int32_t input_zero_point;
int32_t kernel_zero_point;

TVM_DECLARE_ATTRS(QDenseAttrs, "relay.attrs.QDenseAttrs") {
TVM_ATTR_FIELD(units)
.describe("Number of hidden units of the dense transformation.");

TVM_ATTR_FIELD(out_dtype)
.describe("Output data type, set to explicit type under mixed precision setting");

TVM_ATTR_FIELD(input_zero_point)
.describe("The zero point of the input tensor.");
TVM_ATTR_FIELD(kernel_zero_point)
.describe("The zero point of the kernel tensor.");
}
};

} // namespace qnn
} // namespace relay
} // namespace tvm
#endif // TVM_RELAY_ATTRS_QNN_H_
3 changes: 3 additions & 0 deletions python/tvm/relay/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,9 @@
from . import backend
from . import quantize

# Dialects
from . import qnn

from .scope_builder import ScopeBuilder

# Span
Expand Down
21 changes: 21 additions & 0 deletions python/tvm/relay/qnn/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=wildcard-import
"""QNN dialect operators and IR passes."""
from __future__ import absolute_import as _abs
from . import op
from . import transform
22 changes: 22 additions & 0 deletions python/tvm/relay/qnn/_transform.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#pylint: disable=unused-argument
"""Internal module for quantization."""
from __future__ import absolute_import
from tvm._ffi.function import _init_api

_init_api("relay.qnn._transform", __name__)
20 changes: 20 additions & 0 deletions python/tvm/relay/qnn/op/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=wildcard-import
"""Neural network related operators."""
from __future__ import absolute_import as _abs
from .qnn import *
20 changes: 20 additions & 0 deletions python/tvm/relay/qnn/op/_make.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Constructor APIs"""
from ...._ffi.function import _init_api

_init_api("relay.qnn.op._make", __name__)
104 changes: 104 additions & 0 deletions python/tvm/relay/qnn/op/qnn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#pylint: disable=invalid-name
"""QNN dialect operators."""

from __future__ import absolute_import as _abs
from . import _make

def requantize(data,
input_scale,
input_zero_point,
output_scale,
output_zero_point,
rounding="AWAY_FROM_ZERO",
out_dtype="int8"):
r"""Requantized operator.

The requantize operator converts one quantized tensor representation to
another quantized tensor representation. For the output tensor, we are
provided with output scale and zero point. The computation is as follows

Q_output = zp_output + (scale_input)/(scale_output) * (Q_input - zp_input)


Parameters
----------
data : tvm.relay.Expr
The input data to the operator.

input_scale: float
The quantization scale for the input tensor.

input_zero_point: int
The zero point of the input tensor.

output_scale: float
The quantization scale for the output tensor.

output_zero_point: int
The zero point of the output tensor.

rounding : string, optional
Defines the rounding direction when the value is midway between two
representable values.

out_dtype : str, optional
Specifies the output data type.

Returns
-------
result : tvm.relay.Expr
The computed result.
"""

return _make.requantize(data,
input_scale,
input_zero_point,
output_scale,
output_zero_point,
rounding,
out_dtype)

def quantized_dense(data, weight, input_zero_point, kernel_zero_point, units=None, out_dtype="int32"):
"""Dense operator.
Applies a linear transformation

.. math::

`Y = X * W`

Parameters
----------
data : tvm.relay.Expr
The quantied input data to the operator.

weight : tvm.relay.Expr
The quantized weight expressions.

units : int, optional
Number of hidden units of the dense transformation.

out_dtype : str, optional
Specifies the output data type for mixed precision dense can be int32 or int16.

Returns
-------
result : tvm.relay.Expr
The computed result.
"""
return _make.dense(data, weight, units, input_zero_point, kernel_zero_point, out_dtype)
33 changes: 33 additions & 0 deletions python/tvm/relay/qnn/transform.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name

"""QNN Dialect transformation passes."""
from __future__ import absolute_import

from . import _transform

def QnnLower():
"""
Rewrites the high-level quantized ops into low-level exisiting Relay ops.

Returns
-------
Pass : tvm.relay.transform.Pass
The optmized pas.
"""
return _transform.QnnLower()
Loading