Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces some additional rounding modes, and provides a table, that more accurately describes their behavior. Concretely, the following table has been added to docs/qonnx-custom-ops/quant_op.md:
The newly introduced rounding modes are: UP, DOWN, HALF_UP, and HALF_DOWN. These rounding modes were inspired by rounding modes in the java math library (https://docs.oracle.com/javase/8/docs/api/java/math/RoundingMode.html), and the implementation in the Chisel dsptools library (https://github.com/ucb-bar/dsptools/blob/master/src/main/scala/dsptools/numbers/chisel_types/FixedPointTypeClass.scala#L156).
This issue partially solves the incompatibility between a high-level python implementation and a circuit implementation. For instance, consider the following test function for QKeras (v0.9.0):
The function above will fail on the second assert. However, the scaling factors printed in the finally block will be 1, [1,1,1] and [1,1,1]. The reason is that when using "auto_po2" the rounding mode is actually "round half up". This can be seen on:
https://github.com/google/qkeras/blob/67e7c6b8cbd6befd594f142187ac4b73b35512ac/qkeras/quantizers.py#L570C45-L570C46
This pull request does the following:
resolve_rounding_mode
function in src/qonnx/custom_op/general/quant.py.The request does NOT do the following:
I refrained from updating the converters because I don't know the code base very well, and secondly the tests seem to be written with assert_allclose, i.e. approximate compatibility. Issues with rounding modes can be quite subtle, so they would be hard to catch with approximate compatibility.
I have had success making a bit accurate conversion between QKeras and circuits in chisel4ml, after I introduced precise rounding modes. However, this is only when all tensors had a known quantization, and the scaling factor is power-of-two. Looking at the qonnx code base, I have a hard time seeing how the input quantization is specified. In chisel4ml for instance, this is done directly as shown:
This means that the inputs must be quantized to a signed 4-bit integer. I realize that qonnx targets a larger subset of neural network descriptions, however, I believe that it would be useful to make a distinction for these kind of networks(https://arxiv.org/abs/2011.10680 this paper calls them Dyadic Neural networks), as:
I have only empirically shown bit-level accuracy, however, considering the way floating-point is specified (having a power-of-two exponent bits) the equivalence should hold, as long as the mantisa/fraction field is not to big. And if it does get to big, you can also move to 64-bit floating-point number for example.