Add aditional rounding modes #110

jurevreca12 · 2024-03-25T11:51:24Z

This pull request introduces some additional rounding modes, and provides a table, that more accurately describes their behavior. Concretely, the following table has been added to docs/qonnx-custom-ops/quant_op.md:

Number \ ROUNDING_MODE	ROUND=HALF_EVEN	CEIL	FLOOR	UP	DOWN	HALF_UP	HALF_DOWN
5.5	6	6	5	6	5	6	5
2.5	2	3	2	3	2	3	2
1.6	2	2	1	2	1	2	2
1.1	1	2	1	2	1	1	1
1.0	1	1	1	1	1	1	1
-1.0	-1	-1	-1	-1	-1	-1	-1
-1.1	-1	-1	-2	-2	-1	-1	-1
-1.6	-2	-1	-2	-2	-1	-2	-2
-2.5	-2	-2	-3	-3	-2	-3	-2
-5.5	-6	-5	-6	-6	-5	-6	-5

The newly introduced rounding modes are: UP, DOWN, HALF_UP, and HALF_DOWN. These rounding modes were inspired by rounding modes in the java math library (https://docs.oracle.com/javase/8/docs/api/java/math/RoundingMode.html), and the implementation in the Chisel dsptools library (https://github.com/ucb-bar/dsptools/blob/master/src/main/scala/dsptools/numbers/chisel_types/FixedPointTypeClass.scala#L156).

This issue partially solves the incompatibility between a high-level python implementation and a circuit implementation. For instance, consider the following test function for QKeras (v0.9.0):

def test_quantized_bits_rounding_mode():
    alpha1 = qkeras.quantized_bits(bits=3, integer=2, keep_negative=True, alpha=1)
    alpha111 = qkeras.quantized_bits(bits=3, integer=2, keep_negative=True, alpha=[1, 1, 1])
    alpha_po2 = qkeras.quantized_bits(bits=3, integer=2, keep_negative=True, alpha='auto_po2')
    try:
        assert np.array_equal(alpha1(np.array([2.5, 2.5, 3.5])), alpha111(np.array([2.5, 2.5, 3.5])))
        assert np.array_equal(alpha1(np.array([2.5, 2.5, 3.5])), alpha_po2(np.array([2.5, 2.5, 3.5])))
    finally:
        print(alpha1.scale)
        print(alpha111.scale)
        print(alpha_po2.scale)

The function above will fail on the second assert. However, the scaling factors printed in the finally block will be 1, [1,1,1] and [1,1,1]. The reason is that when using "auto_po2" the rounding mode is actually "round half up". This can be seen on:
https://github.com/google/qkeras/blob/67e7c6b8cbd6befd594f142187ac4b73b35512ac/qkeras/quantizers.py#L570C45-L570C46

v = tf.floor(tf.abs(x) / scale + 0.5)

This pull request does the following:

Adds rounding modes to spec.
Ads implementation of the rounding modes to resolve_rounding_mode function in src/qonnx/custom_op/general/quant.py.
Ads a simple test to check the implementation of the rounding modes tests/custom_op/test_rounding_mode.py.

The request does NOT do the following:

It does not fix the QKeras/Brevitas converters.

I refrained from updating the converters because I don't know the code base very well, and secondly the tests seem to be written with assert_allclose, i.e. approximate compatibility. Issues with rounding modes can be quite subtle, so they would be hard to catch with approximate compatibility.

I have had success making a bit accurate conversion between QKeras and circuits in chisel4ml, after I introduced precise rounding modes. However, this is only when all tensors had a known quantization, and the scaling factor is power-of-two. Looking at the qonnx code base, I have a hard time seeing how the input quantization is specified. In chisel4ml for instance, this is done directly as shown:

x = x_in = tf.keras.layers.Input(shape=3)
x = qkeras.QActivation(
    qkeras.quantized_bits(bits=4, integer=3, keep_negative=True)
)(x)
x = qkeras.QDense(
    4,
    kernel_quantizer=qkeras.quantized_bits(
        bits=4, integer=3, keep_negative=True, alpha=np.array([0.5, 0.25, 1, 0.25])
    ),
)(x)
x = qkeras.QActivation(qkeras.quantized_relu(bits=3, integer=3))(x)
x = qkeras.QDense(
    1,
    kernel_quantizer=qkeras.quantized_bits(
        bits=4, integer=3, keep_negative=True, alpha=np.array([0.125])
    ),
)(x)
x = qkeras.QActivation(qkeras.quantized_relu(bits=3, integer=3))(x)
model = tf.keras.Model(inputs=[x_in], outputs=[x])

This means that the inputs must be quantized to a signed 4-bit integer. I realize that qonnx targets a larger subset of neural network descriptions, however, I believe that it would be useful to make a distinction for these kind of networks(https://arxiv.org/abs/2011.10680 this paper calls them Dyadic Neural networks), as:

they are highly efficient to implement in hardware, and
I believe they can be "simulated" with bit-level accuracy using floating-point operations.

I have only empirically shown bit-level accuracy, however, considering the way floating-point is specified (having a power-of-two exponent bits) the equivalence should hold, as long as the mantisa/fraction field is not to big. And if it does get to big, you can also move to 64-bit floating-point number for example.

…rn new rounding modes.

Fixed rounding_mode specifier in convert_quantized_bits

…arameters.

Commented out assertion on non-representability

jurevreca12 · 2024-08-26T06:51:25Z

I am closing this pull request, as it has several features jumbled into it. I will make several new pull requests for all the separate functionality added.

jvreca added 3 commits March 22, 2024 15:23

Added test, docs/, and updated resolve_rounding_mode function to retu…

dad9869

…rn new rounding modes.

Fix table visualization.

47a88e4

Fix table visualization again.

e2c1504

jurevreca12 marked this pull request as draft March 25, 2024 11:52

jvreca added 11 commits August 9, 2024 14:06

Fixed converter to allow alpha/scale to be a tensor

baa0df3

Fixed rounding_mode specifier in convert_quantized_bits

Added a check to see if tensor is representable by the quantization p…

de9f731

…arameters.

Extra Mul node inserted only when neccessary

72b994a

Commented out assertion on non-representability

Added parameterized test for tensor style alpha.

75b40ab

Updated QKeras converter to support auto_po2

7b5bf4a

Added check for none.

9fd5f6a

Fixed pre-commit issues.

d8a66a7

Added check if tensor is repsentable with quant setting.s

cd45320

Added Identity node to the removal list

3eddb38

Added an input quantization node in the qkeras converter (if applicable)

e95042b

Addded del_initializer to modelwrapper.

3e793f2

jurevreca12 closed this Aug 26, 2024

jurevreca12 mentioned this pull request Aug 26, 2024

Rounding mode new #134

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add aditional rounding modes #110

Add aditional rounding modes #110

jurevreca12 commented Mar 25, 2024

jurevreca12 commented Aug 26, 2024

Add aditional rounding modes #110

Add aditional rounding modes #110

Conversation

jurevreca12 commented Mar 25, 2024

jurevreca12 commented Aug 26, 2024