Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent results between TensorRT and ONNX inference for ScatterND operator #3462

Closed
hongliyu0716 opened this issue Nov 17, 2023 · 2 comments
Assignees
Labels
internal-bug-tracked Tracked internally, will be fixed in a future release. triaged Issue has been triaged by maintainers

Comments

@hongliyu0716
Copy link

Description

The TensorRT inference output differs from the output obtained through ONNX inference when using an ONNX model containing ScatterND operator.
The ONNX model structure is as below:
ScatterND onnx

Environment

TensorRT Version: 8.6.1.6

NVIDIA GPU: NVIDIA GeForce MX330

NVIDIA Driver Version: 470.182.03

CUDA Version: 11.4

CUDNN Version: 8.9.5

Operating System: Ubuntu 18.04

Python Version (if applicable): 3.8

Relevant Files

Model link: https://github.com/hongliyu0716/onnx_model/blob/main/ScatterND.onnx

Steps To Reproduce

  1. Download the model
  2. Commands or scripts:
polygraphy run ScatterND.onnx --onnxrt --trt --workspace 256M --save-engine test.plan --fp16 --verbose > test.txt

The error message is as below:

[I] trt-runner-N0-11/17/23-22:29:32    
    ---- Inference Input(s) ----
    {data [dtype=float32, shape=(4, 4, 4)],
     indices [dtype=int32, shape=(2, 1)],
     updates [dtype=float32, shape=(2, 4, 4)]}
[V] trt-runner-N0-11/17/23-22:29:32     | Input metadata is: {data [dtype=float32, shape=(4, 4, 4)],
     indices [dtype=int32, shape=(2, 1)],
     updates [dtype=float32, shape=(2, 4, 4)]}
[I] trt-runner-N0-11/17/23-22:29:32    
    ---- Inference Output(s) ----
    {y [dtype=float32, shape=(4, 4, 4)]}
[I] trt-runner-N0-11/17/23-22:29:32     | Completed 1 iteration(s) in 0.4179 ms | Average inference time: 0.4179 ms.
[V] Successfully ran: ['onnxrt-runner-N0-11/17/23-22:29:32', 'trt-runner-N0-11/17/23-22:29:32']
[I] Accuracy Comparison | onnxrt-runner-N0-11/17/23-22:29:32 vs. trt-runner-N0-11/17/23-22:29:32
[I]     Comparing Output: 'y' (dtype=float32, shape=(4, 4, 4)) with 'y' (dtype=float32, shape=(4, 4, 4))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         onnxrt-runner-N0-11/17/23-22:29:32: y | Stats: mean=0.73975, std-dev=0.44597, var=0.19888, median=0.69314, min=0.018288 at (2, 1, 2), max=1.6966 at (1, 1, 0), avg-magnitude=0.73975
[I]             ---- Values ----
                    [[[0.9529184  1.3841192  0.5150035  1.2469273 ]
                      [0.73331094 0.99574053 0.3237349  0.48483706]
                      [1.2041588  0.9364936  0.5845487  1.6127281 ]
                      [0.5522181  1.6289296  0.7533856  1.5537736 ]]
                    
                     [[1.040977   1.3096323  0.4892853  0.46802938]
                      [1.6966308  1.3963528  1.2782643  1.3557642 ]
                      [1.4980848  1.0093527  1.0345335  0.4889669 ]
                      [0.74822    1.2862793  0.33537382 1.3244871 ]]
                    
                     [[0.95788956 0.5331653  0.6918771  0.31551564]
                      [0.6865009  0.83462566 0.01828828 0.7501443 ]
                      [0.9888611  0.74816567 0.280444   0.78927934]
                      [0.10322601 0.44789353 0.9085955  0.29361415]]
                    
                     [[0.28777534 0.13002858 0.01936696 0.6788355 ]
                      [0.21162811 0.26554665 0.49157315 0.05336254]
                      [0.5741176  0.14672858 0.5893055  0.69975835]
                      [0.10233443 0.41405597 0.69440013 0.41417927]]]
[I]             ---- Histogram ----
                Bin Range       |  Num Elems | Visualization
                (0.0183, 0.186) |          7 | ############################
                (0.186 , 0.354) |          8 | ################################
                (0.354 , 0.522) |          9 | ####################################
                (0.522 , 0.69 ) |          7 | ############################
                (0.69  , 0.857) |         10 | ########################################
                (0.857 , 1.03 ) |          7 | ############################
                (1.03  , 1.19 ) |          2 | ########
                (1.19  , 1.36 ) |          7 | ############################
                (1.36  , 1.53 ) |          3 | ############
                (1.53  , 1.7  ) |          4 | ################
[I]         trt-runner-N0-11/17/23-22:29:32: y | Stats: mean=0.52752, std-dev=0.28402, var=0.080666, median=0.55501, min=0.018288 at (2, 1, 2), max=0.98886 at (2, 2, 0), avg-magnitude=0.52752
[I]             ---- Values ----
                    [[[0.5358964  0.66379464 0.5148891  0.94459474]
                      [0.58655506 0.9034019  0.1374747  0.13927634]
                      [0.8073913  0.39767683 0.16535419 0.9275086 ]
                      [0.34776586 0.7508121  0.725998   0.8833061 ]]
                    
                     [[0.6236722  0.7509424  0.34889835 0.2699279 ]
                      [0.89588624 0.4280912  0.96484005 0.6634415 ]
                      [0.6216957  0.11474597 0.94948924 0.44991213]
                      [0.5783896  0.4081368  0.23702697 0.9033795 ]]
                    
                     [[0.95788956 0.5331653  0.6918771  0.31551564]
                      [0.6865009  0.83462566 0.01828828 0.7501443 ]
                      [0.9888611  0.74816567 0.280444   0.78927934]
                      [0.10322601 0.44789353 0.9085955  0.29361415]]
                    
                     [[0.28777534 0.13002858 0.01936696 0.6788355 ]
                      [0.21162811 0.26554665 0.49157315 0.05336254]
                      [0.5741176  0.14672858 0.5893055  0.69975835]
                      [0.10233443 0.41405597 0.69440013 0.41417927]]]
[I]             ---- Histogram ----
                Bin Range       |  Num Elems | Visualization
                (0.0183, 0.186) |         11 | ####################################
                (0.186 , 0.354) |         10 | #################################
                (0.354 , 0.522) |          9 | ##############################
                (0.522 , 0.69 ) |         12 | ########################################
                (0.69  , 0.857) |         11 | ####################################
                (0.857 , 1.03 ) |         11 | ####################################
                (1.03  , 1.19 ) |          0 | 
                (1.19  , 1.36 ) |          0 | 
                (1.36  , 1.53 ) |          0 | 
                (1.53  , 1.7  ) |          0 | 
[I]         Error Metrics: y
[I]             Minimum Required Tolerance: elemwise error | [abs=0.96826] OR [rel=7.7964] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.21223, std-dev=0.29863, var=0.089178, median=5.7191e-05, min=0 at (2, 0, 0), max=0.96826 at (1, 1, 1), avg-magnitude=0.21223
[I]                 ---- Values ----
                        [[[4.1702199e-01 7.2032452e-01 1.1438131e-04 3.0233252e-01]
                          [1.4675587e-01 9.2338622e-02 1.8626021e-01 3.4556073e-01]
                          [3.9676750e-01 5.3881675e-01 4.1919452e-01 6.8521953e-01]
                          [2.0445222e-01 8.7811750e-01 2.7387619e-02 6.7046756e-01]]
                        
                         [[4.1730481e-01 5.5868989e-01 1.4038694e-01 1.9810149e-01]
                          [8.0074459e-01 9.6826160e-01 3.1342423e-01 6.9232267e-01]
                          [8.7638909e-01 8.9460671e-01 8.5044265e-02 3.9054781e-02]
                          [1.6983044e-01 8.7814248e-01 9.8346844e-02 4.2110759e-01]]
                        
                         [[0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
                          [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
                          [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
                          [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]]
                        
                         [[0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
                          [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
                          [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]
                          [0.0000000e+00 0.0000000e+00 0.0000000e+00 0.0000000e+00]]]
[I]                 ---- Histogram ----
                    Bin Range        |  Num Elems | Visualization
                    (0     , 0.0968) |         37 | ########################################
                    (0.0968, 0.194 ) |          5 | #####
                    (0.194 , 0.29  ) |          2 | ##
                    (0.29  , 0.387 ) |          3 | ###
                    (0.387 , 0.484 ) |          5 | #####
                    (0.484 , 0.581 ) |          2 | ##
                    (0.581 , 0.678 ) |          1 | #
                    (0.678 , 0.775 ) |          3 | ###
                    (0.775 , 0.871 ) |          1 | #
                    (0.871 , 0.968 ) |          5 | #####
[I]             Relative Difference | Stats: mean=0.52857, std-dev=1.1188, var=1.2518, median=0.00011107, min=0 at (2, 0, 0), max=7.7964 at (1, 2, 1), avg-magnitude=0.52857
[I]                 ---- Values ----
                        [[[7.78176486e-01 1.08516169e+00 2.22147472e-04 3.20065856e-01]
                          [2.50199646e-01 1.02212116e-01 1.35486901e+00 2.48111582e+00]
                          [4.91419107e-01 1.35491109e+00 2.53513098e+00 7.38774300e-01]
                          [5.87901890e-01 1.16955698e+00 3.77240963e-02 7.59043276e-01]]
                        
                         [[6.69109225e-01 7.43984997e-01 4.02372032e-01 7.33905256e-01]
                          [8.93801630e-01 2.26181149e+00 3.24845791e-01 1.04353237e+00]
                          [1.40967536e+00 7.79641056e+00 8.95684361e-02 8.68053511e-02]
                          [2.93626368e-01 2.15158844e+00 4.14918363e-01 4.66146946e-01]]
                        
                         [[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
                          [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
                          [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
                          [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]]
                        
                         [[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
                          [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
                          [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
                          [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]]]
[I]                 ---- Histogram ----
                    Bin Range    |  Num Elems | Visualization
                    (0   , 0.78) |         52 | ########################################
                    (0.78, 1.56) |          7 | #####
                    (1.56, 2.34) |          2 | #
                    (2.34, 3.12) |          2 | #
                    (3.12, 3.9 ) |          0 | 
                    (3.9 , 4.68) |          0 | 
                    (4.68, 5.46) |          0 | 
                    (5.46, 6.24) |          0 | 
                    (6.24, 7.02) |          0 | 
                    (7.02, 7.8 ) |          1 | 
[E]         FAILED | Output: 'y' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[E]     FAILED | Mismatched outputs: ['y']
[E] Accuracy Summary | onnxrt-runner-N0-11/17/23-22:29:32 vs. trt-runner-N0-11/17/23-22:29:32 | Passed: 0/1 iterations | Pass Rate: 0.0%
[E] FAILED | Runtime: 6.086s | Command: /home/hll/anaconda3/envs/trt/bin/polygraphy run ScatterND.onnx --onnxrt --trt --workspace 256M --save-engine test.plan --fp16 --verbose
@zerollzeng
Copy link
Collaborator

Filed internal bug 4383792 for this, thanks for reporting issues to us.

@zerollzeng zerollzeng self-assigned this Nov 18, 2023
@zerollzeng zerollzeng added triaged Issue has been triaged by maintainers internal-bug-tracked Tracked internally, will be fixed in a future release. labels Nov 18, 2023
@zerollzeng
Copy link
Collaborator

attributes: reduction is not supported by trt and we didn't check it in onnx parser, we've rejected this in TRT Next. closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal-bug-tracked Tracked internally, will be fixed in a future release. triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants