[ONNX] Support ScatterElements with reduction #13894

vvchernov · 2023-02-01T16:29:44Z

Support ScatterElements on ONNX front-end as described in ONNX docs. Just now Scatter op implementation is used where reduction attribute is not supported at all. Also CI tests are supported for the op.

P.S. In other PRs after that I plan: 1. remove scatter_add and reconnect all its using to ScatterElements(reduction="add") 2. remove scatter implementation and use ScatterElements(reduction="update") instead of it. It will remove the restriction related to input tensor rank size (just now rank <= 4)

tvm-bot · 2023-02-01T16:29:47Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @KJlaccHoeUM9l _{See #10317 for details}

_{Generated by tvm-bot}

vvchernov · 2023-02-10T09:37:32Z

@tvm-bot rerun

python/tvm/topi/scatter_elements.py

AndrewZhaoLuo

Did a first pass, looks good for most part. One comment.

Will take a closer look tonight.

AndrewZhaoLuo · 2023-02-14T19:35:47Z

python/tvm/topi/cuda/scatter_elements.py

+
+            ind_fused = bx1 * max_threads + tx1
+            with ib.if_scope(ind_fused < ind_full_range):
+                index_check = tir.LT(indices_ptr[ind_fused], tir.const(0, indices.dtype))


Why can this block not be fused with the below block? Is it to prevent warp divergence?

Perhaps you can do something like:

index = index + (index < 0) * axis_range

If that is the case.

Hello @AndrewZhaoLuo It was initially fused to the common loop. I decided that conditions is not good thing for GPU calculations aspecially wrapped to additional loops. But in this case it looks like the same (I can return it back if you think it is better way). Another thing is assert for probably shifted axis which is still out of bounds (theoretically we should do this check), but I do not know how I can do this check on ir builder side.

It is very convenient and clear expression that you supposed. Is it the same as I wrote or use other tir procedures? I'm slightly aware about excess multiplication in it

The arithmetic intensity per element loaded is very very low (single digit) so I am not very worried about the multiplication. I would always expect memory access to always be the bottleneck so we can do a lot of computation while the GPU is fetching stuff from global memory. Regardless, it's a pretty common trick to get "branch-less" programming.

Yes, I am not as familiar with IR Builder, I will say if you want to enforce indices being valid you can just do something like modulo indices internally (for wrapping behavior). I am not familiar with assertions in tir in general.

Personally I think it is ok to not check and assume caller will check and guarantee good inputs as we don't necessarily want the check in the base computation.

Ok. I have fused block related to index shifting with general one and use your expression.
Code related to scatter_add was refactored in another branch and PR was prepared (see below), the latter is waiting for merging of this PR.

AndrewZhaoLuo

Can you compare current performance of your GPU kernel vs the current scatter_add implementation?

Looking at it, it may be easier to reuse the existing scatter_add topi implementation and extend it with new reduction functions.

vvchernov · 2023-02-16T07:57:56Z

@AndrewZhaoLuo I did not compare performance on GPU, but I prepared new PR where replace scatter_add by scatter_elements and reuse code for cuda

vvchernov · 2023-02-16T08:06:21Z

One moment for scatter_add: it is implemented for 1d, 2d, 3d and 4d input tensors, as scatter. I thought that I can reuse scatter_add approach, but as result I have done scatter_elements in general way without restrictions on input data rank

vvchernov · 2023-02-16T13:11:37Z

@tvm-bot rerun

…or 18 version

vvchernov force-pushed the vc/scatter branch 9 times, most recently from f0fdaf1 to 3511f6d Compare February 8, 2023 09:43

vvchernov force-pushed the vc/scatter branch from 779d9b3 to 7577227 Compare February 9, 2023 13:10

vvchernov changed the title ~~WIP: [ONNX] Support ScatterElements with reduction~~ [ONNX] Support ScatterElements with reduction Feb 9, 2023

vvchernov force-pushed the vc/scatter branch from 35d01d5 to 8e302e7 Compare February 10, 2023 05:40

vvchernov force-pushed the vc/scatter branch from 8e302e7 to 251bd94 Compare February 11, 2023 06:42

echuraev requested a review from AndrewZhaoLuo February 13, 2023 05:36

KJlaccHoeUM9l reviewed Feb 13, 2023

View reviewed changes

python/tvm/topi/scatter_elements.py Show resolved Hide resolved

vvchernov force-pushed the vc/scatter branch from 251bd94 to e31332a Compare February 13, 2023 14:08

AndrewZhaoLuo reviewed Feb 14, 2023

View reviewed changes

AndrewZhaoLuo reviewed Feb 16, 2023

View reviewed changes

vvchernov mentioned this pull request Feb 16, 2023

[TOPI][Relay][ONNX] Replace scatter_add by scatter_elements(reduction="add") #14008

Merged

Valery Chernov added 6 commits February 16, 2023 19:42

add ScatterElements converter to ONNX front-end

74e413f

native front-end for ScatterElements was implemented

eafbe11

update ScatterElements in ONNX high-level front-end

12cf532

update comments

64656a7

register ScatterElementsAttrs

90a4b30

register scatter elements strategy

709b74e

Valery Chernov added 24 commits February 16, 2023 19:42

fix scatter deprecation in the CI test

1e3a663

fix

c2653b3

fix scatter version support

a7e0aae

fix negative indices

d527411

add scatter elements strategy for cuda, gpu

3525fcb

update assert comment, update check of negative indices, hide tests f…

38af70c

…or 18 version

fixes

ab7cf51

extend error log for convenient analysis

5984eb3

lint fix

46199da

fix

fd2ad9a

sync dtypes

e14e4dd

update cpu tir for scatter elements by scan example

77e2308

scatter elements was basically implemented for topi/cuda

3276dd7

fix cpu scatter elements

da839a4

fix gpu scatter elements

b0e1f12

fix

c617f22

small update

21ce735

transfer indices check out of general loop

1ebe71d

trancsfer ranges and strides calculation to gpu device

5721d94

fixes

2bbff6a

fix axis

fe27ea8

clean code

62c56b7

fix after review

1fa653f

fix lint

ac7c230

vvchernov force-pushed the vc/scatter branch from 7d6db59 to ac7c230 Compare February 16, 2023 16:43

AndrewZhaoLuo approved these changes Feb 16, 2023

View reviewed changes

AndrewZhaoLuo merged commit 0c965f4 into apache:main Feb 16, 2023

vvchernov deleted the vc/scatter branch February 17, 2023 05:15

vvchernov mentioned this pull request Feb 17, 2023

[ONNX][TORCH] Replace scatter op by scatter_elements #14019

Merged

ysh329 mentioned this pull request Apr 17, 2023

[Release] v0.12.0 Release Candidate Notes #14645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX] Support ScatterElements with reduction #13894

[ONNX] Support ScatterElements with reduction #13894

vvchernov commented Feb 1, 2023 •

edited

Loading

tvm-bot commented Feb 1, 2023 •

edited

Loading

vvchernov commented Feb 10, 2023

AndrewZhaoLuo left a comment

AndrewZhaoLuo Feb 14, 2023

vvchernov Feb 15, 2023

vvchernov Feb 15, 2023

AndrewZhaoLuo Feb 16, 2023 •

edited

Loading

vvchernov Feb 16, 2023

AndrewZhaoLuo left a comment

vvchernov commented Feb 16, 2023

vvchernov commented Feb 16, 2023

vvchernov commented Feb 16, 2023

[ONNX] Support ScatterElements with reduction #13894

[ONNX] Support ScatterElements with reduction #13894

Conversation

vvchernov commented Feb 1, 2023 • edited Loading

tvm-bot commented Feb 1, 2023 • edited Loading

vvchernov commented Feb 10, 2023

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

AndrewZhaoLuo Feb 14, 2023

Choose a reason for hiding this comment

vvchernov Feb 15, 2023

Choose a reason for hiding this comment

vvchernov Feb 15, 2023

Choose a reason for hiding this comment

AndrewZhaoLuo Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

vvchernov Feb 16, 2023

Choose a reason for hiding this comment

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

vvchernov commented Feb 16, 2023

vvchernov commented Feb 16, 2023

vvchernov commented Feb 16, 2023

vvchernov commented Feb 1, 2023 •

edited

Loading

tvm-bot commented Feb 1, 2023 •

edited

Loading

AndrewZhaoLuo Feb 16, 2023 •

edited

Loading