[OpenCL] Fix OpenCL get_valid_counts errors due to intrinsic atomic_add #5857

trevor-m · 2020-06-19T23:24:40Z

Some fixes a few months ago to the get_valid_counts CUDA implementation broke OpenCL because of the atomic add intrinsic which was added.

This PR fixes get_valid_counts for OpenCL with the following changes:

Register intrinsic atomic add for OpenCL.
Override intrinsic::tvm_address_of to include storage scope (e.g. __global).
Enable cl_khr_global_int32_base_atomics. This isn't required for OpenCL 1.1+ because atomic_add became a core feature. I'm happy to remove this if we don't care about OpenCL 1.0. Alternatively we can override op->call_type == CallNode::PureExtern and set a flag to enable this only when atomic_add is actually used.

Original error messages before this fix:

During compilation:

Unresolved intrinsic atomic_add with return type int32

During runtime:

<source>:6922:43: error: casting '__global void *' to type 'int *' changes address space of pointer
      atomic_add_return[(0)] = atomic_add(((int *)get_valid_counts_v0 + 0), 1);

trevor-m · 2020-06-22T16:39:23Z

@Laurawly @kazum @wpan11nv Could you please review? Thanks!

src/target/source/codegen_opencl.cc

wpan11nv

Any unit test?

trevor-m · 2020-06-22T21:09:12Z

Any unit test?

RELAY_TEST_TARGETS=opencl python3 tests/python/relay/test_op_level5.py will test this.

We would need to add opencl to ctx_list to have this run by default https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/testing/config.py#L28

Currently the CI doesn't test anything for opencl which is why we don't find out about these errors until much later. Do we know why we don't test opencl?

trevor-m · 2020-06-23T22:01:06Z

@wpan11nv Any more comments?

wpan11nv

LGTM.

zhiics · 2020-06-24T16:34:09Z

@kazum can you take a look and manage the PR? Thanks.

kazum · 2020-06-25T08:59:58Z

tests/python/relay/test_op_level5.py

-            # get_valid_count for cuda doesn't do data rearrangement
-            if target == 'cuda':
+            # get_valid_count for cuda, opencl doesn't do data rearrangement
+            if target in ['cuda', 'opencl']:
                return


Returning here looks wrong to me. The test in the below link doesn't work for OpenCL too because we don't do data rearrangement for GPU nms implementation.
https://discuss.tvm.ai/t/nms-compile-fails-for-cuda-target-but-works-fine-for-llvm-target/7045/2

Probably, we should fix non_max_suppression for GPU first?

OpenCL uses the same implementation as CUDA. The CUDA implementation of get_valid_counts was changed to no longer rearrange the output of get_valid_counts because it will be rearranged by NMS later anyway. This gives the correct output for NMS. See #5339

That issue with NMS looks to be a separate issue where the CUDA implementation wasn't fully updated to match changes to CPU implementation by #4312

Thanks for your explanation. Actually, I've successfully build NMS if I revert the change in #4312.

kazum

Looks good to me. I'll merge this after CI is passed.

trevor-m · 2020-06-26T16:08:33Z

Looks good to me. I'll merge this after CI is passed.

Thanks!

tqchen · 2020-06-28T23:11:16Z

@trevor-m please rebase against the master

trevor-m · 2020-06-29T21:10:41Z

@kazum @tqchen Rebased and CI passed. Thanks!

kazum · 2020-06-30T00:56:09Z

Thanks @trevor-m @wpan11nv !

…dd (apache#5857) * [OpenCL] Fix atomic add used by get_valid_counts * Rename l -> load, add flag to enable atomics * Opencl doesn't do data rearrangement

trevor-m force-pushed the fix-getvalidcounts-opencl branch from 93c2d33 to dc6b48e Compare June 19, 2020 23:28

tqchen added the status: need review label Jun 20, 2020

wpan11nv reviewed Jun 22, 2020

View reviewed changes

src/target/source/codegen_opencl.cc Show resolved Hide resolved

wpan11nv reviewed Jun 22, 2020

View reviewed changes

src/target/source/codegen_opencl.cc Outdated Show resolved Hide resolved

wpan11nv reviewed Jun 22, 2020

View reviewed changes

trevor-m force-pushed the fix-getvalidcounts-opencl branch 2 times, most recently from 9a19371 to 8f657a0 Compare June 23, 2020 16:02

wpan11nv approved these changes Jun 23, 2020

View reviewed changes

kazum self-assigned this Jun 25, 2020

kazum reviewed Jun 25, 2020

View reviewed changes

trevor-m force-pushed the fix-getvalidcounts-opencl branch from 8f657a0 to fa183ce Compare June 25, 2020 22:40

kazum approved these changes Jun 26, 2020

View reviewed changes

tqchen added the status: need update need update based on feedbacks label Jun 29, 2020

Trevor Morris added 3 commits June 29, 2020 15:54

[OpenCL] Fix atomic add used by get_valid_counts

6633c2d

Rename l -> load, add flag to enable atomics

0e8c38e

Opencl doesn't do data rearrangement

1db9f1a

trevor-m force-pushed the fix-getvalidcounts-opencl branch from fa183ce to 1db9f1a Compare June 29, 2020 15:55

kazum approved these changes Jun 30, 2020

View reviewed changes

kazum merged commit b3d3ff2 into apache:master Jun 30, 2020

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenCL] Fix OpenCL get_valid_counts errors due to intrinsic atomic_add #5857

[OpenCL] Fix OpenCL get_valid_counts errors due to intrinsic atomic_add #5857

trevor-m commented Jun 19, 2020 •

edited

Loading

trevor-m commented Jun 22, 2020

wpan11nv left a comment

trevor-m commented Jun 22, 2020

trevor-m commented Jun 23, 2020 •

edited

Loading

wpan11nv left a comment

zhiics commented Jun 24, 2020

kazum Jun 25, 2020

trevor-m Jun 25, 2020 •

edited

Loading

kazum Jun 26, 2020

kazum left a comment

trevor-m commented Jun 26, 2020

tqchen commented Jun 28, 2020

trevor-m commented Jun 29, 2020

kazum commented Jun 30, 2020

[OpenCL] Fix OpenCL get_valid_counts errors due to intrinsic atomic_add #5857

[OpenCL] Fix OpenCL get_valid_counts errors due to intrinsic atomic_add #5857

Conversation

trevor-m commented Jun 19, 2020 • edited Loading

trevor-m commented Jun 22, 2020

wpan11nv left a comment

Choose a reason for hiding this comment

trevor-m commented Jun 22, 2020

trevor-m commented Jun 23, 2020 • edited Loading

wpan11nv left a comment

Choose a reason for hiding this comment

zhiics commented Jun 24, 2020

kazum Jun 25, 2020

Choose a reason for hiding this comment

trevor-m Jun 25, 2020 • edited Loading

Choose a reason for hiding this comment

kazum Jun 26, 2020

Choose a reason for hiding this comment

kazum left a comment

Choose a reason for hiding this comment

trevor-m commented Jun 26, 2020

tqchen commented Jun 28, 2020

trevor-m commented Jun 29, 2020

kazum commented Jun 30, 2020

trevor-m commented Jun 19, 2020 •

edited

Loading

trevor-m commented Jun 23, 2020 •

edited

Loading

trevor-m Jun 25, 2020 •

edited

Loading