[codegen] Add multiple operands and function support when using fp16 compilation #4056

zxy844288792 · 2019-10-04T18:25:08Z

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers.

As discussed in https://discuss.tvm.ai/t/error-cuda-compilation-error/3816 and https://discuss.tvm.ai/t/relay-automatic-fp16-downcasting/3952/3?u=xyzhou
The cuda fp16 computing uses “cuda_fp16.h”, It does not support operations with volatile keywords. Also, max function I refered to this PR, but it is not updating for a month, I add the min function support as well.

I also edit the test_op_level1.py to enable fp16 test case. I will edit more test_op file but I would like to gather some feedback.

vinx13 · 2019-10-06T20:01:50Z

The error might be caused by incorrect arch setting. You can use
https://github.com/dmlc/tvm/blob/cffb4fba03ea582417e2630bd163bca773756af6/python/tvm/contrib/nvcc.py#L218-L238
to conditionally skip the test on ci.

Instead of adding these functions, directly overriding CUDA codegen rules for half might be preferred since we also want to deal with half2 and avoid repetition

zxy844288792 · 2019-10-07T20:18:06Z

The error might be caused by incorrect arch setting. You can use
https://github.com/dmlc/tvm/blob/cffb4fba03ea582417e2630bd163bca773756af6/python/tvm/contrib/nvcc.py#L218-L238

to conditionally skip the test on ci.
Instead of adding these functions, directly overriding CUDA codegen rules for half might be preferred since we also want to deal with half2 and avoid repetition

Thanks for your information! I will try to use have_fp16 to skip the test on CI. I will also start to investiagte how to override the codegen rules

blacklong28 · 2019-10-09T10:12:46Z

I am similar to what you added, but there were some errors when I load resnet-18-fp16.onnx model on RTX2080.
cuda-got-error-cuda-error-launch-out-of-resources

tqchen · 2019-10-10T19:55:01Z

@zxy844288792 @vinx13 please followup on this

zxy844288792 · 2019-10-11T00:10:52Z

@vinx13 Can we get this PR merged with the current changes? We can take look at half and half2 in a separate PR after we have more clarity.

vinx13 · 2019-10-11T01:29:56Z

tests/python/relay/test_op_level1.py

+        gamma = relay.var("gamma", relay.TensorType((2,), dtype))
+        moving_mean = relay.var("moving_mean", relay.TensorType((2,), dtype))
+        moving_var = relay.var("moving_var", relay.TensorType((2,), dtype))
+        y = relay.nn.batch_norm(data, gamma, beta, moving_mean, moving_var,


Fp16 for batch norm is not supported yet, need to merge #4088 first.

vinx13 · 2019-10-11T01:31:00Z

@tqchen fp16 tests on ci are skipped now, any chances to get ci support for fp16 type?

tqchen · 2019-10-11T16:56:52Z

We will need to look into it, because most of the gpu workers we have do not yet have fp16 support. So we have to rely on manual check for now. I will see if we can get a fp16 enabled worker setup

vinx13 · 2019-10-11T21:25:31Z

Thanks @zxy844288792 this is now merged

…compilation (apache#4056) * overload half operators for cuda codegen * add float16 te test_op_level1 * fix test_op_level1.py * fix lint * disable fp16 test if gpu does not support * disable fp16 test if gpu does not support * bypass float16 test if gpu does not support float16

Xingyu Zhou added 5 commits October 7, 2019 19:27

overload half operators for cuda codegen

e6ece4c

add float16 te test_op_level1

49044db

fix test_op_level1.py

d85f452

fix lint

47902c4

disable fp16 test if gpu does not support

66014b4

zxy844288792 force-pushed the cuda branch from 2de68ac to 66014b4 Compare October 7, 2019 19:28

Xingyu Zhou added 2 commits October 7, 2019 20:30

disable fp16 test if gpu does not support

c20a7cc

bypass float16 test if gpu does not support float16

dc378fe

tqchen assigned vinx13 Oct 10, 2019

tqchen added the status: need review label Oct 10, 2019

vinx13 reviewed Oct 11, 2019

View reviewed changes

vinx13 approved these changes Oct 11, 2019

View reviewed changes

vinx13 changed the title ~~[codegen] WIP - Add multiple operands and function support when using fp16 compilation~~ [codegen] Add multiple operands and function support when using fp16 compilation Oct 11, 2019

vinx13 merged commit ce72e9b into apache:master Oct 11, 2019

vinx13 added status: accepted and removed status: need review labels Oct 11, 2019

tqchen mentioned this pull request Oct 24, 2019

[codegen] Add max(half, half) support when enable fp16 #3811

Closed

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codegen] Add multiple operands and function support when using fp16 compilation #4056

[codegen] Add multiple operands and function support when using fp16 compilation #4056

zxy844288792 commented Oct 4, 2019

vinx13 commented Oct 6, 2019 •

edited

Loading

zxy844288792 commented Oct 7, 2019

blacklong28 commented Oct 9, 2019

tqchen commented Oct 10, 2019

zxy844288792 commented Oct 11, 2019

vinx13 Oct 11, 2019

vinx13 commented Oct 11, 2019

tqchen commented Oct 11, 2019

vinx13 commented Oct 11, 2019

[codegen] Add multiple operands and function support when using fp16 compilation #4056

[codegen] Add multiple operands and function support when using fp16 compilation #4056

Conversation

zxy844288792 commented Oct 4, 2019

vinx13 commented Oct 6, 2019 • edited Loading

zxy844288792 commented Oct 7, 2019

blacklong28 commented Oct 9, 2019

tqchen commented Oct 10, 2019

zxy844288792 commented Oct 11, 2019

vinx13 Oct 11, 2019

Choose a reason for hiding this comment

vinx13 commented Oct 11, 2019

tqchen commented Oct 11, 2019

vinx13 commented Oct 11, 2019

vinx13 commented Oct 6, 2019 •

edited

Loading