-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[codegen] Add multiple operands and function support when using fp16 compilation #4056
Conversation
The error might be caused by incorrect arch setting. You can use Instead of adding these functions, directly overriding CUDA codegen rules for |
Thanks for your information! I will try to use have_fp16 to skip the test on CI. I will also start to investiagte how to override the codegen rules |
I am similar to what you added, but there were some errors when I load resnet-18-fp16.onnx model on RTX2080. |
@zxy844288792 @vinx13 please followup on this |
@vinx13 Can we get this PR merged with the current changes? We can take look at |
gamma = relay.var("gamma", relay.TensorType((2,), dtype)) | ||
moving_mean = relay.var("moving_mean", relay.TensorType((2,), dtype)) | ||
moving_var = relay.var("moving_var", relay.TensorType((2,), dtype)) | ||
y = relay.nn.batch_norm(data, gamma, beta, moving_mean, moving_var, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fp16 for batch norm is not supported yet, need to merge #4088 first.
@tqchen fp16 tests on ci are skipped now, any chances to get ci support for fp16 type? |
We will need to look into it, because most of the gpu workers we have do not yet have fp16 support. So we have to rely on manual check for now. I will see if we can get a fp16 enabled worker setup |
Thanks @zxy844288792 this is now merged |
…compilation (apache#4056) * overload half operators for cuda codegen * add float16 te test_op_level1 * fix test_op_level1.py * fix lint * disable fp16 test if gpu does not support * disable fp16 test if gpu does not support * bypass float16 test if gpu does not support float16
…compilation (apache#4056) * overload half operators for cuda codegen * add float16 te test_op_level1 * fix test_op_level1.py * fix lint * disable fp16 test if gpu does not support * disable fp16 test if gpu does not support * bypass float16 test if gpu does not support float16
Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers.
As discussed in https://discuss.tvm.ai/t/error-cuda-compilation-error/3816 and https://discuss.tvm.ai/t/relay-automatic-fp16-downcasting/3952/3?u=xyzhou
The cuda fp16 computing uses “cuda_fp16.h”, It does not support operations with volatile keywords. Also, max function I refered to this PR, but it is not updating for a month, I add the min function support as well.
I also edit the test_op_level1.py to enable fp16 test case. I will edit more test_op file but I would like to gather some feedback.