-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TOPI][CUDA] Enable vectorization on fp16 type #4867
Conversation
Please request reviews from reviewers |
Kindly ping. Thanks! |
if not tvm.runtime.enabled(device): | ||
print("Skip because %s is not enabled" % device) | ||
return | ||
with tvm.target.create(device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a check whether fp16 is supported here and also verify_relu
https://github.com/apache/incubator-tvm/blob/aaf62e47e64d592be770e915a7aa59d41eddb729/topi/tests/python/test_topi_transform.py#L57-L59
topi/tests/python/test_topi_relu.py
Outdated
@@ -87,12 +87,12 @@ def _prelu_numpy(x, W): | |||
tvm.testing.assert_allclose(b.asnumpy(), out_np, rtol=1e-5) | |||
|
|||
def test_relu(): | |||
verify_relu(10, 128) | |||
verify_relu(128, 128, "float32") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you keep a test case as before where m
and n
have different values?
check_device(device) | ||
|
||
def test_vectorization(): | ||
verify_vectorization(128, 128, "float16") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
Thanks all for the suggestions! Tests updated. |
topi/tests/python/test_topi_relu.py
Outdated
from common import get_all_backend | ||
|
||
def verify_relu(m, n): | ||
A = tvm.placeholder((m, n), name='A') | ||
def skipTest(dtype, device): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we prefer skip_test
style naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks!
- This allows to better utilize the memory bandwidth - Note that not all cases are vectorized for fp16 datatype. For instance, when the size is not a multiple of 1024, the inner loop may be an expression that cannot be vectorized. In this case, a small inner loop is still benefical for latency hidding. Signed-off-by: Wei Pan <[email protected]>
- This allows to better utilize the memory bandwidth - Note that not all cases are vectorized for fp16 datatype. For instance, when the size is not a multiple of 1024, the inner loop may be an expression that cannot be vectorized. In this case, a small inner loop is still benefical for latency hidding. Signed-off-by: Wei Pan <[email protected]>
- This allows to better utilize the memory bandwidth - Note that not all cases are vectorized for fp16 datatype. For instance, when the size is not a multiple of 1024, the inner loop may be an expression that cannot be vectorized. In this case, a small inner loop is still benefical for latency hidding. Signed-off-by: Wei Pan <[email protected]>
- This allows to better utilize the memory bandwidth - Note that not all cases are vectorized for fp16 datatype. For instance, when the size is not a multiple of 1024, the inner loop may be an expression that cannot be vectorized. In this case, a small inner loop is still benefical for latency hidding. Signed-off-by: Wei Pan <[email protected]>
This allows to better utilize the memory bandwidth
Note that not all cases are vectorized for fp16 datatype. For
instance, when the size is not a multiple of 1024, the inner loop
may be an expression that cannot be vectorized. In this case, a
small inner loop is still benefical for latency hidding.
Signed-off-by: Wei Pan [email protected]
Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.