-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] [NATIVECPU] Implement missing math builtins for scalar data types #11321
Conversation
@@ -0,0 +1,134 @@ | |||
// REQUIRES: native_cpu_be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is specific about native CPU in this test? It seems like it could run on other targets too.
Also aren't there tests in the E2E suite already testing the math built-ins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test could run on other targets, but I'd like to use it to quickly test builtins that are defined in the x86 target in libclc, which is used only by native cpu. There are tests for math builtins in E2E, but we don't run the full E2E test suite on native cpu yet, because several features are unsupported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this breaks the convention that runtime e2e tests are placed in sycl/test-e2e; whereas sycl/test checks device code generation etc. As @npmiller says ideally it is possible to use existing tests. This makes sure the test folders don't become cluttered with duplicate test cases. For example the sycl::
functions you test are already covered in tests like https://github.com/intel/llvm-test-suite/blob/2e687fcd57149112d7b29e9e85bf74ce85f1119c/SYCL/DeviceLib/built-ins/scalar_math.cpp#L125
For the sycl::native
function calls, I see it can make sense to test them on a per backend basis since the precision is generally backend dependent. Note FYI that they are already generally tested to some extent in this test: https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/DeviceLib/built-ins/fast-math-flag.cpp
However the above fast-math-flag.cpp test is effectively just checking that the fast-math flag work such that sycl::
math correctly call the sycl::native
variants when the flag is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you are right, I think that what you say applies to a lot of tests in sycl/test/native_cpu
(e.g. we have a test for vector addition, I'm sure that equivalent functionalities are already tested in e2e
). We will remove most of the tests in sycl/test/native_cpu
once we enable e2e testing for it, but I've removed the test added in this PR, since we can reuse the tests that you pointed out.
// ModuleChanged |= true; | ||
// } | ||
//======= | ||
//>>>>>>> sycl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A merge gone wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :) That's fixed now, thanks for spotting it
Hi @intel/llvm-reviewers-cuda, do you think there's anything else to do for this PR? |
@intel/llvm-reviewers-cuda ping |
#define GEN_TERNARY_VECTOR_BUILTIN(NAME, TYPE, NUM) \ | ||
_CLC_OVERLOAD TYPE##NUM GETNAME(NAME)(TYPE##NUM n1, TYPE##NUM n2, \ | ||
TYPE##NUM n3) { \ | ||
return __builtin_elementwise_##NAME(n1, n2, n3); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these __builtin* specific to nativecpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are part of clang
see here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I was thinking. It seems a bit strange to me to be calling generic clang builtins in a target specific libclc backend. Is there a good reason to do it like this?
Maybe these clang builtins better than the ones in the libclc/generic backend? Such generic impls that would be called if you don't add a target specific impl. Or could the clang builtins used here just be added to the libclc/generic impls?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clang builtins should lead to the same LLVM intrinsics used in libclc/generic
, but the way the SPIRV builtins are implemented in generic
doesn't play well with the x86 ABI on Linux (see #10970). I think you are right and it's worth considering changing the implementation in generic
, but I'm not 100% sure about the implications of doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see, makes sense, thanks for the clarification.
#define IS_NATIVE | ||
#include "helpers.h" | ||
|
||
GEN_UNARY_BUILTIN(exp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think IS_NATIVE
could be staying defined when you don't want it to?
GEN_UNARY_BUILTIN(exp) | |
GEN_UNARY_BUILTIN(exp) | |
#undef IS_NATIVE |
etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that's done, thank you
Hi @intel/llvm-gatekeepers I think this is ready to be merged, thank you |
Implements missing math builtins for the x86 target in libclc, only for some scalar data types, missing types will be added in a follow up PR. The builtins are defined as LLVM-IR in order to allow us to handle ABI and directly call LLVM intrinsics.