Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] [NATIVECPU] Implement missing math builtins for scalar data types #11321

Merged
merged 12 commits into from
Oct 24, 2023

Conversation

PietroGhg
Copy link
Contributor

Implements missing math builtins for the x86 target in libclc, only for some scalar data types, missing types will be added in a follow up PR. The builtins are defined as LLVM-IR in order to allow us to handle ABI and directly call LLVM intrinsics.

@PietroGhg PietroGhg requested review from a team as code owners September 27, 2023 10:16
@PietroGhg PietroGhg requested a review from ldrumm September 27, 2023 10:16
@PietroGhg PietroGhg temporarily deployed to WindowsCILock September 27, 2023 10:20 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock September 27, 2023 11:00 — with GitHub Actions Inactive
@@ -0,0 +1,134 @@
// REQUIRES: native_cpu_be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is specific about native CPU in this test? It seems like it could run on other targets too.

Also aren't there tests in the E2E suite already testing the math built-ins?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test could run on other targets, but I'd like to use it to quickly test builtins that are defined in the x86 target in libclc, which is used only by native cpu. There are tests for math builtins in E2E, but we don't run the full E2E test suite on native cpu yet, because several features are unsupported

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this breaks the convention that runtime e2e tests are placed in sycl/test-e2e; whereas sycl/test checks device code generation etc. As @npmiller says ideally it is possible to use existing tests. This makes sure the test folders don't become cluttered with duplicate test cases. For example the sycl:: functions you test are already covered in tests like https://github.com/intel/llvm-test-suite/blob/2e687fcd57149112d7b29e9e85bf74ce85f1119c/SYCL/DeviceLib/built-ins/scalar_math.cpp#L125

For the sycl::native function calls, I see it can make sense to test them on a per backend basis since the precision is generally backend dependent. Note FYI that they are already generally tested to some extent in this test: https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/DeviceLib/built-ins/fast-math-flag.cpp

However the above fast-math-flag.cpp test is effectively just checking that the fast-math flag work such that sycl:: math correctly call the sycl::native variants when the flag is used.

Copy link
Contributor Author

@PietroGhg PietroGhg Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right, I think that what you say applies to a lot of tests in sycl/test/native_cpu (e.g. we have a test for vector addition, I'm sure that equivalent functionalities are already tested in e2e). We will remove most of the tests in sycl/test/native_cpu once we enable e2e testing for it, but I've removed the test added in this PR, since we can reuse the tests that you pointed out.

@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 2, 2023 13:49 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 2, 2023 14:29 — with GitHub Actions Inactive
// ModuleChanged |= true;
// }
//=======
//>>>>>>> sycl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A merge gone wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes :) That's fixed now, thanks for spotting it

@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 9, 2023 08:55 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 9, 2023 09:18 — with GitHub Actions Inactive
@PietroGhg PietroGhg requested a review from a team as a code owner October 9, 2023 16:09
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 9, 2023 16:11 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 9, 2023 16:46 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 11, 2023 15:31 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 11, 2023 16:06 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 11, 2023 16:43 — with GitHub Actions Inactive
@PietroGhg
Copy link
Contributor Author

Hi @intel/llvm-reviewers-cuda, do you think there's anything else to do for this PR?

@PietroGhg
Copy link
Contributor Author

@intel/llvm-reviewers-cuda ping

#define GEN_TERNARY_VECTOR_BUILTIN(NAME, TYPE, NUM) \
_CLC_OVERLOAD TYPE##NUM GETNAME(NAME)(TYPE##NUM n1, TYPE##NUM n2, \
TYPE##NUM n3) { \
return __builtin_elementwise_##NAME(n1, n2, n3); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these __builtin* specific to nativecpu?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are part of clang see here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was thinking. It seems a bit strange to me to be calling generic clang builtins in a target specific libclc backend. Is there a good reason to do it like this?

Maybe these clang builtins better than the ones in the libclc/generic backend? Such generic impls that would be called if you don't add a target specific impl. Or could the clang builtins used here just be added to the libclc/generic impls?

Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clang builtins should lead to the same LLVM intrinsics used in libclc/generic, but the way the SPIRV builtins are implemented in generic doesn't play well with the x86 ABI on Linux (see #10970). I think you are right and it's worth considering changing the implementation in generic, but I'm not 100% sure about the implications of doing so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see, makes sense, thanks for the clarification.

#define IS_NATIVE
#include "helpers.h"

GEN_UNARY_BUILTIN(exp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think IS_NATIVE could be staying defined when you don't want it to?

Suggested change
GEN_UNARY_BUILTIN(exp)
GEN_UNARY_BUILTIN(exp)
#undef IS_NATIVE

etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, that's done, thank you

@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 24, 2023 11:18 — with GitHub Actions Inactive
@PietroGhg PietroGhg temporarily deployed to WindowsCILock October 24, 2023 11:48 — with GitHub Actions Inactive
@PietroGhg
Copy link
Contributor Author

Hi @intel/llvm-gatekeepers I think this is ready to be merged, thank you

@steffenlarsen steffenlarsen merged commit ff64511 into intel:sycl Oct 24, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants