-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Fix for detection of free function calls. #3003
Conversation
Signed-off-by: rdeodhar <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a test that shows the cases that didn't work before?
There is an existing test llvm_test_suite_sycl/basic_free_function_queries_free_function_queries. |
Can you please add a test to illustrate this. Every change is supposed to be accompanied by a test unless it is NFC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
A test has been added for the case: |
cgh.parallel_for<class FOX>(range<1>(1), [=](id<1> I) { this_id<1>(); }); | ||
|
||
// This kernel calls sycl::this_item | ||
cgh.parallel_for<class BEE>(range<1>(1), [=](auto I) { this_item<1>(); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. So getOperatorParens(KernelObj)
doesn't return operator method when auto
is used? Do you understand why? It is not obvious to me why. @premanandrao could you also please take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdeodhar, is this the case that was failing before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this particular construct in an existing runtime test was failing. I have replicated the situation in the existing clang lit test. Owing to some templating of members, the operator() of the KernelObj was not being found. However, the code looking for the operator() is unnecessary because the kernel function is already available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... That means KernelObj is defined differently when parameter type is auto? Do you know why this happens? I'm approving the patch since the answer to this question is probably outside the scope of this patch. However, if you know the answer I am curious :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, there is likely value to instead fix the poorly named getOperatorParens instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erichkeane would you know why its getting wrapped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erichkeane well that's a fun case -- I think you can use decls_begin()
and decls_end()
to do this:
static const CXXMethodDecl *getOperatorParens(const CXXRecordDecl *Rec) {
auto MethIter = llvm::find_if(Rec->methods(), [](const CXXMethodDecl *MD) {
return MD->getOverloadedOperator() == OO_Call;
});
if (MethIter != Rec->methods_end())
return *MethIter;
using function_template_iterator = specific_decl_iterator<FunctionTemplateDecl>;
using function_template_range = llvm::iterator_range<specific_decl_iterator<FunctionTemplateDecl>>;
auto FTIter = llvm::find_if(function_template_range(function_template_iterator(Rec->decls_begin()),
function_template_iterator(Rec->decls_end())),
[](const FunctionTemplateDecl *FTD) {
if (const auto *MD = dyn_cast<CXXMethodDecl>(FTD->getTemplatedDecl()))
return MD->getOverloadedOperator() == OO_Call;
return false;
});
if (FTIter != function_template_iterator(Rec->decls_end()))
return *FTIter;
return nullptr;
}
(This is totally untested and was written in a web browser, so YMMV.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erichkeane would you know why its getting wrapped?
Because we ALWAYS wrap templates. We need something in the AST to contain BOTH the uninstantiated and instantiated template.
We should probably update the getOperatorParens, but, IMO, it should probably return an array of valid (non-dependent) operators. I'm guessing this whole idea of using getOperatorParens is flawed though. Trying to figure out the call operator of the type rather than checking the callgraph is likely the wrong way about it. @kbobrovs : Please take a look at how you're using getOperatorParens and make sure it would work in a case where there is multiple operator()s, including instantiated template versions.
IN the case of what @AaronBallman just posted (thanks btw!), I don't know if that works for 2 reasons: First, I think the return *FTIter;
line is wrong, since that returns a FunctionTemplateDecl*
, instead of a CXXMethodDecl*
. Second, it has the same problem of only returning the FIRST operator () :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, that should be return cast<CXXMethodDecl>(*FTIter)
instead, and it definitely only finds the first. As for how to handle multiple operator()
definitions -- I agree that the function should either return a container of all the operators or take the container as a parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kbobrovs : Please take a look at how you're using getOperatorParens and make sure it would work in a case where there is multiple operator()s, including instantiated template versions.
AFAIR, this simplistic approach was based on assumption that the API code is under our control, and we know that there are no other operator '()' calls from the parallel_for_kernel.
cgh.parallel_for<class FOX>(range<1>(1), [=](id<1> I) { this_id<1>(); }); | ||
|
||
// This kernel calls sycl::this_item | ||
cgh.parallel_for<class BEE>(range<1>(1), [=](auto I) { this_item<1>(); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... That means KernelObj is defined differently when parameter type is auto? Do you know why this happens? I'm approving the patch since the answer to this question is probably outside the scope of this patch. However, if you know the answer I am curious :)
The SYCL handler.hpp makes a special-case of an integral kernel parameter type, like so:
Now I don't know why that leads to the behavior we see, but it was not worthwhile to pursue it for this bug fix, because the code that doesn't work as expected is not needed at all. |
Ok. Thanks for explaining |
* sycl: (378 commits) [sycl-post-link][NFC] Extracted the code into a subroutine (intel#3042) [SYCL][NFC] Remove commented out code (intel#3029) [CODEOWNERS] Fix ownership of DPC++ tools tests (intel#3047) [SYCL][NFC] Make tests insensitive to dso_local (intel#3037) [SYCL] Fix acquiring a mutex in _pi_context::finalize (intel#3001) [SYCL] Fix various compilation warnings in plugins (intel#2979) [SYCL][ESIMD] Add simd class conversion ctor and operator (intel#3028) [sycl-post-link][NFC] Use range-based for loop. (intel#3033) [SYCL][NFC] Fix warning in self-build (intel#3023) [NFC] Fix sycl-post-link tests to avoid potential race (intel#3031) [SYCL][CUDA] Add missing barrier to collectives (intel#2990) [SYCL] Make Intel attributes consistent with clang attributes. (intel#3022) [SYCL] Bump SYCL minor version (intel#3026) [SYCL][Doc] Added requirement on reference to test PR in commit message (intel#3010) [SYCL] Put constant initializer list data in non-generic addr space. (intel#3005) [SYCL][L0] Fix memory leak in PiDeviceCache and ZeCommandList (intel#2974) [SYCL] Fix detection of free function calls (intel#3003) [SYCL][NFC] Clean up the builder_dir argument description (intel#3021) [SYCL][ESIMD] Fix LowerESIMD crash on a scalar fptoui LLVM instruction (intel#2699) [NFC] Remove redundant call to getMainExecutable() (intel#3018) ...
This change simplifies the code to detect calls to ""free functions", i.e., functions such as this_item().
The existing code does not work in all cases to detect the Kernel function, while the corrected code uses an already known Kernel function.
Signed-off-by: rdeodhar [email protected]