-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unconditional Lookup in Neighbor Table #1797
base: master
Are you sure you want to change the base?
Conversation
launch perftest |
Hi there, this is jenkins continuous integration... |
launch perftest |
launch jenkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we shift the origin this might be wrong
@@ -86,9 +86,9 @@ namespace gridtools::fn { | |||
template <class Tag, class Ptr, class Strides, class Domain, class Conn, class Offset> | |||
GT_FUNCTION constexpr auto horizontal_shift(iterator<Tag, Ptr, Strides, Domain> const &it, Conn, Offset) { | |||
auto const &table = host_device::at_key<Conn>(it.m_domain.m_tables); | |||
auto new_index = it.m_index == -1 ? -1 : get<Offset::value>(neighbor_table::neighbors(table, it.m_index)); | |||
auto new_index = get<Offset::value>(neighbor_table::neighbors(table, std::max(it.m_index, 0))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. For exotic cases the table might not have an entry at 0? E.g. after an origin shift. But maybe we can document in the neighbor_table concept that the 0 index need to exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea if there are realistic cases where this happens. And no idea if it makes sense to include this workaround. It’s currently required to have reasonable performance on Daint due to the old CUDA versions. But once we move the performance tests to Alps (we probably will, right?), the GT_PROMISE
thing should be good enough, I guess.
Allows better code generations on platforms which don‘t support
GT_PROMISE
, that is, CUDA < 12.