-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Struct hashing support for SerialMurmur3 and SparkMurmur3 #7714
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #7714 +/- ##
===============================================
+ Coverage 81.86% 82.70% +0.84%
===============================================
Files 101 101
Lines 16884 17419 +535
===============================================
+ Hits 13822 14407 +585
+ Misses 3062 3012 -50
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the java side things look good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple small comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but a couple of questions.
hash_value_type CUDA_DEVICE_CALLABLE | ||
MurmurHash3_32<cudf::list_view>::operator()(cudf::list_view const& key) const | ||
{ | ||
cudf_assert(false && "List column hashing is not supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just be a static_assert? cudf_assert
will only fail in a debug build, and only at runtime. This path should result in a compile-time error, shouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, static_assert would be ideal. I originally tried a static assert, but I believe the type dispatcher causes it to be referenced in the switch statement and fails the build. I think that's why cudf_assert
is used in some other hash functions in this header file for unsupported types.
cudf_assert
will only fail in a debug build
That's unfortunate. Is there a separate assertion utility that should be used here that will fire in a release build as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDF_FAIL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is device code. CUDF_FAIL
won't work.
That's unfortunate. Is there a separate assertion utility that should be used here that will fire in a release build as well?
release_assert
used to do that, but then we found out it massively increased our binary size #7583
There's no good way to do error checking from device code like this that doesn't have a significant impact on performance. We just need to make sure to catch these kinds of things on the host beforehand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a macro that simply asserts on the device in a release build without any lengthy string stuff such as function, file and line number? Not super useful for debugging the problem, but it would prevent the code from just running and returning some bogus result without an error in case the host-side checks aren't there or working properly. Does that still bloat the binary or harm performance in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that still bloat the binary or harm performance in some way?
Yeah, the mere presence of an assert
, even unexecuted, impacts performance.
__global__ void benchmark_kernel(value_type* a, value_type* b, int n, bool do_asserts)
{
int i = blockIdx.x * blockDim.x + threadIdx.x;
if(i >= n)
return;
if(do_asserts)
{
assert(b[i] <= static_cast<value_type>(n));
}
value_type x = tanh(a[i]*sinf(b[i]));
x = cosf(x);
if(do_asserts)
{
assert(x <= 1.f);
}
a[i] = x;
}
With do_asserts = false
:
without -DNDEBUG:
repetitions:100 median: 3.02846ms min: 2.95219ms p20: 2.99827ms p80: 3.05152ms max: 5.3207ms
with -DNDEBUG:
repetitions:100 median: 2.816ms min: 2.7904ms p20: 2.80781ms p80: 2.82726ms max: 2.84467ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have debug builds getting tested under CI at least on a nightly/weekly basis in 0.20 which will help flag any code going down a path it shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what's the action here @jlowe @jrhemstad ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, there's nothing more to do. The host code is already checking the types before attempting to dispatch, and these specializations using cudf_assert
are only in place just in case somehow they get invoked errantly in practice. The performance hit from using an actual device assert in release code is too great, so cudf_assert
is as close as we can get, IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought Jake was saying just having a cudf_assert
was expensive. Misunderstood.
auto child_columns = to_leaf_columns(col.child_begin(), col.child_end()); | ||
leaf_columns.insert(leaf_columns.end(), child_columns.begin(), child_columns.end()); | ||
} else { | ||
leaf_columns.emplace_back(col); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I be worried that the else
uses emplacement while the if
uses insertion? I'm not sure how to understand this. Is col
being moved while child_columns
are being copied?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
col
is being used as a copy-constructor argument and child_columns
are being copied as they are inserted. These are all dealing with column_view
which owns no resources, so there shouldn't be any move semantics here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I was hoping that somehow the two branches could be made more symmetric to avoid confusion. But I looked at how to do it and I couldn't find a way.
@gpucibot merge |
Adding struct column support for serial Murmur3 and Spark-compatible Murmur3 hashing. This explodes the struct column into the leaf columns before passing it to the existing hash support. The validity of the parent struct columns can be ignored because hashing a null ends up as a no-op that returns the hash seed, so only the leaf columns within the struct column need to be considered for the hash computation.