-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Presto Java UUID serialization #11197
base: main
Are you sure you want to change the base?
Fix Presto Java UUID serialization #11197
Conversation
✅ Deploy Preview for meta-velox canceled.
|
please review @aditi-pandit @Yuhta @mbasmanova |
@@ -63,10 +63,13 @@ TEST_F(UuidFunctionsTest, castAsVarchar) { | |||
// Verify that CAST results as the same as boost::lexical_cast. We do not use | |||
// boost::lexical_cast to implement CAST because it is too slow. | |||
auto expected = makeFlatVector<std::string>(size, [&](auto row) { | |||
const auto uuid = uuids->valueAt(row); | |||
auto uuid = uuids->valueAt(row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do this and rest of the file can be left unchanged:
auto uuid = folly::Endian::big(uuids->valueAt(row));
@@ -97,8 +97,8 @@ class UuidCastOperator : public exec::CastOperator { | |||
|
|||
size_t offset = 0; | |||
for (auto i = 0; i < 16; ++i) { | |||
result.data()[offset] = kHexTable[uuidBytes[i] * 2]; | |||
result.data()[offset + 1] = kHexTable[uuidBytes[i] * 2 + 1]; | |||
result.data()[offset] = kHexTable[uuidBytes[15 - i] * 2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto uuid = folly::Endian::big(uuids->valueAt(row));
@@ -125,7 +125,10 @@ class UuidCastOperator : public exec::CastOperator { | |||
auto uuid = boost::lexical_cast<boost::uuids::uuid>(uuidString); | |||
|
|||
int128_t u; | |||
memcpy(&u, &uuid, 16); | |||
auto charPtr = reinterpret_cast<char*>(&u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memcpy(&u, &uuid, 16);
u = folly::Endian::big(u);
Thanks @Yuhta , but |
@BryanCutler You can add a utility to |
af1deb8
to
beb2740
Compare
Added DecimalUtil::big for reversing int128_t byte order
beb2740
to
d608cfe
Compare
This fixes the
PrestoSerializer
to put UUID values in the correct format that is expected by Presto Java so that the values will match those from a Java worker. First, when converting UUID to/from string, the values are no longer in big endian format (as taken from boost::uuid) and are instead stored as a little endian in an int128_t. Secondly, Presto Java will read UUID values from anInt128ArrayBlock
with the first value as the most significant bits. To correct this, the upper/lower parts of the int128_t are swapped during serialization/deserialization.A unit test for checking roundtrip UUID serializaiton was added and manual testing of Presto with a native worker to verify the problem from the issue description is fixed.
From prestodb/presto#23311