Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CTAS for Hive transactional tables #4516

Closed
wants to merge 13 commits into from

Conversation

losipiuk
Copy link
Member

This is based on https://github.com/prestosql/presto/pull/2930/files, so please just review top commit.

@cla-bot cla-bot bot added the cla-signed label Jul 21, 2020
@losipiuk losipiuk requested review from wendigo and findepi July 21, 2020 12:29
@@ -883,7 +891,9 @@ public void createTable(ConnectorSession session, ConnectorTableMetadata tableMe
// When metastore is configured with metastore.create.as.acid=true, it will also change Presto-created tables
// behind the scenes. In particular, this won't work with CTAS.
// TODO (https://github.com/prestosql/presto/issues/1956) convert this into normal table property
tableProperties.put(TRANSACTIONAL, "false");

boolean transactional = HiveTableProperties.isTransactional(tableMetadata.getProperties()).orElseGet(() -> false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orElse

}
else {
if (bucketNumber.isPresent()) {
return computeBucketedFileName(Optional.of(queryId), bucketNumber.getAsInt());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know its preexisting, but how do we avoid collisions here?
@electrum

UnsignedLong.fromLongBits(uuid.getMostSignificantBits()).toString());
}
}
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, though it kinda looks worse with long code blocks imo.

if (bucketNumber.isPresent()) {
return computeBucketedFileName(Optional.empty(), bucketNumber.getAsInt());
}
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (queryId.isPresent()) {
return format("0%s_0_%s", paddedBucket, queryId.get());
}
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant else

Comment on lines 559 to 560
// For bucketed tables we drop query id from file names and just leave <bucketId>_0
// For non bucketed tables we use 000000_<uuid_as_number>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like bucketNumber.isPresent() is the more important condition than isTransactionalTable.
So I'd structure the code different, with bucket being the top if.

UUID uuid = randomUUID();
return format("0%s_%s%s",
paddedBucket,
UnsignedLong.fromLongBits(uuid.getLeastSignificantBits()).toString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UnsignedLong.fromLongBits(uuid.getLeastSignificantBits()).toString(),
Long.toUnsignedString(uuid.getLeastSignificantBits()),

else {
fileName = queryId + "_" + randomUUID();
}
fileName = computeFileName(bucketNumber);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put on prev line

@@ -549,6 +549,37 @@ else if (insertExistingPartitionsBehavior == InsertExistingPartitionsBehavior.ER
hiveWriterStats);
}

private String computeFileName(OptionalInt bucketNumber)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have this right above computeBucketedFileName?

public void testCtasAcidTable(boolean isPartitioned, BucketingType bucketingType)
{
if (getHiveVersionMajor() < 3) {
throw new SkipException("Presto Hive transactional tables are supported with Hive version 3 or above");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presto Hive transactional -> Hive transactional

@losipiuk
Copy link
Member Author

AC

@losipiuk
Copy link
Member Author

Rebased on new version of #2930 on master.

@findepi
Copy link
Member

findepi commented Jul 25, 2020

Superseded by #4574

@findepi findepi closed this Jul 25, 2020
@findepi findepi mentioned this pull request Jul 25, 2020
8 tasks
@findepi findepi added this to the 340 milestone Jul 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants