Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug when DELETE ACID block is a DictionaryBlock #9354

Closed

Conversation

djsstarburst
Copy link
Member

@djsstarburst djsstarburst commented Sep 23, 2021

This PR consists of two commits:

Before the first commit, Block.getChildren() was used to take apart the
DELETE ACID rowId block. That is always the wrong thing to do.
Fixed by using ColumnarRow to access the elements of the ACID rowId block.

This first commit also adds a comment warning developers not to use
Block.getChildren() so others don't make the same mistake of
calling the method.

The second commit replaces use of Block.getChildren in
HiveUpdateProcessor with ColumnarRow methods.

@@ -116,28 +117,31 @@ public HiveUpdatablePageSource(
@Override
public void deleteRows(Block rowIds)
{
List<Block> blocks = rowIds.getChildren();
checkArgument(blocks.size() == 3, "The rowId block for DELETE should have 3 children, but has %s", blocks.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check survive in the form of columnarRow.getFieldCount()==3 inside deleteRowsInternal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, added.

List<Block> blocks = rowIds.getChildren();
ColumnarRow columnarRow = ColumnarRow.toColumnarRow(rowIds);
for (int position = 0; position < positionCount; position++) {
// Throw an exception if there are any null rows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant

Suggested change
// Throw an exception if there are any null rows

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

ColumnarRow columnarRow = ColumnarRow.toColumnarRow(rowIds);
for (int position = 0; position < positionCount; position++) {
// Throw an exception if there are any null rows
checkArgument(!columnarRow.isNull(position), "In the deleteRows block, found null position %s", position);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleteRows is not a variable or anything else known

Suggested change
checkArgument(!columnarRow.isNull(position), "In the deleteRows block, found null position %s", position);
checkArgument(!columnarRow.isNull(position), "In the delete rowIds, found null row at position %s", position);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced.

@@ -832,6 +832,28 @@ public void testDeleteAllRowsInPartition()
});
}

@Test(groups = HIVE_TRANSACTIONAL, timeOut = TEST_TIMEOUT)
public void testDeleteAllRowsUnpartitioned()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the important aspect is "delete after delete".

assuming it's correct, let's reflect this in a test name: testDeleteAfterDelete

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea; renamed.

log.info("About to delete all rows");
onTrino().executeQuery("DELETE FROM " + tableName);

verifySelectForTrinoAndHive("SELECT COUNT(*) FROM " + tableName, "TRUE", row(0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: lowercase count, true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

onTrino().executeQuery(format("DELETE FROM %s WHERE id = 2", tableName));

log.info("About to verify");
verifySelectForTrinoAndHive("SELECT * FROM " + tableName, "TRUE", row(1), row(3));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: lowercase true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verification of this stage is a bit redundant (we already have some tests for delete), especially if we run this on Hive too. remove, or keep only on Trino.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to test only on Trino.

verifySelectForTrinoAndHive("SELECT * FROM " + tableName, "TRUE", row(1), row(3));

log.info("About to delete all rows");
onTrino().executeQuery("DELETE FROM " + tableName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the bug triggered for DELETE without any predicate, or for a delete with a predicate too?

we should test both cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I added a test that differs in this way:

            // A predicate sufficient to fool statistics-based optimization
            onTrino().executeQuery(format("DELETE FROM %s WHERE id != 2" + tableName));

withTemporaryTable("delete_all_rows", true, false, NONE, tableName -> {
onTrino().executeQuery(format("CREATE TABLE %s (id INT) WITH (transactional = true)", tableName));

log.info("About to insert");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All logging removed from this method.

@findepi
Copy link
Member

findepi commented Sep 24, 2021

cc @losipiuk

@djsstarburst
Copy link
Member Author

Thanks for the prompt, close review, @findepi. I force pushed an update making all your suggested changes.

BTW, separating the WHERE clause out as a separate argument to the TestHiveTransactionalTable.verifySelect* methods, which I introduced, was a dumb idea. I will submit a no-semantic-change PR to remove them.

Before this commit, Block.getChildren() was used to take apart the
DELETE ACID rowId block.  That is always the wrong thing to do.
Fixed by using ColumnarRow to access the elementa of the ACID rowId block.

This commit also adds a comment warning developers not to use
Block.getChildren() so others don't make the same mistake of
calling the method.
Replace use of Block.getChildren in HiveUpdatablePageSource and
HiveUpdateProcessor with ColumnarRow methods.
@findepi
Copy link
Member

findepi commented Sep 28, 2021

CI #8432

@findepi
Copy link
Member

findepi commented Sep 28, 2021

Merged as 87785b4, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants