Skip to content

Commit

Permalink
Fix bug for materialized join results with lazy input with local voca…
Browse files Browse the repository at this point in the history
…bs (#1684)

This fixes a subtle bug which led to a segmentation fault in the following scenario:

1. The result of a join is fully materialized, but at least one of its inputs is lazy
2. The blocks of the unique input contain local vocabs with unique entries (not shared between the blocks)
3. There is at least one block in the lazy input that has no matching row in the other input, and therefore its local vocab is not used
4. There is at least one block before the block from item 3 that is needed (including its local vocab)

Previously, the local vocabs from all blocks that fulfill item 3 were not part of the input and caused segfaults, when the corresponding local vocab pointer was dereferenced.

Fixes #1679
  • Loading branch information
joka921 authored Dec 16, 2024
1 parent 6429061 commit e422852
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 2 deletions.
9 changes: 7 additions & 2 deletions src/engine/AddCombinedRowToTable.h
Original file line number Diff line number Diff line change
Expand Up @@ -155,8 +155,13 @@ class AddCombinedRowToIdTable {
if (nextIndex_ != 0) {
AD_CORRECTNESS_CHECK(inputLeftAndRight_.has_value());
flush();
} else {
// Clear vocab when no rows were written.
} else if (resultTable_.empty()) {
// Clear local vocab when no rows were written.
//
// TODO<joka921, robinTF> This is a conservative approach. We could
// optimize this case (clear the local vocab more often, but still
// correctly) by considering the situation after all the relevant inputs
// have been processed.
mergedVocab_ = LocalVocab{};
}
}
Expand Down
36 changes: 36 additions & 0 deletions test/AddCombinedRowToTableTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -323,3 +323,39 @@ TEST(AddCombinedRowToTable, verifyLocalVocabIsRetainedWhenNotMoving) {
EXPECT_TRUE(vocabContainsString(localVocab, "b"));
EXPECT_THAT(localVocab.getAllWordsForTesting(), ::testing::SizeIs(2));
}

// _____________________________________________________________________________
TEST(AddCombinedRowToTable, localVocabIsOnlyClearedWhenLegal) {
auto outputTable = makeIdTableFromVector({});
outputTable.setNumColumns(3);
ad_utility::AddCombinedRowToIdTable adder{
1, std::move(outputTable),
std::make_shared<ad_utility::CancellationHandle<>>(), 1};

IdTableWithVocab input1{makeIdTableFromVector({{0, 1}}),
createVocabWithSingleString("a")};
IdTableWithVocab input2{makeIdTableFromVector({{0, 2}}),
createVocabWithSingleString("b")};

adder.setInput(input1, input2);
adder.addRow(0, 0);
IdTableWithVocab input3{makeIdTableFromVector({{3, 1}}),
createVocabWithSingleString("c")};
IdTableWithVocab input4{makeIdTableFromVector({{3, 2}}),
createVocabWithSingleString("d")};
// NOTE: This seemingly redundant call to `setInput` is important, as it tests
// a previous bug: Each call to `setInput` implicitly also calls `flush` and
// also possibly clears the local vocab if it is not used anymore. In this
// case however we may not clear the local vocab, as the result of the
// previous calls to `addRow` has not yet been extracted.
adder.setInput(input1, input2);
adder.setInput(input3, input4);
adder.addRow(0, 0);
auto localVocab = adder.localVocab().clone();

EXPECT_TRUE(vocabContainsString(localVocab, "a"));
EXPECT_TRUE(vocabContainsString(localVocab, "b"));
EXPECT_TRUE(vocabContainsString(localVocab, "c"));
EXPECT_TRUE(vocabContainsString(localVocab, "d"));
EXPECT_THAT(localVocab.getAllWordsForTesting(), ::testing::SizeIs(4));
}

0 comments on commit e422852

Please sign in to comment.