Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-5682][VL] Fix incorrect result when isNull & isNotNull coexist in filter #5670

Merged
merged 6 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,12 @@ class TestOperator extends VeloxWholeStageTransformerSuite {
checkLengthAndPlan(df, 6)
}

test("is_null and is_not_null coexist") {
val df = runQueryAndCompare(
"select l_orderkey from lineitem where l_comment is null and l_comment is not null") { _ => }
checkLengthAndPlan(df, 0)
}

test("and pushdown") {
val df = runQueryAndCompare(
"select l_orderkey from lineitem where l_orderkey > 2 " +
Expand Down
6 changes: 5 additions & 1 deletion cpp/velox/substrait/SubstraitToVeloxPlan.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2036,6 +2036,7 @@ void SubstraitToVeloxPlanConverter::constructSubfieldFilters(

bool nullAllowed = filterInfo.nullAllowed_;
bool isNull = filterInfo.isNull_;
bool existIsNullAndIsNotNull = filterInfo.forbidsNullSet_ && filterInfo.isNullSet_;
uint32_t rangeSize = std::max(filterInfo.lowerBounds_.size(), filterInfo.upperBounds_.size());

if constexpr (KIND == facebook::velox::TypeKind::HUGEINT) {
Expand Down Expand Up @@ -2122,7 +2123,10 @@ void SubstraitToVeloxPlanConverter::constructSubfieldFilters(

// Handle null filtering.
if (rangeSize == 0) {
if (!nullAllowed) {
// handle is not null and is null exists at same time
if (existIsNullAndIsNotNull) {
filters[common::Subfield(inputName)] = std::move(std::make_unique<common::AlwaysFalse>());
} else if (!nullAllowed) {
filters[common::Subfield(inputName)] = std::make_unique<common::IsNotNull>();
} else if (isNull) {
filters[common::Subfield(inputName)] = std::make_unique<common::IsNull>();
Expand Down
4 changes: 4 additions & 0 deletions cpp/velox/substrait/SubstraitToVeloxPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,7 @@ class SubstraitToVeloxPlanConverter {
if (!initialized_) {
initialized_ = true;
}
forbidsNullSet_ = true;
}

// Only null is allowed.
Expand All @@ -325,6 +326,7 @@ class SubstraitToVeloxPlanConverter {
if (!initialized_) {
initialized_ = true;
}
isNullSet_ = true;
}

// Return the initialization status.
Expand Down Expand Up @@ -375,6 +377,8 @@ class SubstraitToVeloxPlanConverter {

bool nullAllowed_ = false;
bool isNull_ = false;
bool forbidsNullSet_ = false;
bool isNullSet_ = false;
Copy link
Contributor

@PHILO-HE PHILO-HE May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is forbidsNullSet_ logically equivalent to !nullAllowed? And isNullSet_ equivalent to isNull_? If true, we can just use the existing flags (maybe, should also check initialized_ == true?). Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not. initialized_ can be set other place not just here two null related setter method.

Copy link
Contributor Author

@zjuwangg zjuwangg May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not. initialized_ can be set other place not just here two null related setter method.

So we can not according to initialized_ variable to determine whether setNull() or forbidsNull() has been called first.
And now setNull method will set nullAllowed_ to true that could cause unexpected filter behavior.

IMO, forbidsNullSet_ and isNullSet_ make code more clean and readable without other performance cost.


// If true, left bound will be exclusive.
std::vector<bool> lowerExclusives_;
Expand Down
Loading