Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NaN handling in GpuLessThanOrEqual and GpuGreaterThanOrEqual #9780

Merged
merged 2 commits into from
Nov 20, 2023

Conversation

jlowe
Copy link
Member

@jlowe jlowe commented Nov 17, 2023

Fixes #9751. Fixes #9711.

ColumnView.isNan does not return any nulls (null input rows return false output rows), so the logic for handling a scalar NaN in GpuLessThanOrEqual and GpuGreaterThanOrEqual was accidentally clobbering nulls in the output. Updated to simply apply the validity vector of the input to the column vector full of true rows to produce the proper result for this corner case.

@jlowe jlowe self-assigned this Nov 17, 2023
@jlowe
Copy link
Member Author

jlowe commented Nov 17, 2023

build

@jlowe
Copy link
Member Author

jlowe commented Nov 17, 2023

build

1 similar comment
@pxLi
Copy link
Collaborator

pxLi commented Nov 20, 2023

build

@jlowe
Copy link
Member Author

jlowe commented Nov 20, 2023

build

@@ -156,6 +156,7 @@ def test_lte(data_gen):
lambda spark : binary_op_df(spark, data_gen).select(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want some tests with hard coded NaN in them?

f.lit("NaN").cast(data_type)

Similar to what we do for None? Just so we know the error case is covered instead of waiting for it to possibly happen.

@@ -258,7 +258,6 @@ def test_lt(data_descr):
s2 < f.col('b'),
f.col('a') < f.col('b')))

@datagen_overrides(seed=0, reason='https://github.com/NVIDIA/spark-rapids/issues/9711')
@pytest.mark.parametrize('data_descr', ast_comparable_descrs, ids=idfn)
def test_lte(data_descr):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in the other tests. Do we want to explicitly have a NaN Scalar in the tests to verify? Or a dedicated test for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently generating NaN and null often enough as special cases that this should be covered pretty well across multiple runs. Want to get these tests fixed for CI, can file followup if desired.

@jlowe jlowe merged commit 0e6fda6 into NVIDIA:branch-23.12 Nov 20, 2023
37 checks passed
@jlowe jlowe deleted the fix-cmp-nan-tests branch November 20, 2023 21:06
@sameerz sameerz added the bug Something isn't working label Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants