Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use datum arithmetic scalar value #7375

Merged
merged 3 commits into from
Aug 24, 2023

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Aug 22, 2023

Which issue does this PR close?

Part of #7353

Rationale for this change

This avoids duplicated arithmetic logic, and moves us closer to replacing the custom ScalarValue logic with upstream kernels (#7353)

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

The upstream kernels have a couple of differences compared to the existing impls:

  • Null + v = Null, whereas previously it was v
  • Decimal precision is correctly updated
  • Heterogeneous interval arithmetic is not supported, this is instead the responsibility of the type coercion machinery

@tustvold tustvold added the api change Changes the API exposed to users of the crate label Aug 22, 2023
@github-actions github-actions bot added physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Aug 22, 2023
/// Wrapping addition of `ScalarValue`
///
/// NB: operating on `ScalarValue` directly is not efficient, performance sensitive code
/// should instead make use of vectorized array kernels
pub fn add<T: Borrow<ScalarValue>>(&self, other: T) -> Result<ScalarValue> {
Copy link
Contributor Author

@tustvold tustvold Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance of this is likely worse, however, following #7358 the only use of this logic is within the range analysis framework, the median accumulator (to average two values in the event of an even number), and Range window functions, none of which I would consider performance critical

@tustvold
Copy link
Contributor Author

#7376 should resolve the median issue

@tustvold tustvold force-pushed the use-datum-arithmetic-scalar-value branch from 5064a79 to 86b4453 Compare August 23, 2023 08:18
@github-actions github-actions bot removed optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Aug 23, 2023
@tustvold tustvold force-pushed the use-datum-arithmetic-scalar-value branch from 86b4453 to 4bc82e5 Compare August 23, 2023 08:35
@github-actions github-actions bot removed the physical-expr Physical Expressions label Aug 23, 2023
@tustvold tustvold marked this pull request as ready for review August 23, 2023 08:42
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great -- thank you @tustvold . I took the liberty of merging up from main (and tweaking the performance comment, because I couldn't help myself) but I think we should merge this PR once the tests are passing

cc @berkaysynnada

@@ -2072,53 +1188,48 @@ impl ScalarValue {
}
}

/// Wrapping addition of `ScalarValue`
///
/// NB: operating on `ScalarValue` directly is not efficient, performance sensitive code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the performance admonition 👍

@alamb
Copy link
Contributor

alamb commented Aug 23, 2023

The upstream kernels have a couple of differences compared to the existing impls:

These are all bug fixes in my mind

@tustvold tustvold merged commit 79e751f into apache:main Aug 24, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants