Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support vectorized unary operators #5130

Merged
merged 8 commits into from
Aug 25, 2022
Merged

Conversation

onelson
Copy link
Contributor

@onelson onelson commented Aug 25, 2022

Fixes #5045

Adds support for vectorized unary operators: add, sub, exists, not.

Differences in the supported Value types between the regular unary and the vectorized fall along the lines of what can be represented as column data (which is the gateway to vectorized map) and what can be freely stored in arrow arrays.

The notable omission here is Duration, which is commonly used with unary sub, ex: -1d. Since this type cannot appear in tables, it cannot be mapped over using vectorization.

The new code path is guarded by a feature flag: vectorizedUnaryOps.

Checklist

Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.

  • ✏️ Write a PR description, regardless of triviality, to include the value of this PR
  • 🔗 Reference related issues
  • 🏃 Test cases are included to exercise the new code
  • 🧪 If new packages are being introduced to stdlib, link to Working Group discussion notes and ensure it lands under experimental/
  • 📖 If language features are changing, ensure docs/Spec.md has been updated

Dear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.

@onelson onelson force-pushed the onelson/feat/vectorized-unary branch from b8ceb84 to c48e0ff Compare August 25, 2022 16:17
@onelson onelson marked this pull request as ready for review August 25, 2022 16:27
@onelson onelson requested a review from a team as a code owner August 25, 2022 16:27
@onelson onelson requested review from wolffcm and removed request for a team August 25, 2022 16:27
@@ -644,15 +644,91 @@ func (e *unaryEvaluator) Type() semantic.MonoType {
return e.t
}

func doUnary(mt semantic.MonoType, op ast.OperatorKind, v values.Value) (values.Value, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lightly modified - this was formerly a closure defined inside the unary evaluator. Extracted so it could be shared with the vector version for the vec repeat case.

Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.


/// Enables calls to map to be vectorized when the function contains
/// unary operators like: add, sub exists, not.
VectorizedUnaryOps,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to start generating Rust code for features at some point. (just an observation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building the enum from the yaml file? Yeah, that's an interesting idea. It could be something we do via build.rs

b.Resize(n)
for i := 0; i < n; i++ {
if v.IsValid(i) {
b.Append(-v.Value(i))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious what Go does if you try to negate math.MinInt64:
https://go.dev/play/p/h-QiLnQU8nz

I guess we inherit this behavior...

switch op {
case ast.AdditionOperator:
// Do nothing.
return v, nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always wonder why this operator exists... at least the implementation is efficient...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mean... 🤷

I'm just trying to stay faithful to the source material 🤣

expect_test::expect![[r##"
(r) => {
return {r:{#C with _value: v[#B]} with add: +r:{#C with _value: v[#B]}._value:v[#B]:v[#B], sub: -r:{#C with _value: v[#B]}._value:v[#B]:v[#B]}:{#C with add: v[#B], sub: v[#B], _value: v[#B]}
}:(r: {#C with _value: v[#B]}) => {#C with add: v[#B], sub: v[#B], _value: v[#B]}"##]].assert_eq(&crate::semantic::formatter::format_node(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how you like this way of verifying vectorized code, or if you think there's a better way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a very hard time reading the string here, so I'm glad it's generated for me. I doubt I could write this by hand correctly without much trial and error.

For me, the exercise mostly comes down to making sure I see a v[] around either a scalar type or a type var in the right field positions. Other than that, the test gives me pass/fail on whether or not the function expression can be vectorized.

Seems sufficient.

@onelson onelson merged commit 43e9276 into master Aug 25, 2022
@onelson onelson deleted the onelson/feat/vectorized-unary branch August 25, 2022 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vectorize unary Add, Sub, Exists, Not
2 participants