Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Remove cloning ArrayData in with_precision_and_scale #3050

Merged
merged 2 commits into from
Nov 10, 2022

Conversation

viirya
Copy link
Member

@viirya viirya commented Nov 8, 2022

Which issue does this PR close?

Closes #.

Rationale for this change

with_precision_and_scale doesn't need self actually but only a reference.

And more, usually we get a reference of decimal array with APIs like as_primitive_array. Currently it makes harder to call with_precision_and_scale with that.

What changes are included in this PR?

Are there any user-facing changes?

@viirya viirya added the api-change Changes to the arrow API label Nov 8, 2022
@github-actions github-actions bot added the arrow Changes to the arrow crate label Nov 8, 2022
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it need self if we changed it to not clone ArrayData?

@viirya
Copy link
Member Author

viirya commented Nov 9, 2022

Would it need self if we changed it to not clone ArrayData?

If not clone ArrayData, then yes, self is needed.

@viirya
Copy link
Member Author

viirya commented Nov 9, 2022

The main pain point of current API is that the reference of primitive array returned from as_primitive_array, cannot be called with it.

@viirya
Copy link
Member Author

viirya commented Nov 10, 2022

@tustvold Do you have more thoughts on this?

@tustvold
Copy link
Contributor

I think a with_ taking by reference is somewhat at odds with the rest of the codebase. It is primarily intended for use when building new array, at which point cloning the ArrayData is just an unnecessary overhead. In the downcast case you mention currently one can do

DecimalArray::from(array.data().clone()).with_precision_and_scale(p, s).unwrap()

i.e. explicitly clone the array data, and this is actually less verbose than downcasting it. So I'm not sure about this...

@viirya
Copy link
Member Author

viirya commented Nov 10, 2022

Would it need self if we changed it to not clone ArrayData?

If not clone ArrayData, then yes, self is needed.

Oops, I'm wrong about this. Not sure how I messed it up previously.

clone() cannot be removed. No matter it is self or &self here. Got compilation error:

    --> arrow-array/src/array/primitive_array.rs:847:20
     |
847  |         let data = self.data().into_builder().data_type(new_data_type);
     |                    ^^^^^^^^^^^^--------------
     |                    |           |
     |                    |           value moved due to this method call
     |                    move occurs because value has type `ArrayData`, which does not implement the `Copy` trait

@viirya
Copy link
Member Author

viirya commented Nov 10, 2022

DecimalArray::from(array.data().clone()).with_precision_and_scale(p, s).unwrap()

This works, though it's not straightforward to use the API.

But as the cloning seems not able to be removed, maybe it is better to make the API more easy to use with a reference?

@tustvold
Copy link
Contributor

tustvold commented Nov 10, 2022

The cloning is not necessary in the very common case of building a new array

clone() cannot be removed. No matter it is self or &self here. Got compilation error:

Try self.data instead of self.data()

@viirya
Copy link
Member Author

viirya commented Nov 10, 2022

Ah, you're right. Not realized that we can directly use data here... 😅

@viirya viirya changed the title Minor: Use reference in with_precision_and_scale Minor: Remove cloning ArrayData in with_precision_and_scale Nov 10, 2022
@tustvold tustvold removed the api-change Changes to the arrow API label Nov 10, 2022
@tustvold tustvold merged commit 5fb3033 into apache:master Nov 10, 2022
@ursabot
Copy link

ursabot commented Nov 10, 2022

Benchmark runs are scheduled for baseline = c027c70 and contender = 5fb3033. 5fb3033 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants