Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(rust,python): refactor arange and add int_range/int_ranges #9666

Merged
merged 13 commits into from
Jul 2, 2023

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented Jul 1, 2023

Partially addresses #9019

Changes:

  • Refactor arange to make use of the existing expression architecture
  • Add int_range and int_ranges functions as alias for arange functionality.
    • While arange has some merit naming-wise (exists in numpy, for example), it's not a good name. What exactly is an 'array range'? We have time_range, date_range, let's be explicit and call this what it is: an int_range. We could add a float_range later, for example.
    • Splits out the vertical vs horizontal behaviours
  • Add doc entry to Expressions / Functions

Future steps (not included here to keep diff somewhat small):

  • Further refactor the functions, they're still messy.
  • Support a dtype arg for int_range(s) on the Rust side to avoid materializing everything as Int64 first.
  • Rename the feature?
  • Deprecate arange?

Can we deprecate arange as a whole, or do we want to keep this function due to name familiarity? If we keep it, do we want to split out functionality into arange/aranges? I prefer getting rid of it entirely.

@github-actions github-actions bot added python Related to Python Polars refactor Code improvements that do not change functionality rust Related to Rust Polars labels Jul 1, 2023
@stinodego stinodego force-pushed the int-ranges branch 2 times, most recently from 65d5689 to c988140 Compare July 1, 2023 22:47
@stinodego stinodego marked this pull request as ready for review July 1, 2023 23:29
@stinodego stinodego changed the title refactor(rust,python): Refactor arange and add int_range/int_ranges refactor(rust,python): refactor arange and add int_range/int_ranges Jul 2, 2023
@MarcoGorelli
Copy link
Collaborator

Nice!

Is the idea to introduce int_range / int_ranges now, deprecate arange in favour of them, and then in 1.0 remove arange completely?


Just a suggestion testing-wise - perhaps it'd be good to test the schema in the lazy case, like how is done in the test_date_range_schema, / test_date_range_no_alias_schema_9037 tests?

@stinodego
Copy link
Member Author

stinodego commented Jul 2, 2023

Is the idea to introduce int_range / int_ranges now, deprecate arange in favour of them, and then in 1.0 remove arange completely?

That's what I would propose. But I can imagine Ritchie might want to keep arange for its name familiarity. Either way, this PR keeps existing arange functionality, so we can do the deprecation later if we choose to.

Just a suggestion testing-wise - perhaps it'd be good to test the schema in the lazy case, like how is done in the test_date_range_schema, / test_date_range_no_alias_schema_9037 tests?

Thanks, I'll look into it!
EDIT: That actually caught a mistake, thanks @MarcoGorelli !

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I like that we moved the implementation properly to the new architecture.

On the naming of arange. I think we should keep it around as an alias to int_range. I believe the range in almost all programming language defaulted to integers. But given that we clarify the dtype for time and date, I understand the specific naming need.

py-polars/polars/functions/range.py Show resolved Hide resolved
@stinodego
Copy link
Member Author

I think we should keep it around as an alias to int_range.

Sounds fair enough. Then I'll deprecate the part of arange that dispatches to int_ranges.

@stinodego stinodego merged commit b81697e into main Jul 2, 2023
@stinodego stinodego deleted the int-ranges branch July 2, 2023 14:28
c-peters pushed a commit to c-peters/polars that referenced this pull request Jul 14, 2023
@Bidek56
Copy link
Contributor

Bidek56 commented Jul 17, 2023

Is int_range and int_ranges functions only available in py-polars?
I am trying to upgrade nodejs-polars to rs-0.31.1 but I am getting:
error[E0425]: cannot find function 'int_ranges' in module 'dsl' Thx

@stinodego
Copy link
Member Author

It should be available in Rust!

@Bidek56
Copy link
Contributor

Bidek56 commented Jul 17, 2023

I see: #[cfg(feature = "range")] on top of the function, could that be causing a problem?

@stinodego
Copy link
Member Author

Ah yes. The feature has been renamed from arange to range. So you must enable the correct feature for it to be available.

@Bidek56
Copy link
Contributor

Bidek56 commented Jul 17, 2023

Adding: #[cfg(feature = "range")] decorator solves it. Thanks!!

@ritchie46
Copy link
Member

Adding: #[cfg(feature = "range")] decorator solves it. Thanks!!

Are you sure? You should activate it on your cargo.toml. This decorator will only ensure your code is excluded from compilation until you activate that feature on your crate.

@Bidek56
Copy link
Contributor

Bidek56 commented Jul 18, 2023

You are correct, #[cfg(feature = "range")] is not needed.

Adding these lines to Cargo.toml solves it, is this the best approach? Thx

[features]
default = ["range"]
range = ["polars-lazy/range"]

@ritchie46
Copy link
Member

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Related to Python Polars refactor Code improvements that do not change functionality rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants