Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: bump narwhals and adapt to support pyarrow #694

Merged
merged 13 commits into from
Aug 21, 2024
Merged

Conversation

FBruzzesi
Copy link
Collaborator

Description

Long story short: I knew that some functionalities would have not worked for pyarrow even if using narwhals and we had to adjust accordingly.

For example we cannot create a series using native_namespace.Series([...]), because pyarrow doesn't have a Series object, therefore the a workaround was needed:

- series = native_namespace.Series([...])
+ series = nw.from_dict({"_tmp": [...]})["_tmp"]

Namely, create a narhwals dataframe for the given namespace and then select the unique column.

A bump of other changes were needed in the tests to assess with pyarrow tables.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the style guidelines (ruff)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (also to the readme.md)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added tests to check whether the new feature adheres to the sklearn convention
  • New and existing unit tests pass locally with my changes

Other comments

  • The commits are because I branched out from another PR 🙈
  • Most likely there are other parts of the codebase in which a change may be required to make sure that pyarrow runs (to be continued)
  • Adding @MarcoGorelli to the mix, to make sure there is no better way of going about this

Copy link
Owner

@koaning koaning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment. If the comment clears it looks good to me!

@koaning
Copy link
Owner

koaning commented Aug 7, 2024

Oh, and one more thing, must we bump narwhals? I usually prefer not to force the user to use the latest and greatest dependency, but it seems like we need it to support more dataframe types?

@FBruzzesi
Copy link
Collaborator Author

Oh, and one more thing, must we bump narwhals? I usually prefer not to force the user to use the latest and greatest dependency, but it seems like we need it to support more dataframe types?

Sadly yes, if we want to have pyarrow support for grouped meta and shift, these functionalities made it only in the latest release (yet narwhals keeps being dependency-free)

Copy link
Owner

@koaning koaning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Might be good to prep another release?

@FBruzzesi
Copy link
Collaborator Author

LGTM! Might be good to prep another release?

Sounds good! We went a bit silent, but now have some breaking changes from #693

@koaning koaning merged commit 236f491 into main Aug 21, 2024
18 checks passed
@koaning koaning deleted the feat/native-series branch August 21, 2024 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants