-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
narwhals
to support pandas as input
#108
Comments
Ooooh that sounds like exactly what we need. I won't have much (if any) time to write code for this in the next couple months, but I'd be more than happy to review a PR, especially if the changes are pretty minimal. Thanks for raising this issue! |
@artiom-matvei you could take a look at this if you have time. Seems simple enough, and may not require that much internal change. |
@vincentarelbundock From what I understand, narwhals is used as an interface that translates operations to the underlying functions of polars or pandas. Currently, I think we convert pandas to polars (through The way I see we could apply narwhals, would be by using narwhals.DataFrame instead of polars.DataFrame everywhere which seems like needing quite some refactoring. Is this the way to go? Do you have a suggestion for a different way of using narwhals? |
Yes, I think that's basically it. IIUC, there wouldn't be that much refactoring to do, because the Narwhals API is extremely similar to Polars, which we already use. So it would mostly be a matter of changing:
I think what I'd like to see is a proof of concept on a very small part of the code base. Converting a couple functions to accept any kind of data frame as input, and returning the original type. If that passes the tests, we will have a better idea of what it would take to refactor everything. A proof of concept before committing many hours. @s3alfisc do you have experience doing this? If so, how would you recommend we proceed? |
This sounds like the right strategy to me! I also think that replacing the Unfortunately, I haven't yet started the narwhals migration of pyfixest, so still somewhat inexperienced. But I think that @MarcoGorelli has by now accompanied many such migrations (for example for altair) so he might have some tips / insights? |
I opened an Issue with narwhals as I cannot make it apply the |
Thanks for opening this issue. It looks detailed and helpful. Hopefully someone will respond with useful info. |
thanks for the ping! happy to help out along the way (though I wouldn't have capacity right now to lead the effort) - i'd suggest to start small and see if there's any parts of the codebase which it's possible to narwhalify whilst keeping the test suite passing. then, gradually increase for any questions / help, i'd suggest:
happy to see what we can do here! |
Thanks to every who contributed to the discussion, and especially to Artiom for the implementation! |
Hi @vincentarelbundock - @juanitorduz pointed me towards @MarcoGorelli's narwhals project, which solves the "two DataFrame APIs" problem for developers by allowing to define APIs that are agnostic to the input data frame type.
I.e. one can do things as
In other words - your
polars
code will work onpandas
inputs, without requiring pandas as a dependency! In fact,narwhals
even allows you to drop thepolars
dependency if you wanted to.Happy to try myself at a PR for this once I find the time =)
The text was updated successfully, but these errors were encountered: