-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make plotly-express dataframe agnostic via narwhals #4790
Conversation
…lotly.py into plotly-with-narwhals
@@ -454,7 +532,7 @@ def make_trace_kwargs(args, trace_spec, trace_data, mapping_labels, sizeref): | |||
mapping = args["color_discrete_map"].copy() | |||
else: | |||
mapping = {} | |||
for cat in trace_data[attr_value]: | |||
for cat in trace_data.get_column(attr_value).to_list(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, although trace_data.get_column(attr_value)
is a Narwhals Series, which is an iterable, explicitly calling a to_list()
makes sure that the elements we loop over are python objects in all cases.
Calling to_py_scalar
for each element is significantly more time consuming for the cases in which elements are not python scalars (such as pyarrow).
I will make sure to add an in-line comment
packages/python/plotly/plotly/tests/test_optional/test_px/test_trendline.py
Outdated
Show resolved
Hide resolved
This is an example of a graph that renders differently each time in polars. Looks like the order for the values of
But for pandas the order is consistent and seems to depend on the order of the data. Would it be possible to make it consistent for polars too. cc @emilykl |
yup, addressed in the latest commit 👌 latest timings: https://www.kaggle.com/code/marcogorelli/visualise-timings?scriptVersionId=205986199 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FBruzzesi @MarcoGorelli All looks good from our side. You can go ahead and merge when ready.
This was a massive effort, thank you very much for the contribution! 🚀
Hey @emilykl thanks a ton! We really appreciated the collaborative effort! As mention on a few meetings, it led us to investigate deeper and improve narwhals codebase significantly as well, it has been such a win-win! Regarding merging, the branch keeps going out of sync with master, and I will need another approval to merge it myself (and hopefully I need to time it correctly 😇) or you can approve and merge whenever you want. There are no other changes on this PR from our side 🚀 |
@FBruzzesi @MarcoGorelli Merged! 🚀 🎉 💥 |
Description
This PR migrates plotly-express module logic from pandas to narwhals. In this way, pandas is not a required dependency for plotly-express (or at least for its entirety - e.g. trendlines will still require pandas for now) and users coming with polars,
pyarrow or other eager dataframes supported in narwhals do not need to depend on pandas in the first place.
Closes #4749
Code PR
plotly.graph_objects
, my modifications concern thecodegen
files and not generated files.modified existing tests.
new tutorial notebook (please see the doc checklist as well).
Out of scope for the PR
Adapt plotly data accordinglyincluded in this PRcc: @MarcoGorelli @LiamConnors