-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(rust, python): Faster is_sorted when no flag set #9777
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice TODO hunt. :)
Does this break |
Not that I know of. It seems fine. Did you see something? |
In import polars as pl
df = pl.DataFrame({"x": [1, 2], "y": [3, 4]})
df.select(pl.struct("x", "y")).to_series().is_sorted()
|
You're right. I tried with 0.18.7 |
Inequality operations aren't implemented for structs. It makes sense to have them. After all, if you can sort structs, you can compare them. But some care is required.
I'm inclined toward option 2 but I'll defer to @ritchie46 since he knows the internals much better. |
We should do this in two steps. First fix the regression by pattern matching on Next we can see if we can implement comparison on structs. This needs to be implemented on the |
I thought structs were sorted in dictionary order, |
I am talking about equality inly in this case. The other one needs row encoding. Probably that's the proper wat for all struct comparisons. |
Equality is already implemented for Structs, the issue is inequality checks, which aren't implemented in Cases:
Honestly, I think I could do the proper change in not much longer than special-casing a fix, but if you'd like time to think about and discuss how to handle the operation properly we can do it in two phases. |
This will check sortedness without sorting the entire series even when no flag is set. Seems faster.