-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!: remove fixed_seed and add pl.set_random_seed #10388
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save, from some merge conflicts, I think this looks great.
I think we have to queue a few breaking changes, and then we can bump in a few weeks.
8d93315
to
ea16d7f
Compare
Does this mean that hashes will always be deterministic now, and not depend on architecture or version? Reference: #7758 (comment) Does this also effectively allow for a global seed? Then might it close #3076? |
This does indeed close #3076. It doesn't change anything about the hashing at the moment. |
@orlp can you rebase and merge this one? |
ea16d7f
to
5e34cbb
Compare
Fixes #10367.
This is a partial revert of #9694. We removed the
fixed_seed
parameter. Callingshuffle(seed=n)
should always do the same thing, instead, in the future we should supportshuffle(seed=expr)
that allows you to set the seed per groupby if you wish to have deterministic but different seeds per group.This also adds a new function
pl.set_random_seed
that you can use to make the internal Polars RNG deterministic, instead of relying on Python'srandom.seed
. Note that you might still get non-deterministic results in queries due to threading order.