-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Array hashing breaks equality on 0.7 #26034
Comments
So you're defining a type that is equal to a number and hashes like a number... but it isn't itself a number and doesn't support subtraction? Yeah, I can see how that'd get you into trouble in many situations. I don't think this is really actionable… or that we'd ever guarantee such a type would work everywhere. |
Indeed, since #16401 a type should not be equal to a number if it doesn't support |
Oh, thanks. I apparently missed the warning that if there exists That's the long-term plan for a simple rule, right? I mean addition/subtraction are intentionally not well-defined mod |
It's not clear yet what approach will be retained. See #26022. |
Goal: Hash approximately log(N) entries with a higher density of hashed elements weighted towards the end and special consideration for repeated values. Colliding hashes will often subsequently be compared by equality -- and equality between arrays works elementwise forwards and is short-circuiting. This means that a collision between arrays that differ by elements at the beginning is cheaper than one where the difference is towards the end. Furthermore, blindly choosing log(N) entries from a sparse array will likely only choose the same element repeatedly (zero in this case). To achieve this, we work backwards, starting by hashing the last element of the array. After hashing each element, we skip the next `fibskip` elements, where `fibskip` is pulled from the Fibonacci sequence -- Fibonacci was chosen as a simple ~O(log(N)) algorithm that ensures we don't hit a common divisor of a dimension and only end up hashing one slice of the array (as might happen with powers of two). Finally, we find the next distinct value from the one we just hashed. Fixes #27865 and fixes #26011. Fixes #26034
I fear that this is fundamental to the current approach for O(1) range hashing (but I would be happy to be corrected!).
The text was updated successfully, but these errors were encountered: