-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Woodland perfect hash #4
Comments
Hey @leijurv. Here's an explanation I came up with, why the mansion seed's lower 32 bits happen to have no collisions, and it turns out it involves some lattices again! The seed is calculated as
or introducing some notation:
We're looking at the lower 32 bits, i.e.
We want to check if there are any collisions:
where
or introducing
So, two points Now all we have to do, is to check that no pair of our input points have differences of coordinates Red arrows are the reduced basis vectors. Now let's look into the question of whether this is a rare situation, that is caused by a specific selection of coefficients If we change This is of course a heurisitic argument, that only gives approximate values, without any specific coefficients, but here's an interesting specific example. Let's assume that our input points lie inside a circle instead of a square, and then try to make this circle as big as possible by choosing an appropriate value of coefficient The radius of the circle is the length of the smallest of basis vectors, which is approximately 70422. To go back from Here's also an example of a bad value of k (specifically So, to sum up, it turns out that with any linear hash function that generates outputs that are sufficiently random (lattice is isotropic enough) we'll have a pretty large region around any input point, that has no hash collisions. Also all hash collisions will occur in a pattern that forms a lattice, which is also pretty interesting. |
very cool!! thanks for the analysis |
To shine some light on the topic. The woodland seed function is a perfect hash of the cords (x,z) in the limited range. It is from the family of linear hash functions. Usual construction would require that parameters are co-prime to modulus and each-other but due to limited range of x and z the function is still injective despite one of the parameters not being co-prime with the modulus.
In case of only one input, if the parameter is co-prime to the modulus, it will create an additive group of size N, meaning it it will create a perfect hash for all values [0, N).
(I can write more on it latter).
The text was updated successfully, but these errors were encountered: