-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive errors when inserting ~1M values #11
Comments
Hrmm... this is really interesting! I think it has to do with the transition from sparse mode to normal mode. That being said, I haven't played with this library in quite a while so I've lost a sense of at which cardinalities this switch happens for various error rates. If you could dig a bit deeper, that would be amazing! If not, I'll try to get to it in the next couple weeks. |
So a couple things I've found:
|
|
I suspect it's really the "normal" calculation, which is only correct once the number of elements is beyond ~1M, the above example is correct below 640K, because it's calculated using the sparse method. If you force the normal method from the start, the numbers are way off from the get go.
Results in
|
Maybe I'm doing something horribly wrong, but I'm getting cardinality errors way beyond the specified boundary. All I do is that I run this simple snippet and it starts going off track at about 700k
This is what I got:
When I bisected this around the 700k mark, I found out it breaks at the 640k mark.
I'm not terribly familiar with the design of the library to comment any further at this point. But let me know if you need a bit more investigation into this.
The text was updated successfully, but these errors were encountered: