-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why does it take more time to search the entire xq than to search sequentially for a subset of all the shards of the xq? #836
Comments
With as baseline case 1:
If you are not in one of those cases, please comment. |
The strange thing is that searching 10 datasets of 100M is faster than searching the dataset of 1B. This is my case. |
There is a demo. The length of xb is 1M, and the index is 'IVF100_HNSW32,PQ64'.
The result is
|
Thanks for the demo. I can reproduce the issue.
The issue is due to the inverted list scanning, the quantization time is constant. |
This will be fixed soon. |
Thank you very much. |
This was fixed upstream, and the fix will be available in the next release. |
Bugfixes: - slow scanning of inverted lists (#836). Features: - add basic support for 6 new metrics in CPU `IndexFlat` and `IndexHNSW` (#848); - add support for `IndexIDMap`/`IndexIDMap2` with binary indexes (#780). Misc: - throw python exception for OOM (#758); - make `DistanceComputer` available for all random access indexes; - gradually moving from `long` to `int64_t` for portability.
hi,dear
thx |
There are 1B samples in xq. I get the searching result by two different ways:
The first way takes more time than the second. Please tell me the reason. THX.
The text was updated successfully, but these errors were encountered: