Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX512 for PQFastScan #3276

Conversation

alexanderguzhva
Copy link
Contributor

AVX-512 implementation for PQFastScan for QBS.
For local benchmarks on 4th gen Xeon, the QPS is up to 10% higher, mostly for a single query case. But as far as I remember, production cases would show higher performance improvements.

@mdouze should I modify pq4_fast_scan_search_1.cpp as well? It is somewhat cumbersome to dig through various possible sub-implementations

@mdouze
Copy link
Contributor

mdouze commented Mar 15, 2024

Sorry for the late answer.
No you don't need to do all sub-implementations.
let me try to import...

@facebook-github-bot
Copy link
Contributor

@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mdouze
Copy link
Contributor

mdouze commented Mar 18, 2024

re-imported

@alexanderguzhva
Copy link
Contributor Author

@mdouze got some problems with raft, rebased on top of master

@algoriddle
Copy link
Contributor

@mdouze got some problems with raft, rebased on top of master

9c79e3d5 fixes the RAFT compilation problem

Signed-off-by: Alexandr Guzhva <[email protected]>
@facebook-github-bot
Copy link
Contributor

@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mdouze merged this pull request in d99f07e.

abhinavdangeti pushed a commit to blevesearch/faiss that referenced this pull request Jul 12, 2024
Summary:
AVX-512 implementation for PQFastScan for QBS.
For local benchmarks on 4th gen Xeon, the QPS is up to 10% higher, mostly for a single query case. But as far as I remember, production cases would show higher performance improvements.

* Baseline `benchs/bench_ivf_fastscan_single_query.py` (sift1M): https://gist.github.com/alexanderguzhva/c9cde2cb5e9c7675f429623e6faa9fbf
* Candidate `benchs/bench_ivf_fastscan_single_query.py` (sift1M): https://gist.github.com/alexanderguzhva/4e8530073a108f73771d38e55bc45b17
* Baseline `benchs/bench_ivf_fastscan.py` (sift1M): https://gist.github.com/alexanderguzhva/9eb03ed60354d7e76cfa25e676f983ac
* Candidate `benchs/bench_ivf_fastscan.py` (sift1M): https://gist.github.com/alexanderguzhva/3cbfeba1364dd445a2bb52455966979e

mdouze should I modify `pq4_fast_scan_search_1.cpp` as well? It is somewhat cumbersome to dig through various possible sub-implementations

Pull Request resolved: facebookresearch#3276

Reviewed By: junjieqi

Differential Revision: D54943632

Pulled By: mdouze

fbshipit-source-id: 3d70066e9779039559b1734c2be99bf439058246
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants