filter | sparse | ood | ||||||
---|---|---|---|---|---|---|---|---|
rank | algorithm | qps | rank | algorithm | qps | rank | algorithm | qps |
1 | zilliz |
213.3K | 1 | zilliz |
34.8K | 1 | scann |
107.4K |
2 | pinecone |
146.7K | 2 | pyanns |
26.9K | 2 | pinecone-ood |
76.9K |
3 | puck |
62.3K | 3 | pinecone_smips |
12.0K | 3 | zilliz |
73.5K |
4 | parlayivf |
55.0K | 4 | shnsw |
8.2K | 4 | pyanns |
55.5K |
5 | wm_filter |
20.9K | 5 | nle |
2.9K | 5 | sustech-ood |
28.5K |
6 | pyanns |
9.0K | 6 | cufe |
0.1K | 6 | mysteryann-dif |
27.9K |
7 | faissplus |
8.5K | 7 | linscan |
0.1K | 7 | mysteryann |
26.6K |
8 | faiss |
7.3K | spmat |
8 | vamana |
20.0K | ||
9 | cufe |
6.3K | sustech-whu |
9 | puck |
19.0K | ||
dhq |
10 | ngt |
11.9K | |||||
fdufilterdiskann |
11 | epsearch |
7.7K | |||||
hwtl_sdu_anns_filter |
12 | diskann |
6.3K | |||||
13 | cufe |
5.4K | ||||||
puck-fizz |
The NeurIPS2023 Practical Vector Search Challenge evaluated participating algorithms on Azure and EC2 CPU-based hardware instances.
In pursuit of expanding the evaluation criteria, we are also running on other generally available hardware configurations.
Shown here are results run on the following hardware:
- AMD EPYC 9124 16-Core 3GHz processor
- 125GB RAM
- 440GB NVMe SSD
- Bare-metal "m4-metal-medium" instance provided by Latitude
The calculated rankings are shown at the top.
Notes:
- Evaluations were run in late August 2024.
- In each track, qualifying algorithms are ranked by largest qps where recall/ap >= 0.9.
- All participating algorithms are shown for each track, but only qualifying algorithms are ranked.
- Each track algorithm links to the build and run commmand used (or disqualifying errors, if any).
- Pareto graphs for each track shown below.
TODO
The full data export CSV file can be found here.
This section shows the steps you can use to reproduce the results shown above, from scratch.
- Signup for/sign into your Latitude account
- Provision an "m4-metal-medium" instance with at least 100GB NVMe SSD with Linux 20.04.06 LTS
- ssh remotely into the instance
- update Linux via command
sudo apt-get update
- install Anaconda for Linux
- run the following commands:
git clone [email protected]:harsha-simhadri/big-ann-benchmarks.git
cd big-ann-benchmarks
conda create -n bigann-latitude-m4-metal-medium python=3.10
conda activate bigann-latitude-m4-metal-medium
python -m pip install -r requirements_py3.10.txt
Prepare the track dataset by running the following command in the top-level directory of the repository:
python create_dataset.py --dataset sparse-full
See the latitude/commands directory for individual algorithm scripts.
Prepare the track dataset by running the following command in the top-level directory of the repository:
python create_dataset.py --dataset yfcc-10M
See the latitude/commands directory for individual algorithm scripts.
Prepare the track dataset by running the following command in the top-level directory of the repository:
python create_dataset.py --dataset text2image-10M
See the latitude/commands directory for individual algorithm scripts.
Prepare the track dataset by running the following command in the top-level directory of the repository:
python create_dataset.py --dataset msturing-30M-clustered
python -m benchmark.streaming.download_gt --runbook_file neurips23/streaming/final_runbook.yaml --dataset msturing-30M-clustered
See the latitude/commands directory for individual algorithm scripts.
To extract the data as CSV:
sudo chmod ugo+rw -R ./results/ # recursively add read/write permissions to directories and files under the results directory.
python data_export.py --recompute --output neurips23/latitude/data_export_m4-metal-medium.csv
To plot individual tracks:
python plot.py --neurips23track sparse --output neurips23/latitude/sparse.png --raw --recompute --dataset sparse-full
python plot.py --neurips23track filter --output neurips23/latitude/filter.png --raw --recompute --dataset yfcc-10M
python plot.py --neurips23track ood --output neurips23/latitude/ood.png --raw --recompute --dataset text2image-10M
TODO: streaming track
To render the ranking table, see this notebook.
- The hardware systems were graciously donated by Latitude
- None of the Neurips2021/23 organizers is an employee or affiliated with Latitude.
- George Williams, an organizer for both the NeurIPS2021 and NeurIPS2023 Competitions ran the evaluations described above.
- Our main contact from Latitude is Victor Chiea, whom we were introduced by Harald Carlens from MLContests.
- Latitude logo for sponsorship attribution below (note: it has a transparent background):