Eval On AMD 3GHz/16-Core + 125GB RAM + NVMe SSD (Bare Metal)

filter			sparse			ood
rank	algorithm	qps	rank	algorithm	qps	rank	algorithm	qps
1	zilliz	213.3K	1	zilliz	34.8K	1	scann	107.4K
2	pinecone	146.7K	2	pyanns	26.9K	2	pinecone-ood	76.9K
3	puck	62.3K	3	pinecone_smips	12.0K	3	zilliz	73.5K
4	parlayivf	55.0K	4	shnsw	8.2K	4	pyanns	55.5K
5	wm_filter	20.9K	5	nle	2.9K	5	sustech-ood	28.5K
6	pyanns	9.0K	6	cufe	0.1K	6	mysteryann-dif	27.9K
7	faissplus	8.5K	7	linscan	0.1K	7	mysteryann	26.6K
8	faiss	7.3K		spmat		8	vamana	20.0K
9	cufe	6.3K		sustech-whu		9	puck	19.0K
	dhq					10	ngt	11.9K
	fdufilterdiskann					11	epsearch	7.7K
	hwtl_sdu_anns_filter					12	diskann	6.3K
						13	cufe	5.4K
							puck-fizz

Introduction

The NeurIPS2023 Practical Vector Search Challenge evaluated participating algorithms on Azure and EC2 CPU-based hardware instances.

In pursuit of expanding the evaluation criteria, we are also running on other generally available hardware configurations.

Shown here are results run on the following hardware:

AMD EPYC 9124 16-Core 3GHz processor
125GB RAM
440GB NVMe SSD
Bare-metal "m4-metal-medium" instance provided by Latitude

Results

The calculated rankings are shown at the top.

Notes:

Evaluations were run in late August 2024.
In each track, qualifying algorithms are ranked by largest qps where recall/ap >= 0.9.
All participating algorithms are shown for each track, but only qualifying algorithms are ranked.
Each track algorithm links to the build and run commmand used (or disqualifying errors, if any).
Pareto graphs for each track shown below.

Track: Filter

Track: Sparse

Track: OOD

Track: Streaming

TODO

Data Export

The full data export CSV file can be found here.

Hardware_Inventory

Via lshw
Via hwinfo
Via procinfo

How_To_Reproduce

This section shows the steps you can use to reproduce the results shown above, from scratch.

System Preparation

Signup for/sign into your Latitude account
Provision an "m4-metal-medium" instance with at least 100GB NVMe SSD with Linux 20.04.06 LTS
ssh remotely into the instance
update Linux via command sudo apt-get update
install Anaconda for Linux
run the following commands:

git clone git@github.com:harsha-simhadri/big-ann-benchmarks.git
cd big-ann-benchmarks
conda create -n bigann-latitude-m4-metal-medium python=3.10
conda activate bigann-latitude-m4-metal-medium
python -m pip install -r requirements_py3.10.txt

Sparse Track

Prepare the track dataset by running the following command in the top-level directory of the repository:

python create_dataset.py --dataset sparse-full

See the latitude/commands directory for individual algorithm scripts.

Filter Track

Prepare the track dataset by running the following command in the top-level directory of the repository:

python create_dataset.py --dataset yfcc-10M

See the latitude/commands directory for individual algorithm scripts.

OOD Track

Prepare the track dataset by running the following command in the top-level directory of the repository:

python create_dataset.py --dataset text2image-10M

See the latitude/commands directory for individual algorithm scripts.

Streaming Track

Prepare the track dataset by running the following command in the top-level directory of the repository:

python create_dataset.py --dataset msturing-30M-clustered
python -m benchmark.streaming.download_gt --runbook_file neurips23/streaming/final_runbook.yaml  --dataset msturing-30M-clustered

See the latitude/commands directory for individual algorithm scripts.

Analysis

To extract the data as CSV:

sudo chmod ugo+rw -R ./results/ # recursively add read/write permissions to directories and files under the results directory.
python data_export.py --recompute --output neurips23/latitude/data_export_m4-metal-medium.csv

To plot individual tracks:

python plot.py --neurips23track sparse --output neurips23/latitude/sparse.png --raw --recompute --dataset sparse-full
python plot.py --neurips23track filter --output neurips23/latitude/filter.png --raw --recompute --dataset yfcc-10M
python plot.py --neurips23track ood --output neurips23/latitude/ood.png --raw --recompute --dataset text2image-10M
TODO: streaming track

To render the ranking table, see this notebook.

Disclaimers_And_Credits

The hardware systems were graciously donated by Latitude
None of the Neurips2021/23 organizers is an employee or affiliated with Latitude.
George Williams, an organizer for both the NeurIPS2021 and NeurIPS2023 Competitions ran the evaluations described above.
Our main contact from Latitude is Victor Chiea, whom we were introduced by Harald Carlens from MLContests.
Latitude logo for sponsorship attribution below (note: it has a transparent background):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

latitude-m4-metal-medium.md

latitude-m4-metal-medium.md

Eval On AMD 3GHz/16-Core + 125GB RAM + NVMe SSD (Bare Metal)

Table Of Contents

Introduction

Results

Track: Filter

Track: Sparse

Track: OOD

Track: Streaming

Data Export

Hardware_Inventory

How_To_Reproduce

System Preparation

Sparse Track

Filter Track

OOD Track

Streaming Track

Analysis

Disclaimers_And_Credits

Files

latitude-m4-metal-medium.md

Latest commit

History

latitude-m4-metal-medium.md

File metadata and controls

Eval On AMD 3GHz/16-Core + 125GB RAM + NVMe SSD (Bare Metal)

Table Of Contents

Introduction

Results

Track: Filter

Track: Sparse

Track: OOD

Track: Streaming

Data Export

Hardware_Inventory

How_To_Reproduce

System Preparation

Sparse Track

Filter Track

OOD Track

Streaming Track

Analysis

Disclaimers_And_Credits