[QUESTION] - Search Optimization for 40 millions vectors in 3072 dimensions in 1 collection #855

nicmax09 · 2024-11-19T14:52:20Z

nicmax09
Nov 19, 2024

Hi,

I hope you are doing well.

I am building a Qdrant database locally, with at least 30 millions vectors embedded in 3072 dimensions, using OpenAI embedding-large-3 model after having made a few tries on local models. Each vector represent a phrase scraped from forums.

I found Qdrant truly amazing, as it has a large number of features and optimization settings. I am quite new to AI with python.

My current configuration is the following :
{ "params":{ "vectors":{ "size":3072 "distance":"Cosine" "on_disk":true } "shard_number":1 "replication_factor":1 "write_consistency_factor":1 "on_disk_payload":true } "hnsw_config":{ "m":16 "ef_construct":100 "full_scan_threshold":10000 "max_indexing_threads":0 "on_disk":false } "optimizer_config":{ "deleted_threshold":0.2 "vacuum_min_vector_number":1000 "default_segment_number":0 "max_segment_size": NULL "memmap_threshold": NULL "indexing_threshold":20000 "flush_interval_sec":5 "max_optimization_threads": NULL } "wal_config":{ "wal_capacity_mb":32 "wal_segments_ahead":0 } "quantization_config":{ "binary":{ "always_ram":true } } "strict_mode_config":{ "enabled":false } }

When I try to retrieve the top 1000 comments similar to a search vectorized query, this setup makes the search last only 10 seconds. But when I try to search for the top 5000, the search lasts 86 seconds and for the top 10000, more than 120 seconds. The search time is not linear to the search limit.

My current search query is the following :
search_result = qdrant_client.query_points( collection_name=REVIEWS_COLLECTION_NAME, query=embedded_query, with_payload=True, limit=top_k, search_params = models.SearchParams(quantization=models.QuantizationSearchParams( ignore=False, rescore=True, oversampling=2.0, )), score_threshold=0.6, timeout=120 )

And I am running qdrant on a VM of 64 gb of RAM, 16 vCPUs and 1TB of memory.

My question is : Do you have any tip on how to improve my configuration, increasing the speed of search and not decreasing too much the accuracy ? I have already tried to put my HNSW on RAM, but it takes too much space.

Thanks a lot for you help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] - Search Optimization for 40 millions vectors in 3072 dimensions in 1 collection #855

{{title}}

Replies: 0 comments

Select a reply

[QUESTION] - Search Optimization for 40 millions vectors in 3072 dimensions in 1 collection #855

nicmax09 Nov 19, 2024

Replies: 0 comments

nicmax09
Nov 19, 2024