You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am building a Qdrant database locally, with at least 30 millions vectors embedded in 3072 dimensions, using OpenAI embedding-large-3 model after having made a few tries on local models. Each vector represent a phrase scraped from forums.
I found Qdrant truly amazing, as it has a large number of features and optimization settings. I am quite new to AI with python.
When I try to retrieve the top 1000 comments similar to a search vectorized query, this setup makes the search last only 10 seconds. But when I try to search for the top 5000, the search lasts 86 seconds and for the top 10000, more than 120 seconds. The search time is not linear to the search limit.
My current search query is the following : search_result = qdrant_client.query_points( collection_name=REVIEWS_COLLECTION_NAME, query=embedded_query, with_payload=True, limit=top_k, search_params = models.SearchParams(quantization=models.QuantizationSearchParams( ignore=False, rescore=True, oversampling=2.0, )), score_threshold=0.6, timeout=120 )
And I am running qdrant on a VM of 64 gb of RAM, 16 vCPUs and 1TB of memory.
My question is : Do you have any tip on how to improve my configuration, increasing the speed of search and not decreasing too much the accuracy ? I have already tried to put my HNSW on RAM, but it takes too much space.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
I hope you are doing well.
I am building a Qdrant database locally, with at least 30 millions vectors embedded in 3072 dimensions, using OpenAI embedding-large-3 model after having made a few tries on local models. Each vector represent a phrase scraped from forums.
I found Qdrant truly amazing, as it has a large number of features and optimization settings. I am quite new to AI with python.
My current configuration is the following :
{ "params":{ "vectors":{ "size":3072 "distance":"Cosine" "on_disk":true } "shard_number":1 "replication_factor":1 "write_consistency_factor":1 "on_disk_payload":true } "hnsw_config":{ "m":16 "ef_construct":100 "full_scan_threshold":10000 "max_indexing_threads":0 "on_disk":false } "optimizer_config":{ "deleted_threshold":0.2 "vacuum_min_vector_number":1000 "default_segment_number":0 "max_segment_size": NULL "memmap_threshold": NULL "indexing_threshold":20000 "flush_interval_sec":5 "max_optimization_threads": NULL } "wal_config":{ "wal_capacity_mb":32 "wal_segments_ahead":0 } "quantization_config":{ "binary":{ "always_ram":true } } "strict_mode_config":{ "enabled":false } }
When I try to retrieve the top 1000 comments similar to a search vectorized query, this setup makes the search last only 10 seconds. But when I try to search for the top 5000, the search lasts 86 seconds and for the top 10000, more than 120 seconds. The search time is not linear to the search limit.
My current search query is the following :
search_result = qdrant_client.query_points( collection_name=REVIEWS_COLLECTION_NAME, query=embedded_query, with_payload=True, limit=top_k, search_params = models.SearchParams(quantization=models.QuantizationSearchParams( ignore=False, rescore=True, oversampling=2.0, )), score_threshold=0.6, timeout=120 )
And I am running qdrant on a VM of 64 gb of RAM, 16 vCPUs and 1TB of memory.
My question is : Do you have any tip on how to improve my configuration, increasing the speed of search and not decreasing too much the accuracy ? I have already tried to put my HNSW on RAM, but it takes too much space.
Thanks a lot for you help.
Beta Was this translation helpful? Give feedback.
All reactions