-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dense vector/embeddings dimension size #92458
Comments
Pinging @elastic/es-search (Team:Search) |
I just got into this limit. For people facing the same limitation AWS OpenSearch supports 10.000 dimensions. Love Elasticsearch Cloud because of its UX but going to AWS for now because my use case, the same as @aykutfirat, it requires 1536 floats. |
The limit is because Elasticsearch uses the Lucene implementation of vector values: /** The maximum length of a vector */
public static final int MAX_DIMENSIONS = 1024; There is an ongoing discussion on apache/lucene#874 and apache/lucene#11507, but the maintainers look very reluctant to that change. Elasticsearch can do what OpenSearch did, they implemented multiple engines and let people decide, docs here. This way OpenSearch supports up to 10000 dimensions with the Faiss or nmslib engines, so people can still use the 1024-dims-limited Lucene engine and make their own tradeoffs. |
Addressed by #95257 |
How to use 2048? Do I need to update my elasticsearch version? |
@nik13 when 8.8 is released, you can specify your dimensions in the mapping up to the limit of 2048. So, upgrade to 8.8 when it is released. |
@benwtrent sorry to bother you. Do you know the timeline for the 8.8 release? |
Hi, I understand that the 2048 dimensions of dense_vector is available since 8.8 release. Is there anyone can tell me when or how the track dwon the progess to GA of this function? |
8.10 |
Description
The latest Open AI embeddings (text-embedding-ada-002) has size 1536. Open AI embeddings are perceived as state of the art and offered at a very good price.
Can you increase the dense vector size so that we can use these kind of models?
Thank you!
The text was updated successfully, but these errors were encountered: