Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch vector store #771

Closed
dkarlovi opened this issue Apr 13, 2023 · 6 comments
Closed

Elasticsearch vector store #771

dkarlovi opened this issue Apr 13, 2023 · 6 comments

Comments

@dkarlovi
Copy link
Contributor

dkarlovi commented Apr 13, 2023

Please consider porting the vector store support for ES / OS:
https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/elastic_vector_search.py

Sidenote

I was playing with using current Opensearch adapter with ES to prototype this, it seems there's a bit more work than expected. Namely, ES v7 doesn't support the knn query type OS supports, it only allows cosineSimilarity() like implemented in the Python version linked above.

ES v8 does support knn (which they call "Approximate kNN") which is supposedly better performing with a large number of documents in ES, but comes with a caveat the vector dimensions are only up to 1024 (with index: true setting which knn query type requires) , while OpenAI embeddings by text-embedding-ada-002 are by default 1536.

TLDR:

  1. base approach taken in Python version works
  2. with a large number of documents and/or advanced features, the new approach should be taken which will cause some implementation issues and tradeoffs.

Update: with ES 8.8, it now supports 2048 dimensions for vectors, meaning ADA embeddings should fit into dense vectors on which ES knows how to do kNN queries.

@dkarlovi
Copy link
Contributor Author

dkarlovi commented Apr 14, 2023

Opensearch merged in #792

@dkarlovi dkarlovi changed the title Elasticsearch / Opensearch vector store Elasticsearch vector store Apr 14, 2023
@zhengyuan-ehps
Copy link

Not sure if this related, but MongoDB Atlas search is also hit the 1024 dimensions restriction when using knn vector field, which means the OpenAi embedding is not compatible with mongo vector store neither.

@dkarlovi
Copy link
Contributor Author

@zhengyuan-ehps IIRC you can set max dimensions on the embeddings model, I'm not sure how much impact this would have on the quality though.

@peterkarman1
Copy link

is this happening? some of us are stuck on es7 >_<

@dkarlovi
Copy link
Contributor Author

dkarlovi commented Jun 5, 2023

@peterkarman1 you'll not be able to use the knn query, which might not be an issue for you.

Overall, the adapter should be very similar to the existing OpenSearch one, with very minor tweaks, I'm sure PRs are welcome.

@dkarlovi
Copy link
Contributor Author

Closed in #1810.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants