Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Vector Search #2526

Closed
wants to merge 11 commits into from
Closed

feat: Vector Search #2526

wants to merge 11 commits into from

Conversation

hinthornw
Copy link
Contributor

@hinthornw hinthornw commented Nov 25, 2024

Right now you enable it by:

  • Initializing the store with an 'embedding config' -> this contains the 'dims' (used to create the table) and the encoder object (rn langchain embeddings object, though that is ......)
  • Call setup() -> creates the vector table.

Each document has 1 or more vectors associated with it for each json path in the embedding config.

Would welcome critique and requests!

Especially around handling migrations, whether we want to support multimedia (multimodal embeddings is old school already so kinda bad langchain embeddings classes don't support well), querying (do we need mmr or other strategies...?) and anything else that feels off.

Also happy to shelve the use of json path for saving embeddings if we think that's too complicated.

Also the default __root__ embedding of the whole JSON object retains the keys. I'm guessing there's a bit of an impact here but probably better to have it remain self descriptive than to drop the keys before embedding the object.

@hinthornw hinthornw changed the base branch from main to wfh/_/parallel_tests November 25, 2024 06:06
@hinthornw hinthornw changed the title Wfh/ /pgsearch feat: Vector Search Nov 25, 2024
Copy link
Contributor Author

@hinthornw hinthornw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

several todos

libs/checkpoint-postgres/langgraph/store/postgres/base.py Outdated Show resolved Hide resolved
- Cohere embed-multilingual-light-v3.0: 384
"""

embed: Embeddings
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably accept a raw function.

We also don't (rn) support multimodal well?

libs/checkpoint-postgres/langgraph/store/postgres/base.py Outdated Show resolved Hide resolved
value: The stored value.
created_at: When the item was first created.
updated_at: When the item was last updated.
response_metadata: Optional metadata about the response/result.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rn contains the similarity score.

Base automatically changed from wfh/_/parallel_tests to main November 25, 2024 20:19
@hinthornw hinthornw marked this pull request as ready for review November 26, 2024 00:38
@hinthornw hinthornw marked this pull request as draft November 26, 2024 00:38
@hinthornw hinthornw closed this Nov 26, 2024
@hinthornw
Copy link
Contributor Author

hinthornw commented Nov 26, 2024

Transfered to #2535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant