Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): add embedding integration contribution docs #3493

Merged
merged 2 commits into from
Dec 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/contributing/INTEGRATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ Below are links to guides with advice and tips for specific types of integration
- [Vector stores](https://github.com/langchain-ai/langchainjs/blob/main/.github/contributing/integrations/VECTOR_STORES.md) (e.g. Pinecone)
- [Persistent message stores](https://github.com/langchain-ai/langchainjs/blob/main/.github/contributing/integrations/MESSAGE_STORES.md) (used to persistently store and load raw chat histories, e.g. Redis)
- [Document loaders](https://github.com/langchain-ai/langchainjs/blob/main/.github/contributing/integrations/DOCUMENT_LOADERS.md) (used to load documents for later storage into vector stores, e.g. Apify)
- Embeddings (TODO) (e.g. Cohere)
- [Embeddings](https://github.com/langchain-ai/langchainjs/blob/main/.github/contributing/integrations/EMBEDDINGS.md) (used to create embeddings of text documents or strings e.g. Cohere)
- [Tools](https://github.com/langchain-ai/langchainjs/blob/main/.github/contributing/integrations/TOOLS.md) (used for agents, e.g. the SERP API tool)

This is a living document, so please make a pull request if we're missing anything useful!
21 changes: 21 additions & 0 deletions .github/contributing/integrations/EMBEDDINGS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Contributing third-party Text Embeddings

This page contains some specific guidelines and examples for contributing integrations with third-party Text Embedding providers.

**Make sure you read the [general guidelines page](https://github.com/langchain-ai/langchainjs/blob/main/.github/contributing/INTEGRATIONS.md) first!**

## Example PR

We'll be referencing this PR adding Gradient Embeddings as an example: https://github.com/langchain-ai/langchainjs/pull/3475

## General ideas

The general idea for adding new third-party Text Embeddings is to subclass the `Embeddings` class and implement the `embedDocuments` and `embedQuery` methods.

The `embedDocuments` method should take a list of documents and return a list of embeddings for each document. The `embedQuery` method should take a query and return an embedding for that query.

`embedQuery` can typically be implemented by calling `embedDocuments` with a list containing only the query.

## Wrap Text Embeddings requests in this.caller

The base Embeddings class contains an instance property called `caller` that will automatically handle retries, errors, timeouts, and more. You should wrap calls to the LLM in `this.caller.call` [as shown here](https://github.com/langchain-ai/langchainjs/blob/f469ec00d945a3f8421b32f4be78bce3f66a74bb/langchain/src/embeddings/gradient_ai.ts#L72)