Skip to content

Commit

Permalink
Nc/opensearch merge (langchain-ai#792)
Browse files Browse the repository at this point in the history
* feat(vectorstore): implement opensearch vector store (langchain-ai#675)

* feat(vectorstore): implement opensearch vector store

* feat(opensearch): implement filtering by metadata attributes + integration test

* Update entryopints, imports, docs

---------

Co-authored-by: Igor Shapiro <[email protected]>
  • Loading branch information
2 people authored and RohitMidha23 committed Apr 18, 2023
1 parent 0e52a00 commit 0b8ffb4
Show file tree
Hide file tree
Showing 13 changed files with 512 additions and 3 deletions.
112 changes: 112 additions & 0 deletions docs/docs/modules/indexes/vector_stores/integrations/opensearch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
sidebar_class_name: node-only
---

# OpenSearch

:::tip Compatibility
Only available on Node.js.
:::

[OpenSearch](https://opensearch.org/) is a fork of [Elasticsearch](https://www.elastic.co/elasticsearch/) that is fully compatible with the Elasticsearch API. Read more about their support for Approximate Nearest Neighbors [here](https://opensearch.org/docs/latest/search-plugins/knn/approximate-knn/).

Langchain.js accepts [@opensearch-project/opensearch](https://opensearch.org/docs/latest/clients/javascript/index/) as the client for OpenSearch vectorstore.

## Setup

```bash npm2yarn
npm install -S @opensearch-project/opensearch
```

You'll also need to have an OpenSearch instance running. You can use the [official Docker image](https://opensearch.org/docs/latest/opensearch/install/docker/) to get started. You can also find an example docker-compose file [here](https://github.com/hwchase17/langchainjs/blob/main/examples/src/indexes/vector_stores/opensearch/docker-compose.yml).

## Index docs

```typescript
import { Client } from "@opensearch-project/opensearch";
import { Document } from "langchain/document";
import { OpenAIEmbeddings } from "langchain/embeddings";
import { OpenSearchVectorStore } from "langchain/vectorstores";

const client = new Client({
nodes: [process.env.OPENSEARCH_URL ?? "http://127.0.0.1:9200"],
});

const docs = [
new Document({
metadata: { foo: "bar" },
pageContent: "opensearch is also a vector db",
}),
new Document({
metadata: { foo: "bar" },
pageContent: "the quick brown fox jumped over the lazy dog",
}),
new Document({
metadata: { baz: "qux" },
pageContent: "lorem ipsum dolor sit amet",
}),
new Document({
metadata: { baz: "qux" },
pageContent:
"OpenSearch is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications",
}),
];

await OpenSearchVectorStore.fromDocuments(docs, new OpenAIEmbeddings(), {
client,
indexName: process.env.OPENSEARCH_INDEX, // Will default to `documents`
});
```

## Query docs

```typescript
import { Client } from "@opensearch-project/opensearch";
import { VectorDBQAChain } from "langchain/chains";
import { OpenAIEmbeddings } from "langchain/embeddings";
import { OpenAI } from "langchain/llms";
import { OpenSearchVectorStore } from "langchain/vectorstores";

const client = new Client({
nodes: [process.env.OPENSEARCH_URL ?? "http://127.0.0.1:9200"],
});

const vectorStore = new OpenSearchVectorStore(new OpenAIEmbeddings(), {
client,
});

/* Search the vector DB independently with meta filters */
const results = await vectorStore.similaritySearch("hello world", 1);
console.log(JSON.stringify(results, null, 2));
/* [
{
"pageContent": "Hello world",
"metadata": {
"id": 2
}
}
] */

/* Use as part of a chain (currently no metadata filters) */
const model = new OpenAI();
const chain = VectorDBQAChain.fromLLM(model, vectorStore, {
k: 1,
returnSourceDocuments: true,
});
const response = await chain.call({ query: "What is opensearch?" });

console.log(JSON.stringify(response, null, 2));
/*
{
"text": " Opensearch is a collection of technologies that allow search engines to publish search results in a standard format, making it easier for users to search across multiple sites.",
"sourceDocuments": [
{
"pageContent": "What's this?",
"metadata": {
"id": 3
}
}
]
}
*/
```
3 changes: 2 additions & 1 deletion examples/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ ANTHROPIC_API_KEY=ADD_YOURS_HERE # https://www.anthropic.com/
COHERE_API_KEY=ADD_YOURS_HERE # https://dashboard.cohere.ai/api-keys
HUGGINGFACEHUB_API_KEY=ADD_YOURS_HERE # https://huggingface.co/settings/tokens
OPENAI_API_KEY=ADD_YOURS_HERE # https://platform.openai.com/account/api-keys
OPENSEARCH_URL=ADD_YOURS_HERE # http://127.0.0.1:9200
PINECONE_API_KEY=ADD_YOURS_HERE # https://app.pinecone.io/organizations
PINECONE_ENVIRONMENT=ADD_YOURS_HERE
PINECONE_INDEX=ADD_YOURS_HERE # E.g. "trec-question-classification" when using "Cohere Trec" example index
REPLICATE_API_KEY=ADD_YOURS_HERE # https://replicate.com/account
SERPAPI_API_KEY=ADD_YOURS_HERE # https://serpapi.com/manage-api-key
SERPER_API_KEY=ADD_YOURS_HERE # https://serper.dev/api-key
SUPABASE_PRIVATE_KEY=ADD_YOURS_HERE # https://app.supabase.com/project/YOUR_PROJECT_ID/settings/api
SUPABASE_URL=ADD_YOURS_HERE # # https://app.supabase.com/project/YOUR_PROJECT_ID/settings/api
SUPABASE_URL=ADD_YOURS_HERE # # https://app.supabase.com/project/YOUR_PROJECT_ID/settings/api
1 change: 1 addition & 0 deletions examples/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
"license": "MIT",
"dependencies": {
"@getmetal/metal-sdk": "^1.0.12",
"@opensearch-project/opensearch": "^2.2.0",
"@pinecone-database/pinecone": "^0.0.12",
"@prisma/client": "^4.11.0",
"@supabase/supabase-js": "^2.10.0",
Expand Down
42 changes: 42 additions & 0 deletions examples/src/indexes/vector_stores/opensearch/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Reference:
# https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#sample-docker-composeyml
version: '3'
services:
opensearch:
image: opensearchproject/opensearch:2.6.0
container_name: opensearch
environment:
- cluster.name=opensearch
- node.name=opensearch
- discovery.type=single-node
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- opensearch_data:/usr/share/opensearch/data
ports:
- 9200:9200
- 9600:9600
networks:
- opensearch
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:latest # Make sure the version of opensearch-dashboards matches the version of opensearch installed on other nodes
container_name: opensearch-dashboards
ports:
- 5601:5601 # Map host port 5601 to container port 5601
expose:
- "5601" # Expose port 5601 for web access to OpenSearch Dashboards
environment:
OPENSEARCH_HOSTS: '["http://opensearch:9200"]' # Define the OpenSearch nodes that OpenSearch Dashboards will query
DISABLE_SECURITY_DASHBOARDS_PLUGIN: "true" # disables security dashboards plugin in OpenSearch Dashboards
networks:
- opensearch
networks:
opensearch:
volumes:
opensearch_data:
22 changes: 22 additions & 0 deletions examples/src/indexes/vector_stores/opensearch/opensearch.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import { Client } from "@opensearch-project/opensearch";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { OpenSearchVectorStore } from "langchain/vectorstores/opensearch";

export async function run() {
const client = new Client({
nodes: [process.env.OPENSEARCH_URL ?? "http://127.0.0.1:9200"],
});

const vectorStore = await OpenSearchVectorStore.fromTexts(
["Hello world", "Bye bye", "What's this?"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
new OpenAIEmbeddings(),
{
client,
indexName: "documents",
}
);

const resultOne = await vectorStore.similaritySearch("Hello world", 1);
console.log(resultOne);
}
1 change: 1 addition & 0 deletions langchain/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ ANTHROPIC_API_KEY=ADD_YOURS_HERE
COHERE_API_KEY=ADD_YOURS_HERE
HUGGINGFACEHUB_API_KEY=ADD_YOURS_HERE
OPENAI_API_KEY=ADD_YOURS_HERE
OPENSEARCH_URL=http://127.0.0.1:9200
PINECONE_API_KEY=ADD_YOURS_HERE
PINECONE_ENVIRONMENT=ADD_YOURS_HERE
PINECONE_INDEX=ADD_YOURS_HERE
Expand Down
3 changes: 3 additions & 0 deletions langchain/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,9 @@ vectorstores/pinecone.d.ts
vectorstores/supabase.cjs
vectorstores/supabase.js
vectorstores/supabase.d.ts
vectorstores/opensearch.cjs
vectorstores/opensearch.js
vectorstores/opensearch.d.ts
vectorstores/milvus.cjs
vectorstores/milvus.js
vectorstores/milvus.d.ts
Expand Down
13 changes: 13 additions & 0 deletions langchain/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,9 @@
"vectorstores/supabase.cjs",
"vectorstores/supabase.js",
"vectorstores/supabase.d.ts",
"vectorstores/opensearch.cjs",
"vectorstores/opensearch.js",
"vectorstores/opensearch.d.ts",
"vectorstores/milvus.cjs",
"vectorstores/milvus.js",
"vectorstores/milvus.d.ts",
Expand Down Expand Up @@ -264,6 +267,7 @@
"@getmetal/metal-sdk": "^1.0.12",
"@huggingface/inference": "^1.5.1",
"@jest/globals": "^29.5.0",
"@opensearch-project/opensearch": "^2.2.0",
"@pinecone-database/pinecone": "^0.0.12",
"@supabase/supabase-js": "^2.10.0",
"@tsconfig/recommended": "^1.0.2",
Expand Down Expand Up @@ -313,6 +317,7 @@
"@aws-sdk/client-s3": "^3.310.0",
"@getmetal/metal-sdk": "*",
"@huggingface/inference": "^1.5.1",
"@opensearch-project/opensearch": "*",
"@pinecone-database/pinecone": "*",
"@supabase/supabase-js": "^2.10.0",
"@zilliz/milvus2-sdk-node": "^2.2.0",
Expand Down Expand Up @@ -347,6 +352,9 @@
"@huggingface/inference": {
"optional": true
},
"@opensearch-project/opensearch": {
"optional": true
},
"@pinecone-database/pinecone": {
"optional": true
},
Expand Down Expand Up @@ -609,6 +617,11 @@
"import": "./vectorstores/supabase.js",
"require": "./vectorstores/supabase.cjs"
},
"./vectorstores/opensearch": {
"types": "./vectorstores/opensearch.d.ts",
"import": "./vectorstores/opensearch.js",
"require": "./vectorstores/opensearch.cjs"
},
"./vectorstores/milvus": {
"types": "./vectorstores/milvus.d.ts",
"import": "./vectorstores/milvus.js",
Expand Down
2 changes: 2 additions & 0 deletions langchain/scripts/create-entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ const entrypoints = {
"vectorstores/mongo": "vectorstores/mongo",
"vectorstores/pinecone": "vectorstores/pinecone",
"vectorstores/supabase": "vectorstores/supabase",
"vectorstores/opensearch": "vectorstores/opensearch",
"vectorstores/milvus": "vectorstores/milvus",
"vectorstores/prisma": "vectorstores/prisma",
// text_splitter
Expand Down Expand Up @@ -134,6 +135,7 @@ const requiresOptionalDependency = [
"vectorstores/mongo",
"vectorstores/pinecone",
"vectorstores/supabase",
"vectorstores/opensearch",
"vectorstores/milvus",
"document_loaders/web/cheerio",
"document_loaders/web/puppeteer",
Expand Down
Loading

0 comments on commit 0b8ffb4

Please sign in to comment.