Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ClickHouse Support #3342

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api_refs/typedoc.json
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@
"./langchain/src/llms/fake.ts",
"./langchain/src/prompts/index.ts",
"./langchain/src/prompts/load.ts",
"./langchain/src/vectorstores/clickhouse.ts",
"./langchain/src/vectorstores/analyticdb.ts",
"./langchain/src/vectorstores/base.ts",
"./langchain/src/vectorstores/cassandra.ts",
Expand Down
35 changes: 35 additions & 0 deletions docs/core_docs/docs/integrations/vectorstores/clickhouse.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
sidebar_class_name: node-only
---

import CodeBlock from "@theme/CodeBlock";

# ClickHouse

:::tip Compatibility
Only available on Node.js.
:::

[ClickHouse](https://clickhouse.com/) is a robust and open-source columnar database that is used for handling analytical queries and efficient storage, ClickHouse is designed to provide a powerful combination of vector search and analytics.

## Setup

1. Launch a ClickHouse cluster. Refer to the [ClickHouse Installation Guide](https://clickhouse.com/docs/en/getting-started/install/) for details.
2. After launching a ClickHouse cluster, retrieve the `Connection Details` from the cluster's `Actions` menu. You will need the host, port, username, and password.
3. Install the required Node.js peer dependency for ClickHouse in your workspace.

```bash npm2yarn
npm install -S @clickhouse/client
```

## Index and Query Docs

import InsertExample from "@examples/indexes/vector_stores/clickhouse_fromTexts.ts";

<CodeBlock language="typescript">{InsertExample}</CodeBlock>

## Query Docs From an Existing Collection

import SearchExample from "@examples/indexes/vector_stores/clickhouse_search.ts";

<CodeBlock language="typescript">{SearchExample}</CodeBlock>
Original file line number Diff line number Diff line change
Expand Up @@ -136,3 +136,4 @@ Here's a quick guide to help you pick the right vector store for your use case:
- If you are looking for an online MPP (Massively Parallel Processing) data warehousing service, you might want to consider the [AnalyticDB](/docs/integrations/vectorstores/analyticdb) vector store.
- If you're in search of a cost-effective vector database that allows run vector search with SQL, look no further than [MyScale](/docs/integrations/vectorstores/myscale).
- If you're in search of a vector database that you can load from both the browser and server side, check out [CloseVector](/docs/integrations/vectorstores/closevector). It's a vector database that aims to be cross-platform.
- If you're looking for a scalable, open-source columnar database with excellent performance for analytical queries, then consider [ClickHouse](/docs/integrations/vectorstores/clickhouse).
4 changes: 4 additions & 0 deletions examples/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ MYSCALE_HOST=ADD_YOURS_HERE
MYSCALE_PORT=ADD_YOURS_HERE
MYSCALE_USERNAME=ADD_YOURS_HERE
MYSCALE_PASSWORD=ADD_YOURS_HERE
CLICKHOUSE_HOST=ADD_YOURS_HERE
CLICKHOUSE_PORT=ADD_YOURS_HERE
CLICKHOUSE_USERNAME=ADD_YOURS_HERE
CLICKHOUSE_PASSWORD=ADD_YOURS_HERE
REDIS_URL=ADD_YOURS_HERE
SINGLESTORE_HOST=ADD_YOURS_HERE
SINGLESTORE_PORT=ADD_YOURS_HERE
Expand Down
2 changes: 1 addition & 1 deletion examples/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"author": "LangChain",
"license": "MIT",
"dependencies": {
"@clickhouse/client": "^0.0.14",
"@clickhouse/client": "^0.2.5",
"@elastic/elasticsearch": "^8.4.0",
"@getmetal/metal-sdk": "^4.0.0",
"@getzep/zep-js": "^0.9.0",
Expand Down
34 changes: 34 additions & 0 deletions examples/src/indexes/vector_stores/clickhouse_fromTexts.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import { ClickHouseStore } from "langchain/vectorstores/clickhouse";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is flagging the addition of code that explicitly accesses environment variables via process.env in the diff. Maintainers should review this change to ensure proper handling of environment variables.

import { OpenAIEmbeddings } from "langchain/embeddings/openai";

// Initialize ClickHouse store from texts
const vectorStore = await ClickHouseStore.fromTexts(
["Hello world", "Bye bye", "hello nice world"],
[
{ id: 2, name: "2" },
{ id: 1, name: "1" },
{ id: 3, name: "3" },
],
new OpenAIEmbeddings(),
{
host: process.env.CLICKHOUSE_HOST || "localhost",
port: process.env.CLICKHOUSE_PORT || 8443,
username: process.env.CLICKHOUSE_USER || "username",
password: process.env.CLICKHOUSE_PASSWORD || "password",
database: process.env.CLICKHOUSE_DATABASE || "default",
table: process.env.CLICKHOUSE_TABLE || "vector_table",
}
);

// Sleep 1 second to ensure that the search occurs after the successful insertion of data.
await new Promise((resolve) => setTimeout(resolve, 1000));

// Perform similarity search without filtering
const results = await vectorStore.similaritySearch("hello world", 1);
console.log(results);

// Perform similarity search with filtering
const filteredResults = await vectorStore.similaritySearch("hello world", 1, {
whereStr: "metadata.name = '1'",
});
console.log(filteredResults);
28 changes: 28 additions & 0 deletions examples/src/indexes/vector_stores/clickhouse_search.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import { ClickHouseStore } from "langchain/vectorstores/clickhouse";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is flagging the addition of code that accesses environment variables via process.env in order to set configuration options for the ClickHouse store. Maintainers should review this change to ensure the handling of environment variables is appropriate.

import { OpenAIEmbeddings } from "langchain/embeddings/openai";

// Initialize ClickHouse store
const vectorStore = await ClickHouseStore.fromExistingIndex(
new OpenAIEmbeddings(),
{
host: process.env.CLICKHOUSE_HOST || "localhost",
port: process.env.CLICKHOUSE_PORT || 8443,
username: process.env.CLICKHOUSE_USER || "username",
password: process.env.CLICKHOUSE_PASSWORD || "password",
database: process.env.CLICKHOUSE_DATABASE || "default",
table: process.env.CLICKHOUSE_TABLE || "vector_table",
}
);

// Sleep 1 second to ensure that the search occurs after the successful insertion of data.
await new Promise((resolve) => setTimeout(resolve, 1000));

// Perform similarity search without filtering
const results = await vectorStore.similaritySearch("hello world", 1);
console.log(results);

// Perform similarity search with filtering
const filteredResults = await vectorStore.similaritySearch("hello world", 1, {
whereStr: "metadata.name = '1'",
});
console.log(filteredResults);
4 changes: 4 additions & 0 deletions langchain/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ MYSCALE_HOST=ADD_YOURS_HERE
MYSCALE_PORT=ADD_YOURS_HERE
MYSCALE_USERNAME=ADD_YOURS_HERE
MYSCALE_PASSWORD=ADD_YOURS_HERE
CLICKHOUSE_HOST=ADD_YOURS_HERE
CLICKHOUSE_PORT=ADD_YOURS_HERE
CLICKHOUSE_USERNAME=ADD_YOURS_HERE
CLICKHOUSE_PASSWORD=ADD_YOURS_HERE
FIGMA_ACCESS_TOKEN=ADD_YOURS_HERE
REDIS_URL=ADD_YOURS_HERE
ROCKSET_API_KEY=ADD_YOURS_HERE
Expand Down
3 changes: 3 additions & 0 deletions langchain/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,9 @@ prompts.d.ts
prompts/load.cjs
prompts/load.js
prompts/load.d.ts
vectorstores/clickhouse.cjs
vectorstores/clickhouse.js
vectorstores/clickhouse.d.ts
vectorstores/analyticdb.cjs
vectorstores/analyticdb.js
vectorstores/analyticdb.d.ts
Expand Down
14 changes: 9 additions & 5 deletions langchain/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,9 @@
"prompts/load.cjs",
"prompts/load.js",
"prompts/load.d.ts",
"vectorstores/clickhouse.cjs",
"vectorstores/clickhouse.js",
"vectorstores/clickhouse.d.ts",
"vectorstores/analyticdb.cjs",
"vectorstores/analyticdb.js",
"vectorstores/analyticdb.d.ts",
Expand Down Expand Up @@ -843,7 +846,6 @@
"@aws-sdk/credential-provider-node": "^3.388.0",
"@aws-sdk/types": "^3.357.0",
"@azure/storage-blob": "^12.15.0",
"@clickhouse/client": "^0.0.14",
"@cloudflare/ai": "^1.0.12",
"@cloudflare/workers-types": "^4.20230922.0",
"@elastic/elasticsearch": "^8.4.0",
Expand Down Expand Up @@ -982,7 +984,6 @@
"@aws-sdk/client-sfn": "^3.310.0",
"@aws-sdk/credential-provider-node": "^3.388.0",
"@azure/storage-blob": "^12.15.0",
"@clickhouse/client": "^0.0.14",
"@cloudflare/ai": "^1.0.12",
"@elastic/elasticsearch": "^8.4.0",
"@getmetal/metal-sdk": "*",
Expand Down Expand Up @@ -1102,9 +1103,6 @@
"@azure/storage-blob": {
"optional": true
},
"@clickhouse/client": {
"optional": true
},
"@cloudflare/ai": {
"optional": true
},
Expand Down Expand Up @@ -1369,6 +1367,7 @@
},
"dependencies": {
"@anthropic-ai/sdk": "^0.9.1",
"@clickhouse/client": "^0.2.5",
"ansi-styles": "^5.0.0",
"binary-extensions": "^2.2.0",
"camelcase": "6",
Expand Down Expand Up @@ -1773,6 +1772,11 @@
"import": "./prompts/load.js",
"require": "./prompts/load.cjs"
},
"./vectorstores/clickhouse": {
"types": "./vectorstores/clickhouse.d.ts",
"import": "./vectorstores/clickhouse.js",
"require": "./vectorstores/clickhouse.cjs"
},
"./vectorstores/analyticdb": {
"types": "./vectorstores/analyticdb.d.ts",
"import": "./vectorstores/analyticdb.js",
Expand Down
2 changes: 2 additions & 0 deletions langchain/scripts/create-entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ const entrypoints = {
prompts: "prompts/index",
"prompts/load": "prompts/load",
// vectorstores
"vectorstores/clickhouse": "vectorstores/clickhouse",
"vectorstores/analyticdb": "vectorstores/analyticdb",
"vectorstores/base": "vectorstores/base",
"vectorstores/cassandra": "vectorstores/cassandra",
Expand Down Expand Up @@ -371,6 +372,7 @@ const requiresOptionalDependency = [
"prompts/load",
"vectorstores/analyticdb",
"vectorstores/cassandra",
"vectorstores/clickhouse",
"vectorstores/chroma",
"vectorstores/cloudflare_vectorize",
"vectorstores/closevector/web",
Expand Down
1 change: 1 addition & 0 deletions langchain/src/load/import_constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ export const optionalImportEntrypoints = [
"langchain/llms/writer",
"langchain/llms/portkey",
"langchain/prompts/load",
"langchain/vectorstores/clickhouse",
"langchain/vectorstores/analyticdb",
"langchain/vectorstores/cassandra",
"langchain/vectorstores/convex",
Expand Down
3 changes: 3 additions & 0 deletions langchain/src/load/import_type.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,9 @@ export interface OptionalImportMap {
"langchain/prompts/load"?:
| typeof import("../prompts/load.js")
| Promise<typeof import("../prompts/load.js")>;
"langchain/vectorstores/clickhouse"?:
| typeof import("../vectorstores/clickhouse.js")
| Promise<typeof import("../vectorstores/clickhouse.js")>;
"langchain/vectorstores/analyticdb"?:
| typeof import("../vectorstores/analyticdb.js")
| Promise<typeof import("../vectorstores/analyticdb.js")>;
Expand Down
Loading