forked from langchain-ai/langchain
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
implement vectorstores by tencent vectordb (langchain-ai#9989)
Hi there! I'm excited to open this PR to add support for using 'Tencent Cloud VectorDB' as a vector store. Tencent Cloud VectorDB is a fully-managed, self-developed, enterprise-level distributed database service designed for storing, retrieving, and analyzing multi-dimensional vector data. The database supports multiple index types and similarity calculation methods, with a single index supporting vector scales up to 1 billion and capable of handling millions of QPS with millisecond-level query latency. Tencent Cloud VectorDB not only provides external knowledge bases for large models to improve their accuracy, but also has wide applications in AI fields such as recommendation systems, NLP services, computer vision, and intelligent customer service. The PR includes: Implementation of Vectorstore. I have read your [contributing guidelines](https://github.com/hwchase17/langchain/blob/72b7d76d79b0e187426787616d96257b64292119/.github/CONTRIBUTING.md). And I have passed the tests below make format make lint make coverage make test
- Loading branch information
Showing
5 changed files
with
616 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# TencentVectorDB | ||
|
||
This page covers how to use the TencentVectorDB ecosystem within LangChain. | ||
|
||
### VectorStore | ||
|
||
There exists a wrapper around TencentVectorDB, allowing you to use it as a vectorstore, | ||
whether for semantic search or example selection. | ||
|
||
To import this vectorstore: | ||
```python | ||
from langchain.vectorstores import TencentVectorDB | ||
``` | ||
|
||
For a more detailed walkthrough of the TencentVectorDB wrapper, see [this notebook](/docs/integrations/vectorstores/tencentvectordb.html) |
122 changes: 122 additions & 0 deletions
122
docs/extras/integrations/vectorstores/tencentvectordb.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"collapsed": true, | ||
"jupyter": { | ||
"outputs_hidden": true | ||
} | ||
}, | ||
"source": [ | ||
"# Tencent Cloud VectorDB\n", | ||
"\n", | ||
">[Tencent Cloud VectorDB](https://cloud.tencent.com/document/product/1709) is a fully managed, self-developed, enterprise-level distributed database service designed for storing, retrieving, and analyzing multi-dimensional vector data. The database supports multiple index types and similarity calculation methods. A single index can support a vector scale of up to 1 billion and can support millions of QPS and millisecond-level query latency. Tencent Cloud Vector Database can not only provide an external knowledge base for large models to improve the accuracy of large model responses but can also be widely used in AI fields such as recommendation systems, NLP services, computer vision, and intelligent customer service.\n", | ||
"\n", | ||
"This notebook shows how to use functionality related to the Tencent vector database.\n", | ||
"\n", | ||
"To run, you should have a [Database instance.](https://cloud.tencent.com/document/product/1709/95101)." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip3 install tcvectordb" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.embeddings.fake import FakeEmbeddings\n", | ||
"from langchain.text_splitter import CharacterTextSplitter\n", | ||
"from langchain.vectorstores import TencentVectorDB\n", | ||
"from langchain.vectorstores.tencentvectordb import ConnectionParams\n", | ||
"from langchain.document_loaders import TextLoader" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n", | ||
"documents = loader.load()\n", | ||
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n", | ||
"docs = text_splitter.split_documents(documents)\n", | ||
"embeddings = FakeEmbeddings(size=128)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"conn_params = ConnectionParams(url=\"http://10.0.X.X\", \n", | ||
" key=\"eC4bLRy2va******************************\", \n", | ||
" username=\"root\", \n", | ||
" timeout=20)\n", | ||
"\n", | ||
"vector_db = TencentVectorDB.from_documents(\n", | ||
" docs,\n", | ||
" embeddings,\n", | ||
" connection_params=conn_params,\n", | ||
" # drop_old=True,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"query = \"What did the president say about Ketanji Brown Jackson\"\n", | ||
"docs = vector_db.similarity_search(query)\n", | ||
"docs[0].page_content" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"vector_db = TencentVectorDB(embeddings, conn_params)\n", | ||
"\n", | ||
"vector_db.add_texts([\"Ankush went to Princeton\"])\n", | ||
"query = \"Where did Ankush go to college?\"\n", | ||
"docs = vector_db.max_marginal_relevance_search(query)\n", | ||
"docs[0].page_content" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.