-
Notifications
You must be signed in to change notification settings - Fork 15.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add HugeGraphQAChain to support gremlin generating chain (#7132)
[Apache HugeGraph](https://github.com/apache/incubator-hugegraph) is a convenient, efficient, and adaptable graph database, compatible with the Apache TinkerPop3 framework and the Gremlin query language. In this PR, the HugeGraph and HugeGraphQAChain provide the same functionality as the existing integration with Neo4j and enables query generation and question answering over HugeGraph database. The difference is that the graph query language supported by HugeGraph is not cypher but another very popular graph query language [Gremlin](https://tinkerpop.apache.org/gremlin.html). A notebook example and a simple test case have also been added. --------- Co-authored-by: Bagatur <[email protected]>
- Loading branch information
Showing
9 changed files
with
531 additions
and
1 deletion.
There are no files selected for viewing
302 changes: 302 additions & 0 deletions
302
docs/extras/modules/chains/additional/graph_hugegraph_qa.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,302 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d2777010", | ||
"metadata": {}, | ||
"source": [ | ||
"# HugeGraph QA Chain\n", | ||
"\n", | ||
"This notebook shows how to use LLMs to provide a natural language interface to [HugeGraph](https://hugegraph.apache.org/cn/) database." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "f26dcbe4", | ||
"metadata": {}, | ||
"source": [ | ||
"You will need to have a running HugeGraph instance.\n", | ||
"You can run a local docker container by running the executing the following script:\n", | ||
"\n", | ||
"```\n", | ||
"docker run \\\n", | ||
" --name=graph \\\n", | ||
" -itd \\\n", | ||
" -p 8080:8080 \\\n", | ||
" hugegraph/hugegraph\n", | ||
"```\n", | ||
"\n", | ||
"If we want to connect HugeGraph in the application, we need to install python sdk:\n", | ||
"\n", | ||
"```\n", | ||
"pip3 install hugegraph-python\n", | ||
"```" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d64a29f1", | ||
"metadata": {}, | ||
"source": [ | ||
"If you are using the docker container, you need to wait a couple of second for the database to start, and then we need create schema and write graph data for the database." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 13, | ||
"id": "e53ab93e", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from hugegraph.connection import PyHugeGraph\n", | ||
"\n", | ||
"client = PyHugeGraph(\"localhost\", \"8080\", user=\"admin\", pwd=\"admin\", graph=\"hugegraph\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "b7c3a50e", | ||
"metadata": {}, | ||
"source": [ | ||
"First, we create the schema for a simple movie database:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"id": "ef5372a8", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"'create EdgeLabel success, Detail: \"b\\'{\"id\":1,\"name\":\"ActedIn\",\"source_label\":\"Person\",\"target_label\":\"Movie\",\"frequency\":\"SINGLE\",\"sort_keys\":[],\"nullable_keys\":[],\"index_labels\":[],\"properties\":[],\"status\":\"CREATED\",\"ttl\":0,\"enable_label_index\":true,\"user_data\":{\"~create_time\":\"2023-07-04 10:48:47.908\"}}\\'\"'" | ||
] | ||
}, | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"\"\"\"schema\"\"\"\n", | ||
"schema = client.schema()\n", | ||
"schema.propertyKey(\"name\").asText().ifNotExist().create()\n", | ||
"schema.propertyKey(\"birthDate\").asText().ifNotExist().create()\n", | ||
"schema.vertexLabel(\"Person\").properties(\"name\", \"birthDate\").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n", | ||
"schema.vertexLabel(\"Movie\").properties(\"name\").usePrimaryKeyId().primaryKeys(\"name\").ifNotExist().create()\n", | ||
"schema.edgeLabel(\"ActedIn\").sourceLabel(\"Person\").targetLabel(\"Movie\").ifNotExist().create()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "016f7989", | ||
"metadata": {}, | ||
"source": [ | ||
"Then we can insert some data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 26, | ||
"id": "b7f4c370", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"1:Robert De Niro--ActedIn-->2:The Godfather Part II" | ||
] | ||
}, | ||
"execution_count": 26, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"\"\"\"graph\"\"\"\n", | ||
"g = client.graph()\n", | ||
"g.addVertex(\"Person\", {\"name\": \"Al Pacino\", \"birthDate\": \"1940-04-25\"})\n", | ||
"g.addVertex(\"Person\", {\"name\": \"Robert De Niro\", \"birthDate\": \"1943-08-17\"})\n", | ||
"g.addVertex(\"Movie\", {\"name\": \"The Godfather\"})\n", | ||
"g.addVertex(\"Movie\", {\"name\": \"The Godfather Part II\"})\n", | ||
"g.addVertex(\"Movie\", {\"name\": \"The Godfather Coda The Death of Michael Corleone\"})\n", | ||
"\n", | ||
"g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather\", {})\n", | ||
"g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Part II\", {})\n", | ||
"g.addEdge(\"ActedIn\", \"1:Al Pacino\", \"2:The Godfather Coda The Death of Michael Corleone\", {})\n", | ||
"g.addEdge(\"ActedIn\", \"1:Robert De Niro\", \"2:The Godfather Part II\", {})" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5b8f7788", | ||
"metadata": {}, | ||
"source": [ | ||
"## Creating `HugeGraphQAChain`\n", | ||
"\n", | ||
"We can now create the `HugeGraph` and `HugeGraphQAChain`. To create the `HugeGraph` we simply need to pass the database object to the `HugeGraph` constructor." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 27, | ||
"id": "f1f68fcf", | ||
"metadata": { | ||
"is_executing": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.chat_models import ChatOpenAI\n", | ||
"from langchain.chains import HugeGraphQAChain\n", | ||
"from langchain.graphs import HugeGraph" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 28, | ||
"id": "b86ebfa7", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"graph = HugeGraph(\n", | ||
" username=\"admin\",\n", | ||
" password=\"admin\",\n", | ||
" address=\"localhost\",\n", | ||
" port=8080,\n", | ||
" graph=\"hugegraph\"\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e262540b", | ||
"metadata": {}, | ||
"source": [ | ||
"## Refresh graph schema information\n", | ||
"\n", | ||
"If the schema of database changes, you can refresh the schema information needed to generate Gremlin statements." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 29, | ||
"id": "134dd8d6", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# graph.refresh_schema()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 30, | ||
"id": "e78b8e72", | ||
"metadata": { | ||
"ExecuteTime": {} | ||
}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Node properties: [name: Person, primary_keys: ['name'], properties: ['name', 'birthDate'], name: Movie, primary_keys: ['name'], properties: ['name']]\n", | ||
"Edge properties: [name: ActedIn, properties: []]\n", | ||
"Relationships: ['Person--ActedIn-->Movie']\n", | ||
"\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"print(graph.get_schema)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5c27e813", | ||
"metadata": {}, | ||
"source": [ | ||
"## Querying the graph\n", | ||
"\n", | ||
"We can now use the graph Gremlin QA chain to ask question of the graph" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 31, | ||
"id": "3b23dead", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"chain = HugeGraphQAChain.from_llm(\n", | ||
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 32, | ||
"id": "76aecc93", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"\n", | ||
"\n", | ||
"\u001b[1m> Entering new chain...\u001b[0m\n", | ||
"Generated gremlin:\n", | ||
"\u001b[32;1m\u001b[1;3mg.V().has('Movie', 'name', 'The Godfather').in('ActedIn').valueMap(true)\u001b[0m\n", | ||
"Full Context:\n", | ||
"\u001b[32;1m\u001b[1;3m[{'id': '1:Al Pacino', 'label': 'Person', 'name': ['Al Pacino'], 'birthDate': ['1940-04-25']}]\u001b[0m\n", | ||
"\n", | ||
"\u001b[1m> Finished chain.\u001b[0m\n" | ||
] | ||
}, | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"'Al Pacino played in The Godfather.'" | ||
] | ||
}, | ||
"execution_count": 32, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"chain.run(\"Who played in The Godfather?\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "869f0258", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "venv", | ||
"language": "python", | ||
"name": "venv" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.