Skip to content

Commit

Permalink
Add reference doc for Links
Browse files Browse the repository at this point in the history
  • Loading branch information
cbornet committed Sep 7, 2024
1 parent 262e19b commit ea6a3bd
Show file tree
Hide file tree
Showing 3 changed files with 131 additions and 13 deletions.
15 changes: 8 additions & 7 deletions docs/api_reference/scripts/custom_formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,16 @@ def process_toc_h3_elements(html_content: str) -> str:

# Process each element
for element in toc_h3_elements:
element = element.a.code.span
# Get the text content of the element
content = element.get_text()
if element.a.code is not None:
element = element.a.code.span
# Get the text content of the element
content = element.get_text()

# Apply the regex substitution
modified_content = content.split(".")[-1]
# Apply the regex substitution
modified_content = content.split(".")[-1]

# Update the element's content
element.string = modified_content
# Update the element's content
element.string = modified_content

# Return the modified HTML
return str(soup)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@
add_links(document, links)
graph_vector_store.add_document(document)
.. seealso::
:class:`How to create links between documents <langchain_core.graph_vectorstores.links.Link>`
***********
Get started
***********
Expand Down
126 changes: 120 additions & 6 deletions libs/core/langchain_core/graph_vectorstores/links.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,109 @@
@beta()
@dataclass(frozen=True)
class Link:
"""A link to/from a tag of a given tag.
"""A link to/from a tag of a given kind.
Edges exist from nodes with an outgoing link to nodes with a matching incoming link.
"""
Documents in a :class:`graph vector store <langchain_core.graph_vectorstores.base.GraphVectorStore>`
are connected via "links".
Links form a bipartite graph between documents and tags: documents are connected
to tags, and tags are connected to other documents.
When documents are retrieved from a graph vector store, a pair of documents are
connected with a depth of one if both documents are connected to the same tag.
Links have a ``kind`` property, used to namespace different tag identifiers.
For example a link to a keyword might use kind ``kw``, while a link to a URL might
use kind ``url``.
This allows the same tag value to be used in different contexts without causing
name collisions.
Links are directed. The directionality of links controls how the graph is
traversed at retrieval time.
For example, given documents ``A`` and ``B``, connected by links to tag ``T``:
+----------+----------+---------------------------------+
| A to T | B to T | Result |
+==========+==========+=================================+
| outgoing | incoming | Retrieval traverses from A to B |
+----------+----------+---------------------------------+
| incoming | incoming | No traversal from A to B |
+----------+----------+---------------------------------+
| outgoing | incoming | No traversal from A to B |
+----------+----------+---------------------------------+
| bidir | incoming | Retrieval traverses from A to B |
+----------+----------+---------------------------------+
| bidir | outgoing | No traversal from A to B |
+----------+----------+---------------------------------+
| outgoing | bidir | Retrieval traverses from A to B |
+----------+----------+---------------------------------+
| incoming | bidir | No traversal from A to B |
+----------+----------+---------------------------------+
Directed links make it possible to describe relationships such as term
references / definitions: term definitions are generally relevant to any documents
that use the term, but the full set of documents using a term generally aren't
relevant to the term's definition.
.. seealso::
:mod:`How to use a graph vector store <langchain_community.graph_vectorstores>`
How to add links to a Document
==============================
How to create links
-------------------
You can create links using the Link class's constructors :meth:`incoming`,
:meth:`outgoing`, and :meth:`bidir`::
from langchain_core.graph_vectorstores.links import Link
print(Link.bidir(kind="location", tag="Paris"))
.. code-block:: output
Link(kind='location', direction='bidir', tag='Paris')
Extending documents with links
------------------------------
Now that we know how to create links, let's associate them with some documents.
These edges will strengthen the connection between documents that share a keyword
when using a graph vector store to retrieve documents.
First, we'll load some text and chunk it into smaller pieces.
Then we'll add a link to each document to link them all together::
from langchain_community.document_loaders import TextLoader
from langchain_core.graph_vectorstores.links import add_links
from langchain_text_splitters import CharacterTextSplitter
loader = TextLoader("state_of_the_union.txt")
raw_documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)
for doc in documents:
add_links(doc, Link.bidir(kind="genre", tag="oratory"))
print(documents[0].metadata)
.. code-block:: output
{'source': 'state_of_the_union.txt', 'links': [Link(kind='genre', direction='bidir', tag='oratory')]}
As we can see, each document's metadata now includes a bidirectional link to the
genre ``oratory``.
The documents can then be added to a graph vector store::
from langchain_community.graph_vectorstores import CassandraGraphVectorStore
graph_vectorstore = CassandraGraphVectorStore.from_documents(
documents=documents, embeddings=...
)
""" # noqa: E501

kind: str
"""The kind of link. Allows different extractors to use the same tag name without
Expand All @@ -23,17 +122,32 @@ class Link:

@staticmethod
def incoming(kind: str, tag: str) -> "Link":
"""Create an incoming link."""
"""Create an incoming link.
Args:
kind: the link kind.
tag: the link tag.
"""
return Link(kind=kind, direction="in", tag=tag)

@staticmethod
def outgoing(kind: str, tag: str) -> "Link":
"""Create an outgoing link."""
"""Create an outgoing link.
Args:
kind: the link kind.
tag: the link tag.
"""
return Link(kind=kind, direction="out", tag=tag)

@staticmethod
def bidir(kind: str, tag: str) -> "Link":
"""Create a bidirectional link."""
"""Create a bidirectional link.
Args:
kind: the link kind.
tag: the link tag.
"""
return Link(kind=kind, direction="bidir", tag=tag)


Expand Down

0 comments on commit ea6a3bd

Please sign in to comment.