Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure RAG doc to separate concepts from how-to #65

Merged
merged 2 commits into from
May 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
181 changes: 116 additions & 65 deletions docs/rag.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
Firebase Genkit provides abstractions that help you build retrieval-augmented generation
(RAG) flows, as well as plugins that provide integrations with related tools.

## What is RAG?

Retrieval-augmented generation is a technique used to incorporate external
sources of information into an LLM’s responses. It's important to be able to do
so because, while LLMs are typically trained on a broad body of
Expand Down Expand Up @@ -37,14 +39,15 @@ the best quality RAG. The core Genkit framework offers two main abstractions to
help you do RAG:

- Indexers: add documents to an "index".
- Embedders: transforms documents into a vector representation
- Retrievers: retrieve documents from an "index", given a query.

These definitions are broad on purpose because Genkit is un-opinionated about
what an "index" is or how exactly documents are retrieved from it. Genkit only
provides a `Document` format and everything else is defined by the retriever or
indexer implementation provider.

## Indexers
### Indexers

The index is responsible for keeping track of your documents in such a way that
you can quickly retrieve relevant documents given a specific query. This is most
Expand Down Expand Up @@ -78,21 +81,64 @@ with a stable source of data. On the other hand, if you are working with data
that frequently changes, you might continuously run the ingestion flow (for
example, in a Cloud Firestore trigger, whenever a document is updated).

The following example shows how you could ingest a collection of PDF documents
into a vector database. It uses the local file-based vector similarity retriever
### Embedders

An embedder is a function that takes content (text, images, audio, etc.) and creates a numeric vector that encodes the semantic meaning of the original content. As mentioned above, embedders are leveraged as part of the process of indexing, however, they can also be used independently to create embeddings without an index.

### Retrievers

A retriever is a concept that encapsulates logic related to any kind of document
retrieval. The most popular retrieval cases typically include retrieval from
schnecle marked this conversation as resolved.
Show resolved Hide resolved
vector stores, however, in Genkit a retriever can be any function that returns data.

To create a retriever, you can use one of the provided implementations or
create your own.

## Supported indexers, retrievers, and embedders

Genkit provides indexer and retriever support through its plugin system. The
following plugins are officially supported:

- [Cloud Firestore vector store](plugins/firebase.md)
- [Chroma DB](plugins/chroma.md) vector database
- [Pinecone](plugins/pinecone.md) cloud vector database

In addition, Genkit supports the following vector stores through predefined
code templates, which you can customize for your database configuration and
schema:

- PostgreSQL with [`pgvector`](templates/pgvector.md)

Embedding model support is provided through the following plugins:

| Plugin | Models |
| ------------------------- | -------------------- |
| [Google Generative AI][1] | Gecko text embedding |
| [Google Vertex AI][2] | Gecko text embedding |

[1]: plugins/google-genai.md
[2]: plugins/vertex-ai.md

## Defining a RAG Flow

The following examples show how you could ingest a collection of PDF documents
into a vector database and retrieve them for use in a flow.

It uses the local file-based vector similarity retriever
that Genkit provides out-of-the box for simple testing and prototyping (_do not
use in production_):
use in production_)

```ts
import { Document, index } from '@genkit-ai/ai/retriever';
import { defineFlow, run } from '@genkit-ai/flow';
import fs from 'fs';
import { chunk } from 'llm-chunk'; // npm install llm-chunk
import path from 'path';
import pdf from 'pdf-parse'; // npm i pdf-parse && npm i -D --save @types/pdf-parse
import z from 'zod';
### Install dependencies for processing PDFs

```posix-terminal
npm install llm-chunk pdf-parse

import { configureGenkit } from '@genkit-ai/core';
npm i -D --save @types/pdf-parse
```

### Add a local vector store to your configuration

```ts
import {
devLocalIndexerRef,
devLocalVectorstore,
Expand All @@ -103,6 +149,8 @@ configureGenkit({
plugins: [
// vertexAI provides the textEmbeddingGecko embedder
vertexAI(),

// the local vector store requires an embedder to translate from text to vector
devLocalVectorstore([
{
indexName: 'bob-facts',
Expand All @@ -111,16 +159,55 @@ configureGenkit({
]),
],
});
```

### Define an Indexer

The following example shows how to create an indexer to ingest a collection of PDF documents
and store them in a local vector database.

It uses the local file-based vector similarity retriever
that Genkit provides out-of-the box for simple testing and prototyping (_do not
use in production_)

#### Create the indexer

```ts
import { devLocalIndexerRef } from '@genkit-ai/dev-local-vectorstore';

export const pdfIndexer = devLocalIndexerRef('bob-facts');
```

#### Create chunking config

This example uses the `llm-chunk` library which provides a simple text splitter to break up documents into segments that can be vectorized.

The following definition configures the chunking function to gaurantee a document segment of between 1000 and 2000 characters, broken at the end of a sentence, with an overlap between chunks of 100 characters.

```ts
const chunkingConfig = {
minLength: 1000, // number of minimum characters into chunk
maxLength: 2000, // number of maximum characters into chunk
splitter: 'sentence', // paragraph | sentence
overlap: 100, // number of overlap chracters
delimiters: '', // regex for base split method
minLength: 1000,
maxLength: 2000,
splitter: 'sentence',
overlap: 100,
delimiters: '',
} as any;
```

More chunking options for this library can be found in the [llm-chunk documentation](https://www.npmjs.com/package/llm-chunk).

#### Define your indexer flow

```ts
import { chunk } from 'llm-chunk';
import path from 'path';
import pdf from 'pdf-parse';
import { readFile } from 'fs/promises';
import z from 'zod';

import { Document, index } from '@genkit-ai/ai/retriever';
import { defineFlow, run } from '@genkit-ai/flow';
import { devLocalVectorstore } from '@genkit-ai/dev-local-vectorstore';

export const indexPdf = defineFlow(
{
Expand All @@ -130,18 +217,23 @@ export const indexPdf = defineFlow(
},
async (filePath) => {
filePath = path.resolve(filePath);

// Read the pdf.
const pdfTxt = await run('extract-text', () =>
extractTextFromPdf(filePath)
);

// Divide the pdf text into segments.
const chunks = await run('chunk-it', async () =>
chunk(pdfTxt, chunkingConfig)
);

// Convert chunks of text into documents to store in the index.
const documents = chunks.map((text) => {
return Document.fromText(text, { filePath });
});

// Add documents to the index.
await index({
indexer: pdfIndexer,
documents,
Expand All @@ -151,33 +243,27 @@ export const indexPdf = defineFlow(

async function extractTextFromPdf(filePath: string) {
const pdfFile = path.resolve(filePath);
const dataBuffer = fs.readFileSync(pdfFile);
const dataBuffer = await readFile(pdfFile);
const data = await pdf(dataBuffer);
return data.text;
}
```

To run the flow:
#### Run the indexer flow

```posix-terminal
genkit flow:run indexPdf "'../pdfs'"
```

## Retrievers
After running the `indexPdf` flow, the vector database will be seeded with documents and ready to be used in Genkit flows with retrieval steps.

A retriever is a concept that encapsulates logic related to any kind of document
retrieval. The most popular retrieval cases typically include retrieval from
vector stores.

To create a retriever, you can use one of the provided implementations or
create your own.
### Define a flow with retrieval

The following example shows how you might use a retriever in a RAG flow. Like
the indexer example, this example uses Genkit's file-based vector retriever,
which you should not use in production.

```ts
import { configureGenkit } from '@genkit-ai/core';
import { defineFlow } from '@genkit-ai/flow';
import { generate } from '@genkit-ai/ai/generate';
import { retrieve } from '@genkit-ai/ai/retriever';
Expand All @@ -188,18 +274,7 @@ import {
import { geminiPro, textEmbeddingGecko, vertexAI } from '@genkit-ai/vertexai';
import * as z from 'zod';

configureGenkit({
plugins: [
vertexAI(),
devLocalVectorstore([
{
indexName: 'bob-facts',
embedder: textEmbeddingGecko,
},
]),
],
});

// Define the retriever reference
export const bobFactRetriever = devLocalRetrieverRef('bob-facts');

export const ragFlow = defineFlow(
Expand All @@ -211,6 +286,7 @@ export const ragFlow = defineFlow(
options: { k: 3 },
});

// generate a response
const llmResponse = await generate({
model: geminiPro,
prompt: `Answer this question: ${input}`,
Expand All @@ -223,31 +299,6 @@ export const ragFlow = defineFlow(
);
```

## Supported indexers, retrievers, and embedders

Genkit provides indexer and retriever support through its plugin system. The
following plugins are officially supported:

- [Cloud Firestore vector store](plugins/firebase.md)
- [Chroma DB](plugins/chroma.md) vector database
- [Pinecone](plugins/pinecone.md) cloud vector database

In addition, Genkit supports the following vector stores through predefined
code templates, which you can customize for your database configuration and
schema:

- PostgreSQL with [`pgvector`](templates/pgvector.md)

Embedding model support is provided through the following plugins:

| Plugin | Models |
| ------------------------- | -------------------- |
| [Google Generative AI][1] | Gecko text embedding |
| [Google Vertex AI][2] | Gecko text embedding |

[1]: plugins/google-genai.md
[2]: plugins/vertex-ai.md

## Write your own indexers and retrievers

It's also possible to create your own retriever. This is useful if your
Expand Down
Loading