Implement AI chat endpoint #539

DavidMStraub · 2024-08-27T13:22:07Z

This is a work in progress implementing an LLM-based AI chat assistant endpoint for Gramps Web, based on the RAG (retrieval augmented generation) technique.

Short summary of the architecture:

gramps_webapi.api.search.text_semantic converts Gramps objects to text for semantic indexing
v1.0 of Sifts is used for providing a search index for vector embeddings, enabling semantic search
- The new semantic search index is managed via a --semantic switch on the CLI commands and a semantic boolean query argument on the index endpoints
- Vector embeddings are computed purely locally via Sentence Transformers
A new /api/chat/ endpoint
- Conducts a semantic search on the query
- Feeds the top 10 search results along with the query to a LLM
- returns the response

Configuration options:

VECTOR_EMBEDDING_MODEL: the sentence transformers model to use for vector embeddings. E.g. intfloat/multilingual-e5-small
LLM_BASE_URL: base URL for openai-python. If empty, uses the OpenAI API; but can use any OpenAI compatible API, e.g. using Ollama locally
LLM_MODEL: the model to use for chat. E.g., gpt-4o-mini for OpenAI or tinyllama for Ollama.

Guiding principles

Fully local if desired, optionally leveraging APIs
Disabled by default (resource consumption, privacy concerns, ...)

What's missing?

documentation, obviously
more unit tests
optional quota system per tree
limiting chat endpoint to user groups?
I'm still working on the text_semantic module to complete it for all object types and tweak it to improve semantic search recall and precision, which currently is not as good as I would like
A config option to limi context length rather than retrieving a fixed number of objects
~~Possibly a better way to handle long notes~~
Adding a default embedding model to the default docker image

I also plan to

write a blog post with details about the architecture
submit a PR to the frontend repo with a chat UI very soon

Note that this is meant to be an initial implementation laying the foundations; this can be made more powerful later on by introducing function calling etc.

DavidMStraub · 2024-08-29T16:39:52Z

Docs: gramps-project/gramps-web-docs#18

emyoulation · 2024-08-29T17:31:14Z

This is a work in progress implementing an LLM-based AI chat assistant endpoint for Gramps Web, based on the RAG (retrieval augmented generation) technique.

Please introduce acronyms.
"This is a work in progress implementing a Large Language Model (LLM) based artificial intelligence (AI) chat assistant endpoint for Gramps Web, utilizing the Retrieval-Augmented Generation (RAG) technique."

DavidMStraub · 2024-09-08T19:45:41Z

Almost done. Expect to finish in the next 7 days.

DavidMStraub · 2024-09-14T19:37:38Z

Update API docs for /metadata/

RahulVadisetty91

After reviewing gramps_webapi/main.py script

The second definition overwrites the first. It seems the second function is the correct one since it includes the semantic option. Remove the first one to avoid overwriting.

This was referenced Aug 27, 2024

Add UI for AI Chat gramps-project/gramps-web#489

Merged

Add docs for chat gramps-project/gramps-web-docs#18

Open

DavidMStraub added 12 commits September 4, 2024 22:38

Add dummy chat endpoint

bcae9ab

Implement RAG chat endpoint

164cb26

Fix function signature

de2c537

Install ai dependencies

1e35281

Add missing future import

2c1096b

Docker: torch cpu-only; add ST model

11efde2

Add columns for AI quota/usage

35d3b5f

Add AI quota

73eae34

Faster chunking of indexer

cd18201

Add dummy test case

9894344

Use build instead of sdist

235f7d3

Add optional dependencies in toml

54a2507

DavidMStraub force-pushed the chat branch from d930d4d to 54a2507 Compare September 4, 2024 20:47

DavidMStraub added 13 commits September 6, 2024 09:52

Add semantic search status to metadata

212d530

Update semantic index on add/update/delete

ddf1514

Add app_has_semantic_search

998bace

Remove commented code

4936cfb

Start adding tests for chat

bffd665

Fix tests

717da3f

Fix test

c8daa19

Add future import

b337783

Add to test

5491839

Add openai mock test

2e55b14

Limit chat endpoint to user groups

c88ba89

Limit LLM context length

af231c1

Refactor text_semantic

b14e8c9

DavidMStraub added 3 commits September 8, 2024 18:04

Refactor text_semantic

a58b19d

Fix syntax error

9c03db0

More refactoring & fixes

9e3c487

DavidMStraub added 6 commits September 10, 2024 22:23

Improve text_semantic

584bdae

More improvements to text_semantic

35e721e

More improvements to text_semantic

c447455

Add doc strings

faecaa9

Remove unused function

630292f

More improvements to text_semantic

07b7836

DavidMStraub marked this pull request as ready for review September 14, 2024 12:11

DavidMStraub added 2 commits September 14, 2024 20:53

Amend alembic migration

7a7950c

Add more info to metadata

4e88070

DavidMStraub added 5 commits September 15, 2024 21:23

Update api spec for metadata

dc70555

Improve progress callback for search indexer

018b22b

Raise error if semantic search produces no results

9680181

Prevent permanently granting UseChat

d09a371

More logging for reindex

fa6ca7a

RahulVadisetty91 reviewed Sep 17, 2024

View reviewed changes

DavidMStraub added 4 commits September 26, 2024 18:58

Fix missing argument in callback

0ceaf94

Change embedding model in Dockerfile

cbf0907

Preload embedding model on app init...n

328a691

Fix annotations for Python 3.8

02c7c0b

DavidMStraub merged commit 82cfa86 into gramps-project:master Sep 27, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement AI chat endpoint #539

Implement AI chat endpoint #539

DavidMStraub commented Aug 27, 2024 •

edited

Loading

DavidMStraub commented Aug 29, 2024

emyoulation commented Aug 29, 2024

DavidMStraub commented Sep 8, 2024

DavidMStraub commented Sep 14, 2024 •

edited

Loading

RahulVadisetty91 left a comment

Implement AI chat endpoint #539

Implement AI chat endpoint #539

Conversation

DavidMStraub commented Aug 27, 2024 • edited Loading

DavidMStraub commented Aug 29, 2024

emyoulation commented Aug 29, 2024

DavidMStraub commented Sep 8, 2024

DavidMStraub commented Sep 14, 2024 • edited Loading

RahulVadisetty91 left a comment

Choose a reason for hiding this comment

DavidMStraub commented Aug 27, 2024 •

edited

Loading

DavidMStraub commented Sep 14, 2024 •

edited

Loading