-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement AI chat endpoint #539
Conversation
Please introduce acronyms. |
Almost done. Expect to finish in the next 7 days. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a work in progress implementing an LLM-based AI chat assistant endpoint for Gramps Web, based on the RAG (retrieval augmented generation) technique.
Short summary of the architecture:
gramps_webapi.api.search.text_semantic
converts Gramps objects to text for semantic indexing--semantic
switch on the CLI commands and asemantic
boolean query argument on the index endpoints/api/chat/
endpointConfiguration options:
VECTOR_EMBEDDING_MODEL
: the sentence transformers model to use for vector embeddings. E.g.intfloat/multilingual-e5-small
LLM_BASE_URL
: base URL foropenai-python
. If empty, uses the OpenAI API; but can use any OpenAI compatible API, e.g. using Ollama locallyLLM_MODEL
: the model to use for chat. E.g.,gpt-4o-mini
for OpenAI ortinyllama
for Ollama.Guiding principles
What's missing?
text_semantic
module to complete it for all object types and tweak it to improve semantic search recall and precision, which currently is not as good as I would likePossibly a better way to handle long notesI also plan to
Note that this is meant to be an initial implementation laying the foundations; this can be made more powerful later on by introducing function calling etc.