This is a simple chatbot that uses the RAG architecture to answer questions. Every prompt is used to retrieve context from a vector database, and the answer is generated using the retrieved context and the question.
Currently uses llama3
- Python (Fastapi, Pytorch, Hugging face Transformers)
- Chroma vector database
- Docker