GERD is developed as an experimental library to investigate how large language models (LLMs) can be used to generate and analyze (sets of) documents.
This project was initially forked from Llama-2-Open-Source-LLM-CPU-Inference by Kenneth Leung.
If you just want to it try out, you can clone the project and install dependencies with pip
:
git clone https://github.com/caretech-owl/gerd.git
cd gerd
pip install -e ".[full]"
python examples/hello.py
For more information on development look at DEV.md. If you want to try this out in your browser, head over to binder 👉 . Note that running LLMs on the CPU (and especially on limited virtual machines like binder) takes some time. If you are in a hurry you might be better off by cloning the repo and running the examples or notebooks locally.
Follow quickstart but execute gradio
with the qa_frontend
instead of the example file.
When the server is done loading, open http://127.0.0.1:7860
in your browser.
gradio gerd/frontends/qa_frontend.py
# Some Llama.cpp outut
# ...
# * Running on local URL: http://127.0.0.1:7860
Click the 'Click to Upload' button and search for a GRASCCO document named Caja.txt
which is located in the tests/data/grascoo
folder and upload it into the vector store. Next, you can query information from the document. For instance Wie heißt der Patient?
(What is the patient called?).
Prompt chaining is a prompt engineering approach to increase the 'reflection' of a large language model onto its given answer. Check examples/chaining.py for an illustration. Also, have a look at how chaining is configured and used with GERD. You can find the config at config/gen_chaining.yml
python examples/chaining.py
# ...
====== Resolved prompt =====
system: You are a helpful assistant. Please answer the following question in a truthful and brief manner.
user: What type of mammal lays the biggest eggs?
# ...
Result: Based on the given information, the largest egg-laying mammal is the blue whale, which can lay up to 100 million eggs per year. However, the other assertions provided do not align with this information.
As you see, the answer does not make much sense with the default model which is rather small. Give it a try with meta-llama/Llama-3.2-3B. To use this model, you need to login with the huggingface cli and accept the Meta Community License Agreement.
- LangChain: Framework for developing applications powered by language models
- C Transformers: Python bindings for the Transformer models implemented in C/C++ using GGML library
- FAISS: Open-source library for efficient similarity search and clustering of dense vectors.
- Sentence-Transformers (all-MiniLM-L6-v2): Open-source pre-trained transformer model for embedding text to a 384-dimensional dense vector space for tasks like
- Poetry: Tool for dependency management and Python packaging
/assets
: Images relevant to the project/config
: Configuration files for LLM applications/examples
: Examples that demonstrate the different usage scenarios/gerd
: Code related toGERD
/images
: Images for the documentation/models
: Binary file of GGML quantized LLM model (i.e., Llama-2-7B-Chat)/prompts
: Plain text prompt files/templates
: Prompt files as jinja2 templates/tests
: Unit tests forGERD
/vectorstore
: FAISS vector store for documentspyproject.toml
: TOML file to specify which versions of the dependencies used (Poetry)