Tiny Semantic Caching

Description:

Semantic Caching is an In-Memory Database that support Semantic Search (Vector Search), it can be used in many different applications like RAG (Retrieval Augmented Generation), Database Assistant, and many more.. Designing a high performance applications that uses LLMs requires handling alot of issues like Time-Complexity, and avoidance of repeatable calls. Semantic Caching can help and save time and computational resources when designing applications like this. Tiny Semantic Caching is a project that uses Ollama and Vector Search in Duckdb to create complete semantic caching cycle.

Prerequisities:

Python (>=3.10)
Poetry
Ollama
Docker
Basic Understanding of Vector Indexing & Vector Search.

Project Setup:

Install all Prerequisities Softwares required for this project.
install requirements

poetry install

copy all containt of .env.example to .env file / rename .env.example to .env .
get an embedding model from Ollama like nomic-embed-text

ollama pull nomic-embed-text

make sure to update model name/embedding size in .env file if you used other embedding model.

to test the project locally

## use this directly
poetry run uvicorn main:app --reload

## or use this to activate the environment first
poetry shell
## then test the API
uvicorn main:app --reload

use the following URL to test the functionalities http:localhost:8000/docs

if no issues locally, use the docker-compose file to build the containers

### build the images
docker-compose build
## run the docker-compose file
docker-compose up -d

How it Works?

There are 4 different Functionalities:

vectorize (GET): convert Passed Text to Vector Using the Embedding Model

insertion (POST): insert data and its embeddings to caching database.
search (POST): search for similar/identical text based on passed text. here text is vectorized then search in caching database, last thing it to insert it.

refresh (DELETE): refreshing database to clear all records from it.

Usage:

feel free to update the scripts based on your needs and run the docker compose file.
use the direct image without any update by

## go to scripts directory
cd scripts
## run the docker compose file
docker-compose up -d

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tiny Semantic Caching

Description:

Prerequisities:

Project Setup:

How it Works?

Usage:

About

Releases

Packages

Languages

License

Ezzaldin97/tiny-semantic-caching

Folders and files

Latest commit

History

Repository files navigation

Tiny Semantic Caching

Description:

Prerequisities:

Project Setup:

How it Works?

Usage:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages