Skip to content

using Ollama and Duckdb Vector Search Extension to build in Memory Semantic Caching Database

License

Notifications You must be signed in to change notification settings

Ezzaldin97/tiny-semantic-caching

Repository files navigation

Tiny Semantic Caching

Description:

Semantic Caching is an In-Memory Database that support Semantic Search (Vector Search), it can be used in many different applications like RAG (Retrieval Augmented Generation), Database Assistant, and many more.. Designing a high performance applications that uses LLMs requires handling alot of issues like Time-Complexity, and avoidance of repeatable calls. Semantic Caching can help and save time and computational resources when designing applications like this. Tiny Semantic Caching is a project that uses Ollama and Vector Search in Duckdb to create complete semantic caching cycle.

Prerequisities:

  • Python (>=3.10)
  • Poetry
  • Ollama
  • Docker
  • Basic Understanding of Vector Indexing & Vector Search.

Project Setup:

  • Install all Prerequisities Softwares required for this project.
  • install requirements
poetry install
  • copy all containt of .env.example to .env file / rename .env.example to .env .
  • get an embedding model from Ollama like nomic-embed-text
ollama pull nomic-embed-text

make sure to update model name/embedding size in .env file if you used other embedding model.

  • to test the project locally
## use this directly
poetry run uvicorn main:app --reload

## or use this to activate the environment first
poetry shell
## then test the API
uvicorn main:app --reload

use the following URL to test the functionalities http:localhost:8000/docs

  • if no issues locally, use the docker-compose file to build the containers
### build the images
docker-compose build
## run the docker-compose file
docker-compose up -d

How it Works?

There are 4 different Functionalities:

  1. vectorize (GET): convert Passed Text to Vector Using the Embedding Model

  1. insertion (POST): insert data and its embeddings to caching database.

  2. search (POST): search for similar/identical text based on passed text. here text is vectorized then search in caching database, last thing it to insert it.

  1. refresh (DELETE): refreshing database to clear all records from it.

Usage:

  • feel free to update the scripts based on your needs and run the docker compose file.
  • use the direct image without any update by
## go to scripts directory
cd scripts
## run the docker compose file
docker-compose up -d

About

using Ollama and Duckdb Vector Search Extension to build in Memory Semantic Caching Database

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published