diff --git a/faq/index.html b/faq/index.html index 0fa79bf..4811f2a 100644 --- a/faq/index.html +++ b/faq/index.html @@ -2160,10 +2160,10 @@
sqlite3.Operatio
To work around the first issue, you can increase the disk space or clean up the disk space. To work around the second
issue, you can increase the temporary disk space (works fine for containers but might be a problem for VMs) or point
SQLite3 to a different temporary directory by using SQLITE_TMPDIR
environment variable.
-
-SQLite Temp File
-More information on sqlite3 temp files can be found here.
-
+
+SQLite Temp File
+More information on how sqlite3 uses temp files can be found here.
+
RuntimeError: Chroma is running in http-only client mode, and can only be run with 'chromadb.api.fastapi.FastAPI'
¶
Symptoms and Context:
The following error is raised when trying to create a new PersistentClient
, EphemeralClient
, or Client
:
@@ -2177,7 +2177,7 @@ This is a collection of small guides and recipes to help you get started with ChromaDB.
Latest ChromaDB version: 0.5.0
"},{"location":"#new-and-noteworthy","title":"New and Noteworthy","text":" - \ud83e\udde0 Memory Management - Learn how to manage memory in ChromaDB - \ud83d\udcc5
30-May-2024
- \ud83d\udcd0 Resource Requirements - Recently updated with temporary storage requirements - \ud83d\udcc5
28-May-2024
- \u2049\ufe0fFAQs - Facing an issue, check out our FAQ section for answers. - \ud83d\udcc5
28-May-2024
- \ud83d\udcbe Chroma Storage Layout - Understand how Chroma stores persistent data - \ud83d\udcc5
21-May-2024
- \u2699\ufe0f Chroma Configuration - Learn about all the levers that Chroma offers for configuring the client, server and HNSW indices - \ud83d\udcc5
16-May-2024
- \ud83d\udcbb Systemd Service - Learn how to start Chroma upon system boot - \ud83d\udcc5
15-May-2024
"},{"location":"#getting-started","title":"Getting Started","text":"We suggest you first head to the Concepts section to get familiar with ChromaDB concepts, such as Documents, Metadata, Embeddings, etc.
Once you're comfortable with the concepts, you can jump to the Installation section to install ChromaDB.
Core Topics:
- Filters - Learn to filter data in ChromaDB using metadata and document filters
- Resource Requirements - Understand the resource requirements for running ChromaDB
- \u2728Multi-Tenancy - Learn how to implement multi-tenancy in ChromaDB
"},{"location":"#running-chromadb","title":"Running ChromaDB","text":" - CLI - Running ChromaDB via the CLI
- Docker - Running ChromaDB in Docker
- Docker Compose - Running ChromaDB in Docker Compose
- Kubernetes - Running ChromaDB in Kubernetes (Minikube)
"},{"location":"#integrations","title":"Integrations","text":" - \u2728LangChain - Integrating ChromaDB with LangChain
- \u2728LlamaIndex - Integrating ChromaDB with LlamaIndex
- \u2728Ollama - Integrating ChromaDB with Ollama
"},{"location":"#the-ecosystem","title":"The Ecosystem","text":""},{"location":"#clients","title":"Clients","text":"Below is a list of available clients for ChromaDB.
- Python Client (Official Chroma client)
- JavaScript Client (Official Chroma client)
- Ruby Client (Community maintained)
- Java Client (Community maintained)
- Go Client (Community maintained)
- C# Client (Microsoft maintained)
- Rust Client (Community maintained)
- Elixir Client (Community maintained)
- Dart Client (Community maintained)
- PHP Client (Community maintained)
- PHP (Laravel) Client (Community maintained)
"},{"location":"#user-interfaces","title":"User Interfaces","text":" - VectorAdmin (MintPlex Labs) - An open-source web-based admin interface for vector databases, including ChromaDB
- ChromaDB UI (Community maintained) - A web-based UI for ChromaDB
"},{"location":"#cli-tooling","title":"CLI Tooling","text":" - Chroma CLI (Community maintained) - Early Alpha
- Chroma Data Pipes (Community maintained) - A CLI tool for importing and exporting data from ChromaDB
- Chroma Ops (Community maintained) - A maintenance CLI tool for ChromaDB
"},{"location":"#strategies","title":"Strategies","text":" - Backup - Backing up ChromaDB data
- Batch Imports - Importing data in batches
- Multi-Tenancy - Running multiple ChromaDB instances
- Keyword Search - Searching for keywords in ChromaDB
- Memory Management - Managing memory in ChromaDB
- Time-based Queries - Querying data based on timestamps
- \u2728'
Coming Soon
Testing with Chroma - learn how to test your GenAI apps that include Chroma. - \u2728'
Coming Soon
Monitoring Chroma - learn how to monitor your Chroma instance. - \u2728'
Coming Soon
Building Chroma clients - learn how to build clients for Chroma. - \u2728'
Coming Soon
Creating the perfect Embedding Function (wrapper) - learn the best practices for creating your own embedding function. - \u2728 Multi-User Basic Auth Plugin - learn how to build a multi-user basic authentication plugin for Chroma.
- \u2728 CORS Configuration For JS Browser apps - learn how to configure CORS for Chroma.
- \u2728 Running Chroma with SystemD - learn how to start Chroma upon system boot.
"},{"location":"#get-help","title":"Get Help","text":"Missing something? Let us know by opening an issue, reach out on Discord (look for @taz
).
"},{"location":"contributing/getting-started/","title":"Getting Started with Contributing to Chroma","text":""},{"location":"contributing/getting-started/#overview","title":"Overview","text":"Here are some steps to follow:
- Fork the repository (if you are part of an organization to which you cannot grant permissions it might be advisable to fork under your own user account to allow other community members to contribute by granting them permissions, something that is a bit more difficult at organizational level)
- Clone your forked repo locally (git clone ...) under a dir with an apt name for the change you want to make e.g.
my_awesome_feature
- Create a branch for your change (git checkout -b my_awesome_feature)
- Make your changes
- Test (see Testing)
- Lint (see Linting)
- Commit your changes (git commit -am 'Added some feature')
- Push to the branch (git push origin my_awesome_feature)
- Create a new Pull Request (PR) from your forked repository to the main Chroma repository
"},{"location":"contributing/getting-started/#testing","title":"Testing","text":"It is generally good to test your changes before submitting a PR.
To run the full test suite:
pip install -r requirements_dev.txt\npytest\n
To run a specific test:
pytest chromadb/tests/test_api.py::test_get_collection\n
If you want to see the output of print statements in the tests, you can run:
pytest -s\n
If you want your pytest to stop on first failure, you can run:
pytest -x\n
"},{"location":"contributing/getting-started/#integration-tests","title":"Integration Tests","text":"You can only run the integration tests by running:
sh bin/bin/integration-test\n
The above will create a docker container and will run the integration tests against it. This will also include JS client.
"},{"location":"contributing/getting-started/#linting","title":"Linting","text":""},{"location":"contributing/useful-shortcuts/","title":"Useful Shortcuts for Contributors","text":""},{"location":"contributing/useful-shortcuts/#git","title":"Git","text":""},{"location":"contributing/useful-shortcuts/#aliases","title":"Aliases","text":""},{"location":"contributing/useful-shortcuts/#create-venv-and-install-dependencies","title":"Create venv and install dependencies","text":"Add the following to your .bashrc
, .zshrc
or .profile
:
alias chroma-init='python -m virtualenv venv && source venv/bin/activate && pip install -r requirements.txt && pip install -r requirements_dev.txt'\n
"},{"location":"core/api/","title":"Chroma API","text":"In this article we will cover the Chroma API in an indepth details.
"},{"location":"core/api/#accessing-the-api","title":"Accessing the API","text":"If you are running a Chroma server you can access its API at - http://<chroma_server_host>:<chroma_server_port>/docs
( e.g. http://localhost:8000/docs
).
"},{"location":"core/api/#api-endpoints","title":"API Endpoints","text":"TBD
"},{"location":"core/api/#generating-clients","title":"Generating Clients","text":"While Chroma ecosystem has client implementations for many languages, it may be the case you want to roll out your own. Below we explain some of the options available to you:
"},{"location":"core/api/#using-openapi-generator","title":"Using OpenAPI Generator","text":"The fastest way to build a client is to use the OpenAPI Generator the API spec.
"},{"location":"core/api/#manually-creating-a-client","title":"Manually Creating a Client","text":"If you more control over things, you can create your own client by using the API spec as guideline.
For your convenience we provide some data structures in various languages to help you get started. The important structures are:
- Client
- Collection
- Embedding
- Document
- ID
- Metadata
- QueryRequest/QueryResponse
- Include
- Where Filter
- WhereDocument Filter
"},{"location":"core/api/#python","title":"Python","text":""},{"location":"core/api/#typescript","title":"Typescript","text":""},{"location":"core/api/#golang","title":"Golang","text":""},{"location":"core/api/#java","title":"Java","text":""},{"location":"core/api/#rust","title":"Rust","text":""},{"location":"core/api/#elixir","title":"Elixir","text":""},{"location":"core/clients/","title":"Chroma Clients","text":"Chroma Settings Object
The below is only a partial list of Chroma configuration options. For full list check the code chromadb.config.Settings
or the ChromaDB Configuration page.
"},{"location":"core/clients/#persistent-client","title":"Persistent Client","text":"To create your a local persistent client use the PersistentClient
class. This client will store all data locally in a directory on your machine at the path you specify.
import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.PersistentClient(\n path=\"test\",\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
path
- parameter must be a local path on the machine where Chroma is running. If the path does not exist, it will be created. The path can be relative or absolute. If the path is not specified, the default is ./chroma
in the current working directory. settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
Positional Parameters
Chroma PersistentClient
parameters are positional, unless keyword arguments are used.
"},{"location":"core/clients/#uses-of-persistent-client","title":"Uses of Persistent Client","text":"The persistent client is useful for:
- Local development: You can use the persistent client to develop locally and test out ChromaDB.
- Embedded applications: You can use the persistent client to embed ChromaDB in your application. For example, if you are building a web application, you can use the persistent client to store data locally on the server.
"},{"location":"core/clients/#http-client","title":"HTTP Client","text":"Chroma also provides HTTP Client, suitable for use in a client-server mode. This client can be used to connect to a remote ChromaDB server.
PythonJavaScript import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.HttpClient(\n host=\"localhost\",\n port=8000,\n ssl=False,\n headers=None,\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
host
- The host of the remote server. If not specified, the default is localhost
. port
- The port of the remote server. If not specified, the default is 8000
. ssl
- If True
, the client will use HTTPS. If not specified, the default is False
. headers
- (optional): The headers to be sent to the server. The setting can be used to pass additional headers to the server. An example of this can be auth headers. settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
!!! tip \"Positional Parameters\"
Chroma `PersistentClient` parameters are positional, unless keyword arguments are used.\n
import {ChromaClient} from \"chromadb\";\nconst client = new ChromaClient({\n path: \"http://localhost:8000\",\n auth: {\n provider: \"token\",\n credentials: \"your_token_here\",\n tokenHeaderType: \"AUTHORIZATION\",\n },\n tenant: \"default_tenant\",\n database: \"default_database\",\n});\n
Parameters:
path
- The Chroma endpoint auth
- Chroma authentication object tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
"},{"location":"core/clients/#uses-of-http-client","title":"Uses of HTTP Client","text":"The HTTP client is ideal for when you want to scale your application or move off of local machine storage. It is important to note that there are trade-offs associated with using HTTP client:
- Network latency - The time it takes to send a request to the server and receive a response.
- Serialization and deserialization overhead - The time it takes to convert data to a format that can be sent over the network and then convert it back to its original format.
- Security - The data is sent over the network, so it is important to ensure that the connection is secure (we recommend using both HTTPS and authentication).
- Availability - The server must be available for the client to connect to it.
- Bandwidth usage - The amount of data sent over the network.
- Data privacy and compliance - Storing data on a remote server may require compliance with data protection laws and regulations.
- Difficulty in debugging - Debugging network issues can be more difficult than debugging local issues. The same applies to server-side issues.
"},{"location":"core/clients/#host-parameter-special-cases-python-only","title":"Host parameter special cases (Python-only)","text":"The host
parameter supports a more advanced syntax than just the hostname. You can specify the whole endpoint ULR ( without the API paths), e.g. https://chromadb.example.com:8000/my_server/path/
. This is useful when you want to use a reverse proxy or load balancer in front of your ChromaDB server.
"},{"location":"core/clients/#ephemeral-client","title":"Ephemeral Client","text":"Ephemeral client is a client that does not store any data on disk. It is useful for fast prototyping and testing. To get started with an ephemeral client, use the EphemeralClient
class.
import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.EphemeralClient(\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
Positional Parameters
Chroma PersistentClient
parameters are positional, unless keyword arguments are used.
"},{"location":"core/clients/#environmental-variable-configured-client","title":"Environmental Variable Configured Client","text":"You can also configure the client using environmental variables. This is useful when you want to configure any of the client configurations listed above via environmental variables.
import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.Client(\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
Positional Parameters
Chroma PersistentClient
parameters are positional, unless keyword arguments are used.
"},{"location":"core/collections/","title":"Collections","text":"Collections are the grouping mechanism for embeddings, documents, and metadata.
"},{"location":"core/collections/#collection-basics","title":"Collection Basics","text":""},{"location":"core/collections/#collection-properties","title":"Collection Properties","text":"Each collection is characterized by the following properties:
name
: The name of the collection. The name can be changed as long as it is unique within the database ( use collection.modify(new_name=\"new_name\")
to change the name of the collection metadata
: A dictionary of metadata associated with the collection. The metadata is a dictionary of key-value pairs. Keys can be strings, values can be strings, integers, floats, or booleans. Metadata can be changed using collection.modify(new_metadata={\"key\": \"value\"})
(Note: Metadata is always overwritten when modified) embedding_function
: The embedding function used to embed documents in the collection.
Defaults:
- Embedding Function - by default if
embedding_function
parameter is not provided at get()
or create_collection()
or get_or_create_collection()
time, Chroma uses chromadb.utils.embedding_functions.DefaultEmbeddingFunction
which uses the chromadb.utils.embedding_functions.DefaultEmbeddingFunction
to embed documents. The default embedding function uses Onnx Runtime with all-MiniLM-L6-v2
model. - distance metric - by default Chroma use L2 (Euclidean Distance Squared) distance metric for newly created collection. You can change it at creation time using
hnsw:space
metadata key. Possible values are l2
, cosine
, and 'ip' (inner product) - Batch size, defined by
hnsw:batch_size
metadata key. Default is 100. The batch size defines the size of the in-memory bruteforce index. Once the threshold is reached, vectors are added to the HNSW index and the bruteforce index is cleared. Greater values may improve ingest performance. When updating also consider changing sync threshold - Sync threshold, defined by
hnsw:sync_threshold
metadata key. Default 1000. The sync threshold defines the limit at which the HNSW index is synced to disk. This limit only applies to newly added vectors.
Keep in Mind
Collection distance metric cannot be changed after the collection is created. To change the distance metric see #cloning-a-collection
Name Restrictions
Collection names in Chroma must adhere to the following restrictions:
(1) contains 3-63 characters (2) starts and ends with an alphanumeric character (3) otherwise contains only alphanumeric characters, underscores or hyphens (-) (4) contains no two consecutive periods (..) (5) is not a valid IPv4 address
"},{"location":"core/collections/#creating-a-collection","title":"Creating a collection","text":"Official Docs
For more information on the create_collection
or get_or_create_collection
methods, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type name
Name of the collection to create. Parameter is required N/A String metadata
Metadata associated with the collection. This is an optional parameter None
Dictionary embedding_function
Embedding function to use for the collection. This is an optional parameter chromadb.utils.embedding_functions.DefaultEmbeddingFunction
EmbeddingFunction import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.create_collection(\"test\")\n
Alternatively you can use the get_or_create_collection
method to create a collection if it doesn't exist already.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\", metadata={\"key\": \"value\"})\n
Metadata with get_or_create_collection()
If the collection exists and metadata is provided in the method it will attempt to overwrite the existing metadata.
"},{"location":"core/collections/#deleting-a-collection","title":"Deleting a collection","text":"Official Docs
For more information on the delete_collection
method, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type name
Name of the collection to delete. Parameter is required N/A String import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\nclient.delete_collection(\"test\")\n
"},{"location":"core/collections/#listing-all-collections","title":"Listing all collections","text":"Official Docs
For more information on the list_collections
method, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type offset
The starting offset for listing collections. This is an optional parameter None
Positive Integer limit
The number of collections to return. If the remaining collections from offset
are fewer than this number then returned collection will also be fewer. This is an optional parameter None
Positive Integer import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncollections = client.list_collections()\n
"},{"location":"core/collections/#getting-a-collection","title":"Getting a collection","text":"Official Docs
For more information on the get_collection
method, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type name
Name of the collection to get. Parameter is required N/A String embedding_function
Embedding function to use for the collection. This is an optional parameter chromadb.utils.embedding_functions.DefaultEmbeddingFunction
EmbeddingFunction import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_collection(\"test\")\n
"},{"location":"core/collections/#modifying-a-collection","title":"Modifying a collection","text":"Official Docs
For more information on the modify
method, see the official ChromaDB documentation.
Modify method on collection
As the reader will observe modify
method is called on the collection and node on the client as the rest of the collection lifecycle methods.
Metadata Overwrite
Metadata is always overwritten when modified. If you want to add a new key-value pair to the metadata, you must first get the existing metadata and then add the new key-value pair to it.
Parameters:
Name Description Default Value Type new_name
The new name of the collection. Parameter is required N/A String metadata
Metadata associated with the collection. This is an optional parameter None
Dictionary Both collection properties (name
and metadata
) can be modified, separately ot together.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_collection(\"test\")\ncol.modify(name=\"test2\", metadata={\"key\": \"value\"})\n
"},{"location":"core/collections/#counting-collections","title":"Counting Collections","text":"Official Docs
For more information on the count_collections
method, see the official ChromaDB documentation.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\") # create a new collection\n\nclient.count_collections()\n
"},{"location":"core/collections/#iterating-over-a-collection","title":"Iterating over a Collection","text":"import chromadb\n\nclient = chromadb.PersistentClient(path=\"my_local_data\") # or HttpClient()\n\ncollection = client.get_or_create_collection(\"local_collection\")\ncollection.add(\n ids=[f\"i\" for i in range(1000)],\n documents=[f\"document {i}\" for i in range(1000)],\n metadatas=[{\"doc_id\": i} for i in range(1000)])\nexisting_count = collection.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = collection.get(\n include=[\"metadatas\", \"documents\", \"embeddings\"],\n limit=batch_size,\n offset=i)\n print(batch) # do something with the batch\n
"},{"location":"core/collections/#collection-utilities","title":"Collection Utilities","text":""},{"location":"core/collections/#copying-local-collection-to-remote","title":"Copying Local Collection to Remote","text":"The following example demonstrates how to copy a local collection to a remote ChromaDB server. (it also works in reverse)
import chromadb\n\nclient = chromadb.PersistentClient(path=\"my_local_data\")\nremote_client = chromadb.HttpClient()\n\ncollection = client.get_or_create_collection(\"local_collection\")\ncollection.add(\n ids=[\"1\", \"2\"],\n documents=[\"hello world\", \"hello ChromaDB\"],\n metadatas=[{\"a\": 1}, {\"b\": 2}])\nremote_collection = remote_client.get_or_create_collection(\"remote_collection\",\n metadata=collection.metadata)\nexisting_count = collection.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = collection.get(\n include=[\"metadatas\", \"documents\", \"embeddings\"],\n limit=batch_size,\n offset=i)\n remote_collection.add(\n ids=batch[\"ids\"],\n documents=batch[\"documents\"],\n metadatas=batch[\"metadatas\"],\n embeddings=batch[\"embeddings\"])\n
Using ChromaDB Data Pipes
There is a more efficient way to copy data between local and remote collections using ChromaDB Data Pipes package.
pip install chromadb-data-pipes\ncdp export \"file://path/to_local_data/local_collection\" | \\\ncdp import \"http://remote_chromadb:port/remote_collection\" --create\n
"},{"location":"core/collections/#cloning-a-collection","title":"Cloning a collection","text":"Here are some reasons why you might want to clone a collection:
- Change distance function (via metadata -
hnsw:space
) - Change HNSW hyper parameters (
hnsw:M
, hnsw:construction_ef
, hnsw:search_ef
)
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\") # create a new collection with L2 (default)\n\ncol.add(ids=[f\"{i}\" for i in range(1000)], documents=[f\"document {i}\" for i in range(1000)])\nnewCol = client.get_or_create_collection(\"test1\", metadata={\n \"hnsw:space\": \"cosine\"}) # let's change the distance function to cosine\n\nexisting_count = col.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = col.get(include=[\"metadatas\", \"documents\", \"embeddings\"], limit=batch_size, offset=i)\n newCol.add(ids=batch[\"ids\"], documents=batch[\"documents\"], metadatas=batch[\"metadatas\"],\n embeddings=batch[\"embeddings\"])\n\nprint(newCol.count())\nprint(newCol.get(offset=0, limit=10)) # get first 10 documents\n
"},{"location":"core/collections/#changing-the-embedding-function","title":"Changing the embedding function","text":"To change the embedding function of a collection, it must be cloned to a new collection with the desired embedding function.
import os\nimport chromadb\nfrom chromadb.utils.embedding_functions import OpenAIEmbeddingFunction, DefaultEmbeddingFunction\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ndefault_ef = DefaultEmbeddingFunction()\ncol = client.create_collection(\"default_ef_collection\",embedding_function=default_ef)\nopenai_ef = OpenAIEmbeddingFunction(api_key=os.getenv(\"OPENAI_API_KEY\"), model_name=\"text-embedding-3-small\")\ncol.add(ids=[f\"{i}\" for i in range(1000)], documents=[f\"document {i}\" for i in range(1000)])\nnewCol = client.get_or_create_collection(\"openai_ef_collection\", embedding_function=openai_ef)\n\nexisting_count = col.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = col.get(include=[\"metadatas\", \"documents\"], limit=batch_size, offset=i)\n newCol.add(ids=batch[\"ids\"], documents=batch[\"documents\"], metadatas=batch[\"metadatas\"])\n# get first 10 documents with their OpenAI embeddings\nprint(newCol.get(offset=0, limit=10,include=[\"metadatas\", \"documents\", \"embeddings\"])) \n
"},{"location":"core/collections/#cloning-a-subset-of-a-collection-with-query","title":"Cloning a subset of a collection with query","text":"The below example demonstrates how to select a slice of an existing collection by using where
and where_document
query and creating a new collection with the selected slice.
Race Condition
The below example is not atomic and if data is changed between the initial selection query (select_ids = col.get(...)
and the subsequent insertion query (batch = col.get(...)
) the new collection may not contain the expected data.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\") # create a new collection with L2 (default)\n\ncol.add(ids=[f\"{i}\" for i in range(1000)], documents=[f\"document {i}\" for i in range(1000)])\nnewCol = client.get_or_create_collection(\"test1\", metadata={\n \"hnsw:space\": \"cosine\", \"hnsw:M\": 32}) # let's change the distance function to cosine and M to 32\nquery_where = {\"metadata_key\": \"value\"}\nquery_where_document = {\"$contains\": \"document\"}\nselect_ids = col.get(where_document=query_where_document, where=query_where, include=[]) # get only IDs\nbatch_size = 10\nfor i in range(0, len(select_ids[\"ids\"]), batch_size):\n batch = col.get(include=[\"metadatas\", \"documents\", \"embeddings\"], limit=batch_size, offset=i, where=query_where,\n where_document=query_where_document)\n newCol.add(ids=batch[\"ids\"], documents=batch[\"documents\"], metadatas=batch[\"metadatas\"],\n embeddings=batch[\"embeddings\"])\n\nprint(newCol.count())\nprint(newCol.get(offset=0, limit=10)) # get first 10 documents\n
"},{"location":"core/collections/#updating-documentrecord-metadata","title":"Updating Document/Record Metadata","text":"In this example we loop through all documents of a collection and strip all metadata fields of leading and trailing whitespace. Change the update_metadata
function to suit your needs.
from chromadb import Settings\nimport chromadb\n\nclient = chromadb.PersistentClient(path=\"test\", settings=Settings(allow_reset=True))\nclient.reset() # reset the database so we can run this script multiple times\ncol = client.get_or_create_collection(\"test\")\ncount = col.count()\n\n\ndef update_metadata(metadata: dict):\n return {k: v.strip() for k, v in metadata.items()}\n\n\nfor i in range(0, count, 10):\n batch = col.get(include=[\"metadatas\"], limit=10, offset=i)\n col.update(ids=batch[\"ids\"], metadatas=[update_metadata(metadata) for metadata in batch[\"metadatas\"]])\n
"},{"location":"core/collections/#tips-and-tricks","title":"Tips and Tricks","text":""},{"location":"core/collections/#getting-ids-only","title":"Getting IDs Only","text":"The below example demonstrates how to get only the IDs of a collection. This is useful if you need to work with IDs without the need to fetch any additional data. Chroma will accept and empty include
array indicating that no other data than the IDs is returned.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\")\ncol = client.get_or_create_collection(\"my_collection\")\nids_only_result = col.get(include=[])\nprint(ids_only_result['ids'])\n
"},{"location":"core/concepts/","title":"Concepts","text":""},{"location":"core/concepts/#tenancy-and-db-hierarchies","title":"Tenancy and DB Hierarchies","text":"The following picture illustrates the tenancy and DB hierarchy in Chroma:
Storage
In Chroma single-node, all data about tenancy, databases, collections and documents is stored in a single SQLite database.
"},{"location":"core/concepts/#tenants","title":"Tenants","text":"A tenant is a logical grouping for a set of databases. A tenant is designed to model a single organization or user. A tenant can have multiple databases.
"},{"location":"core/concepts/#databases","title":"Databases","text":"A database is a logical grouping for a set of collections. A database is designed to model a single application or project. A database can have multiple collections.
"},{"location":"core/concepts/#collections","title":"Collections","text":"Collections are the grouping mechanism for embeddings, documents, and metadata.
"},{"location":"core/concepts/#documents","title":"Documents","text":"Chunks of text
Documents in ChromaDB lingo are chunks of text that fits within the embedding model's context window. Unlike other frameworks that use the term \"document\" to mean a file, ChromaDB uses the term \"document\" to mean a chunk of text.
Documents are raw chunks of text that are associated with an embedding. Documents are stored in the database and can be queried for.
"},{"location":"core/concepts/#metadata","title":"Metadata","text":"Metadata is a dictionary of key-value pairs that can be associated with an embedding. Metadata is stored in the database and can be queried for.
Metadata values can be of the following types:
- strings
- integers
- floats (float32)
- booleans
"},{"location":"core/concepts/#embedding-function","title":"Embedding Function","text":"Also referred to as embedding model, embedding functions in ChromaDB are wrappers that expose a consistent interface for generating embedding vectors from documents or text queries.
For a list of supported embedding functions see Chroma's official documentation.
"},{"location":"core/concepts/#distance-function","title":"Distance Function","text":"Distance functions help in calculating the difference (distance) between two embedding vectors. ChromaDB supports the following distance functions:
- Cosine - Useful for text similarity
- Euclidean (L2) - Useful for text similarity, more sensitive to noise than
cosine
- Inner Product (IP) - Recommender systems
"},{"location":"core/concepts/#embedding-vector","title":"Embedding Vector","text":"A representation of a document in the embedding space in te form of a vector, list of 32-bit floats (or ints).
"},{"location":"core/concepts/#embedding-model","title":"Embedding Model","text":""},{"location":"core/concepts/#document-and-metadata-index","title":"Document and Metadata Index","text":"The document and metadata index is stored in SQLite database.
"},{"location":"core/concepts/#vector-index-hnsw-index","title":"Vector Index (HNSW Index)","text":"Under the hood (ca. v0.4.22) Chroma uses its own fork HNSW lib for indexing and searching vectors.
In a single-node mode, Chroma will create a single HNSW index for each collection. The index is stored in a subdir of your persistent dir, named after the collection id (UUID-based).
The HNSW lib uses fast ANN algo to search the vectors in the index.
"},{"location":"core/configuration/","title":"Configuration","text":"Work in Progress
This page is a work in progress and may not be complete.
"},{"location":"core/configuration/#common-configurations-options","title":"Common Configurations Options","text":""},{"location":"core/configuration/#server-configuration","title":"Server Configuration","text":""},{"location":"core/configuration/#core","title":"Core","text":""},{"location":"core/configuration/#is_persistent","title":"is_persistent
","text":""},{"location":"core/configuration/#persist_directory","title":"persist_directory
","text":""},{"location":"core/configuration/#allow_reset","title":"allow_reset
","text":""},{"location":"core/configuration/#chroma_memory_limit_bytes","title":"chroma_memory_limit_bytes
","text":""},{"location":"core/configuration/#chroma_segment_cache_policy","title":"chroma_segment_cache_policy
","text":""},{"location":"core/configuration/#telemetry-and-observability","title":"Telemetry and Observability","text":""},{"location":"core/configuration/#chroma_otel_collection_endpoint","title":"chroma_otel_collection_endpoint
","text":""},{"location":"core/configuration/#chroma_otel_service_name","title":"chroma_otel_service_name
","text":""},{"location":"core/configuration/#chroma_otel_collection_headers","title":"chroma_otel_collection_headers
","text":""},{"location":"core/configuration/#chroma_otel_granularity","title":"chroma_otel_granularity
","text":""},{"location":"core/configuration/#chroma_product_telemetry_impl","title":"chroma_product_telemetry_impl
","text":""},{"location":"core/configuration/#chroma_telemetry_impl","title":"chroma_telemetry_impl
","text":""},{"location":"core/configuration/#anonymized_telemetry","title":"anonymized_telemetry
","text":""},{"location":"core/configuration/#maintenance","title":"Maintenance","text":""},{"location":"core/configuration/#migrations","title":"migrations
","text":""},{"location":"core/configuration/#migrations_hash_algorithm","title":"migrations_hash_algorithm
","text":""},{"location":"core/configuration/#operations-and-distributed","title":"Operations and Distributed","text":""},{"location":"core/configuration/#chroma_sysdb_impl","title":"chroma_sysdb_impl
","text":""},{"location":"core/configuration/#chroma_producer_impl","title":"chroma_producer_impl
","text":""},{"location":"core/configuration/#chroma_consumer_impl","title":"chroma_consumer_impl
","text":""},{"location":"core/configuration/#chroma_segment_manager_impl","title":"chroma_segment_manager_impl
","text":""},{"location":"core/configuration/#chroma_segment_directory_impl","title":"chroma_segment_directory_impl
","text":""},{"location":"core/configuration/#chroma_memberlist_provider_impl","title":"chroma_memberlist_provider_impl
","text":""},{"location":"core/configuration/#worker_memberlist_name","title":"worker_memberlist_name
","text":""},{"location":"core/configuration/#chroma_coordinator_host","title":"chroma_coordinator_host
","text":""},{"location":"core/configuration/#chroma_server_grpc_port","title":"chroma_server_grpc_port
","text":""},{"location":"core/configuration/#chroma_logservice_host","title":"chroma_logservice_host
","text":""},{"location":"core/configuration/#chroma_logservice_port","title":"chroma_logservice_port
","text":""},{"location":"core/configuration/#chroma_quota_provider_impl","title":"chroma_quota_provider_impl
","text":""},{"location":"core/configuration/#chroma_rate_limiting_provider_impl","title":"chroma_rate_limiting_provider_impl
","text":""},{"location":"core/configuration/#authentication","title":"Authentication","text":""},{"location":"core/configuration/#chroma_auth_token_transport_header","title":"chroma_auth_token_transport_header
","text":""},{"location":"core/configuration/#chroma_client_auth_provider","title":"chroma_client_auth_provider
","text":""},{"location":"core/configuration/#chroma_client_auth_credentials","title":"chroma_client_auth_credentials
","text":""},{"location":"core/configuration/#chroma_server_auth_ignore_paths","title":"chroma_server_auth_ignore_paths
","text":""},{"location":"core/configuration/#chroma_overwrite_singleton_tenant_database_access_from_auth","title":"chroma_overwrite_singleton_tenant_database_access_from_auth
","text":""},{"location":"core/configuration/#chroma_server_authn_provider","title":"chroma_server_authn_provider
","text":""},{"location":"core/configuration/#chroma_server_authn_credentials","title":"chroma_server_authn_credentials
","text":""},{"location":"core/configuration/#chroma_server_authn_credentials_file","title":"chroma_server_authn_credentials_file
","text":""},{"location":"core/configuration/#authorization","title":"Authorization","text":""},{"location":"core/configuration/#chroma_server_authz_provider","title":"chroma_server_authz_provider
","text":""},{"location":"core/configuration/#chroma_server_authz_config","title":"chroma_server_authz_config
","text":""},{"location":"core/configuration/#chroma_server_authz_config_file","title":"chroma_server_authz_config_file
","text":""},{"location":"core/configuration/#client-configuration","title":"Client Configuration","text":""},{"location":"core/configuration/#authentication_1","title":"Authentication","text":""},{"location":"core/configuration/#hnsw-configuration","title":"HNSW Configuration","text":"HNSW is the underlying library for Chroma vector indexing and search. Chroma exposes a number of parameters to configure HNSW for your use case. All HNSW parameters are configured as metadata for a collection.
Changing HNSW parameters
Some HNSW parameters cannot be changed after index creation via the standard method shown below. If you which to change these parameters, you will need to clone the collection see an example here.
"},{"location":"core/configuration/#hnswspace","title":"hnsw:space
","text":"Description: Controls the distance metric of the HNSW index. The space cannot be changed after index creation.
Default: l2
Constraints:
- Possible values:
l2
, cosine
, ip
- Parameter cannot be changed after index creation.
"},{"location":"core/configuration/#hnswconstruction_ef","title":"hnsw:construction_ef
","text":"Description: Controls the number of neighbours in the HNSW graph to explore when adding new vectors. The more neighbours HNSW explores the better and more exhaustive the results will be. Increasing the value will also increase memory consumption.
Default: 100
Constraints:
- Values must be positive integers.
- Parameter cannot be changed after index creation.
"},{"location":"core/configuration/#hnswm","title":"hnsw:M
","text":"Description: Controls the maximum number of neighbour connections (M), a newly inserted vector. A higher value results in a mode densely connected graph. The impact on this is slower but more accurate searches with increased memory consumption.
Default: 16
Constraints:
- Values must be positive integers.
- Parameter cannot be changed after index creation.
"},{"location":"core/configuration/#hnswsearch_ef","title":"hnsw:search_ef
","text":"Description: Controls the number of neighbours in the HNSW graph to explore when searching. Increasing this requires more memory for the HNSW algo to explore the nodes during knn search.
Default: 10
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswnum_threads","title":"hnsw:num_threads
","text":"Description: Controls how many threads HNSW algo use.
Default: <number of CPU cores>
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswresize_factor","title":"hnsw:resize_factor
","text":"Description: Controls the rate of growth of the graph (e.g. how many node capacity will be added) whenever the current graph capacity is reached.
Default: 1.2
Constraints:
- Values must be positive floating point numbers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswbatch_size","title":"hnsw:batch_size
","text":"Description: Controls the size of the Bruteforce (in-memory) index. Once this threshold is crossed vectors from BF gets transferred to HNSW index. This value can be changed after index creation. The value must be less than hnsw:sync_threshold
.
Default: 100
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswsync_threshold","title":"hnsw:sync_threshold
","text":"Description: Controls the threshold when using HNSW index is written to disk.
Default: 1000
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#examples","title":"Examples","text":"Configuring HNSW parameters at creation time
import chromadb\n\nclient = chromadb.HttpClient() # Adjust as per your client\nres = client.create_collection(\"my_collection\", metadata={\n \"hnsw:space\": \"cosine\",\n \"hnsw:construction_ef\": 100,\n \"hnsw:M\": 16,\n \"hnsw:search_ef\": 10,\n \"hnsw:num_threads\": 4,\n \"hnsw:resize_factor\": 1.2,\n \"hnsw:batch_size\": 100,\n \"hnsw:sync_threshold\": 1000,\n})\n
Updating HNSW parameters after creation
import chromadb\n\nclient = chromadb.HttpClient() # Adjust as per your client\nres = client.get_or_create_collection(\"my_collection\", metadata={\n \"hnsw:search_ef\": 200,\n \"hnsw:num_threads\": 8,\n \"hnsw:resize_factor\": 2,\n \"hnsw:batch_size\": 10000,\n \"hnsw:sync_threshold\": 1000000,\n})\n
get_or_create_collection overrides
When using get_or_create_collection()
with metadata
parameter, existing metadata will be overridden with the new values.
"},{"location":"core/document-ids/","title":"Document IDs","text":"Chroma is unopinionated about document IDs and delegates those decisions to the user. This frees users to build semantics around their IDs.
"},{"location":"core/document-ids/#note-on-compound-ids","title":"Note on Compound IDs","text":"While you can choose to use IDs that are composed of multiple sub-IDs (e.g. user_id
+ document_id
), it is important to highlight that Chroma does not support querying by partial ID.
"},{"location":"core/document-ids/#common-practices","title":"Common Practices","text":""},{"location":"core/document-ids/#uuids","title":"UUIDs","text":"UUIDs are a common choice for document IDs. They are unique, and can be generated in a distributed fashion. They are also opaque, which means that they do not contain any information about the document itself. This can be a good thing, as it allows you to change the document without changing the ID.
import uuid\nimport chromadb\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"collection\")\ncollection.add(ids=[f\"{uuid.uuid4()}\" for _ in range(len(my_documents))], documents=my_documents)\n
"},{"location":"core/document-ids/#caveats","title":"Caveats","text":"Predictable Ordering
UUIDs especially v4 are not lexicographically sortable. In its current version (0.4.x-0.5.0) Chroma orders responses of get()
by the ID of the documents. Therefore, if you need predictable ordering, you may want to consider a different ID strategy.
Storage Overhead
UUIDs are 128 bits long, which can be a lot of overhead if you have a large number of documents. If you are concerned about storage overhead, you may want to consider a different ID strategy.
"},{"location":"core/document-ids/#ulids","title":"ULIDs","text":"ULIDs are a variant of UUIDs that are lexicographically sortable. They are also 128 bits long, like UUIDs, but they are encoded in a way that makes them sortable. This can be useful if you need predictable ordering of your documents.
ULIDs are also shorter than UUIDs, which can save you some storage space. They are also opaque, like UUIDs, which means that they do not contain any information about the document itself.
Install the ulid-py
package to generate ULIDs.
pip install py-ulid\n
from ulid import ULID\nimport chromadb\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n_ulid = ULID()\n\nclient = chromadb.Client()\n\ncollection = client.get_or_create_collection(\"name\")\n\ncollection.add(ids=[f\"{_ulid.generate()}\" for _ in range(len(my_documents))], documents=my_documents)\n
"},{"location":"core/document-ids/#nanoids","title":"NanoIDs","text":"Coming soon.
"},{"location":"core/document-ids/#hashes","title":"Hashes","text":"Hashes are another common choice for document IDs. They are unique, and can be generated in a distributed fashion. They are also opaque, which means that they do not contain any information about the document itself. This can be a good thing, as it allows you to change the document without changing the ID.
import hashlib\nimport os\nimport chromadb\n\n\ndef generate_sha256_hash() -> str:\n # Generate a random number\n random_data = os.urandom(16)\n # Create a SHA256 hash object\n sha256_hash = hashlib.sha256()\n # Update the hash object with the random data\n sha256_hash.update(random_data)\n # Return the hexadecimal representation of the hash\n return sha256_hash.hexdigest()\n\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"collection\")\ncollection.add(ids=[generate_sha256_hash() for _ in range(len(my_documents))], documents=my_documents)\n
It is also possible to use the document as basis for the hash, the downside of that is that when the document changes and you have a semantic around the text as relating to the hash, you may need to update the hash.
import hashlib\nimport chromadb\n\n\ndef generate_sha256_hash_from_text(text) -> str:\n # Create a SHA256 hash object\n sha256_hash = hashlib.sha256()\n # Update the hash object with the text encoded to bytes\n sha256_hash.update(text.encode('utf-8'))\n # Return the hexadecimal representation of the hash\n return sha256_hash.hexdigest()\n\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"collection\")\ncollection.add(ids=[generate_sha256_hash_from_text(my_documents[i]) for i in range(len(my_documents))],\n documents=my_documents)\n
"},{"location":"core/document-ids/#semantic-strategies","title":"Semantic Strategies","text":"In this section we'll explore a few different use cases for building semantics around document IDs.
- URL Slugs - if your docs are web pages with permalinks (e.g. blog posts), you can use the URL slug as the document ID.
- File Paths - if your docs are files on disk, you can use the file path as the document ID.
"},{"location":"core/filters/","title":"Filters","text":"Chroma provides two types of filters:
- Metadata - filter documents based on metadata using
where
clause in either Collection.query()
or Collection.get()
- Document - filter documents based on document content using
where_document
in Collection.query()
or `Collection.get().
Those familiar with MongoDB queries will find Chroma's filters very similar.
"},{"location":"core/filters/#metadata-filters","title":"Metadata Filters","text":""},{"location":"core/filters/#equality","title":"Equality","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": \"is_equal_to_this\"}\n)\n
Alternative syntax:
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$eq\": \"is_equal_to_this\"}}\n)\n
"},{"location":"core/filters/#inequality","title":"Inequality","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$ne\": \"is_not_equal_to_this\"}}\n)\n
"},{"location":"core/filters/#greater-than","title":"Greater Than","text":"Greater Than
The $gt
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$gt\": 5}}\n)\n
"},{"location":"core/filters/#greater-than-or-equal","title":"Greater Than or Equal","text":"Greater Than or Equal
The $gte
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$gte\": 5.1}}\n)\n
"},{"location":"core/filters/#less-than","title":"Less Than","text":"Less Than
The $lt
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$lt\": 5}}\n)\n
"},{"location":"core/filters/#less-than-or-equal","title":"Less Than or Equal","text":"Less Than or Equal
The $lte
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$lte\": 5.1}}\n)\n
"},{"location":"core/filters/#in","title":"In","text":"In works on all data types - string, int, float, and bool.
In
The $in
operator is only supported for list of values of the same type.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$in\": [\"value1\", \"value2\"]}}\n)\n
"},{"location":"core/filters/#not-in","title":"Not In","text":"Not In works on all data types - string, int, float, and bool.
Not In
The $nin
operator is only supported for list of values of the same type.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$nin\": [\"value1\", \"value2\"]}}\n)\n
"},{"location":"core/filters/#logical-operator-and","title":"Logical Operator: And","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"$and\": [{\"metadata_field1\": \"value1\"}, {\"metadata_field2\": \"value2\"}]}\n)\n
Logical Operators can be nested.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"$and\": [{\"metadata_field1\": \"value1\"}, {\"$or\": [{\"metadata_field2\": \"value2\"}, {\"metadata_field3\": \"value3\"}]}]}\n)\n
"},{"location":"core/filters/#logical-operator-or","title":"Logical Operator: Or","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"$or\": [{\"metadata_field1\": \"value1\"}, {\"metadata_field2\": \"value2\"}]}\n)\n
"},{"location":"core/filters/#document-filters","title":"Document Filters","text":""},{"location":"core/filters/#contains","title":"Contains","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$contains\": \"search_string\"}\n)\n
"},{"location":"core/filters/#not-contains","title":"Not Contains","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$not_contains\": \"search_string\"}\n)\n
"},{"location":"core/filters/#logical-operator-and_1","title":"Logical Operator: And","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$and\": [{\"$contains\": \"search_string1\"}, {\"$contains\": \"search_string2\"}]}\n)\n
Logical Operators can be nested.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$and\": [{\"$contains\": \"search_string1\"}, {\"$or\": [{\"$not_contains\": \"search_string2\"}, {\"$not_contains\": \"search_string3\"}]}]}\n)\n
"},{"location":"core/filters/#logical-operator-or_1","title":"Logical Operator: Or","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$or\": [{\"$not_contains\": \"search_string1\"}, {\"$not_contains\": \"search_string2\"}]}\n)\n
"},{"location":"core/install/","title":"Installation","text":""},{"location":"core/install/#core-chromadb","title":"Core ChromaDB","text":"To install the latest version of chromadb, run:
pip install chromadb\n
To install a specific version of chromadb, run:
pip install chromadb==<x.y.z>\n
Releases
You can find Chroma releases in PyPI here.
"},{"location":"core/install/#chromadb-python-client","title":"ChromaDB Python Client","text":"To install the latest version of the ChromaDB Python client, run:
pip install chromadb-client\n
Releases
You can find Chroma releases in PyPI here.
"},{"location":"core/resources/","title":"Resource Requirements","text":"Chroma makes use of the following compute resources:
- RAM - Chroma stores the vector HNSW index in-memory. This allows it to perform blazing fast semantic searches.
- Disk - Chroma persists all data to disk. This includes the vector HNSW index, metadata index, system DB, and the write-ahead log (WAL).
- CPU - Chroma uses CPU for indexing and searching vectors.
Here are some formulas and heuristics to help you estimate the resources you need to run Chroma.
"},{"location":"core/resources/#ram","title":"RAM","text":"Once you select your embedding model, use the following formula for calculating RAM storage requirements for the vector HNSW index:
number of vectors
* dimensionality of vectors
* 4 bytes
= RAM required
number of vectors
- This is the number of vectors you plan to index. These are the documents in your Chroma collection (or chunks if you use LlamaIndex or LangChain terminology). dimensionality of vectors
- This is the dimensionality of the vectors output by your embedding model. For example, if you use the sentence-transformers/paraphrase-MiniLM-L6-v2
model, the dimensionality of the vectors is 384. 4 bytes
- This is the size of each component of a vector. Chroma relies on HNSW lib implementation that uses 32bit floats.
"},{"location":"core/resources/#disk","title":"Disk","text":"Disk storage requirements mainly depend on what metadata you store and the number of vectors you index. The heuristics is at least 2-4x the RAM required for the vector HNSW index.
WAL Cleanup
Chroma does not currently clean the WAL so your sqlite3 metadata file will grow over time. In the meantime feel free to use available tooling to periodically clean your WAL - see chromadb-ops for more information.
"},{"location":"core/resources/#temporary-disk-space","title":"Temporary Disk Space","text":"Chroma uses temporary storage for its SQLite3 related operations - sorting and buffering large queries. By default, SQLite3 uses /tmp
for temporary storage.
There are two guidelines to follow:
- Have enough space if your application intends to make large queries or has multiple concurrent queries.
- Ensure temporary storage is on a fast disk to avoid performance bottlenecks.
You can configure the location of sqlite temp files with the SQLITE_TMPDIR
environment variable.
SQLite3 Temporary Storage
You can read more about SQLite3 temporary storage in the SQLite3 documentation.
"},{"location":"core/resources/#cpu","title":"CPU","text":"There are no hard requirements for the CPU, but it is recommended to use as much CPU as you can spare as it directly relates to index and search speeds.
"},{"location":"core/storage-layout/","title":"Storage Layout","text":"When configured as PersistentClient
or running as a server, Chroma persists its data under the provided persist_directory
.
For PersistentClient
the persistent directory is usually passed as path
parameter when creating the client, if not passed the default is ./chroma/
(relative path to where the client is started from).
For the server, the persistent directory can be passed as environment variable PERSIST_DIRECTORY
or as a command line argument --path
. If not passed, the default is ./chroma/
(relative path to where the server is started).
Once the client or the server is started a basic directory structure is created under the persistent directory containing the chroma.sqlite3
file. Once collections are created and data is added, subdirectories are created for each collection. The subdirectories are UUID-named and refer to the vector segment.
"},{"location":"core/storage-layout/#directory-structure","title":"Directory Structure","text":"The following diagram represents a typical Chroma persistent directory structure:
"},{"location":"core/storage-layout/#chromasqlite3","title":"chroma.sqlite3
","text":"Note about the tables
While we try to make it as accurate as possible chroma data layout inside the slite3
database is subject to change. The following description is valid as of version 0.5.0
. The tables are also not representative of the distributed architecture of Chroma.
The chroma.sqlite3
is typical for Chroma single-node. The file contains the following four types of data:
- Sysdb - Chroma system database, responsible for storing tenant, database, collection and segment information.
- WAL - the write-ahead log, which is used to ensure durability of the data.
- Metadata Segment - all metadata and documents stored in Chroma.
- Migrations - the database schema migration scripts.
"},{"location":"core/storage-layout/#sysdb","title":"Sysdb","text":"The system database comprises the following tables:
- tenants - contains all the tenants in the system. Usually gets initialized with a single tenant -
default_tenant
. - databases - contains all the databases per tenant. Usually gets initialized with a single database -
default_database
related to the default_tenant
. - collections - contains all the collections per database.
- collection_metadata - contains all the metadata associated with each collection. The metadata for a collection consists of any user-specified key-value pairs and the
hnsw:*
keys that store the HNSW index parameters. - segments - contains all the segments per collection. Each collection gets two segments -
metadata
and vector
. - segment_metadata - contains all the metadata associated with each segment. This table contains
hnsw:*
keys that store the HNSW index parameters for the vector segment.
"},{"location":"core/storage-layout/#wal","title":"WAL","text":"The write-ahead log is a table that stores all the changes made to the database. It is used to ensure that the data is durable and can be recovered in case of a crash. The WAL is composed of the following tables:
- embeddings_queue - contains all data ingested into Chroma. Each row of the table represents an operation upon a collection (add, update, delete, upsert). The row contains all the necessary information (embedding, document, metadata and associated relationship to a collection) to replay the operation and ensure data consistency.
- max_seq_id - maintains the maximum sequence ID of the metadata segment that is used as a WAL replay starting point for the metadata segment.
"},{"location":"core/storage-layout/#metadata-segment","title":"Metadata Segment","text":"The metadata segment is a table that stores all the metadata and documents stored in Chroma. The metadata segment is composed of the following tables:
- embeddings -
- embedding_metadata - contains all the metadata associated with each document and its embedding.
- embedding_fulltext_search - document full-text search index. This is a virtual table and upon inspection of the sqlite will appear as a series of tables starting with
embedding_fulltext_search_
. This is an FTS5 table and is used for full-text search queries on documents stored in Chroma (via where_document
filter in query
and get
methods).
"},{"location":"core/storage-layout/#migrations","title":"Migrations","text":"The migrations table contains all schema migrations applied to the chroma.sqlite3
database. The table is used to track the schema version and ensure that the database schema is up-to-date.
"},{"location":"core/storage-layout/#collection-subdirectories","title":"Collection Subdirectories","text":"TBD
"},{"location":"core/system_constraints/","title":"Chroma System Constraints","text":"This section contains common constraints of Chroma.
- Chroma is thread-safe
- Chroma is not process-safe
- Multiple Chroma Clients (Ephemeral, Persistent, Http) can be created from one or more threads within the same process
- A collection's name is unique within a Tenant and DB
- A collection's dimensions cannot change after creation => you cannot change the embedding function after creation
- Chroma operates in two modes - standalone (PersistentClient, EphemeralClient) and client/server (HttpClient with ChromaServer)
- The distance function cannot be changed after collection creation.
"},{"location":"core/system_constraints/#operational-modes","title":"Operational Modes","text":"Chroma can be operated in two modes:
- Standalone - This allows embedding Chroma in your python application without the need to communicate with external processes.
- Client/Server - This allows embedding Chroma in your python application as a thin-client with minimal dependencies and communicating with it via REST API. This is useful when you want to use Chroma from multiple processes or even multiple machines.
Depending on the mode you choose, you will need to consider the following component responsibilities:
- Standalone:
- Clients (Persistent, Ephemeral) - Responsible for persistence, embedding, querying
- Client/Server:
- Clients (HttpClient) - Responsible for embedding, communication with Chroma server via REST API
- Server - Responsible for persistence and querying
"},{"location":"core/tenants-and-databases/","title":"Tenants and Databases","text":"Tenants and Databases are two grouping abstractions that provides means to organize and manage data in Chroma.
"},{"location":"core/tenants-and-databases/#tenants","title":"Tenants","text":"A tenant is a logical grouping of databases.
"},{"location":"core/tenants-and-databases/#databases","title":"Databases","text":"A database is a logical grouping of collections.
"},{"location":"core/advanced/wal-pruning/","title":"Write-ahead Log (WAL) Pruning","text":"As of this writing (v0.4.22) Chroma stores its WAL forever. This means that the WAL will grow indefinitely. This is obviously not ideal. Here we provide a small script + a few steps how to prune your WAL and keep it at a reasonable size. Pruning the WAL is particularly important if you have many writes to Chroma (e.g. documents are added, updated or deleted frequently).
"},{"location":"core/advanced/wal-pruning/#tooling","title":"Tooling","text":"We have worked on a tooling to provide users with a way to prune their WAL - chroma-ops.
To prune your WAL you can run the following command:
pip install chroma-ops\nchops cleanup-wal /path/to/persist_dir\n
\u26a0\ufe0f IMPORTANT: It is always a good thing to backup your data before you prune the WAL.
"},{"location":"core/advanced/wal-pruning/#manual","title":"Manual","text":"Steps:
Stop Chroma
It is vitally important that you stop Chroma before you prune the WAL. If you don't stop Chroma you risk corrupting
- \u26a0\ufe0f Stop Chroma
- \ud83d\udcbe Create a backup of your
chroma.sqlite3
file in your persistent dir - \ud83d\udc40 Check your current
chroma.sqlite3
size (e.g. ls -lh /path/to/persist/dir/chroma.sqlite3
) - \ud83d\udda5\ufe0f Run the script below
- \ud83d\udd2d Check your current
chroma.sqlite3
size again to verify that the WAL has been pruned - \ud83d\ude80 Start Chroma
Script (store it in a file like compact-wal.sql
)
wal_clean.py#!/usr/bin/env python3\n# Call the script: python wal_clean.py ./chroma-test-compact\nimport os\nimport sqlite3\nfrom typing import cast, Optional, Dict\nimport argparse\nimport pickle\n\n\nclass PersistentData:\n \"\"\"Stores the data and metadata needed for a PersistentLocalHnswSegment\"\"\"\n\n dimensionality: Optional[int]\n total_elements_added: int\n max_seq_id: int\n\n id_to_label: Dict[str, int]\n label_to_id: Dict[int, str]\n id_to_seq_id: Dict[str, int]\n\n\ndef load_from_file(filename: str) -> \"PersistentData\":\n \"\"\"Load persistent data from a file\"\"\"\n with open(filename, \"rb\") as f:\n ret = cast(PersistentData, pickle.load(f))\n return ret\n\n\ndef clean_wal(chroma_persist_dir: str):\n if not os.path.exists(chroma_persist_dir):\n raise Exception(f\"Persist {chroma_persist_dir} dir does not exist\")\n if not os.path.exists(f'{chroma_persist_dir}/chroma.sqlite3'):\n raise Exception(\n f\"SQL file not found int persist dir {chroma_persist_dir}/chroma.sqlite3\")\n # Connect to SQLite database\n conn = sqlite3.connect(f'{chroma_persist_dir}/chroma.sqlite3')\n\n # Create a cursor object\n cursor = conn.cursor()\n\n # SQL query\n query = \"SELECT id,topic FROM segments where scope='VECTOR'\" # Replace with your query\n\n # Execute the query\n cursor.execute(query)\n\n # Fetch the results (if needed)\n results = cursor.fetchall()\n wal_cleanup_queries = []\n for row in results:\n # print(row)\n metadata = load_from_file(\n f'{chroma_persist_dir}/{row[0]}/index_metadata.pickle')\n wal_cleanup_queries.append(\n f\"DELETE FROM embeddings_queue WHERE seq_id < {metadata.max_seq_id} AND topic='{row[1]}';\")\n\n cursor.executescript('\\n'.join(wal_cleanup_queries))\n # Close the cursor and connection\n cursor.close()\n conn.close()\n\n\nif __name__ == \"__main__\":\n parser = argparse.ArgumentParser()\n parser.add_argument('persist_dir', type=str)\n arg = parser.parse_args()\n print(arg.persist_dir)\n clean_wal(arg.persist_dir)\n
Run the script
# Let's create a backup\ntar -czvf /path/to/persist/dir/chroma.sqlite3.backup.tar.gz /path/to/persist/dir/chroma.sqlite3\nlsof /path/to/persist/dir/chroma.sqlite3 # make sure that no process is using the file\npython wal_clean.py /path/to/persist/dir/\n# start chroma\n
"},{"location":"core/advanced/wal/","title":"Write-ahead Log (WAL)","text":"Chroma uses WAL to ensure data durability, even if things go wrong (e.g. server crashes). To achieve the latter Chroma uses what is known in the DB-industry as WAL or Write-Ahead Log. The purpose of the WAL is to ensure that each user request (aka transaction) is safely stored before acknowledging back to the user. Subsequently, in fact immediately after writing to the WAL, the data is also written to the index. This enables Chroma to serve as real-time search engine, where the data is available for querying immediately after it is written to the WAL.
Below is a diagram that illustrates the WAL in ChromaDB (ca. v0.4.22):
"},{"location":"core/advanced/wal/#vector-indices-overview","title":"Vector Indices Overview","text":"The diagram below illustrates how data gets transferred from the WAL to the binary vector indices (Bruteforce and HNSW):
For each collection Chroma maintains two binary indices - Bruteforce (in-memory, fast) and HNSW lib (persisted to disk, slow when adding new vectors and persisting). As you can imagine, the BF index serves the role of a buffer that holds the uncommitted to HNWS persisted index portion of the WAL. The HNSW index itself has a max sequence id counter, stored in a metadata file, that indicates from which position in the WAL the buffering to the BF index should begin. The latter buffering usually happens when the collection is first accessed.
There are two transfer points (in the diagram, sync threshold) for BF to HNSW:
hnsw:batch_size
- forces the BF vectors to be added to HNSW in-memory (this is a slow operation) -
hnsw:sync_threshold
- forces Chroma to dump the HNSW in-memory index to disk (this is a slow operation)
-
Both of the above sync points are controlled via Collection-level metadata with respective named params. It is customary hnsw:sync_threshold
> hnsw:batch_size
"},{"location":"core/advanced/wal/#metadata-indices-overview","title":"Metadata Indices Overview","text":"The following diagram illustrates how data gets transferred from the WAL to the metadata index:
"},{"location":"core/advanced/wal/#further-reading","title":"Further Reading","text":"For the DevOps minded folks we have a few more resources:
- WAL Pruning - Clean up your WAL
"},{"location":"ecosystem/clients/","title":"Chroma Ecosystem Clients","text":""},{"location":"ecosystem/clients/#python","title":"Python","text":"Maintainer Chroma Core team Repo https://github.com/chroma-core/chroma Status \u2705 Stable Version 0.4.25.dev0
(PyPi Link) Docs https://docs.trychroma.com/api Compatibility Python: 3.7+
, Chroma API Version: 0.4.15+
Feature Support:
Feature Supported Create Tenant \u2705 Get Tenant \u2705 Create DB \u2705 Get DB \u2705 Create Collection \u2705 Get Collection \u2705 List Collection \u2705 Count Collection \u2705 Delete Collection \u2705 Add Documents \u2705 Delete Documents \u2705 Update Documents \u2705 Query Documents \u2705 Get Document \u2705 Count Documents \u2705 Auth - Basic \u2705 Auth - Token \u2705 Reset \u2705 Embedding Function Support:
Embedding Function Supported OpenAI \u2705 Sentence Transformers \u2705 HuggingFace Inference API \u2705 Cohere \u2705 Google Vertex AI \u2705 Google Generative AI (Gemini) \u2705 OpenCLIP (Multi-modal) \u2705 Embedding Functions
The list above is not exhaustive. Check official docs for up-to-date information.
"},{"location":"ecosystem/clients/#javascript","title":"JavaScript","text":"Maintainer Chroma Core team Repo https://github.com/chroma-core/chroma Status \u2705 Stable Version 1.8.1
(NPM Link) Docs https://docs.trychroma.com/api Compatibility Python: 3.7+
, Chroma API Version: TBD
Feature Support:
Feature Supported Create Tenant \u2705 Get Tenant \u2705 Create DB \u2705 Get DB \u2705 Create Collection \u2705 Get Collection \u2705 List Collection \u2705 Count Collection \u2705 Delete Collection \u2705 Add Documents \u2705 Delete Documents \u2705 Update Documents \u2705 Query Documents \u2705 Get Document \u2705 Count Documents \u2705 Auth - Basic \u2705 Auth - Token \u2705 Reset \u2705 Embedding Function Support:
Embedding Function Supported OpenAI \u2705 Sentence Transformers \u2705 HuggingFace Inference API \u2705 Cohere \u2705 Google Vertex AI \u2705 Google Generative AI (Gemini) \u2705 OpenCLIP (Multi-modal) \u2705 Embedding Functions
The list above is not exhaustive. Check official docs for up-to-date information.
"},{"location":"ecosystem/clients/#ruby-client","title":"Ruby Client","text":"https://github.com/mariochavez/chroma
"},{"location":"ecosystem/clients/#java-client","title":"Java Client","text":"https://github.com/amikos-tech/chromadb-java-client
"},{"location":"ecosystem/clients/#go-client","title":"Go Client","text":"https://github.com/amikos-tech/chroma-go
"},{"location":"ecosystem/clients/#c-client","title":"C# Client","text":"https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/Connectors/Connectors.Memory.Chroma
"},{"location":"ecosystem/clients/#rust-client","title":"Rust Client","text":"https://crates.io/crates/chromadb
"},{"location":"ecosystem/clients/#elixir-client","title":"Elixir Client","text":"https://hex.pm/packages/chroma/
"},{"location":"ecosystem/clients/#dart-client","title":"Dart Client","text":"https://pub.dev/packages/chromadb
"},{"location":"ecosystem/clients/#php-client","title":"PHP Client","text":"https://github.com/CodeWithKyrian/chromadb-php
"},{"location":"ecosystem/clients/#php-laravel-client","title":"PHP (Laravel) Client","text":"https://github.com/helgeSverre/chromadb
"},{"location":"embeddings/bring-your-own-embeddings/","title":"Creating your own embedding function","text":"from chromadb.api.types import (\n Documents,\n EmbeddingFunction,\n Embeddings\n)\n\n\nclass MyCustomEmbeddingFunction(EmbeddingFunction[Documents]):\n def __init__(\n self,\n my_ef_param: str\n ):\n \"\"\"Initialize the embedding function.\"\"\"\n\n def __call__(self, input: Documents) -> Embeddings:\n \"\"\"Embed the input documents.\"\"\"\n return self._my_ef(input)\n
Now let's break the above down.
First you create a class that inherits from EmbeddingFunction[Documents]
. The Documents
type is a list of Document
objects. Each Document
object has a text
attribute that contains the text of the document. Chroma also supports multi-modal
"},{"location":"embeddings/bring-your-own-embeddings/#example-implementation","title":"Example Implementation","text":"Below is an implementation of an embedding function that works with transformers
models.
Note
This example requires the transformers
and torch
python packages. You can install them with pip install transformers torch
.
By default, all transformers
models on HF are supported are also supported by the sentence-transformers
package. For which Chroma provides out of the box support.
import importlib\nfrom typing import Optional, cast\n\nimport numpy as np\nimport numpy.typing as npt\nfrom chromadb.api.types import EmbeddingFunction, Documents, Embeddings\n\n\nclass TransformerEmbeddingFunction(EmbeddingFunction[Documents]):\n def __init__(\n self,\n model_name: str = \"dbmdz/bert-base-turkish-cased\",\n cache_dir: Optional[str] = None,\n ):\n try:\n from transformers import AutoModel, AutoTokenizer\n\n self._torch = importlib.import_module(\"torch\")\n self._tokenizer = AutoTokenizer.from_pretrained(model_name)\n self._model = AutoModel.from_pretrained(model_name, cache_dir=cache_dir)\n except ImportError:\n raise ValueError(\n \"The transformers and/or pytorch python package is not installed. Please install it with \"\n \"`pip install transformers` or `pip install torch`\"\n )\n\n @staticmethod\n def _normalize(vector: npt.NDArray) -> npt.NDArray:\n \"\"\"Normalizes a vector to unit length using L2 norm.\"\"\"\n norm = np.linalg.norm(vector)\n if norm == 0:\n return vector\n return vector / norm\n\n def __call__(self, input: Documents) -> Embeddings:\n inputs = self._tokenizer(\n input, padding=True, truncation=True, return_tensors=\"pt\"\n )\n with self._torch.no_grad():\n outputs = self._model(**inputs)\n embeddings = outputs.last_hidden_state.mean(dim=1) # mean pooling\n return [e.tolist() for e in self._normalize(embeddings)]\n
"},{"location":"embeddings/cross-encoders/","title":"Cross-Encoders Reranking","text":"Work in Progress
This page is a work in progress and may not be complete.
For now this is just a tiny snippet how to use a cross-encoder to rerank results returned from Chroma. Soon we will provide a more detailed guide to the usefulness of cross-encoders/rerankers.
"},{"location":"embeddings/cross-encoders/#hugging-face-cross-encoders","title":"Hugging Face Cross Encoders","text":"from sentence_transformers import CrossEncoder\nimport numpy as np\nimport chromadb\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"my_collection\")\n# add some documents \ncollection.add(ids=[\"doc1\", \"doc2\", \"doc3\"], documents=[\"Hello, world!\", \"Hello, Chroma!\", \"Hello, Universe!\"])\n# query the collection\nquery = \"Hello, world!\"\nresults = collection.query(query_texts=[query], n_results=3)\n\n\n\nmodel = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2', max_length=512)\n# rerank the results with original query and documents returned from Chroma\nscores = model.predict([(query, doc) for doc in results[\"documents\"][0]])\n# get the highest scoring document\nprint(results[\"documents\"][0][np.argmax(scores)])\n
"},{"location":"embeddings/embedding-models/","title":"Embedding Models","text":"Work in Progress
This page is a work in progress.
Embedding Models are your best friends in the world of Chroma, and vector databases in general. They take something you understand in the form of text, images, audio etc. and turn it into a list of numbers (embeddings), which a machine learning model can understand. This process makes documents interpretable by a machine learning model.
The goal of this page is to arm you with enough knowledge to make an informed decision about which embedding model to choose for your use case.
The importance of a model
GenAI moves pretty fast therefore we recommend not to over-rely on models too much. When creating your solution create the necessary abstractions and tests to be able to quickly experiment and change things up (don't overdo it on the abstraction though).
"},{"location":"embeddings/embedding-models/#characteristics-of-an-embedding-model","title":"Characteristics of an Embedding Model","text":" - Modality - the type of data each model is designed to work with. For example, text, images, audio, video. Note: Some models can work with multiple modalities (e.g. OpenAI's CLIP).
- Context - The maximum number of tokens the model can process at once.
- Tokenization - The model's tokenizer or the way a model turns text into tokens to process.
- Dimensionality - The number of dimensions in the output embeddings/vectors.
- Training Data - The data the model was trained on.
- Execution Environment - How the model is run (e.g. local, cloud, API).
- Loss Function - The function used to train the model e.g. how well the model is doing in predicting the embeddings, compared to the actual embeddings.
"},{"location":"embeddings/embedding-models/#model-categories","title":"Model Categories","text":"There are several ways to categorize embedding models other than the above characteristics:
- Execution environment e.g. API vs local
- Licensing e.g. open-source vs proprietary
- Privacy e.g. on-premises vs cloud
"},{"location":"embeddings/embedding-models/#execution-environment","title":"Execution Environment","text":"The execution environment is probably the first choice you should consider when creating your GenAI solution. Can I afford my data to leave the confines of my computer, cluster, organization? If the answer is yes and you are still in the experimentation phase of your GenAI journey we recommend using API-based embedding models.
"},{"location":"embeddings/gpu-support/","title":"Embedding Functions GPU Support","text":"By default, Chroma does not require GPU support for embedding functions. However, if you want to use GPU support, some of the functions, especially those running locally provide GPU support.
"},{"location":"embeddings/gpu-support/#default-embedding-functions-onnxruntime","title":"Default Embedding Functions (Onnxruntime)","text":"To use the default embedding functions with GPU support, you need to install onnxruntime-gpu
package. You can install it with the following command:
pip install onnxruntime-gpu\n
Note: To ensure no conflicts, you can uninstall onnxruntime
(e.g. pip uninstall onnxruntime
) in a separate environment.
List available providers:
import onnxruntime\n\nprint(onnxruntime.get_available_providers())\n
Select the desired provider and set it as preferred before using the embedding functions (in the below example, we use CUDAExecutionProvider
):
import time\nfrom chromadb.utils.embedding_functions import ONNXMiniLM_L6_V2\n\nef = ONNXMiniLM_L6_V2(preferred_providers=['CUDAExecutionProvider'])\n\ndocs = []\nfor i in range(1000):\n docs.append(f\"this is a document with id {i}\")\n\nstart_time = time.perf_counter()\nembeddings = ef(docs)\nend_time = time.perf_counter()\nprint(f\"Elapsed time: {end_time - start_time} seconds\")\n
IMPORTANT OBSERVATION: Our observations are that for GPU support using sentence transformers with model all-MiniLM-L6-v2
outperforms onnxruntime with GPU support. In practical terms on a Colab T4 GPU, the onnxruntime example above runs for about 100s whereas the equivalent sentence transformers example runs for about 1.8s.
"},{"location":"embeddings/gpu-support/#sentence-transformers","title":"Sentence Transformers","text":"import time\nfrom chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\n# This will download the model to your machine and set it up for GPU support\nef = SentenceTransformerEmbeddingFunction(model_name=\"thenlper/gte-small\", device=\"cuda\")\n\n# Test with 10k documents\ndocs = []\nfor i in range(10000):\n docs.append(f\"this is a document with id {i}\")\n\nstart_time = time.perf_counter()\nembeddings = ef(docs)\nend_time = time.perf_counter()\nprint(f\"Elapsed time: {end_time - start_time} seconds\")\n
Note: You can run the above example in google Colab - see the notebook
"},{"location":"embeddings/gpu-support/#openclip","title":"OpenCLIP","text":"Prior to PR #1806, we simply used the torch
package to load the model and run it on the GPU.
import chromadb\nfrom chromadb.utils.embedding_functions import OpenCLIPEmbeddingFunction\nfrom chromadb.utils.data_loaders import ImageLoader\nimport toch\nimport os\n\nIMAGE_FOLDER = \"images\"\ntoch.device(\"cuda\")\n\nembedding_function = OpenCLIPEmbeddingFunction()\nimage_loader = ImageLoader()\n\nclient = chromadb.PersistentClient(path=\"my_local_data\")\ncollection = client.create_collection(\n name='multimodal_collection',\n embedding_function=embedding_function,\n data_loader=image_loader)\n\nimage_uris = sorted([os.path.join(IMAGE_FOLDER, image_name) for image_name in os.listdir(IMAGE_FOLDER)])\nids = [str(i) for i in range(len(image_uris))]\ncollection.add(ids=ids, uris=image_uris)\n
After PR #1806:
from chromadb.utils.embedding_functions import OpenCLIPEmbeddingFunction\nembedding_function = OpenCLIPEmbeddingFunction(device=\"cuda\")\n
"},{"location":"faq/","title":"Frequently Asked Questions and Commonly Encountered Issues","text":"This section provides answers to frequently asked questions and information on commonly encountered problem when working with Chroma. These information below is based on interactions with the Chroma community.
404 Answer Not Found
If you have a question that is not answered here, please reach out to us on our Discord @taz or GitHub Issues
"},{"location":"faq/#frequently-asked-questions","title":"Frequently Asked Questions","text":""},{"location":"faq/#what-does-chroma-use-to-index-embedding-vectors","title":"What does Chroma use to index embedding vectors?","text":"Chroma uses its own fork of HNSW lib for indexing and searching embeddings.
Alternative Questions:
- What library does Chroma use for vector index and search?
- What algorithm does Chroma use for vector search?
"},{"location":"faq/#how-to-set-dimensionality-of-my-collections","title":"How to set dimensionality of my collections?","text":"When creating a collection, its dimensionality is determined by the dimensionality of the first embedding added to it. Once the dimensionality is set, it cannot be changed. Therefore, it is important to consistently use embeddings of the same dimensionality when adding or querying a collection.
Example:
import chromadb\n\nclient = chromadb.Client()\n\ncollection = client.create_collection(\"name\") # dimensionality is not set yet\n\n# add an embedding to the collection\ncollection.add(ids=[\"id1\"], embeddings=[[1, 2, 3]]) # dimensionality is set to 3\n
Alternative Questions:
- Can I change the dimensionality of a collection?
"},{"location":"faq/#can-i-use-transformers-models-with-chroma","title":"Can I use transformers
models with Chroma?","text":"Generally, yes you can use transformers
models with Chroma. Although Chroma does not provide a wrapper for this, you can use SentenceTransformerEmbeddingFunction
to achieve the same result. The sentence-transformer library will implicitly do mean-pooling on the last hidden layer, and you'll get a warning about it - No sentence-transformers model found with name [model name]. Creating a new one with MEAN pooling.
Example:
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\n\nef = SentenceTransformerEmbeddingFunction(model_name=\"FacebookAI/xlm-roberta-large-finetuned-conll03-english\")\n\nprint(ef([\"test\"]))\n
Warning
Not all models will work with the above method. Also mean pooling may not be the best strategy for the model. Read the model card and try to understand what if any pooling the creators recommend. You may also want to normalize the embeddings before adding them to Chroma (pass normalize_embeddings=True
to the SentenceTransformerEmbeddingFunction
EF constructor).
"},{"location":"faq/#commonly-encountered-problems","title":"Commonly Encountered Problems","text":""},{"location":"faq/#collection-dimensionality-mismatch","title":"Collection Dimensionality Mismatch","text":"Symptoms:
This error usually exhibits in the following error message:
chromadb.errors.InvalidDimensionException: Embedding dimension XXX does not match collection dimensionality YYY
Context:
When adding/upserting or querying Chroma collection. This error is more visible/pronounced when using the Python APIs, but will also show up in also surface in other clients.
Cause:
You are trying to add or query a collection with vectors of a different dimensionality than the collection was created with.
Explanation/Solution:
When you first create a collection client.create_collection(\"name\")
, the collection will not have knowledge of its dimensionality so that allows you to add vectors of any dimensionality to it. However, once your first batch of embeddings is added to the collection, the collection will be locked to that dimensionality. Any subsequent query or add operation must use embeddings of the same dimensionality. The dimensionality of the embeddings is a characteristic of the embedding model (EmbeddingFunction) used to generate the embeddings, therefore it is important to consistently use the same EmbeddingFunction when adding or querying a collection.
Tip
If you do not specify an embedding_function
when creating (client.create_collection
) or getting (client.get_or_create_collection
) a collection, Chroma wil use its default embedding function.
"},{"location":"faq/#large-distances-in-search-results","title":"Large Distances in Search Results","text":"Symptoms:
When querying a collection, you get results that are in the 10s or 100s.
Context:
Frequently when using you own embedding function.
Cause:
The embeddings are not normalized.
Explanation/Solution:
L2
(Euclidean distance) and IP
(inner product) distance metrics are sensitive to the magnitude of the vectors. Chroma uses L2
by default. Therefore, it is recommended to normalize the embeddings before adding them to Chroma.
Here is an example how to normalize embeddings using L2 norm:
import numpy as np\n\n\ndef normalize_L2(vector):\n \"\"\"Normalizes a vector to unit length using L2 norm.\"\"\"\n norm = np.linalg.norm(vector)\n if norm == 0:\n return vector\n return vector / norm\n
"},{"location":"faq/#operationalerror-no-such-column-collectionstopic","title":"OperationalError: no such column: collections.topic
","text":"Symptoms:
The error OperationalError: no such column: collections.topic
is raised when trying to access Chroma locally or remotely.
Context:
After upgrading to Chroma 0.5.0
or accessing your Chroma persistent data with Chroma client version 0.5.0
.
Cause:
In version 0.5.x
Chroma has made some SQLite3 schema changes that are not backwards compatible with the previous versions. Once you access your persistent data on the server or locally with the new Chroma version it will automatically migrate to the new schema. This operation is not reversible.
Explanation/Solution:
To resolve this issue you will need to upgrade all your clients accessing the Chroma data to version 0.5.x
.
Here's a link to the migration performed by Chroma - https://github.com/chroma-core/chroma/blob/main/chromadb/migrations/sysdb/00005-remove-topic.sqlite.sql
"},{"location":"faq/#sqlite3operationalerror-database-or-disk-is-full","title":"sqlite3.OperationalError: database or disk is full
","text":"Symptoms:
The error sqlite3.OperationalError: database or disk is full
is raised when trying to access Chroma locally or remotely. The error can occur in any of the Chroma API calls.
Context:
There are two contexts in which this error can occur:
- When the persistent disk space is full or the disk quota is reached - This is where your
PERSIST_DIRECTORY
points to. - When there is not enough space in the temporary director - frequently
/tmp
on your system or container.
Cause:
When inserting new data and your Chroma persistent disk space is full or the disk quota is reached, the database will not be able to write metadata to SQLite3 db thus raising the error.
When performing large queries or multiple concurrent queries, the temporary disk space may be exhausted.
Explanation/Solution:
To work around the first issue, you can increase the disk space or clean up the disk space. To work around the second issue, you can increase the temporary disk space (works fine for containers but might be a problem for VMs) or point SQLite3 to a different temporary directory by using SQLITE_TMPDIR
environment variable.
SQLite Temp File
More information on sqlite3 temp files can be found here.
"},{"location":"faq/#runtimeerror-chroma-is-running-in-http-only-client-mode-and-can-only-be-run-with-chromadbapifastapifastapi","title":"RuntimeError: Chroma is running in http-only client mode, and can only be run with 'chromadb.api.fastapi.FastAPI'
","text":"Symptoms and Context:
The following error is raised when trying to create a new PersistentClient
, EphemeralClient
, or Client
:
RuntimeError: Chroma is running in http-only client mode, and can only be run with 'chromadb.api.fastapi.FastAPI' \nas the chroma_api_impl. see https://docs.trychroma.com/usage-guide?lang=py#using-the-python-http-only-client for more information.\n
Cause:
There are two possible causes for this error:
chromadb-client
is installed and you are trying to work with a local client. - Dependency conflict with
chromadb-client
and chromadb
packages.
Explanation/Solution:
Chroma comes in two packages - chromadb
and chromadb-client
. The chromadb-client
package is used to interact with a remote Chroma server. If you are trying to work with a local client, you should use the chromadb
package. If you are planning to interact with remote server only it is recommended to use the chromadb-client
package.
If you intend to work locally with Chroma (e.g. embed in your app) then we suggest that you uninstall the chromadb-client
package and install the chromadb
package.
To check which package you have installed:
pip list | grep chromadb\n
To uninstall the chromadb-client
package:
pip uninstall chromadb-client\n
Working with virtual environments It is recommended to work with virtual environments to avoid dependency conflicts. To create a virtual environment you can use the following snippet:
pip install virtualenv\npython -m venv myenv\nsource myenv/bin/activate\npip install chromadb # and other packages you need\n
Alternatively you can use conda
or poetry
to manage your environments. Default Embedding Function Default embedding function - chromadb.utils.embedding_functions.DefaultEmbeddingFunction
- can only be used with chromadb
package.
"},{"location":"integrations/langchain/","title":"Chroma Integrations With LangChain","text":" - Embeddings - learn how to use Chroma Embedding functions with LC and vice versa
- Retrievers - learn how to use LangChain retrievers with Chroma
"},{"location":"integrations/langchain/embeddings/","title":"Langchain Embeddings","text":""},{"location":"integrations/langchain/embeddings/#embedding-functions","title":"Embedding Functions","text":"Chroma and Langchain both offer embedding functions which are wrappers on top of popular embedding models.
Unfortunately Chroma and LC's embedding functions are not compatible with each other. Below we offer two adapters to convert Chroma's embedding functions to LC's and vice versa.
Links: - Chroma Embedding Functions Definition - Langchain Embedding Functions Definition
Here is the adapter to convert Chroma's embedding functions to LC's:
from langchain_core.embeddings import Embeddings\nfrom chromadb.api.types import EmbeddingFunction\n\n\nclass ChromaEmbeddingsAdapter(Embeddings):\n def __init__(self, ef: EmbeddingFunction):\n self.ef = ef\n\n def embed_documents(self, texts):\n return self.ef(texts)\n\n def embed_query(self, query):\n return self.ef([query])[0]\n
Here is the adapter to convert LC's embedding function s to Chroma's:
from langchain_core.embeddings import Embeddings\nfrom chromadb.api.types import EmbeddingFunction, Documents\n\n\nclass LangChainEmbeddingAdapter(EmbeddingFunction[Documents]):\n def __init__(self, ef: Embeddings):\n self.ef = ef\n\n def __call__(self, input: Documents) -> Embeddings:\n # LC EFs also have embed_query but Chroma doesn't support that so we just use embed_documents\n # TODO: better type checking\n return self.ef.embed_documents(input)\n
"},{"location":"integrations/langchain/embeddings/#example-usage","title":"Example Usage","text":"Using Chroma Embedding Functions with Langchain:
from langchain.vectorstores.chroma import Chroma\nfrom chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\n\ntexts = [\"foo\", \"bar\", \"baz\"]\n\ndocs_vectorstore = Chroma.from_texts(\n texts=texts,\n collection_name=\"docs_store\",\n embedding=ChromaEmbeddingsAdapter(SentenceTransformerEmbeddingFunction(model_name=\"all-MiniLM-L6-v2\")),\n)\n
Using Langchain Embedding Functions with Chroma:
from langchain_community.embeddings import SentenceTransformerEmbeddings\nimport chromadb\n\nclient = chromadb.Client()\n\ncollection = client.get_or_create_collection(\"test\", embedding_function=LangChainEmbeddingAdapter(\n SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")))\ncollection.add(ids=[\"1\", \"2\", \"3\"], documents=[\"foo\", \"bar\", \"baz\"])\n
"},{"location":"integrations/langchain/retrievers/","title":"\ud83e\udd9c\u26d3\ufe0f Langchain Retriever","text":"TBD: describe what retrievers are in LC and how they work.
"},{"location":"integrations/langchain/retrievers/#vector-store-retriever","title":"Vector Store Retriever","text":"In the below example we demonstrate how to use Chroma as a vector store retriever with a filter query.
Note that the filter is supplied whenever we create the retriever object so the filter applies to all queries (get_relevant_documents
).
from langchain.document_loaders import OnlinePDFLoader\nfrom langchain.chains import RetrievalQA\nfrom langchain.llms import OpenAI\nfrom langchain.vectorstores import Chroma\nfrom typing import Dict, Any\nimport chromadb\nfrom langchain_core.embeddings import Embeddings\n\nclient = chromadb.PersistentClient(path=\"./chroma\")\n\ncol = client.get_or_create_collection(\"test\")\n\ncol.upsert([f\"{i}\" for i in range(10)],documents=[f\"This is document #{i}\" for i in range(10)],metadatas=[{\"id\":f\"{i}\"} for i in range(10)])\n\nef = chromadb.utils.embedding_functions.DefaultEmbeddingFunction()\n\nclass DefChromaEF(Embeddings):\n def __init__(self,ef):\n self.ef = ef\n\n def embed_documents(self,texts):\n return self.ef(texts)\n\n def embed_query(self, query):\n return self.ef([query])[0]\n\n\ndb = Chroma(client=client, collection_name=\"test\",embedding_function=DefChromaEF(ef))\n\nretriever = db.as_retriever(search_kwargs={\"filter\":{\"id\":\"1\"}})\n\ndocs = retriever.get_relevant_documents(\"document\")\n\nassert len(docs)==1\n
Ref: https://colab.research.google.com/drive/1L0RwQVVBtvTTd6Le523P4uzz3m3fm0pH#scrollTo=xROOfxLohE5j
"},{"location":"integrations/llamaindex/","title":"Chroma Integrations With LlamaIndex","text":" - Embeddings - learn how to use LlamaIndex embeddings functions with Chroma and vice versa
"},{"location":"integrations/llamaindex/embeddings/","title":"LlamaIndex Embeddings","text":""},{"location":"integrations/llamaindex/embeddings/#embedding-functions","title":"Embedding Functions","text":"Chroma and LlamaIndex both offer embedding functions which are wrappers on top of popular embedding models.
Unfortunately Chroma and LI's embedding functions are not compatible with each other. Below we offer an adapters to convert LI embedding function to Chroma one.
from llama_index.core.schema import TextNode\nfrom llama_index.core.base.embeddings.base import BaseEmbedding\nfrom chromadb import EmbeddingFunction, Documents, Embeddings\n\n\nclass LlamaIndexEmbeddingAdapter(EmbeddingFunction):\n def __init__(self, ef: BaseEmbedding):\n self.ef = ef\n\n def __call__(self, input: Documents) -> Embeddings:\n return [node.embedding for node in self.ef([TextNode(text=doc) for doc in input])]\n
Text modality
The above adapter assumes that the input documents are text. If you are using a different modality, you will need to modify the adapter accordingly.
An example of how to use the above with LlamaIndex:
Prerequisites for example
Run pip install llama-index chromadb llama-index-embeddings-fastembed fastembed
import chromadb\nfrom llama_index.embeddings.fastembed import FastEmbedEmbedding\n\n# make sure to include the above adapter and imports\nembed_model = FastEmbedEmbedding(model_name=\"BAAI/bge-small-en-v1.5\")\n\nclient = chromadb.Client()\n\ncol = client.get_or_create_collection(\"test_collection\", embedding_function=LlamaIndexEmbeddingAdapter(embed_model))\n\ncol.add(ids=[\"1\"], documents=[\"this is a test document\"])\n
"},{"location":"integrations/ollama/","title":"Chroma Integrations With Ollama","text":" - Embeddings - learn how to use Ollama as embedder for Chroma documents
- \u2728
Coming soon
RAG with Ollama - a primer on how to build a simple RAG app with Ollama and Chroma
"},{"location":"integrations/ollama/embeddings/","title":"Ollama","text":"Ollama offers out-of-the-box embedding API which allows you to generate embeddings for your documents. Chroma provides a convenient wrapper around Ollama's embedding API.
"},{"location":"integrations/ollama/embeddings/#ollama-embedding-models","title":"Ollama Embedding Models","text":"While you can use any of the ollama models including LLMs to generate embeddings. We generally recommend using specialized models like nomic-embed-text
for text embeddings. The latter models are specifically trained for embeddings and are more efficient for this purpose (e.g. the dimensions of the output embeddings are much smaller than those from LLMs e.g. 1024 - nomic-embed-text vs 4096 - llama3)
Models:
Model Pull Ollama Registry Link nomic-embed-text
ollama pull nomic-embed-text
nomic-embed-text mxbai-embed-large
ollama pull mxbai-embed-large
mxbai-embed-large snowflake-arctic-embed
ollama pull snowflake-arctic-embed
snowflake-arctic-embed all-minilm-l6-v2
ollama pull chroma/all-minilm-l6-v2-f32
all-minilm-l6-v2-f32"},{"location":"integrations/ollama/embeddings/#basic-usage","title":"Basic Usage","text":"First let's run a local docker container with Ollama. We'll pull nomic-embed-text
model:
docker run -d --rm -v ./ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama\ndocker exec -it ollama ollama run nomic-embed-text # press Ctrl+D to exit after model downloads successfully\n# test it\ncurl http://localhost:11434/api/embeddings -d '{\"model\": \"nomic-embed-text\",\"prompt\": \"Here is an article about llamas...\"}'\n
Ollama Docs
For more information on Ollama, visit the Ollama GitHub repository.
Using the CLI
If you have or prefer to use the Ollama CLI, you can use the following command to get a model:
ollama pull nomic-embed-text\n
Now let's configure our OllamaEmbeddingFunction Embedding (python) function with the default Ollama endpoint:
"},{"location":"integrations/ollama/embeddings/#python","title":"Python","text":"import chromadb\nfrom chromadb.utils.embedding_functions import OllamaEmbeddingFunction\n\nclient = chromadb.PersistentClient(path=\"ollama\")\n\n# create EF with custom endpoint\nef = OllamaEmbeddingFunction(\n model_name=\"nomic-embed-text\",\n url=\"http://localhost:11434/api/embeddings\",\n)\n\nprint(ef([\"Here is an article about llamas...\"]))\n
"},{"location":"integrations/ollama/embeddings/#javascript","title":"JavaScript","text":"For JS users, you can use the OllamaEmbeddingFunction
class to create embeddings:
const {OllamaEmbeddingFunction} = require('chromadb');\nconst embedder = new OllamaEmbeddingFunction({\n url: \"http://localhost:11434/api/embeddings\",\n model: \"nomic-embed-text\"\n})\n\n// use directly\nconst embeddings = embedder.generate([\"Here is an article about llamas...\"])\n
"},{"location":"integrations/ollama/embeddings/#golang","title":"Golang","text":"For Golang you can use the chroma-go
client's OllamaEmbeddingFunction
embedding function to generate embeddings for your documents:
package main\n\nimport (\n \"context\"\n \"fmt\"\n ollama \"github.com/amikos-tech/chroma-go/ollama\"\n)\n\nfunc main() {\n documents := []string{\n \"Document 1 content here\",\n \"Document 2 content here\",\n }\n // the `/api/embeddings` endpoint is automatically appended to the base URL\n ef, err := ollama.NewOllamaEmbeddingFunction(ollama.WithBaseURL(\"http://127.0.0.1:11434\"), ollama.WithModel(\"nomic-embed-text\"))\n if err != nil {\n fmt.Printf(\"Error creating Ollama embedding function: %s \\n\", err)\n }\n resp, err := ef.EmbedDocuments(context.Background(), documents)\n if err != nil {\n fmt.Printf(\"Error embedding documents: %s \\n\", err)\n }\n fmt.Printf(\"Embedding response: %v \\n\", resp)\n}\n
Golang Client
You can install the Golang client by running the following command:
go get github.com/amikos-tech/chroma-go\n
For more information visit https://go-client.chromadb.dev/
"},{"location":"running/deployment-patterns/","title":"Deployment Patterns","text":"In this section we'll cover a patterns of how to deploy GenAI applications using Chroma as a vector store.
"},{"location":"running/health-checks/","title":"Health Checks","text":""},{"location":"running/health-checks/#docker-compose","title":"Docker Compose","text":"The simples form of health check is to use the healthcheck
directive in the docker-compose.yml
file. This is useful if you are deploying Chroma alongside other services that may depend on it.
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n image: server\n build:\n context: .\n dockerfile: Dockerfile\n volumes:\n # Be aware that indexed data are located in \"/chroma/chroma/\"\n # Default configuration for persist_directory in chromadb/config.py\n # Read more about deployments: https://docs.trychroma.com/deployment\n - chroma-data:/chroma/chroma\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}\n - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}\n - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n ports:\n - 8000:8000\n healthcheck:\n test: [ \"CMD\", \"/bin/bash\", \"-c\", \"cat < /dev/null > /dev/tcp/localhost/8001\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\nvolumes:\n chroma-data:\n driver: local\n
"},{"location":"running/health-checks/#kubernetes","title":"Kubernetes","text":"In kubernetes you can use the livenessProbe
and readinessProbe
to check the health of the server. This is useful if you are deploying Chroma in a kubernetes cluster.
apiVersion: apps/v1\nkind: Deployment\nmetadata:\n name: chroma\n labels:\n app: chroma\nspec:\n replicas: 1\n selector:\n matchLabels:\n app: chroma\n template:\n metadata:\n labels:\n app: chroma\n spec:\n containers:\n - name: chroma\n image: <chroma-image>\n ports:\n - containerPort: 8000\n livenessProbe:\n httpGet:\n path: /api/v1\n port: 8000\n initialDelaySeconds: 5\n periodSeconds: 5\n readinessProbe:\n httpGet:\n path: /api/v1\n port: 8000\n initialDelaySeconds: 5\n periodSeconds: 5\n startupProbe:\n httpGet:\n path: /api/v1\n port: 8000\n failureThreshold: 3\n periodSeconds: 60\n initialDelaySeconds: 60\n
Alternative to the httpGet
you can also use tcpSocket
:
readinessProbe:\n tcpSocket:\n port: 8000\n failureThreshold: 3\n timeoutSeconds: 30\n periodSeconds: 60\n livenessProbe:\n tcpSocket:\n port: 8000\n failureThreshold: 3\n timeoutSeconds: 30\n periodSeconds: 60\n startupProbe:\n tcpSocket:\n port: 8000\n failureThreshold: 3\n periodSeconds: 60\n initialDelaySeconds: 60\n
"},{"location":"running/road-to-prod/","title":"Road To Production","text":"In this section we will cover considerations for operating Chroma ina production environment.
To operate Chroma in production your deployment must follow your organization's best practices and guidelines around business continuity, security, and compliance. Here we will list the core concepts and offer some guidance on how to achieve them.
Core system abilities:
- High Availability - The deployment should be able to handle failures while continuing to serve requests.
- Scalability - The deployment should be able to handle increased load by adding more resources (aka scale horizontally).
- Privacy and Security - The deployment should protect data from unauthorized access and ensure data integrity.
- Observability - The deployment should provide metrics and logs to help operators understand the system's health.
- Backup and Restore - The deployment should have a backup and restore strategy to protect against data loss.
- Disaster Recovery - The deployment should have a disaster recovery plan to recover from catastrophic failures.
- Maintenance - The deployment should be easy to maintain and upgrade.
While our guidance is most likely incomplete it can be taken as a compliment to your own organizational processes. For those deploying Chroma in a smaller enterprise without such processes, we advise common sense and caution.
"},{"location":"running/road-to-prod/#high-availability","title":"High Availability","text":""},{"location":"running/road-to-prod/#scalability","title":"Scalability","text":""},{"location":"running/road-to-prod/#privacy-and-security","title":"Privacy and Security","text":""},{"location":"running/road-to-prod/#data-security","title":"Data Security","text":""},{"location":"running/road-to-prod/#in-transit","title":"In Transit","text":"The bare minimum for securing data in transit is to use HTTPS when performing Chroma API calls. This ensures that data is encrypted when it is sent over the network.
There are several ways to achieve this:
- Use a reverse proxy like Envoy or Nginx to terminate SSL/TLS connections.
- Use a load balancer like AWS ELB or Google Cloud Load Balancer to terminate SSL/TLS connections (technically a Envoy and Nginx are also LBs).
- Use a service mesh like Istio or Linkerd to manage SSL/TLS connections between services.
- Enable SSL/TLS in your Chroma server.
Depending on your requirements you may choose one or more of these options.
Reverse Proxy:
Load Balancer:
Service Mesh:
Chroma Server:
"},{"location":"running/road-to-prod/#at-rest","title":"At Rest","text":""},{"location":"running/road-to-prod/#access-control","title":"Access Control","text":""},{"location":"running/road-to-prod/#authentication","title":"Authentication","text":""},{"location":"running/road-to-prod/#authorization","title":"Authorization","text":""},{"location":"running/road-to-prod/#observability","title":"Observability","text":""},{"location":"running/road-to-prod/#backup-and-restore","title":"Backup and Restore","text":""},{"location":"running/road-to-prod/#disaster-recovery","title":"Disaster Recovery","text":""},{"location":"running/road-to-prod/#maintenance","title":"Maintenance","text":""},{"location":"running/running-chroma/","title":"Running Chroma","text":""},{"location":"running/running-chroma/#local-server","title":"Local Server","text":"Article Link
This article is also available on Medium Running ChromaDB \u2014 Part 1: Local Server.
"},{"location":"running/running-chroma/#chroma-cli","title":"Chroma CLI","text":"The simplest way to run Chroma locally is via the Chroma cli
which is part of the core Chroma package.
Prerequisites:
- Python 3.8 to 3.11 - Download Python | Python.org
pip install chromadb\nchroma run --host localhost --port 8000 --path ./my_chroma_data\n
--host
The host to which to listen to, by default it is [localhost](http://localhost:8000/docs)
, but if you want to expose it to your entire network then you can specify `0.0.0.0``
--port
The port on which to listen to, by default this is 8000
.
--path
The path where to persist your Chroma data locally.
Target Path Install
It is possible to install Chroma in a specific directory by running pip install chromadb -t /path/to/dir
. To run Chroma CLI from the installation dir expor the Python Path export PYTHONPATH=$PYTHONPATH:/path/to/dir
.
"},{"location":"running/running-chroma/#docker","title":"Docker","text":"Running Chroma server locally can be achieved via a simple docker command as shown below.
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
docker run -d --rm --name chromadb -v ./chroma:/chroma/chroma -e IS_PERSISTENT=TRUE -e ANONYMIZED_TELEMETRY=TRUE chromadb/chroma:latest\n
Options:
-v
specifies a local dir which is where Chroma will store its data so when the container is destroyed the data remains. Note: If you are using -e PERSIST_DIRECTORY
then you need to point the volume to that directory. -e
IS_PERSISTENT=TRUE
let\u2019s Chroma know to persist data -e
PERSIST_DIRECTORY=/path/in/container
specifies the path in the container where the data will be stored, by default it is /chroma/chroma
-e ANONYMIZED_TELEMETRY=TRUE
allows you to turn on (TRUE
) or off (FALSE
) anonymous product telemetry which helps the Chroma team in making informed decisions about Chroma OSS and commercial direction. chromadb/chroma:latest
indicates the latest Chroma version but can be replaced with any valid tag if a prior version is needed (e.g. chroma:0.4.24
)
"},{"location":"running/running-chroma/#docker-compose-cloned-repo","title":"Docker Compose (Cloned Repo)","text":"If you are feeling adventurous you can also use the Chroma main
branch to run a local Chroma server with the latest changes:
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
- Git - Git - Downloads (git-scm.com)
git clone https://github.com/chroma-core/chroma && cd chroma\ndocker compose up -d --build\n
If you want to run a specific version of Chroma you can checkout the version tag you need:
git checkout release/0.4.24\n
"},{"location":"running/running-chroma/#docker-compose-without-cloning-the-repo","title":"Docker Compose (Without Cloning the Repo)","text":"If you do not wish or are able to clone the repo locally, Chroma server can also be run with docker compose by creating (or using a gist) a docker-compose.yaml
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
- cURL (if you want to use the gist approach)
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\nservices:\n chromadb:\n image: chromadb/chroma:latest\n volumes:\n - ./chromadb:/chroma/chroma\n environment:\n - IS_PERSISTENT=TRUE\n - PERSIST_DIRECTORY=/chroma/chroma # this is the default path, change it as needed\n - ANONYMIZED_TELEMETRY=${ANONYMIZED_TELEMETRY:-TRUE}\n ports:\n - 8000:8000\n networks:\n - net\n
The above will create a container with the latest Chroma (chromadb/chroma:latest
), will expose it to port 8000
on the local machine and will persist data in ./chromadb
relative path from where the docker-compose.yaml
has been ran.
We have also created a small gist with the above file for convenience:
curl -s https://gist.githubusercontent.com/tazarov/4fd933274bbacb3b9f286b15c01e904b/raw/87268142d64d8ee0f7f98c27a62a5d089923a1df/docker-compose.yaml | docker-compose -f - up\n
"},{"location":"running/running-chroma/#minikube-with-helm-chart","title":"Minikube With Helm Chart","text":"Note: This deployment can just as well be done with KinD
depending on your preference.
A more advanced approach to running Chroma locally (but also on a remote cluster) is to deploy it using a Helm chart.
Disclaimer: The chart used here is not a 1st party chart, but is contributed by a core contributor to Chroma.
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
- Install minikube - minikube start | minikube (k8s.io)
- kubectl - Install Tools | Kubernetes
- Helm - Helm | Installing Helm
Once you have all of the above running Chroma in a local minikube
cluster quite simple
Create a minikube
cluster:
minikube start --addons=ingress -p chroma\nminikube profile chroma\n
Get and install the chart:
helm repo add chroma https://amikos-tech.github.io/chromadb-chart/\nhelm repo update\nhelm install chroma chroma/chromadb --set chromadb.apiVersion=\"0.4.24\"\n
By default the chart will enable authentication in Chroma. To get the token run the following:
kubectl --namespace default get secret chromadb-auth -o jsonpath=\"{.data.token}\" | base64 --decode\n# or use this to directly export variable\nexport CHROMA_TOKEN=$(kubectl --namespace default get secret chromadb-auth -o jsonpath=\"{.data.token}\" | base64 --decode)\n
The first step to connect and start using Chroma is to forward your port:
minikube service chroma-chromadb --url\n
The above should print something like this:
http://127.0.0.1:61892\n\u2757 Because you are using a Docker driver on darwin, the terminal needs to be open to run it.\n
Note: Depending on your OS the message might be slightly different.
Test it out (pip install chromadb
):
import chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(host=\"http://127.0.0.1:61892\",\n settings=Settings(\n chroma_client_auth_provider=\"chromadb.auth.token.TokenAuthClientProvider\",\n chroma_client_auth_credentials=\"<your_chroma_token>\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\n\nclient.get_version() # this should work with or without authentication - it is a public endpoint\n\nclient.list_collections() # this is a protected endpoint and requires authentication\n
For more information about the helm chart consult - https://github.com/amikos-tech/chromadb-chart
"},{"location":"running/systemd-service/","title":"Systemd service","text":"You can run Chroma as a systemd service which wil allow you to automatically start Chroma on boot and restart it if it crashes.
"},{"location":"running/systemd-service/#docker-compose","title":"Docker Compose","text":"The following is an examples systemd service for running Chroma using Docker Compose.
Create a file /etc/systemd/system/chroma.service
with the following content:
Example assumptions
The below example assumes Debian-based system with docker-ce installed.
[Unit]\nDescription = Chroma Service\nAfter = network.target docker.service\nRequires = docker.service\n\n[Service]\nType = forking\nUser = root\nGroup = root\nWorkingDirectory = /home/admin/chroma\nExecStart = /usr/bin/docker compose up -d\nExecStop = /usr/bin/docker compose down\nRemainAfterExit = true\n\n[Install]\nWantedBy = multi-user.target\n
Replace WorkingDirectory
with the path to your docker compose is. You may also need to replace /usr/bin/docker
with the path to your docker binary.
Alternatively you can install directly from a gist:
wget https://gist.githubusercontent.com/tazarov/9c46966de0b32a4962dcc79dce8b2646/raw/7cf8c471f33fba8a51d6f808f9b1af6ca1b0923c/chroma-docker.service \\\n -O /etc/systemd/system/chroma.service\n
Loading, enabling and starting the service:
sudo systemctl daemon-reload\nsudo systemctl enable chroma\nsudo systemctl start chroma\n
Type=forking
In the above example, we use Type=forking
because Docker Compose runs in the background (-d
). If you are using a different command that runs in the foreground, you may need to use Type=simple
instead.
"},{"location":"running/systemd-service/#chroma-cli","title":"Chroma CLI","text":"The following is an examples systemd service for running Chroma using the Chroma CLI.
Create a file /etc/systemd/system/chroma.service
with the following content:
Example assumptions
The below example assumes that Chroma is installed in Python site-packages
package.
[Unit]\nDescription = Chroma Service\nAfter = network.target\n\n[Service]\nType = simple\nUser = root\nGroup = root\nWorkingDirectory = /chroma\nExecStart=/usr/local/bin/chroma run --host 127.0.0.1 --port 8000 --path /chroma/data --log-path /var/log/chroma.log\n\n[Install]\nWantedBy = multi-user.target\n
Replace the WorkingDirectory
, /chroma/data
and /var/log/chroma.log
with the appropriate paths.
Safe Config
The above example service listens and localhost
which may not work if you are looking to expose Chroma to outside world. Adjust the --host
and --port
flags as needed.
Alternatively you can install from a gist:
wget https://gist.githubusercontent.com/tazarov/5e10ce892c06757d8188a8a34cd6d26d/raw/327a9d0b07afeb0b0cb77453aa9171fdd190984f/chroma-cli.service \\\n -O /etc/systemd/system/chroma.service\n
Loading, enabling and starting the service:
sudo systemctl daemon-reload\nsudo systemctl enable chroma\nsudo systemctl start chroma\n
Type=simple
In the above example, we use Type=simple
because the Chroma CLI runs in the foreground. If you are using a different command that runs in the background, you may need to use Type=forking
instead.
"},{"location":"strategies/backup/","title":"ChromaDB Backups","text":"Depending on your use case there are a few different ways to back up your ChromaDB data.
- API export - this approach is relatively simple, slow for large datasets and may result in a backup that is missing some updates, should your data change frequently.
- Disk snapshot - this approach is fast, but is highly dependent on the underlying storage. Should your cloud provider and underlying volume support snapshots, this is a good option.
- Filesystem backup - this approach is also fast, but requires stopping your Chroma container to avoid data corruption. This is a good option if you can afford to stop your Chroma container for a few minutes.
Other Options
Have another option in mind, feel free to add it to the above list.
"},{"location":"strategies/backup/#api-export","title":"API Export","text":""},{"location":"strategies/backup/#with-chroma-datapipes","title":"With Chroma Datapipes","text":"One way to export via the API is to use Tooling like Chroma Data Pipes. Chroma Data Pipes is a command-line tool that provides a simple way import/export/transform ChromaDB data.
Exporting from local filesystem:
cdp export \"file:///absolute/path/to/chroma-data/my-collection-name\" > my_chroma_data.jsonl\n
Exporting from remote server:
cdp export \"http://remote-chroma-server:8000/my-collection-name\" > my_chroma_data.jsonl\n
Get Help
Read more about Chroma Data Pipes here
"},{"location":"strategies/backup/#disk-snapshot","title":"Disk Snapshot","text":"TBD
"},{"location":"strategies/backup/#filesystem-backup","title":"Filesystem Backup","text":""},{"location":"strategies/backup/#from-docker-container","title":"From Docker Container","text":"Sometimes you have been running Chroma in a Docker container without a host mount, intentionally or unintentionally. So all your data is now stored in the container's filesystem. Here's how you can back up your data:
- Stop the container:
docker stop <chroma-container-id/name>\n
- Create a backup of the container's filesystem:
docker cp <chroma-container-id/name>:/chroma/chroma /path/to/backup\n
/path/to/backup
is the directory where you want to store the backup on your host machine.
"},{"location":"strategies/batching/","title":"Batching","text":"It is often that you may need to ingest a large number of documents into Chroma. The problem you may face is related to the underlying SQLite version of the machine running Chroma which imposes a maximum number of statements and parameters which Chroma translates into a batchable record size, exposed via the max_batch_size
parameter of the ChromaClient
class.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\")\nprint(\"Number of documents that can be inserted at once: \",client.max_batch_size)\n
"},{"location":"strategies/batching/#creating-batches","title":"Creating Batches","text":"Due to consistency and data integrity reasons, Chroma does not offer, yet, out-of-the-box batching support. The below code snippet shows how to create batches of documents and ingest them into Chroma.
import chromadb\nfrom chromadb.utils.batch_utils import create_batches\nimport uuid\n\nclient = chromadb.PersistentClient(path=\"test-large-batch\")\nlarge_batch = [(f\"{uuid.uuid4()}\", f\"document {i}\", [0.1] * 1536) for i in range(100000)]\nids, documents, embeddings = zip(*large_batch)\nbatches = create_batches(api=client,ids=list(ids), documents=list(documents), embeddings=list(embeddings))\ncollection = client.get_or_create_collection(\"test\")\nfor batch in batches:\n print(f\"Adding batch of size {len(batch[0])}\")\n collection.add(ids=batch[0],\n documents=batch[3],\n embeddings=batch[1],\n metadatas=batch[2])\n
"},{"location":"strategies/cors/","title":"CORS Configuration for Browser-Based Access","text":"Chroma JS package allows you to use Chroma in your browser-based SPA application. This is great, but that means that you'll need to configure Chroma to work with your browser to avoid CORS issues.
"},{"location":"strategies/cors/#setting-up-chroma-for-browser-based-access","title":"Setting up Chroma for Browser-Based Access","text":"To allow browsers to directly access your Chroma instance you'll need to configure the CHROMA_SERVER_CORS_ALLOW_ORIGINS
. The CHROMA_SERVER_CORS_ALLOW_ORIGINS
environment variable controls the hosts which are allowed to access your Chroma instance.
Note
The CHROMA_SERVER_CORS_ALLOW_ORIGINS
environment variable is a list of strings. Each string is a URL that is allowed to access your Chroma instance. If you want to allow all hosts to access your Chroma instance, you can set CHROMA_SERVER_CORS_ALLOW_ORIGINS
to [\"*\"]
. This is not recommended for production environments.
The below examples assume that your web app is running on http://localhost:3000
. You can find an example of NextJS and Langchain here.
Using Chroma run:
export CHROMA_SERVER_CORS_ALLOW_ORIGINS='[\"http://localhost:3000\"]'\nchroma run --path /path/to/chroma-data\n
Or with docker:
docker run -e CHROMA_SERVER_CORS_ALLOW_ORIGINS='[\"http://localhost:3000\"]' -v /path/to/chroma-data:/chroma/chroma -p 8000:8000 chromadb/chroma\n
Or in your docker-compose.yml
:
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n image: chromadb/chroma:0.5.0\n volumes:\n # Be aware that indexed data are located in \"/chroma/chroma/\"\n # Default configuration for persist_directory in chromadb/config.py\n # Read more about deployments: https://docs.trychroma.com/deployment\n - chroma-data:/chroma/chroma\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTHN_CREDENTIALS_FILE=${CHROMA_SERVER_AUTHN_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTHN_CREDENTIALS=${CHROMA_SERVER_AUTHN_CREDENTIALS}\n - CHROMA_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n - CHROMA_SERVER_CORS_ALLOW_ORIGINS=[\"http://localhost:3000\"]\n restart: unless-stopped # possible values are: \"no\", always\", \"on-failure\", \"unless-stopped\"\n ports:\n - \"8000:8000\"\n healthcheck:\n # Adjust below to match your container port\n test: [ \"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/v1/heartbeat\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\n\nvolumes:\n chroma-data:\n driver: local\n
Run docker compose up
to start your Chroma instance.
"},{"location":"strategies/keyword-search/","title":"Keyword Search","text":"Chroma uses SQLite for storing metadata and documents. Additionally documents are indexed using SQLite FTS5 for fast text search.
import chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.PersistentClient(path=\"test\", settings=Settings(allow_reset=True))\n\nclient.reset()\ncol = client.get_or_create_collection(\"test\")\n\ncol.upsert(ids=[\"1\", \"2\", \"3\"], documents=[\"He is a technology freak and he loves AI topics\", \"AI technology are advancing at a fast pace\", \"Innovation in LLMs is a hot topic\"],metadatas=[{\"author\": \"John Doe\"}, {\"author\": \"Jane Doe\"}, {\"author\": \"John Doe\"}])\ncol.query(query_texts=[\"technology\"], where_document={\"$or\":[{\"$contains\":\"technology\"}, {\"$contains\":\"freak\"}]})\n
The above should return:
{'ids': [['2', '1']],\n 'distances': [[1.052205477809135, 1.3074231535113972]],\n 'metadatas': [[{'author': 'Jane Doe'}, {'author': 'John Doe'}]],\n 'embeddings': None,\n 'documents': [['AI technology are advancing at a fast pace',\n 'He is a technology freak and he loves AI topics']],\n 'uris': None,\n 'data': None}\n
"},{"location":"strategies/memory-management/","title":"Memory Management","text":"This section provided additional info and strategies how to manage memory in Chroma.
"},{"location":"strategies/memory-management/#lru-cache-strategy","title":"LRU Cache Strategy","text":"Out of the box Chroma offers an LRU cache strategy which unloads segments (collections) that are not used while trying to abide to the configured memory usage limits.
To enable the LRU cache the following two settings parameters or environment variables need to be set:
PythonEnvironment Variables from chromadb.config import Settings\n\nsettings = Settings(\n chroma_segment_cache_policy=\"LRU\",\n chroma_memory_limit_bytes=10000000000 # ~10GB\n)\n
export CHROMA_SEGMENT_CACHE_POLICY=LRU\nexport CHROMA_MEMORY_LIMIT_BYTES=10000000000 # ~10GB\n
"},{"location":"strategies/memory-management/#manualcustom-collection-unloading","title":"Manual/Custom Collection Unloading","text":"Local Clients
The below code snippets assume you are working with a PersistentClient
or an EphemeralClient
instance.
At the time of writing (Chroma v0.4.22), Chroma does not allow you to manually unloading of collections from memory.
Here we provide a simple utility function to help users unload collections from memory.
Internal APIs
The below code relies on internal APIs and may change in future versions of Chroma. The function relies on Chroma internal APIs which may change. The below snippet has been tested with Chroma 0.4.24+
.
import gc\nimport os\n\nimport chromadb\nimport psutil\nfrom chromadb.types import SegmentScope\n\n\ndef bytes_to_gb(bytes_value):\n return bytes_value / (1024 ** 3)\n\n\ndef get_process_info():\n pid = os.getpid()\n p = psutil.Process(pid)\n with p.oneshot():\n mem_info = p.memory_info()\n # disk_io = p.io_counters()\n return {\n \"memory_usage\": bytes_to_gb(mem_info.rss),\n }\n\n\ndef unload_index(collection_name: str, chroma_client: chromadb.PersistentClient):\n \"\"\"\n Unloads binary hnsw index from memory and removes both segments (binary and metadata) from the segment cache.\n \"\"\"\n collection = chroma_client.get_collection(collection_name)\n collection_id = collection.id\n segment_manager = chroma_client._server._manager\n for scope in [SegmentScope.VECTOR, SegmentScope.METADATA]:\n if scope in segment_manager.segment_cache:\n cache = segment_manager.segment_cache[scope].cache\n if collection_id in cache:\n segment_manager.callback_cache_evict(cache[collection_id])\n gc.collect()\n
Example Contributed
The above example was enhanced and contributed by Amir
(amdeilami) from our Discord comminity. We appreciate and encourage his work and contributions to the Chroma community.
Usage Example import chromadb\n\n\nclient = chromadb.PersistentClient(path=\"testds-1M/chroma-data\")\ncol=client.get_collection(\"test\")\nprint(col.count())\ncol.get(limit=1,include=[\"embeddings\"]) # force load the collection into memory\n\nunload_index(\"test\", client)\n
"},{"location":"strategies/privacy/","title":"Privacy Strategies","text":""},{"location":"strategies/privacy/#overview","title":"Overview","text":"TBD
"},{"location":"strategies/privacy/#encryption","title":"Encryption","text":""},{"location":"strategies/privacy/#document-encryption","title":"Document Encryption","text":""},{"location":"strategies/privacy/#client-side-document-encryption","title":"Client-side Document Encryption","text":"See the notebook on client-side document encryption.
"},{"location":"strategies/rebuilding/","title":"Rebuilding Chroma DB","text":""},{"location":"strategies/rebuilding/#rebuilding-a-collection","title":"Rebuilding a Collection","text":"Here are several reasons you might want to rebuild a collection:
- Your metadata or binary index is corrupted or even deleted
- Optimize performance of HNSW index after a large number of updates
WAL Consistency and Backups
Before you proceed, make sure to backup your data. Secondly make sure that your WAL contains all the data to allow the proper rebuilding of the collection. For instance, after v0.4.22 you should not have run optimizations or WAL cleanup.
IMPORTANT
Only do this on a stopped Chroma instance.
Find the UUID of the target binary index directory to remove. Typically, the binary index directory is located in the persistent directory and is named after the collection vector segment (in segments
table). You can find the UUID by running the following SQL query:
sqlite3 /path/to/db/chroma.sqlite3 \"select s.id, c.name from segments s join collections c on s.collection=c.id where s.scope='VECTOR';\"\n
The above should print UUID dir and collection names.
Once you remove/rename the UUID dir, restart Chroma and query your collection like so:
import chromadb\nclient = chromadb.HttpClient() # Adjust as per your client\nres = client.get_collection(\"my_collection\").get(limit=1,include=['embeddings'])\n
Chroma will recreate your collection from the WAL.
Rebuilding the collection
Depending on how large your collection is, this process can take a while.
"},{"location":"strategies/time-based-queries/","title":"Time-based Queries","text":""},{"location":"strategies/time-based-queries/#filtering-documents-by-timestamps","title":"Filtering Documents By Timestamps","text":"In the example below, we create a collection with 100 documents, each with a random timestamp in the last two weeks. We then query the collection for documents that were created in the last week.
The example demonstrates how Chroma metadata can be leveraged to filter documents based on how recently they were added or updated.
import uuid\nimport chromadb\n\nimport datetime\nimport random\n\nnow = datetime.datetime.now()\ntwo_weeks_ago = now - datetime.timedelta(days=14)\n\ndates = [\n two_weeks_ago + datetime.timedelta(days=random.randint(0, 14))\n for _ in range(100)\n]\ndates = [int(date.timestamp()) for date in dates]\n\n# convert epoch seconds to iso format\n\ndef iso_date(epoch_seconds): return datetime.datetime.fromtimestamp(\n epoch_seconds).isoformat()\n\nclient = chromadb.EphemeralClient()\n\ncol = client.get_or_create_collection(\"test\")\n\ncol.add(ids=[f\"{uuid.uuid4()}\" for _ in range(100)], documents=[\n f\"document {i}\" for i in range(100)], metadatas=[{\"date\": date} for date in dates])\n\nres = col.get(where={\"date\": {\"$gt\": (now - datetime.timedelta(days=7)).timestamp()}})\n\nfor i in res['metadatas']:\n print(iso_date(i['date']))\n
Ref: https://gist.github.com/tazarov/3c9301d22ab863dca0b6fb1e5e3511b1
"},{"location":"strategies/multi-tenancy/","title":"Multi-Tenancy Strategies","text":""},{"location":"strategies/multi-tenancy/#introduction","title":"Introduction","text":"Some deployment settings of Chroma may require multi-tenancy support. This document outlines the strategies for multi-tenancy approaches in Chroma.
"},{"location":"strategies/multi-tenancy/#approaches","title":"Approaches","text":" - Naive approach - This is a simple approach puts the onus of enforcing multi-tenancy on the application. It is the simplest approach to implement, but is not very well suited for production environments.
- Multi-User Basic Auth - This article provides a stepping stone to more advanced multi-tenancy where the Chroma authentication allows for multiple users to access the same Chroma instance with their own credentials.
- Authorization Model with OpenFGA - Implement an advanced authorization model with OpenFGA.
- Implementing OpenFGA Authorization Model In Chroma - Learn how to implement OpenFGA authorization model in Chroma with full code example.
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/","title":"Implementing OpenFGA Authorization Model In Chroma","text":"Source Code
The source code for this article can be found here.
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#preparation","title":"Preparation","text":"To make things useful we also introduce an initial tuple set with permissions which will allows us to test the authorization model.
We define three users:
admin
part of chroma
team as owner
user1
part of chroma
team as reader
admin-ext
part of external
team as owner
We will give enough permissions to these three users and their respective teams so that they can perform collection creation, deletion, add records, remove records, get records and query records in the context of their role within the team - owner
has access to all API actions while reader
can only read, list get, query.
Abbreviate Example
We have removed some of the data from the above example for brevity. The full tuple set can be found under data/data/initial-data.json
[\n {\n \"object\": \"team:chroma\",\n \"relation\": \"owner\",\n \"user\": \"user:admin\"\n },\n {\n \"object\": \"team:chroma\",\n \"relation\": \"reader\",\n \"user\": \"user:user1\"\n },\n {\n \"object\": \"team:external\",\n \"relation\": \"owner\",\n \"user\": \"user:admin-ext\"\n },\n {\n \"object\": \"server:localhost\",\n \"relation\": \"can_get_tenant\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"tenant:default_tenant-default_database\",\n \"relation\": \"can_get_database\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_create_collection\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_list_collections\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_get_or_create_collection\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_count_collections\",\n \"user\": \"team:chroma#owner\"\n }\n]\n
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#testing-the-model","title":"Testing the model","text":"Let\u2019s spin up a quick docker compose to test our setup. In the repo we have provided openfga/docker-compose.openfga-standalone.yaml
docker compose -f openfga/docker-compose.openfga-standalone.yaml up\n
For this next part ensure you have FGA CLI installed.
Once the containers are up and running let\u2019s create a store and import the model:
export FGA_API_URL=http://localhost:8082 # our OpenFGA binds to 8082 on localhost\nfga store create --model data/models/model-article-p4.fga --name chromadb-auth\n
You should see a response like this:
{\n \"store\": {\n \"created_at\": \"2024-04-09T18:37:26.367747Z\",\n \"id\": \"01HV3VB347NPY3NMX6VQ5N2E23\",\n \"name\": \"chromadb-auth\",\n \"updated_at\": \"2024-04-09T18:37:26.367747Z\"\n },\n \"model\": {\n \"authorization_model_id\": \"01HV3VB34JAXWF0F3C00DFBZV4\"\n }\n}\n
Let\u2019s import our initial tuple set. Before that make sure to export FGA_STORE_ID
and FGA_MODEL_ID
as per the output of the previous command:
export FGA_STORE_ID=01HV3VB347NPY3NMX6VQ5N2E23\nexport FGA_MODEL_ID=01HV3VB34JAXWF0F3C00DFBZV4\nfga tuple write --file data/data/initial-data.json\n
Let\u2019s test our imported model and tuples:
fga query check user:admin can_get_preflight server:localhost\n
If everything is working you should see this:
{\n \"allowed\": true,\n \"resolution\": \"\"\n}\n
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#implementing-authorization-plumbing-in-chroma","title":"Implementing Authorization Plumbing in Chroma","text":"First we will start with making a few small changes to the authorization plugin we\u2019ve made. Why you ask? We need to introduce teams (aka groups). For that we\u2019ll resort to standard Apache groupfile
as follows:
chroma: admin, user1\nexternal: admin-ext\n
The groupfile
will be mounted to our Chroma container and read by the multi-user basic auth plugin. The changes to the authentication plugin are as follows:
# imports as before\n\n@register_provider(\"multi_user_htpasswd_file\")\nclass MultiUserHtpasswdFileServerAuthCredentialsProvider(ServerAuthCredentialsProvider):\n _creds: Dict[str, SecretStr] # contains user:password-hash\n\n def __init__(self, system: System) -> None:\n super().__init__(system)\n try:\n self.bc = importlib.import_module(\"bcrypt\")\n except ImportError:\n raise ValueError(aa\n \"The bcrypt python package is not installed. \"\n \"Please install it with `pip install bcrypt`\"\n )\n system.settings.require(\"chroma_server_auth_credentials_file\")\n _file = str(system.settings.chroma_server_auth_credentials_file)\n ... # as before\n _basepath = path.dirname(_file)\n self._user_group_map = dict()\n if path.exists(path.join(_basepath, \"groupfile\")):\n _groups = dict()\n with open(path.join(_basepath, \"groupfile\"), \"r\") as f:\n for line in f:\n _raw_group = [v for v in line.strip().split(\":\")]\n if len(_raw_group) < 2:\n raise ValueError(\n \"Invalid Htpasswd group file found in \"\n f\"[{path.join(_basepath, 'groupfile')}]. \"\n \"Must be <groupname>:<username1>,<username2>,...,<usernameN>.\"\n )\n _groups[_raw_group[0]] = [u.strip() for u in _raw_group[1].split(\",\")]\n for _group, _users in _groups.items():\n for _user in _users:\n if _user not in self._user_group_map:\n self._user_group_map[_user] = _group\n\n @trace_method( # type: ignore\n \"MultiUserHtpasswdFileServerAuthCredentialsProvider.validate_credentials\",\n OpenTelemetryGranularity.ALL,\n )\n @override\n def validate_credentials(self, credentials: AbstractCredentials[T]) -> bool:\n ... # as before\n\n @override\n def get_user_identity(\n self, credentials: AbstractCredentials[T]\n ) -> Optional[SimpleUserIdentity]:\n _creds = cast(Dict[str, SecretStr], credentials.get_credentials())\n if _creds[\"username\"].get_secret_value() in self._user_group_map.keys():\n return SimpleUserIdentity(\n _creds[\"username\"].get_secret_value(),\n attributes={\n \"team\": self._user_group_map[_creds[\"username\"].get_secret_value()]\n },\n )\n return SimpleUserIdentity(_creds[\"username\"].get_secret_value(), attributes={\"team\": \"public\"})\n
Full code
The code can be found under chroma_auth/authn/basic/__**init__**.py
We read the group file and for each user create a key in self._user_group_map
to specify the group or team of that user. The information is returned as user identity attributes that is further used by the authz plugin.
Now let\u2019s turn our attention to the authorization plugin. First let\u2019s start with that we\u2019re trying to achieve with it:
- Handle OpenFGA configuration from the import of the model as per the snippet above. This will help us to wire all necessary parts of the code with correct authorization model configuration.
- Map all existing Chroma authorization actions to our authorization model
- Adapt any shortcomings or quirks in Chroma authorization to the way OpenFGA works
- Implement the Enforcement Point (EP) logic
- Implement OpenFGA Permissions API wrapper - this is a utility class that will help us update and keep updating the OpenFGA tuples throughout collections\u2019 lifecycle.
We\u2019ve split the implementation in two files:
chroma_auth/authz/openfga/__init__.py
- Storing our OpenFGA authorization configuration reader and our authorization plugin that adapts to Chroma authz model and enforces authorization decisions chroma_auth/authz/openfga/openfga_permissions.py
- Holds our OpenFGA permissions update logic. chroma_auth/instr/**__init__**.py
- holds our adapted FastAPI server from Chroma 0.4.24
. While the authz plugin system in Chroma makes it easy to write the enforcement of authorization decisions, the update of permissions does require us to into this rabbit hole. Don\u2019t worry the actual changes are minimal
Let\u2019s cover things in a little more detail.
Reading the configuration.
@register_provider(\"openfga_config_provider\")\nclass OpenFGAAuthorizationConfigurationProvider(\n ServerAuthorizationConfigurationProvider[ClientConfiguration]\n):\n _config_file: str\n _config: ClientConfiguration\n\n def __init__(self, system: System) -> None:\n super().__init__(system)\n self._settings = system.settings\n if \"FGA_API_URL\" not in os.environ:\n raise ValueError(\"FGA_API_URL not set\")\n self._config = self._try_load_from_file()\n\n # TODO in the future we can also add credentials (preshared) or OIDC\n\n def _try_load_from_file(self) -> ClientConfiguration:\n store_id = None\n model_id = None\n if \"FGA_STORE_ID\" in os.environ and \"FGA_MODEL_ID\" in os.environ:\n return ClientConfiguration(\n api_url=os.environ.get(\"FGA_API_URL\"),\n store_id=os.environ[\"FGA_STORE_ID\"],\n authorization_model_id=os.environ[\"FGA_MODEL_ID\"],\n )\n if \"FGA_CONFIG_FILE\" not in os.environ and not store_id and not model_id:\n raise ValueError(\"FGA_CONFIG_FILE or FGA_STORE_ID/FGA_MODEL_ID env vars not set\")\n with open(os.environ[\"FGA_CONFIG_FILE\"], \"r\") as f:\n config = json.load(f)\n return ClientConfiguration(\n api_url=os.environ.get(\"FGA_API_URL\"),\n store_id=config[\"store\"][\"id\"],\n authorization_model_id=config[\"model\"][\"authorization_model_id\"],\n )\n\n @override\n def get_configuration(self) -> ClientConfiguration:\n return self._config\n
This is a pretty simple and straightforward implementation that will either take env variables for the FGA Server URL, Store and Model or it will only take the server ULR + json configuration (the same as above).
Next let\u2019s have a look at our OpenFGAAuthorizationProvider
implementation. We\u2019ll start with the constructor where we adapt existing Chroma authorization actions to our model:
def __init__(self, system: System) -> None:\n # more code here, but we're skipping for brevity\n self._authz_to_model_action_map = {\n AuthzResourceActions.CREATE_DATABASE.value: \"can_create_database\",\n AuthzResourceActions.GET_DATABASE.value: \"can_get_database\",\n AuthzResourceActions.CREATE_TENANT.value: \"can_create_tenant\",\n AuthzResourceActions.GET_TENANT.value: \"can_get_tenant\",\n AuthzResourceActions.LIST_COLLECTIONS.value: \"can_list_collections\",\n AuthzResourceActions.COUNT_COLLECTIONS.value: \"can_count_collections\",\n AuthzResourceActions.GET_COLLECTION.value: \"can_get_collection\",\n AuthzResourceActions.CREATE_COLLECTION.value: \"can_create_collection\",\n AuthzResourceActions.GET_OR_CREATE_COLLECTION.value: \"can_get_or_create_collection\",\n AuthzResourceActions.DELETE_COLLECTION.value: \"can_delete_collection\",\n AuthzResourceActions.UPDATE_COLLECTION.value: \"can_update_collection\",\n AuthzResourceActions.ADD.value: \"can_add_records\",\n AuthzResourceActions.DELETE.value: \"can_delete_records\",\n AuthzResourceActions.GET.value: \"can_get_records\",\n AuthzResourceActions.QUERY.value: \"can_query_records\",\n AuthzResourceActions.COUNT.value: \"can_count_records\",\n AuthzResourceActions.UPDATE.value: \"can_update_records\",\n AuthzResourceActions.UPSERT.value: \"can_upsert_records\",\n AuthzResourceActions.RESET.value: \"can_reset\",\n }\n\n self._authz_to_model_object_map = {\n AuthzResourceTypes.DB.value: \"database\",\n AuthzResourceTypes.TENANT.value: \"tenant\",\n AuthzResourceTypes.COLLECTION.value: \"collection\",\n }\n
The above is located in chroma_auth/authz/openfga/__init__.py
The above is fairly straightforward mapping between AuthzResourceActions
part of Chroma\u2019s auth framework and the relations (aka actions) we\u2019ve defined in our model above. Next we map also the AuthzResourceTypes
to OpenFGA objects. This seem pretty simple right? Wrong, things are not so perfect and nothing exhibits this more than our next portion that takes the action and resource and returns object and relation to be checked:
def resolve_resource_action(self, resource: AuthzResource, action: AuthzAction) -> tuple:\n attrs = \"\"\n tenant = None,\n database = None\n if \"tenant\" in resource.attributes:\n attrs += f\"{resource.attributes['tenant']}\"\n tenant = resource.attributes['tenant']\n if \"database\" in resource.attributes:\n attrs += f\"-{resource.attributes['database']}\"\n database = resource.attributes['database']\n if action.id == AuthzResourceActions.GET_TENANT.value or action.id == AuthzResourceActions.CREATE_TENANT.value:\n return \"server:localhost\", self._authz_to_model_action_map[action.id]\n if action.id == AuthzResourceActions.GET_DATABASE.value or action.id == AuthzResourceActions.CREATE_DATABASE.value:\n return f\"tenant:{attrs}\", self._authz_to_model_action_map[action.id]\n if action.id == AuthzResourceActions.CREATE_COLLECTION.value:\n try:\n cole_exists = self._api.get_collection(\n resource.id, tenant=tenant, database=database\n )\n return f\"collection:{attrs}-{cole_exists.name}\", self._authz_to_model_action_map[\n AuthzResourceActions.GET_COLLECTION.value]\n except Exception as e:\n return f\"{self._authz_to_model_object_map[resource.type]}:{attrs}\", self._authz_to_model_action_map[\n action.id]\n if resource.id == \"*\":\n return f\"{self._authz_to_model_object_map[resource.type]}:{attrs}\", self._authz_to_model_action_map[action.id]\n else:\n return f\"{self._authz_to_model_object_map[resource.type]}:{attrs}-{resource.id}\",\n self._authz_to_model_action_map[action.id]\n
Full code
The above is located in chroma_auth/authz/openfga/__init__.py
The resolve_resource_action
function demonstrates the idiosyncrasies of Chroma\u2019s auth. I have only myself to blame. The key takeaway is that there is room for improvement.
The actual authorization enforcement is then dead simple:
def authorize(self, context: AuthorizationContext) -> bool:\n with OpenFgaClient(self._authz_config_provider.get_configuration()) as fga_client:\n try:\n obj, act = self.resolve_resource_action(resource=context.resource, action=context.action)\n resp = fga_client.check(body=ClientCheckRequest(\n user=f\"user:{context.user.id}\",\n relation=act,\n object=obj,\n ))\n # openfga_sdk.models.check_response.CheckResponse\n return resp.allowed\n except Exception as e:\n logger.error(f\"Error while authorizing: {str(e)}\")\n return False\n
At the end we\u2019ll look at the our permissions API wrapper. While a full-blown solution will implement all possible object lifecycle hooks, we\u2019re content with collections. Therefore we\u2019ll add lifecycle callbacks for creating and deleting collection (we\u2019re not considering, sharing of the collection with other users and change of ownership). So how does our create collection hook might look like you ask?
def create_collection_permissions(self, collection: Collection, request: Request) -> None:\n if not hasattr(request.state, \"user_identity\"):\n return\n identity = request.state.user_identity # AuthzUser\n tenant = request.query_params.get(\"tenant\")\n database = request.query_params.get(\"database\")\n _object = f\"collection:{tenant}-{database}-{collection.id}\"\n _object_for_get_collection = f\"collection:{tenant}-{database}-{collection.name}\" # this is a bug in the Chroma Authz that feeds in the name of the collection instead of ID\n _user = f\"team:{identity.get_user_attributes()['team']}#owner\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else f\"user:{identity.get_user_id()}\"\n _user_writer = f\"team:{identity.get_user_attributes()['team']}#writer\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n _user_reader = f\"team:{identity.get_user_attributes()['team']}#reader\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n with OpenFgaClient(self._fga_configuration) as fga_client:\n fga_client.write_tuples(\n body=[\n ClientTuple(_user, \"can_add_records\", _object),\n ClientTuple(_user, \"can_delete_records\", _object),\n ClientTuple(_user, \"can_update_records\", _object),\n ClientTuple(_user, \"can_get_records\", _object),\n ClientTuple(_user, \"can_upsert_records\", _object),\n ClientTuple(_user, \"can_count_records\", _object),\n ClientTuple(_user, \"can_query_records\", _object),\n ClientTuple(_user, \"can_get_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_delete_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_update_collection\", _object),\n ]\n )\n if _user_writer:\n fga_client.write_tuples(\n body=[\n ClientTuple(_user_writer, \"can_add_records\", _object),\n ClientTuple(_user_writer, \"can_delete_records\", _object),\n ClientTuple(_user_writer, \"can_update_records\", _object),\n ClientTuple(_user_writer, \"can_get_records\", _object),\n ClientTuple(_user_writer, \"can_upsert_records\", _object),\n ClientTuple(_user_writer, \"can_count_records\", _object),\n ClientTuple(_user_writer, \"can_query_records\", _object),\n ClientTuple(_user_writer, \"can_get_collection\", _object_for_get_collection),\n ClientTuple(_user_writer, \"can_delete_collection\", _object_for_get_collection),\n ClientTuple(_user_writer, \"can_update_collection\", _object),\n ]\n )\n if _user_reader:\n fga_client.write_tuples(\n body=[\n ClientTuple(_user_reader, \"can_get_records\", _object),\n ClientTuple(_user_reader, \"can_query_records\", _object),\n ClientTuple(_user_reader, \"can_count_records\", _object),\n ClientTuple(_user_reader, \"can_get_collection\", _object_for_get_collection),\n ]\n )\n
Full code
You can find the full code in chroma_auth/authz/openfga/openfga_permissions.py
Looks pretty straight, but hold on I hear a thought creeping in your mind. \u201cWhy are you adding roles manually?\u201d
You are right, it lacks that DRY-je-ne-sais-quoi, and I\u2019m happy to keep it simple an explicit. A more mature implementation can read the model figure out what type we\u2019re adding permissions for and then for each relation add the requisite users, but premature optimization is difficult to put in an article that won\u2019t turn into a book.
With the above code we make the assumption that the collection doesn\u2019t exist ergo its permissions tuples don\u2019t exist. ( OpenFGA will fail to add tuples that already exist and there is not way around it other than deleting them first). Remember permission tuple lifecycle is your responsibility when adding authz to your application.
The delete is oddly similar (that\u2019s why we\u2019ve skipped the bulk of it):
def delete_collection_permissions(self, collection: Collection, request: Request) -> None:\n if not hasattr(request.state, \"user_identity\"):\n return\n identity = request.state.user_identity\n\n _object = f\"collection:{collection.tenant}-{collection.database}-{collection.id}\"\n _object_for_get_collection = f\"collection:{collection.tenant}-{collection.database}-{collection.name}\" # this is a bug in the Chroma Authz that feeds in the name of the collection instead of ID\n _user = f\"team:{identity.get_user_attributes()['team']}#owner\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else f\"user:{identity.get_user_id()}\"\n _user_writer = f\"team:{identity.get_user_attributes()['team']}#writer\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n _user_reader = f\"team:{identity.get_user_attributes()['team']}#reader\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n with OpenFgaClient(self._fga_configuration) as fga_client:\n fga_client.delete_tuples(\n body=[\n ClientTuple(_user, \"can_add_records\", _object),\n ClientTuple(_user, \"can_delete_records\", _object),\n ClientTuple(_user, \"can_update_records\", _object),\n ClientTuple(_user, \"can_get_records\", _object),\n ClientTuple(_user, \"can_upsert_records\", _object),\n ClientTuple(_user, \"can_count_records\", _object),\n ClientTuple(_user, \"can_query_records\", _object),\n ClientTuple(_user, \"can_get_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_delete_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_update_collection\", _object),\n ]\n )\n # more code in the repo\n
Full code
You can find the full code in chroma_auth/authz/openfga/openfga_permissions.py
Let\u2019s turn our attention at the last piece of code - the necessary evil of updating the FastAPI in Chroma to add our Permissions API hooks. We start simple by injecting our component using Chroma\u2019s DI (dependency injection).
from chroma_auth.authz.openfga.openfga_permissions import OpenFGAPermissionsAPI\n\nself._permissionsApi: OpenFGAPermissionsAPI = self._system.instance(OpenFGAPermissionsAPI)\n
The we add a hook for collection creation:
def create_collection(\n self,\n request: Request,\n collection: CreateCollection,\n tenant: str = DEFAULT_TENANT,\n database: str = DEFAULT_DATABASE,\n) -> Collection:\n existing = None\n try:\n existing = self._api.get_collection(collection.name, tenant=tenant, database=database)\n except ValueError as e:\n if \"does not exist\" not in str(e):\n raise e\n collection = self._api.create_collection(\n name=collection.name,\n metadata=collection.metadata,\n get_or_create=collection.get_or_create,\n tenant=tenant,\n database=database,\n )\n if not existing:\n self._permissionsApi.create_collection_permissions(collection=collection, request=request)\n return collection\n
Full code
You can find the full code in chroma_auth/instr/__init__.py
And one for collection removal:
def delete_collection(\n self,\n request: Request,\n collection_name: str,\n tenant: str = DEFAULT_TENANT,\n database: str = DEFAULT_DATABASE,\n) -> None:\n collection = self._api.get_collection(collection_name, tenant=tenant, database=database)\n resp = self._api.delete_collection(\n collection_name, tenant=tenant, database=database\n )\n\n self._permissionsApi.delete_collection_permissions(collection=collection, request=request)\n return resp\n
Full code
You can find the full code in chroma_auth/instr/__init__.py
The key thing to observe about the above snippets is that we invoke permissions API when we\u2019re sure things have been persisted in the DB. I know, I know, atomicity here is also important, but that is for another article. Just keep in mind that it is easier to fix broken permission than broken data.
I promise this was the last bit of python code you\u2019ll see in this article.
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#the-infra","title":"The Infra","text":"Infrastructure!!! Finally, a sigh of relieve.
Let\u2019s draw a diagrams:
Link
We have our Chroma server, that relies on OpenFGA which persists data in PostgreSQL. \u201cOk, but \u2026\u201d, I can see you scratch your head, \u201c\u2026 how do I bring this magnificent architecture to live?\u201d. I thought you\u2019d never ask. We\u2019ll rely on our trusty docker compose skills with the following sequence in mind:
\u201cWhere is the docker-compose.yaml
!\u201d. Voil\u00e0, my impatient friends:
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n depends_on:\n openfga:\n condition: service_healthy\n import:\n condition: service_completed_successfully\n image: chroma-server\n build:\n dockerfile: Dockerfile\n volumes:\n - ./chroma-data:/chroma/chroma\n - ./server.htpasswd:/chroma/server.htpasswd\n - ./groupfile:/chroma/groupfile\n - ./data/:/data\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}\n - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}\n - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n - CHROMA_SERVER_AUTHZ_PROVIDER=${CHROMA_SERVER_AUTHZ_PROVIDER}\n - CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER=${CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER}\n - FGA_API_URL=http://openfga:8080\n - FGA_CONFIG_FILE=/data/store.json # we expect that the import job will create this file\n restart: unless-stopped # possible values are: \"no\", always\", \"on-failure\", \"unless-stopped\"\n ports:\n - \"8000:8000\"\n healthcheck:\n # Adjust below to match your container port\n test: [ \"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/v1/heartbeat\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\n postgres:\n image: postgres:14\n container_name: postgres\n networks:\n - net\n ports:\n - \"5432:5432\"\n environment:\n - POSTGRES_USER=postgres\n - POSTGRES_PASSWORD=password\n healthcheck:\n test: [ \"CMD-SHELL\", \"pg_isready -U postgres\" ]\n interval: 5s\n timeout: 5s\n retries: 5\n volumes:\n - postgres_data_openfga:/var/lib/postgresql/data\n\n migrate:\n depends_on:\n postgres:\n condition: service_healthy\n image: openfga/openfga:latest\n container_name: migrate\n command: migrate\n environment:\n - OPENFGA_DATASTORE_ENGINE=postgres\n - OPENFGA_DATASTORE_URI=postgres://postgres:password@postgres:5432/postgres?sslmode=disable\n networks:\n - net\n openfga:\n depends_on:\n migrate:\n condition: service_completed_successfully\n image: openfga/openfga:latest\n container_name: openfga\n environment:\n - OPENFGA_DATASTORE_ENGINE=postgres\n - OPENFGA_DATASTORE_URI=postgres://postgres:password@postgres:5432/postgres?sslmode=disable\n - OPENFGA_LOG_FORMAT=json\n command: run\n networks:\n - net\n ports:\n # Needed for the http server\n - \"8082:8080\"\n # Needed for the grpc server (if used)\n - \"8083:8081\"\n # Needed for the playground (Do not enable in prod!)\n - \"3003:3000\"\n healthcheck:\n test: [ \"CMD\", \"/usr/local/bin/grpc_health_probe\", \"-addr=openfga:8081\" ]\n interval: 5s\n timeout: 30s\n retries: 3\n import:\n depends_on:\n openfga:\n condition: service_healthy\n image: fga-cli\n build:\n context: .\n dockerfile: Dockerfile-fgacli\n container_name: import\n volumes:\n - ./data/:/data\n command: |\n /bin/sh -c \"/data/create_store_and_import.sh\"\n environment:\n - FGA_SERVER_URL=http://openfga:8080\n networks:\n - net\nvolumes:\n postgres_data_openfga:\n driver: local\n
Don\u2019t forget to create an .env
file:
CHROMA_SERVER_AUTH_PROVIDER = \"chromadb.auth.basic.BasicAuthServerProvider\"\nCHROMA_SERVER_AUTH_CREDENTIALS_FILE = \"server.htpasswd\"\nCHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER = \"chroma_auth.authn.basic.MultiUserHtpasswdFileServerAuthCredentialsProvider\"\nCHROMA_SERVER_AUTHZ_PROVIDER = \"chroma_auth.authz.openfga.OpenFGAAuthorizationProvider\"\nCHROMA_SERVER_AUTHZ_CONFIG_PROVIDER = \"chroma_auth.authz.openfga.OpenFGAAuthorizationConfigurationProvider\"\n
Update your server.htpasswd
to include the new user:
admin:$2\ny$05$vkBK4b1Vk5O98jNHgr.uduTJsTOfM395sKEKe48EkJCVPH / MBIeHK\nuser1:$2\ny$05$UQ0kC2x3T2XgeN4WU12BdekUwCJmLjJNhMaMtFNolYdj83OqiEpVu\nadmin - ext:$2\ny$05$9.\nL13wKQTHeXz9IH2UO2RurWEK. / Z24qapzyi6ywQGJds2DaC36C2\n
And the groupfile
from before. And don\u2019t forget to take a look at the import script under - data/create_store_and_import.sh
Run the following command at the root of the repo and let things fail and burn down (or in the event this works - awe you, disclaimer - it worked on my machine):
docker\ncompose\nup - -build\n
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#tests-who-needs-test-when-you-have-stable-infra","title":"Tests, who needs test when you have stable infra!","text":"Authorization is serious stuff, which is why we\u2019ve created a bare minimum set of tests to prove we\u2019re not totally wrong about it!
Real Serious Note
Serious Note: Take these things seriously and write a copious amounts of tests before rolling out things to prod. Don\u2019t become OWASP Top10 \u201cHero\u201d. Broken access controls is a thing that WILL keep you up at night.
We\u2019ll focus on three areas:
- Testing admin (owner) access
- Testing team access for owner and reader roles
- Testing cross team permissions
Admin Access
Simple check to ensure that whoever created the collection (aka the owner) is allowed all actions.
import uuid\nimport chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.list_collections() # this is a protected endpoint and requires authentication\n\ncol = client.get_or_create_collection(f\"test_collection-{str(uuid.uuid4())}\")\ncol.add(ids=[\"1\"], documents=[\"test doc\"])\n\ncol.get()\ncol.update(ids=[\"1\"], documents=[\"test doc 2\"])\ncol.count()\ncol.upsert(ids=[\"1\"], documents=[\"test doc 3\"])\ncol.delete(ids=[\"1\"])\n\nclient.delete_collection(col.name)\n
Full code
You can find the full code in test_auth.ipynb
Team Access
Team access tests whether roles and permissions associated with those roles are correctly enforced.
import uuid\nimport chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.list_collections() # this is a protected endpoint and requires authentication\n\ncol_name = f\"test_collection-{str(uuid.uuid4())}\"\ncol = client.get_or_create_collection(col_name)\nprint(f\"Creating collection {col.id}\")\ncol.add(ids=[\"1\"], documents=[\"test doc\"])\n\nclient.get_collection(col_name)\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"user1:password123\"))\n\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.list_collections() # this is a protected endpoint and requires authentication\nclient.count_collections()\nprint(\"Getting collection \" + col_name)\ncol = client.get_collection(col_name)\ncol.get()\ncol.count()\n\ntry:\n client.delete_collection(col_name)\nexcept Exception as e:\n print(e) #expect unauthorized error\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\n\nclient.delete_collection(col_name)\n
Full code
You can find the full code in test_auth.ipynb
Cross-team access
In the cross team access scenario we\u2019ll create a collection with one team owner (admin
) and will try to access it (aka delete it) with another team\u2019s owner in a very mano-a-mano (owner-to-owner way). It is important to observe that all these collections are created within the same database (default_database
)
import uuid\nimport chromadb\nfrom chromadb.config import Settings\n\ncol_name = f\"test_collection-{str(uuid.uuid4())}\"\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\n\nclient.get_or_create_collection(col_name)\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin-ext:password123\"))\n\nclient.get_or_create_collection(\"external-collection\")\n\ntry:\n client.delete_collection(col_name)\nexcept Exception as e:\n print(\"Expected error for admin-ext: \", str(e)) #expect unauthorized error\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\nclient.delete_collection(col_name)\ntry:\n client.delete_collection(\"external-collection\")\nexcept Exception as e:\n print(\"Expected error for admin: \", str(e)) #expect unauthorized error\n
Full code
You can find the full code in test_auth.ipynb
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/","title":"Chroma Authorization Model with OpenFGA","text":"Source Code
The source code for this article can be found here.
This article will not provide any code that you can use immediately but will set the stage for our next article, which will introduce the actual Chroma-OpenFGA integration.
With that in mind, let\u2019s get started.
Who is this article for? The intended audience is DevSecOps, but engineers and architects could also use this to learn about Chroma and the authorization models.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#authorization-model","title":"Authorization Model","text":"Authorization models are an excellent way to abstract the way you wish your users to access your application form the actual implementation.
There are many ways to do authz, ranging from commercial Auth0 FGA to OSS options like Ory Keto/Kratos, CASBIN, Permify, and Kubescape, but for this article, we\u2019ve decided to use OpenFGA (which technically is Auth0\u2019s open-source framework for FGA).
Why OpenFGA, I hear you ask? Here are a few reasons:
- Apache-2 licensed
- CNCF Incubating project
- Zanzibar alignment in that it is a ReBAC (Relation-based access control) system
- DSL for modeling and testing permissions (as well as JSON-base version for those with masochistic tendencies)
OpenFGA has done a great job explaining the steps to building an Authorization model, which you can read here. We will go over those while keeping our goal of creating an authorization model for Chroma.
It is worth noting that the resulting authorization model that we will create here will be suitable for many GenAI applications, such as general-purpose RAG systems. Still, it is not a one-size-fits-all solution to all problems. For instance, if you want to implement authz in Chroma within your organization, OpenFGA might not be the right tool for the job, and you should consult with your IT/Security department for guidance on integrating with existing systems.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#the-goal","title":"The Goal","text":"Our goal is to achieve the following:
- Allow fine-grained access to the following resources - collection, database, tenant, and Chroma server.
- AlGrouping of users for improved permission management.
- Individual user access to resources
- Roles - owner, writer, reader
Document-Level Access
Although granting access to individual documents in a collection can be beneficial in some contexts, we have left that part out of our goals to keep things as simple and short as possible. If you are interested in this topic, reach out, and we will help you.
This article will not cover user management, commonly called Identity Access Management (IAM). We\u2019ll cover that in a subsequent article.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#modeling-fundamentals","title":"Modeling Fundamentals","text":"Let\u2019s start with the fundamentals:
Why could user U perform an action A on an object O?
We will attempt to answer the question in the context of Chroma by following OpenFGA approach to refining the model. The steps are:
- Pick the most important features.
- List of object types
- List of relations for the types
- Test the model
- Iterate
Given that OpenFGA is Zanzibar inspired, the basic primitive for it is a tuple of the following format:
(User,Relation,Object)\n
With the above we can express any relation between a user (or a team or even another object) the action the user performs (captured by object relations) and the object (aka API resource).
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#pick-the-features","title":"Pick the features","text":"In the context of Chroma, the features are the actions the user can perform on Chroma API (as of this writing v0.4.24).
Let\u2019s explore what are the actions that users can perform:
- Create a tenant
- Get a tenant
- Create a database for a tenant
- Get a database for a tenant
- Create a collection in a database
- Delete a collection from a database
- Update collection name and metadata
- List collections in a database
- Count collections in a database
- Add records to a collection
- Delete records from a collection
- Update records in a collection
- Upsert records in a collection
- Count records in a collection
- Get records from a collection
- Query records in a collection
- Get pre-flight-checks
Open Endpoints
Note we will omit get hearbeat
and get version
actions as this is generally a good idea to be open so that orchestrators (docker/k8s) can get the health status of chroma.
To make it easy to reason about relations in our authorization model we will rephrase the above to the following format:
A user {user} can perform action {action} to/on/in {object types} ... IF {conditions}\n
- A user can perform action create tenant on Chroma server if they are owner of the server
- A user can perform action get tenant on Chroma server if they are a reader or writer or owner of the server
- A user can perform action create database on a tenant if they are an owner of the tenant
- A user can perform action get database on a tenant if they are reader, writer or owner of the tenant
- A user can perform action create collection on a database if they are a writer or an owner of the database
- A user can perform action delete collection on a database if they are a writer or an owner of the database
- A user can perform action update collection name or metadata on a database if they are a writer or an owner of the database
- A user can perform action list collections in a database if they are a writer or an owner of the database
- A user can perform action count collections in a database if they are a writer or an owner of the database
- A user can perform action add records on a collection if they are writer or owner of the collection
- A user can perform action delete records on a collection if they are writer or owner of the collection
- A user can perform action update records on a collection if they are writer or owner of the collection
- A user can perform action upsert records on a collection if they are writer or owner of the collection
- A user can perform action get records on a collection if they are writer or owner or reader of the collection
- A user can perform action count records on a collection if they are writer or owner or reader of the collection
- A user can perform action query records on a collection if they are writer or owner or reader of the collection
- A user can perform action get pre-flight-checks on a Chroma server if they are writer or owner or reader of the server
We don\u2019t have to get it all right in the first iteration, but the above is a good starting point that can be adapted further.
The above statements alone are already a great introspection as to what we can do within Chroma and who is supposed to be able to do what. Please note that your mileage may vary, as per your authz requirements, but in our experience the variations are generally around the who.
As an astute reader you have already noted that we\u2019re generally outlined some RBAC stuff in the form of owner, writer and reader.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#list-the-objects","title":"List the objects!!!","text":"Now that we know what our users can do, let\u2019s figure solidify our understanding of on what our users will be performing these actions, aka the object types.
Let\u2019s call them out:
- User - this is basic and pretty obvious object type that we want to model our users after
- Chroma server - this is our top level object in the access relations
- Tenant - for most Chroma developers this will equate to a team or a group
- Database
- Collection
We can also examine all of the of the <object>
in the above statements to ensure we haven\u2019t missed any objects. So far seems we\u2019re all good.
Now that we have our objects let\u2019s create a first iteration of our authorization model using OpenFGA DSL:
model\n schema 1.1\n\ntype server\ntype user\ntype tenant\ntype database\ntype collection\n
OpenFGA CLI
You will need to install openfga CLI - https://openfga.dev/docs/getting-started/install-sdk. Also check the VSCode extension for OpenFGA.
Let\u2019s validate our work:
fga model validate --file model-article-p1.fga\n
You should see the following output:
{\n \"is_valid\":true\n}\n
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#relations","title":"Relations","text":"Now that we have the actions and the objects, let us figure out the relationships we want to build into our model.
To come up with our relations we can follow these two rules:
- Any noun of the type
{noun} of a/an/the {type}
expression (e.g. of the collection
) - Any verb or action described with
can {action} on/in {type}
So now let\u2019s work on our model to expand it with relationships:
model\n schema 1.1\n\ntype user\n\ntype server\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define can_get_preflight: reader or owner or writer\n define can_create_tenant: owner or writer\n\ntype tenant\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define belongsTo: [server]\n define can_create_database: owner from belongsTo or writer from belongsTo or owner or writer\n define can_get_database: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n\ntype database\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define belongsTo: [tenant]\n define can_create_collection: owner from belongsTo or writer from belongsTo or owner or writer\n define can_delete_collection: owner from belongsTo or writer from belongsTo or owner or writer\n define can_list_collections: owner or writer or owner from belongsTo or writer from belongsTo\n define can_get_collection: owner or writer or owner from belongsTo or writer from belongsTo\n define can_get_or_create_collection: owner or writer or owner from belongsTo or writer from belongsTo\n define can_count_collections: owner or writer or owner from belongsTo or writer from belongsTo\n\ntype collection\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define belongsTo: [database]\n define can_add_records: writer or reader or owner from belongsTo or writer from belongsTo\n define can_delete_records: writer or owner from belongsTo or writer from belongsTo\n define can_update_records: writer or owner from belongsTo or writer from belongsTo\n define can_get_records: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n define can_upsert_records: writer or owner from belongsTo or writer from belongsTo\n define can_count_records: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n define can_query_records: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n
Let\u2019s validated:
fga model validate --file model-article-p2.fga\n
This seems mostly accurate and should do ok as Authorization model. But let us see if we can make it better. If we are to implement the above we will end up with lots of permissions in OpenFGA, not that it can\u2019t handle them, but as we go into the implementation details it will become cumbersome to update and maintain all these permissions. So let\u2019s look for opportunity to simplify things a little.
Can we make the model a little simpler and the first question we ask is do we really need owner, reader, writer on every object or can we make a decision about our model and simplify this. As it turns out we can. The way that most multi-user systems work is that they tend to gravitate to grouping things as a way to reduce the need to maintain a large number of permissions. In our case we can group our users into team
and in each team we\u2019ll have owner, writer, reader
Let\u2019s see the results:
model\n schema 1.1\n\ntype user\n\ntype team\n relations\n define owner: [user]\n define writer: [user]\n define reader: [user]\n\ntype server\n relations\n define can_get_preflight: [user, team#owner, team#writer, team#reader]\n define can_create_tenant: [user, team#owner, team#writer]\n define can_get_tenant: [user, team#owner, team#writer, team#reader]\n\ntype tenant\n relations\n define can_create_database: [user, team#owner, team#writer]\n define can_get_database: [user, team#owner, team#writer, team#reader]\n\ntype database\n relations\n define can_create_collection: [user, team#owner, team#writer]\n define can_list_collections: [user, team#owner, team#writer, team#reader]\n define can_get_or_create_collection: [user, team#owner, team#writer]\n define can_count_collections: [user, team#owner, team#writer, team#reader]\n\ntype collection\n relations\n define can_delete_collection: [user, team#owner, team#writer]\n define can_get_collection: [user, team#owner, team#writer, team#reader]\n define can_update_collection: [user, team#owner, team#writer]\n define can_add_records: [user, team#owner, team#writer]\n define can_delete_records: [user, team#owner, team#writer]\n define can_update_records: [user, team#owner, team#writer]\n define can_get_records: [user, team#owner, team#writer, team#reader]\n define can_upsert_records: [user, team#owner, team#writer]\n define can_count_records: [user, team#owner, team#writer, team#reader]\n define can_query_records: [user, team#owner, team#writer, team#reader]\n
That is arguably more readable.
As you will observe we have also added [user]
in the permissions of each object, why is that you may ask. The reason is that we want to build a fine-grained authorization, which means while a collection can be belong to a team, we can also grant individual permissions to users. This gives us a great way to play around with permissions at the cost of a more complex implementation of how permissions are managed, but we will get to that in the next post.
We have also removed the belongsTo
relationship as we no longer need it. Reason: OpenFGA does not allow access of relations more than a single layer into the hierarchy thus a collection cannot use the owner of its team for permissions (there are other ways to implement that outside of the scope of this article).
Let\u2019s recap what is our model capable of doing:
- Fine-grained access control to objects is possible via relations
- Users can be grouped into teams (a single user per team is also acceptable for cases where you need a user to be the sole owner of a collection or a database)
- Access to resources can be granted to individual users via object relations
- Define roles within a team (this can be extended to allow roles per resource, but is outside of the scope of this article)
In short we have achieved the goals we have initially set, with a relatively simple and understandable model. However, does our model work? Let\u2019s find out in the next section.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#testing-the-model","title":"Testing the model","text":"Luckily OpenFGA folks have provided a great developer experience by making it easy to write and run tests. This is a massive W and time-saver.
- An individual user can be given access to specific resources via relations
- Users can be part of any of the team roles
- An object can access by a team
name: Chroma Authorization Model Tests # optional\n\nmodel_file: ./model-article-p4.fga # you can specify an external .fga file, or include it inline\n\n# tuple_file: ./tuples.yaml # you can specify an external file, or include it inline\ntuples:\n - user: user:jane\n relation: owner\n object: team:chroma\n - user: user:john\n relation: writer\n object: team:chroma\n - user: user:jill\n relation: reader\n object: team:chroma\n - user: user:sam\n relation: can_create_tenant\n object: server:server1\n - user: user:sam\n relation: can_get_tenant\n object: server:server1\n - user: user:sam\n relation: can_get_preflight\n object: server:server1\n - user: user:michelle\n relation: can_create_tenant\n object: server:server1\n - user: team:chroma#owner\n relation: can_get_preflight\n object: server:server1\n - user: team:chroma#owner\n relation: can_create_tenant\n object: server:server1\n - user: team:chroma#owner\n relation: can_get_tenant\n object: server:server1\n - user: team:chroma#writer\n relation: can_get_preflight\n object: server:server1\n - user: team:chroma#writer\n relation: can_create_tenant\n object: server:server1\n - user: team:chroma#writer\n relation: can_get_tenant\n object: server:server1\n - user: team:chroma#reader\n relation: can_get_preflight\n object: server:server1\n - user: team:chroma#reader\n relation: can_get_tenant\n object: server:server1\n\ntests:\n - name: Users should have team roles\n check:\n - user: user:jane\n object: team:chroma\n assertions:\n owner: true\n writer: false\n reader: false\n - user: user:john\n object: team:chroma\n assertions:\n writer: true\n owner: false\n reader: false\n - user: user:jill\n object: team:chroma\n assertions:\n writer: false\n owner: false\n reader: true\n - user: user:unknown\n object: team:chroma\n assertions:\n writer: false\n owner: false\n reader: false\n - user: user:jane\n object: team:unknown\n assertions:\n writer: false\n owner: false\n reader: false\n - user: user:unknown\n object: team:unknown\n assertions:\n writer: false\n owner: false\n reader: false\n - name: Users should have direct access to server\n check:\n - user: user:sam\n object: server:server1\n assertions:\n can_get_preflight: true\n can_create_tenant: true\n can_get_tenant: true\n - user: user:michelle\n object: server:server1\n assertions:\n can_get_preflight: false\n can_create_tenant: true\n can_get_tenant: false\n - user: user:unknown\n object: server:server1\n assertions:\n can_get_preflight: false\n can_create_tenant: false\n can_get_tenant: false\n - user: user:jill\n object: server:serverX\n assertions:\n can_get_preflight: false\n can_create_tenant: false\n can_get_tenant: false\n - name: Users of a team should have access to server\n check:\n - user: user:jane\n object: server:server1\n assertions:\n can_create_tenant: true\n can_get_tenant: true\n can_get_preflight: true\n - user: user:john\n object: server:server1\n assertions:\n can_create_tenant: true\n can_get_tenant: true\n can_get_preflight: true\n - user: user:jill\n object: server:server1\n assertions:\n can_create_tenant: false\n can_get_tenant: true\n can_get_preflight: true\n - user: user:unknown\n object: server:server1\n assertions:\n can_create_tenant: false\n can_get_tenant: false\n can_get_preflight: false\n
Let\u2019s run the tests:
fga model test --tests test.model-article-p4.fga.yaml\n
This will result in the following output:
# Test Summary #\nTests 3/3 passing\nChecks 42/42 passing\n
That is all folks. We try to keep things as concise as possible and this article has already our levels of comfort in that area. The bottom line is that authorization is no joke and it should take as long of a time as needed.
Writing out all tests will not be concise (maybe we\u2019ll add that to the repo).
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#conclusion","title":"Conclusion","text":"In this article we\u2019ve have built an authorization model for Chroma from scratch using OpenFGA. Admittedly it is a simple model, it still gives is a lot of flexibility to control access to Chroma resources.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#resources","title":"Resources","text":" - https://github.com/amikos-tech/chromadb-auth - the companion repo for this article (files are stored under
openfga/basic/
) - https://openfga.dev/docs - Read it, understand it, code it!
- https://marketplace.visualstudio.com/items?itemName=openfga.openfga-vscode - It makes your life easier
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/","title":"Multi-User Basic Auth","text":""},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#why-multi-user-auth","title":"Why Multi-user Auth?","text":"Multi-user authentication can be crucial for several reasons. Let's delve into this topic.
Security\u2014The primary concern is the security of your deployments. You need to control who can access your data and ensure they are authorized to do so. You may wonder, since Chroma offers basic and token-based authentication, why is multi-user authentication necessary?
You should never share your Chroma access credentials with your users or any app that depends on Chroma. The answer to this concern is a categorical NO.
Another reason to consider multi-user authentication is to differentiate access to your data. However, the solution presented here doesn't provide this. It's a stepping stone towards our upcoming article on multi-tenancy and securing Chroma data.
Last but not least is auditing. While we acknowledge this is not for everybody, there is ~~an~~ increasing pressure to provide visibility into your app via auditable events.
Multi-user experiences - Not all GenAI apps are intended to be private or individual. This is another reason to consider and implement multi-user authentication and authorization.
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#dive-right-in","title":"Dive right in.","text":"Let's get straight to the point and build a multi-user authorization with basic authentication. Here's our goal:
- Develop a server-side authorization provider that can read multiple users from a
.htpasswd
file - Generate a multi-user
.htpasswd
file with several test users - Package our plugin with the Chroma base image and execute it using Docker Compose
Auth CIP
Chroma has detailed info about how its authentication and authorization are implemented. Should you want to learn more go read the CIP (Chroma Improvement Proposal doc).
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#the-plugin","title":"The Plugin","text":"import importlib\nimport logging\nfrom typing import Dict, cast, TypeVar, Optional\n\nfrom chromadb.auth import (\n ServerAuthCredentialsProvider,\n AbstractCredentials,\n SimpleUserIdentity,\n)\nfrom chromadb.auth.registry import register_provider\nfrom chromadb.config import System\nfrom chromadb.telemetry.opentelemetry import (\n OpenTelemetryGranularity,\n trace_method,\n add_attributes_to_current_span,\n)\nfrom pydantic import SecretStr\nfrom overrides import override\n\nT = TypeVar(\"T\")\n\nlogger = logging.getLogger(__name__)\n\n\n@register_provider(\"multi_user_htpasswd_file\")\nclass MultiUserHtpasswdFileServerAuthCredentialsProvider(ServerAuthCredentialsProvider):\n _creds: Dict[str, SecretStr] # contains user:password-hash\n\n def __init__(self, system: System) -> None:\n super().__init__(system)\n try:\n self.bc = importlib.import_module(\"bcrypt\")\n except ImportError:\n raise ValueError(\n \"The bcrypt python package is not installed. \"\n \"Please install it with `pip install bcrypt`\"\n )\n system.settings.require(\"chroma_server_auth_credentials_file\")\n _file = str(system.settings.chroma_server_auth_credentials_file)\n self._creds = dict()\n with open(_file, \"r\") as f:\n for line in f:\n _raw_creds = [v for v in line.strip().split(\":\")]\n if len(_raw_creds) != 2:\n raise ValueError(\n \"Invalid Htpasswd credentials found in \"\n f\"[{str(system.settings.chroma_server_auth_credentials_file)}]. \"\n \"Must be <username>:<bcrypt passwd>.\"\n )\n self._creds[_raw_creds[0]] = SecretStr(_raw_creds[1])\n\n @trace_method( # type: ignore\n \"MultiUserHtpasswdFileServerAuthCredentialsProvider.validate_credentials\",\n OpenTelemetryGranularity.ALL,\n )\n @override\n def validate_credentials(self, credentials: AbstractCredentials[T]) -> bool:\n _creds = cast(Dict[str, SecretStr], credentials.get_credentials())\n\n if len(_creds) != 2 or \"username\" not in _creds or \"password\" not in _creds:\n logger.error(\n \"Returned credentials did match expected format: \"\n \"dict[username:SecretStr, password: SecretStr]\"\n )\n add_attributes_to_current_span(\n {\n \"auth_succeeded\": False,\n \"auth_error\": \"Returned credentials did match expected format: \"\n \"dict[username:SecretStr, password: SecretStr]\",\n }\n )\n return False # early exit on wrong format\n _user_pwd_hash = (\n self._creds[_creds[\"username\"].get_secret_value()]\n if _creds[\"username\"].get_secret_value() in self._creds\n else None\n )\n validation_response = _user_pwd_hash is not None and self.bc.checkpw(\n _creds[\"password\"].get_secret_value().encode(\"utf-8\"),\n _user_pwd_hash.get_secret_value().encode(\"utf-8\"),\n )\n add_attributes_to_current_span(\n {\n \"auth_succeeded\": validation_response,\n \"auth_error\": f\"Failed to validate credentials for user {_creds['username'].get_secret_value()}\"\n if not validation_response\n else \"\",\n }\n )\n return validation_response\n\n @override\n def get_user_identity(\n self, credentials: AbstractCredentials[T]\n ) -> Optional[SimpleUserIdentity]:\n _creds = cast(Dict[str, SecretStr], credentials.get_credentials())\n return SimpleUserIdentity(_creds[\"username\"].get_secret_value())\n
In less than 80 lines of code, we have our plugin. Let's delve into and explain some of the key points of the code above:
__init__
- Here, we dynamically import bcrypt, which we'll use to check user credentials. We also read the configured credentials file - server.htpasswd
line by line, to retrieve each user (we assume each line contains a new user with its bcrypt hash). validate_credentials
- This is where the magic happens. We initially perform some lightweight validations on the credentials parsed by Chroma and passed to the plugin. Then, we attempt to retrieve the user and its hash from the _creds
dictionary. The final step is to verify the hash. We've also added some attributes to monitor our authentication process in our observability layer (we have an upcoming article about this). get_user_identity
- Constructs a simple user identity, which the authorization plugin uses to verify permissions. Although not needed for now, each authentication plugin must implement this, as user identities are crucial for authorization.
We'll store our plugin in __init__.py
within the following directory structure - chroma_auth/authn/basic/__init__.py
(refer to the repository for details).
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#password-file","title":"Password file","text":"Now that we have our plugin let\u2019s create a password file with a few users:
Initial user:
echo \"password123\" | htpasswd -iBc server.htpasswd admin\n
The above will create (-c
flag) a new server.htpasswd file with initial user admin
and the password will be read from stdin (-i
flag) and saved as bcrypt hash (-B
flag)
Let\u2019s add another user:
echo \"password123\" | htpasswd -iB server.htpasswd user1\n
Now our server.htpasswd
file will look like this:
admin:$2y$05$vkBK4b1Vk5O98jNHgr.uduTJsTOfM395sKEKe48EkJCVPH/MBIeHK\nuser1:$2y$05$UQ0kC2x3T2XgeN4WU12BdekUwCJmLjJNhMaMtFNolYdj83OqiEpVu\n
Moving on to docker setup.
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#docker-compose-setup","title":"Docker compose setup","text":"Let\u2019s create a Dockerfile
to bundle our plugin with the official Chroma image:
ARG CHROMA_VERSION=0.4.24\nFROM ghcr.io/chroma-core/chroma:${CHROMA_VERSION} as base\n\nCOPY chroma_auth/ /chroma/chroma_auth\n
This will pick up the official docker image for Chroma and will add our plugin directory structure so that we can use it.
Now let\u2019s create an .env
file to load our plugin:
CHROMA_SERVER_AUTH_PROVIDER=\"chromadb.auth.basic.BasicAuthServerProvider\"\nCHROMA_SERVER_AUTH_CREDENTIALS_FILE=\"server.htpasswd\"\nCHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=\"chroma_auth.authn.basic.MultiUserHtpasswdFileServerAuthCredentialsProvider\"\n
And finally our docker-compose.yaml
:
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n image: chroma-server\n build:\n dockerfile: Dockerfile\n volumes:\n - ./chroma-data:/chroma/chroma\n - ./server.htpasswd:/chroma/server.htpasswd\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}\n - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}\n - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n restart: unless-stopped # possible values are: \"no\", always\", \"on-failure\", \"unless-stopped\"\n ports:\n - \"8000:8000\"\n healthcheck:\n # Adjust below to match your container port\n test: [ \"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/v1/heartbeat\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\n
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#the-test","title":"The test","text":"Let\u2019s run our docker compose setup:
docker compose --env-file ./.env up --build\n
You should see the following log message if the plugin was successfully loaded:
server-1 | DEBUG: [01-04-2024 14:10:13] Starting component MultiUserHtpasswdFileServerAuthCredentialsProvider\nserver-1 | DEBUG: [01-04-2024 14:10:13] Starting component BasicAuthServerProvider\nserver-1 | DEBUG: [01-04-2024 14:10:13] Starting component FastAPIChromaAuthMiddleware\n
Once our container is up and running, let\u2019s see if our multi-user auth works:
import chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"admin:password123\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.get_or_create_collection(\"test_collection\") # this is a protected endpoint and requires authentication\nclient.list_collections() # this is a protected endpoint and requires authentication\n
The above code should return the list of collections, a single collection test_collection
that we created.
(chromadb-multi-user-basic-auth-py3.11) [chromadb-multi-user-basic-auth]python 19:51:38 \u2601 main \u2602 \u26a1 \u271a\nPython 3.11.7 (main, Dec 30 2023, 14:03:09) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> import chromadb\n>>> from chromadb.config import Settings\n>>> \n>>> client = chromadb.HttpClient(\n... settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"admin:password123\"))\n>>> client.heartbeat() # this should work with or without authentication - it is a public endpoint\n1711990302270211007\n>>> \n>>> client.list_collections() # this is a protected endpoint and requires authentication\n[]\n
Great, now let\u2019s test for our other user:
client = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"user1:password123\"))\n
Works just as well (logs omitted for brevity).
To ensure that our plugin works as expected let\u2019s also test with an user that is not in our server.htpasswd
file:
client = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"invalid_user:password123\"))\n
Traceback (most recent call last):\n File \"<stdin>\", line 1, in <module>\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/__init__.py\", line 197, in HttpClient\n return ClientCreator(tenant=tenant, database=database, settings=settings)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 144, in __init__\n self._validate_tenant_database(tenant=tenant, database=database)\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 445, in _validate_tenant_database\n raise e\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 438, in _validate_tenant_database\n self._admin_client.get_tenant(name=tenant)\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 486, in get_tenant\n return self._server.get_tenant(name=name)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py\", line 127, in wrapper\n return f(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/fastapi.py\", line 200, in get_tenant\n raise_chroma_error(resp)\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/fastapi.py\", line 649, in raise_chroma_error\n raise chroma_error\nchromadb.errors.AuthorizationError: Unauthorized\n
As expected, we get auth error when trying to connect to Chroma (the client initialization validates the tenant and DB which are both protected endpoints which raises the exception above).
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/","title":"Naive Multi-tenancy Strategies","text":"Single-note Chroma
The below strategies are applicable to single-node Chroma only. The strategies require your app to act as both PEP (Policy Enforcement Point) and PDP (Policy Decision Point) for authorization. This is a naive approach to multi-tenancy and is probably not suited for production environments, however it is a good and simple way to get started with multi-tenancy in Chroma.
Authorization
We are in the process of creating a list of articles on how to implement proper authorization in Chroma, leveraging the an external service and Chroma's auth plugins. The first article of the series is available in Medium and will also be made available here soon.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#introduction","title":"Introduction","text":"There are several multi-tenancy strategies available to users of Chroma. The actual strategy will depend on the needs of the user and the application. The strategies below apply to multi-user environments, but do no factor in partly-shared resources like groups or teams.
- User-Per-Doc: In this scenario, the app maintains multiple collections and each collection document is associated with a single user.
- User-Per-Collection: In this scenario, the app maintains multiple collections and each collection is associated with a single user.
- User-Per-Database: In this scenario, the app maintains multiple databases with a single tenant and each database is associated with a single user.
- User-Per-Tenant: In this scenario, the app maintains multiple tenants and each tenant is associated with a single user.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-doc","title":"User-Per-Doc","text":"The goal of this strategy is to grant user permissions to access individual documents.
To implement this strategy you need to add some sort of user identification to each document that belongs to a user. For this example we will assume it is user_id
.
import chromadb\n\nclient = chromadb.PersistentClient()\ncollection = client.get_or_create_collection(\"my-collection\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n metadatas=[{\"user_id\": \"user1\"}, {\"user_id\": \"user2\"}],\n ids=[\"doc1\", \"doc2\"],\n)\n
At query time you will have to provide the user_id
as a filter to your query like so:
results = collection.query(\n query_texts=[\"This is a query document\"],\n where=[{\"user_id\": \"user1\"}],\n)\n
To successfully implement this strategy your code needs to consistently add and filter on the user_id
metadata to ensure separation of data.
Drawbacks:
- Error-prone: Messing up the filtering can lead to data being leaked across users.
- Scalability: As the number of users and documents grow, doing filtering on metadata can become slow.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-collection","title":"User-Per-Collection","text":"The goal of this strategy is to grant a user access to all documents in a collection.
To implement this strategy you need to create a collection for each user. For this example we will assume it is user_id
.
import chromadb\n\nclient = chromadb.PersistentClient()\nuser_id = \"user1\"\ncollection = client.get_or_create_collection(f\"user-collection:{user_id}\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n ids=[\"doc1\", \"doc2\"],\n)\n
At query time you will have to provide the user_id
as a filter to your query like so:
user_id = \"user1\"\nuser_collection = client.get_collection(f\"user-collection:{user_id}\")\nresults = user_collection.query(\n query_texts=[\"This is a query document\"],\n)\n
To successfully implement this strategy your code needs to consistently create and query the correct collection for the user.
Drawbacks:
- Error-prone: Messing up the collection name can lead to data being leaked across users.
- Shared document search: If you want to maintain some documents shared then you will have to create a separate collection for those documents and allow users to query the shared collection as well.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-database","title":"User-Per-Database","text":"The goal of this strategy is to associate a user with a single database thus granting them access to all collections and documents within the database.
import chromadb\nfrom chromadb import DEFAULT_TENANT\nfrom chromadb import Settings\n\nadminClient = chromadb.AdminClient(Settings(\n is_persistent=True,\n persist_directory=\"multitenant\",\n))\n\n\n# For Remote Chroma server:\n# \n# adminClient= chromadb.AdminClient(Settings(\n# chroma_api_impl=\"chromadb.api.fastapi.FastAPI\",\n# chroma_server_host=\"localhost\",\n# chroma_server_http_port=\"8000\",\n# ))\n\ndef get_or_create_db_for_user(user_id):\n database = f\"db:{user_id}\"\n try:\n adminClient.get_database(database)\n except Exception as e:\n adminClient.create_database(database, DEFAULT_TENANT)\n return DEFAULT_TENANT, database\n\n\nuser_id = \"user_John\"\n\ntenant, database = get_or_create_db_for_user(user_id)\n# replace with chromadb.HttpClient for remote Chroma server\nclient = chromadb.PersistentClient(path=\"multitenant\", tenant=tenant, database=database)\ncollection = client.get_or_create_collection(\"user_collection\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n ids=[\"doc1\", \"doc2\"],\n)\n
In the above code we do the following:
- We create or get a database for each user in the
DEFAULT_TENANT
using the chromadb.AdminClient
. - We then create a
PersistentClient
for each user with the tenant
and database
we got from the AdminClient
. - We then create or get collection and add data to it.
Drawbacks:
- This strategy requires consistent management of tenants and databases and their use in the client application.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-tenant","title":"User-Per-Tenant","text":"The goal of this strategy is to associate a user with a single tenant thus granting them access to all databases, collections, and documents within the tenant.
import chromadb\nfrom chromadb import DEFAULT_DATABASE\nfrom chromadb import Settings\n\nadminClient = chromadb.AdminClient(Settings(\n chroma_api_impl=\"chromadb.api.segment.SegmentAPI\",\n is_persistent=True,\n persist_directory=\"multitenant\",\n))\n\n\n# For Remote Chroma server:\n# \n# adminClient= chromadb.AdminClient(Settings(\n# chroma_api_impl=\"chromadb.api.fastapi.FastAPI\",\n# chroma_server_host=\"localhost\",\n# chroma_server_http_port=\"8000\",\n# ))\n\ndef get_or_create_tenant_for_user(user_id):\n tenant_id = f\"tenant_user:{user_id}\"\n try:\n adminClient.get_tenant(tenant_id)\n except Exception as e:\n adminClient.create_tenant(tenant_id)\n adminClient.create_database(DEFAULT_DATABASE, tenant_id)\n return tenant_id, DEFAULT_DATABASE\n\n\nuser_id = \"user1\"\n\ntenant, database = get_or_create_tenant_for_user(user_id)\n# replace with chromadb.HttpClient for remote Chroma server\nclient = chromadb.PersistentClient(path=\"multitenant\", tenant=tenant, database=database)\ncollection = client.get_or_create_collection(\"user_collection\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n ids=[\"doc1\", \"doc2\"],\n)\n
In the above code we do the following:
- We create or get a tenant for each user with
DEFAULT_DATABASE
using the chromadb.AdminClient
. - We then create a
PersistentClient
for each user with the tenant
and database
we got from the AdminClient
. - We then create or get collection and add data to it.
Drawbacks:
- This strategy requires consistent management of tenants and databases and their use in the client application.
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to ChromaDB Cookbook","text":"This is a collection of small guides and recipes to help you get started with ChromaDB.
Latest ChromaDB version: 0.5.0
"},{"location":"#new-and-noteworthy","title":"New and Noteworthy","text":" - \ud83e\udde0 Memory Management - Learn how to manage memory in ChromaDB - \ud83d\udcc5
30-May-2024
- \ud83d\udcd0 Resource Requirements - Recently updated with temporary storage requirements - \ud83d\udcc5
28-May-2024
- \u2049\ufe0fFAQs - Facing an issue, check out our FAQ section for answers. - \ud83d\udcc5
28-May-2024
- \ud83d\udcbe Chroma Storage Layout - Understand how Chroma stores persistent data - \ud83d\udcc5
21-May-2024
- \u2699\ufe0f Chroma Configuration - Learn about all the levers that Chroma offers for configuring the client, server and HNSW indices - \ud83d\udcc5
16-May-2024
- \ud83d\udcbb Systemd Service - Learn how to start Chroma upon system boot - \ud83d\udcc5
15-May-2024
"},{"location":"#getting-started","title":"Getting Started","text":"We suggest you first head to the Concepts section to get familiar with ChromaDB concepts, such as Documents, Metadata, Embeddings, etc.
Once you're comfortable with the concepts, you can jump to the Installation section to install ChromaDB.
Core Topics:
- Filters - Learn to filter data in ChromaDB using metadata and document filters
- Resource Requirements - Understand the resource requirements for running ChromaDB
- \u2728Multi-Tenancy - Learn how to implement multi-tenancy in ChromaDB
"},{"location":"#running-chromadb","title":"Running ChromaDB","text":" - CLI - Running ChromaDB via the CLI
- Docker - Running ChromaDB in Docker
- Docker Compose - Running ChromaDB in Docker Compose
- Kubernetes - Running ChromaDB in Kubernetes (Minikube)
"},{"location":"#integrations","title":"Integrations","text":" - \u2728LangChain - Integrating ChromaDB with LangChain
- \u2728LlamaIndex - Integrating ChromaDB with LlamaIndex
- \u2728Ollama - Integrating ChromaDB with Ollama
"},{"location":"#the-ecosystem","title":"The Ecosystem","text":""},{"location":"#clients","title":"Clients","text":"Below is a list of available clients for ChromaDB.
- Python Client (Official Chroma client)
- JavaScript Client (Official Chroma client)
- Ruby Client (Community maintained)
- Java Client (Community maintained)
- Go Client (Community maintained)
- C# Client (Microsoft maintained)
- Rust Client (Community maintained)
- Elixir Client (Community maintained)
- Dart Client (Community maintained)
- PHP Client (Community maintained)
- PHP (Laravel) Client (Community maintained)
"},{"location":"#user-interfaces","title":"User Interfaces","text":" - VectorAdmin (MintPlex Labs) - An open-source web-based admin interface for vector databases, including ChromaDB
- ChromaDB UI (Community maintained) - A web-based UI for ChromaDB
"},{"location":"#cli-tooling","title":"CLI Tooling","text":" - Chroma CLI (Community maintained) - Early Alpha
- Chroma Data Pipes (Community maintained) - A CLI tool for importing and exporting data from ChromaDB
- Chroma Ops (Community maintained) - A maintenance CLI tool for ChromaDB
"},{"location":"#strategies","title":"Strategies","text":" - Backup - Backing up ChromaDB data
- Batch Imports - Importing data in batches
- Multi-Tenancy - Running multiple ChromaDB instances
- Keyword Search - Searching for keywords in ChromaDB
- Memory Management - Managing memory in ChromaDB
- Time-based Queries - Querying data based on timestamps
- \u2728'
Coming Soon
Testing with Chroma - learn how to test your GenAI apps that include Chroma. - \u2728'
Coming Soon
Monitoring Chroma - learn how to monitor your Chroma instance. - \u2728'
Coming Soon
Building Chroma clients - learn how to build clients for Chroma. - \u2728'
Coming Soon
Creating the perfect Embedding Function (wrapper) - learn the best practices for creating your own embedding function. - \u2728 Multi-User Basic Auth Plugin - learn how to build a multi-user basic authentication plugin for Chroma.
- \u2728 CORS Configuration For JS Browser apps - learn how to configure CORS for Chroma.
- \u2728 Running Chroma with SystemD - learn how to start Chroma upon system boot.
"},{"location":"#get-help","title":"Get Help","text":"Missing something? Let us know by opening an issue, reach out on Discord (look for @taz
).
"},{"location":"contributing/getting-started/","title":"Getting Started with Contributing to Chroma","text":""},{"location":"contributing/getting-started/#overview","title":"Overview","text":"Here are some steps to follow:
- Fork the repository (if you are part of an organization to which you cannot grant permissions it might be advisable to fork under your own user account to allow other community members to contribute by granting them permissions, something that is a bit more difficult at organizational level)
- Clone your forked repo locally (git clone ...) under a dir with an apt name for the change you want to make e.g.
my_awesome_feature
- Create a branch for your change (git checkout -b my_awesome_feature)
- Make your changes
- Test (see Testing)
- Lint (see Linting)
- Commit your changes (git commit -am 'Added some feature')
- Push to the branch (git push origin my_awesome_feature)
- Create a new Pull Request (PR) from your forked repository to the main Chroma repository
"},{"location":"contributing/getting-started/#testing","title":"Testing","text":"It is generally good to test your changes before submitting a PR.
To run the full test suite:
pip install -r requirements_dev.txt\npytest\n
To run a specific test:
pytest chromadb/tests/test_api.py::test_get_collection\n
If you want to see the output of print statements in the tests, you can run:
pytest -s\n
If you want your pytest to stop on first failure, you can run:
pytest -x\n
"},{"location":"contributing/getting-started/#integration-tests","title":"Integration Tests","text":"You can only run the integration tests by running:
sh bin/bin/integration-test\n
The above will create a docker container and will run the integration tests against it. This will also include JS client.
"},{"location":"contributing/getting-started/#linting","title":"Linting","text":""},{"location":"contributing/useful-shortcuts/","title":"Useful Shortcuts for Contributors","text":""},{"location":"contributing/useful-shortcuts/#git","title":"Git","text":""},{"location":"contributing/useful-shortcuts/#aliases","title":"Aliases","text":""},{"location":"contributing/useful-shortcuts/#create-venv-and-install-dependencies","title":"Create venv and install dependencies","text":"Add the following to your .bashrc
, .zshrc
or .profile
:
alias chroma-init='python -m virtualenv venv && source venv/bin/activate && pip install -r requirements.txt && pip install -r requirements_dev.txt'\n
"},{"location":"core/api/","title":"Chroma API","text":"In this article we will cover the Chroma API in an indepth details.
"},{"location":"core/api/#accessing-the-api","title":"Accessing the API","text":"If you are running a Chroma server you can access its API at - http://<chroma_server_host>:<chroma_server_port>/docs
( e.g. http://localhost:8000/docs
).
"},{"location":"core/api/#api-endpoints","title":"API Endpoints","text":"TBD
"},{"location":"core/api/#generating-clients","title":"Generating Clients","text":"While Chroma ecosystem has client implementations for many languages, it may be the case you want to roll out your own. Below we explain some of the options available to you:
"},{"location":"core/api/#using-openapi-generator","title":"Using OpenAPI Generator","text":"The fastest way to build a client is to use the OpenAPI Generator the API spec.
"},{"location":"core/api/#manually-creating-a-client","title":"Manually Creating a Client","text":"If you more control over things, you can create your own client by using the API spec as guideline.
For your convenience we provide some data structures in various languages to help you get started. The important structures are:
- Client
- Collection
- Embedding
- Document
- ID
- Metadata
- QueryRequest/QueryResponse
- Include
- Where Filter
- WhereDocument Filter
"},{"location":"core/api/#python","title":"Python","text":""},{"location":"core/api/#typescript","title":"Typescript","text":""},{"location":"core/api/#golang","title":"Golang","text":""},{"location":"core/api/#java","title":"Java","text":""},{"location":"core/api/#rust","title":"Rust","text":""},{"location":"core/api/#elixir","title":"Elixir","text":""},{"location":"core/clients/","title":"Chroma Clients","text":"Chroma Settings Object
The below is only a partial list of Chroma configuration options. For full list check the code chromadb.config.Settings
or the ChromaDB Configuration page.
"},{"location":"core/clients/#persistent-client","title":"Persistent Client","text":"To create your a local persistent client use the PersistentClient
class. This client will store all data locally in a directory on your machine at the path you specify.
import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.PersistentClient(\n path=\"test\",\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
path
- parameter must be a local path on the machine where Chroma is running. If the path does not exist, it will be created. The path can be relative or absolute. If the path is not specified, the default is ./chroma
in the current working directory. settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
Positional Parameters
Chroma PersistentClient
parameters are positional, unless keyword arguments are used.
"},{"location":"core/clients/#uses-of-persistent-client","title":"Uses of Persistent Client","text":"The persistent client is useful for:
- Local development: You can use the persistent client to develop locally and test out ChromaDB.
- Embedded applications: You can use the persistent client to embed ChromaDB in your application. For example, if you are building a web application, you can use the persistent client to store data locally on the server.
"},{"location":"core/clients/#http-client","title":"HTTP Client","text":"Chroma also provides HTTP Client, suitable for use in a client-server mode. This client can be used to connect to a remote ChromaDB server.
PythonJavaScript import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.HttpClient(\n host=\"localhost\",\n port=8000,\n ssl=False,\n headers=None,\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
host
- The host of the remote server. If not specified, the default is localhost
. port
- The port of the remote server. If not specified, the default is 8000
. ssl
- If True
, the client will use HTTPS. If not specified, the default is False
. headers
- (optional): The headers to be sent to the server. The setting can be used to pass additional headers to the server. An example of this can be auth headers. settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
!!! tip \"Positional Parameters\"
Chroma `PersistentClient` parameters are positional, unless keyword arguments are used.\n
import {ChromaClient} from \"chromadb\";\nconst client = new ChromaClient({\n path: \"http://localhost:8000\",\n auth: {\n provider: \"token\",\n credentials: \"your_token_here\",\n tokenHeaderType: \"AUTHORIZATION\",\n },\n tenant: \"default_tenant\",\n database: \"default_database\",\n});\n
Parameters:
path
- The Chroma endpoint auth
- Chroma authentication object tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
"},{"location":"core/clients/#uses-of-http-client","title":"Uses of HTTP Client","text":"The HTTP client is ideal for when you want to scale your application or move off of local machine storage. It is important to note that there are trade-offs associated with using HTTP client:
- Network latency - The time it takes to send a request to the server and receive a response.
- Serialization and deserialization overhead - The time it takes to convert data to a format that can be sent over the network and then convert it back to its original format.
- Security - The data is sent over the network, so it is important to ensure that the connection is secure (we recommend using both HTTPS and authentication).
- Availability - The server must be available for the client to connect to it.
- Bandwidth usage - The amount of data sent over the network.
- Data privacy and compliance - Storing data on a remote server may require compliance with data protection laws and regulations.
- Difficulty in debugging - Debugging network issues can be more difficult than debugging local issues. The same applies to server-side issues.
"},{"location":"core/clients/#host-parameter-special-cases-python-only","title":"Host parameter special cases (Python-only)","text":"The host
parameter supports a more advanced syntax than just the hostname. You can specify the whole endpoint ULR ( without the API paths), e.g. https://chromadb.example.com:8000/my_server/path/
. This is useful when you want to use a reverse proxy or load balancer in front of your ChromaDB server.
"},{"location":"core/clients/#ephemeral-client","title":"Ephemeral Client","text":"Ephemeral client is a client that does not store any data on disk. It is useful for fast prototyping and testing. To get started with an ephemeral client, use the EphemeralClient
class.
import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.EphemeralClient(\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
Positional Parameters
Chroma PersistentClient
parameters are positional, unless keyword arguments are used.
"},{"location":"core/clients/#environmental-variable-configured-client","title":"Environmental Variable Configured Client","text":"You can also configure the client using environmental variables. This is useful when you want to configure any of the client configurations listed above via environmental variables.
import chromadb\nfrom chromadb.config import DEFAULT_TENANT, DEFAULT_DATABASE, Settings\n\nclient = chromadb.Client(\n settings=Settings(),\n tenant=DEFAULT_TENANT,\n database=DEFAULT_DATABASE,\n)\n
Parameters:
settings
- Chroma settings object. tenant
- the tenant to use. Default is default_tenant
. database
- the database to use. Default is default_database
.
Positional Parameters
Chroma PersistentClient
parameters are positional, unless keyword arguments are used.
"},{"location":"core/collections/","title":"Collections","text":"Collections are the grouping mechanism for embeddings, documents, and metadata.
"},{"location":"core/collections/#collection-basics","title":"Collection Basics","text":""},{"location":"core/collections/#collection-properties","title":"Collection Properties","text":"Each collection is characterized by the following properties:
name
: The name of the collection. The name can be changed as long as it is unique within the database ( use collection.modify(new_name=\"new_name\")
to change the name of the collection metadata
: A dictionary of metadata associated with the collection. The metadata is a dictionary of key-value pairs. Keys can be strings, values can be strings, integers, floats, or booleans. Metadata can be changed using collection.modify(new_metadata={\"key\": \"value\"})
(Note: Metadata is always overwritten when modified) embedding_function
: The embedding function used to embed documents in the collection.
Defaults:
- Embedding Function - by default if
embedding_function
parameter is not provided at get()
or create_collection()
or get_or_create_collection()
time, Chroma uses chromadb.utils.embedding_functions.DefaultEmbeddingFunction
which uses the chromadb.utils.embedding_functions.DefaultEmbeddingFunction
to embed documents. The default embedding function uses Onnx Runtime with all-MiniLM-L6-v2
model. - distance metric - by default Chroma use L2 (Euclidean Distance Squared) distance metric for newly created collection. You can change it at creation time using
hnsw:space
metadata key. Possible values are l2
, cosine
, and 'ip' (inner product) - Batch size, defined by
hnsw:batch_size
metadata key. Default is 100. The batch size defines the size of the in-memory bruteforce index. Once the threshold is reached, vectors are added to the HNSW index and the bruteforce index is cleared. Greater values may improve ingest performance. When updating also consider changing sync threshold - Sync threshold, defined by
hnsw:sync_threshold
metadata key. Default 1000. The sync threshold defines the limit at which the HNSW index is synced to disk. This limit only applies to newly added vectors.
Keep in Mind
Collection distance metric cannot be changed after the collection is created. To change the distance metric see #cloning-a-collection
Name Restrictions
Collection names in Chroma must adhere to the following restrictions:
(1) contains 3-63 characters (2) starts and ends with an alphanumeric character (3) otherwise contains only alphanumeric characters, underscores or hyphens (-) (4) contains no two consecutive periods (..) (5) is not a valid IPv4 address
"},{"location":"core/collections/#creating-a-collection","title":"Creating a collection","text":"Official Docs
For more information on the create_collection
or get_or_create_collection
methods, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type name
Name of the collection to create. Parameter is required N/A String metadata
Metadata associated with the collection. This is an optional parameter None
Dictionary embedding_function
Embedding function to use for the collection. This is an optional parameter chromadb.utils.embedding_functions.DefaultEmbeddingFunction
EmbeddingFunction import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.create_collection(\"test\")\n
Alternatively you can use the get_or_create_collection
method to create a collection if it doesn't exist already.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\", metadata={\"key\": \"value\"})\n
Metadata with get_or_create_collection()
If the collection exists and metadata is provided in the method it will attempt to overwrite the existing metadata.
"},{"location":"core/collections/#deleting-a-collection","title":"Deleting a collection","text":"Official Docs
For more information on the delete_collection
method, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type name
Name of the collection to delete. Parameter is required N/A String import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\nclient.delete_collection(\"test\")\n
"},{"location":"core/collections/#listing-all-collections","title":"Listing all collections","text":"Official Docs
For more information on the list_collections
method, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type offset
The starting offset for listing collections. This is an optional parameter None
Positive Integer limit
The number of collections to return. If the remaining collections from offset
are fewer than this number then returned collection will also be fewer. This is an optional parameter None
Positive Integer import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncollections = client.list_collections()\n
"},{"location":"core/collections/#getting-a-collection","title":"Getting a collection","text":"Official Docs
For more information on the get_collection
method, see the official ChromaDB documentation.
Parameters:
Name Description Default Value Type name
Name of the collection to get. Parameter is required N/A String embedding_function
Embedding function to use for the collection. This is an optional parameter chromadb.utils.embedding_functions.DefaultEmbeddingFunction
EmbeddingFunction import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_collection(\"test\")\n
"},{"location":"core/collections/#modifying-a-collection","title":"Modifying a collection","text":"Official Docs
For more information on the modify
method, see the official ChromaDB documentation.
Modify method on collection
As the reader will observe modify
method is called on the collection and node on the client as the rest of the collection lifecycle methods.
Metadata Overwrite
Metadata is always overwritten when modified. If you want to add a new key-value pair to the metadata, you must first get the existing metadata and then add the new key-value pair to it.
Parameters:
Name Description Default Value Type new_name
The new name of the collection. Parameter is required N/A String metadata
Metadata associated with the collection. This is an optional parameter None
Dictionary Both collection properties (name
and metadata
) can be modified, separately ot together.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_collection(\"test\")\ncol.modify(name=\"test2\", metadata={\"key\": \"value\"})\n
"},{"location":"core/collections/#counting-collections","title":"Counting Collections","text":"Official Docs
For more information on the count_collections
method, see the official ChromaDB documentation.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\") # create a new collection\n\nclient.count_collections()\n
"},{"location":"core/collections/#iterating-over-a-collection","title":"Iterating over a Collection","text":"import chromadb\n\nclient = chromadb.PersistentClient(path=\"my_local_data\") # or HttpClient()\n\ncollection = client.get_or_create_collection(\"local_collection\")\ncollection.add(\n ids=[f\"i\" for i in range(1000)],\n documents=[f\"document {i}\" for i in range(1000)],\n metadatas=[{\"doc_id\": i} for i in range(1000)])\nexisting_count = collection.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = collection.get(\n include=[\"metadatas\", \"documents\", \"embeddings\"],\n limit=batch_size,\n offset=i)\n print(batch) # do something with the batch\n
"},{"location":"core/collections/#collection-utilities","title":"Collection Utilities","text":""},{"location":"core/collections/#copying-local-collection-to-remote","title":"Copying Local Collection to Remote","text":"The following example demonstrates how to copy a local collection to a remote ChromaDB server. (it also works in reverse)
import chromadb\n\nclient = chromadb.PersistentClient(path=\"my_local_data\")\nremote_client = chromadb.HttpClient()\n\ncollection = client.get_or_create_collection(\"local_collection\")\ncollection.add(\n ids=[\"1\", \"2\"],\n documents=[\"hello world\", \"hello ChromaDB\"],\n metadatas=[{\"a\": 1}, {\"b\": 2}])\nremote_collection = remote_client.get_or_create_collection(\"remote_collection\",\n metadata=collection.metadata)\nexisting_count = collection.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = collection.get(\n include=[\"metadatas\", \"documents\", \"embeddings\"],\n limit=batch_size,\n offset=i)\n remote_collection.add(\n ids=batch[\"ids\"],\n documents=batch[\"documents\"],\n metadatas=batch[\"metadatas\"],\n embeddings=batch[\"embeddings\"])\n
Using ChromaDB Data Pipes
There is a more efficient way to copy data between local and remote collections using ChromaDB Data Pipes package.
pip install chromadb-data-pipes\ncdp export \"file://path/to_local_data/local_collection\" | \\\ncdp import \"http://remote_chromadb:port/remote_collection\" --create\n
"},{"location":"core/collections/#cloning-a-collection","title":"Cloning a collection","text":"Here are some reasons why you might want to clone a collection:
- Change distance function (via metadata -
hnsw:space
) - Change HNSW hyper parameters (
hnsw:M
, hnsw:construction_ef
, hnsw:search_ef
)
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\") # create a new collection with L2 (default)\n\ncol.add(ids=[f\"{i}\" for i in range(1000)], documents=[f\"document {i}\" for i in range(1000)])\nnewCol = client.get_or_create_collection(\"test1\", metadata={\n \"hnsw:space\": \"cosine\"}) # let's change the distance function to cosine\n\nexisting_count = col.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = col.get(include=[\"metadatas\", \"documents\", \"embeddings\"], limit=batch_size, offset=i)\n newCol.add(ids=batch[\"ids\"], documents=batch[\"documents\"], metadatas=batch[\"metadatas\"],\n embeddings=batch[\"embeddings\"])\n\nprint(newCol.count())\nprint(newCol.get(offset=0, limit=10)) # get first 10 documents\n
"},{"location":"core/collections/#changing-the-embedding-function","title":"Changing the embedding function","text":"To change the embedding function of a collection, it must be cloned to a new collection with the desired embedding function.
import os\nimport chromadb\nfrom chromadb.utils.embedding_functions import OpenAIEmbeddingFunction, DefaultEmbeddingFunction\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ndefault_ef = DefaultEmbeddingFunction()\ncol = client.create_collection(\"default_ef_collection\",embedding_function=default_ef)\nopenai_ef = OpenAIEmbeddingFunction(api_key=os.getenv(\"OPENAI_API_KEY\"), model_name=\"text-embedding-3-small\")\ncol.add(ids=[f\"{i}\" for i in range(1000)], documents=[f\"document {i}\" for i in range(1000)])\nnewCol = client.get_or_create_collection(\"openai_ef_collection\", embedding_function=openai_ef)\n\nexisting_count = col.count()\nbatch_size = 10\nfor i in range(0, existing_count, batch_size):\n batch = col.get(include=[\"metadatas\", \"documents\"], limit=batch_size, offset=i)\n newCol.add(ids=batch[\"ids\"], documents=batch[\"documents\"], metadatas=batch[\"metadatas\"])\n# get first 10 documents with their OpenAI embeddings\nprint(newCol.get(offset=0, limit=10,include=[\"metadatas\", \"documents\", \"embeddings\"])) \n
"},{"location":"core/collections/#cloning-a-subset-of-a-collection-with-query","title":"Cloning a subset of a collection with query","text":"The below example demonstrates how to select a slice of an existing collection by using where
and where_document
query and creating a new collection with the selected slice.
Race Condition
The below example is not atomic and if data is changed between the initial selection query (select_ids = col.get(...)
and the subsequent insertion query (batch = col.get(...)
) the new collection may not contain the expected data.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\") # or HttpClient()\ncol = client.get_or_create_collection(\"test\") # create a new collection with L2 (default)\n\ncol.add(ids=[f\"{i}\" for i in range(1000)], documents=[f\"document {i}\" for i in range(1000)])\nnewCol = client.get_or_create_collection(\"test1\", metadata={\n \"hnsw:space\": \"cosine\", \"hnsw:M\": 32}) # let's change the distance function to cosine and M to 32\nquery_where = {\"metadata_key\": \"value\"}\nquery_where_document = {\"$contains\": \"document\"}\nselect_ids = col.get(where_document=query_where_document, where=query_where, include=[]) # get only IDs\nbatch_size = 10\nfor i in range(0, len(select_ids[\"ids\"]), batch_size):\n batch = col.get(include=[\"metadatas\", \"documents\", \"embeddings\"], limit=batch_size, offset=i, where=query_where,\n where_document=query_where_document)\n newCol.add(ids=batch[\"ids\"], documents=batch[\"documents\"], metadatas=batch[\"metadatas\"],\n embeddings=batch[\"embeddings\"])\n\nprint(newCol.count())\nprint(newCol.get(offset=0, limit=10)) # get first 10 documents\n
"},{"location":"core/collections/#updating-documentrecord-metadata","title":"Updating Document/Record Metadata","text":"In this example we loop through all documents of a collection and strip all metadata fields of leading and trailing whitespace. Change the update_metadata
function to suit your needs.
from chromadb import Settings\nimport chromadb\n\nclient = chromadb.PersistentClient(path=\"test\", settings=Settings(allow_reset=True))\nclient.reset() # reset the database so we can run this script multiple times\ncol = client.get_or_create_collection(\"test\")\ncount = col.count()\n\n\ndef update_metadata(metadata: dict):\n return {k: v.strip() for k, v in metadata.items()}\n\n\nfor i in range(0, count, 10):\n batch = col.get(include=[\"metadatas\"], limit=10, offset=i)\n col.update(ids=batch[\"ids\"], metadatas=[update_metadata(metadata) for metadata in batch[\"metadatas\"]])\n
"},{"location":"core/collections/#tips-and-tricks","title":"Tips and Tricks","text":""},{"location":"core/collections/#getting-ids-only","title":"Getting IDs Only","text":"The below example demonstrates how to get only the IDs of a collection. This is useful if you need to work with IDs without the need to fetch any additional data. Chroma will accept and empty include
array indicating that no other data than the IDs is returned.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\")\ncol = client.get_or_create_collection(\"my_collection\")\nids_only_result = col.get(include=[])\nprint(ids_only_result['ids'])\n
"},{"location":"core/concepts/","title":"Concepts","text":""},{"location":"core/concepts/#tenancy-and-db-hierarchies","title":"Tenancy and DB Hierarchies","text":"The following picture illustrates the tenancy and DB hierarchy in Chroma:
Storage
In Chroma single-node, all data about tenancy, databases, collections and documents is stored in a single SQLite database.
"},{"location":"core/concepts/#tenants","title":"Tenants","text":"A tenant is a logical grouping for a set of databases. A tenant is designed to model a single organization or user. A tenant can have multiple databases.
"},{"location":"core/concepts/#databases","title":"Databases","text":"A database is a logical grouping for a set of collections. A database is designed to model a single application or project. A database can have multiple collections.
"},{"location":"core/concepts/#collections","title":"Collections","text":"Collections are the grouping mechanism for embeddings, documents, and metadata.
"},{"location":"core/concepts/#documents","title":"Documents","text":"Chunks of text
Documents in ChromaDB lingo are chunks of text that fits within the embedding model's context window. Unlike other frameworks that use the term \"document\" to mean a file, ChromaDB uses the term \"document\" to mean a chunk of text.
Documents are raw chunks of text that are associated with an embedding. Documents are stored in the database and can be queried for.
"},{"location":"core/concepts/#metadata","title":"Metadata","text":"Metadata is a dictionary of key-value pairs that can be associated with an embedding. Metadata is stored in the database and can be queried for.
Metadata values can be of the following types:
- strings
- integers
- floats (float32)
- booleans
"},{"location":"core/concepts/#embedding-function","title":"Embedding Function","text":"Also referred to as embedding model, embedding functions in ChromaDB are wrappers that expose a consistent interface for generating embedding vectors from documents or text queries.
For a list of supported embedding functions see Chroma's official documentation.
"},{"location":"core/concepts/#distance-function","title":"Distance Function","text":"Distance functions help in calculating the difference (distance) between two embedding vectors. ChromaDB supports the following distance functions:
- Cosine - Useful for text similarity
- Euclidean (L2) - Useful for text similarity, more sensitive to noise than
cosine
- Inner Product (IP) - Recommender systems
"},{"location":"core/concepts/#embedding-vector","title":"Embedding Vector","text":"A representation of a document in the embedding space in te form of a vector, list of 32-bit floats (or ints).
"},{"location":"core/concepts/#embedding-model","title":"Embedding Model","text":""},{"location":"core/concepts/#document-and-metadata-index","title":"Document and Metadata Index","text":"The document and metadata index is stored in SQLite database.
"},{"location":"core/concepts/#vector-index-hnsw-index","title":"Vector Index (HNSW Index)","text":"Under the hood (ca. v0.4.22) Chroma uses its own fork HNSW lib for indexing and searching vectors.
In a single-node mode, Chroma will create a single HNSW index for each collection. The index is stored in a subdir of your persistent dir, named after the collection id (UUID-based).
The HNSW lib uses fast ANN algo to search the vectors in the index.
"},{"location":"core/configuration/","title":"Configuration","text":"Work in Progress
This page is a work in progress and may not be complete.
"},{"location":"core/configuration/#common-configurations-options","title":"Common Configurations Options","text":""},{"location":"core/configuration/#server-configuration","title":"Server Configuration","text":""},{"location":"core/configuration/#core","title":"Core","text":""},{"location":"core/configuration/#is_persistent","title":"is_persistent
","text":""},{"location":"core/configuration/#persist_directory","title":"persist_directory
","text":""},{"location":"core/configuration/#allow_reset","title":"allow_reset
","text":""},{"location":"core/configuration/#chroma_memory_limit_bytes","title":"chroma_memory_limit_bytes
","text":""},{"location":"core/configuration/#chroma_segment_cache_policy","title":"chroma_segment_cache_policy
","text":""},{"location":"core/configuration/#telemetry-and-observability","title":"Telemetry and Observability","text":""},{"location":"core/configuration/#chroma_otel_collection_endpoint","title":"chroma_otel_collection_endpoint
","text":""},{"location":"core/configuration/#chroma_otel_service_name","title":"chroma_otel_service_name
","text":""},{"location":"core/configuration/#chroma_otel_collection_headers","title":"chroma_otel_collection_headers
","text":""},{"location":"core/configuration/#chroma_otel_granularity","title":"chroma_otel_granularity
","text":""},{"location":"core/configuration/#chroma_product_telemetry_impl","title":"chroma_product_telemetry_impl
","text":""},{"location":"core/configuration/#chroma_telemetry_impl","title":"chroma_telemetry_impl
","text":""},{"location":"core/configuration/#anonymized_telemetry","title":"anonymized_telemetry
","text":""},{"location":"core/configuration/#maintenance","title":"Maintenance","text":""},{"location":"core/configuration/#migrations","title":"migrations
","text":""},{"location":"core/configuration/#migrations_hash_algorithm","title":"migrations_hash_algorithm
","text":""},{"location":"core/configuration/#operations-and-distributed","title":"Operations and Distributed","text":""},{"location":"core/configuration/#chroma_sysdb_impl","title":"chroma_sysdb_impl
","text":""},{"location":"core/configuration/#chroma_producer_impl","title":"chroma_producer_impl
","text":""},{"location":"core/configuration/#chroma_consumer_impl","title":"chroma_consumer_impl
","text":""},{"location":"core/configuration/#chroma_segment_manager_impl","title":"chroma_segment_manager_impl
","text":""},{"location":"core/configuration/#chroma_segment_directory_impl","title":"chroma_segment_directory_impl
","text":""},{"location":"core/configuration/#chroma_memberlist_provider_impl","title":"chroma_memberlist_provider_impl
","text":""},{"location":"core/configuration/#worker_memberlist_name","title":"worker_memberlist_name
","text":""},{"location":"core/configuration/#chroma_coordinator_host","title":"chroma_coordinator_host
","text":""},{"location":"core/configuration/#chroma_server_grpc_port","title":"chroma_server_grpc_port
","text":""},{"location":"core/configuration/#chroma_logservice_host","title":"chroma_logservice_host
","text":""},{"location":"core/configuration/#chroma_logservice_port","title":"chroma_logservice_port
","text":""},{"location":"core/configuration/#chroma_quota_provider_impl","title":"chroma_quota_provider_impl
","text":""},{"location":"core/configuration/#chroma_rate_limiting_provider_impl","title":"chroma_rate_limiting_provider_impl
","text":""},{"location":"core/configuration/#authentication","title":"Authentication","text":""},{"location":"core/configuration/#chroma_auth_token_transport_header","title":"chroma_auth_token_transport_header
","text":""},{"location":"core/configuration/#chroma_client_auth_provider","title":"chroma_client_auth_provider
","text":""},{"location":"core/configuration/#chroma_client_auth_credentials","title":"chroma_client_auth_credentials
","text":""},{"location":"core/configuration/#chroma_server_auth_ignore_paths","title":"chroma_server_auth_ignore_paths
","text":""},{"location":"core/configuration/#chroma_overwrite_singleton_tenant_database_access_from_auth","title":"chroma_overwrite_singleton_tenant_database_access_from_auth
","text":""},{"location":"core/configuration/#chroma_server_authn_provider","title":"chroma_server_authn_provider
","text":""},{"location":"core/configuration/#chroma_server_authn_credentials","title":"chroma_server_authn_credentials
","text":""},{"location":"core/configuration/#chroma_server_authn_credentials_file","title":"chroma_server_authn_credentials_file
","text":""},{"location":"core/configuration/#authorization","title":"Authorization","text":""},{"location":"core/configuration/#chroma_server_authz_provider","title":"chroma_server_authz_provider
","text":""},{"location":"core/configuration/#chroma_server_authz_config","title":"chroma_server_authz_config
","text":""},{"location":"core/configuration/#chroma_server_authz_config_file","title":"chroma_server_authz_config_file
","text":""},{"location":"core/configuration/#client-configuration","title":"Client Configuration","text":""},{"location":"core/configuration/#authentication_1","title":"Authentication","text":""},{"location":"core/configuration/#hnsw-configuration","title":"HNSW Configuration","text":"HNSW is the underlying library for Chroma vector indexing and search. Chroma exposes a number of parameters to configure HNSW for your use case. All HNSW parameters are configured as metadata for a collection.
Changing HNSW parameters
Some HNSW parameters cannot be changed after index creation via the standard method shown below. If you which to change these parameters, you will need to clone the collection see an example here.
"},{"location":"core/configuration/#hnswspace","title":"hnsw:space
","text":"Description: Controls the distance metric of the HNSW index. The space cannot be changed after index creation.
Default: l2
Constraints:
- Possible values:
l2
, cosine
, ip
- Parameter cannot be changed after index creation.
"},{"location":"core/configuration/#hnswconstruction_ef","title":"hnsw:construction_ef
","text":"Description: Controls the number of neighbours in the HNSW graph to explore when adding new vectors. The more neighbours HNSW explores the better and more exhaustive the results will be. Increasing the value will also increase memory consumption.
Default: 100
Constraints:
- Values must be positive integers.
- Parameter cannot be changed after index creation.
"},{"location":"core/configuration/#hnswm","title":"hnsw:M
","text":"Description: Controls the maximum number of neighbour connections (M), a newly inserted vector. A higher value results in a mode densely connected graph. The impact on this is slower but more accurate searches with increased memory consumption.
Default: 16
Constraints:
- Values must be positive integers.
- Parameter cannot be changed after index creation.
"},{"location":"core/configuration/#hnswsearch_ef","title":"hnsw:search_ef
","text":"Description: Controls the number of neighbours in the HNSW graph to explore when searching. Increasing this requires more memory for the HNSW algo to explore the nodes during knn search.
Default: 10
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswnum_threads","title":"hnsw:num_threads
","text":"Description: Controls how many threads HNSW algo use.
Default: <number of CPU cores>
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswresize_factor","title":"hnsw:resize_factor
","text":"Description: Controls the rate of growth of the graph (e.g. how many node capacity will be added) whenever the current graph capacity is reached.
Default: 1.2
Constraints:
- Values must be positive floating point numbers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswbatch_size","title":"hnsw:batch_size
","text":"Description: Controls the size of the Bruteforce (in-memory) index. Once this threshold is crossed vectors from BF gets transferred to HNSW index. This value can be changed after index creation. The value must be less than hnsw:sync_threshold
.
Default: 100
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#hnswsync_threshold","title":"hnsw:sync_threshold
","text":"Description: Controls the threshold when using HNSW index is written to disk.
Default: 1000
Constraints:
- Values must be positive integers.
- Parameter can be changed after index creation.
"},{"location":"core/configuration/#examples","title":"Examples","text":"Configuring HNSW parameters at creation time
import chromadb\n\nclient = chromadb.HttpClient() # Adjust as per your client\nres = client.create_collection(\"my_collection\", metadata={\n \"hnsw:space\": \"cosine\",\n \"hnsw:construction_ef\": 100,\n \"hnsw:M\": 16,\n \"hnsw:search_ef\": 10,\n \"hnsw:num_threads\": 4,\n \"hnsw:resize_factor\": 1.2,\n \"hnsw:batch_size\": 100,\n \"hnsw:sync_threshold\": 1000,\n})\n
Updating HNSW parameters after creation
import chromadb\n\nclient = chromadb.HttpClient() # Adjust as per your client\nres = client.get_or_create_collection(\"my_collection\", metadata={\n \"hnsw:search_ef\": 200,\n \"hnsw:num_threads\": 8,\n \"hnsw:resize_factor\": 2,\n \"hnsw:batch_size\": 10000,\n \"hnsw:sync_threshold\": 1000000,\n})\n
get_or_create_collection overrides
When using get_or_create_collection()
with metadata
parameter, existing metadata will be overridden with the new values.
"},{"location":"core/document-ids/","title":"Document IDs","text":"Chroma is unopinionated about document IDs and delegates those decisions to the user. This frees users to build semantics around their IDs.
"},{"location":"core/document-ids/#note-on-compound-ids","title":"Note on Compound IDs","text":"While you can choose to use IDs that are composed of multiple sub-IDs (e.g. user_id
+ document_id
), it is important to highlight that Chroma does not support querying by partial ID.
"},{"location":"core/document-ids/#common-practices","title":"Common Practices","text":""},{"location":"core/document-ids/#uuids","title":"UUIDs","text":"UUIDs are a common choice for document IDs. They are unique, and can be generated in a distributed fashion. They are also opaque, which means that they do not contain any information about the document itself. This can be a good thing, as it allows you to change the document without changing the ID.
import uuid\nimport chromadb\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"collection\")\ncollection.add(ids=[f\"{uuid.uuid4()}\" for _ in range(len(my_documents))], documents=my_documents)\n
"},{"location":"core/document-ids/#caveats","title":"Caveats","text":"Predictable Ordering
UUIDs especially v4 are not lexicographically sortable. In its current version (0.4.x-0.5.0) Chroma orders responses of get()
by the ID of the documents. Therefore, if you need predictable ordering, you may want to consider a different ID strategy.
Storage Overhead
UUIDs are 128 bits long, which can be a lot of overhead if you have a large number of documents. If you are concerned about storage overhead, you may want to consider a different ID strategy.
"},{"location":"core/document-ids/#ulids","title":"ULIDs","text":"ULIDs are a variant of UUIDs that are lexicographically sortable. They are also 128 bits long, like UUIDs, but they are encoded in a way that makes them sortable. This can be useful if you need predictable ordering of your documents.
ULIDs are also shorter than UUIDs, which can save you some storage space. They are also opaque, like UUIDs, which means that they do not contain any information about the document itself.
Install the ulid-py
package to generate ULIDs.
pip install py-ulid\n
from ulid import ULID\nimport chromadb\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n_ulid = ULID()\n\nclient = chromadb.Client()\n\ncollection = client.get_or_create_collection(\"name\")\n\ncollection.add(ids=[f\"{_ulid.generate()}\" for _ in range(len(my_documents))], documents=my_documents)\n
"},{"location":"core/document-ids/#nanoids","title":"NanoIDs","text":"Coming soon.
"},{"location":"core/document-ids/#hashes","title":"Hashes","text":"Hashes are another common choice for document IDs. They are unique, and can be generated in a distributed fashion. They are also opaque, which means that they do not contain any information about the document itself. This can be a good thing, as it allows you to change the document without changing the ID.
import hashlib\nimport os\nimport chromadb\n\n\ndef generate_sha256_hash() -> str:\n # Generate a random number\n random_data = os.urandom(16)\n # Create a SHA256 hash object\n sha256_hash = hashlib.sha256()\n # Update the hash object with the random data\n sha256_hash.update(random_data)\n # Return the hexadecimal representation of the hash\n return sha256_hash.hexdigest()\n\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"collection\")\ncollection.add(ids=[generate_sha256_hash() for _ in range(len(my_documents))], documents=my_documents)\n
It is also possible to use the document as basis for the hash, the downside of that is that when the document changes and you have a semantic around the text as relating to the hash, you may need to update the hash.
import hashlib\nimport chromadb\n\n\ndef generate_sha256_hash_from_text(text) -> str:\n # Create a SHA256 hash object\n sha256_hash = hashlib.sha256()\n # Update the hash object with the text encoded to bytes\n sha256_hash.update(text.encode('utf-8'))\n # Return the hexadecimal representation of the hash\n return sha256_hash.hexdigest()\n\n\nmy_documents = [\n \"Hello, world!\",\n \"Hello, Chroma!\"\n]\n\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"collection\")\ncollection.add(ids=[generate_sha256_hash_from_text(my_documents[i]) for i in range(len(my_documents))],\n documents=my_documents)\n
"},{"location":"core/document-ids/#semantic-strategies","title":"Semantic Strategies","text":"In this section we'll explore a few different use cases for building semantics around document IDs.
- URL Slugs - if your docs are web pages with permalinks (e.g. blog posts), you can use the URL slug as the document ID.
- File Paths - if your docs are files on disk, you can use the file path as the document ID.
"},{"location":"core/filters/","title":"Filters","text":"Chroma provides two types of filters:
- Metadata - filter documents based on metadata using
where
clause in either Collection.query()
or Collection.get()
- Document - filter documents based on document content using
where_document
in Collection.query()
or `Collection.get().
Those familiar with MongoDB queries will find Chroma's filters very similar.
"},{"location":"core/filters/#metadata-filters","title":"Metadata Filters","text":""},{"location":"core/filters/#equality","title":"Equality","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": \"is_equal_to_this\"}\n)\n
Alternative syntax:
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$eq\": \"is_equal_to_this\"}}\n)\n
"},{"location":"core/filters/#inequality","title":"Inequality","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$ne\": \"is_not_equal_to_this\"}}\n)\n
"},{"location":"core/filters/#greater-than","title":"Greater Than","text":"Greater Than
The $gt
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$gt\": 5}}\n)\n
"},{"location":"core/filters/#greater-than-or-equal","title":"Greater Than or Equal","text":"Greater Than or Equal
The $gte
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$gte\": 5.1}}\n)\n
"},{"location":"core/filters/#less-than","title":"Less Than","text":"Less Than
The $lt
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$lt\": 5}}\n)\n
"},{"location":"core/filters/#less-than-or-equal","title":"Less Than or Equal","text":"Less Than or Equal
The $lte
operator is only supported for numerical values - int or float values.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$lte\": 5.1}}\n)\n
"},{"location":"core/filters/#in","title":"In","text":"In works on all data types - string, int, float, and bool.
In
The $in
operator is only supported for list of values of the same type.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$in\": [\"value1\", \"value2\"]}}\n)\n
"},{"location":"core/filters/#not-in","title":"Not In","text":"Not In works on all data types - string, int, float, and bool.
Not In
The $nin
operator is only supported for list of values of the same type.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"metadata_field\": {\"$nin\": [\"value1\", \"value2\"]}}\n)\n
"},{"location":"core/filters/#logical-operator-and","title":"Logical Operator: And","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"$and\": [{\"metadata_field1\": \"value1\"}, {\"metadata_field2\": \"value2\"}]}\n)\n
Logical Operators can be nested.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"$and\": [{\"metadata_field1\": \"value1\"}, {\"$or\": [{\"metadata_field2\": \"value2\"}, {\"metadata_field3\": \"value3\"}]}]}\n)\n
"},{"location":"core/filters/#logical-operator-or","title":"Logical Operator: Or","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where={\"$or\": [{\"metadata_field1\": \"value1\"}, {\"metadata_field2\": \"value2\"}]}\n)\n
"},{"location":"core/filters/#document-filters","title":"Document Filters","text":""},{"location":"core/filters/#contains","title":"Contains","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$contains\": \"search_string\"}\n)\n
"},{"location":"core/filters/#not-contains","title":"Not Contains","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$not_contains\": \"search_string\"}\n)\n
"},{"location":"core/filters/#logical-operator-and_1","title":"Logical Operator: And","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$and\": [{\"$contains\": \"search_string1\"}, {\"$contains\": \"search_string2\"}]}\n)\n
Logical Operators can be nested.
results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$and\": [{\"$contains\": \"search_string1\"}, {\"$or\": [{\"$not_contains\": \"search_string2\"}, {\"$not_contains\": \"search_string3\"}]}]}\n)\n
"},{"location":"core/filters/#logical-operator-or_1","title":"Logical Operator: Or","text":"results = collection.query(\n query_texts=[\"This is a query document\"],\n n_results=2,\n where_document={\"$or\": [{\"$not_contains\": \"search_string1\"}, {\"$not_contains\": \"search_string2\"}]}\n)\n
"},{"location":"core/install/","title":"Installation","text":""},{"location":"core/install/#core-chromadb","title":"Core ChromaDB","text":"To install the latest version of chromadb, run:
pip install chromadb\n
To install a specific version of chromadb, run:
pip install chromadb==<x.y.z>\n
Releases
You can find Chroma releases in PyPI here.
"},{"location":"core/install/#chromadb-python-client","title":"ChromaDB Python Client","text":"To install the latest version of the ChromaDB Python client, run:
pip install chromadb-client\n
Releases
You can find Chroma releases in PyPI here.
"},{"location":"core/resources/","title":"Resource Requirements","text":"Chroma makes use of the following compute resources:
- RAM - Chroma stores the vector HNSW index in-memory. This allows it to perform blazing fast semantic searches.
- Disk - Chroma persists all data to disk. This includes the vector HNSW index, metadata index, system DB, and the write-ahead log (WAL).
- CPU - Chroma uses CPU for indexing and searching vectors.
Here are some formulas and heuristics to help you estimate the resources you need to run Chroma.
"},{"location":"core/resources/#ram","title":"RAM","text":"Once you select your embedding model, use the following formula for calculating RAM storage requirements for the vector HNSW index:
number of vectors
* dimensionality of vectors
* 4 bytes
= RAM required
number of vectors
- This is the number of vectors you plan to index. These are the documents in your Chroma collection (or chunks if you use LlamaIndex or LangChain terminology). dimensionality of vectors
- This is the dimensionality of the vectors output by your embedding model. For example, if you use the sentence-transformers/paraphrase-MiniLM-L6-v2
model, the dimensionality of the vectors is 384. 4 bytes
- This is the size of each component of a vector. Chroma relies on HNSW lib implementation that uses 32bit floats.
"},{"location":"core/resources/#disk","title":"Disk","text":"Disk storage requirements mainly depend on what metadata you store and the number of vectors you index. The heuristics is at least 2-4x the RAM required for the vector HNSW index.
WAL Cleanup
Chroma does not currently clean the WAL so your sqlite3 metadata file will grow over time. In the meantime feel free to use available tooling to periodically clean your WAL - see chromadb-ops for more information.
"},{"location":"core/resources/#temporary-disk-space","title":"Temporary Disk Space","text":"Chroma uses temporary storage for its SQLite3 related operations - sorting and buffering large queries. By default, SQLite3 uses /tmp
for temporary storage.
There are two guidelines to follow:
- Have enough space if your application intends to make large queries or has multiple concurrent queries.
- Ensure temporary storage is on a fast disk to avoid performance bottlenecks.
You can configure the location of sqlite temp files with the SQLITE_TMPDIR
environment variable.
SQLite3 Temporary Storage
You can read more about SQLite3 temporary storage in the SQLite3 documentation.
"},{"location":"core/resources/#cpu","title":"CPU","text":"There are no hard requirements for the CPU, but it is recommended to use as much CPU as you can spare as it directly relates to index and search speeds.
"},{"location":"core/storage-layout/","title":"Storage Layout","text":"When configured as PersistentClient
or running as a server, Chroma persists its data under the provided persist_directory
.
For PersistentClient
the persistent directory is usually passed as path
parameter when creating the client, if not passed the default is ./chroma/
(relative path to where the client is started from).
For the server, the persistent directory can be passed as environment variable PERSIST_DIRECTORY
or as a command line argument --path
. If not passed, the default is ./chroma/
(relative path to where the server is started).
Once the client or the server is started a basic directory structure is created under the persistent directory containing the chroma.sqlite3
file. Once collections are created and data is added, subdirectories are created for each collection. The subdirectories are UUID-named and refer to the vector segment.
"},{"location":"core/storage-layout/#directory-structure","title":"Directory Structure","text":"The following diagram represents a typical Chroma persistent directory structure:
"},{"location":"core/storage-layout/#chromasqlite3","title":"chroma.sqlite3
","text":"Note about the tables
While we try to make it as accurate as possible chroma data layout inside the slite3
database is subject to change. The following description is valid as of version 0.5.0
. The tables are also not representative of the distributed architecture of Chroma.
The chroma.sqlite3
is typical for Chroma single-node. The file contains the following four types of data:
- Sysdb - Chroma system database, responsible for storing tenant, database, collection and segment information.
- WAL - the write-ahead log, which is used to ensure durability of the data.
- Metadata Segment - all metadata and documents stored in Chroma.
- Migrations - the database schema migration scripts.
"},{"location":"core/storage-layout/#sysdb","title":"Sysdb","text":"The system database comprises the following tables:
- tenants - contains all the tenants in the system. Usually gets initialized with a single tenant -
default_tenant
. - databases - contains all the databases per tenant. Usually gets initialized with a single database -
default_database
related to the default_tenant
. - collections - contains all the collections per database.
- collection_metadata - contains all the metadata associated with each collection. The metadata for a collection consists of any user-specified key-value pairs and the
hnsw:*
keys that store the HNSW index parameters. - segments - contains all the segments per collection. Each collection gets two segments -
metadata
and vector
. - segment_metadata - contains all the metadata associated with each segment. This table contains
hnsw:*
keys that store the HNSW index parameters for the vector segment.
"},{"location":"core/storage-layout/#wal","title":"WAL","text":"The write-ahead log is a table that stores all the changes made to the database. It is used to ensure that the data is durable and can be recovered in case of a crash. The WAL is composed of the following tables:
- embeddings_queue - contains all data ingested into Chroma. Each row of the table represents an operation upon a collection (add, update, delete, upsert). The row contains all the necessary information (embedding, document, metadata and associated relationship to a collection) to replay the operation and ensure data consistency.
- max_seq_id - maintains the maximum sequence ID of the metadata segment that is used as a WAL replay starting point for the metadata segment.
"},{"location":"core/storage-layout/#metadata-segment","title":"Metadata Segment","text":"The metadata segment is a table that stores all the metadata and documents stored in Chroma. The metadata segment is composed of the following tables:
- embeddings -
- embedding_metadata - contains all the metadata associated with each document and its embedding.
- embedding_fulltext_search - document full-text search index. This is a virtual table and upon inspection of the sqlite will appear as a series of tables starting with
embedding_fulltext_search_
. This is an FTS5 table and is used for full-text search queries on documents stored in Chroma (via where_document
filter in query
and get
methods).
"},{"location":"core/storage-layout/#migrations","title":"Migrations","text":"The migrations table contains all schema migrations applied to the chroma.sqlite3
database. The table is used to track the schema version and ensure that the database schema is up-to-date.
"},{"location":"core/storage-layout/#collection-subdirectories","title":"Collection Subdirectories","text":"TBD
"},{"location":"core/system_constraints/","title":"Chroma System Constraints","text":"This section contains common constraints of Chroma.
- Chroma is thread-safe
- Chroma is not process-safe
- Multiple Chroma Clients (Ephemeral, Persistent, Http) can be created from one or more threads within the same process
- A collection's name is unique within a Tenant and DB
- A collection's dimensions cannot change after creation => you cannot change the embedding function after creation
- Chroma operates in two modes - standalone (PersistentClient, EphemeralClient) and client/server (HttpClient with ChromaServer)
- The distance function cannot be changed after collection creation.
"},{"location":"core/system_constraints/#operational-modes","title":"Operational Modes","text":"Chroma can be operated in two modes:
- Standalone - This allows embedding Chroma in your python application without the need to communicate with external processes.
- Client/Server - This allows embedding Chroma in your python application as a thin-client with minimal dependencies and communicating with it via REST API. This is useful when you want to use Chroma from multiple processes or even multiple machines.
Depending on the mode you choose, you will need to consider the following component responsibilities:
- Standalone:
- Clients (Persistent, Ephemeral) - Responsible for persistence, embedding, querying
- Client/Server:
- Clients (HttpClient) - Responsible for embedding, communication with Chroma server via REST API
- Server - Responsible for persistence and querying
"},{"location":"core/tenants-and-databases/","title":"Tenants and Databases","text":"Tenants and Databases are two grouping abstractions that provides means to organize and manage data in Chroma.
"},{"location":"core/tenants-and-databases/#tenants","title":"Tenants","text":"A tenant is a logical grouping of databases.
"},{"location":"core/tenants-and-databases/#databases","title":"Databases","text":"A database is a logical grouping of collections.
"},{"location":"core/advanced/wal-pruning/","title":"Write-ahead Log (WAL) Pruning","text":"As of this writing (v0.4.22) Chroma stores its WAL forever. This means that the WAL will grow indefinitely. This is obviously not ideal. Here we provide a small script + a few steps how to prune your WAL and keep it at a reasonable size. Pruning the WAL is particularly important if you have many writes to Chroma (e.g. documents are added, updated or deleted frequently).
"},{"location":"core/advanced/wal-pruning/#tooling","title":"Tooling","text":"We have worked on a tooling to provide users with a way to prune their WAL - chroma-ops.
To prune your WAL you can run the following command:
pip install chroma-ops\nchops cleanup-wal /path/to/persist_dir\n
\u26a0\ufe0f IMPORTANT: It is always a good thing to backup your data before you prune the WAL.
"},{"location":"core/advanced/wal-pruning/#manual","title":"Manual","text":"Steps:
Stop Chroma
It is vitally important that you stop Chroma before you prune the WAL. If you don't stop Chroma you risk corrupting
- \u26a0\ufe0f Stop Chroma
- \ud83d\udcbe Create a backup of your
chroma.sqlite3
file in your persistent dir - \ud83d\udc40 Check your current
chroma.sqlite3
size (e.g. ls -lh /path/to/persist/dir/chroma.sqlite3
) - \ud83d\udda5\ufe0f Run the script below
- \ud83d\udd2d Check your current
chroma.sqlite3
size again to verify that the WAL has been pruned - \ud83d\ude80 Start Chroma
Script (store it in a file like compact-wal.sql
)
wal_clean.py#!/usr/bin/env python3\n# Call the script: python wal_clean.py ./chroma-test-compact\nimport os\nimport sqlite3\nfrom typing import cast, Optional, Dict\nimport argparse\nimport pickle\n\n\nclass PersistentData:\n \"\"\"Stores the data and metadata needed for a PersistentLocalHnswSegment\"\"\"\n\n dimensionality: Optional[int]\n total_elements_added: int\n max_seq_id: int\n\n id_to_label: Dict[str, int]\n label_to_id: Dict[int, str]\n id_to_seq_id: Dict[str, int]\n\n\ndef load_from_file(filename: str) -> \"PersistentData\":\n \"\"\"Load persistent data from a file\"\"\"\n with open(filename, \"rb\") as f:\n ret = cast(PersistentData, pickle.load(f))\n return ret\n\n\ndef clean_wal(chroma_persist_dir: str):\n if not os.path.exists(chroma_persist_dir):\n raise Exception(f\"Persist {chroma_persist_dir} dir does not exist\")\n if not os.path.exists(f'{chroma_persist_dir}/chroma.sqlite3'):\n raise Exception(\n f\"SQL file not found int persist dir {chroma_persist_dir}/chroma.sqlite3\")\n # Connect to SQLite database\n conn = sqlite3.connect(f'{chroma_persist_dir}/chroma.sqlite3')\n\n # Create a cursor object\n cursor = conn.cursor()\n\n # SQL query\n query = \"SELECT id,topic FROM segments where scope='VECTOR'\" # Replace with your query\n\n # Execute the query\n cursor.execute(query)\n\n # Fetch the results (if needed)\n results = cursor.fetchall()\n wal_cleanup_queries = []\n for row in results:\n # print(row)\n metadata = load_from_file(\n f'{chroma_persist_dir}/{row[0]}/index_metadata.pickle')\n wal_cleanup_queries.append(\n f\"DELETE FROM embeddings_queue WHERE seq_id < {metadata.max_seq_id} AND topic='{row[1]}';\")\n\n cursor.executescript('\\n'.join(wal_cleanup_queries))\n # Close the cursor and connection\n cursor.close()\n conn.close()\n\n\nif __name__ == \"__main__\":\n parser = argparse.ArgumentParser()\n parser.add_argument('persist_dir', type=str)\n arg = parser.parse_args()\n print(arg.persist_dir)\n clean_wal(arg.persist_dir)\n
Run the script
# Let's create a backup\ntar -czvf /path/to/persist/dir/chroma.sqlite3.backup.tar.gz /path/to/persist/dir/chroma.sqlite3\nlsof /path/to/persist/dir/chroma.sqlite3 # make sure that no process is using the file\npython wal_clean.py /path/to/persist/dir/\n# start chroma\n
"},{"location":"core/advanced/wal/","title":"Write-ahead Log (WAL)","text":"Chroma uses WAL to ensure data durability, even if things go wrong (e.g. server crashes). To achieve the latter Chroma uses what is known in the DB-industry as WAL or Write-Ahead Log. The purpose of the WAL is to ensure that each user request (aka transaction) is safely stored before acknowledging back to the user. Subsequently, in fact immediately after writing to the WAL, the data is also written to the index. This enables Chroma to serve as real-time search engine, where the data is available for querying immediately after it is written to the WAL.
Below is a diagram that illustrates the WAL in ChromaDB (ca. v0.4.22):
"},{"location":"core/advanced/wal/#vector-indices-overview","title":"Vector Indices Overview","text":"The diagram below illustrates how data gets transferred from the WAL to the binary vector indices (Bruteforce and HNSW):
For each collection Chroma maintains two binary indices - Bruteforce (in-memory, fast) and HNSW lib (persisted to disk, slow when adding new vectors and persisting). As you can imagine, the BF index serves the role of a buffer that holds the uncommitted to HNWS persisted index portion of the WAL. The HNSW index itself has a max sequence id counter, stored in a metadata file, that indicates from which position in the WAL the buffering to the BF index should begin. The latter buffering usually happens when the collection is first accessed.
There are two transfer points (in the diagram, sync threshold) for BF to HNSW:
hnsw:batch_size
- forces the BF vectors to be added to HNSW in-memory (this is a slow operation) -
hnsw:sync_threshold
- forces Chroma to dump the HNSW in-memory index to disk (this is a slow operation)
-
Both of the above sync points are controlled via Collection-level metadata with respective named params. It is customary hnsw:sync_threshold
> hnsw:batch_size
"},{"location":"core/advanced/wal/#metadata-indices-overview","title":"Metadata Indices Overview","text":"The following diagram illustrates how data gets transferred from the WAL to the metadata index:
"},{"location":"core/advanced/wal/#further-reading","title":"Further Reading","text":"For the DevOps minded folks we have a few more resources:
- WAL Pruning - Clean up your WAL
"},{"location":"ecosystem/clients/","title":"Chroma Ecosystem Clients","text":""},{"location":"ecosystem/clients/#python","title":"Python","text":"Maintainer Chroma Core team Repo https://github.com/chroma-core/chroma Status \u2705 Stable Version 0.4.25.dev0
(PyPi Link) Docs https://docs.trychroma.com/api Compatibility Python: 3.7+
, Chroma API Version: 0.4.15+
Feature Support:
Feature Supported Create Tenant \u2705 Get Tenant \u2705 Create DB \u2705 Get DB \u2705 Create Collection \u2705 Get Collection \u2705 List Collection \u2705 Count Collection \u2705 Delete Collection \u2705 Add Documents \u2705 Delete Documents \u2705 Update Documents \u2705 Query Documents \u2705 Get Document \u2705 Count Documents \u2705 Auth - Basic \u2705 Auth - Token \u2705 Reset \u2705 Embedding Function Support:
Embedding Function Supported OpenAI \u2705 Sentence Transformers \u2705 HuggingFace Inference API \u2705 Cohere \u2705 Google Vertex AI \u2705 Google Generative AI (Gemini) \u2705 OpenCLIP (Multi-modal) \u2705 Embedding Functions
The list above is not exhaustive. Check official docs for up-to-date information.
"},{"location":"ecosystem/clients/#javascript","title":"JavaScript","text":"Maintainer Chroma Core team Repo https://github.com/chroma-core/chroma Status \u2705 Stable Version 1.8.1
(NPM Link) Docs https://docs.trychroma.com/api Compatibility Python: 3.7+
, Chroma API Version: TBD
Feature Support:
Feature Supported Create Tenant \u2705 Get Tenant \u2705 Create DB \u2705 Get DB \u2705 Create Collection \u2705 Get Collection \u2705 List Collection \u2705 Count Collection \u2705 Delete Collection \u2705 Add Documents \u2705 Delete Documents \u2705 Update Documents \u2705 Query Documents \u2705 Get Document \u2705 Count Documents \u2705 Auth - Basic \u2705 Auth - Token \u2705 Reset \u2705 Embedding Function Support:
Embedding Function Supported OpenAI \u2705 Sentence Transformers \u2705 HuggingFace Inference API \u2705 Cohere \u2705 Google Vertex AI \u2705 Google Generative AI (Gemini) \u2705 OpenCLIP (Multi-modal) \u2705 Embedding Functions
The list above is not exhaustive. Check official docs for up-to-date information.
"},{"location":"ecosystem/clients/#ruby-client","title":"Ruby Client","text":"https://github.com/mariochavez/chroma
"},{"location":"ecosystem/clients/#java-client","title":"Java Client","text":"https://github.com/amikos-tech/chromadb-java-client
"},{"location":"ecosystem/clients/#go-client","title":"Go Client","text":"https://github.com/amikos-tech/chroma-go
"},{"location":"ecosystem/clients/#c-client","title":"C# Client","text":"https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/Connectors/Connectors.Memory.Chroma
"},{"location":"ecosystem/clients/#rust-client","title":"Rust Client","text":"https://crates.io/crates/chromadb
"},{"location":"ecosystem/clients/#elixir-client","title":"Elixir Client","text":"https://hex.pm/packages/chroma/
"},{"location":"ecosystem/clients/#dart-client","title":"Dart Client","text":"https://pub.dev/packages/chromadb
"},{"location":"ecosystem/clients/#php-client","title":"PHP Client","text":"https://github.com/CodeWithKyrian/chromadb-php
"},{"location":"ecosystem/clients/#php-laravel-client","title":"PHP (Laravel) Client","text":"https://github.com/helgeSverre/chromadb
"},{"location":"embeddings/bring-your-own-embeddings/","title":"Creating your own embedding function","text":"from chromadb.api.types import (\n Documents,\n EmbeddingFunction,\n Embeddings\n)\n\n\nclass MyCustomEmbeddingFunction(EmbeddingFunction[Documents]):\n def __init__(\n self,\n my_ef_param: str\n ):\n \"\"\"Initialize the embedding function.\"\"\"\n\n def __call__(self, input: Documents) -> Embeddings:\n \"\"\"Embed the input documents.\"\"\"\n return self._my_ef(input)\n
Now let's break the above down.
First you create a class that inherits from EmbeddingFunction[Documents]
. The Documents
type is a list of Document
objects. Each Document
object has a text
attribute that contains the text of the document. Chroma also supports multi-modal
"},{"location":"embeddings/bring-your-own-embeddings/#example-implementation","title":"Example Implementation","text":"Below is an implementation of an embedding function that works with transformers
models.
Note
This example requires the transformers
and torch
python packages. You can install them with pip install transformers torch
.
By default, all transformers
models on HF are supported are also supported by the sentence-transformers
package. For which Chroma provides out of the box support.
import importlib\nfrom typing import Optional, cast\n\nimport numpy as np\nimport numpy.typing as npt\nfrom chromadb.api.types import EmbeddingFunction, Documents, Embeddings\n\n\nclass TransformerEmbeddingFunction(EmbeddingFunction[Documents]):\n def __init__(\n self,\n model_name: str = \"dbmdz/bert-base-turkish-cased\",\n cache_dir: Optional[str] = None,\n ):\n try:\n from transformers import AutoModel, AutoTokenizer\n\n self._torch = importlib.import_module(\"torch\")\n self._tokenizer = AutoTokenizer.from_pretrained(model_name)\n self._model = AutoModel.from_pretrained(model_name, cache_dir=cache_dir)\n except ImportError:\n raise ValueError(\n \"The transformers and/or pytorch python package is not installed. Please install it with \"\n \"`pip install transformers` or `pip install torch`\"\n )\n\n @staticmethod\n def _normalize(vector: npt.NDArray) -> npt.NDArray:\n \"\"\"Normalizes a vector to unit length using L2 norm.\"\"\"\n norm = np.linalg.norm(vector)\n if norm == 0:\n return vector\n return vector / norm\n\n def __call__(self, input: Documents) -> Embeddings:\n inputs = self._tokenizer(\n input, padding=True, truncation=True, return_tensors=\"pt\"\n )\n with self._torch.no_grad():\n outputs = self._model(**inputs)\n embeddings = outputs.last_hidden_state.mean(dim=1) # mean pooling\n return [e.tolist() for e in self._normalize(embeddings)]\n
"},{"location":"embeddings/cross-encoders/","title":"Cross-Encoders Reranking","text":"Work in Progress
This page is a work in progress and may not be complete.
For now this is just a tiny snippet how to use a cross-encoder to rerank results returned from Chroma. Soon we will provide a more detailed guide to the usefulness of cross-encoders/rerankers.
"},{"location":"embeddings/cross-encoders/#hugging-face-cross-encoders","title":"Hugging Face Cross Encoders","text":"from sentence_transformers import CrossEncoder\nimport numpy as np\nimport chromadb\nclient = chromadb.Client()\ncollection = client.get_or_create_collection(\"my_collection\")\n# add some documents \ncollection.add(ids=[\"doc1\", \"doc2\", \"doc3\"], documents=[\"Hello, world!\", \"Hello, Chroma!\", \"Hello, Universe!\"])\n# query the collection\nquery = \"Hello, world!\"\nresults = collection.query(query_texts=[query], n_results=3)\n\n\n\nmodel = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2', max_length=512)\n# rerank the results with original query and documents returned from Chroma\nscores = model.predict([(query, doc) for doc in results[\"documents\"][0]])\n# get the highest scoring document\nprint(results[\"documents\"][0][np.argmax(scores)])\n
"},{"location":"embeddings/embedding-models/","title":"Embedding Models","text":"Work in Progress
This page is a work in progress.
Embedding Models are your best friends in the world of Chroma, and vector databases in general. They take something you understand in the form of text, images, audio etc. and turn it into a list of numbers (embeddings), which a machine learning model can understand. This process makes documents interpretable by a machine learning model.
The goal of this page is to arm you with enough knowledge to make an informed decision about which embedding model to choose for your use case.
The importance of a model
GenAI moves pretty fast therefore we recommend not to over-rely on models too much. When creating your solution create the necessary abstractions and tests to be able to quickly experiment and change things up (don't overdo it on the abstraction though).
"},{"location":"embeddings/embedding-models/#characteristics-of-an-embedding-model","title":"Characteristics of an Embedding Model","text":" - Modality - the type of data each model is designed to work with. For example, text, images, audio, video. Note: Some models can work with multiple modalities (e.g. OpenAI's CLIP).
- Context - The maximum number of tokens the model can process at once.
- Tokenization - The model's tokenizer or the way a model turns text into tokens to process.
- Dimensionality - The number of dimensions in the output embeddings/vectors.
- Training Data - The data the model was trained on.
- Execution Environment - How the model is run (e.g. local, cloud, API).
- Loss Function - The function used to train the model e.g. how well the model is doing in predicting the embeddings, compared to the actual embeddings.
"},{"location":"embeddings/embedding-models/#model-categories","title":"Model Categories","text":"There are several ways to categorize embedding models other than the above characteristics:
- Execution environment e.g. API vs local
- Licensing e.g. open-source vs proprietary
- Privacy e.g. on-premises vs cloud
"},{"location":"embeddings/embedding-models/#execution-environment","title":"Execution Environment","text":"The execution environment is probably the first choice you should consider when creating your GenAI solution. Can I afford my data to leave the confines of my computer, cluster, organization? If the answer is yes and you are still in the experimentation phase of your GenAI journey we recommend using API-based embedding models.
"},{"location":"embeddings/gpu-support/","title":"Embedding Functions GPU Support","text":"By default, Chroma does not require GPU support for embedding functions. However, if you want to use GPU support, some of the functions, especially those running locally provide GPU support.
"},{"location":"embeddings/gpu-support/#default-embedding-functions-onnxruntime","title":"Default Embedding Functions (Onnxruntime)","text":"To use the default embedding functions with GPU support, you need to install onnxruntime-gpu
package. You can install it with the following command:
pip install onnxruntime-gpu\n
Note: To ensure no conflicts, you can uninstall onnxruntime
(e.g. pip uninstall onnxruntime
) in a separate environment.
List available providers:
import onnxruntime\n\nprint(onnxruntime.get_available_providers())\n
Select the desired provider and set it as preferred before using the embedding functions (in the below example, we use CUDAExecutionProvider
):
import time\nfrom chromadb.utils.embedding_functions import ONNXMiniLM_L6_V2\n\nef = ONNXMiniLM_L6_V2(preferred_providers=['CUDAExecutionProvider'])\n\ndocs = []\nfor i in range(1000):\n docs.append(f\"this is a document with id {i}\")\n\nstart_time = time.perf_counter()\nembeddings = ef(docs)\nend_time = time.perf_counter()\nprint(f\"Elapsed time: {end_time - start_time} seconds\")\n
IMPORTANT OBSERVATION: Our observations are that for GPU support using sentence transformers with model all-MiniLM-L6-v2
outperforms onnxruntime with GPU support. In practical terms on a Colab T4 GPU, the onnxruntime example above runs for about 100s whereas the equivalent sentence transformers example runs for about 1.8s.
"},{"location":"embeddings/gpu-support/#sentence-transformers","title":"Sentence Transformers","text":"import time\nfrom chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\n# This will download the model to your machine and set it up for GPU support\nef = SentenceTransformerEmbeddingFunction(model_name=\"thenlper/gte-small\", device=\"cuda\")\n\n# Test with 10k documents\ndocs = []\nfor i in range(10000):\n docs.append(f\"this is a document with id {i}\")\n\nstart_time = time.perf_counter()\nembeddings = ef(docs)\nend_time = time.perf_counter()\nprint(f\"Elapsed time: {end_time - start_time} seconds\")\n
Note: You can run the above example in google Colab - see the notebook
"},{"location":"embeddings/gpu-support/#openclip","title":"OpenCLIP","text":"Prior to PR #1806, we simply used the torch
package to load the model and run it on the GPU.
import chromadb\nfrom chromadb.utils.embedding_functions import OpenCLIPEmbeddingFunction\nfrom chromadb.utils.data_loaders import ImageLoader\nimport toch\nimport os\n\nIMAGE_FOLDER = \"images\"\ntoch.device(\"cuda\")\n\nembedding_function = OpenCLIPEmbeddingFunction()\nimage_loader = ImageLoader()\n\nclient = chromadb.PersistentClient(path=\"my_local_data\")\ncollection = client.create_collection(\n name='multimodal_collection',\n embedding_function=embedding_function,\n data_loader=image_loader)\n\nimage_uris = sorted([os.path.join(IMAGE_FOLDER, image_name) for image_name in os.listdir(IMAGE_FOLDER)])\nids = [str(i) for i in range(len(image_uris))]\ncollection.add(ids=ids, uris=image_uris)\n
After PR #1806:
from chromadb.utils.embedding_functions import OpenCLIPEmbeddingFunction\nembedding_function = OpenCLIPEmbeddingFunction(device=\"cuda\")\n
"},{"location":"faq/","title":"Frequently Asked Questions and Commonly Encountered Issues","text":"This section provides answers to frequently asked questions and information on commonly encountered problem when working with Chroma. These information below is based on interactions with the Chroma community.
404 Answer Not Found
If you have a question that is not answered here, please reach out to us on our Discord @taz or GitHub Issues
"},{"location":"faq/#frequently-asked-questions","title":"Frequently Asked Questions","text":""},{"location":"faq/#what-does-chroma-use-to-index-embedding-vectors","title":"What does Chroma use to index embedding vectors?","text":"Chroma uses its own fork of HNSW lib for indexing and searching embeddings.
Alternative Questions:
- What library does Chroma use for vector index and search?
- What algorithm does Chroma use for vector search?
"},{"location":"faq/#how-to-set-dimensionality-of-my-collections","title":"How to set dimensionality of my collections?","text":"When creating a collection, its dimensionality is determined by the dimensionality of the first embedding added to it. Once the dimensionality is set, it cannot be changed. Therefore, it is important to consistently use embeddings of the same dimensionality when adding or querying a collection.
Example:
import chromadb\n\nclient = chromadb.Client()\n\ncollection = client.create_collection(\"name\") # dimensionality is not set yet\n\n# add an embedding to the collection\ncollection.add(ids=[\"id1\"], embeddings=[[1, 2, 3]]) # dimensionality is set to 3\n
Alternative Questions:
- Can I change the dimensionality of a collection?
"},{"location":"faq/#can-i-use-transformers-models-with-chroma","title":"Can I use transformers
models with Chroma?","text":"Generally, yes you can use transformers
models with Chroma. Although Chroma does not provide a wrapper for this, you can use SentenceTransformerEmbeddingFunction
to achieve the same result. The sentence-transformer library will implicitly do mean-pooling on the last hidden layer, and you'll get a warning about it - No sentence-transformers model found with name [model name]. Creating a new one with MEAN pooling.
Example:
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\n\nef = SentenceTransformerEmbeddingFunction(model_name=\"FacebookAI/xlm-roberta-large-finetuned-conll03-english\")\n\nprint(ef([\"test\"]))\n
Warning
Not all models will work with the above method. Also mean pooling may not be the best strategy for the model. Read the model card and try to understand what if any pooling the creators recommend. You may also want to normalize the embeddings before adding them to Chroma (pass normalize_embeddings=True
to the SentenceTransformerEmbeddingFunction
EF constructor).
"},{"location":"faq/#commonly-encountered-problems","title":"Commonly Encountered Problems","text":""},{"location":"faq/#collection-dimensionality-mismatch","title":"Collection Dimensionality Mismatch","text":"Symptoms:
This error usually exhibits in the following error message:
chromadb.errors.InvalidDimensionException: Embedding dimension XXX does not match collection dimensionality YYY
Context:
When adding/upserting or querying Chroma collection. This error is more visible/pronounced when using the Python APIs, but will also show up in also surface in other clients.
Cause:
You are trying to add or query a collection with vectors of a different dimensionality than the collection was created with.
Explanation/Solution:
When you first create a collection client.create_collection(\"name\")
, the collection will not have knowledge of its dimensionality so that allows you to add vectors of any dimensionality to it. However, once your first batch of embeddings is added to the collection, the collection will be locked to that dimensionality. Any subsequent query or add operation must use embeddings of the same dimensionality. The dimensionality of the embeddings is a characteristic of the embedding model (EmbeddingFunction) used to generate the embeddings, therefore it is important to consistently use the same EmbeddingFunction when adding or querying a collection.
Tip
If you do not specify an embedding_function
when creating (client.create_collection
) or getting (client.get_or_create_collection
) a collection, Chroma wil use its default embedding function.
"},{"location":"faq/#large-distances-in-search-results","title":"Large Distances in Search Results","text":"Symptoms:
When querying a collection, you get results that are in the 10s or 100s.
Context:
Frequently when using you own embedding function.
Cause:
The embeddings are not normalized.
Explanation/Solution:
L2
(Euclidean distance) and IP
(inner product) distance metrics are sensitive to the magnitude of the vectors. Chroma uses L2
by default. Therefore, it is recommended to normalize the embeddings before adding them to Chroma.
Here is an example how to normalize embeddings using L2 norm:
import numpy as np\n\n\ndef normalize_L2(vector):\n \"\"\"Normalizes a vector to unit length using L2 norm.\"\"\"\n norm = np.linalg.norm(vector)\n if norm == 0:\n return vector\n return vector / norm\n
"},{"location":"faq/#operationalerror-no-such-column-collectionstopic","title":"OperationalError: no such column: collections.topic
","text":"Symptoms:
The error OperationalError: no such column: collections.topic
is raised when trying to access Chroma locally or remotely.
Context:
After upgrading to Chroma 0.5.0
or accessing your Chroma persistent data with Chroma client version 0.5.0
.
Cause:
In version 0.5.x
Chroma has made some SQLite3 schema changes that are not backwards compatible with the previous versions. Once you access your persistent data on the server or locally with the new Chroma version it will automatically migrate to the new schema. This operation is not reversible.
Explanation/Solution:
To resolve this issue you will need to upgrade all your clients accessing the Chroma data to version 0.5.x
.
Here's a link to the migration performed by Chroma - https://github.com/chroma-core/chroma/blob/main/chromadb/migrations/sysdb/00005-remove-topic.sqlite.sql
"},{"location":"faq/#sqlite3operationalerror-database-or-disk-is-full","title":"sqlite3.OperationalError: database or disk is full
","text":"Symptoms:
The error sqlite3.OperationalError: database or disk is full
is raised when trying to access Chroma locally or remotely. The error can occur in any of the Chroma API calls.
Context:
There are two contexts in which this error can occur:
- When the persistent disk space is full or the disk quota is reached - This is where your
PERSIST_DIRECTORY
points to. - When there is not enough space in the temporary director - frequently
/tmp
on your system or container.
Cause:
When inserting new data and your Chroma persistent disk space is full or the disk quota is reached, the database will not be able to write metadata to SQLite3 db thus raising the error.
When performing large queries or multiple concurrent queries, the temporary disk space may be exhausted.
Explanation/Solution:
To work around the first issue, you can increase the disk space or clean up the disk space. To work around the second issue, you can increase the temporary disk space (works fine for containers but might be a problem for VMs) or point SQLite3 to a different temporary directory by using SQLITE_TMPDIR
environment variable.
SQLite Temp File More information on how sqlite3 uses temp files can be found here.
"},{"location":"faq/#runtimeerror-chroma-is-running-in-http-only-client-mode-and-can-only-be-run-with-chromadbapifastapifastapi","title":"RuntimeError: Chroma is running in http-only client mode, and can only be run with 'chromadb.api.fastapi.FastAPI'
","text":"Symptoms and Context:
The following error is raised when trying to create a new PersistentClient
, EphemeralClient
, or Client
:
RuntimeError: Chroma is running in http-only client mode, and can only be run with 'chromadb.api.fastapi.FastAPI' \nas the chroma_api_impl. see https://docs.trychroma.com/usage-guide?lang=py#using-the-python-http-only-client for more information.\n
Cause:
There are two possible causes for this error:
chromadb-client
is installed and you are trying to work with a local client. - Dependency conflict with
chromadb-client
and chromadb
packages.
Explanation/Solution:
Chroma (python) comes in two packages - chromadb
and chromadb-client
. The chromadb-client
package is used to interact with a remote Chroma server. If you are trying to work with a local client, you should use the chromadb
package. If you are planning to interact with remote server only it is recommended to use the chromadb-client
package.
If you intend to work locally with Chroma (e.g. embed in your app) then we suggest that you uninstall the chromadb-client
package and install the chromadb
package.
To check which package you have installed:
pip list | grep chromadb\n
To uninstall the chromadb-client
package:
pip uninstall chromadb-client\n
Working with virtual environments It is recommended to work with virtual environments to avoid dependency conflicts. To create a virtual environment you can use the following snippet:
pip install virtualenv\npython -m venv myenv\nsource myenv/bin/activate\npip install chromadb # and other packages you need\n
Alternatively you can use conda
or poetry
to manage your environments. Default Embedding Function Default embedding function - chromadb.utils.embedding_functions.DefaultEmbeddingFunction
- can only be used with chromadb
package.
"},{"location":"integrations/langchain/","title":"Chroma Integrations With LangChain","text":" - Embeddings - learn how to use Chroma Embedding functions with LC and vice versa
- Retrievers - learn how to use LangChain retrievers with Chroma
"},{"location":"integrations/langchain/embeddings/","title":"Langchain Embeddings","text":""},{"location":"integrations/langchain/embeddings/#embedding-functions","title":"Embedding Functions","text":"Chroma and Langchain both offer embedding functions which are wrappers on top of popular embedding models.
Unfortunately Chroma and LC's embedding functions are not compatible with each other. Below we offer two adapters to convert Chroma's embedding functions to LC's and vice versa.
Links: - Chroma Embedding Functions Definition - Langchain Embedding Functions Definition
Here is the adapter to convert Chroma's embedding functions to LC's:
from langchain_core.embeddings import Embeddings\nfrom chromadb.api.types import EmbeddingFunction\n\n\nclass ChromaEmbeddingsAdapter(Embeddings):\n def __init__(self, ef: EmbeddingFunction):\n self.ef = ef\n\n def embed_documents(self, texts):\n return self.ef(texts)\n\n def embed_query(self, query):\n return self.ef([query])[0]\n
Here is the adapter to convert LC's embedding function s to Chroma's:
from langchain_core.embeddings import Embeddings\nfrom chromadb.api.types import EmbeddingFunction, Documents\n\n\nclass LangChainEmbeddingAdapter(EmbeddingFunction[Documents]):\n def __init__(self, ef: Embeddings):\n self.ef = ef\n\n def __call__(self, input: Documents) -> Embeddings:\n # LC EFs also have embed_query but Chroma doesn't support that so we just use embed_documents\n # TODO: better type checking\n return self.ef.embed_documents(input)\n
"},{"location":"integrations/langchain/embeddings/#example-usage","title":"Example Usage","text":"Using Chroma Embedding Functions with Langchain:
from langchain.vectorstores.chroma import Chroma\nfrom chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction\n\ntexts = [\"foo\", \"bar\", \"baz\"]\n\ndocs_vectorstore = Chroma.from_texts(\n texts=texts,\n collection_name=\"docs_store\",\n embedding=ChromaEmbeddingsAdapter(SentenceTransformerEmbeddingFunction(model_name=\"all-MiniLM-L6-v2\")),\n)\n
Using Langchain Embedding Functions with Chroma:
from langchain_community.embeddings import SentenceTransformerEmbeddings\nimport chromadb\n\nclient = chromadb.Client()\n\ncollection = client.get_or_create_collection(\"test\", embedding_function=LangChainEmbeddingAdapter(\n SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")))\ncollection.add(ids=[\"1\", \"2\", \"3\"], documents=[\"foo\", \"bar\", \"baz\"])\n
"},{"location":"integrations/langchain/retrievers/","title":"\ud83e\udd9c\u26d3\ufe0f Langchain Retriever","text":"TBD: describe what retrievers are in LC and how they work.
"},{"location":"integrations/langchain/retrievers/#vector-store-retriever","title":"Vector Store Retriever","text":"In the below example we demonstrate how to use Chroma as a vector store retriever with a filter query.
Note that the filter is supplied whenever we create the retriever object so the filter applies to all queries (get_relevant_documents
).
from langchain.document_loaders import OnlinePDFLoader\nfrom langchain.chains import RetrievalQA\nfrom langchain.llms import OpenAI\nfrom langchain.vectorstores import Chroma\nfrom typing import Dict, Any\nimport chromadb\nfrom langchain_core.embeddings import Embeddings\n\nclient = chromadb.PersistentClient(path=\"./chroma\")\n\ncol = client.get_or_create_collection(\"test\")\n\ncol.upsert([f\"{i}\" for i in range(10)],documents=[f\"This is document #{i}\" for i in range(10)],metadatas=[{\"id\":f\"{i}\"} for i in range(10)])\n\nef = chromadb.utils.embedding_functions.DefaultEmbeddingFunction()\n\nclass DefChromaEF(Embeddings):\n def __init__(self,ef):\n self.ef = ef\n\n def embed_documents(self,texts):\n return self.ef(texts)\n\n def embed_query(self, query):\n return self.ef([query])[0]\n\n\ndb = Chroma(client=client, collection_name=\"test\",embedding_function=DefChromaEF(ef))\n\nretriever = db.as_retriever(search_kwargs={\"filter\":{\"id\":\"1\"}})\n\ndocs = retriever.get_relevant_documents(\"document\")\n\nassert len(docs)==1\n
Ref: https://colab.research.google.com/drive/1L0RwQVVBtvTTd6Le523P4uzz3m3fm0pH#scrollTo=xROOfxLohE5j
"},{"location":"integrations/llamaindex/","title":"Chroma Integrations With LlamaIndex","text":" - Embeddings - learn how to use LlamaIndex embeddings functions with Chroma and vice versa
"},{"location":"integrations/llamaindex/embeddings/","title":"LlamaIndex Embeddings","text":""},{"location":"integrations/llamaindex/embeddings/#embedding-functions","title":"Embedding Functions","text":"Chroma and LlamaIndex both offer embedding functions which are wrappers on top of popular embedding models.
Unfortunately Chroma and LI's embedding functions are not compatible with each other. Below we offer an adapters to convert LI embedding function to Chroma one.
from llama_index.core.schema import TextNode\nfrom llama_index.core.base.embeddings.base import BaseEmbedding\nfrom chromadb import EmbeddingFunction, Documents, Embeddings\n\n\nclass LlamaIndexEmbeddingAdapter(EmbeddingFunction):\n def __init__(self, ef: BaseEmbedding):\n self.ef = ef\n\n def __call__(self, input: Documents) -> Embeddings:\n return [node.embedding for node in self.ef([TextNode(text=doc) for doc in input])]\n
Text modality
The above adapter assumes that the input documents are text. If you are using a different modality, you will need to modify the adapter accordingly.
An example of how to use the above with LlamaIndex:
Prerequisites for example
Run pip install llama-index chromadb llama-index-embeddings-fastembed fastembed
import chromadb\nfrom llama_index.embeddings.fastembed import FastEmbedEmbedding\n\n# make sure to include the above adapter and imports\nembed_model = FastEmbedEmbedding(model_name=\"BAAI/bge-small-en-v1.5\")\n\nclient = chromadb.Client()\n\ncol = client.get_or_create_collection(\"test_collection\", embedding_function=LlamaIndexEmbeddingAdapter(embed_model))\n\ncol.add(ids=[\"1\"], documents=[\"this is a test document\"])\n
"},{"location":"integrations/ollama/","title":"Chroma Integrations With Ollama","text":" - Embeddings - learn how to use Ollama as embedder for Chroma documents
- \u2728
Coming soon
RAG with Ollama - a primer on how to build a simple RAG app with Ollama and Chroma
"},{"location":"integrations/ollama/embeddings/","title":"Ollama","text":"Ollama offers out-of-the-box embedding API which allows you to generate embeddings for your documents. Chroma provides a convenient wrapper around Ollama's embedding API.
"},{"location":"integrations/ollama/embeddings/#ollama-embedding-models","title":"Ollama Embedding Models","text":"While you can use any of the ollama models including LLMs to generate embeddings. We generally recommend using specialized models like nomic-embed-text
for text embeddings. The latter models are specifically trained for embeddings and are more efficient for this purpose (e.g. the dimensions of the output embeddings are much smaller than those from LLMs e.g. 1024 - nomic-embed-text vs 4096 - llama3)
Models:
Model Pull Ollama Registry Link nomic-embed-text
ollama pull nomic-embed-text
nomic-embed-text mxbai-embed-large
ollama pull mxbai-embed-large
mxbai-embed-large snowflake-arctic-embed
ollama pull snowflake-arctic-embed
snowflake-arctic-embed all-minilm-l6-v2
ollama pull chroma/all-minilm-l6-v2-f32
all-minilm-l6-v2-f32"},{"location":"integrations/ollama/embeddings/#basic-usage","title":"Basic Usage","text":"First let's run a local docker container with Ollama. We'll pull nomic-embed-text
model:
docker run -d --rm -v ./ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama\ndocker exec -it ollama ollama run nomic-embed-text # press Ctrl+D to exit after model downloads successfully\n# test it\ncurl http://localhost:11434/api/embeddings -d '{\"model\": \"nomic-embed-text\",\"prompt\": \"Here is an article about llamas...\"}'\n
Ollama Docs
For more information on Ollama, visit the Ollama GitHub repository.
Using the CLI
If you have or prefer to use the Ollama CLI, you can use the following command to get a model:
ollama pull nomic-embed-text\n
Now let's configure our OllamaEmbeddingFunction Embedding (python) function with the default Ollama endpoint:
"},{"location":"integrations/ollama/embeddings/#python","title":"Python","text":"import chromadb\nfrom chromadb.utils.embedding_functions import OllamaEmbeddingFunction\n\nclient = chromadb.PersistentClient(path=\"ollama\")\n\n# create EF with custom endpoint\nef = OllamaEmbeddingFunction(\n model_name=\"nomic-embed-text\",\n url=\"http://localhost:11434/api/embeddings\",\n)\n\nprint(ef([\"Here is an article about llamas...\"]))\n
"},{"location":"integrations/ollama/embeddings/#javascript","title":"JavaScript","text":"For JS users, you can use the OllamaEmbeddingFunction
class to create embeddings:
const {OllamaEmbeddingFunction} = require('chromadb');\nconst embedder = new OllamaEmbeddingFunction({\n url: \"http://localhost:11434/api/embeddings\",\n model: \"nomic-embed-text\"\n})\n\n// use directly\nconst embeddings = embedder.generate([\"Here is an article about llamas...\"])\n
"},{"location":"integrations/ollama/embeddings/#golang","title":"Golang","text":"For Golang you can use the chroma-go
client's OllamaEmbeddingFunction
embedding function to generate embeddings for your documents:
package main\n\nimport (\n \"context\"\n \"fmt\"\n ollama \"github.com/amikos-tech/chroma-go/ollama\"\n)\n\nfunc main() {\n documents := []string{\n \"Document 1 content here\",\n \"Document 2 content here\",\n }\n // the `/api/embeddings` endpoint is automatically appended to the base URL\n ef, err := ollama.NewOllamaEmbeddingFunction(ollama.WithBaseURL(\"http://127.0.0.1:11434\"), ollama.WithModel(\"nomic-embed-text\"))\n if err != nil {\n fmt.Printf(\"Error creating Ollama embedding function: %s \\n\", err)\n }\n resp, err := ef.EmbedDocuments(context.Background(), documents)\n if err != nil {\n fmt.Printf(\"Error embedding documents: %s \\n\", err)\n }\n fmt.Printf(\"Embedding response: %v \\n\", resp)\n}\n
Golang Client
You can install the Golang client by running the following command:
go get github.com/amikos-tech/chroma-go\n
For more information visit https://go-client.chromadb.dev/
"},{"location":"running/deployment-patterns/","title":"Deployment Patterns","text":"In this section we'll cover a patterns of how to deploy GenAI applications using Chroma as a vector store.
"},{"location":"running/health-checks/","title":"Health Checks","text":""},{"location":"running/health-checks/#docker-compose","title":"Docker Compose","text":"The simples form of health check is to use the healthcheck
directive in the docker-compose.yml
file. This is useful if you are deploying Chroma alongside other services that may depend on it.
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n image: server\n build:\n context: .\n dockerfile: Dockerfile\n volumes:\n # Be aware that indexed data are located in \"/chroma/chroma/\"\n # Default configuration for persist_directory in chromadb/config.py\n # Read more about deployments: https://docs.trychroma.com/deployment\n - chroma-data:/chroma/chroma\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}\n - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}\n - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n ports:\n - 8000:8000\n healthcheck:\n test: [ \"CMD\", \"/bin/bash\", \"-c\", \"cat < /dev/null > /dev/tcp/localhost/8001\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\nvolumes:\n chroma-data:\n driver: local\n
"},{"location":"running/health-checks/#kubernetes","title":"Kubernetes","text":"In kubernetes you can use the livenessProbe
and readinessProbe
to check the health of the server. This is useful if you are deploying Chroma in a kubernetes cluster.
apiVersion: apps/v1\nkind: Deployment\nmetadata:\n name: chroma\n labels:\n app: chroma\nspec:\n replicas: 1\n selector:\n matchLabels:\n app: chroma\n template:\n metadata:\n labels:\n app: chroma\n spec:\n containers:\n - name: chroma\n image: <chroma-image>\n ports:\n - containerPort: 8000\n livenessProbe:\n httpGet:\n path: /api/v1\n port: 8000\n initialDelaySeconds: 5\n periodSeconds: 5\n readinessProbe:\n httpGet:\n path: /api/v1\n port: 8000\n initialDelaySeconds: 5\n periodSeconds: 5\n startupProbe:\n httpGet:\n path: /api/v1\n port: 8000\n failureThreshold: 3\n periodSeconds: 60\n initialDelaySeconds: 60\n
Alternative to the httpGet
you can also use tcpSocket
:
readinessProbe:\n tcpSocket:\n port: 8000\n failureThreshold: 3\n timeoutSeconds: 30\n periodSeconds: 60\n livenessProbe:\n tcpSocket:\n port: 8000\n failureThreshold: 3\n timeoutSeconds: 30\n periodSeconds: 60\n startupProbe:\n tcpSocket:\n port: 8000\n failureThreshold: 3\n periodSeconds: 60\n initialDelaySeconds: 60\n
"},{"location":"running/road-to-prod/","title":"Road To Production","text":"In this section we will cover considerations for operating Chroma ina production environment.
To operate Chroma in production your deployment must follow your organization's best practices and guidelines around business continuity, security, and compliance. Here we will list the core concepts and offer some guidance on how to achieve them.
Core system abilities:
- High Availability - The deployment should be able to handle failures while continuing to serve requests.
- Scalability - The deployment should be able to handle increased load by adding more resources (aka scale horizontally).
- Privacy and Security - The deployment should protect data from unauthorized access and ensure data integrity.
- Observability - The deployment should provide metrics and logs to help operators understand the system's health.
- Backup and Restore - The deployment should have a backup and restore strategy to protect against data loss.
- Disaster Recovery - The deployment should have a disaster recovery plan to recover from catastrophic failures.
- Maintenance - The deployment should be easy to maintain and upgrade.
While our guidance is most likely incomplete it can be taken as a compliment to your own organizational processes. For those deploying Chroma in a smaller enterprise without such processes, we advise common sense and caution.
"},{"location":"running/road-to-prod/#high-availability","title":"High Availability","text":""},{"location":"running/road-to-prod/#scalability","title":"Scalability","text":""},{"location":"running/road-to-prod/#privacy-and-security","title":"Privacy and Security","text":""},{"location":"running/road-to-prod/#data-security","title":"Data Security","text":""},{"location":"running/road-to-prod/#in-transit","title":"In Transit","text":"The bare minimum for securing data in transit is to use HTTPS when performing Chroma API calls. This ensures that data is encrypted when it is sent over the network.
There are several ways to achieve this:
- Use a reverse proxy like Envoy or Nginx to terminate SSL/TLS connections.
- Use a load balancer like AWS ELB or Google Cloud Load Balancer to terminate SSL/TLS connections (technically a Envoy and Nginx are also LBs).
- Use a service mesh like Istio or Linkerd to manage SSL/TLS connections between services.
- Enable SSL/TLS in your Chroma server.
Depending on your requirements you may choose one or more of these options.
Reverse Proxy:
Load Balancer:
Service Mesh:
Chroma Server:
"},{"location":"running/road-to-prod/#at-rest","title":"At Rest","text":""},{"location":"running/road-to-prod/#access-control","title":"Access Control","text":""},{"location":"running/road-to-prod/#authentication","title":"Authentication","text":""},{"location":"running/road-to-prod/#authorization","title":"Authorization","text":""},{"location":"running/road-to-prod/#observability","title":"Observability","text":""},{"location":"running/road-to-prod/#backup-and-restore","title":"Backup and Restore","text":""},{"location":"running/road-to-prod/#disaster-recovery","title":"Disaster Recovery","text":""},{"location":"running/road-to-prod/#maintenance","title":"Maintenance","text":""},{"location":"running/running-chroma/","title":"Running Chroma","text":""},{"location":"running/running-chroma/#local-server","title":"Local Server","text":"Article Link
This article is also available on Medium Running ChromaDB \u2014 Part 1: Local Server.
"},{"location":"running/running-chroma/#chroma-cli","title":"Chroma CLI","text":"The simplest way to run Chroma locally is via the Chroma cli
which is part of the core Chroma package.
Prerequisites:
- Python 3.8 to 3.11 - Download Python | Python.org
pip install chromadb\nchroma run --host localhost --port 8000 --path ./my_chroma_data\n
--host
The host to which to listen to, by default it is [localhost](http://localhost:8000/docs)
, but if you want to expose it to your entire network then you can specify `0.0.0.0``
--port
The port on which to listen to, by default this is 8000
.
--path
The path where to persist your Chroma data locally.
Target Path Install
It is possible to install Chroma in a specific directory by running pip install chromadb -t /path/to/dir
. To run Chroma CLI from the installation dir expor the Python Path export PYTHONPATH=$PYTHONPATH:/path/to/dir
.
"},{"location":"running/running-chroma/#docker","title":"Docker","text":"Running Chroma server locally can be achieved via a simple docker command as shown below.
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
docker run -d --rm --name chromadb -v ./chroma:/chroma/chroma -e IS_PERSISTENT=TRUE -e ANONYMIZED_TELEMETRY=TRUE chromadb/chroma:latest\n
Options:
-v
specifies a local dir which is where Chroma will store its data so when the container is destroyed the data remains. Note: If you are using -e PERSIST_DIRECTORY
then you need to point the volume to that directory. -e
IS_PERSISTENT=TRUE
let\u2019s Chroma know to persist data -e
PERSIST_DIRECTORY=/path/in/container
specifies the path in the container where the data will be stored, by default it is /chroma/chroma
-e ANONYMIZED_TELEMETRY=TRUE
allows you to turn on (TRUE
) or off (FALSE
) anonymous product telemetry which helps the Chroma team in making informed decisions about Chroma OSS and commercial direction. chromadb/chroma:latest
indicates the latest Chroma version but can be replaced with any valid tag if a prior version is needed (e.g. chroma:0.4.24
)
"},{"location":"running/running-chroma/#docker-compose-cloned-repo","title":"Docker Compose (Cloned Repo)","text":"If you are feeling adventurous you can also use the Chroma main
branch to run a local Chroma server with the latest changes:
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
- Git - Git - Downloads (git-scm.com)
git clone https://github.com/chroma-core/chroma && cd chroma\ndocker compose up -d --build\n
If you want to run a specific version of Chroma you can checkout the version tag you need:
git checkout release/0.4.24\n
"},{"location":"running/running-chroma/#docker-compose-without-cloning-the-repo","title":"Docker Compose (Without Cloning the Repo)","text":"If you do not wish or are able to clone the repo locally, Chroma server can also be run with docker compose by creating (or using a gist) a docker-compose.yaml
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
- cURL (if you want to use the gist approach)
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\nservices:\n chromadb:\n image: chromadb/chroma:latest\n volumes:\n - ./chromadb:/chroma/chroma\n environment:\n - IS_PERSISTENT=TRUE\n - PERSIST_DIRECTORY=/chroma/chroma # this is the default path, change it as needed\n - ANONYMIZED_TELEMETRY=${ANONYMIZED_TELEMETRY:-TRUE}\n ports:\n - 8000:8000\n networks:\n - net\n
The above will create a container with the latest Chroma (chromadb/chroma:latest
), will expose it to port 8000
on the local machine and will persist data in ./chromadb
relative path from where the docker-compose.yaml
has been ran.
We have also created a small gist with the above file for convenience:
curl -s https://gist.githubusercontent.com/tazarov/4fd933274bbacb3b9f286b15c01e904b/raw/87268142d64d8ee0f7f98c27a62a5d089923a1df/docker-compose.yaml | docker-compose -f - up\n
"},{"location":"running/running-chroma/#minikube-with-helm-chart","title":"Minikube With Helm Chart","text":"Note: This deployment can just as well be done with KinD
depending on your preference.
A more advanced approach to running Chroma locally (but also on a remote cluster) is to deploy it using a Helm chart.
Disclaimer: The chart used here is not a 1st party chart, but is contributed by a core contributor to Chroma.
Prerequisites:
- Docker - Overview of Docker Desktop | Docker Docs
- Install minikube - minikube start | minikube (k8s.io)
- kubectl - Install Tools | Kubernetes
- Helm - Helm | Installing Helm
Once you have all of the above running Chroma in a local minikube
cluster quite simple
Create a minikube
cluster:
minikube start --addons=ingress -p chroma\nminikube profile chroma\n
Get and install the chart:
helm repo add chroma https://amikos-tech.github.io/chromadb-chart/\nhelm repo update\nhelm install chroma chroma/chromadb --set chromadb.apiVersion=\"0.4.24\"\n
By default the chart will enable authentication in Chroma. To get the token run the following:
kubectl --namespace default get secret chromadb-auth -o jsonpath=\"{.data.token}\" | base64 --decode\n# or use this to directly export variable\nexport CHROMA_TOKEN=$(kubectl --namespace default get secret chromadb-auth -o jsonpath=\"{.data.token}\" | base64 --decode)\n
The first step to connect and start using Chroma is to forward your port:
minikube service chroma-chromadb --url\n
The above should print something like this:
http://127.0.0.1:61892\n\u2757 Because you are using a Docker driver on darwin, the terminal needs to be open to run it.\n
Note: Depending on your OS the message might be slightly different.
Test it out (pip install chromadb
):
import chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(host=\"http://127.0.0.1:61892\",\n settings=Settings(\n chroma_client_auth_provider=\"chromadb.auth.token.TokenAuthClientProvider\",\n chroma_client_auth_credentials=\"<your_chroma_token>\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\n\nclient.get_version() # this should work with or without authentication - it is a public endpoint\n\nclient.list_collections() # this is a protected endpoint and requires authentication\n
For more information about the helm chart consult - https://github.com/amikos-tech/chromadb-chart
"},{"location":"running/systemd-service/","title":"Systemd service","text":"You can run Chroma as a systemd service which wil allow you to automatically start Chroma on boot and restart it if it crashes.
"},{"location":"running/systemd-service/#docker-compose","title":"Docker Compose","text":"The following is an examples systemd service for running Chroma using Docker Compose.
Create a file /etc/systemd/system/chroma.service
with the following content:
Example assumptions
The below example assumes Debian-based system with docker-ce installed.
[Unit]\nDescription = Chroma Service\nAfter = network.target docker.service\nRequires = docker.service\n\n[Service]\nType = forking\nUser = root\nGroup = root\nWorkingDirectory = /home/admin/chroma\nExecStart = /usr/bin/docker compose up -d\nExecStop = /usr/bin/docker compose down\nRemainAfterExit = true\n\n[Install]\nWantedBy = multi-user.target\n
Replace WorkingDirectory
with the path to your docker compose is. You may also need to replace /usr/bin/docker
with the path to your docker binary.
Alternatively you can install directly from a gist:
wget https://gist.githubusercontent.com/tazarov/9c46966de0b32a4962dcc79dce8b2646/raw/7cf8c471f33fba8a51d6f808f9b1af6ca1b0923c/chroma-docker.service \\\n -O /etc/systemd/system/chroma.service\n
Loading, enabling and starting the service:
sudo systemctl daemon-reload\nsudo systemctl enable chroma\nsudo systemctl start chroma\n
Type=forking
In the above example, we use Type=forking
because Docker Compose runs in the background (-d
). If you are using a different command that runs in the foreground, you may need to use Type=simple
instead.
"},{"location":"running/systemd-service/#chroma-cli","title":"Chroma CLI","text":"The following is an examples systemd service for running Chroma using the Chroma CLI.
Create a file /etc/systemd/system/chroma.service
with the following content:
Example assumptions
The below example assumes that Chroma is installed in Python site-packages
package.
[Unit]\nDescription = Chroma Service\nAfter = network.target\n\n[Service]\nType = simple\nUser = root\nGroup = root\nWorkingDirectory = /chroma\nExecStart=/usr/local/bin/chroma run --host 127.0.0.1 --port 8000 --path /chroma/data --log-path /var/log/chroma.log\n\n[Install]\nWantedBy = multi-user.target\n
Replace the WorkingDirectory
, /chroma/data
and /var/log/chroma.log
with the appropriate paths.
Safe Config
The above example service listens and localhost
which may not work if you are looking to expose Chroma to outside world. Adjust the --host
and --port
flags as needed.
Alternatively you can install from a gist:
wget https://gist.githubusercontent.com/tazarov/5e10ce892c06757d8188a8a34cd6d26d/raw/327a9d0b07afeb0b0cb77453aa9171fdd190984f/chroma-cli.service \\\n -O /etc/systemd/system/chroma.service\n
Loading, enabling and starting the service:
sudo systemctl daemon-reload\nsudo systemctl enable chroma\nsudo systemctl start chroma\n
Type=simple
In the above example, we use Type=simple
because the Chroma CLI runs in the foreground. If you are using a different command that runs in the background, you may need to use Type=forking
instead.
"},{"location":"strategies/backup/","title":"ChromaDB Backups","text":"Depending on your use case there are a few different ways to back up your ChromaDB data.
- API export - this approach is relatively simple, slow for large datasets and may result in a backup that is missing some updates, should your data change frequently.
- Disk snapshot - this approach is fast, but is highly dependent on the underlying storage. Should your cloud provider and underlying volume support snapshots, this is a good option.
- Filesystem backup - this approach is also fast, but requires stopping your Chroma container to avoid data corruption. This is a good option if you can afford to stop your Chroma container for a few minutes.
Other Options
Have another option in mind, feel free to add it to the above list.
"},{"location":"strategies/backup/#api-export","title":"API Export","text":""},{"location":"strategies/backup/#with-chroma-datapipes","title":"With Chroma Datapipes","text":"One way to export via the API is to use Tooling like Chroma Data Pipes. Chroma Data Pipes is a command-line tool that provides a simple way import/export/transform ChromaDB data.
Exporting from local filesystem:
cdp export \"file:///absolute/path/to/chroma-data/my-collection-name\" > my_chroma_data.jsonl\n
Exporting from remote server:
cdp export \"http://remote-chroma-server:8000/my-collection-name\" > my_chroma_data.jsonl\n
Get Help
Read more about Chroma Data Pipes here
"},{"location":"strategies/backup/#disk-snapshot","title":"Disk Snapshot","text":"TBD
"},{"location":"strategies/backup/#filesystem-backup","title":"Filesystem Backup","text":""},{"location":"strategies/backup/#from-docker-container","title":"From Docker Container","text":"Sometimes you have been running Chroma in a Docker container without a host mount, intentionally or unintentionally. So all your data is now stored in the container's filesystem. Here's how you can back up your data:
- Stop the container:
docker stop <chroma-container-id/name>\n
- Create a backup of the container's filesystem:
docker cp <chroma-container-id/name>:/chroma/chroma /path/to/backup\n
/path/to/backup
is the directory where you want to store the backup on your host machine.
"},{"location":"strategies/batching/","title":"Batching","text":"It is often that you may need to ingest a large number of documents into Chroma. The problem you may face is related to the underlying SQLite version of the machine running Chroma which imposes a maximum number of statements and parameters which Chroma translates into a batchable record size, exposed via the max_batch_size
parameter of the ChromaClient
class.
import chromadb\n\nclient = chromadb.PersistentClient(path=\"test\")\nprint(\"Number of documents that can be inserted at once: \",client.max_batch_size)\n
"},{"location":"strategies/batching/#creating-batches","title":"Creating Batches","text":"Due to consistency and data integrity reasons, Chroma does not offer, yet, out-of-the-box batching support. The below code snippet shows how to create batches of documents and ingest them into Chroma.
import chromadb\nfrom chromadb.utils.batch_utils import create_batches\nimport uuid\n\nclient = chromadb.PersistentClient(path=\"test-large-batch\")\nlarge_batch = [(f\"{uuid.uuid4()}\", f\"document {i}\", [0.1] * 1536) for i in range(100000)]\nids, documents, embeddings = zip(*large_batch)\nbatches = create_batches(api=client,ids=list(ids), documents=list(documents), embeddings=list(embeddings))\ncollection = client.get_or_create_collection(\"test\")\nfor batch in batches:\n print(f\"Adding batch of size {len(batch[0])}\")\n collection.add(ids=batch[0],\n documents=batch[3],\n embeddings=batch[1],\n metadatas=batch[2])\n
"},{"location":"strategies/cors/","title":"CORS Configuration for Browser-Based Access","text":"Chroma JS package allows you to use Chroma in your browser-based SPA application. This is great, but that means that you'll need to configure Chroma to work with your browser to avoid CORS issues.
"},{"location":"strategies/cors/#setting-up-chroma-for-browser-based-access","title":"Setting up Chroma for Browser-Based Access","text":"To allow browsers to directly access your Chroma instance you'll need to configure the CHROMA_SERVER_CORS_ALLOW_ORIGINS
. The CHROMA_SERVER_CORS_ALLOW_ORIGINS
environment variable controls the hosts which are allowed to access your Chroma instance.
Note
The CHROMA_SERVER_CORS_ALLOW_ORIGINS
environment variable is a list of strings. Each string is a URL that is allowed to access your Chroma instance. If you want to allow all hosts to access your Chroma instance, you can set CHROMA_SERVER_CORS_ALLOW_ORIGINS
to [\"*\"]
. This is not recommended for production environments.
The below examples assume that your web app is running on http://localhost:3000
. You can find an example of NextJS and Langchain here.
Using Chroma run:
export CHROMA_SERVER_CORS_ALLOW_ORIGINS='[\"http://localhost:3000\"]'\nchroma run --path /path/to/chroma-data\n
Or with docker:
docker run -e CHROMA_SERVER_CORS_ALLOW_ORIGINS='[\"http://localhost:3000\"]' -v /path/to/chroma-data:/chroma/chroma -p 8000:8000 chromadb/chroma\n
Or in your docker-compose.yml
:
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n image: chromadb/chroma:0.5.0\n volumes:\n # Be aware that indexed data are located in \"/chroma/chroma/\"\n # Default configuration for persist_directory in chromadb/config.py\n # Read more about deployments: https://docs.trychroma.com/deployment\n - chroma-data:/chroma/chroma\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTHN_CREDENTIALS_FILE=${CHROMA_SERVER_AUTHN_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTHN_CREDENTIALS=${CHROMA_SERVER_AUTHN_CREDENTIALS}\n - CHROMA_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n - CHROMA_SERVER_CORS_ALLOW_ORIGINS=[\"http://localhost:3000\"]\n restart: unless-stopped # possible values are: \"no\", always\", \"on-failure\", \"unless-stopped\"\n ports:\n - \"8000:8000\"\n healthcheck:\n # Adjust below to match your container port\n test: [ \"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/v1/heartbeat\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\n\nvolumes:\n chroma-data:\n driver: local\n
Run docker compose up
to start your Chroma instance.
"},{"location":"strategies/keyword-search/","title":"Keyword Search","text":"Chroma uses SQLite for storing metadata and documents. Additionally documents are indexed using SQLite FTS5 for fast text search.
import chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.PersistentClient(path=\"test\", settings=Settings(allow_reset=True))\n\nclient.reset()\ncol = client.get_or_create_collection(\"test\")\n\ncol.upsert(ids=[\"1\", \"2\", \"3\"], documents=[\"He is a technology freak and he loves AI topics\", \"AI technology are advancing at a fast pace\", \"Innovation in LLMs is a hot topic\"],metadatas=[{\"author\": \"John Doe\"}, {\"author\": \"Jane Doe\"}, {\"author\": \"John Doe\"}])\ncol.query(query_texts=[\"technology\"], where_document={\"$or\":[{\"$contains\":\"technology\"}, {\"$contains\":\"freak\"}]})\n
The above should return:
{'ids': [['2', '1']],\n 'distances': [[1.052205477809135, 1.3074231535113972]],\n 'metadatas': [[{'author': 'Jane Doe'}, {'author': 'John Doe'}]],\n 'embeddings': None,\n 'documents': [['AI technology are advancing at a fast pace',\n 'He is a technology freak and he loves AI topics']],\n 'uris': None,\n 'data': None}\n
"},{"location":"strategies/memory-management/","title":"Memory Management","text":"This section provided additional info and strategies how to manage memory in Chroma.
"},{"location":"strategies/memory-management/#lru-cache-strategy","title":"LRU Cache Strategy","text":"Out of the box Chroma offers an LRU cache strategy which unloads segments (collections) that are not used while trying to abide to the configured memory usage limits.
To enable the LRU cache the following two settings parameters or environment variables need to be set:
PythonEnvironment Variables from chromadb.config import Settings\n\nsettings = Settings(\n chroma_segment_cache_policy=\"LRU\",\n chroma_memory_limit_bytes=10000000000 # ~10GB\n)\n
export CHROMA_SEGMENT_CACHE_POLICY=LRU\nexport CHROMA_MEMORY_LIMIT_BYTES=10000000000 # ~10GB\n
"},{"location":"strategies/memory-management/#manualcustom-collection-unloading","title":"Manual/Custom Collection Unloading","text":"Local Clients
The below code snippets assume you are working with a PersistentClient
or an EphemeralClient
instance.
At the time of writing (Chroma v0.4.22), Chroma does not allow you to manually unloading of collections from memory.
Here we provide a simple utility function to help users unload collections from memory.
Internal APIs
The below code relies on internal APIs and may change in future versions of Chroma. The function relies on Chroma internal APIs which may change. The below snippet has been tested with Chroma 0.4.24+
.
import gc\nimport os\n\nimport chromadb\nimport psutil\nfrom chromadb.types import SegmentScope\n\n\ndef bytes_to_gb(bytes_value):\n return bytes_value / (1024 ** 3)\n\n\ndef get_process_info():\n pid = os.getpid()\n p = psutil.Process(pid)\n with p.oneshot():\n mem_info = p.memory_info()\n # disk_io = p.io_counters()\n return {\n \"memory_usage\": bytes_to_gb(mem_info.rss),\n }\n\n\ndef unload_index(collection_name: str, chroma_client: chromadb.PersistentClient):\n \"\"\"\n Unloads binary hnsw index from memory and removes both segments (binary and metadata) from the segment cache.\n \"\"\"\n collection = chroma_client.get_collection(collection_name)\n collection_id = collection.id\n segment_manager = chroma_client._server._manager\n for scope in [SegmentScope.VECTOR, SegmentScope.METADATA]:\n if scope in segment_manager.segment_cache:\n cache = segment_manager.segment_cache[scope].cache\n if collection_id in cache:\n segment_manager.callback_cache_evict(cache[collection_id])\n gc.collect()\n
Example Contributed
The above example was enhanced and contributed by Amir
(amdeilami) from our Discord comminity. We appreciate and encourage his work and contributions to the Chroma community.
Usage Example import chromadb\n\n\nclient = chromadb.PersistentClient(path=\"testds-1M/chroma-data\")\ncol=client.get_collection(\"test\")\nprint(col.count())\ncol.get(limit=1,include=[\"embeddings\"]) # force load the collection into memory\n\nunload_index(\"test\", client)\n
"},{"location":"strategies/privacy/","title":"Privacy Strategies","text":""},{"location":"strategies/privacy/#overview","title":"Overview","text":"TBD
"},{"location":"strategies/privacy/#encryption","title":"Encryption","text":""},{"location":"strategies/privacy/#document-encryption","title":"Document Encryption","text":""},{"location":"strategies/privacy/#client-side-document-encryption","title":"Client-side Document Encryption","text":"See the notebook on client-side document encryption.
"},{"location":"strategies/rebuilding/","title":"Rebuilding Chroma DB","text":""},{"location":"strategies/rebuilding/#rebuilding-a-collection","title":"Rebuilding a Collection","text":"Here are several reasons you might want to rebuild a collection:
- Your metadata or binary index is corrupted or even deleted
- Optimize performance of HNSW index after a large number of updates
WAL Consistency and Backups
Before you proceed, make sure to backup your data. Secondly make sure that your WAL contains all the data to allow the proper rebuilding of the collection. For instance, after v0.4.22 you should not have run optimizations or WAL cleanup.
IMPORTANT
Only do this on a stopped Chroma instance.
Find the UUID of the target binary index directory to remove. Typically, the binary index directory is located in the persistent directory and is named after the collection vector segment (in segments
table). You can find the UUID by running the following SQL query:
sqlite3 /path/to/db/chroma.sqlite3 \"select s.id, c.name from segments s join collections c on s.collection=c.id where s.scope='VECTOR';\"\n
The above should print UUID dir and collection names.
Once you remove/rename the UUID dir, restart Chroma and query your collection like so:
import chromadb\nclient = chromadb.HttpClient() # Adjust as per your client\nres = client.get_collection(\"my_collection\").get(limit=1,include=['embeddings'])\n
Chroma will recreate your collection from the WAL.
Rebuilding the collection
Depending on how large your collection is, this process can take a while.
"},{"location":"strategies/time-based-queries/","title":"Time-based Queries","text":""},{"location":"strategies/time-based-queries/#filtering-documents-by-timestamps","title":"Filtering Documents By Timestamps","text":"In the example below, we create a collection with 100 documents, each with a random timestamp in the last two weeks. We then query the collection for documents that were created in the last week.
The example demonstrates how Chroma metadata can be leveraged to filter documents based on how recently they were added or updated.
import uuid\nimport chromadb\n\nimport datetime\nimport random\n\nnow = datetime.datetime.now()\ntwo_weeks_ago = now - datetime.timedelta(days=14)\n\ndates = [\n two_weeks_ago + datetime.timedelta(days=random.randint(0, 14))\n for _ in range(100)\n]\ndates = [int(date.timestamp()) for date in dates]\n\n# convert epoch seconds to iso format\n\ndef iso_date(epoch_seconds): return datetime.datetime.fromtimestamp(\n epoch_seconds).isoformat()\n\nclient = chromadb.EphemeralClient()\n\ncol = client.get_or_create_collection(\"test\")\n\ncol.add(ids=[f\"{uuid.uuid4()}\" for _ in range(100)], documents=[\n f\"document {i}\" for i in range(100)], metadatas=[{\"date\": date} for date in dates])\n\nres = col.get(where={\"date\": {\"$gt\": (now - datetime.timedelta(days=7)).timestamp()}})\n\nfor i in res['metadatas']:\n print(iso_date(i['date']))\n
Ref: https://gist.github.com/tazarov/3c9301d22ab863dca0b6fb1e5e3511b1
"},{"location":"strategies/multi-tenancy/","title":"Multi-Tenancy Strategies","text":""},{"location":"strategies/multi-tenancy/#introduction","title":"Introduction","text":"Some deployment settings of Chroma may require multi-tenancy support. This document outlines the strategies for multi-tenancy approaches in Chroma.
"},{"location":"strategies/multi-tenancy/#approaches","title":"Approaches","text":" - Naive approach - This is a simple approach puts the onus of enforcing multi-tenancy on the application. It is the simplest approach to implement, but is not very well suited for production environments.
- Multi-User Basic Auth - This article provides a stepping stone to more advanced multi-tenancy where the Chroma authentication allows for multiple users to access the same Chroma instance with their own credentials.
- Authorization Model with OpenFGA - Implement an advanced authorization model with OpenFGA.
- Implementing OpenFGA Authorization Model In Chroma - Learn how to implement OpenFGA authorization model in Chroma with full code example.
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/","title":"Implementing OpenFGA Authorization Model In Chroma","text":"Source Code
The source code for this article can be found here.
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#preparation","title":"Preparation","text":"To make things useful we also introduce an initial tuple set with permissions which will allows us to test the authorization model.
We define three users:
admin
part of chroma
team as owner
user1
part of chroma
team as reader
admin-ext
part of external
team as owner
We will give enough permissions to these three users and their respective teams so that they can perform collection creation, deletion, add records, remove records, get records and query records in the context of their role within the team - owner
has access to all API actions while reader
can only read, list get, query.
Abbreviate Example
We have removed some of the data from the above example for brevity. The full tuple set can be found under data/data/initial-data.json
[\n {\n \"object\": \"team:chroma\",\n \"relation\": \"owner\",\n \"user\": \"user:admin\"\n },\n {\n \"object\": \"team:chroma\",\n \"relation\": \"reader\",\n \"user\": \"user:user1\"\n },\n {\n \"object\": \"team:external\",\n \"relation\": \"owner\",\n \"user\": \"user:admin-ext\"\n },\n {\n \"object\": \"server:localhost\",\n \"relation\": \"can_get_tenant\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"tenant:default_tenant-default_database\",\n \"relation\": \"can_get_database\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_create_collection\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_list_collections\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_get_or_create_collection\",\n \"user\": \"team:chroma#owner\"\n },\n {\n \"object\": \"database:default_tenant-default_database\",\n \"relation\": \"can_count_collections\",\n \"user\": \"team:chroma#owner\"\n }\n]\n
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#testing-the-model","title":"Testing the model","text":"Let\u2019s spin up a quick docker compose to test our setup. In the repo we have provided openfga/docker-compose.openfga-standalone.yaml
docker compose -f openfga/docker-compose.openfga-standalone.yaml up\n
For this next part ensure you have FGA CLI installed.
Once the containers are up and running let\u2019s create a store and import the model:
export FGA_API_URL=http://localhost:8082 # our OpenFGA binds to 8082 on localhost\nfga store create --model data/models/model-article-p4.fga --name chromadb-auth\n
You should see a response like this:
{\n \"store\": {\n \"created_at\": \"2024-04-09T18:37:26.367747Z\",\n \"id\": \"01HV3VB347NPY3NMX6VQ5N2E23\",\n \"name\": \"chromadb-auth\",\n \"updated_at\": \"2024-04-09T18:37:26.367747Z\"\n },\n \"model\": {\n \"authorization_model_id\": \"01HV3VB34JAXWF0F3C00DFBZV4\"\n }\n}\n
Let\u2019s import our initial tuple set. Before that make sure to export FGA_STORE_ID
and FGA_MODEL_ID
as per the output of the previous command:
export FGA_STORE_ID=01HV3VB347NPY3NMX6VQ5N2E23\nexport FGA_MODEL_ID=01HV3VB34JAXWF0F3C00DFBZV4\nfga tuple write --file data/data/initial-data.json\n
Let\u2019s test our imported model and tuples:
fga query check user:admin can_get_preflight server:localhost\n
If everything is working you should see this:
{\n \"allowed\": true,\n \"resolution\": \"\"\n}\n
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#implementing-authorization-plumbing-in-chroma","title":"Implementing Authorization Plumbing in Chroma","text":"First we will start with making a few small changes to the authorization plugin we\u2019ve made. Why you ask? We need to introduce teams (aka groups). For that we\u2019ll resort to standard Apache groupfile
as follows:
chroma: admin, user1\nexternal: admin-ext\n
The groupfile
will be mounted to our Chroma container and read by the multi-user basic auth plugin. The changes to the authentication plugin are as follows:
# imports as before\n\n@register_provider(\"multi_user_htpasswd_file\")\nclass MultiUserHtpasswdFileServerAuthCredentialsProvider(ServerAuthCredentialsProvider):\n _creds: Dict[str, SecretStr] # contains user:password-hash\n\n def __init__(self, system: System) -> None:\n super().__init__(system)\n try:\n self.bc = importlib.import_module(\"bcrypt\")\n except ImportError:\n raise ValueError(aa\n \"The bcrypt python package is not installed. \"\n \"Please install it with `pip install bcrypt`\"\n )\n system.settings.require(\"chroma_server_auth_credentials_file\")\n _file = str(system.settings.chroma_server_auth_credentials_file)\n ... # as before\n _basepath = path.dirname(_file)\n self._user_group_map = dict()\n if path.exists(path.join(_basepath, \"groupfile\")):\n _groups = dict()\n with open(path.join(_basepath, \"groupfile\"), \"r\") as f:\n for line in f:\n _raw_group = [v for v in line.strip().split(\":\")]\n if len(_raw_group) < 2:\n raise ValueError(\n \"Invalid Htpasswd group file found in \"\n f\"[{path.join(_basepath, 'groupfile')}]. \"\n \"Must be <groupname>:<username1>,<username2>,...,<usernameN>.\"\n )\n _groups[_raw_group[0]] = [u.strip() for u in _raw_group[1].split(\",\")]\n for _group, _users in _groups.items():\n for _user in _users:\n if _user not in self._user_group_map:\n self._user_group_map[_user] = _group\n\n @trace_method( # type: ignore\n \"MultiUserHtpasswdFileServerAuthCredentialsProvider.validate_credentials\",\n OpenTelemetryGranularity.ALL,\n )\n @override\n def validate_credentials(self, credentials: AbstractCredentials[T]) -> bool:\n ... # as before\n\n @override\n def get_user_identity(\n self, credentials: AbstractCredentials[T]\n ) -> Optional[SimpleUserIdentity]:\n _creds = cast(Dict[str, SecretStr], credentials.get_credentials())\n if _creds[\"username\"].get_secret_value() in self._user_group_map.keys():\n return SimpleUserIdentity(\n _creds[\"username\"].get_secret_value(),\n attributes={\n \"team\": self._user_group_map[_creds[\"username\"].get_secret_value()]\n },\n )\n return SimpleUserIdentity(_creds[\"username\"].get_secret_value(), attributes={\"team\": \"public\"})\n
Full code
The code can be found under chroma_auth/authn/basic/__**init__**.py
We read the group file and for each user create a key in self._user_group_map
to specify the group or team of that user. The information is returned as user identity attributes that is further used by the authz plugin.
Now let\u2019s turn our attention to the authorization plugin. First let\u2019s start with that we\u2019re trying to achieve with it:
- Handle OpenFGA configuration from the import of the model as per the snippet above. This will help us to wire all necessary parts of the code with correct authorization model configuration.
- Map all existing Chroma authorization actions to our authorization model
- Adapt any shortcomings or quirks in Chroma authorization to the way OpenFGA works
- Implement the Enforcement Point (EP) logic
- Implement OpenFGA Permissions API wrapper - this is a utility class that will help us update and keep updating the OpenFGA tuples throughout collections\u2019 lifecycle.
We\u2019ve split the implementation in two files:
chroma_auth/authz/openfga/__init__.py
- Storing our OpenFGA authorization configuration reader and our authorization plugin that adapts to Chroma authz model and enforces authorization decisions chroma_auth/authz/openfga/openfga_permissions.py
- Holds our OpenFGA permissions update logic. chroma_auth/instr/**__init__**.py
- holds our adapted FastAPI server from Chroma 0.4.24
. While the authz plugin system in Chroma makes it easy to write the enforcement of authorization decisions, the update of permissions does require us to into this rabbit hole. Don\u2019t worry the actual changes are minimal
Let\u2019s cover things in a little more detail.
Reading the configuration.
@register_provider(\"openfga_config_provider\")\nclass OpenFGAAuthorizationConfigurationProvider(\n ServerAuthorizationConfigurationProvider[ClientConfiguration]\n):\n _config_file: str\n _config: ClientConfiguration\n\n def __init__(self, system: System) -> None:\n super().__init__(system)\n self._settings = system.settings\n if \"FGA_API_URL\" not in os.environ:\n raise ValueError(\"FGA_API_URL not set\")\n self._config = self._try_load_from_file()\n\n # TODO in the future we can also add credentials (preshared) or OIDC\n\n def _try_load_from_file(self) -> ClientConfiguration:\n store_id = None\n model_id = None\n if \"FGA_STORE_ID\" in os.environ and \"FGA_MODEL_ID\" in os.environ:\n return ClientConfiguration(\n api_url=os.environ.get(\"FGA_API_URL\"),\n store_id=os.environ[\"FGA_STORE_ID\"],\n authorization_model_id=os.environ[\"FGA_MODEL_ID\"],\n )\n if \"FGA_CONFIG_FILE\" not in os.environ and not store_id and not model_id:\n raise ValueError(\"FGA_CONFIG_FILE or FGA_STORE_ID/FGA_MODEL_ID env vars not set\")\n with open(os.environ[\"FGA_CONFIG_FILE\"], \"r\") as f:\n config = json.load(f)\n return ClientConfiguration(\n api_url=os.environ.get(\"FGA_API_URL\"),\n store_id=config[\"store\"][\"id\"],\n authorization_model_id=config[\"model\"][\"authorization_model_id\"],\n )\n\n @override\n def get_configuration(self) -> ClientConfiguration:\n return self._config\n
This is a pretty simple and straightforward implementation that will either take env variables for the FGA Server URL, Store and Model or it will only take the server ULR + json configuration (the same as above).
Next let\u2019s have a look at our OpenFGAAuthorizationProvider
implementation. We\u2019ll start with the constructor where we adapt existing Chroma authorization actions to our model:
def __init__(self, system: System) -> None:\n # more code here, but we're skipping for brevity\n self._authz_to_model_action_map = {\n AuthzResourceActions.CREATE_DATABASE.value: \"can_create_database\",\n AuthzResourceActions.GET_DATABASE.value: \"can_get_database\",\n AuthzResourceActions.CREATE_TENANT.value: \"can_create_tenant\",\n AuthzResourceActions.GET_TENANT.value: \"can_get_tenant\",\n AuthzResourceActions.LIST_COLLECTIONS.value: \"can_list_collections\",\n AuthzResourceActions.COUNT_COLLECTIONS.value: \"can_count_collections\",\n AuthzResourceActions.GET_COLLECTION.value: \"can_get_collection\",\n AuthzResourceActions.CREATE_COLLECTION.value: \"can_create_collection\",\n AuthzResourceActions.GET_OR_CREATE_COLLECTION.value: \"can_get_or_create_collection\",\n AuthzResourceActions.DELETE_COLLECTION.value: \"can_delete_collection\",\n AuthzResourceActions.UPDATE_COLLECTION.value: \"can_update_collection\",\n AuthzResourceActions.ADD.value: \"can_add_records\",\n AuthzResourceActions.DELETE.value: \"can_delete_records\",\n AuthzResourceActions.GET.value: \"can_get_records\",\n AuthzResourceActions.QUERY.value: \"can_query_records\",\n AuthzResourceActions.COUNT.value: \"can_count_records\",\n AuthzResourceActions.UPDATE.value: \"can_update_records\",\n AuthzResourceActions.UPSERT.value: \"can_upsert_records\",\n AuthzResourceActions.RESET.value: \"can_reset\",\n }\n\n self._authz_to_model_object_map = {\n AuthzResourceTypes.DB.value: \"database\",\n AuthzResourceTypes.TENANT.value: \"tenant\",\n AuthzResourceTypes.COLLECTION.value: \"collection\",\n }\n
The above is located in chroma_auth/authz/openfga/__init__.py
The above is fairly straightforward mapping between AuthzResourceActions
part of Chroma\u2019s auth framework and the relations (aka actions) we\u2019ve defined in our model above. Next we map also the AuthzResourceTypes
to OpenFGA objects. This seem pretty simple right? Wrong, things are not so perfect and nothing exhibits this more than our next portion that takes the action and resource and returns object and relation to be checked:
def resolve_resource_action(self, resource: AuthzResource, action: AuthzAction) -> tuple:\n attrs = \"\"\n tenant = None,\n database = None\n if \"tenant\" in resource.attributes:\n attrs += f\"{resource.attributes['tenant']}\"\n tenant = resource.attributes['tenant']\n if \"database\" in resource.attributes:\n attrs += f\"-{resource.attributes['database']}\"\n database = resource.attributes['database']\n if action.id == AuthzResourceActions.GET_TENANT.value or action.id == AuthzResourceActions.CREATE_TENANT.value:\n return \"server:localhost\", self._authz_to_model_action_map[action.id]\n if action.id == AuthzResourceActions.GET_DATABASE.value or action.id == AuthzResourceActions.CREATE_DATABASE.value:\n return f\"tenant:{attrs}\", self._authz_to_model_action_map[action.id]\n if action.id == AuthzResourceActions.CREATE_COLLECTION.value:\n try:\n cole_exists = self._api.get_collection(\n resource.id, tenant=tenant, database=database\n )\n return f\"collection:{attrs}-{cole_exists.name}\", self._authz_to_model_action_map[\n AuthzResourceActions.GET_COLLECTION.value]\n except Exception as e:\n return f\"{self._authz_to_model_object_map[resource.type]}:{attrs}\", self._authz_to_model_action_map[\n action.id]\n if resource.id == \"*\":\n return f\"{self._authz_to_model_object_map[resource.type]}:{attrs}\", self._authz_to_model_action_map[action.id]\n else:\n return f\"{self._authz_to_model_object_map[resource.type]}:{attrs}-{resource.id}\",\n self._authz_to_model_action_map[action.id]\n
Full code
The above is located in chroma_auth/authz/openfga/__init__.py
The resolve_resource_action
function demonstrates the idiosyncrasies of Chroma\u2019s auth. I have only myself to blame. The key takeaway is that there is room for improvement.
The actual authorization enforcement is then dead simple:
def authorize(self, context: AuthorizationContext) -> bool:\n with OpenFgaClient(self._authz_config_provider.get_configuration()) as fga_client:\n try:\n obj, act = self.resolve_resource_action(resource=context.resource, action=context.action)\n resp = fga_client.check(body=ClientCheckRequest(\n user=f\"user:{context.user.id}\",\n relation=act,\n object=obj,\n ))\n # openfga_sdk.models.check_response.CheckResponse\n return resp.allowed\n except Exception as e:\n logger.error(f\"Error while authorizing: {str(e)}\")\n return False\n
At the end we\u2019ll look at the our permissions API wrapper. While a full-blown solution will implement all possible object lifecycle hooks, we\u2019re content with collections. Therefore we\u2019ll add lifecycle callbacks for creating and deleting collection (we\u2019re not considering, sharing of the collection with other users and change of ownership). So how does our create collection hook might look like you ask?
def create_collection_permissions(self, collection: Collection, request: Request) -> None:\n if not hasattr(request.state, \"user_identity\"):\n return\n identity = request.state.user_identity # AuthzUser\n tenant = request.query_params.get(\"tenant\")\n database = request.query_params.get(\"database\")\n _object = f\"collection:{tenant}-{database}-{collection.id}\"\n _object_for_get_collection = f\"collection:{tenant}-{database}-{collection.name}\" # this is a bug in the Chroma Authz that feeds in the name of the collection instead of ID\n _user = f\"team:{identity.get_user_attributes()['team']}#owner\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else f\"user:{identity.get_user_id()}\"\n _user_writer = f\"team:{identity.get_user_attributes()['team']}#writer\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n _user_reader = f\"team:{identity.get_user_attributes()['team']}#reader\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n with OpenFgaClient(self._fga_configuration) as fga_client:\n fga_client.write_tuples(\n body=[\n ClientTuple(_user, \"can_add_records\", _object),\n ClientTuple(_user, \"can_delete_records\", _object),\n ClientTuple(_user, \"can_update_records\", _object),\n ClientTuple(_user, \"can_get_records\", _object),\n ClientTuple(_user, \"can_upsert_records\", _object),\n ClientTuple(_user, \"can_count_records\", _object),\n ClientTuple(_user, \"can_query_records\", _object),\n ClientTuple(_user, \"can_get_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_delete_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_update_collection\", _object),\n ]\n )\n if _user_writer:\n fga_client.write_tuples(\n body=[\n ClientTuple(_user_writer, \"can_add_records\", _object),\n ClientTuple(_user_writer, \"can_delete_records\", _object),\n ClientTuple(_user_writer, \"can_update_records\", _object),\n ClientTuple(_user_writer, \"can_get_records\", _object),\n ClientTuple(_user_writer, \"can_upsert_records\", _object),\n ClientTuple(_user_writer, \"can_count_records\", _object),\n ClientTuple(_user_writer, \"can_query_records\", _object),\n ClientTuple(_user_writer, \"can_get_collection\", _object_for_get_collection),\n ClientTuple(_user_writer, \"can_delete_collection\", _object_for_get_collection),\n ClientTuple(_user_writer, \"can_update_collection\", _object),\n ]\n )\n if _user_reader:\n fga_client.write_tuples(\n body=[\n ClientTuple(_user_reader, \"can_get_records\", _object),\n ClientTuple(_user_reader, \"can_query_records\", _object),\n ClientTuple(_user_reader, \"can_count_records\", _object),\n ClientTuple(_user_reader, \"can_get_collection\", _object_for_get_collection),\n ]\n )\n
Full code
You can find the full code in chroma_auth/authz/openfga/openfga_permissions.py
Looks pretty straight, but hold on I hear a thought creeping in your mind. \u201cWhy are you adding roles manually?\u201d
You are right, it lacks that DRY-je-ne-sais-quoi, and I\u2019m happy to keep it simple an explicit. A more mature implementation can read the model figure out what type we\u2019re adding permissions for and then for each relation add the requisite users, but premature optimization is difficult to put in an article that won\u2019t turn into a book.
With the above code we make the assumption that the collection doesn\u2019t exist ergo its permissions tuples don\u2019t exist. ( OpenFGA will fail to add tuples that already exist and there is not way around it other than deleting them first). Remember permission tuple lifecycle is your responsibility when adding authz to your application.
The delete is oddly similar (that\u2019s why we\u2019ve skipped the bulk of it):
def delete_collection_permissions(self, collection: Collection, request: Request) -> None:\n if not hasattr(request.state, \"user_identity\"):\n return\n identity = request.state.user_identity\n\n _object = f\"collection:{collection.tenant}-{collection.database}-{collection.id}\"\n _object_for_get_collection = f\"collection:{collection.tenant}-{collection.database}-{collection.name}\" # this is a bug in the Chroma Authz that feeds in the name of the collection instead of ID\n _user = f\"team:{identity.get_user_attributes()['team']}#owner\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else f\"user:{identity.get_user_id()}\"\n _user_writer = f\"team:{identity.get_user_attributes()['team']}#writer\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n _user_reader = f\"team:{identity.get_user_attributes()['team']}#reader\" if identity.get_user_attributes() and \"team\" in identity.get_user_attributes() else None\n with OpenFgaClient(self._fga_configuration) as fga_client:\n fga_client.delete_tuples(\n body=[\n ClientTuple(_user, \"can_add_records\", _object),\n ClientTuple(_user, \"can_delete_records\", _object),\n ClientTuple(_user, \"can_update_records\", _object),\n ClientTuple(_user, \"can_get_records\", _object),\n ClientTuple(_user, \"can_upsert_records\", _object),\n ClientTuple(_user, \"can_count_records\", _object),\n ClientTuple(_user, \"can_query_records\", _object),\n ClientTuple(_user, \"can_get_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_delete_collection\", _object_for_get_collection),\n ClientTuple(_user, \"can_update_collection\", _object),\n ]\n )\n # more code in the repo\n
Full code
You can find the full code in chroma_auth/authz/openfga/openfga_permissions.py
Let\u2019s turn our attention at the last piece of code - the necessary evil of updating the FastAPI in Chroma to add our Permissions API hooks. We start simple by injecting our component using Chroma\u2019s DI (dependency injection).
from chroma_auth.authz.openfga.openfga_permissions import OpenFGAPermissionsAPI\n\nself._permissionsApi: OpenFGAPermissionsAPI = self._system.instance(OpenFGAPermissionsAPI)\n
The we add a hook for collection creation:
def create_collection(\n self,\n request: Request,\n collection: CreateCollection,\n tenant: str = DEFAULT_TENANT,\n database: str = DEFAULT_DATABASE,\n) -> Collection:\n existing = None\n try:\n existing = self._api.get_collection(collection.name, tenant=tenant, database=database)\n except ValueError as e:\n if \"does not exist\" not in str(e):\n raise e\n collection = self._api.create_collection(\n name=collection.name,\n metadata=collection.metadata,\n get_or_create=collection.get_or_create,\n tenant=tenant,\n database=database,\n )\n if not existing:\n self._permissionsApi.create_collection_permissions(collection=collection, request=request)\n return collection\n
Full code
You can find the full code in chroma_auth/instr/__init__.py
And one for collection removal:
def delete_collection(\n self,\n request: Request,\n collection_name: str,\n tenant: str = DEFAULT_TENANT,\n database: str = DEFAULT_DATABASE,\n) -> None:\n collection = self._api.get_collection(collection_name, tenant=tenant, database=database)\n resp = self._api.delete_collection(\n collection_name, tenant=tenant, database=database\n )\n\n self._permissionsApi.delete_collection_permissions(collection=collection, request=request)\n return resp\n
Full code
You can find the full code in chroma_auth/instr/__init__.py
The key thing to observe about the above snippets is that we invoke permissions API when we\u2019re sure things have been persisted in the DB. I know, I know, atomicity here is also important, but that is for another article. Just keep in mind that it is easier to fix broken permission than broken data.
I promise this was the last bit of python code you\u2019ll see in this article.
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#the-infra","title":"The Infra","text":"Infrastructure!!! Finally, a sigh of relieve.
Let\u2019s draw a diagrams:
Link
We have our Chroma server, that relies on OpenFGA which persists data in PostgreSQL. \u201cOk, but \u2026\u201d, I can see you scratch your head, \u201c\u2026 how do I bring this magnificent architecture to live?\u201d. I thought you\u2019d never ask. We\u2019ll rely on our trusty docker compose skills with the following sequence in mind:
\u201cWhere is the docker-compose.yaml
!\u201d. Voil\u00e0, my impatient friends:
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n depends_on:\n openfga:\n condition: service_healthy\n import:\n condition: service_completed_successfully\n image: chroma-server\n build:\n dockerfile: Dockerfile\n volumes:\n - ./chroma-data:/chroma/chroma\n - ./server.htpasswd:/chroma/server.htpasswd\n - ./groupfile:/chroma/groupfile\n - ./data/:/data\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}\n - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}\n - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n - CHROMA_SERVER_AUTHZ_PROVIDER=${CHROMA_SERVER_AUTHZ_PROVIDER}\n - CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER=${CHROMA_SERVER_AUTHZ_CONFIG_PROVIDER}\n - FGA_API_URL=http://openfga:8080\n - FGA_CONFIG_FILE=/data/store.json # we expect that the import job will create this file\n restart: unless-stopped # possible values are: \"no\", always\", \"on-failure\", \"unless-stopped\"\n ports:\n - \"8000:8000\"\n healthcheck:\n # Adjust below to match your container port\n test: [ \"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/v1/heartbeat\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\n postgres:\n image: postgres:14\n container_name: postgres\n networks:\n - net\n ports:\n - \"5432:5432\"\n environment:\n - POSTGRES_USER=postgres\n - POSTGRES_PASSWORD=password\n healthcheck:\n test: [ \"CMD-SHELL\", \"pg_isready -U postgres\" ]\n interval: 5s\n timeout: 5s\n retries: 5\n volumes:\n - postgres_data_openfga:/var/lib/postgresql/data\n\n migrate:\n depends_on:\n postgres:\n condition: service_healthy\n image: openfga/openfga:latest\n container_name: migrate\n command: migrate\n environment:\n - OPENFGA_DATASTORE_ENGINE=postgres\n - OPENFGA_DATASTORE_URI=postgres://postgres:password@postgres:5432/postgres?sslmode=disable\n networks:\n - net\n openfga:\n depends_on:\n migrate:\n condition: service_completed_successfully\n image: openfga/openfga:latest\n container_name: openfga\n environment:\n - OPENFGA_DATASTORE_ENGINE=postgres\n - OPENFGA_DATASTORE_URI=postgres://postgres:password@postgres:5432/postgres?sslmode=disable\n - OPENFGA_LOG_FORMAT=json\n command: run\n networks:\n - net\n ports:\n # Needed for the http server\n - \"8082:8080\"\n # Needed for the grpc server (if used)\n - \"8083:8081\"\n # Needed for the playground (Do not enable in prod!)\n - \"3003:3000\"\n healthcheck:\n test: [ \"CMD\", \"/usr/local/bin/grpc_health_probe\", \"-addr=openfga:8081\" ]\n interval: 5s\n timeout: 30s\n retries: 3\n import:\n depends_on:\n openfga:\n condition: service_healthy\n image: fga-cli\n build:\n context: .\n dockerfile: Dockerfile-fgacli\n container_name: import\n volumes:\n - ./data/:/data\n command: |\n /bin/sh -c \"/data/create_store_and_import.sh\"\n environment:\n - FGA_SERVER_URL=http://openfga:8080\n networks:\n - net\nvolumes:\n postgres_data_openfga:\n driver: local\n
Don\u2019t forget to create an .env
file:
CHROMA_SERVER_AUTH_PROVIDER = \"chromadb.auth.basic.BasicAuthServerProvider\"\nCHROMA_SERVER_AUTH_CREDENTIALS_FILE = \"server.htpasswd\"\nCHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER = \"chroma_auth.authn.basic.MultiUserHtpasswdFileServerAuthCredentialsProvider\"\nCHROMA_SERVER_AUTHZ_PROVIDER = \"chroma_auth.authz.openfga.OpenFGAAuthorizationProvider\"\nCHROMA_SERVER_AUTHZ_CONFIG_PROVIDER = \"chroma_auth.authz.openfga.OpenFGAAuthorizationConfigurationProvider\"\n
Update your server.htpasswd
to include the new user:
admin:$2\ny$05$vkBK4b1Vk5O98jNHgr.uduTJsTOfM395sKEKe48EkJCVPH / MBIeHK\nuser1:$2\ny$05$UQ0kC2x3T2XgeN4WU12BdekUwCJmLjJNhMaMtFNolYdj83OqiEpVu\nadmin - ext:$2\ny$05$9.\nL13wKQTHeXz9IH2UO2RurWEK. / Z24qapzyi6ywQGJds2DaC36C2\n
And the groupfile
from before. And don\u2019t forget to take a look at the import script under - data/create_store_and_import.sh
Run the following command at the root of the repo and let things fail and burn down (or in the event this works - awe you, disclaimer - it worked on my machine):
docker\ncompose\nup - -build\n
"},{"location":"strategies/multi-tenancy/authorization-model-impl-with-openfga/#tests-who-needs-test-when-you-have-stable-infra","title":"Tests, who needs test when you have stable infra!","text":"Authorization is serious stuff, which is why we\u2019ve created a bare minimum set of tests to prove we\u2019re not totally wrong about it!
Real Serious Note
Serious Note: Take these things seriously and write a copious amounts of tests before rolling out things to prod. Don\u2019t become OWASP Top10 \u201cHero\u201d. Broken access controls is a thing that WILL keep you up at night.
We\u2019ll focus on three areas:
- Testing admin (owner) access
- Testing team access for owner and reader roles
- Testing cross team permissions
Admin Access
Simple check to ensure that whoever created the collection (aka the owner) is allowed all actions.
import uuid\nimport chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.list_collections() # this is a protected endpoint and requires authentication\n\ncol = client.get_or_create_collection(f\"test_collection-{str(uuid.uuid4())}\")\ncol.add(ids=[\"1\"], documents=[\"test doc\"])\n\ncol.get()\ncol.update(ids=[\"1\"], documents=[\"test doc 2\"])\ncol.count()\ncol.upsert(ids=[\"1\"], documents=[\"test doc 3\"])\ncol.delete(ids=[\"1\"])\n\nclient.delete_collection(col.name)\n
Full code
You can find the full code in test_auth.ipynb
Team Access
Team access tests whether roles and permissions associated with those roles are correctly enforced.
import uuid\nimport chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.list_collections() # this is a protected endpoint and requires authentication\n\ncol_name = f\"test_collection-{str(uuid.uuid4())}\"\ncol = client.get_or_create_collection(col_name)\nprint(f\"Creating collection {col.id}\")\ncol.add(ids=[\"1\"], documents=[\"test doc\"])\n\nclient.get_collection(col_name)\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"user1:password123\"))\n\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.list_collections() # this is a protected endpoint and requires authentication\nclient.count_collections()\nprint(\"Getting collection \" + col_name)\ncol = client.get_collection(col_name)\ncol.get()\ncol.count()\n\ntry:\n client.delete_collection(col_name)\nexcept Exception as e:\n print(e) #expect unauthorized error\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\n\nclient.delete_collection(col_name)\n
Full code
You can find the full code in test_auth.ipynb
Cross-team access
In the cross team access scenario we\u2019ll create a collection with one team owner (admin
) and will try to access it (aka delete it) with another team\u2019s owner in a very mano-a-mano (owner-to-owner way). It is important to observe that all these collections are created within the same database (default_database
)
import uuid\nimport chromadb\nfrom chromadb.config import Settings\n\ncol_name = f\"test_collection-{str(uuid.uuid4())}\"\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\n\nclient.get_or_create_collection(col_name)\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin-ext:password123\"))\n\nclient.get_or_create_collection(\"external-collection\")\n\ntry:\n client.delete_collection(col_name)\nexcept Exception as e:\n print(\"Expected error for admin-ext: \", str(e)) #expect unauthorized error\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",\n chroma_client_auth_credentials=\"admin:password123\"))\nclient.delete_collection(col_name)\ntry:\n client.delete_collection(\"external-collection\")\nexcept Exception as e:\n print(\"Expected error for admin: \", str(e)) #expect unauthorized error\n
Full code
You can find the full code in test_auth.ipynb
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/","title":"Chroma Authorization Model with OpenFGA","text":"Source Code
The source code for this article can be found here.
This article will not provide any code that you can use immediately but will set the stage for our next article, which will introduce the actual Chroma-OpenFGA integration.
With that in mind, let\u2019s get started.
Who is this article for? The intended audience is DevSecOps, but engineers and architects could also use this to learn about Chroma and the authorization models.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#authorization-model","title":"Authorization Model","text":"Authorization models are an excellent way to abstract the way you wish your users to access your application form the actual implementation.
There are many ways to do authz, ranging from commercial Auth0 FGA to OSS options like Ory Keto/Kratos, CASBIN, Permify, and Kubescape, but for this article, we\u2019ve decided to use OpenFGA (which technically is Auth0\u2019s open-source framework for FGA).
Why OpenFGA, I hear you ask? Here are a few reasons:
- Apache-2 licensed
- CNCF Incubating project
- Zanzibar alignment in that it is a ReBAC (Relation-based access control) system
- DSL for modeling and testing permissions (as well as JSON-base version for those with masochistic tendencies)
OpenFGA has done a great job explaining the steps to building an Authorization model, which you can read here. We will go over those while keeping our goal of creating an authorization model for Chroma.
It is worth noting that the resulting authorization model that we will create here will be suitable for many GenAI applications, such as general-purpose RAG systems. Still, it is not a one-size-fits-all solution to all problems. For instance, if you want to implement authz in Chroma within your organization, OpenFGA might not be the right tool for the job, and you should consult with your IT/Security department for guidance on integrating with existing systems.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#the-goal","title":"The Goal","text":"Our goal is to achieve the following:
- Allow fine-grained access to the following resources - collection, database, tenant, and Chroma server.
- AlGrouping of users for improved permission management.
- Individual user access to resources
- Roles - owner, writer, reader
Document-Level Access
Although granting access to individual documents in a collection can be beneficial in some contexts, we have left that part out of our goals to keep things as simple and short as possible. If you are interested in this topic, reach out, and we will help you.
This article will not cover user management, commonly called Identity Access Management (IAM). We\u2019ll cover that in a subsequent article.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#modeling-fundamentals","title":"Modeling Fundamentals","text":"Let\u2019s start with the fundamentals:
Why could user U perform an action A on an object O?
We will attempt to answer the question in the context of Chroma by following OpenFGA approach to refining the model. The steps are:
- Pick the most important features.
- List of object types
- List of relations for the types
- Test the model
- Iterate
Given that OpenFGA is Zanzibar inspired, the basic primitive for it is a tuple of the following format:
(User,Relation,Object)\n
With the above we can express any relation between a user (or a team or even another object) the action the user performs (captured by object relations) and the object (aka API resource).
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#pick-the-features","title":"Pick the features","text":"In the context of Chroma, the features are the actions the user can perform on Chroma API (as of this writing v0.4.24).
Let\u2019s explore what are the actions that users can perform:
- Create a tenant
- Get a tenant
- Create a database for a tenant
- Get a database for a tenant
- Create a collection in a database
- Delete a collection from a database
- Update collection name and metadata
- List collections in a database
- Count collections in a database
- Add records to a collection
- Delete records from a collection
- Update records in a collection
- Upsert records in a collection
- Count records in a collection
- Get records from a collection
- Query records in a collection
- Get pre-flight-checks
Open Endpoints
Note we will omit get hearbeat
and get version
actions as this is generally a good idea to be open so that orchestrators (docker/k8s) can get the health status of chroma.
To make it easy to reason about relations in our authorization model we will rephrase the above to the following format:
A user {user} can perform action {action} to/on/in {object types} ... IF {conditions}\n
- A user can perform action create tenant on Chroma server if they are owner of the server
- A user can perform action get tenant on Chroma server if they are a reader or writer or owner of the server
- A user can perform action create database on a tenant if they are an owner of the tenant
- A user can perform action get database on a tenant if they are reader, writer or owner of the tenant
- A user can perform action create collection on a database if they are a writer or an owner of the database
- A user can perform action delete collection on a database if they are a writer or an owner of the database
- A user can perform action update collection name or metadata on a database if they are a writer or an owner of the database
- A user can perform action list collections in a database if they are a writer or an owner of the database
- A user can perform action count collections in a database if they are a writer or an owner of the database
- A user can perform action add records on a collection if they are writer or owner of the collection
- A user can perform action delete records on a collection if they are writer or owner of the collection
- A user can perform action update records on a collection if they are writer or owner of the collection
- A user can perform action upsert records on a collection if they are writer or owner of the collection
- A user can perform action get records on a collection if they are writer or owner or reader of the collection
- A user can perform action count records on a collection if they are writer or owner or reader of the collection
- A user can perform action query records on a collection if they are writer or owner or reader of the collection
- A user can perform action get pre-flight-checks on a Chroma server if they are writer or owner or reader of the server
We don\u2019t have to get it all right in the first iteration, but the above is a good starting point that can be adapted further.
The above statements alone are already a great introspection as to what we can do within Chroma and who is supposed to be able to do what. Please note that your mileage may vary, as per your authz requirements, but in our experience the variations are generally around the who.
As an astute reader you have already noted that we\u2019re generally outlined some RBAC stuff in the form of owner, writer and reader.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#list-the-objects","title":"List the objects!!!","text":"Now that we know what our users can do, let\u2019s figure solidify our understanding of on what our users will be performing these actions, aka the object types.
Let\u2019s call them out:
- User - this is basic and pretty obvious object type that we want to model our users after
- Chroma server - this is our top level object in the access relations
- Tenant - for most Chroma developers this will equate to a team or a group
- Database
- Collection
We can also examine all of the of the <object>
in the above statements to ensure we haven\u2019t missed any objects. So far seems we\u2019re all good.
Now that we have our objects let\u2019s create a first iteration of our authorization model using OpenFGA DSL:
model\n schema 1.1\n\ntype server\ntype user\ntype tenant\ntype database\ntype collection\n
OpenFGA CLI
You will need to install openfga CLI - https://openfga.dev/docs/getting-started/install-sdk. Also check the VSCode extension for OpenFGA.
Let\u2019s validate our work:
fga model validate --file model-article-p1.fga\n
You should see the following output:
{\n \"is_valid\":true\n}\n
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#relations","title":"Relations","text":"Now that we have the actions and the objects, let us figure out the relationships we want to build into our model.
To come up with our relations we can follow these two rules:
- Any noun of the type
{noun} of a/an/the {type}
expression (e.g. of the collection
) - Any verb or action described with
can {action} on/in {type}
So now let\u2019s work on our model to expand it with relationships:
model\n schema 1.1\n\ntype user\n\ntype server\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define can_get_preflight: reader or owner or writer\n define can_create_tenant: owner or writer\n\ntype tenant\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define belongsTo: [server]\n define can_create_database: owner from belongsTo or writer from belongsTo or owner or writer\n define can_get_database: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n\ntype database\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define belongsTo: [tenant]\n define can_create_collection: owner from belongsTo or writer from belongsTo or owner or writer\n define can_delete_collection: owner from belongsTo or writer from belongsTo or owner or writer\n define can_list_collections: owner or writer or owner from belongsTo or writer from belongsTo\n define can_get_collection: owner or writer or owner from belongsTo or writer from belongsTo\n define can_get_or_create_collection: owner or writer or owner from belongsTo or writer from belongsTo\n define can_count_collections: owner or writer or owner from belongsTo or writer from belongsTo\n\ntype collection\n relations\n define owner: [user]\n define reader: [user]\n define writer: [user]\n define belongsTo: [database]\n define can_add_records: writer or reader or owner from belongsTo or writer from belongsTo\n define can_delete_records: writer or owner from belongsTo or writer from belongsTo\n define can_update_records: writer or owner from belongsTo or writer from belongsTo\n define can_get_records: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n define can_upsert_records: writer or owner from belongsTo or writer from belongsTo\n define can_count_records: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n define can_query_records: reader or owner or writer or owner from belongsTo or reader from belongsTo or writer from belongsTo\n
Let\u2019s validated:
fga model validate --file model-article-p2.fga\n
This seems mostly accurate and should do ok as Authorization model. But let us see if we can make it better. If we are to implement the above we will end up with lots of permissions in OpenFGA, not that it can\u2019t handle them, but as we go into the implementation details it will become cumbersome to update and maintain all these permissions. So let\u2019s look for opportunity to simplify things a little.
Can we make the model a little simpler and the first question we ask is do we really need owner, reader, writer on every object or can we make a decision about our model and simplify this. As it turns out we can. The way that most multi-user systems work is that they tend to gravitate to grouping things as a way to reduce the need to maintain a large number of permissions. In our case we can group our users into team
and in each team we\u2019ll have owner, writer, reader
Let\u2019s see the results:
model\n schema 1.1\n\ntype user\n\ntype team\n relations\n define owner: [user]\n define writer: [user]\n define reader: [user]\n\ntype server\n relations\n define can_get_preflight: [user, team#owner, team#writer, team#reader]\n define can_create_tenant: [user, team#owner, team#writer]\n define can_get_tenant: [user, team#owner, team#writer, team#reader]\n\ntype tenant\n relations\n define can_create_database: [user, team#owner, team#writer]\n define can_get_database: [user, team#owner, team#writer, team#reader]\n\ntype database\n relations\n define can_create_collection: [user, team#owner, team#writer]\n define can_list_collections: [user, team#owner, team#writer, team#reader]\n define can_get_or_create_collection: [user, team#owner, team#writer]\n define can_count_collections: [user, team#owner, team#writer, team#reader]\n\ntype collection\n relations\n define can_delete_collection: [user, team#owner, team#writer]\n define can_get_collection: [user, team#owner, team#writer, team#reader]\n define can_update_collection: [user, team#owner, team#writer]\n define can_add_records: [user, team#owner, team#writer]\n define can_delete_records: [user, team#owner, team#writer]\n define can_update_records: [user, team#owner, team#writer]\n define can_get_records: [user, team#owner, team#writer, team#reader]\n define can_upsert_records: [user, team#owner, team#writer]\n define can_count_records: [user, team#owner, team#writer, team#reader]\n define can_query_records: [user, team#owner, team#writer, team#reader]\n
That is arguably more readable.
As you will observe we have also added [user]
in the permissions of each object, why is that you may ask. The reason is that we want to build a fine-grained authorization, which means while a collection can be belong to a team, we can also grant individual permissions to users. This gives us a great way to play around with permissions at the cost of a more complex implementation of how permissions are managed, but we will get to that in the next post.
We have also removed the belongsTo
relationship as we no longer need it. Reason: OpenFGA does not allow access of relations more than a single layer into the hierarchy thus a collection cannot use the owner of its team for permissions (there are other ways to implement that outside of the scope of this article).
Let\u2019s recap what is our model capable of doing:
- Fine-grained access control to objects is possible via relations
- Users can be grouped into teams (a single user per team is also acceptable for cases where you need a user to be the sole owner of a collection or a database)
- Access to resources can be granted to individual users via object relations
- Define roles within a team (this can be extended to allow roles per resource, but is outside of the scope of this article)
In short we have achieved the goals we have initially set, with a relatively simple and understandable model. However, does our model work? Let\u2019s find out in the next section.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#testing-the-model","title":"Testing the model","text":"Luckily OpenFGA folks have provided a great developer experience by making it easy to write and run tests. This is a massive W and time-saver.
- An individual user can be given access to specific resources via relations
- Users can be part of any of the team roles
- An object can access by a team
name: Chroma Authorization Model Tests # optional\n\nmodel_file: ./model-article-p4.fga # you can specify an external .fga file, or include it inline\n\n# tuple_file: ./tuples.yaml # you can specify an external file, or include it inline\ntuples:\n - user: user:jane\n relation: owner\n object: team:chroma\n - user: user:john\n relation: writer\n object: team:chroma\n - user: user:jill\n relation: reader\n object: team:chroma\n - user: user:sam\n relation: can_create_tenant\n object: server:server1\n - user: user:sam\n relation: can_get_tenant\n object: server:server1\n - user: user:sam\n relation: can_get_preflight\n object: server:server1\n - user: user:michelle\n relation: can_create_tenant\n object: server:server1\n - user: team:chroma#owner\n relation: can_get_preflight\n object: server:server1\n - user: team:chroma#owner\n relation: can_create_tenant\n object: server:server1\n - user: team:chroma#owner\n relation: can_get_tenant\n object: server:server1\n - user: team:chroma#writer\n relation: can_get_preflight\n object: server:server1\n - user: team:chroma#writer\n relation: can_create_tenant\n object: server:server1\n - user: team:chroma#writer\n relation: can_get_tenant\n object: server:server1\n - user: team:chroma#reader\n relation: can_get_preflight\n object: server:server1\n - user: team:chroma#reader\n relation: can_get_tenant\n object: server:server1\n\ntests:\n - name: Users should have team roles\n check:\n - user: user:jane\n object: team:chroma\n assertions:\n owner: true\n writer: false\n reader: false\n - user: user:john\n object: team:chroma\n assertions:\n writer: true\n owner: false\n reader: false\n - user: user:jill\n object: team:chroma\n assertions:\n writer: false\n owner: false\n reader: true\n - user: user:unknown\n object: team:chroma\n assertions:\n writer: false\n owner: false\n reader: false\n - user: user:jane\n object: team:unknown\n assertions:\n writer: false\n owner: false\n reader: false\n - user: user:unknown\n object: team:unknown\n assertions:\n writer: false\n owner: false\n reader: false\n - name: Users should have direct access to server\n check:\n - user: user:sam\n object: server:server1\n assertions:\n can_get_preflight: true\n can_create_tenant: true\n can_get_tenant: true\n - user: user:michelle\n object: server:server1\n assertions:\n can_get_preflight: false\n can_create_tenant: true\n can_get_tenant: false\n - user: user:unknown\n object: server:server1\n assertions:\n can_get_preflight: false\n can_create_tenant: false\n can_get_tenant: false\n - user: user:jill\n object: server:serverX\n assertions:\n can_get_preflight: false\n can_create_tenant: false\n can_get_tenant: false\n - name: Users of a team should have access to server\n check:\n - user: user:jane\n object: server:server1\n assertions:\n can_create_tenant: true\n can_get_tenant: true\n can_get_preflight: true\n - user: user:john\n object: server:server1\n assertions:\n can_create_tenant: true\n can_get_tenant: true\n can_get_preflight: true\n - user: user:jill\n object: server:server1\n assertions:\n can_create_tenant: false\n can_get_tenant: true\n can_get_preflight: true\n - user: user:unknown\n object: server:server1\n assertions:\n can_create_tenant: false\n can_get_tenant: false\n can_get_preflight: false\n
Let\u2019s run the tests:
fga model test --tests test.model-article-p4.fga.yaml\n
This will result in the following output:
# Test Summary #\nTests 3/3 passing\nChecks 42/42 passing\n
That is all folks. We try to keep things as concise as possible and this article has already our levels of comfort in that area. The bottom line is that authorization is no joke and it should take as long of a time as needed.
Writing out all tests will not be concise (maybe we\u2019ll add that to the repo).
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#conclusion","title":"Conclusion","text":"In this article we\u2019ve have built an authorization model for Chroma from scratch using OpenFGA. Admittedly it is a simple model, it still gives is a lot of flexibility to control access to Chroma resources.
"},{"location":"strategies/multi-tenancy/authorization-model-with-openfga/#resources","title":"Resources","text":" - https://github.com/amikos-tech/chromadb-auth - the companion repo for this article (files are stored under
openfga/basic/
) - https://openfga.dev/docs - Read it, understand it, code it!
- https://marketplace.visualstudio.com/items?itemName=openfga.openfga-vscode - It makes your life easier
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/","title":"Multi-User Basic Auth","text":""},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#why-multi-user-auth","title":"Why Multi-user Auth?","text":"Multi-user authentication can be crucial for several reasons. Let's delve into this topic.
Security\u2014The primary concern is the security of your deployments. You need to control who can access your data and ensure they are authorized to do so. You may wonder, since Chroma offers basic and token-based authentication, why is multi-user authentication necessary?
You should never share your Chroma access credentials with your users or any app that depends on Chroma. The answer to this concern is a categorical NO.
Another reason to consider multi-user authentication is to differentiate access to your data. However, the solution presented here doesn't provide this. It's a stepping stone towards our upcoming article on multi-tenancy and securing Chroma data.
Last but not least is auditing. While we acknowledge this is not for everybody, there is ~~an~~ increasing pressure to provide visibility into your app via auditable events.
Multi-user experiences - Not all GenAI apps are intended to be private or individual. This is another reason to consider and implement multi-user authentication and authorization.
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#dive-right-in","title":"Dive right in.","text":"Let's get straight to the point and build a multi-user authorization with basic authentication. Here's our goal:
- Develop a server-side authorization provider that can read multiple users from a
.htpasswd
file - Generate a multi-user
.htpasswd
file with several test users - Package our plugin with the Chroma base image and execute it using Docker Compose
Auth CIP
Chroma has detailed info about how its authentication and authorization are implemented. Should you want to learn more go read the CIP (Chroma Improvement Proposal doc).
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#the-plugin","title":"The Plugin","text":"import importlib\nimport logging\nfrom typing import Dict, cast, TypeVar, Optional\n\nfrom chromadb.auth import (\n ServerAuthCredentialsProvider,\n AbstractCredentials,\n SimpleUserIdentity,\n)\nfrom chromadb.auth.registry import register_provider\nfrom chromadb.config import System\nfrom chromadb.telemetry.opentelemetry import (\n OpenTelemetryGranularity,\n trace_method,\n add_attributes_to_current_span,\n)\nfrom pydantic import SecretStr\nfrom overrides import override\n\nT = TypeVar(\"T\")\n\nlogger = logging.getLogger(__name__)\n\n\n@register_provider(\"multi_user_htpasswd_file\")\nclass MultiUserHtpasswdFileServerAuthCredentialsProvider(ServerAuthCredentialsProvider):\n _creds: Dict[str, SecretStr] # contains user:password-hash\n\n def __init__(self, system: System) -> None:\n super().__init__(system)\n try:\n self.bc = importlib.import_module(\"bcrypt\")\n except ImportError:\n raise ValueError(\n \"The bcrypt python package is not installed. \"\n \"Please install it with `pip install bcrypt`\"\n )\n system.settings.require(\"chroma_server_auth_credentials_file\")\n _file = str(system.settings.chroma_server_auth_credentials_file)\n self._creds = dict()\n with open(_file, \"r\") as f:\n for line in f:\n _raw_creds = [v for v in line.strip().split(\":\")]\n if len(_raw_creds) != 2:\n raise ValueError(\n \"Invalid Htpasswd credentials found in \"\n f\"[{str(system.settings.chroma_server_auth_credentials_file)}]. \"\n \"Must be <username>:<bcrypt passwd>.\"\n )\n self._creds[_raw_creds[0]] = SecretStr(_raw_creds[1])\n\n @trace_method( # type: ignore\n \"MultiUserHtpasswdFileServerAuthCredentialsProvider.validate_credentials\",\n OpenTelemetryGranularity.ALL,\n )\n @override\n def validate_credentials(self, credentials: AbstractCredentials[T]) -> bool:\n _creds = cast(Dict[str, SecretStr], credentials.get_credentials())\n\n if len(_creds) != 2 or \"username\" not in _creds or \"password\" not in _creds:\n logger.error(\n \"Returned credentials did match expected format: \"\n \"dict[username:SecretStr, password: SecretStr]\"\n )\n add_attributes_to_current_span(\n {\n \"auth_succeeded\": False,\n \"auth_error\": \"Returned credentials did match expected format: \"\n \"dict[username:SecretStr, password: SecretStr]\",\n }\n )\n return False # early exit on wrong format\n _user_pwd_hash = (\n self._creds[_creds[\"username\"].get_secret_value()]\n if _creds[\"username\"].get_secret_value() in self._creds\n else None\n )\n validation_response = _user_pwd_hash is not None and self.bc.checkpw(\n _creds[\"password\"].get_secret_value().encode(\"utf-8\"),\n _user_pwd_hash.get_secret_value().encode(\"utf-8\"),\n )\n add_attributes_to_current_span(\n {\n \"auth_succeeded\": validation_response,\n \"auth_error\": f\"Failed to validate credentials for user {_creds['username'].get_secret_value()}\"\n if not validation_response\n else \"\",\n }\n )\n return validation_response\n\n @override\n def get_user_identity(\n self, credentials: AbstractCredentials[T]\n ) -> Optional[SimpleUserIdentity]:\n _creds = cast(Dict[str, SecretStr], credentials.get_credentials())\n return SimpleUserIdentity(_creds[\"username\"].get_secret_value())\n
In less than 80 lines of code, we have our plugin. Let's delve into and explain some of the key points of the code above:
__init__
- Here, we dynamically import bcrypt, which we'll use to check user credentials. We also read the configured credentials file - server.htpasswd
line by line, to retrieve each user (we assume each line contains a new user with its bcrypt hash). validate_credentials
- This is where the magic happens. We initially perform some lightweight validations on the credentials parsed by Chroma and passed to the plugin. Then, we attempt to retrieve the user and its hash from the _creds
dictionary. The final step is to verify the hash. We've also added some attributes to monitor our authentication process in our observability layer (we have an upcoming article about this). get_user_identity
- Constructs a simple user identity, which the authorization plugin uses to verify permissions. Although not needed for now, each authentication plugin must implement this, as user identities are crucial for authorization.
We'll store our plugin in __init__.py
within the following directory structure - chroma_auth/authn/basic/__init__.py
(refer to the repository for details).
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#password-file","title":"Password file","text":"Now that we have our plugin let\u2019s create a password file with a few users:
Initial user:
echo \"password123\" | htpasswd -iBc server.htpasswd admin\n
The above will create (-c
flag) a new server.htpasswd file with initial user admin
and the password will be read from stdin (-i
flag) and saved as bcrypt hash (-B
flag)
Let\u2019s add another user:
echo \"password123\" | htpasswd -iB server.htpasswd user1\n
Now our server.htpasswd
file will look like this:
admin:$2y$05$vkBK4b1Vk5O98jNHgr.uduTJsTOfM395sKEKe48EkJCVPH/MBIeHK\nuser1:$2y$05$UQ0kC2x3T2XgeN4WU12BdekUwCJmLjJNhMaMtFNolYdj83OqiEpVu\n
Moving on to docker setup.
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#docker-compose-setup","title":"Docker compose setup","text":"Let\u2019s create a Dockerfile
to bundle our plugin with the official Chroma image:
ARG CHROMA_VERSION=0.4.24\nFROM ghcr.io/chroma-core/chroma:${CHROMA_VERSION} as base\n\nCOPY chroma_auth/ /chroma/chroma_auth\n
This will pick up the official docker image for Chroma and will add our plugin directory structure so that we can use it.
Now let\u2019s create an .env
file to load our plugin:
CHROMA_SERVER_AUTH_PROVIDER=\"chromadb.auth.basic.BasicAuthServerProvider\"\nCHROMA_SERVER_AUTH_CREDENTIALS_FILE=\"server.htpasswd\"\nCHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=\"chroma_auth.authn.basic.MultiUserHtpasswdFileServerAuthCredentialsProvider\"\n
And finally our docker-compose.yaml
:
version: '3.9'\n\nnetworks:\n net:\n driver: bridge\n\nservices:\n server:\n image: chroma-server\n build:\n dockerfile: Dockerfile\n volumes:\n - ./chroma-data:/chroma/chroma\n - ./server.htpasswd:/chroma/server.htpasswd\n command: \"--workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30\"\n environment:\n - IS_PERSISTENT=TRUE\n - CHROMA_SERVER_AUTH_PROVIDER=${CHROMA_SERVER_AUTH_PROVIDER}\n - CHROMA_SERVER_AUTH_CREDENTIALS_FILE=${CHROMA_SERVER_AUTH_CREDENTIALS_FILE}\n - CHROMA_SERVER_AUTH_CREDENTIALS=${CHROMA_SERVER_AUTH_CREDENTIALS}\n - CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=${CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER}\n - CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER=${CHROMA_SERVER_AUTH_TOKEN_TRANSPORT_HEADER}\n - PERSIST_DIRECTORY=${PERSIST_DIRECTORY:-/chroma/chroma}\n - CHROMA_OTEL_EXPORTER_ENDPOINT=${CHROMA_OTEL_EXPORTER_ENDPOINT}\n - CHROMA_OTEL_EXPORTER_HEADERS=${CHROMA_OTEL_EXPORTER_HEADERS}\n - CHROMA_OTEL_SERVICE_NAME=${CHROMA_OTEL_SERVICE_NAME}\n - CHROMA_OTEL_GRANULARITY=${CHROMA_OTEL_GRANULARITY}\n - CHROMA_SERVER_NOFILE=${CHROMA_SERVER_NOFILE}\n restart: unless-stopped # possible values are: \"no\", always\", \"on-failure\", \"unless-stopped\"\n ports:\n - \"8000:8000\"\n healthcheck:\n # Adjust below to match your container port\n test: [ \"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/v1/heartbeat\" ]\n interval: 30s\n timeout: 10s\n retries: 3\n networks:\n - net\n
"},{"location":"strategies/multi-tenancy/multi-user-basic-auth/#the-test","title":"The test","text":"Let\u2019s run our docker compose setup:
docker compose --env-file ./.env up --build\n
You should see the following log message if the plugin was successfully loaded:
server-1 | DEBUG: [01-04-2024 14:10:13] Starting component MultiUserHtpasswdFileServerAuthCredentialsProvider\nserver-1 | DEBUG: [01-04-2024 14:10:13] Starting component BasicAuthServerProvider\nserver-1 | DEBUG: [01-04-2024 14:10:13] Starting component FastAPIChromaAuthMiddleware\n
Once our container is up and running, let\u2019s see if our multi-user auth works:
import chromadb\nfrom chromadb.config import Settings\n\nclient = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"admin:password123\"))\nclient.heartbeat() # this should work with or without authentication - it is a public endpoint\nclient.get_or_create_collection(\"test_collection\") # this is a protected endpoint and requires authentication\nclient.list_collections() # this is a protected endpoint and requires authentication\n
The above code should return the list of collections, a single collection test_collection
that we created.
(chromadb-multi-user-basic-auth-py3.11) [chromadb-multi-user-basic-auth]python 19:51:38 \u2601 main \u2602 \u26a1 \u271a\nPython 3.11.7 (main, Dec 30 2023, 14:03:09) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> import chromadb\n>>> from chromadb.config import Settings\n>>> \n>>> client = chromadb.HttpClient(\n... settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"admin:password123\"))\n>>> client.heartbeat() # this should work with or without authentication - it is a public endpoint\n1711990302270211007\n>>> \n>>> client.list_collections() # this is a protected endpoint and requires authentication\n[]\n
Great, now let\u2019s test for our other user:
client = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"user1:password123\"))\n
Works just as well (logs omitted for brevity).
To ensure that our plugin works as expected let\u2019s also test with an user that is not in our server.htpasswd
file:
client = chromadb.HttpClient(\n settings=Settings(chroma_client_auth_provider=\"chromadb.auth.basic.BasicAuthClientProvider\",chroma_client_auth_credentials=\"invalid_user:password123\"))\n
Traceback (most recent call last):\n File \"<stdin>\", line 1, in <module>\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/__init__.py\", line 197, in HttpClient\n return ClientCreator(tenant=tenant, database=database, settings=settings)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 144, in __init__\n self._validate_tenant_database(tenant=tenant, database=database)\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 445, in _validate_tenant_database\n raise e\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 438, in _validate_tenant_database\n self._admin_client.get_tenant(name=tenant)\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/client.py\", line 486, in get_tenant\n return self._server.get_tenant(name=name)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py\", line 127, in wrapper\n return f(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/fastapi.py\", line 200, in get_tenant\n raise_chroma_error(resp)\n File \"/Users/tazarov/Library/Caches/pypoetry/virtualenvs/chromadb-multi-user-basic-auth-vIZuPNTE-py3.11/lib/python3.11/site-packages/chromadb/api/fastapi.py\", line 649, in raise_chroma_error\n raise chroma_error\nchromadb.errors.AuthorizationError: Unauthorized\n
As expected, we get auth error when trying to connect to Chroma (the client initialization validates the tenant and DB which are both protected endpoints which raises the exception above).
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/","title":"Naive Multi-tenancy Strategies","text":"Single-note Chroma
The below strategies are applicable to single-node Chroma only. The strategies require your app to act as both PEP (Policy Enforcement Point) and PDP (Policy Decision Point) for authorization. This is a naive approach to multi-tenancy and is probably not suited for production environments, however it is a good and simple way to get started with multi-tenancy in Chroma.
Authorization
We are in the process of creating a list of articles on how to implement proper authorization in Chroma, leveraging the an external service and Chroma's auth plugins. The first article of the series is available in Medium and will also be made available here soon.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#introduction","title":"Introduction","text":"There are several multi-tenancy strategies available to users of Chroma. The actual strategy will depend on the needs of the user and the application. The strategies below apply to multi-user environments, but do no factor in partly-shared resources like groups or teams.
- User-Per-Doc: In this scenario, the app maintains multiple collections and each collection document is associated with a single user.
- User-Per-Collection: In this scenario, the app maintains multiple collections and each collection is associated with a single user.
- User-Per-Database: In this scenario, the app maintains multiple databases with a single tenant and each database is associated with a single user.
- User-Per-Tenant: In this scenario, the app maintains multiple tenants and each tenant is associated with a single user.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-doc","title":"User-Per-Doc","text":"The goal of this strategy is to grant user permissions to access individual documents.
To implement this strategy you need to add some sort of user identification to each document that belongs to a user. For this example we will assume it is user_id
.
import chromadb\n\nclient = chromadb.PersistentClient()\ncollection = client.get_or_create_collection(\"my-collection\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n metadatas=[{\"user_id\": \"user1\"}, {\"user_id\": \"user2\"}],\n ids=[\"doc1\", \"doc2\"],\n)\n
At query time you will have to provide the user_id
as a filter to your query like so:
results = collection.query(\n query_texts=[\"This is a query document\"],\n where=[{\"user_id\": \"user1\"}],\n)\n
To successfully implement this strategy your code needs to consistently add and filter on the user_id
metadata to ensure separation of data.
Drawbacks:
- Error-prone: Messing up the filtering can lead to data being leaked across users.
- Scalability: As the number of users and documents grow, doing filtering on metadata can become slow.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-collection","title":"User-Per-Collection","text":"The goal of this strategy is to grant a user access to all documents in a collection.
To implement this strategy you need to create a collection for each user. For this example we will assume it is user_id
.
import chromadb\n\nclient = chromadb.PersistentClient()\nuser_id = \"user1\"\ncollection = client.get_or_create_collection(f\"user-collection:{user_id}\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n ids=[\"doc1\", \"doc2\"],\n)\n
At query time you will have to provide the user_id
as a filter to your query like so:
user_id = \"user1\"\nuser_collection = client.get_collection(f\"user-collection:{user_id}\")\nresults = user_collection.query(\n query_texts=[\"This is a query document\"],\n)\n
To successfully implement this strategy your code needs to consistently create and query the correct collection for the user.
Drawbacks:
- Error-prone: Messing up the collection name can lead to data being leaked across users.
- Shared document search: If you want to maintain some documents shared then you will have to create a separate collection for those documents and allow users to query the shared collection as well.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-database","title":"User-Per-Database","text":"The goal of this strategy is to associate a user with a single database thus granting them access to all collections and documents within the database.
import chromadb\nfrom chromadb import DEFAULT_TENANT\nfrom chromadb import Settings\n\nadminClient = chromadb.AdminClient(Settings(\n is_persistent=True,\n persist_directory=\"multitenant\",\n))\n\n\n# For Remote Chroma server:\n# \n# adminClient= chromadb.AdminClient(Settings(\n# chroma_api_impl=\"chromadb.api.fastapi.FastAPI\",\n# chroma_server_host=\"localhost\",\n# chroma_server_http_port=\"8000\",\n# ))\n\ndef get_or_create_db_for_user(user_id):\n database = f\"db:{user_id}\"\n try:\n adminClient.get_database(database)\n except Exception as e:\n adminClient.create_database(database, DEFAULT_TENANT)\n return DEFAULT_TENANT, database\n\n\nuser_id = \"user_John\"\n\ntenant, database = get_or_create_db_for_user(user_id)\n# replace with chromadb.HttpClient for remote Chroma server\nclient = chromadb.PersistentClient(path=\"multitenant\", tenant=tenant, database=database)\ncollection = client.get_or_create_collection(\"user_collection\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n ids=[\"doc1\", \"doc2\"],\n)\n
In the above code we do the following:
- We create or get a database for each user in the
DEFAULT_TENANT
using the chromadb.AdminClient
. - We then create a
PersistentClient
for each user with the tenant
and database
we got from the AdminClient
. - We then create or get collection and add data to it.
Drawbacks:
- This strategy requires consistent management of tenants and databases and their use in the client application.
"},{"location":"strategies/multi-tenancy/naive-multi-tenancy/#user-per-tenant","title":"User-Per-Tenant","text":"The goal of this strategy is to associate a user with a single tenant thus granting them access to all databases, collections, and documents within the tenant.
import chromadb\nfrom chromadb import DEFAULT_DATABASE\nfrom chromadb import Settings\n\nadminClient = chromadb.AdminClient(Settings(\n chroma_api_impl=\"chromadb.api.segment.SegmentAPI\",\n is_persistent=True,\n persist_directory=\"multitenant\",\n))\n\n\n# For Remote Chroma server:\n# \n# adminClient= chromadb.AdminClient(Settings(\n# chroma_api_impl=\"chromadb.api.fastapi.FastAPI\",\n# chroma_server_host=\"localhost\",\n# chroma_server_http_port=\"8000\",\n# ))\n\ndef get_or_create_tenant_for_user(user_id):\n tenant_id = f\"tenant_user:{user_id}\"\n try:\n adminClient.get_tenant(tenant_id)\n except Exception as e:\n adminClient.create_tenant(tenant_id)\n adminClient.create_database(DEFAULT_DATABASE, tenant_id)\n return tenant_id, DEFAULT_DATABASE\n\n\nuser_id = \"user1\"\n\ntenant, database = get_or_create_tenant_for_user(user_id)\n# replace with chromadb.HttpClient for remote Chroma server\nclient = chromadb.PersistentClient(path=\"multitenant\", tenant=tenant, database=database)\ncollection = client.get_or_create_collection(\"user_collection\")\ncollection.add(\n documents=[\"This is document1\", \"This is document2\"],\n ids=[\"doc1\", \"doc2\"],\n)\n
In the above code we do the following:
- We create or get a tenant for each user with
DEFAULT_DATABASE
using the chromadb.AdminClient
. - We then create a
PersistentClient
for each user with the tenant
and database
we got from the AdminClient
. - We then create or get collection and add data to it.
Drawbacks:
- This strategy requires consistent management of tenants and databases and their use in the client application.
"}]}
\ No newline at end of file