Skip to content

Commit

Permalink
feat: Version bump + HNSW examples (#49)
Browse files Browse the repository at this point in the history
  • Loading branch information
tazarov authored Nov 21, 2024
1 parent 461e63c commit 36157ae
Show file tree
Hide file tree
Showing 7 changed files with 56 additions and 27 deletions.
2 changes: 1 addition & 1 deletion docs/core/advanced/wal.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ request (aka transaction) is safely stored before acknowledging back to the user
after writing to the WAL, the data is also written to the index. This enables Chroma to serve as real-time search
engine, where the data is available for querying immediately after it is written to the WAL.

Below is a diagram that illustrates the WAL in ChromaDB (ca. v0.5.1322):
Below is a diagram that illustrates the WAL in ChromaDB (ca. v0.5.20):

![WAL](../../assets/images/WAL.png){: style="width:100%"}

Expand Down
59 changes: 44 additions & 15 deletions docs/core/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,12 @@ HNSW for your use case. All HNSW parameters are configured as metadata for a col
- Possible values: `l2`, `cosine`, `ip`
- Parameter **_cannot_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:space": "cosine"})
```

### `hnsw:construction_ef`

**Description**: Controls the number of neighbours in the HNSW graph to explore when adding new vectors. The more
Expand All @@ -178,6 +184,12 @@ explores the better and more exhaustive the results will be. Increasing the valu
- Values must be positive integers.
- Parameter **_cannot_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:construction_ef": 100})
```

### `hnsw:M`

**Description**: Controls the maximum number of neighbour connections (M), a newly inserted vector. A higher value
Expand All @@ -191,6 +203,12 @@ consumption.
- Values must be positive integers.
- Parameter **_cannot_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:M": 16})
```

### `hnsw:search_ef`

**Description**: Controls the number of neighbours in the HNSW graph to explore when searching. Increasing this requires
Expand All @@ -203,6 +221,12 @@ more memory for the HNSW algo to explore the nodes during knn search.
- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:search_ef": 10})
```

### `hnsw:num_threads`

**Description**: Controls how many threads HNSW algo use.
Expand All @@ -214,6 +238,12 @@ more memory for the HNSW algo to explore the nodes during knn search.
- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:num_threads": 4})
```

### `hnsw:resize_factor`

**Description**: Controls the rate of growth of the graph (e.g. how many node capacity will be added) whenever the
Expand All @@ -226,6 +256,12 @@ current graph capacity is reached.
- Values must be positive floating point numbers.
- Parameter **_can_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:resize_factor": 1.2})
```

### `hnsw:batch_size`

**Description**: Controls the size of the Bruteforce (in-memory) index. Once this threshold is crossed vectors from BF
Expand All @@ -239,6 +275,12 @@ gets transferred to HNSW index. This value can be changed after index creation.
- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

**Example**:

```python
res = client.create_collection("my_collection", metadata={ "hnsw:batch_size": 100})
```

### `hnsw:sync_threshold`

**Description**: Controls the threshold when using HNSW index is written to disk.
Expand Down Expand Up @@ -272,19 +314,6 @@ res = client.create_collection("my_collection", metadata={

Updating HNSW parameters after creation

```python
import chromadb

client = chromadb.HttpClient() # Adjust as per your client
res = client.get_or_create_collection("my_collection", metadata={
"hnsw:search_ef": 200,
"hnsw:num_threads": 8,
"hnsw:resize_factor": 2,
"hnsw:batch_size": 10000,
"hnsw:sync_threshold": 1000000,
})
```

!!! tip "get_or_create_collection overrides"
!!! warning "Updating HNSW parameters"

When using `get_or_create_collection()` with `metadata` parameter, existing metadata will be overridden with the new values.
Updating HNSW parameters after index creation is not supported as of version `0.5.5`.
4 changes: 2 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ This is a collection of small guides and recipes to help you get started with Ch

!!! warning "Critical Fix in 0.5.13"

If you are using Chroma `>=0.5.7` and `<=0.5.13` please upgrade to `0.5.13` or later as there is a critical bug that can cause data loss. Read more on the [GH Issue #2922](https://github.com/chroma-core/chroma/issues/2922).
If you are using Chroma `>=0.5.7` and `<=0.5.13` please upgrade to `0.5.13+` or later as there is a critical bug that can cause data loss. Read more on the [GH Issue #2922](https://github.com/chroma-core/chroma/issues/2922).

Latest ChromaDB version: [0.5.18](https://github.com/chroma-core/chroma/releases/tag/0.5.18)
Latest ChromaDB version: [0.5.20](https://github.com/chroma-core/chroma/releases/tag/0.5.20)


## New and Noteworthy
Expand Down
10 changes: 5 additions & 5 deletions docs/running/running-chroma.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Prerequisites:
- Docker - [Overview of Docker Desktop | Docker Docs](https://docs.docker.com/desktop/)

```shell
docker run -d --rm --name chromadb -v ./chroma:/chroma/chroma -e IS_PERSISTENT=TRUE -e ANONYMIZED_TELEMETRY=TRUE chromadb/chroma:0.5.13
docker run -d --rm --name chromadb -v ./chroma:/chroma/chroma -e IS_PERSISTENT=TRUE -e ANONYMIZED_TELEMETRY=TRUE chromadb/chroma:0.5.20
```

Options:
Expand Down Expand Up @@ -72,7 +72,7 @@ docker compose up -d --build
If you want to run a specific version of Chroma you can checkout the version tag you need:

```shell
git checkout release/0.5.13
git checkout release/0.5.20
```

### Docker Compose (Without Cloning the Repo)
Expand All @@ -93,7 +93,7 @@ networks:
driver: bridge
services:
chromadb:
image: chromadb/chroma:0.5.13
image: chromadb/chroma:0.5.20
volumes:
- ./chromadb:/chroma/chroma
environment:
Expand All @@ -106,7 +106,7 @@ services:
- net
```
The above will create a container with the latest Chroma (`chromadb/chroma:0.5.13`), will expose it to port `8000` on
The above will create a container with the latest Chroma (`chromadb/chroma:0.5.20`), will expose it to port `8000` on
the local machine and will persist data in `./chromadb` relative path from where the `docker-compose.yaml` has been ran.

!!! tip "Versioning"
Expand Down Expand Up @@ -148,7 +148,7 @@ Get and install the chart:
```bash
helm repo add chroma https://amikos-tech.github.io/chromadb-chart/
helm repo update
helm install chroma chroma/chromadb --set chromadb.apiVersion="0.5.13"
helm install chroma chroma/chromadb --set chromadb.apiVersion="0.5.20"
```

By default the chart will enable authentication in Chroma. To get the token run the following:
Expand Down
2 changes: 1 addition & 1 deletion docs/security/chroma-ssl-cert.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ You can run Chroma with the SSL/TLS certificate generate above or any other cert

services:
server:
image: chromadb/chroma:0.5.13
image: chromadb/chroma:0.5.20
volumes:
# Be aware that indexed data are located in "/chroma/chroma/"
# Default configuration for persist_directory in chromadb/config.py
Expand Down
4 changes: 2 additions & 2 deletions docs/security/ssl-proxies.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ services:
sh -c "/usr/local/bin/wait-for-certs.sh && \
/opt/bitnami/envoy/bin/envoy -c /opt/bitnami/envoy/conf/envoy.yaml"
chromadb:
image: chromadb/chroma:0.5.13
image: chromadb/chroma:0.5.20
volumes:
- ./chromadb:/chroma/chroma
environment:
Expand Down Expand Up @@ -166,7 +166,7 @@ services:
environment:
- CHROMA_DOMAIN=${CHROMA_DOMAIN:-localhost}
chromadb:
image: chromadb/chroma:0.5.13
image: chromadb/chroma:0.5.20
volumes:
- ./chromadb:/chroma/chroma
environment:
Expand Down
2 changes: 1 addition & 1 deletion docs/strategies/memory-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ To enable the LRU cache the following two settings parameters or environment var

The below code snippets assume you are working with a `PersistentClient` or an `EphemeralClient` instance.

At the time of writing (Chroma v0.5.1322), Chroma does not allow you to manually unloading of collections from memory.
At the time of writing (Chroma v0.5.20), Chroma does not allow you to manually unloading of collections from memory.

Here we provide a simple utility function to help users unload collections from memory.

Expand Down

0 comments on commit 36157ae

Please sign in to comment.