Skip to content

Commit

Permalink
chore: Updated configuration
Browse files Browse the repository at this point in the history
  • Loading branch information
tazarov committed May 16, 2024
1 parent 8758625 commit 4eac403
Showing 1 changed file with 105 additions and 13 deletions.
118 changes: 105 additions & 13 deletions docs/core/configuration.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Chroma Configuration
# Configuration

!!! warn "Work in Progress"

Expand All @@ -8,29 +8,121 @@

## Server Configuration

### Authentication

### Authorization

## Client Configuration

### Authentication

## HNSW Configuration

HNSW is the underlying library for Chroma vector indexing and search. Chroma exposes a number of parameters to configure
HNSW for your use case. All HNSW parameters are configured as metadata on the collection level.

| Parameter Name | Description and Use | Values |
|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|
| `hnsw:space` | Controls the distance metric of the HNSW index. The space cannot be changed after index creation. | **Possible values**:<br/> - `l2`<br/> - `cosine`<br/> - `ip` <br/>**Default**: `l2` |
| `hnsw:construction_ef` | Controls the number of neighbours in the HNSW graph to explore when adding new vectors. The more neighbours HNSW explores the better and more exhaustive the results will be. Increasing the value will also increase memory consumption. This parameter cannot be changed after index creation. | **Possible values**: _Positive Integers_<br/>**Default**: `100` |
| `hnsw:M` | Controls to how many neighbour nodes (M), a newly inserted vector. A higher value results in a mode densely connected graph. The impact on this is slower but more accurate searches with increased memory consumption. This parameter cannot be changed after index creation. | **Possible values**: _Positive Integers_<br/>**Default**: `16` |
| `hnsw:search_ef` | Controls the number of neighbours in the HNSW graph to explore when searching. Increasing this requires more memory for the HNSW algo to explore the nodes during knn search. This parameter can be changed after index creation. | **Possible values**: _Positive Integers_<br/>**Default**: `10` |
| `hnsw:num_threads` | Controls how many threads HNSW algo use. This parameter can be changed after index creation. | **Possible values**: _Positive Integers_<br/>**Default**: `<number of CPU cores>` |
| `hnsw:resize_factor` | Controls the rate of growth of the graph (e.g. how many node capacity will be added) whenever the current graph capacity is reached. This parameter can be changed after index creation. | **Possible values**: _Positive Floating Point_<br/>**Default**: `1.2` |
| `hnsw:batch_size` | Controls the size of the Bruteforce (in-memory) index. Once this threshold is crossed vectors from BF gets transferred to HNSW index. This value can be changed after index creation. The value must be less than `hnsw:sync_threshold`. | **Possible values**: _Positive Integers_<br/>**Default**: `100` |
| `hnsw:sync_threshold` | Controls the threshold when using HNSW index is written to disk. | **Possible values**: _Positive Integers_<br/>**Default**: `1000` |

!!! tip "Changing HNSW parameters"

Some HNSW parameters cannot be changed after index creation via the standard method shown below.
If you which to change these parameters, you will need to clone the collection see an example [here](collections.md#cloning-a-collection).
### Example

### `hnsw:space`

**Description**: Controls the distance metric of the HNSW index. The space cannot be changed after index creation.

**Default**: `l2`

**Constraints**:

- Possible values: `l2`, `cosine`, `ip`
- Parameter **_cannot_** be changed after index creation.

### `hnsw:construction_ef`

**Description**: Controls the number of neighbours in the HNSW graph to explore when adding new vectors. The more
neighbours HNSW
explores the better and more exhaustive the results will be. Increasing the value will also increase memory consumption.

**Default**: `100`

**Constraints**:

- Values must be positive integers.
- Parameter **_cannot_** be changed after index creation.

### `hnsw:M`

**Description**: Controls the maximum number of neighbour connections (M), a newly inserted vector. A higher value
results in a mode densely connected graph. The impact on this is slower but more accurate searches with increased memory
consumption.

**Default**: `16`

**Constraints**:

- Values must be positive integers.
- Parameter **_cannot_** be changed after index creation.

### `hnsw:search_ef`

**Description**: Controls the number of neighbours in the HNSW graph to explore when searching. Increasing this requires
more memory for the HNSW algo to explore the nodes during knn search.

**Default**: `10`

**Constraints**:

- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

### `hnsw:num_threads`

**Description**: Controls how many threads HNSW algo use.

**Default**: `<number of CPU cores>`

**Constraints**:

- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

### `hnsw:resize_factor`

**Description**: Controls the rate of growth of the graph (e.g. how many node capacity will be added) whenever the
current graph capacity is reached.

**Default**: `1.2`

**Constraints**:

- Values must be positive floating point numbers.
- Parameter **_can_** be changed after index creation.

### `hnsw:batch_size`

**Description**: Controls the size of the Bruteforce (in-memory) index. Once this threshold is crossed vectors from BF
gets transferred to HNSW index. This value can be changed after index creation. The value must be less than
`hnsw:sync_threshold`.

**Default**: `100`

**Constraints**:

- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

### `hnsw:sync_threshold`

**Description**: Controls the threshold when using HNSW index is written to disk.

**Default**: `1000`

**Constraints**:

- Values must be positive integers.
- Parameter **_can_** be changed after index creation.

### Examples

Configuring HNSW parameters at creation time

Expand Down

0 comments on commit 4eac403

Please sign in to comment.