Skip to content

Commit

Permalink
Add index correction document (#2217)
Browse files Browse the repository at this point in the history
* Add index correction document

* style: format code with Gofumpt and Prettier

This commit fixes the style issues introduced in ffe071a according to the output
from Gofumpt and Prettier.

Details: #2217

* Update docs/user-guides/index-correction.md

Co-authored-by: Hiroto Funakoshi <[email protected]>

* Update notes on processing time

---------

Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>
Co-authored-by: Hiroto Funakoshi <[email protected]>
  • Loading branch information
3 people authored and kmrmt committed Dec 12, 2023
1 parent 030dda9 commit 7d8b793
Showing 1 changed file with 39 additions and 0 deletions.
39 changes: 39 additions & 0 deletions docs/user-guides/index-correction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Index Correction

In the Vald cluster, the same Index is replicated to multiple agents due to the `index_replica` setting. However, inconsistencies between replicas may occur due to pod eviction or the occurrence of OOM killer during vector insertions. For example,

1. The timestamp of the index differs between agents (some agents have an old index saved and it has not been updated).
2. The number of replicas does not meet the value set in `index_replica`.

To resolve these inconsistencies, you can use the `Index Correction` feature.

`Index Correction` is implemented as a [`CronJob`](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/), checking the consistency between replicas regularly and resolving any inconsistencies.

## Settings

- enabled
Turns the index correction feature on/off.
- schedule
Sets the interval for the job start in cron notation (the default value is `3 6 * * *`, which means 3:06 AM every day).
- suspend
[Temporary suspension setting](https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#schedule-suspension) for CronJob.

```yaml
manager:
index:
corrector:
enabled: true
schedule: "3 6 * * *"
suspend: false
```
## Important Notes
- Processing time
Under conditions of 10 million identical vectors(not including `index_replica`) and 10 agent replicas, the processing takes about 30~40 minutes (this is only a reference, and the actual execution time may vary depending on the infrastructure). Time complexity of the process is `O(MN)` where M is the number of identical vector items and N is the number of agent replicas. `index_replica` does not matter for the processing time.

- concurrencyPolicy
`Forbid` is set internally, so a new job will not be created while an existing job is running. In other words, if the process does not finish within the interval specified by the schedule, the next job will not be scheduled.

- Index operations during correction
Vector operations performed after the start of the index correction job are not considered in that job.

0 comments on commit 7d8b793

Please sign in to comment.