From e8383eff036d0d788181b53e5ca20bc4fe99f711 Mon Sep 17 00:00:00 2001 From: vankichi Date: Thu, 18 Jan 2024 11:09:44 +0900 Subject: [PATCH 1/3] :pencil: update capacity planning doc Signed-off-by: vankichi --- docs/user-guides/capacity-planning.md | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/docs/user-guides/capacity-planning.md b/docs/user-guides/capacity-planning.md index ae68d12a1b..ffc7ff3b77 100644 --- a/docs/user-guides/capacity-planning.md +++ b/docs/user-guides/capacity-planning.md @@ -6,21 +6,32 @@ Capacity planning is essential before deploying the Vald cluster to the cloud se There are three viewpoints: Vald cluster view, Kubernetes view, and Component view. Let's see each view. +
+When introducing production, we recommend that you actually measure how many resources are required for verification. +
+ ## Vald cluster view The essential point at the Vald cluster view is the hardware specification, especially RAM. The Vald cluster, especially Vald Agent components, requires much RAM capacity because the vector index is stored in memory. -It is easy to figure out the minimum required RAM capacity by the following formula. +The minimum required memory for each vector (bit) is: + +```bash +// minimum required bit of vector +{ oid (64bit) + timestamp (64bit) + uuid (user defined) } * 2 + { dimension * 64 } + { the creation edge size + the search edge size } * 8 +``` + +Considering the `index size` and `index_replica`, it is easy to figure out the minimum required RAM capacity by the following formula. ```bash -( { the dimension vector } × { bit number of vector } + { the bit of vectors ID string } ) × { the maximum number of the vector } × { the index replica } +{ minimum required bit of vector } * { the index size } * { index_replica } ``` -For example, if you want to insert 1 million vectors with 900 dimensions and the object type is 32-bit with 32 byte (256 bit) ID, and the index replica is 3, the minimum required RAM capacity is: +For example, you want to insert 1 million vectors with 900 dimensions with 32 byte (256 bit) UUID, the index replica is 3, `creation edge size` is 20, and `search edge size` is 10, the minimum required RAM capacity is: ```bash -(900 × 32 + 256 ) × 1,000,000 × 3 = 8,7168,000,000 (bit) = 10.896 (GB) +{(64 + 64 + 256) × 2 + (900 × 64) + (20 + 10) × 8 } × 1,000,000 × 3 = 175,824,000,000 (bit) = 21.978 (GB) ``` It is just the minimum required RAM for indexing. @@ -28,12 +39,12 @@ Considering the margin of RAM capacity, the minimum RAM capacity should be less Therefore, the actual minimum RAM capacity will be: ```bash -8,7168,000,000 (bit) / 0.6 = 145,280,000,000 (bit) = 18.16 (GB) +8,7168,000,000 (bit) / 0.6 = 145,280,000,000 (bit) = 36.63 (GB) ```
In the production usage, memory usage may be not enough in the minimum required RAM.
-E.g., there are a noisy problem, high memory usage for createIndex (indexing on memory), high traffic needs more memory, etc. +Because for example, there are a noisy problem, high memory usage for createIndex (indexing on memory), high traffic needs more memory, etc.
## Kubernetes cluster view From 9a7dac8ed13ed9cfe7ad6c8fefa16c95269193f7 Mon Sep 17 00:00:00 2001 From: vankichi Date: Thu, 18 Jan 2024 11:29:27 +0900 Subject: [PATCH 2/3] :pencil: Fix spelling Signed-off-by: vankichi --- docs/user-guides/capacity-planning.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/user-guides/capacity-planning.md b/docs/user-guides/capacity-planning.md index ffc7ff3b77..92d4e6b8f1 100644 --- a/docs/user-guides/capacity-planning.md +++ b/docs/user-guides/capacity-planning.md @@ -18,14 +18,14 @@ The Vald cluster, especially Vald Agent components, requires much RAM capacity b The minimum required memory for each vector (bit) is: ```bash -// minimum required bit of vector +// minimum required bits of vector { oid (64bit) + timestamp (64bit) + uuid (user defined) } * 2 + { dimension * 64 } + { the creation edge size + the search edge size } * 8 ``` Considering the `index size` and `index_replica`, it is easy to figure out the minimum required RAM capacity by the following formula. ```bash -{ minimum required bit of vector } * { the index size } * { index_replica } +{ minimum required bits of vector } * { the index size } * { index_replica } ``` For example, you want to insert 1 million vectors with 900 dimensions with 32 byte (256 bit) UUID, the index replica is 3, `creation edge size` is 20, and `search edge size` is 10, the minimum required RAM capacity is: From c100b2a7901cb900b3e632d82b9f2690fc38d50c Mon Sep 17 00:00:00 2001 From: vankichi Date: Thu, 18 Jan 2024 15:27:50 +0900 Subject: [PATCH 3/3] :pencil: Fix typo Signed-off-by: vankichi --- docs/user-guides/capacity-planning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guides/capacity-planning.md b/docs/user-guides/capacity-planning.md index 92d4e6b8f1..4fdbd286b5 100644 --- a/docs/user-guides/capacity-planning.md +++ b/docs/user-guides/capacity-planning.md @@ -39,7 +39,7 @@ Considering the margin of RAM capacity, the minimum RAM capacity should be less Therefore, the actual minimum RAM capacity will be: ```bash -8,7168,000,000 (bit) / 0.6 = 145,280,000,000 (bit) = 36.63 (GB) +175,824,000,000 (bit) / 0.6 = 293,040,000,000 (bit) = 36.63 (GB) ```