Skip to content

Commit

Permalink
⬆️ v1.7.10 Mon Jan 22 04:29:55 UTC 2024
Browse files Browse the repository at this point in the history
Signed-off-by: vdaas-ci <[email protected]>
  • Loading branch information
vdaas-ci committed Jan 22, 2024
1 parent 2940a52 commit 745af67
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 6 deletions.
12 changes: 7 additions & 5 deletions docs/user-guides/capacity-planning/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,15 @@
<il class=version__item><a href=javascript:void(0) class=version__link>v1.1</a></il>
<il class=version__item><a href=javascript:void(0) class=version__link>v1.0</a></il></ul></details></li><li class=header__item><a class=header__git href=https://github.com/vdaas/vald target=_blank><picture class=git__logo><source srcset=https://vald.netlify.app/images/logo_github_white.svg media="(prefers-color-scheme: dark)" width=18 height=18><img src=https://vald.netlify.app/images/logo_github_black.svg alt=github width=18 height=18></picture><div class=git__star><p class=git__starnum id=git-star-num></p></div></a></li></ul></nav></div></header><main role=main><div class=single><aside class=page><nav><ul id=list-body><li class=withchild id=cat_Overview>Overview<ul><li class=index><a href=/docs/overview/about-vald/>About Vald</a></li><li class=index><a href=/docs/overview/architecture/>Architecture</a></li><li class=index><a href=/docs/overview/data-flow/>Data Flow</a></li></ul></li><li class=withchild id=cat_Component>Component<ul><li class=index><a href=/docs/overview/component/agent/>Agent</a></li><li class=index><a href=/docs/overview/component/lb-gateway/>Lb Gateway</a></li><li class=index><a href=/docs/overview/component/filter-gateway/>Filter Gateway</a></li><li class=index><a href=/docs/overview/component/discoverer/>Discoverer</a></li><li class=index><a href=/docs/overview/component/index-manager/>Index Manager</a></li></ul></li><li class=withchild id=cat_Tutorial>Tutorial<ul><li class=index><a href=/docs/tutorial/get-started/>Get Started</a></li><li class=index><a href=/docs/tutorial/vald-agent-standalone-on-k8s/>Vald Agent Standalone on K8s</a></li><li class=index><a href=/docs/tutorial/vald-agent-standalone-on-docker/>Vald Agent Standalone on Docker</a></li></ul></li><li class=withchild id=cat_Usecase>Usecase<ul><li class=index><a href=/docs/usecase/usage-example/>Usage Example</a></li></ul></li><li class=withchild id="cat_User Guides">User Guides<ul><li class=index><a href=/docs/user-guides/configuration/>Configuration</a></li><li class=index><a href=/docs/user-guides/backup-configuration/>Backup Configuration</a></li><li class=index><a href=/docs/user-guides/filtering-configuration/>Filtering Configuration</a></li><li class=index><a href=/docs/user-guides/cluster-role-binding/>Cluster Role Binding</a></li><li class=index><a href=/docs/user-guides/deployment/>Deployment</a></li><li class=index><a href=/docs/user-guides/operations/>Operations</a></li><li class=index><a href=/docs/user-guides/upgrade-cluster/>Upgrade Cluster</a></li><li class=view><a href=/docs/user-guides/capacity-planning/>Capacity Planning</a></li><li class=index><a href=/docs/user-guides/client-api-config/>Client Api Config</a></li><li class=index><a href=/docs/user-guides/observability-configuration/>Observability Configuration</a></li><li class=index><a href=/docs/user-guides/network-policy/>Network Policy</a></li><li class=index><a href=/docs/user-guides/index-correction/>Index Correction</a></li><li class=index><a href=/docs/user-guides/sdks/>Sdks</a></li></ul></li><li class=withchild id=cat_Performance>Performance<ul><li class=index><a href=/docs/performance/benchmark/>Benchmark</a></li><li class=index><a href=/docs/performance/loadtest/>Loadtest</a></li></ul></li><li class=withchild id=cat_Api>API<ul><li class=index><a href=/docs/api/insert/>Insert</a></li><li class=index><a href=/docs/api/update/>Update</a></li><li class=index><a href=/docs/api/upsert/>Upsert</a></li><li class=index><a href=/docs/api/search/>Search</a></li><li class=index><a href=/docs/api/remove/>Remove</a></li><li class=index><a href=/docs/api/object/>Object</a></li><li class=index><a href=/docs/api/filter-gateway/>Filter Gateway</a></li><li class=index><a href=/docs/api/status/>Status</a></li><li class=index><a href=/docs/api/build_proto/>Build</a></li></ul></li><li class=withchild id=cat_Troubleshooting>Troubleshooting<ul><li class=index><a href=/docs/troubleshooting/client-side/>Client Side</a></li><li class=index><a href=/docs/troubleshooting/provisioning/>Provisioning</a></li></ul></li><li class=withchild id=cat_Contributing>Contributing<ul><li class=index><a href=/docs/contributing/contributing-guide/>Contributing Guide</a></li><li class=index><a href=/docs/contributing/development/>Development</a></li><li class=index><a href=/docs/contributing/coding-style/>Coding Style</a></li><li class=index><a href=/docs/contributing/unit-test-guideline/>Unit Test Guideline</a></li></ul></li><li class=withchild id=cat_Support>Support<ul><li class=index><a href=/docs/support/contacts/>Contacts</a></li><li class=index><a href=/docs/support/faq/>Faq</a></li></ul></li><li class=withchild id=cat_Release>Release<ul><li class=index><a href=/docs/release/changelog/>Changelog</a></li></ul></li></ul></nav></aside><div class=content><div class=markdown id=markdown><h1 id=capacity-planning>Capacity Planning</h1><h2 id=what-is-capacity-planning-for-the-vald-cluster>What is capacity planning for the Vald cluster?</h2><p>Capacity planning is essential before deploying the Vald cluster to the cloud service.
There are three viewpoints: Vald cluster view, Kubernetes view, and Component view.
Let&rsquo;s see each view.</p><h2 id=vald-cluster-view>Vald cluster view</h2><p>The essential point at the Vald cluster view is the hardware specification, especially RAM.
The Vald cluster, especially Vald Agent components, requires much RAM capacity because the vector index is stored in memory.</p><p>It is easy to figure out the minimum required RAM capacity by the following formula.</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span><span style=color:#f92672>(</span> <span style=color:#f92672>{</span> the dimension vector <span style=color:#f92672>}</span> × <span style=color:#f92672>{</span> bit number of vector <span style=color:#f92672>}</span> + <span style=color:#f92672>{</span> the bit of vectors ID string <span style=color:#f92672>}</span> <span style=color:#f92672>)</span> × <span style=color:#f92672>{</span> the maximum number of the vector <span style=color:#f92672>}</span> × <span style=color:#f92672>{</span> the index replica <span style=color:#f92672>}</span>
</span></span></code></pre></div><p>For example, if you want to insert 1 million vectors with 900 dimensions and the object type is 32-bit with 32 byte (256 bit) ID, and the index replica is 3, the minimum required RAM capacity is:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span><span style=color:#f92672>(</span><span style=color:#ae81ff>900</span> × <span style=color:#ae81ff>32</span> + <span style=color:#ae81ff>256</span> <span style=color:#f92672>)</span> × 1,000,000 × 3 <span style=color:#f92672>=</span> 8,7168,000,000 <span style=color:#f92672>(</span>bit<span style=color:#f92672>)</span> <span style=color:#f92672>=</span> 10.896 <span style=color:#f92672>(</span>GB<span style=color:#f92672>)</span>
Let&rsquo;s see each view.</p><div class=notice>When introducing production, we recommend that you actually measure how many resources are required for verification.</div><h2 id=vald-cluster-view>Vald cluster view</h2><p>The essential point at the Vald cluster view is the hardware specification, especially RAM.
The Vald cluster, especially Vald Agent components, requires much RAM capacity because the vector index is stored in memory.</p><p>The minimum required memory for each vector (bit) is:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span>// minimum required bits of vector
</span></span><span style=display:flex><span><span style=color:#f92672>{</span> oid <span style=color:#f92672>(</span>64bit<span style=color:#f92672>)</span> + timestamp <span style=color:#f92672>(</span>64bit<span style=color:#f92672>)</span> + uuid <span style=color:#f92672>(</span>user defined<span style=color:#f92672>)</span> <span style=color:#f92672>}</span> * <span style=color:#ae81ff>2</span> + <span style=color:#f92672>{</span> dimension * <span style=color:#ae81ff>64</span> <span style=color:#f92672>}</span> + <span style=color:#f92672>{</span> the creation edge size + the search edge size <span style=color:#f92672>}</span> * <span style=color:#ae81ff>8</span>
</span></span></code></pre></div><p>Considering the <code>index size</code> and <code>index_replica</code>, it is easy to figure out the minimum required RAM capacity by the following formula.</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span><span style=color:#f92672>{</span> minimum required bits of vector <span style=color:#f92672>}</span> * <span style=color:#f92672>{</span> the index size <span style=color:#f92672>}</span> * <span style=color:#f92672>{</span> index_replica <span style=color:#f92672>}</span>
</span></span></code></pre></div><p>For example, you want to insert 1 million vectors with 900 dimensions with 32 byte (256 bit) UUID, the index replica is 3, <code>creation edge size</code> is 20, and <code>search edge size</code> is 10, the minimum required RAM capacity is:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span><span style=color:#f92672>{(</span><span style=color:#ae81ff>64</span> + <span style=color:#ae81ff>64</span> + 256<span style=color:#f92672>)</span> × <span style=color:#ae81ff>2</span> + <span style=color:#f92672>(</span><span style=color:#ae81ff>900</span> × 64<span style=color:#f92672>)</span> + <span style=color:#f92672>(</span><span style=color:#ae81ff>20</span> + 10<span style=color:#f92672>)</span> × <span style=color:#ae81ff>8</span> <span style=color:#f92672>}</span> × 1,000,000 × 3 <span style=color:#f92672>=</span> 175,824,000,000 <span style=color:#f92672>(</span>bit<span style=color:#f92672>)</span> <span style=color:#f92672>=</span> 21.978 <span style=color:#f92672>(</span>GB<span style=color:#f92672>)</span>
</span></span></code></pre></div><p>It is just the minimum required RAM for indexing.
Considering the margin of RAM capacity, the minimum RAM capacity should be less than 60% of the actual RAM capacity.
Therefore, the actual minimum RAM capacity will be:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span>8,7168,000,000 <span style=color:#f92672>(</span>bit<span style=color:#f92672>)</span> / 0.6 <span style=color:#f92672>=</span> 145,280,000,000 <span style=color:#f92672>(</span>bit<span style=color:#f92672>)</span> <span style=color:#f92672>=</span> 18.16 <span style=color:#f92672>(</span>GB<span style=color:#f92672>)</span>
</span></span></code></pre></div><div class=warn>In the production usage, memory usage may be not enough in the minimum required RAM.<br>E.g., there are a noisy problem, high memory usage for createIndex (indexing on memory), high traffic needs more memory, etc.</div><h2 id=kubernetes-cluster-view>Kubernetes cluster view</h2><h3 id=pod-priority--qos>Pod priority & QoS</h3><p>When the Node capacity (e.g., RAM, CPU) reaches the limit, Kubernetes will decide to kill some Pods according to QoS and Pod priority.
Therefore, the actual minimum RAM capacity will be:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span>175,824,000,000 <span style=color:#f92672>(</span>bit<span style=color:#f92672>)</span> / 0.6 <span style=color:#f92672>=</span> 293,040,000,000 <span style=color:#f92672>(</span>bit<span style=color:#f92672>)</span> <span style=color:#f92672>=</span> 36.63 <span style=color:#f92672>(</span>GB<span style=color:#f92672>)</span>
</span></span></code></pre></div><div class=warn>In the production usage, memory usage may be not enough in the minimum required RAM.<br>Because for example, there are a noisy problem, high memory usage for createIndex (indexing on memory), high traffic needs more memory, etc.</div><h2 id=kubernetes-cluster-view>Kubernetes cluster view</h2><h3 id=pod-priority--qos>Pod priority & QoS</h3><p>When the Node capacity (e.g., RAM, CPU) reaches the limit, Kubernetes will decide to kill some Pods according to QoS and Pod priority.
Kubernetes performs pod scheduling with pods Priority Class as the priority and QoS as the second priority.</p><h4 id=pod-priority>Pod priority</h4><p>Pod priority has the integer value, and the higher value, the higher priority.</p><p>Each Vald component has the default priority value:</p><ul><li>Agent: 1000000000</li><li>Discoverer: 1000000</li><li>Filter Gateway: 1000000</li><li>LB Gateway: 1000000</li><li>Index Manager: 1000000</li></ul><p>Therefore, the order of priority is as follows:</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-bash data-lang=bash><span style=display:flex><span>Agent &gt; Discoverer <span style=color:#f92672>=</span> Filter Gateway <span style=color:#f92672>=</span> LB Gateway <span style=color:#f92672>=</span> Index Manger
</span></span></code></pre></div><p>Those values will be helpful when the Pods other than the Vald component are in the same Node.</p><p>It is easy to change by editing your <code>values.yaml</code>.</p><div class=highlight><pre tabindex=0 style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-yaml data-lang=yaml><span style=display:flex><span><span style=color:#75715e># e.g. LB Gateway podPriority settings.</span>
</span></span><span style=display:flex><span>...
Expand Down
Loading

0 comments on commit 745af67

Please sign in to comment.