You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CockroachDB cluster is elastic in that the number of nodes in the cluster can be increased or decreased at any time. The elasticity de-risks the initial planning on sizing the cluster. The sizing is typically performed for pricing and budgetary estimates and to right-size a POC and initial starting environment.
After the initial environment has been used, real-world measurements can be used to calibrate the sizing. The initial sizing requires adjustments after observing nodes' resource usages such as CPU, RAM, Disk, and Network. Adjustments are required for user response time, application throughput, and availability as well. If the cluster was undersized, add more nodes. If the cluster was oversized, remove the unneeded nodes. The tuning of applications and database objects can have a significant impact which is not addressed here.
With the elasticity in mind, iterative steps for sizing is described here. Each successive step results in incremental accuracy. The first method is capacity based which is easy to use. The 2nd method is model based which requires matching the application to a known model. The 3rd method is simulation based which requires running the actual application.
Outline
The capacity based method is a process of estimating platform resource requirements (CPU, RAM, disk, network) of a planned application. Technical descriptions of the application’s requirements are used and thus this is an ”educated guess” rather than a precise calculation. The model based method simplifies the application by taking the most germane and the most critical portions of the application. This allows controlled experiments and what-if scenarios that are impossible with the actual application. This method can be highly effective once ratios hav been established to calibrate to the actual workload.
The capacity based method uses the online tool called CockroachDB Survival. The tool simulates survivable failures in different homogeneous network topologies in a CockroachDB cluster. Based on this, CockroachDB Survival will produce minimal starting hardware sizing following the CockroachDB Platform Provisioning and Sizing Best Practices.
The model based method uses a downloadable tool called CockroachDB Workload. The tool has about a dozen different industry workloads, such as YCSB and TPCC. A workload is selected that resembles the target workload with database capacity and concurrency. Using the selected workload, start with the configuration determined from the capacity method. After reviewing the performance profile, increase or decrease the numbers of nodes until the desired goal is reached.
The simulation based method uses the actual workload in the testing environment. Again, start with the configuration determined from the capacity method and or model. This allows a set of calibration ratios to be built so that capacity and model based methods can be used with higher accuracy in the future. After reviewing the performance profile, increase or decrease the number of nodes until the desired goal is reached.
Below are examples of the capacity and CockroachDB workload based methods:
Jesse Seldess (jseldess) commented:
Piyush Singh, this is an old issue from R Lee with some good guidance on ways to size a cluster’s resources. Should we keep it around, or should we close and reopen a distinct issue if this is a topic we want to document (which I imagine it is), also for Dedicated.
Exalate commented:
Title
Initial Sizing Guidance
Description
CockroachDB cluster is elastic in that the number of nodes in the cluster can be increased or decreased at any time. The elasticity de-risks the initial planning on sizing the cluster. The sizing is typically performed for pricing and budgetary estimates and to right-size a POC and initial starting environment.
After the initial environment has been used, real-world measurements can be used to calibrate the sizing. The initial sizing requires adjustments after observing nodes' resource usages such as CPU, RAM, Disk, and Network. Adjustments are required for user response time, application throughput, and availability as well. If the cluster was undersized, add more nodes. If the cluster was oversized, remove the unneeded nodes. The tuning of applications and database objects can have a significant impact which is not addressed here.
With the elasticity in mind, iterative steps for sizing is described here. Each successive step results in incremental accuracy. The first method is capacity based which is easy to use. The 2nd method is model based which requires matching the application to a known model. The 3rd method is simulation based which requires running the actual application.
Outline
The capacity based method is a process of estimating platform resource requirements (CPU, RAM, disk, network) of a planned application. Technical descriptions of the application’s requirements are used and thus this is an ”educated guess” rather than a precise calculation. The model based method simplifies the application by taking the most germane and the most critical portions of the application. This allows controlled experiments and what-if scenarios that are impossible with the actual application. This method can be highly effective once ratios hav been established to calibrate to the actual workload.
The capacity based method uses the online tool called CockroachDB Survival. The tool simulates survivable failures in different homogeneous network topologies in a CockroachDB cluster. Based on this, CockroachDB Survival will produce minimal starting hardware sizing following the CockroachDB Platform Provisioning and Sizing Best Practices.
The model based method uses a downloadable tool called CockroachDB Workload. The tool has about a dozen different industry workloads, such as YCSB and TPCC. A workload is selected that resembles the target workload with database capacity and concurrency. Using the selected workload, start with the configuration determined from the capacity method. After reviewing the performance profile, increase or decrease the numbers of nodes until the desired goal is reached.
The simulation based method uses the actual workload in the testing environment. Again, start with the configuration determined from the capacity method and or model. This allows a set of calibration ratios to be built so that capacity and model based methods can be used with higher accuracy in the future. After reviewing the performance profile, increase or decrease the number of nodes until the desired goal is reached.
Below are examples of the capacity and CockroachDB workload based methods:
Expected Audience
Architect, Developer, Operator
Jira Issue: DOC-1649
The text was updated successfully, but these errors were encountered: