From a40cad6812b2a48612868b62649e5d3ef9ecee64 Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Thu, 23 Jan 2020 15:35:56 -0500 Subject: [PATCH 1/8] added new remote write params --- docs/coordinator/api/remote.md | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/docs/coordinator/api/remote.md b/docs/coordinator/api/remote.md index 9621eb5240..5bbb242b4a 100644 --- a/docs/coordinator/api/remote.md +++ b/docs/coordinator/api/remote.md @@ -30,6 +30,33 @@ docs/common/headers_optional_read_write.md Binary [snappy compressed](http://google.github.io/snappy/) Prometheus [WriteRequest protobuf message](https://github.com/prometheus/prometheus/blob/10444e8b1dc69ffcddab93f09ba8dfa6a4a2fddb/prompb/remote.proto#L22-L24). +### Available Tuning Params + +**Note:** All relevant parameters can be found under the `queue_config` section of the Prometheus `remote_write` configuration. + +`capacity` +Capacity controls how many samples are queued in memory per shard before blocking reading from the WAL. Once the WAL is blocked, samples cannot be appended to any shards and all throughput will cease. + +Capacity should be high enough to avoid blocking other shards in most cases, but too much capacity can cause excess memory consumption and longer times to clear queues during resharding. It is recommended to set capacity to 3-10 times `max_samples_per_send`. + +`max_shards` +Max shards configures the maximum number of shards, or parallelism, Prometheus will use for each remote write queue. Prometheus will try not to use too many shards,but if the queue falls behind the remote write component will increase the number of shards up to max shards to increase thoughput. Unless remote writing to a very slow endpoint, it is unlikely that `max_shards` should be increased beyond the default. However, it may be necessary to reduce max shards if there is potential to overwhelm the remote endpoint, or to reduce memory usage when data is backed up. + +`min_shards` +Min shards configures the minimum number of shards used by Prometheus, and is the number of shards used when remote write starts. If remote write falls behind, Prometheus will automatically scale up the number of shards so most users do not have to adjust this parameter. However, increasing min shards will allow Prometheus to avoid falling behind at the beginning while calculating the required number of shards. + +`max_samples_per_send` +Max samples per send can be adjusted depending on the backend in use. Many systems work very well by sending more samples per batch without a significant increase in latency. Other backends will have issues if trying to send a large number of samples in each request. The default value is small enough to work for most systems. + +`batch_send_deadline` +Batch send deadline sets the maximum amount of time between sends for a single shard. Even if the queued shards has not reached `max_samples_per_send`, a request will be sent. Batch send deadline can be increased for low volume systems that are not latency sensitive in order to increase request efficiency. + +`min_backoff` +Min backoff controls the minimum amount of time to wait before retrying a failed request. Increasing the backoff spreads out requests when a remote endpoint comes back online. The backoff interval is doubled for each failed requests up to `max_backoff`. + +`max_backoff` +Max backoff controls the maximum amount of time to wait before retrying a failed request. + ### Sample Call There isn't a straightforward way to Snappy compress and marshal a Prometheus WriteRequest protobuf message using just shell, so this example uses a specific command line utility instead. From f94f1dd4c8a77353a6e1c643b7121261a6a0fa6c Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Mon, 27 Jan 2020 13:16:16 -0500 Subject: [PATCH 2/8] added link to prom params --- docs/coordinator/api/remote.md | 25 +------------------------ 1 file changed, 1 insertion(+), 24 deletions(-) diff --git a/docs/coordinator/api/remote.md b/docs/coordinator/api/remote.md index 5bbb242b4a..23fa4b37ad 100644 --- a/docs/coordinator/api/remote.md +++ b/docs/coordinator/api/remote.md @@ -32,30 +32,7 @@ Binary [snappy compressed](http://google.github.io/snappy/) Prometheus [WriteReq ### Available Tuning Params -**Note:** All relevant parameters can be found under the `queue_config` section of the Prometheus `remote_write` configuration. - -`capacity` -Capacity controls how many samples are queued in memory per shard before blocking reading from the WAL. Once the WAL is blocked, samples cannot be appended to any shards and all throughput will cease. - -Capacity should be high enough to avoid blocking other shards in most cases, but too much capacity can cause excess memory consumption and longer times to clear queues during resharding. It is recommended to set capacity to 3-10 times `max_samples_per_send`. - -`max_shards` -Max shards configures the maximum number of shards, or parallelism, Prometheus will use for each remote write queue. Prometheus will try not to use too many shards,but if the queue falls behind the remote write component will increase the number of shards up to max shards to increase thoughput. Unless remote writing to a very slow endpoint, it is unlikely that `max_shards` should be increased beyond the default. However, it may be necessary to reduce max shards if there is potential to overwhelm the remote endpoint, or to reduce memory usage when data is backed up. - -`min_shards` -Min shards configures the minimum number of shards used by Prometheus, and is the number of shards used when remote write starts. If remote write falls behind, Prometheus will automatically scale up the number of shards so most users do not have to adjust this parameter. However, increasing min shards will allow Prometheus to avoid falling behind at the beginning while calculating the required number of shards. - -`max_samples_per_send` -Max samples per send can be adjusted depending on the backend in use. Many systems work very well by sending more samples per batch without a significant increase in latency. Other backends will have issues if trying to send a large number of samples in each request. The default value is small enough to work for most systems. - -`batch_send_deadline` -Batch send deadline sets the maximum amount of time between sends for a single shard. Even if the queued shards has not reached `max_samples_per_send`, a request will be sent. Batch send deadline can be increased for low volume systems that are not latency sensitive in order to increase request efficiency. - -`min_backoff` -Min backoff controls the minimum amount of time to wait before retrying a failed request. Increasing the backoff spreads out requests when a remote endpoint comes back online. The backoff interval is doubled for each failed requests up to `max_backoff`. - -`max_backoff` -Max backoff controls the maximum amount of time to wait before retrying a failed request. +Refer [here](https://prometheus.io/docs/practices/remote_write/) for an up to date list of remote tuning parameters. ### Sample Call From 68bf63a4b1742ba3d85a91e6aecc8e98ce45f2bc Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Mon, 2 Mar 2020 14:02:00 -0500 Subject: [PATCH 3/8] adding section for multiple m3db clusters --- .../multiple_m3db_clusters.md | 125 ++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 docs/operational_guide/multiple_m3db_clusters.md diff --git a/docs/operational_guide/multiple_m3db_clusters.md b/docs/operational_guide/multiple_m3db_clusters.md new file mode 100644 index 0000000000..c60224bbde --- /dev/null +++ b/docs/operational_guide/multiple_m3db_clusters.md @@ -0,0 +1,125 @@ +## Write to multiple M3DB clusters via M3Coordinator + +### Overview: + +Default M3 architecture has the M3Coordinator writing to and aggregating meterics from a single M3DB cluster. If wanting to add more than one, follow the below instructions. + +Use case(s): +- Sending metrics to different namespaces for different retention periods, etc. + +### Instructions: + +**Note:** Adding mutliple M3DB clusters to m3coordinator using the m3aggregator requires clusterManagement + +**Note:** When making API requests, an environment header needs to be set. + +1. Add clusterManagement to congfig file: + +Example of clusterManagement config: + +```bash +clusterManagement: + etcd: + env: default_env + zone: embedded + service: m3db + cacheDir: /data/m3kv_default + etcdClusters: + - zone: embedded + endpoints: + - 10.33.131.173:2379 + - 10.33.131.174:2379 + - 10.33.131.175:2379 + ``` + +Example config file with clusterManagement: + +```bash + - namespaces: + - namespace: 21d + retention: 504h + type: unaggregated + client: + config: + service: + env: default_env + zone: embedded + service: m3db + cacheDir: /data/m3kv_default + etcdClusters: + - zone: embedded + endpoints: + - 10.33.131.173:2379 + - 10.33.131.174:2379 + - 10.33.131.175:2379 + writeConsistencyLevel: majority + readConsistencyLevel: unstrict_majority + writeTimeout: 10s + fetchTimeout: 15s + connectTimeout: 20s + writeRetry: + initialBackoff: 500ms + backoffFactor: 3 + maxRetries: 2 + jitter: true + fetchRetry: + initialBackoff: 500ms + backoffFactor: 2 + maxRetries: 3 + jitter: true + backgroundHealthCheckFailLimit: 4 + backgroundHealthCheckFailThrottleFactor: 0.5 + - namespaces: + - namespace: 90d + retention: 2160h + type: aggregated + resolution: 10m + - namespace: 500d + retention: 12000h + type: aggregated + resolution: 1h + client: + config: + service: + env: lts_env + zone: embedded + service: m3db + cacheDir: /data/m3kv_lts + etcdClusters: + - zone: embedded + endpoints: + - 10.33.131.173:2379 + - 10.33.131.174:2379 + - 10.33.131.175:2379 + writeConsistencyLevel: majority + readConsistencyLevel: unstrict_majority + writeTimeout: 10s + fetchTimeout: 15s + connectTimeout: 20s + writeRetry: + initialBackoff: 500ms + backoffFactor: 3 + maxRetries: 2 + jitter: true + fetchRetry: + initialBackoff: 500ms + backoffFactor: 2 + maxRetries: 3 + jitter: true + backgroundHealthCheckFailLimit: 4 + backgroundHealthCheckFailThrottleFactor: 0.5 +tagOptions: + idScheme: quoted +clusterManagement: + etcd: + env: default_env + zone: embedded + service: m3db + cacheDir: /data/m3kv_default + etcdClusters: + - zone: embedded + endpoints: + - 10.33.131.173:2379 + - 10.33.131.174:2379 + - 10.33.131.175:2379 +``` From c76d553b787bf95fa57db1e551bd35cb4f62da67 Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Mon, 9 Mar 2020 14:11:45 -0400 Subject: [PATCH 4/8] first round of edits made --- .../multiple_m3db_clusters.md | 41 ++++++------------- 1 file changed, 12 insertions(+), 29 deletions(-) diff --git a/docs/operational_guide/multiple_m3db_clusters.md b/docs/operational_guide/multiple_m3db_clusters.md index c60224bbde..437672077e 100644 --- a/docs/operational_guide/multiple_m3db_clusters.md +++ b/docs/operational_guide/multiple_m3db_clusters.md @@ -2,37 +2,20 @@ ### Overview: -Default M3 architecture has the M3Coordinator writing to and aggregating meterics from a single M3DB cluster. If wanting to add more than one, follow the below instructions. +Default M3 architecture has the M3Coordinator writing to and aggregating meterics from a single M3DB cluster. To map a single coordinator to more than one M3DB cluster, follow the below instructions. Use case(s): - Sending metrics to different namespaces for different retention periods, etc. ### Instructions: -**Note:** Adding mutliple M3DB clusters to m3coordinator using the m3aggregator requires clusterManagement +**Note:** Adding multiple M3DB clusters to m3coordinator using the m3aggregator requires clusterManagement. **Note:** When making API requests, an environment header needs to be set. 1. Add clusterManagement to congfig file: -Example of clusterManagement config: - -```bash -clusterManagement: - etcd: - env: default_env - zone: embedded - service: m3db - cacheDir: /data/m3kv_default - etcdClusters: - - zone: embedded - endpoints: - - 10.33.131.173:2379 - - 10.33.131.174:2379 - - 10.33.131.175:2379 - ``` - -Example config file with clusterManagement: +Example config file with clusterManagement (see end of the config): ```bash - namespaces: @@ -49,9 +32,9 @@ Example config file with clusterManagement: etcdClusters: - zone: embedded endpoints: - - 10.33.131.173:2379 - - 10.33.131.174:2379 - - 10.33.131.175:2379 + - + - + - writeConsistencyLevel: majority readConsistencyLevel: unstrict_majority writeTimeout: 10s @@ -88,9 +71,9 @@ Example config file with clusterManagement: etcdClusters: - zone: embedded endpoints: - - 10.33.131.173:2379 - - 10.33.131.174:2379 - - 10.33.131.175:2379 + - + - + - writeConsistencyLevel: majority readConsistencyLevel: unstrict_majority writeTimeout: 10s @@ -119,7 +102,7 @@ clusterManagement: etcdClusters: - zone: embedded endpoints: - - 10.33.131.173:2379 - - 10.33.131.174:2379 - - 10.33.131.175:2379 + - + - + - ``` From 2b1388b06ee3618dd4f4b6ee9a6596f41cdb2950 Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Thu, 19 Mar 2020 17:35:55 -0400 Subject: [PATCH 5/8] small edits and additions --- .../multiple_m3db_clusters.md | 23 +++++-------------- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/docs/operational_guide/multiple_m3db_clusters.md b/docs/operational_guide/multiple_m3db_clusters.md index 437672077e..09fefd9790 100644 --- a/docs/operational_guide/multiple_m3db_clusters.md +++ b/docs/operational_guide/multiple_m3db_clusters.md @@ -11,13 +11,13 @@ Use case(s): **Note:** Adding multiple M3DB clusters to m3coordinator using the m3aggregator requires clusterManagement. -**Note:** When making API requests, an environment header needs to be set. - -1. Add clusterManagement to congfig file: +1. Add clusterManagement to config file: Example config file with clusterManagement (see end of the config): ```bash +clusters: + # Should match the namespace(s) set up in the DB nodes - namespaces: - namespace: 21d retention: 504h @@ -35,11 +35,6 @@ Example config file with clusterManagement (see end of the config): - - - - writeConsistencyLevel: majority - readConsistencyLevel: unstrict_majority - writeTimeout: 10s - fetchTimeout: 15s - connectTimeout: 20s writeRetry: initialBackoff: 500ms backoffFactor: 3 @@ -50,8 +45,6 @@ Example config file with clusterManagement (see end of the config): backoffFactor: 2 maxRetries: 3 jitter: true - backgroundHealthCheckFailLimit: 4 - backgroundHealthCheckFailThrottleFactor: 0.5 - namespaces: - namespace: 90d retention: 2160h @@ -74,11 +67,6 @@ Example config file with clusterManagement (see end of the config): - - - - writeConsistencyLevel: majority - readConsistencyLevel: unstrict_majority - writeTimeout: 10s - fetchTimeout: 15s - connectTimeout: 20s writeRetry: initialBackoff: 500ms backoffFactor: 3 @@ -89,8 +77,6 @@ Example config file with clusterManagement (see end of the config): backoffFactor: 2 maxRetries: 3 jitter: true - backgroundHealthCheckFailLimit: 4 - backgroundHealthCheckFailThrottleFactor: 0.5 tagOptions: idScheme: quoted clusterManagement: @@ -106,3 +92,6 @@ clusterManagement: - - ``` +2. Use environment header when calling existing APIs + +**Note:** Any API requests to the coordinator require a `Cluster-Environment-Name` header that corresponds to the cluster the request is targeting. \ No newline at end of file From 5b2d9a6d3761f9ab32446f47c7008d5fd8789095 Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Fri, 20 Mar 2020 12:12:32 -0400 Subject: [PATCH 6/8] formatting changes --- docs/operational_guide/multiple_m3db_clusters.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/operational_guide/multiple_m3db_clusters.md b/docs/operational_guide/multiple_m3db_clusters.md index 09fefd9790..fc83a2f0e6 100644 --- a/docs/operational_guide/multiple_m3db_clusters.md +++ b/docs/operational_guide/multiple_m3db_clusters.md @@ -1,19 +1,19 @@ -## Write to multiple M3DB clusters via M3Coordinator +## Write to multiple M3DB clusters via m3coordinator ### Overview: -Default M3 architecture has the M3Coordinator writing to and aggregating meterics from a single M3DB cluster. To map a single coordinator to more than one M3DB cluster, follow the below instructions. +Default M3 architecture has the m3coordinator writing to and aggregating meterics from a single M3DB cluster. To map a single coordinator to more than one M3DB cluster, follow the below instructions. Use case(s): - Sending metrics to different namespaces for different retention periods, etc. ### Instructions: -**Note:** Adding multiple M3DB clusters to m3coordinator using the m3aggregator requires clusterManagement. +**Note:** Adding multiple M3DB clusters to m3coordinator using the m3aggregator requires `clusterManagement`. -1. Add clusterManagement to config file: +1. Add `clusterManagement` to config file: -Example config file with clusterManagement (see end of the config): +Example config file with `clusterManagement` (see end of the config): ```bash clusters: @@ -92,6 +92,6 @@ clusterManagement: - - ``` -2. Use environment header when calling existing APIs +2. Use environment header when calling existing APIs. -**Note:** Any API requests to the coordinator require a `Cluster-Environment-Name` header that corresponds to the cluster the request is targeting. \ No newline at end of file +**Note:** Any API requests to the m3coordinator require a `cluster-environment-name` header that corresponds to the cluster the request is targeting. \ No newline at end of file From ee666de15ce1e6b8624a262c0683023fa39d3b79 Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Mon, 23 Mar 2020 09:24:43 -0400 Subject: [PATCH 7/8] final changes --- docs/operational_guide/multiple_m3db_clusters.md | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/docs/operational_guide/multiple_m3db_clusters.md b/docs/operational_guide/multiple_m3db_clusters.md index fc83a2f0e6..1cf2e64508 100644 --- a/docs/operational_guide/multiple_m3db_clusters.md +++ b/docs/operational_guide/multiple_m3db_clusters.md @@ -2,16 +2,14 @@ ### Overview: -Default M3 architecture has the m3coordinator writing to and aggregating meterics from a single M3DB cluster. To map a single coordinator to more than one M3DB cluster, follow the below instructions. +Default M3 architecture has the m3coordinator writing to and aggregating metrics from a single M3DB cluster. To map a single coordinator to more than one M3DB cluster, follow the below instructions. Use case(s): - Sending metrics to different namespaces for different retention periods, etc. ### Instructions: -**Note:** Adding multiple M3DB clusters to m3coordinator using the m3aggregator requires `clusterManagement`. - -1. Add `clusterManagement` to config file: +1. Add `clusterManagement` to config file to add multiple M3BD clusters to m3coordinator: Example config file with `clusterManagement` (see end of the config): @@ -92,6 +90,4 @@ clusterManagement: - - ``` -2. Use environment header when calling existing APIs. - -**Note:** Any API requests to the m3coordinator require a `cluster-environment-name` header that corresponds to the cluster the request is targeting. \ No newline at end of file +2. Use a `Cluster-Environment-Name` header for any API requests to the m3coordinator. \ No newline at end of file From 50ab050960d662df6a92fc90b192217ee9aa996a Mon Sep 17 00:00:00 2001 From: Gibbs Cullen Date: Mon, 23 Mar 2020 10:37:05 -0400 Subject: [PATCH 8/8] nit --- docs/operational_guide/multiple_m3db_clusters.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/operational_guide/multiple_m3db_clusters.md b/docs/operational_guide/multiple_m3db_clusters.md index 1cf2e64508..1d9d1e0eca 100644 --- a/docs/operational_guide/multiple_m3db_clusters.md +++ b/docs/operational_guide/multiple_m3db_clusters.md @@ -90,4 +90,4 @@ clusterManagement: - - ``` -2. Use a `Cluster-Environment-Name` header for any API requests to the m3coordinator. \ No newline at end of file +2. Use the `Cluster-Environment-Name` header for any API requests to the m3coordinator. \ No newline at end of file