diff --git a/docs/how_to/query.md b/docs/how_to/query.md index 8ec164a6b6..f19ca84f49 100644 --- a/docs/how_to/query.md +++ b/docs/how_to/query.md @@ -40,7 +40,7 @@ You will notice that in the setup linked above, M3DB has just one unaggregated n resolution: 10s ``` - If you run Statsite, m3agg, or some other aggregation tier, you will want to set the `all` flag under `downsample` to `false`. Otherwise, you will be aggregating metrics that have already been aggregated. +If you run Statsite, m3agg, or some other aggregation tier, you will want to set the `all` flag under `downsample` to `false`. Otherwise, you will be aggregating metrics that have already been aggregated. ```json - namespace: metrics_10s_48h @@ -51,6 +51,40 @@ You will notice that in the setup linked above, M3DB has just one unaggregated n all: false ``` +## ID generation + +The default generation scheme for IDs is unfortunately prone to collisions, but remains the default for backwards compatibility reasons. It is suggested to set the ID generation scheme to one of either `quoted` or `prepend_meta`. `quoted` generation scheme yields the most human-readable IDs, whereas `prepend_meta` is better for more compact IDS, or if tags are expected to contain non-ASCII characters. To set the ID generation scheme, add the following to your coordinator configuration yaml file: + +```yaml +tagOptions: + idScheme: +``` + +As an example of how these schemes generate IDs, consider a series with the following 4 tags, +`[{"t1":v1}, {t2:"v2"}, {t3:v3}, {t4:v4}]`. The following is an example of how different schemes will generate IDs. + +``` +legacy: "t1"=v1,t2="v2",t3=v3,t4=v4, +prepend_meta: 4,2,2,4,2,2,2,2!"t1"v1t2"v2"t3v3t4v4 +quoted: {\"t1\"="v1",t2="\"v2\"",t3="v3",t4="v4"} +``` + +If there is a chance that your metric tags will contain "control" characters, specifically `,` and `=`, it is highly recommended that one of either the `quoted` or `prepend_meta` schemes are specified, as the `legacy` scheme may cause ID collisions. As a general guideline, we suggest `quoted`, as it mirrors the more familiar Prometheus style IDs. + +We technically have a fourth ID generation scheme that is used for Graphite IDs, but it is exclusive to the Graphite ingestion path and is not selectable as a general scheme. + +**WARNING:** Once a scheme is selected, be very careful about changing it. If changed, all incoming metrics will resolved to a new ID, effectively doubling the metric cardinality until all of the older-style metric IDs fall out of retention. + +### Migration + +We recently updated our ID generation scheme in m3coordinator to avoid the collision issues discussed above. To ease migration, we're temporarily enforcing that an ID generation scheme be explicitly provided in the m3Coordinator configuration files. + +If you have been running m3query or m3coordinator already, you may want to counterintuitively select the collision-prone `legacy` scheme, as all the IDs for all of your current metrics would have already been generated with this scheme, and choosing another will effectively double your index size. If the twofold increase in cardinality is an acceptable increase (and unfortunately, this is likely to mean doubled cardinality until your longest retention cluster rotates out), it's suggested to choose a collision-resistant scheme instead. + +An example of a configuration file with the ID generation scheme can be found (here)[https://github.com/m3db/m3/blob/master/scripts/docker-integration-tests/prometheus/m3coordinator.yml] + +If none of these options work for you, or you would like further clarification, please stop by our [gitter channel](https://gitter.im/m3db/Lobby) and we'll be happy to help you. + ## Grafana You can also set up m3query as a [datasource in Grafana](http://docs.grafana.org/features/datasources/prometheus/). To do this, add a new datasource with a type of `Prometheus`. The URL should point to the host/port running m3query. By default, m3query runs on port `7201`. diff --git a/kube/bundle.yaml b/kube/bundle.yaml index c5a8393bfb..0198696642 100644 --- a/kube/bundle.yaml +++ b/kube/bundle.yaml @@ -131,6 +131,8 @@ data: sanitization: prometheus samplingRate: 1.0 extended: none + tagOptions: + idScheme: quoted db: logging: diff --git a/kube/m3dbnode-configmap.yaml b/kube/m3dbnode-configmap.yaml index d7a6bae810..69c433df10 100644 --- a/kube/m3dbnode-configmap.yaml +++ b/kube/m3dbnode-configmap.yaml @@ -23,6 +23,8 @@ data: sanitization: prometheus samplingRate: 1.0 extended: none + tagOptions: + idScheme: quoted db: logging: diff --git a/scripts/development/m3_stack/m3aggregator.yml b/scripts/development/m3_stack/m3aggregator.yml index 6cd924e604..b8db209b25 100644 --- a/scripts/development/m3_stack/m3aggregator.yml +++ b/scripts/development/m3_stack/m3aggregator.yml @@ -216,7 +216,7 @@ aggregator: jitterEnabled: true maxJitters: - flushInterval: 5s - maxJitterPercent: 1.0 + maxJitterPercent: 1.0 - flushInterval: 10s maxJitterPercent: 0.5 - flushInterval: 1m diff --git a/scripts/development/m3_stack/m3coordinator.yml b/scripts/development/m3_stack/m3coordinator.yml index cf47d98a1d..aa963b2982 100644 --- a/scripts/development/m3_stack/m3coordinator.yml +++ b/scripts/development/m3_stack/m3coordinator.yml @@ -52,3 +52,6 @@ ingest: carbon: ingester: listenAddress: "0.0.0.0:7204" + +tagOptions: + idScheme: quoted diff --git a/scripts/docker-integration-tests/carbon/m3coordinator.yml b/scripts/docker-integration-tests/carbon/m3coordinator.yml index f3d109ac2b..fd5db0ebd6 100644 --- a/scripts/docker-integration-tests/carbon/m3coordinator.yml +++ b/scripts/docker-integration-tests/carbon/m3coordinator.yml @@ -55,3 +55,6 @@ carbon: policies: - resolution: 5s retention: 10h + +tagOptions: + idScheme: quoted diff --git a/scripts/docker-integration-tests/prometheus/m3coordinator.yml b/scripts/docker-integration-tests/prometheus/m3coordinator.yml index 435bd48380..6d513c5fb2 100644 --- a/scripts/docker-integration-tests/prometheus/m3coordinator.yml +++ b/scripts/docker-integration-tests/prometheus/m3coordinator.yml @@ -34,3 +34,6 @@ clusters: - dbnode01:2379 writeConsistencyLevel: majority readConsistencyLevel: unstrict_majority + +tagOptions: + idScheme: quoted diff --git a/src/cmd/services/m3query/config/config.go b/src/cmd/services/m3query/config/config.go index 70af672c18..5514fbbcbf 100644 --- a/src/cmd/services/m3query/config/config.go +++ b/src/cmd/services/m3query/config/config.go @@ -22,6 +22,7 @@ package config import ( "errors" + "fmt" "time" etcdclient "github.com/m3db/m3/src/cluster/client/etcd" @@ -33,6 +34,7 @@ import ( "github.com/m3db/m3/src/query/models" "github.com/m3db/m3/src/query/storage" "github.com/m3db/m3/src/query/storage/m3" + xdocs "github.com/m3db/m3/src/x/docs" xconfig "github.com/m3db/m3x/config" "github.com/m3db/m3x/config/listenaddress" "github.com/m3db/m3x/instrument" @@ -48,6 +50,9 @@ const ( M3DBStorageType BackendStorageType = "m3db" defaultCarbonIngesterListenAddress = "0.0.0.0:7204" + errNoIDGenerationScheme = "error: a recent breaking change means that an ID " + + "generation scheme is required in coordinator configuration settings. " + + "More information is available here: %s" ) var ( @@ -340,7 +345,9 @@ func TagOptionsFromConfig(cfg TagOptionsConfiguration) (models.TagOptions, error } if cfg.Scheme == models.TypeDefault { - cfg.Scheme = models.TypeLegacy + // If no config has been set, error. + docLink := xdocs.Path("how_to/query#migration") + return nil, fmt.Errorf(errNoIDGenerationScheme, docLink) } opts = opts.SetIDSchemeType(cfg.Scheme) diff --git a/src/cmd/services/m3query/config/config_test.go b/src/cmd/services/m3query/config/config_test.go index 24cdcba78d..8d7c2f8d94 100644 --- a/src/cmd/services/m3query/config/config_test.go +++ b/src/cmd/services/m3query/config/config_test.go @@ -21,9 +21,11 @@ package config import ( + "fmt" "testing" "github.com/m3db/m3/src/query/models" + xdocs "github.com/m3db/m3/src/x/docs" xconfig "github.com/m3db/m3x/config" "github.com/stretchr/testify/assert" @@ -32,18 +34,34 @@ import ( yaml "gopkg.in/yaml.v2" ) -func TestTagOptionsFromEmptyConfig(t *testing.T) { +func TestTagOptionsFromEmptyConfigErrors(t *testing.T) { cfg := TagOptionsConfiguration{} opts, err := TagOptionsFromConfig(cfg) - require.NoError(t, err) - require.NotNil(t, opts) - assert.Equal(t, []byte("__name__"), opts.MetricName()) + require.Error(t, err) + require.Nil(t, opts) +} + +func TestTagOptionsFromConfigWithIDGenerationScheme(t *testing.T) { + schemes := []models.IDSchemeType{models.TypeLegacy, + models.TypePrependMeta, models.TypeQuoted} + for _, scheme := range schemes { + cfg := TagOptionsConfiguration{ + Scheme: scheme, + } + + opts, err := TagOptionsFromConfig(cfg) + require.NoError(t, err) + require.NotNil(t, opts) + assert.Equal(t, []byte("__name__"), opts.MetricName()) + assert.Equal(t, scheme, opts.IDSchemeType()) + } } func TestTagOptionsFromConfig(t *testing.T) { name := "foobar" cfg := TagOptionsConfiguration{ MetricName: name, + Scheme: models.TypeLegacy, } opts, err := TagOptionsFromConfig(cfg) require.NoError(t, err) @@ -97,14 +115,41 @@ func TestConfigValidation(t *testing.T) { } } -func TestDefaultTagOptionsConfig(t *testing.T) { +func TestDefaultTagOptionsConfigErrors(t *testing.T) { var cfg TagOptionsConfiguration require.NoError(t, yaml.Unmarshal([]byte(""), &cfg)) opts, err := TagOptionsFromConfig(cfg) - require.NoError(t, err) - assert.Equal(t, []byte("__name__"), opts.MetricName()) - assert.Equal(t, []byte("le"), opts.BucketName()) - assert.Equal(t, models.TypeLegacy, opts.IDSchemeType()) + + docLink := xdocs.Path("how_to/query#migration") + expectedError := fmt.Sprintf(errNoIDGenerationScheme, docLink) + require.EqualError(t, err, expectedError) + require.Nil(t, opts) +} + +func TestGraphiteIDGenerationSchemeIsInvalid(t *testing.T) { + var cfg TagOptionsConfiguration + require.Error(t, yaml.Unmarshal([]byte("idScheme: graphite"), &cfg)) +} + +func TestTagOptionsConfigWithTagGenerationScheme(t *testing.T) { + var tests = []struct { + schemeStr string + scheme models.IDSchemeType + }{ + {"legacy", models.TypeLegacy}, + {"prepend_meta", models.TypePrependMeta}, + {"quoted", models.TypeQuoted}, + } + + for _, tt := range tests { + var cfg TagOptionsConfiguration + schemeConfig := fmt.Sprintf("idScheme: %s", tt.schemeStr) + require.NoError(t, yaml.Unmarshal([]byte(schemeConfig), &cfg)) + opts, err := TagOptionsFromConfig(cfg) + require.NoError(t, err) + assert.Equal(t, []byte("__name__"), opts.MetricName()) + assert.Equal(t, tt.scheme, opts.IDSchemeType()) + } } func TestTagOptionsConfig(t *testing.T) { diff --git a/src/dbnode/config/m3dbnode-all-config.yml b/src/dbnode/config/m3dbnode-all-config.yml index 8155709b8a..811f0cf9a3 100644 --- a/src/dbnode/config/m3dbnode-all-config.yml +++ b/src/dbnode/config/m3dbnode-all-config.yml @@ -30,6 +30,10 @@ coordinator: limits: maxComputedDatapoints: 10000 + tagOptions: + # Configuration setting for generating metric IDs from tags. + idScheme: quoted + db: # Minimum log level which will be emitted. logging: diff --git a/src/dbnode/config/m3dbnode-cluster-template.yml b/src/dbnode/config/m3dbnode-cluster-template.yml index cec60eb72a..01f723de50 100644 --- a/src/dbnode/config/m3dbnode-cluster-template.yml +++ b/src/dbnode/config/m3dbnode-cluster-template.yml @@ -19,6 +19,10 @@ coordinator: samplingRate: 1.0 extended: none + tagOptions: + # Configuration setting for generating metric IDs from tags. + idScheme: quoted + db: logging: level: info diff --git a/src/dbnode/config/m3dbnode-local-etcd.yml b/src/dbnode/config/m3dbnode-local-etcd.yml index 38c1b47b4a..b1464c4b89 100644 --- a/src/dbnode/config/m3dbnode-local-etcd.yml +++ b/src/dbnode/config/m3dbnode-local-etcd.yml @@ -22,6 +22,10 @@ coordinator: limits: maxComputedDatapoints: 10000 + tagOptions: + # Configuration setting for generating metric IDs from tags. + idScheme: quoted + db: logging: level: info diff --git a/src/dbnode/config/m3dbnode-local.yml b/src/dbnode/config/m3dbnode-local.yml index 59bf0261af..65f8639352 100644 --- a/src/dbnode/config/m3dbnode-local.yml +++ b/src/dbnode/config/m3dbnode-local.yml @@ -19,6 +19,10 @@ coordinator: samplingRate: 1.0 extended: none + tagOptions: + # Configuration setting for generating metric IDs from tags. + idScheme: quoted + db: logging: level: info diff --git a/src/query/config/m3coordinator-cluster-template.yml b/src/query/config/m3coordinator-cluster-template.yml index 59f462b366..9ca4374f79 100644 --- a/src/query/config/m3coordinator-cluster-template.yml +++ b/src/query/config/m3coordinator-cluster-template.yml @@ -12,6 +12,9 @@ metrics: samplingRate: 1.0 extended: none +tagOptions: + idScheme: quoted + clusters: ## Fill-out the following and un-comment before using, and ## make sure indent by two spaces is applied. diff --git a/src/query/config/m3coordinator-local-etcd.yml b/src/query/config/m3coordinator-local-etcd.yml index 5e1d04cc49..1d6b880b23 100644 --- a/src/query/config/m3coordinator-local-etcd.yml +++ b/src/query/config/m3coordinator-local-etcd.yml @@ -34,3 +34,6 @@ clusters: endpoint: http://127.0.0.1:2380 writeConsistencyLevel: majority readConsistencyLevel: unstrict_majority + +tagOptions: + idScheme: quoted diff --git a/src/query/config/m3query-dev-etcd.yml b/src/query/config/m3query-dev-etcd.yml index ef0e16dfb6..1750a95041 100644 --- a/src/query/config/m3query-dev-etcd.yml +++ b/src/query/config/m3query-dev-etcd.yml @@ -59,4 +59,7 @@ readWorkerPoolPolicy: writeWorkerPoolPolicy: grow: false - size: 10 \ No newline at end of file + size: 10 + +tagOptions: + idScheme: quoted diff --git a/src/query/config/m3query-local-etcd.yml b/src/query/config/m3query-local-etcd.yml index 4b65c4a423..a2964ceab2 100644 --- a/src/query/config/m3query-local-etcd.yml +++ b/src/query/config/m3query-local-etcd.yml @@ -12,6 +12,9 @@ metrics: samplingRate: 1.0 extended: none +tagOptions: + idScheme: quoted + clusters: - namespaces: - namespace: default @@ -49,4 +52,3 @@ clusters: jitter: true backgroundHealthCheckFailLimit: 4 backgroundHealthCheckFailThrottleFactor: 0.5 - diff --git a/src/query/models/config.go b/src/query/models/config.go index 1ba755e0ea..618afe0359 100644 --- a/src/query/models/config.go +++ b/src/query/models/config.go @@ -72,6 +72,13 @@ func (t *IDSchemeType) UnmarshalYAML(unmarshal func(interface{}) error) error { } for _, valid := range validIDSchemes { + if valid == TypeGraphite { + // NB: while the graphite scheme is valid, it is not available to choose + // as a general ID scheme; instead, it is set on any metric coming through + // the graphite ingestion path. + continue + } + if str == valid.String() { *t = valid return nil diff --git a/src/query/models/config_test.go b/src/query/models/config_test.go index 18941eb4f1..949680db33 100644 --- a/src/query/models/config_test.go +++ b/src/query/models/config_test.go @@ -50,7 +50,13 @@ func TestMetricsTypeUnmarshalYAML(t *testing.T) { Type IDSchemeType `yaml:"type"` } - for _, value := range validIDSchemes { + validParseSchemes := []IDSchemeType{ + TypeLegacy, + TypeQuoted, + TypePrependMeta, + } + + for _, value := range validParseSchemes { str := fmt.Sprintf("type: %s\n", value.String()) var cfg config @@ -60,6 +66,9 @@ func TestMetricsTypeUnmarshalYAML(t *testing.T) { } var cfg config + // Graphite fails. + require.Error(t, yaml.Unmarshal([]byte("type: graphite\n"), &cfg)) + // Bad type fails. require.Error(t, yaml.Unmarshal([]byte("type: not_a_known_type\n"), &cfg)) require.NoError(t, yaml.Unmarshal([]byte(""), &cfg)) diff --git a/src/query/models/types.go b/src/query/models/types.go index d85361713c..dfff733a5f 100644 --- a/src/query/models/types.go +++ b/src/query/models/types.go @@ -54,14 +54,17 @@ const ( TypeLegacy // TypeQuoted describes a scheme where IDs are generated by appending // tag names with explicitly quoted and escaped tag values. Tag names are - // also escaped if they contain invalid characters. - // {t1:v1},{t2:v2} -> t1"v1"t2"v2" - // {t1:v1,t2:v2} -> t1"v1,t2:v2" + // also escaped if they contain invalid characters. This is equivalent to + // the Prometheus ID style. + // {t1:v1},{t2:v2} -> {t1="v1",t2="v2"} + // {t1:v1,t2:v2} -> {t1="v1,t2:v2"} + // {"t1":"v1"} -> {\"t1\""="\"v1\""} TypeQuoted // TypePrependMeta describes a scheme where IDs are generated by prepending // the length of each tag at the start of the ID - // {t1:v1},{t2:v2} -> 44t1v1t2v2 - // {t1:v1,t2:v2} -> 10t1v1,t2:v2 + // {t1:v1},{t2:v2} -> 2,2,2,2!t1v1t2v2 + // {t1:v1,t2:v2} -> 2,8!t1v1,t2:v2 + // {"t1":"v1"} -> 4,4!"t1""v1" TypePrependMeta // TypeGraphite describes a scheme where IDs are generated to match graphite // representation of the tags. This scheme should only be used on the graphite @@ -71,6 +74,10 @@ const ( // // NB: when TypeGraphite is specified, tags are ordered numerically rather // than lexically. + // + // NB 2: while the graphite scheme is valid, it is not available to choose as + // a general ID scheme; instead, it is set on any metric coming through the + // graphite ingestion path. TypeGraphite ) diff --git a/src/query/server/server_test.go b/src/query/server/server_test.go index af0434ea97..cdf8b72a59 100644 --- a/src/query/server/server_test.go +++ b/src/query/server/server_test.go @@ -71,6 +71,7 @@ clusters: tagOptions: metricName: "_new" + idScheme: quoted readWorkerPoolPolicy: grow: true @@ -83,6 +84,7 @@ writeWorkerPoolPolicy: size: 100 shards: 1000 killProbability: 0.3 + ` //TODO: Use randomly assigned port here @@ -103,7 +105,7 @@ func TestRun(t *testing.T) { session := client.NewMockSession(ctrl) for _, value := range []float64{1, 2} { session.EXPECT().WriteTagged(ident.NewIDMatcher("prometheus_metrics"), - ident.NewIDMatcher("_new=first,biz=baz,foo=bar,"), + ident.NewIDMatcher(`{_new="first",biz="baz",foo="bar"}`), gomock.Any(), gomock.Any(), value, @@ -112,7 +114,7 @@ func TestRun(t *testing.T) { } for _, value := range []float64{3, 4} { session.EXPECT().WriteTagged(ident.NewIDMatcher("prometheus_metrics"), - ident.NewIDMatcher("_new=second,bar=baz,foo=qux,"), + ident.NewIDMatcher(`{_new="second",bar="baz",foo="qux"}`), gomock.Any(), gomock.Any(), value, @@ -252,6 +254,7 @@ backend: grpc tagOptions: metricName: "bar" + idScheme: prepend_meta readWorkerPoolPolicy: grow: true