diff --git a/.gitignore b/.gitignore index 4e1783a9b8d..e5b6dfce009 100644 --- a/.gitignore +++ b/.gitignore @@ -12,7 +12,7 @@ kube/.minikube # Ignore e2e working dirs. data/ -test/e2e/e2e_integration_test* +test/e2e/e2e_* # Ignore promu artifacts. /.build diff --git a/CHANGELOG.md b/CHANGELOG.md index c4fd5035fde..d8de57028a1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,44 +10,57 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re ## Unreleased +### Added + - [#4667](https://github.com/thanos-io/thanos/pull/4667) Add a pure aws-sdk auth for s3 storage. +- [#4680](https://github.com/thanos-io/thanos/pull/4680) Query: add `exemplar.partial-response` flag to control partial response. +- [#4679](https://github.com/thanos-io/thanos/pull/4679) Added `enable-feature` flag to enable negative offsets and @ modifier, similar to Prometheus. +- [#4696](https://github.com/thanos-io/thanos/pull/4696) Query: add cache name to tracing spans. +- [#4736](https://github.com/thanos-io/thanos/pull/4736) S3: Add capability to use custom AWS STS Endpoint. +- [#4764](https://github.com/thanos-io/thanos/pull/4764) Compactor: add `block-viewer.global.sync-block-timeout` flag to set the timeout of synchronization block metas. ### Fixed -- [#4663](https://github.com/thanos-io/thanos/pull/4663) Fetcher: Fix discovered data races +- [#4508](https://github.com/thanos-io/thanos/pull/4508) Adjust and rename `ThanosSidecarUnhealthy` to `ThanosSidecarNoConnectionToStartedPrometheus`; Remove `ThanosSidecarPrometheusDown` alert; Remove unused `thanos_sidecar_last_heartbeat_success_time_seconds` metrics. +- [#4663](https://github.com/thanos-io/thanos/pull/4663) Fetcher: Fix discovered data races. +- [#4754](https://github.com/thanos-io/thanos/pull/4754) Query: Fix possible panic on stores endpoint. +- [#4753](https://github.com/thanos-io/thanos/pull/4753) Store: validate block sync concurrency parameter +- [#4792](https://github.com/thanos-io/thanos/pull/4792) Store: Fix data race in BucketedBytes pool. -### Added +## [v0.23.1](https://github.com/thanos-io/thanos/tree/release-0.23) - 2021.10.1 -- [#4680](https://github.com/thanos-io/thanos/pull/4680) Query: add `exemplar.partial-response` flag to control partial response. +- [#4714](https://github.com/thanos-io/thanos/pull/4714) EndpointSet: Do not use unimplemented yet new InfoAPI to obtain metadata (avoids unnecessary HTTP roundtrip, instrumentation/alerts spam and logs). -## v0.23.0 - In Progress +## [v0.23.0](https://github.com/thanos-io/thanos/tree/release-0.23) - 2021.09.23 ### Added -- [#4453](https://github.com/thanos-io/thanos/pull/4453) Tools: Add flag `--selector.relabel-config-file` / `--selector.relabel-config` / `--max-time` / `--min-time` to filter served blocks. -- [#4482](https://github.com/thanos-io/thanos/pull/4482) COS: Add http_config for cos object store client. -- [#4487](https://github.com/thanos-io/thanos/pull/4487) Query: Add memcached auto discovery support. -- [#4444](https://github.com/thanos-io/thanos/pull/4444) UI: Add search block UI. -- [#4509](https://github.com/thanos-io/thanos/pull/4509) Logging: Adds duration_ms in int64 to the logs. -- [#4462](https://github.com/thanos-io/thanos/pull/4462) UI: Add find overlap block UI. -- [#4469](https://github.com/thanos-io/thanos/pull/4469) Compact: Add flag `compact.skip-block-with-out-of-order-chunks` to skip blocks with out-of-order chunks during compaction instead of halting -- [#4506](https://github.com/thanos-io/thanos/pull/4506) `Baidu BOS` object storage, see [documents](docs/storage.md#baidu-bos) for further information. -- [#4552](https://github.com/thanos-io/thanos/pull/4552) Compact: Adds `thanos_compact_downsample_duration_seconds` histogram. -- [#4594](https://github.com/thanos-io/thanos/pull/4594) reloader: Expose metrics in config reloader to give info on the last operation. -- [#4623](https://github.com/thanos-io/thanos/pull/4623) query-frontend: made HTTP downstream tripper (client) configurable via parameters `--query-range.downstream-tripper-config` and `--query-range.downstream-tripper-config-file`. If your downstream URL is localhost or 127.0.0.1 then it is strongly recommended to bump `max_idle_conns_per_host` to at least 100 so that `query-frontend` could properly use HTTP keep-alive connections and thus reduce the latency of `query-frontend` by about 20%. -- [#4636](https://github.com/thanos-io/thanos/pull/4636) Azure: Support authentication using user-assigned managed identity +- [#4453](https://github.com/thanos-io/thanos/pull/4453) Tools `thanos bucket web`: Add flag `--selector.relabel-config-file` / `--selector.relabel-config` / `--max-time` / `--min-time` to filter served blocks. +- [#4482](https://github.com/thanos-io/thanos/pull/4482) Store: Add `http_config` option for COS object store client. +- [#4487](https://github.com/thanos-io/thanos/pull/4487) Query/Store: Add memcached auto discovery support for all caching clients. +- [#4444](https://github.com/thanos-io/thanos/pull/4444) UI: Add search to the Block UI. +- [#4509](https://github.com/thanos-io/thanos/pull/4509) Logging: Add `duration_ms` in int64 to the logs for easier log filtering. +- [#4462](https://github.com/thanos-io/thanos/pull/4462) UI: Highlighting blocks overlap in the Block UI. +- [#4469](https://github.com/thanos-io/thanos/pull/4469) Compact: Add flag `compact.skip-block-with-out-of-order-chunks` to skip blocks with out-of-order chunks during compaction instead of halting. +- [#4506](https://github.com/thanos-io/thanos/pull/4506) Store: Add `Baidu BOS` object storage, see [documents](docs/storage.md#baidu-bos) for further information. +- [#4552](https://github.com/thanos-io/thanos/pull/4552) Compact: Add `thanos_compact_downsample_duration_seconds` histogram metric. +- [#4594](https://github.com/thanos-io/thanos/pull/4594) Reloader: Expose metrics in config reloader to give info on the last operation. +- [#4619](https://github.com/thanos-io/thanos/pull/4619) Tracing: Added consistent tags to Series call from Querier about number important series statistics: `processed.series`, `processed.samples`, `processed.samples` and `processed.bytes`. This will give admin idea of how much data each component processes per query. +- [#4623](https://github.com/thanos-io/thanos/pull/4623) Query-frontend: Make HTTP downstream tripper (client) configurable via parameters `--query-range.downstream-tripper-config` and `--query-range.downstream-tripper-config-file`. If your downstream URL is localhost or 127.0.0.1 then it is strongly recommended to bump `max_idle_conns_per_host` to at least 100 so that `query-frontend` could properly use HTTP keep-alive connections and thus reduce the latency of `query-frontend` by about 20%. ### Fixed - [#4468](https://github.com/thanos-io/thanos/pull/4468) Rule: Fix temporary rule filename composition issue. -- [#4476](https://github.com/thanos-io/thanos/pull/4476) UI: fix incorrect html escape sequence used for '>' symbol. -- [#4532](https://github.com/thanos-io/thanos/pull/4532) Mixin: Fixed "all jobs" selector in thanos mixin dashboards. -- [#4607](https://github.com/thanos-io/thanos/pull/4607) Azure: Fix Azure MSI Rate Limit +- [#4476](https://github.com/thanos-io/thanos/pull/4476) UI: Fix incorrect html escape sequence used for '>' symbol. +- [#4532](https://github.com/thanos-io/thanos/pull/4532) Mixin: Fix "all jobs" selector in thanos mixin dashboards. +- [#4607](https://github.com/thanos-io/thanos/pull/4607) Azure: Fix Azure MSI Rate Limit. ### Changed -- [#4519](https://github.com/thanos-io/thanos/pull/4519) Query: switch to miekgdns DNS resolver as the default one. +- [#4519](https://github.com/thanos-io/thanos/pull/4519) Query: Switch to miekgdns DNS resolver as the default one. - [#4586](https://github.com/thanos-io/thanos/pull/4586) Update Prometheus/Cortex dependencies and implement LabelNames() pushdown as a result; provides massive speed-up for the labels API in Thanos Query. +- [#4421](https://github.com/thanos-io/thanos/pull/4421) *breaking :warning:*: `--store` (in the future, to be renamed to `--endpoints`) now supports passing any APIs from Thanos gRPC APIs: StoreAPI, MetadataAPI, RulesAPI, TargetsAPI and ExemplarsAPI (in oppose in the past you have to put it in hidden `--targets`, `--rules` etc flags). `--store` will now automatically detect what APIs server exposes. +- [#4669](https://github.com/thanos-io/thanos/pull/4669) Moved Prometheus dependency to v2.30. ## [v0.22.0](https://github.com/thanos-io/thanos/tree/release-0.22) - 2021.07.22 diff --git a/cmd/thanos/compact.go b/cmd/thanos/compact.go index 2ae488de6db..6e01c21110e 100644 --- a/cmd/thanos/compact.go +++ b/cmd/thanos/compact.go @@ -448,7 +448,7 @@ func runCompact( // TODO(bwplotka): Find a way to avoid syncing if no op was done. if err := sy.SyncMetas(ctx); err != nil { - return errors.Wrap(err, "sync before first pass of downsampling") + return errors.Wrap(err, "sync before retention") } if err := compact.ApplyRetentionPolicyByResolution(ctx, logger, bkt, sy.Metas(), retentionByResolution, compactMetrics.blocksMarked.WithLabelValues(metadata.DeletionMarkFilename, "")); err != nil { @@ -537,14 +537,14 @@ func runCompact( } g.Add(func() error { - iterCtx, iterCancel := context.WithTimeout(ctx, conf.waitInterval) + iterCtx, iterCancel := context.WithTimeout(ctx, conf.blockViewerSyncBlockTimeout) _, _, _ = f.Fetch(iterCtx) iterCancel() // For /global state make sure to fetch periodically. return runutil.Repeat(conf.blockViewerSyncBlockInterval, ctx.Done(), func() error { return runutil.RetryWithLog(logger, time.Minute, ctx.Done(), func() error { - iterCtx, iterCancel := context.WithTimeout(ctx, conf.waitInterval) + iterCtx, iterCancel := context.WithTimeout(ctx, conf.blockViewerSyncBlockTimeout) defer iterCancel() _, _, err := f.Fetch(iterCtx) @@ -576,6 +576,7 @@ type compactConfig struct { blockSyncConcurrency int blockMetaFetchConcurrency int blockViewerSyncBlockInterval time.Duration + blockViewerSyncBlockTimeout time.Duration cleanupBlocksInterval time.Duration compactionConcurrency int downsampleConcurrency int @@ -634,6 +635,8 @@ func (cc *compactConfig) registerFlag(cmd extkingpin.FlagClause) { Default("32").IntVar(&cc.blockMetaFetchConcurrency) cmd.Flag("block-viewer.global.sync-block-interval", "Repeat interval for syncing the blocks between local and remote view for /global Block Viewer UI."). Default("1m").DurationVar(&cc.blockViewerSyncBlockInterval) + cmd.Flag("block-viewer.global.sync-block-timeout", "Maximum time for syncing the blocks between local and remote view for /global Block Viewer UI."). + Default("5m").DurationVar(&cc.blockViewerSyncBlockTimeout) cmd.Flag("compact.cleanup-interval", "How often we should clean up partially uploaded blocks and blocks with deletion mark in the background when --wait has been enabled. Setting it to \"0s\" disables it - the cleaning will only happen at the end of an iteration."). Default("5m").DurationVar(&cc.cleanupBlocksInterval) diff --git a/cmd/thanos/query.go b/cmd/thanos/query.go index 4d893f212ed..373cd579134 100644 --- a/cmd/thanos/query.go +++ b/cmd/thanos/query.go @@ -52,6 +52,11 @@ import ( "github.com/thanos-io/thanos/pkg/ui" ) +const ( + promqlNegativeOffset = "promql-negative-offset" + promqlAtModifier = "promql-at-modifier" +) + // registerQuery registers a query command. func registerQuery(app *extkingpin.App) { comp := component.Query @@ -146,6 +151,8 @@ func registerQuery(app *extkingpin.App) { enableMetricMetadataPartialResponse := cmd.Flag("metric-metadata.partial-response", "Enable partial response for metric metadata endpoint. --no-metric-metadata.partial-response for disabling."). Hidden().Default("true").Bool() + featureList := cmd.Flag("enable-feature", "Comma separated experimental feature names to enable.The current list of features is "+promqlNegativeOffset+" and "+promqlAtModifier+".").Default("").Strings() + enableExemplarPartialResponse := cmd.Flag("exemplar.partial-response", "Enable partial response for exemplar endpoint. --no-exemplar.partial-response for disabling."). Hidden().Default("true").Bool() @@ -163,6 +170,16 @@ func registerQuery(app *extkingpin.App) { return errors.Wrap(err, "parse federation labels") } + var enableNegativeOffset, enableAtModifier bool + for _, feature := range *featureList { + if feature == promqlNegativeOffset { + enableNegativeOffset = true + } + if feature == promqlAtModifier { + enableAtModifier = true + } + } + if dup := firstDuplicate(*stores); dup != "" { return errors.Errorf("Address %s is duplicated for --store flag.", dup) } @@ -266,6 +283,8 @@ func registerQuery(app *extkingpin.App) { *defaultMetadataTimeRange, *strictStores, *webDisableCORS, + enableAtModifier, + enableNegativeOffset, component.Query, ) }) @@ -329,6 +348,8 @@ func runQuery( defaultMetadataTimeRange time.Duration, strictStores []string, disableCORS bool, + enableAtModifier bool, + enableNegativeOffset bool, comp component.Component, ) error { // TODO(bplotka in PR #513 review): Move arguments into struct. @@ -456,6 +477,9 @@ func runQuery( cancelRun() }) + engineOpts.EnableAtModifier = enableAtModifier + engineOpts.EnableNegativeOffset = enableNegativeOffset + ctxUpdate, cancelUpdate := context.WithCancel(context.Background()) g.Add(func() error { for { @@ -479,7 +503,6 @@ func runQuery( } }, func(error) { cancelUpdate() - close(fileSDUpdates) }) } // Periodically update the addresses from static flags and file SD by resolving them using DNS SD if necessary. @@ -546,7 +569,7 @@ func runQuery( api := v1.NewQueryAPI( logger, - endpoints, + endpoints.GetEndpointStatus, engineFactory(promql.NewEngine, engineOpts, dynamicLookbackDelta), queryableCreator, // NOTE: Will share the same replica label as the query for now. diff --git a/cmd/thanos/rule.go b/cmd/thanos/rule.go index 6a6cb9cd672..d5893edd2a0 100644 --- a/cmd/thanos/rule.go +++ b/cmd/thanos/rule.go @@ -34,6 +34,7 @@ import ( "github.com/prometheus/prometheus/util/strutil" "github.com/thanos-io/thanos/pkg/errutil" "github.com/thanos-io/thanos/pkg/extkingpin" + "github.com/thanos-io/thanos/pkg/httpconfig" extflag "github.com/efficientgo/tools/extkingpin" "github.com/thanos-io/thanos/pkg/alert" @@ -43,12 +44,10 @@ import ( "github.com/thanos-io/thanos/pkg/discovery/dns" "github.com/thanos-io/thanos/pkg/extprom" extpromhttp "github.com/thanos-io/thanos/pkg/extprom/http" - http_util "github.com/thanos-io/thanos/pkg/http" "github.com/thanos-io/thanos/pkg/logging" "github.com/thanos-io/thanos/pkg/objstore/client" "github.com/thanos-io/thanos/pkg/prober" "github.com/thanos-io/thanos/pkg/promclient" - "github.com/thanos-io/thanos/pkg/query" thanosrules "github.com/thanos-io/thanos/pkg/rules" "github.com/thanos-io/thanos/pkg/runutil" grpcserver "github.com/thanos-io/thanos/pkg/server/grpc" @@ -266,29 +265,29 @@ func runRule( ) error { metrics := newRuleMetrics(reg) - var queryCfg []query.Config + var queryCfg []httpconfig.Config var err error if len(conf.queryConfigYAML) > 0 { - queryCfg, err = query.LoadConfigs(conf.queryConfigYAML) + queryCfg, err = httpconfig.LoadConfigs(conf.queryConfigYAML) if err != nil { return err } } else { - queryCfg, err = query.BuildQueryConfig(conf.query.addrs) + queryCfg, err = httpconfig.BuildConfig(conf.query.addrs) if err != nil { - return err + return errors.Wrap(err, "query configuration") } // Build the query configuration from the legacy query flags. - var fileSDConfigs []http_util.FileSDConfig + var fileSDConfigs []httpconfig.FileSDConfig if len(conf.query.sdFiles) > 0 { - fileSDConfigs = append(fileSDConfigs, http_util.FileSDConfig{ + fileSDConfigs = append(fileSDConfigs, httpconfig.FileSDConfig{ Files: conf.query.sdFiles, RefreshInterval: model.Duration(conf.query.sdInterval), }) queryCfg = append(queryCfg, - query.Config{ - EndpointsConfig: http_util.EndpointsConfig{ + httpconfig.Config{ + EndpointsConfig: httpconfig.EndpointsConfig{ Scheme: "http", FileSDConfigs: fileSDConfigs, }, @@ -302,16 +301,16 @@ func runRule( extprom.WrapRegistererWithPrefix("thanos_rule_query_apis_", reg), dns.ResolverType(conf.query.dnsSDResolver), ) - var queryClients []*http_util.Client + var queryClients []*httpconfig.Client queryClientMetrics := extpromhttp.NewClientMetrics(extprom.WrapRegistererWith(prometheus.Labels{"client": "query"}, reg)) for _, cfg := range queryCfg { cfg.HTTPClientConfig.ClientMetrics = queryClientMetrics - c, err := http_util.NewHTTPClient(cfg.HTTPClientConfig, "query") + c, err := httpconfig.NewHTTPClient(cfg.HTTPClientConfig, "query") if err != nil { return err } c.Transport = tracing.HTTPTripperware(logger, c.Transport) - queryClient, err := http_util.NewClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone()) + queryClient, err := httpconfig.NewClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone()) if err != nil { return err } @@ -381,13 +380,13 @@ func runRule( ) for _, cfg := range alertingCfg.Alertmanagers { cfg.HTTPClientConfig.ClientMetrics = amClientMetrics - c, err := http_util.NewHTTPClient(cfg.HTTPClientConfig, "alertmanager") + c, err := httpconfig.NewHTTPClient(cfg.HTTPClientConfig, "alertmanager") if err != nil { return err } c.Transport = tracing.HTTPTripperware(logger, c.Transport) // Each Alertmanager client has a different list of targets thus each needs its own DNS provider. - amClient, err := http_util.NewClient(logger, cfg.EndpointsConfig, c, amProvider.Clone()) + amClient, err := httpconfig.NewClient(logger, cfg.EndpointsConfig, c, amProvider.Clone()) if err != nil { return err } @@ -706,7 +705,7 @@ func removeDuplicateQueryEndpoints(logger log.Logger, duplicatedQueriers prometh func queryFuncCreator( logger log.Logger, - queriers []*http_util.Client, + queriers []*httpconfig.Client, duplicatedQuery prometheus.Counter, ruleEvalWarnings *prometheus.CounterVec, httpMethod string, @@ -762,7 +761,7 @@ func queryFuncCreator( } } -func addDiscoveryGroups(g *run.Group, c *http_util.Client, interval time.Duration) { +func addDiscoveryGroups(g *run.Group, c *httpconfig.Client, interval time.Duration) { ctx, cancel := context.WithCancel(context.Background()) g.Add(func() error { c.Discover(ctx) diff --git a/cmd/thanos/sidecar.go b/cmd/thanos/sidecar.go index fee36d65374..8584492b4fe 100644 --- a/cmd/thanos/sidecar.go +++ b/cmd/thanos/sidecar.go @@ -29,7 +29,7 @@ import ( "github.com/thanos-io/thanos/pkg/exthttp" "github.com/thanos-io/thanos/pkg/extkingpin" "github.com/thanos-io/thanos/pkg/extprom" - thanoshttp "github.com/thanos-io/thanos/pkg/http" + "github.com/thanos-io/thanos/pkg/httpconfig" "github.com/thanos-io/thanos/pkg/logging" meta "github.com/thanos-io/thanos/pkg/metadata" thanosmodel "github.com/thanos-io/thanos/pkg/model" @@ -138,10 +138,6 @@ func runSidecar( Name: "thanos_sidecar_prometheus_up", Help: "Boolean indicator whether the sidecar can reach its Prometheus peer.", }) - lastHeartbeat := promauto.With(reg).NewGauge(prometheus.GaugeOpts{ - Name: "thanos_sidecar_last_heartbeat_success_time_seconds", - Help: "Timestamp of the last successful heartbeat in seconds.", - }) ctx, cancel := context.WithCancel(context.Background()) g.Add(func() error { @@ -191,7 +187,6 @@ func runSidecar( ) promUp.Set(1) statusProber.Ready() - lastHeartbeat.SetToCurrentTime() return nil }) if err != nil { @@ -213,7 +208,6 @@ func runSidecar( promUp.Set(0) } else { promUp.Set(1) - lastHeartbeat.SetToCurrentTime() } return nil @@ -234,7 +228,7 @@ func runSidecar( t := exthttp.NewTransport() t.MaxIdleConnsPerHost = conf.connection.maxIdleConnsPerHost t.MaxIdleConns = conf.connection.maxIdleConns - c := promclient.NewClient(&http.Client{Transport: tracing.HTTPTripperware(logger, t)}, logger, thanoshttp.ThanosUserAgent) + c := promclient.NewClient(&http.Client{Transport: tracing.HTTPTripperware(logger, t)}, logger, httpconfig.ThanosUserAgent) promStore, err := store.NewPrometheusStore(logger, reg, c, conf.prometheus.url, component.Sidecar, m.Labels, m.Timestamps, m.Version) if err != nil { diff --git a/cmd/thanos/store.go b/cmd/thanos/store.go index 19b6dbf8370..25310f9ef3b 100644 --- a/cmd/thanos/store.go +++ b/cmd/thanos/store.go @@ -113,7 +113,7 @@ func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) { cmd.Flag("sync-block-duration", "Repeat interval for syncing the blocks between local and remote view."). Default("3m").DurationVar(&sc.syncInterval) - cmd.Flag("block-sync-concurrency", "Number of goroutines to use when constructing index-cache.json blocks from object storage."). + cmd.Flag("block-sync-concurrency", "Number of goroutines to use when constructing index-cache.json blocks from object storage. Must be equal or greater than 1."). Default("20").IntVar(&sc.blockSyncConcurrency) cmd.Flag("block-meta-fetch-concurrency", "Number of goroutines to use when fetching block metadata from object storage."). diff --git a/cmd/thanos/tools_bucket.go b/cmd/thanos/tools_bucket.go index cacd26eb88d..9c0c837a69f 100644 --- a/cmd/thanos/tools_bucket.go +++ b/cmd/thanos/tools_bucket.go @@ -1133,8 +1133,8 @@ func registerBucketRewrite(app extkingpin.AppClause, objStoreConfig *extflag.Pat } if tbc.dryRun { - level.Info(logger).Log("msg", "dry run finished. Changes should be printed to stderr") - return nil + level.Info(logger).Log("msg", "dry run finished. Changes should be printed to stderr", "Block ID", id) + continue } level.Info(logger).Log("msg", "wrote new block after modifications; flushing", "source", id, "new", newID) diff --git a/docs/components/compact.md b/docs/components/compact.md index ce1a796b36e..8b662a88e75 100644 --- a/docs/components/compact.md +++ b/docs/components/compact.md @@ -284,6 +284,10 @@ Flags: Repeat interval for syncing the blocks between local and remote view for /global Block Viewer UI. + --block-viewer.global.sync-block-timeout=5m + Maximum time for syncing the blocks between + local and remote view for /global Block Viewer + UI. --bucket-web-label=BUCKET-WEB-LABEL Prometheus label to use as timeline title in the bucket web UI diff --git a/docs/components/query.md b/docs/components/query.md index 63f7dd0a2cf..41e2cccf896 100644 --- a/docs/components/query.md +++ b/docs/components/query.md @@ -252,6 +252,9 @@ Query node exposing PromQL enabled Query API with data retrieved from multiple store nodes. Flags: + --enable-feature= ... Comma separated experimental feature names to + enable.The current list of features is + promql-negative-offset and promql-at-modifier. --grpc-address="0.0.0.0:10901" Listen ip:port address for gRPC endpoints (StoreAPI). Make sure this address is routable diff --git a/docs/components/store.md b/docs/components/store.md index 3ae037beb90..cdc86050524 100644 --- a/docs/components/store.md +++ b/docs/components/store.md @@ -34,6 +34,7 @@ Flags: --block-sync-concurrency=20 Number of goroutines to use when constructing index-cache.json blocks from object storage. + Must be equal or greater than 1. --chunk-pool-size=2GB Maximum size of concurrently allocatable bytes reserved strictly to reuse for chunks in memory. diff --git a/docs/operating/modify-objstore-data.md b/docs/operating/modify-objstore-data.md new file mode 100644 index 00000000000..014168efc51 --- /dev/null +++ b/docs/operating/modify-objstore-data.md @@ -0,0 +1,175 @@ +# Modify series in the object storage via bucket rewrite tool + +For operational purposes, there are some use cases to manipulate data in the object storage. For example, delete some high cardinality metrics or relabel metrics if needed. This is already possible via the bucket rewrite tool. + +## Delete series + +```shell +thanos tools bucket rewrite --rewrite.to-delete-config-file config.yaml --objstore.config-file objstore.yaml --id +``` + +This is the example command to delete some data in the specified TSDB block from your object store bucket. For example, if `k8s_app_metric37` is the metric you want to delete, then the config file `config.yaml` would be: + +```yaml +- matchers: '{__name__="k8s_app_metric37"}' +``` + +Example output from my mac looks like below. Dry run mode is enabled by default to prevent unexpected series from being deleted. + +A changelog file is generated so that you can check the expected modification of the provided deletion request. + +```shell +thanos tools bucket rewrite --rewrite.to-delete-config-file config.yaml --objstore.config-file ~/local-bucket-config.yaml --id 01FET1EK9BC3E0QD4886RQCM8K + +level=info ts=2021-09-25T05:47:14.87316Z caller=factory.go:49 msg="loading bucket configuration" +level=info ts=2021-09-25T05:47:14.875365Z caller=tools_bucket.go:1078 msg="downloading block" source=01FET1EK9BC3E0QD4886RQCM8K +level=info ts=2021-09-25T05:47:14.887816Z caller=tools_bucket.go:1115 msg="changelog will be available" file=/var/folders/ny/yy113mqs6szcpjy2qrnhq9rh0000gq/T/thanos-rewrite/01FGDQWKJ7H29B3V4HCQ691WN9/change.log +level=info ts=2021-09-25T05:47:14.912544Z caller=tools_bucket.go:1130 msg="starting rewrite for block" source=01FET1EK9BC3E0QD4886RQCM8K new=01FGDQWKJ7H29B3V4HCQ691WN9 toDelete="- matchers: '{__name__=\"k8s_app_metric37\"}'\n" toRelabel= +level=info ts=2021-09-25T05:47:14.919438Z caller=compactor.go:41 msg="processed 10.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.925442Z caller=compactor.go:41 msg="processed 20.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.930263Z caller=compactor.go:41 msg="processed 30.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.934325Z caller=compactor.go:41 msg="processed 40.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.939466Z caller=compactor.go:41 msg="processed 50.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.944513Z caller=compactor.go:41 msg="processed 60.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.950254Z caller=compactor.go:41 msg="processed 70.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.955336Z caller=compactor.go:41 msg="processed 80.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.960193Z caller=compactor.go:41 msg="processed 90.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.964705Z caller=compactor.go:41 msg="processed 100.00% of 15000 series" +level=info ts=2021-09-25T05:47:14.964768Z caller=tools_bucket.go:1136 msg="dry run finished. Changes should be printed to stderr" +level=info ts=2021-09-25T05:47:14.965101Z caller=main.go:160 msg=exiting +``` + +Below is an example output of the changelog. All the series that match the given deletion config will be deleted. The last column `[{1630713615001 1630715400001}]` represents the start and end time of the series. + +```shell +cat /var/folders/ny/yy113mqs6szcpjy2qrnhq9rh0000gq/T/thanos-rewrite/01FGDQWKJ7H29B3V4HCQ691WN9/change.log + +Deleted {__blockgen_target__="1", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="1", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="1", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Deleted {__blockgen_target__="10", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="10", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="10", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Deleted {__blockgen_target__="100", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="100", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="100", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Deleted {__blockgen_target__="11", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="11", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="11", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Deleted {__blockgen_target__="12", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="12", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="12", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Deleted {__blockgen_target__="13", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="13", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="13", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +... +``` + +If the changelog output is expected, then we can use the same command in the first step, but with `--no-dry-run` flag to actually delete the data we want. + +```shell +thanos tools bucket rewrite --no-dry-run --rewrite.to-delete-config-file config.yaml --objstore.config-file objstore.yaml --id +``` + +The output is listed below. + +```shell +thanos tools bucket rewrite --no-dry-run --rewrite.to-delete-config-file config.yaml --objstore.config-file ~/local-bucket-config.yaml --id 01FET1EK9BC3E0QD4886RQCM8K + +level=info ts=2021-09-25T05:59:18.05232Z caller=factory.go:49 msg="loading bucket configuration" +level=info ts=2021-09-25T05:59:18.059056Z caller=tools_bucket.go:1078 msg="downloading block" source=01FET1EK9BC3E0QD4886RQCM8K +level=info ts=2021-09-25T05:59:18.074761Z caller=tools_bucket.go:1115 msg="changelog will be available" file=/var/folders/ny/yy113mqs6szcpjy2qrnhq9rh0000gq/T/thanos-rewrite/01FGDRJNST2EYDY2RKWFZJPGWJ/change.log +level=info ts=2021-09-25T05:59:18.108293Z caller=tools_bucket.go:1130 msg="starting rewrite for block" source=01FET1EK9BC3E0QD4886RQCM8K new=01FGDRJNST2EYDY2RKWFZJPGWJ toDelete="- matchers: '{__name__=\"k8s_app_metric37\"}'\n" toRelabel= +level=info ts=2021-09-25T05:59:18.395253Z caller=compactor.go:41 msg="processed 10.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.406416Z caller=compactor.go:41 msg="processed 20.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.419826Z caller=compactor.go:41 msg="processed 30.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.428238Z caller=compactor.go:41 msg="processed 40.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.436017Z caller=compactor.go:41 msg="processed 50.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.444738Z caller=compactor.go:41 msg="processed 60.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.452328Z caller=compactor.go:41 msg="processed 70.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.465218Z caller=compactor.go:41 msg="processed 80.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.477385Z caller=compactor.go:41 msg="processed 90.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.485254Z caller=compactor.go:41 msg="processed 100.00% of 15000 series" +level=info ts=2021-09-25T05:59:18.485296Z caller=tools_bucket.go:1140 msg="wrote new block after modifications; flushing" source=01FET1EK9BC3E0QD4886RQCM8K new=01FGDRJNST2EYDY2RKWFZJPGWJ +level=info ts=2021-09-25T05:59:18.662059Z caller=tools_bucket.go:1149 msg="uploading new block" source=01FET1EK9BC3E0QD4886RQCM8K new=01FGDRJNST2EYDY2RKWFZJPGWJ +level=info ts=2021-09-25T05:59:18.667883Z caller=tools_bucket.go:1159 msg=uploaded source=01FET1EK9BC3E0QD4886RQCM8K new=01FGDRJNST2EYDY2RKWFZJPGWJ +level=info ts=2021-09-25T05:59:18.667921Z caller=tools_bucket.go:1167 msg="rewrite done" IDs=01FET1EK9BC3E0QD4886RQCM8K +level=info ts=2021-09-25T05:59:18.668136Z caller=main.go:160 msg=exiting +``` + +After rewriting, a new block `01FGDRJNST2EYDY2RKWFZJPGWJ` will be uploaded to your object store bucket. + +However, the old block will not be deleted by default for the reason of safety. You can add `--delete-blocks` flag so that the source block will be marked as deletion after rewrite is done and will be deleted automatically if you have a compactor running against that bucket. + +### Advanced deletion config + +Multiple matchers can be added in the deletion config. + +For example, the config file below specifies deletion for all series that match: + +1. metric name `k8s_app_metric1` +2. metric name `k8s_app_metric37` and label `__blockgen_target__` that regexp matched `7.*` + +```yaml +- matchers: '{__name__="k8s_app_metric37", __blockgen_target__=~"7.*"}' +- matchers: '{__name__="k8s_app_metric1"}' +``` + +## Relabel series + +```shell +thanos tools bucket rewrite --rewrite.to-relabel-config-file config.yaml --objstore.config-file objstore.yaml --id +``` + +Series relabeling is needed when you want to rename your metrics or drop some high cardinality labels. The command is similar to rewrite deletion, but with `--rewrite.to-relabel-config-file` flag. The configuration is the same as [Prometheus relabel_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config). For example, the relabel config file does: + +1. delete all series that match `{__name__="k8s_app_metric37"}` + +2. rename `k8s_app_metric38` to `old_metric` + +```yaml +- action: drop + regex: k8s_app_metric37 + source_labels: [__name__] +- action: replace + source_labels: [__name__] + regex: k8s_app_metric38 + target_label: __name__ + replacement: old_metric +``` + +Example output of the changelog: + +```shell +Deleted {__blockgen_target__="1", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="1", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="1", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Relabelled {__blockgen_target__="1", __name__="k8s_app_metric38", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} {__blockgen_target__="1", __name__="old_metric", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} +Relabelled {__blockgen_target__="1", __name__="k8s_app_metric38", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} {__blockgen_target__="1", __name__="old_metric", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} +Relabelled {__blockgen_target__="1", __name__="k8s_app_metric38", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} {__blockgen_target__="1", __name__="old_metric", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} +Deleted {__blockgen_target__="10", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="10", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="10", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Relabelled {__blockgen_target__="10", __name__="k8s_app_metric38", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} {__blockgen_target__="10", __name__="old_metric", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} +Relabelled {__blockgen_target__="10", __name__="k8s_app_metric38", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} {__blockgen_target__="10", __name__="old_metric", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} +Relabelled {__blockgen_target__="10", __name__="k8s_app_metric38", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} {__blockgen_target__="10", __name__="old_metric", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} +Deleted {__blockgen_target__="100", __name__="k8s_app_metric37", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} [{1630713615001 1630715400001}] +Deleted {__blockgen_target__="100", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} [{1630715415000 1630719015000}] +Deleted {__blockgen_target__="100", __name__="k8s_app_metric37", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} [{1630719015000 1630720815000}] +Relabelled {__blockgen_target__="100", __name__="k8s_app_metric38", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} {__blockgen_target__="100", __name__="old_metric", next_rollout_time="2021-09-03 23:30:00 +0000 UTC"} +Relabelled {__blockgen_target__="100", __name__="k8s_app_metric38", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} {__blockgen_target__="100", __name__="old_metric", next_rollout_time="2021-09-04 00:30:00 +0000 UTC"} +Relabelled {__blockgen_target__="100", __name__="k8s_app_metric38", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} {__blockgen_target__="100", __name__="old_metric", next_rollout_time="2021-09-04 01:30:00 +0000 UTC"} +... +``` + +If the output is expected, then you can add `--no-dry-run` flag to rewrite blocks. + +## Rewrite Prometheus TSDB blocks + +Thanos object storage supports `local filesystem`, which used local filesystem as bucket. If you want to delete/rewrite Prometheus TSDB, you can use the command below: + +```shell +thanos tools bucket rewrite --prom-blocks --rewrite.to-relabel-config-file config.yaml --objstore.config-file local-bucket.yaml --id +``` + +`--prom-blocks` disables external labels check when adding new blocks. For the local bucket config file, please refer to [this](https://thanos.io/tip/thanos/storage.md/#filesystem). diff --git a/docs/release-process.md b/docs/release-process.md index e85556ac319..97d82f125d1 100644 --- a/docs/release-process.md +++ b/docs/release-process.md @@ -23,8 +23,9 @@ Release shepherd responsibilities: | Release | Time of first RC | Shepherd (GitHub handle) | |---------|----------------------|-----------------------------| -| v0.24.0 | (planned) 2021.09.28 | No one ATM | -| v0.23.0 | 2021.09.01 | `@bwplotka` | +| v0.25.0 | (planned) 2021.12.09 | No one ATM | +| v0.24.0 | (planned) 2021.10.28 | `@squat` | +| v0.23.0 | 2021.09.02 | `@bwplotka` | | v0.22.0 | 2021.07.06 | `@GiedriusS` | | v0.21.0 | 2021.05.28 | `@metalmatze` and `@onprem` | | v0.20.0 | 2021.04.23 | `@kakkoyun` | @@ -120,7 +121,7 @@ The whole release from release candidate `rc.0` to actual release should have ex 10. Announce `#thanos` slack channel. -11. Pull commits from release branch to main branch for non `rc` releases. +11. Pull commits from release branch to main branch for non `rc` releases. Make sure to not modify `VERSION`, it should be still pointing to `version+1-dev` ([TODO to automate this](https://github.com/thanos-io/thanos/issues/4741)) 12. After releasing a major version, please cut a release for `kube-thanos` as well. https://github.com/thanos-io/kube-thanos/releases Make sure all the flag changes are reflected in the manifests. Otherwise, the process is the same, except we don't have `rc` for the `kube-thanos`. We do this to make sure we have compatible manifests for each major versions. diff --git a/docs/storage.md b/docs/storage.md index 89e543c6c54..d097a945fd7 100644 --- a/docs/storage.md +++ b/docs/storage.md @@ -89,6 +89,7 @@ config: kms_key_id: "" kms_encryption_context: {} encryption_key: "" + sts_endpoint: "" ``` At a minimum, you will need to provide a value for the `bucket`, `endpoint`, `access_key`, and `secret_key` keys. The rest of the keys are optional. @@ -229,6 +230,12 @@ With this policy you should be able to run set `THANOS_TEST_OBJSTORE_SKIP=GCS,AZ Details about AWS policies: https://docs.aws.amazon.com/AmazonS3/latest/dev/using-with-s3-actions.html +##### STS Endpoint + +If you want to use IAM credential retrieved from an instance profile, Thanos needs to authenticate through AWS STS. For this purposes you can specify your own STS Endpoint. + +By default Thanos will use endpoint: https://sts.amazonaws.com and AWS region coresponding endpoints. + #### GCS To configure Google Cloud Storage bucket as an object store you need to set `bucket` with GCS bucket name and configure Google Application credentials. diff --git a/examples/alerts/alerts.md b/examples/alerts/alerts.md index 0d54adb4215..b274f4579f7 100644 --- a/examples/alerts/alerts.md +++ b/examples/alerts/alerts.md @@ -296,16 +296,6 @@ rules: ```yaml mdox-exec="cat examples/tmp/thanos-sidecar.yaml" name: thanos-sidecar rules: -- alert: ThanosSidecarPrometheusDown - annotations: - description: Thanos Sidecar {{$labels.instance}} cannot connect to Prometheus. - runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarprometheusdown - summary: Thanos Sidecar cannot connect to Prometheus - expr: | - thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0 - for: 5m - labels: - severity: critical - alert: ThanosSidecarBucketOperationsFailed annotations: description: Thanos Sidecar {{$labels.instance}} bucket operations are failing @@ -316,14 +306,16 @@ rules: for: 5m labels: severity: critical -- alert: ThanosSidecarUnhealthy +- alert: ThanosSidecarNoConnectionToStartedPrometheus annotations: - description: Thanos Sidecar {{$labels.instance}} is unhealthy for more than {{$value}} - seconds. - runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy - summary: Thanos Sidecar is unhealthy. + description: Thanos Sidecar {{$labels.instance}} is unhealthy. + runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus + summary: Thanos Sidecar cannot access Prometheus, even though Prometheus seems + healthy and has reloaded WAL. expr: | - time() - max by (job, instance) (thanos_sidecar_last_heartbeat_success_time_seconds{job=~".*thanos-sidecar.*"}) >= 240 + thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0 + AND on (namespace, pod) + prometheus_tsdb_data_replay_duration_seconds != 0 for: 5m labels: severity: critical diff --git a/examples/alerts/alerts.yaml b/examples/alerts/alerts.yaml index 8c0d7d7340d..867a88984b5 100644 --- a/examples/alerts/alerts.yaml +++ b/examples/alerts/alerts.yaml @@ -301,16 +301,6 @@ groups: severity: warning - name: thanos-sidecar rules: - - alert: ThanosSidecarPrometheusDown - annotations: - description: Thanos Sidecar {{$labels.instance}} cannot connect to Prometheus. - runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarprometheusdown - summary: Thanos Sidecar cannot connect to Prometheus - expr: | - thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0 - for: 5m - labels: - severity: critical - alert: ThanosSidecarBucketOperationsFailed annotations: description: Thanos Sidecar {{$labels.instance}} bucket operations are failing @@ -321,14 +311,16 @@ groups: for: 5m labels: severity: critical - - alert: ThanosSidecarUnhealthy + - alert: ThanosSidecarNoConnectionToStartedPrometheus annotations: - description: Thanos Sidecar {{$labels.instance}} is unhealthy for more than - {{$value}} seconds. - runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy - summary: Thanos Sidecar is unhealthy. + description: Thanos Sidecar {{$labels.instance}} is unhealthy. + runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus + summary: Thanos Sidecar cannot access Prometheus, even though Prometheus seems + healthy and has reloaded WAL. expr: | - time() - max by (job, instance) (thanos_sidecar_last_heartbeat_success_time_seconds{job=~".*thanos-sidecar.*"}) >= 240 + thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0 + AND on (namespace, pod) + prometheus_tsdb_data_replay_duration_seconds != 0 for: 5m labels: severity: critical diff --git a/examples/alerts/tests.yaml b/examples/alerts/tests.yaml index 64207c46f79..d04662243bc 100644 --- a/examples/alerts/tests.yaml +++ b/examples/alerts/tests.yaml @@ -7,127 +7,74 @@ evaluation_interval: 1m tests: - interval: 1m input_series: - - series: 'thanos_sidecar_last_heartbeat_success_time_seconds{namespace="production", job="thanos-sidecar", instance="thanos-sidecar-0"}' - values: '5 10 43 17 11 0 0 0' - - series: 'thanos_sidecar_last_heartbeat_success_time_seconds{namespace="production", job="thanos-sidecar", instance="thanos-sidecar-1"}' - values: '4 9 42 15 10 0 0 0' - promql_expr_test: - - expr: time() - eval_time: 1m - exp_samples: - - labels: '{}' - value: 60 - - expr: time() - eval_time: 2m - exp_samples: - - labels: '{}' - value: 120 - - expr: max(thanos_sidecar_last_heartbeat_success_time_seconds{job="thanos-sidecar"}) by (job, instance) - eval_time: 2m - exp_samples: - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-0"}' - value: 43 - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-1"}' - value: 42 - - expr: max(thanos_sidecar_last_heartbeat_success_time_seconds{job="thanos-sidecar"}) by (job, instance) - eval_time: 10m - exp_samples: - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-0"}' - value: 0 - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-1"}' - value: 0 - - expr: max(thanos_sidecar_last_heartbeat_success_time_seconds{job="thanos-sidecar"}) by (job, instance) - eval_time: 11m - exp_samples: - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-0"}' - value: 0 - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-1"}' - value: 0 - - expr: time() - max(thanos_sidecar_last_heartbeat_success_time_seconds{job="thanos-sidecar"}) by (job, instance) - eval_time: 10m - exp_samples: - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-0"}' - value: 600 - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-1"}' - value: 600 - - expr: time() - max(thanos_sidecar_last_heartbeat_success_time_seconds{job="thanos-sidecar"}) by (job, instance) - eval_time: 11m - exp_samples: - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-0"}' - value: 660 - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-1"}' - value: 660 - - expr: time() - max(thanos_sidecar_last_heartbeat_success_time_seconds{job="thanos-sidecar"}) by (job, instance) >= 600 - eval_time: 12m - exp_samples: - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-0"}' - value: 720 - - labels: '{job="thanos-sidecar", instance="thanos-sidecar-1"}' - value: 720 + - series: 'thanos_sidecar_prometheus_up{namespace="production", job="thanos-sidecar", instance="thanos-sidecar-0", pod="prometheus-0"}' + values: '1x5 0x15' + - series: 'thanos_sidecar_prometheus_up{namespace="production", job="thanos-sidecar", instance="thanos-sidecar-1", pod="prometheus-1"}' + values: '1x4 0x15' + - series: 'prometheus_tsdb_data_replay_duration_seconds{namespace="production", job="prometheus-k8s", instance="prometheus-k8s-0", pod="prometheus-0"}' + values: '4x5 0x5 5x15' + - series: 'prometheus_tsdb_data_replay_duration_seconds{namespace="production", job="prometheus-k8s", instance="prometheus-k8s-1", pod="prometheus-1"}' + values: '10x14 0x6' alert_rule_test: - eval_time: 1m - alertname: ThanosSidecarUnhealthy + alertname: ThanosSidecarNoConnectionToStartedPrometheus - eval_time: 2m - alertname: ThanosSidecarUnhealthy + alertname: ThanosSidecarNoConnectionToStartedPrometheus - eval_time: 3m - alertname: ThanosSidecarUnhealthy + alertname: ThanosSidecarNoConnectionToStartedPrometheus - eval_time: 10m - alertname: ThanosSidecarUnhealthy + alertname: ThanosSidecarNoConnectionToStartedPrometheus exp_alerts: - - exp_labels: - severity: critical - job: thanos-sidecar - instance: thanos-sidecar-0 - exp_annotations: - description: 'Thanos Sidecar thanos-sidecar-0 is unhealthy for more than 600 seconds.' - runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy' - summary: 'Thanos Sidecar is unhealthy.' - exp_labels: severity: critical job: thanos-sidecar instance: thanos-sidecar-1 + namespace: production + pod: prometheus-1 exp_annotations: - description: 'Thanos Sidecar thanos-sidecar-1 is unhealthy for more than 600 seconds.' - runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy' - summary: 'Thanos Sidecar is unhealthy.' + description: 'Thanos Sidecar thanos-sidecar-1 is unhealthy.' + runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus' + summary: 'Thanos Sidecar cannot access Prometheus, even though Prometheus seems healthy and has reloaded WAL.' - eval_time: 11m - alertname: ThanosSidecarUnhealthy + alertname: ThanosSidecarNoConnectionToStartedPrometheus exp_alerts: - - exp_labels: - severity: critical - job: thanos-sidecar - instance: thanos-sidecar-0 - exp_annotations: - description: 'Thanos Sidecar thanos-sidecar-0 is unhealthy for more than 660 seconds.' - runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy' - summary: 'Thanos Sidecar is unhealthy.' - exp_labels: severity: critical job: thanos-sidecar instance: thanos-sidecar-1 + namespace: production + pod: prometheus-1 exp_annotations: - description: 'Thanos Sidecar thanos-sidecar-1 is unhealthy for more than 660 seconds.' - runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy' - summary: 'Thanos Sidecar is unhealthy.' + description: 'Thanos Sidecar thanos-sidecar-1 is unhealthy.' + runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus' + summary: 'Thanos Sidecar cannot access Prometheus, even though Prometheus seems healthy and has reloaded WAL.' - eval_time: 12m - alertname: ThanosSidecarUnhealthy + alertname: ThanosSidecarNoConnectionToStartedPrometheus exp_alerts: - exp_labels: severity: critical job: thanos-sidecar - instance: thanos-sidecar-0 + instance: thanos-sidecar-1 + namespace: production + pod: prometheus-1 exp_annotations: - description: 'Thanos Sidecar thanos-sidecar-0 is unhealthy for more than 720 seconds.' - runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy' - summary: 'Thanos Sidecar is unhealthy.' + description: 'Thanos Sidecar thanos-sidecar-1 is unhealthy.' + runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus' + summary: 'Thanos Sidecar cannot access Prometheus, even though Prometheus seems healthy and has reloaded WAL.' + - eval_time: 20m + alertname: ThanosSidecarNoConnectionToStartedPrometheus + exp_alerts: - exp_labels: severity: critical job: thanos-sidecar - instance: thanos-sidecar-1 + instance: thanos-sidecar-0 + namespace: production + pod: prometheus-0 exp_annotations: - description: 'Thanos Sidecar thanos-sidecar-1 is unhealthy for more than 720 seconds.' - runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy' - summary: 'Thanos Sidecar is unhealthy.' + description: 'Thanos Sidecar thanos-sidecar-0 is unhealthy.' + runbook_url: 'https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus' + summary: 'Thanos Sidecar cannot access Prometheus, even though Prometheus seems healthy and has reloaded WAL.' + - interval: 1m input_series: - series: 'prometheus_rule_evaluations_total{namespace="production", job="thanos-ruler", instance="thanos-ruler-0"}' diff --git a/go.mod b/go.mod index 1863c17a30e..47f3b368166 100644 --- a/go.mod +++ b/go.mod @@ -19,9 +19,10 @@ require ( github.com/cespare/xxhash/v2 v2.1.2 github.com/chromedp/cdproto v0.0.0-20200424080200-0de008e41fa0 github.com/chromedp/chromedp v0.5.3 - github.com/cortexproject/cortex v1.10.1-0.20210820081236-70dddb6b70b8 + github.com/cortexproject/cortex v1.10.1-0.20211006150606-fb15b432e267 github.com/davecgh/go-spew v1.1.1 github.com/efficientgo/e2e v0.11.1-0.20210829161758-f4cc6dbdc6ea + github.com/efficientgo/tools/core v0.0.0-20210129205121-421d0828c9a6 github.com/efficientgo/tools/extkingpin v0.0.0-20210609125236-d73259166f20 github.com/facette/natsort v0.0.0-20181210072756-2cd4dd1e2dcb github.com/fatih/structtag v1.1.0 @@ -35,7 +36,6 @@ require ( github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da github.com/golang/snappy v0.0.4 github.com/googleapis/gax-go v2.0.2+incompatible - github.com/grafana/dskit v0.0.0-20210819132858-471020752967 github.com/grpc-ecosystem/go-grpc-middleware/providers/kit/v2 v2.0.0-20201002093600-73cf2ae9d891 github.com/grpc-ecosystem/go-grpc-middleware/v2 v2.0.0-rc.2.0.20201207153454-9f6bf00c00a7 github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0 @@ -56,7 +56,7 @@ require ( github.com/opentracing/opentracing-go v1.2.0 github.com/pkg/errors v0.9.1 github.com/pmezard/go-difflib v1.0.0 - github.com/prometheus/alertmanager v0.23.0 + github.com/prometheus/alertmanager v0.23.1-0.20210914172521-e35efbddb66a github.com/prometheus/client_golang v1.11.0 github.com/prometheus/client_model v0.2.0 github.com/prometheus/common v0.30.0 @@ -65,7 +65,7 @@ require ( github.com/tencentyun/cos-go-sdk-v5 v0.7.31 github.com/uber/jaeger-client-go v2.29.1+incompatible github.com/uber/jaeger-lib v2.4.1+incompatible - github.com/weaveworks/common v0.0.0-20210722103813-e649eff5ab4a + github.com/weaveworks/common v0.0.0-20210901124008-1fa3f9fa874c go.elastic.co/apm v1.11.0 go.elastic.co/apm/module/apmot v1.11.0 go.uber.org/atomic v1.9.0 diff --git a/go.sum b/go.sum index b6c5c119932..e76aa4c62f3 100644 --- a/go.sum +++ b/go.sum @@ -463,8 +463,8 @@ github.com/cortexproject/cortex v1.6.1-0.20210215155036-dfededd9f331/go.mod h1:8 github.com/cortexproject/cortex v1.7.1-0.20210224085859-66d6fb5b0d42/go.mod h1:u2dxcHInYbe45wxhLoWVdlFJyDhXewsMcxtnbq/QbH4= github.com/cortexproject/cortex v1.7.1-0.20210316085356-3fedc1108a49/go.mod h1:/DBOW8TzYBTE/U+O7Whs7i7E2eeeZl1iRVDtIqxn5kg= github.com/cortexproject/cortex v1.8.1-0.20210422151339-cf1c444e0905/go.mod h1:xxm4/CLvTmDxwE7yXwtClR4dIvkG4S09o5DygPOgc1U= -github.com/cortexproject/cortex v1.10.1-0.20210820081236-70dddb6b70b8 h1:3wtJ9PaFNIpBeSTjjhF7l4qTbvZf0BEX47TEAqqn6G0= -github.com/cortexproject/cortex v1.10.1-0.20210820081236-70dddb6b70b8/go.mod h1:F8PX2IHaeFvqCci46Y+fhskJkCtLvh0OqCKFtWyjP7w= +github.com/cortexproject/cortex v1.10.1-0.20211006150606-fb15b432e267 h1:IwLIfwD1AxH1hlO09m3vdj4cSnlqhgGQV5yVgxnBPjU= +github.com/cortexproject/cortex v1.10.1-0.20211006150606-fb15b432e267/go.mod h1:viwUqGbsFAHfsAGye0tUuyhKrbrlJc6LkvOXQ3j8xM4= github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= github.com/creack/pty v1.1.7/go.mod h1:lj5s0c3V2DBrqTV7llrYr5NG6My20zk30Fl46Y7DoTY= @@ -684,7 +684,6 @@ github.com/go-openapi/runtime v0.19.15/go.mod h1:dhGWCTKRXlAfGnQG0ONViOZpjfg0m2g github.com/go-openapi/runtime v0.19.16/go.mod h1:5P9104EJgYcizotuXhEuUrzVc+j1RiSjahULvYmlv98= github.com/go-openapi/runtime v0.19.24/go.mod h1:Lm9YGCeecBnUUkFTxPC4s1+lwrkJ0pthx8YvyjCfkgk= github.com/go-openapi/runtime v0.19.26/go.mod h1:BvrQtn6iVb2QmiVXRsFAm6ZCAZBpbVKFfN6QWCp582M= -github.com/go-openapi/runtime v0.19.28/go.mod h1:BvrQtn6iVb2QmiVXRsFAm6ZCAZBpbVKFfN6QWCp582M= github.com/go-openapi/runtime v0.19.29 h1:5IIvCaIDbxetN674vX9eOxvoZ9mYGQ16fV1Q0VSG+NA= github.com/go-openapi/runtime v0.19.29/go.mod h1:BvrQtn6iVb2QmiVXRsFAm6ZCAZBpbVKFfN6QWCp582M= github.com/go-openapi/spec v0.0.0-20160808142527-6aced65f8501/go.mod h1:J8+jY1nAiCcj+friV/PDoE1/3eeccG9LYBs0tYvLOWc= @@ -738,8 +737,8 @@ github.com/go-playground/locales v0.12.1/go.mod h1:IUMDtCfWo/w/mtMfIE/IG2K+Ey3yg github.com/go-playground/universal-translator v0.16.0/go.mod h1:1AnU7NaIRDWWzGEKwgtJRd2xk99HeFyHw3yid4rvQIY= github.com/go-redis/redis/v8 v8.0.0-beta.10.0.20200905143926-df7fe4e2ce72/go.mod h1:CJP1ZIHwhosNYwIdaHPZK9vHsM3+roNBaZ7U9Of1DXc= github.com/go-redis/redis/v8 v8.2.3/go.mod h1:ysgGY09J/QeDYbu3HikWEIPCwaeOkuNoTgKayTEaEOw= -github.com/go-redis/redis/v8 v8.9.0 h1:FTTbB7WqlXfVNdVv0SsxA+oVi0bAwit6bMe3IUucq2o= -github.com/go-redis/redis/v8 v8.9.0/go.mod h1:ik7vb7+gm8Izylxu6kf6wG26/t2VljgCfSQ1DM4O1uU= +github.com/go-redis/redis/v8 v8.11.4 h1:kHoYkfZP6+pe04aFTnhDH6GDROa5yJdHJVNxV3F46Tg= +github.com/go-redis/redis/v8 v8.11.4/go.mod h1:2Z2wHZXdQpCDXEGzqMockDpNyYvi2l4Pxt6RJr792+w= github.com/go-resty/resty/v2 v2.1.1-0.20191201195748-d7b97669fe48 h1:JVrqSeQfdhYRFk24TvhTZWU0q8lfCojxZQFi3Ou7+uY= github.com/go-resty/resty/v2 v2.1.1-0.20191201195748-d7b97669fe48/go.mod h1:dZGr0i9PLlaaTD4H/hoZIDjQ+r6xq8mgbRzHZf7f2J8= github.com/go-sql-driver/mysql v1.4.0/go.mod h1:zAC/RDZ24gD3HViQzih4MyKcchzm+sOG5ZlKdlhCg5w= @@ -747,6 +746,7 @@ github.com/go-sql-driver/mysql v1.4.1/go.mod h1:zAC/RDZ24gD3HViQzih4MyKcchzm+sOG github.com/go-sql-driver/mysql v1.5.0/go.mod h1:DCzpHaOWr8IXmIStZouvnhqoel9Qv2LBy8hT2VhHyBg= github.com/go-stack/stack v1.8.0 h1:5SgMzNM5HxrEjV0ww2lTmX6E2Izsfxas4+YHWRs3Lsk= github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY= +github.com/go-task/slim-sprig v0.0.0-20210107165309-348f09dbbbc0/go.mod h1:fyg7847qk6SyHyPtNmDHnmrv/HOrqktSC+C9fM+CJOE= github.com/go-zookeeper/zk v1.0.2 h1:4mx0EYENAdX/B/rbunjlt5+4RTA/a9SMHBRuSKdGxPM= github.com/go-zookeeper/zk v1.0.2/go.mod h1:nOB03cncLtlp4t+UAkGSV+9beXP/akpekBwL+UX1Qcw= github.com/gobuffalo/attrs v0.0.0-20190224210810-a9411de4debd/go.mod h1:4duuawTqi2wkkpB4ePgWMaai6/Kc6WEz83bhFwpHzj0= @@ -953,9 +953,8 @@ github.com/gorilla/websocket v0.0.0-20170926233335-4201258b820c/go.mod h1:E7qHFY github.com/gorilla/websocket v1.4.0/go.mod h1:E7qHFY5m1UJ88s3WnNqhKjPHQ0heANvMoAMk2YaljkQ= github.com/gorilla/websocket v1.4.2 h1:+/TMaTYc4QFitKJxsQ7Yye35DkWvkdLcvGKqM+x0Ufc= github.com/gorilla/websocket v1.4.2/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE= -github.com/grafana/dskit v0.0.0-20210818123532-6645f87e9e12/go.mod h1:QaNAQaCSFOtG/NHf6Jd/zh67H25kkrVCq36U61Y2Mhw= -github.com/grafana/dskit v0.0.0-20210819132858-471020752967 h1:1Z8LpFZzzpqEK1pq1PU8UGbeUQubO1Idh+jt1XXwB8M= -github.com/grafana/dskit v0.0.0-20210819132858-471020752967/go.mod h1:uF46UNN1/feB1egpq8UGbBBKvJjGgZauW7pcVbeFLLM= +github.com/grafana/dskit v0.0.0-20210908150159-fcf48cb19aa4 h1:OwWd9nQZYfb01HTJjleuO8eOP5t6Hl2EqVjng6W1juc= +github.com/grafana/dskit v0.0.0-20210908150159-fcf48cb19aa4/go.mod h1:m3eHzwe5IT5eE2MI3Ena2ooU8+Hek8IiVXb9yJ1+0rs= github.com/gregjones/httpcache v0.0.0-20180305231024-9cad4c3443a7/go.mod h1:FecbI9+v66THATjSRHfNgh1IVFe/9kFxbXtjV0ctIMA= github.com/grpc-ecosystem/go-grpc-middleware v1.0.0/go.mod h1:FiyG127CGDf3tlThmgyCl78X/SZQqEOJBCDaAfeWzPs= github.com/grpc-ecosystem/go-grpc-middleware v1.0.1-0.20190118093823-f849b5445de4/go.mod h1:FiyG127CGDf3tlThmgyCl78X/SZQqEOJBCDaAfeWzPs= @@ -1247,7 +1246,6 @@ github.com/miekg/dns v1.1.31/go.mod h1:KNUDUusw/aVsxyTYZM1oqvCicbwhgbNgztCETuNZ7 github.com/miekg/dns v1.1.35/go.mod h1:KNUDUusw/aVsxyTYZM1oqvCicbwhgbNgztCETuNZ7xM= github.com/miekg/dns v1.1.38/go.mod h1:KNUDUusw/aVsxyTYZM1oqvCicbwhgbNgztCETuNZ7xM= github.com/miekg/dns v1.1.41/go.mod h1:p6aan82bvRIyn+zDIv9xYNUpwa73JcSh9BKwknJysuI= -github.com/miekg/dns v1.1.42/go.mod h1:+evo5L0630/F6ca/Z9+GAqzhjGyn8/c+TBaOyfEl0V4= github.com/miekg/dns v1.1.43 h1:JKfpVSCB84vrAmHzyrsxB5NAr5kLoMXZArPSw7Qlgyg= github.com/miekg/dns v1.1.43/go.mod h1:+evo5L0630/F6ca/Z9+GAqzhjGyn8/c+TBaOyfEl0V4= github.com/miekg/pkcs11 v1.0.3/go.mod h1:XsNlhZGX73bx86s2hdc/FuaLm2CPZJemRLMA+WTFxgs= @@ -1327,8 +1325,9 @@ github.com/ncw/swift v1.0.52 h1:ACF3JufDGgeKp/9mrDgQlEgS8kRYC4XKcuzj/8EJjQU= github.com/ncw/swift v1.0.52/go.mod h1:23YIA4yWVnGwv2dQlN4bB7egfYX6YLn0Yo/S6zZO/ZM= github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e h1:fD57ERR4JtEqsWbfPhv4DMiApHyliiK5xCTNVSPiaAs= github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno= -github.com/nxadm/tail v1.4.4 h1:DQuhQpB1tVlglWS2hLQ5OV6B5r8aGxSrPc5Qo6uTN78= github.com/nxadm/tail v1.4.4/go.mod h1:kenIhsEOeOJmVchQTgglprH7qJGnHDVpk1VPCcaMI8A= +github.com/nxadm/tail v1.4.8 h1:nPr65rt6Y5JFSKQO7qToXr7pePgD6Gwiw05lkbyAQTE= +github.com/nxadm/tail v1.4.8/go.mod h1:+ncqLTQzXmGhMZNUePPaPqPvBxHAIsmXswZKocGu+AU= github.com/oklog/oklog v0.3.2/go.mod h1:FCV+B7mhrz4o+ueLpx+KqkyXRGMWOYEvfiXtdGtbWGs= github.com/oklog/run v1.0.0/go.mod h1:dlhp/R75TPv97u0XWUtDeV/lRKWPKSdTuV0TZvrmrQA= github.com/oklog/run v1.1.0 h1:GEenZ1cK0+q0+wsJew9qUg/DyD8k3JzYsZAi5gYi2mA= @@ -1349,8 +1348,8 @@ github.com/onsi/ginkgo v1.11.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+ github.com/onsi/ginkgo v1.12.1/go.mod h1:zj2OWP4+oCPe1qIXoGWkgMRwljMUYCdkwsT2108oapk= github.com/onsi/ginkgo v1.14.0/go.mod h1:iSB4RoI2tjJc9BBv4NKIKWKya62Rps+oPG/Lv9klQyY= github.com/onsi/ginkgo v1.14.1/go.mod h1:iSB4RoI2tjJc9BBv4NKIKWKya62Rps+oPG/Lv9klQyY= -github.com/onsi/ginkgo v1.15.0 h1:1V1NfVQR87RtWAgp1lv9JZJ5Jap+XFGKPi00andXGi4= -github.com/onsi/ginkgo v1.15.0/go.mod h1:hF8qUzuuC8DJGygJH3726JnCZX4MYbRB8yFfISqnKUg= +github.com/onsi/ginkgo v1.16.4 h1:29JGrr5oVBm5ulCWet69zQkzWipVXIol6ygQUe/EzNc= +github.com/onsi/ginkgo v1.16.4/go.mod h1:dX+/inL/fNMqNlz0e9LfyB9TswhZpCVdJM/Z6Vvnwo0= github.com/onsi/gomega v0.0.0-20151007035656-2152b45fa28a/go.mod h1:C1qb7wdrVGGVU+Z6iS04AVkA3Q65CEZX59MT0QO5uiA= github.com/onsi/gomega v0.0.0-20170829124025-dcabb60a477c/go.mod h1:C1qb7wdrVGGVU+Z6iS04AVkA3Q65CEZX59MT0QO5uiA= github.com/onsi/gomega v1.4.2/go.mod h1:ex+gbHU/CVuBBDIJjb2X0qEXbFg53c61hWP/1CpauHY= @@ -1360,8 +1359,8 @@ github.com/onsi/gomega v1.7.1/go.mod h1:XdKZgCCFLUoM/7CFJVPcG8C1xQ1AJ0vpAezJrB7J github.com/onsi/gomega v1.10.1/go.mod h1:iN09h71vgCQne3DLsj+A5owkum+a2tYe+TOCB1ybHNo= github.com/onsi/gomega v1.10.2/go.mod h1:iN09h71vgCQne3DLsj+A5owkum+a2tYe+TOCB1ybHNo= github.com/onsi/gomega v1.10.3/go.mod h1:V9xEwhxec5O8UDM77eCW8vLymOMltsqPVYWrpDsH8xc= -github.com/onsi/gomega v1.10.5 h1:7n6FEkpFmfCoo2t+YYqXH0evK+a9ICQz0xcAy9dYcaQ= -github.com/onsi/gomega v1.10.5/go.mod h1:gza4q3jKQJijlu05nKWRCW/GavJumGt8aNRxWg7mt48= +github.com/onsi/gomega v1.16.0 h1:6gjqkI8iiRHMvdccRJM8rVKjCWk6ZIm6FTm3ddIe4/c= +github.com/onsi/gomega v1.16.0/go.mod h1:HnhC7FXeEQY45zxNK3PPoIUhzk/80Xly9PcubAlGdZY= github.com/op/go-logging v0.0.0-20160315200505-970db520ece7/go.mod h1:HzydrMdWErDVzsI23lYNej1Htcns9BCg93Dk0bBINWk= github.com/opencontainers/go-digest v0.0.0-20170106003457-a6d0ee40d420/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s= github.com/opencontainers/go-digest v0.0.0-20180430190053-c9281466c8b2/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s= @@ -1440,9 +1439,9 @@ github.com/prometheus/alertmanager v0.21.1-0.20200911160112-1fdff6b3f939/go.mod github.com/prometheus/alertmanager v0.21.1-0.20201106142418-c39b78780054/go.mod h1:imXRHOP6QTsE0fFsIsAV/cXimS32m7gVZOiUj11m6Ig= github.com/prometheus/alertmanager v0.21.1-0.20210310093010-0f9cab6991e6/go.mod h1:MTqVn+vIupE0dzdgo+sMcNCp37SCAi8vPrvKTTnTz9g= github.com/prometheus/alertmanager v0.21.1-0.20210422101724-8176f78a70e1/go.mod h1:gsEqwD5BHHW9RNKvCuPOrrTMiP5I+faJUyLXvnivHik= -github.com/prometheus/alertmanager v0.22.3-0.20210726110322-3d86bd709df8/go.mod h1:BBhEP06PwDGsIKsQzOeTNe2jU6tU19SzhJ41C2ib4XE= -github.com/prometheus/alertmanager v0.23.0 h1:KIb9IChC3kg+1CC388qfr7bsT+tARpQqdsCMoatdObA= github.com/prometheus/alertmanager v0.23.0/go.mod h1:0MLTrjQI8EuVmvykEhcfr/7X0xmaDAZrqMgxIq3OXHk= +github.com/prometheus/alertmanager v0.23.1-0.20210914172521-e35efbddb66a h1:qroc/F4ygaQ0uc2S+Pyk/exMwnSpokGyN1QjfZ1DiWU= +github.com/prometheus/alertmanager v0.23.1-0.20210914172521-e35efbddb66a/go.mod h1:U7pGu+z7A9ZKhK8lq1MvIOp5GdVlZjwOYk+S0h3LSbA= github.com/prometheus/client_golang v0.0.0-20180209125602-c332b6f63c06/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw= github.com/prometheus/client_golang v0.8.0/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw= github.com/prometheus/client_golang v0.9.1/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw= @@ -1657,7 +1656,6 @@ github.com/thanos-io/thanos v0.13.1-0.20210204123931-82545cdd16fe/go.mod h1:ZLDG github.com/thanos-io/thanos v0.13.1-0.20210224074000-659446cab117/go.mod h1:kdqFpzdkveIKpNNECVJd75RPvgsAifQgJymwCdfev1w= github.com/thanos-io/thanos v0.13.1-0.20210226164558-03dace0a1aa1/go.mod h1:gMCy4oCteKTT7VuXVvXLTPGzzjovX1VPE5p+HgL1hyU= github.com/thanos-io/thanos v0.13.1-0.20210401085038-d7dff0c84d17/go.mod h1:zU8KqE+6A+HksK4wiep8e/3UvCZLm+Wrw9AqZGaAm9k= -github.com/thanos-io/thanos v0.19.1-0.20210729154440-aa148f8fdb28/go.mod h1:Xskx78e0CYL6w0yDNOZHGdvwQMlsuzPsePmPtbp9Xuk= github.com/thanos-io/thanos v0.22.0/go.mod h1:SZDWz3phcUcBr4MYFoPFRvl+Z9Nbi45HlwQlwSZSt+Q= github.com/themihai/gomemcache v0.0.0-20180902122335-24332e2d58ab h1:7ZR3hmisBWw77ZpO1/o86g+JV3VKlk3d48jopJxzTjU= github.com/themihai/gomemcache v0.0.0-20180902122335-24332e2d58ab/go.mod h1:eheTFp954zcWZXCU8d0AT76ftsQOTo4DTqkN/h3k1MY= @@ -1709,8 +1707,9 @@ github.com/weaveworks/common v0.0.0-20200914083218-61ffdd448099/go.mod h1:hz10LO github.com/weaveworks/common v0.0.0-20201119133501-0619918236ec/go.mod h1:ykzWac1LtVfOxdCK+jD754at1Ws9dKCwFeUzkFBffPs= github.com/weaveworks/common v0.0.0-20210112142934-23c8d7fa6120/go.mod h1:ykzWac1LtVfOxdCK+jD754at1Ws9dKCwFeUzkFBffPs= github.com/weaveworks/common v0.0.0-20210419092856-009d1eebd624/go.mod h1:ykzWac1LtVfOxdCK+jD754at1Ws9dKCwFeUzkFBffPs= -github.com/weaveworks/common v0.0.0-20210722103813-e649eff5ab4a h1:ALomSnvy/NPeVoc4a1o7keaHHgLS76r9ZYIlwWWF+KA= github.com/weaveworks/common v0.0.0-20210722103813-e649eff5ab4a/go.mod h1:YU9FvnS7kUnRt6HY10G+2qHkwzP3n3Vb1XsXDsJTSp8= +github.com/weaveworks/common v0.0.0-20210901124008-1fa3f9fa874c h1:+yzwVr4/12cUgsdjbEHq6MsKB7jWBZpZccAP6xvqTzQ= +github.com/weaveworks/common v0.0.0-20210901124008-1fa3f9fa874c/go.mod h1:YU9FvnS7kUnRt6HY10G+2qHkwzP3n3Vb1XsXDsJTSp8= github.com/weaveworks/promrus v1.2.0 h1:jOLf6pe6/vss4qGHjXmGz4oDJQA+AOCqEL3FvvZGz7M= github.com/weaveworks/promrus v1.2.0/go.mod h1:SaE82+OJ91yqjrE1rsvBWVzNZKcHYFtMUyS1+Ogs/KA= github.com/willf/bitset v1.1.9/go.mod h1:RjeCKbqT1RxIR/KWY6phxZiaY1IyutSBfGjNPySAYV4= @@ -2009,10 +2008,10 @@ golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v golang.org/x/net v0.0.0-20210316092652-d523dce5a7f4/go.mod h1:RBQZq4jEuRlivfhVLdyRGr576XBO4/greRjx4P4O3yc= golang.org/x/net v0.0.0-20210324051636-2c4c8ecb7826/go.mod h1:RBQZq4jEuRlivfhVLdyRGr576XBO4/greRjx4P4O3yc= golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM= +golang.org/x/net v0.0.0-20210428140749-89ef3d95e781/go.mod h1:OJAsFXCWl8Ukc7SiCT/9KSuxbyM7479/AVlXFRxuMCk= golang.org/x/net v0.0.0-20210503060351-7fd8e65b6420/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= golang.org/x/net v0.0.0-20210520170846-37e1c6afe023/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= golang.org/x/net v0.0.0-20210525063256-abc453219eb5/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= -golang.org/x/net v0.0.0-20210610132358-84b48f89b13b/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= golang.org/x/net v0.0.0-20210614182718-04defd469f4e/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= golang.org/x/net v0.0.0-20210726213435-c6fcb2dbf985/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= golang.org/x/net v0.0.0-20210903162142-ad29c8ab022f h1:w6wWR0H+nyVpbSAQbzVEIACVyr/h8l/BEkY6Sokc7Eg= diff --git a/mixin/README.md b/mixin/README.md index 4bb7ce797c8..baef01946cd 100644 --- a/mixin/README.md +++ b/mixin/README.md @@ -106,6 +106,7 @@ This project is intended to be used as a library. You can extend and customize d }, sidecar+:: { selector: 'job=~".*thanos-sidecar.*"', + thanosPrometheusCommonDimensions: 'namespace, pod', title: '%(prefix)sSidecar' % $.dashboard.prefix, }, // TODO(kakkoyun): Fix naming convention: bucketReplicate diff --git a/mixin/alerts/sidecar.libsonnet b/mixin/alerts/sidecar.libsonnet index b4682106192..5bdea985ab1 100644 --- a/mixin/alerts/sidecar.libsonnet +++ b/mixin/alerts/sidecar.libsonnet @@ -2,6 +2,7 @@ local thanos = self, sidecar+:: { selector: error 'must provide selector for Thanos Sidecar alerts', + thanosPrometheusCommonDimensions: error 'must provide commonDimensions between Thanos and Prometheus metrics for Sidecar alerts', dimensions: std.join(', ', std.objectFields(thanos.targetGroups) + ['job', 'instance']), }, prometheusAlerts+:: { @@ -10,20 +11,6 @@ { name: 'thanos-sidecar', rules: [ - { - alert: 'ThanosSidecarPrometheusDown', - annotations: { - description: 'Thanos Sidecar {{$labels.instance}}%s cannot connect to Prometheus.' % location, - summary: 'Thanos Sidecar cannot connect to Prometheus', - }, - expr: ||| - thanos_sidecar_prometheus_up{%(selector)s} == 0 - ||| % thanos.sidecar, - 'for': '5m', - labels: { - severity: 'critical', - }, - }, { alert: 'ThanosSidecarBucketOperationsFailed', annotations: { @@ -39,13 +26,15 @@ }, }, { - alert: 'ThanosSidecarUnhealthy', + alert: 'ThanosSidecarNoConnectionToStartedPrometheus', annotations: { - description: 'Thanos Sidecar {{$labels.instance}}%s is unhealthy for more than {{$value}} seconds.' % location, - summary: 'Thanos Sidecar is unhealthy.', + description: 'Thanos Sidecar {{$labels.instance}}%s is unhealthy.' % location, + summary: 'Thanos Sidecar cannot access Prometheus, even though Prometheus seems healthy and has reloaded WAL.', }, expr: ||| - time() - max by (%(dimensions)s) (thanos_sidecar_last_heartbeat_success_time_seconds{%(selector)s}) >= 240 + thanos_sidecar_prometheus_up{%(selector)s} == 0 + AND on (%(thanosPrometheusCommonDimensions)s) + prometheus_tsdb_data_replay_duration_seconds != 0 ||| % thanos.sidecar, 'for': '5m', labels: { diff --git a/mixin/config.libsonnet b/mixin/config.libsonnet index 634f2c7d6cd..e4d415d5ef9 100644 --- a/mixin/config.libsonnet +++ b/mixin/config.libsonnet @@ -46,6 +46,7 @@ }, sidecar+:: { selector: 'job=~".*thanos-sidecar.*"', + thanosPrometheusCommonDimensions: 'namespace, pod', title: '%(prefix)sSidecar' % $.dashboard.prefix, }, // TODO(kakkoyun): Fix naming convention: bucketReplicate diff --git a/mixin/runbook.md b/mixin/runbook.md index 03f92aed716..98e76b97820 100755 --- a/mixin/runbook.md +++ b/mixin/runbook.md @@ -85,9 +85,8 @@ |Name|Summary|Description|Severity|Runbook| |---|---|---|---|---| -|ThanosSidecarPrometheusDown|Thanos Sidecar cannot connect to Prometheus|Thanos Sidecar {{$labels.instance}} cannot connect to Prometheus.|critical|[https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarprometheusdown](https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarprometheusdown)| |ThanosSidecarBucketOperationsFailed|Thanos Sidecar bucket operations are failing|Thanos Sidecar {{$labels.instance}} bucket operations are failing|critical|[https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarbucketoperationsfailed](https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarbucketoperationsfailed)| -|ThanosSidecarUnhealthy|Thanos Sidecar is unhealthy.|Thanos Sidecar {{$labels.instance}} is unhealthy for more than {{$value}} seconds.|critical|[https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy](https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarunhealthy)| +|ThanosSidecarNoConnectionToStartedPrometheus|Thanos Sidecar cannot access Prometheus, even though Prometheus seems healthy and has reloaded WAL.|Thanos Sidecar {{$labels.instance}} is unhealthy.|critical|[https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus](https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanossidecarnoconnectiontostartedprometheus)| ## thanos-store diff --git a/pkg/alert/config.go b/pkg/alert/config.go index b6d5c0b33ee..1572821cf28 100644 --- a/pkg/alert/config.go +++ b/pkg/alert/config.go @@ -13,10 +13,10 @@ import ( "github.com/pkg/errors" "github.com/prometheus/common/model" "github.com/prometheus/prometheus/pkg/relabel" + "github.com/thanos-io/thanos/pkg/httpconfig" "gopkg.in/yaml.v2" "github.com/thanos-io/thanos/pkg/discovery/dns" - http_util "github.com/thanos-io/thanos/pkg/http" ) type AlertingConfig struct { @@ -25,10 +25,10 @@ type AlertingConfig struct { // AlertmanagerConfig represents a client to a cluster of Alertmanager endpoints. type AlertmanagerConfig struct { - HTTPClientConfig http_util.ClientConfig `yaml:"http_config"` - EndpointsConfig http_util.EndpointsConfig `yaml:",inline"` - Timeout model.Duration `yaml:"timeout"` - APIVersion APIVersion `yaml:"api_version"` + HTTPClientConfig httpconfig.ClientConfig `yaml:"http_config"` + EndpointsConfig httpconfig.EndpointsConfig `yaml:",inline"` + Timeout model.Duration `yaml:"timeout"` + APIVersion APIVersion `yaml:"api_version"` } // APIVersion represents the API version of the Alertmanager endpoint. @@ -61,10 +61,10 @@ func (v *APIVersion) UnmarshalYAML(unmarshal func(interface{}) error) error { func DefaultAlertmanagerConfig() AlertmanagerConfig { return AlertmanagerConfig{ - EndpointsConfig: http_util.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ Scheme: "http", StaticAddresses: []string{}, - FileSDConfigs: []http_util.FileSDConfig{}, + FileSDConfigs: []httpconfig.FileSDConfig{}, }, Timeout: model.Duration(time.Second * 10), APIVersion: APIv1, @@ -111,7 +111,7 @@ func BuildAlertmanagerConfig(address string, timeout time.Duration) (Alertmanage break } } - var basicAuth http_util.BasicAuth + var basicAuth httpconfig.BasicAuth if parsed.User != nil && parsed.User.String() != "" { basicAuth.Username = parsed.User.Username() pw, _ := parsed.User.Password() @@ -119,10 +119,10 @@ func BuildAlertmanagerConfig(address string, timeout time.Duration) (Alertmanage } return AlertmanagerConfig{ - HTTPClientConfig: http_util.ClientConfig{ + HTTPClientConfig: httpconfig.ClientConfig{ BasicAuth: basicAuth, }, - EndpointsConfig: http_util.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ PathPrefix: parsed.Path, Scheme: scheme, StaticAddresses: []string{host}, diff --git a/pkg/alert/config_test.go b/pkg/alert/config_test.go index 71aaee399cf..11920a342de 100644 --- a/pkg/alert/config_test.go +++ b/pkg/alert/config_test.go @@ -9,7 +9,7 @@ import ( "gopkg.in/yaml.v2" - "github.com/thanos-io/thanos/pkg/http" + "github.com/thanos-io/thanos/pkg/httpconfig" "github.com/thanos-io/thanos/pkg/testutil" ) @@ -54,7 +54,7 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "http://localhost:9093", expected: AlertmanagerConfig{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"localhost:9093"}, Scheme: "http", }, @@ -64,7 +64,7 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "https://am.example.com", expected: AlertmanagerConfig{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"am.example.com"}, Scheme: "https", }, @@ -74,7 +74,7 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "dns+http://localhost:9093", expected: AlertmanagerConfig{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"dns+localhost:9093"}, Scheme: "http", }, @@ -84,7 +84,7 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "dnssrv+http://localhost", expected: AlertmanagerConfig{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"dnssrv+localhost"}, Scheme: "http", }, @@ -94,7 +94,7 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "ssh+http://localhost", expected: AlertmanagerConfig{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"localhost"}, Scheme: "ssh+http", }, @@ -104,7 +104,7 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "dns+https://localhost/path/prefix/", expected: AlertmanagerConfig{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"dns+localhost:9093"}, Scheme: "https", PathPrefix: "/path/prefix/", @@ -115,13 +115,13 @@ func TestBuildAlertmanagerConfiguration(t *testing.T) { { address: "http://user:pass@localhost:9093", expected: AlertmanagerConfig{ - HTTPClientConfig: http.ClientConfig{ - BasicAuth: http.BasicAuth{ + HTTPClientConfig: httpconfig.ClientConfig{ + BasicAuth: httpconfig.BasicAuth{ Username: "user", Password: "pass", }, }, - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ StaticAddresses: []string{"localhost:9093"}, Scheme: "http", }, diff --git a/pkg/api/query/v1.go b/pkg/api/query/v1.go index 4f3866b62b7..fd9ee99b683 100644 --- a/pkg/api/query/v1.go +++ b/pkg/api/query/v1.go @@ -93,8 +93,8 @@ type QueryAPI struct { enableExemplarPartialResponse bool disableCORS bool - replicaLabels []string - endpointSet *query.EndpointSet + replicaLabels []string + endpointStatus func() []query.EndpointStatus defaultRangeQueryStep time.Duration defaultInstantQueryMaxSourceResolution time.Duration @@ -106,7 +106,7 @@ type QueryAPI struct { // NewQueryAPI returns an initialized QueryAPI type. func NewQueryAPI( logger log.Logger, - endpointSet *query.EndpointSet, + endpointStatus func() []query.EndpointStatus, qe func(int64) *promql.Engine, c query.QueryableCreator, ruleGroups rules.UnaryClient, @@ -146,7 +146,7 @@ func NewQueryAPI( enableMetricMetadataPartialResponse: enableMetricMetadataPartialResponse, enableExemplarPartialResponse: enableExemplarPartialResponse, replicaLabels: replicaLabels, - endpointSet: endpointSet, + endpointStatus: endpointStatus, defaultRangeQueryStep: defaultRangeQueryStep, defaultInstantQueryMaxSourceResolution: defaultInstantQueryMaxSourceResolution, defaultMetadataTimeRange: defaultMetadataTimeRange, @@ -715,7 +715,11 @@ func (qapi *QueryAPI) labelNames(r *http.Request) (interface{}, []error, *api.Ap func (qapi *QueryAPI) stores(_ *http.Request) (interface{}, []error, *api.ApiError) { statuses := make(map[string][]query.EndpointStatus) - for _, status := range qapi.endpointSet.GetEndpointStatus() { + for _, status := range qapi.endpointStatus() { + // Don't consider an endpoint if we cannot retrieve component type. + if status.ComponentType == nil { + continue + } statuses[status.ComponentType.String()] = append(statuses[status.ComponentType.String()], status) } return statuses, nil, nil diff --git a/pkg/api/query/v1_test.go b/pkg/api/query/v1_test.go index a9f0648f425..218498fe819 100644 --- a/pkg/api/query/v1_test.go +++ b/pkg/api/query/v1_test.go @@ -1201,6 +1201,93 @@ func TestMetadataEndpoints(t *testing.T) { } } +func TestStoresEndpoint(t *testing.T) { + apiWithNotEndpoints := &QueryAPI{ + endpointStatus: func() []query.EndpointStatus { + return []query.EndpointStatus{} + }, + } + apiWithValidEndpoints := &QueryAPI{ + endpointStatus: func() []query.EndpointStatus { + return []query.EndpointStatus{ + { + Name: "endpoint-1", + ComponentType: component.Store, + }, + { + Name: "endpoint-2", + ComponentType: component.Store, + }, + { + Name: "endpoint-3", + ComponentType: component.Sidecar, + }, + } + }, + } + apiWithInvalidEndpoint := &QueryAPI{ + endpointStatus: func() []query.EndpointStatus { + return []query.EndpointStatus{ + { + Name: "endpoint-1", + ComponentType: component.Store, + }, + { + Name: "endpoint-2", + }, + } + }, + } + + testCases := []endpointTestCase{ + { + endpoint: apiWithNotEndpoints.stores, + method: http.MethodGet, + response: map[string][]query.EndpointStatus{}, + }, + { + endpoint: apiWithValidEndpoints.stores, + method: http.MethodGet, + response: map[string][]query.EndpointStatus{ + "store": { + { + Name: "endpoint-1", + ComponentType: component.Store, + }, + { + Name: "endpoint-2", + ComponentType: component.Store, + }, + }, + "sidecar": { + { + Name: "endpoint-3", + ComponentType: component.Sidecar, + }, + }, + }, + }, + { + endpoint: apiWithInvalidEndpoint.stores, + method: http.MethodGet, + response: map[string][]query.EndpointStatus{ + "store": { + { + Name: "endpoint-1", + ComponentType: component.Store, + }, + }, + }, + }, + } + + for i, test := range testCases { + if ok := testEndpoint(t, test, strings.TrimSpace(fmt.Sprintf("#%d %s", i, test.query.Encode())), reflect.DeepEqual); !ok { + return + } + } +} + func TestParseTime(t *testing.T) { ts, err := time.Parse(time.RFC3339Nano, "2015-06-03T13:21:58.555Z") if err != nil { diff --git a/pkg/block/fetcher.go b/pkg/block/fetcher.go index f9f45202d99..77091ecbffc 100644 --- a/pkg/block/fetcher.go +++ b/pkg/block/fetcher.go @@ -660,9 +660,9 @@ func (f *DeduplicateFilter) DuplicateIDs() []ulid.ULID { func addNodeBySources(root, add *Node) bool { var rootNode *Node + childSources := add.Compaction.Sources for _, node := range root.Children { parentSources := node.Compaction.Sources - childSources := add.Compaction.Sources // Block exists with same sources, add as child. if contains(parentSources, childSources) && contains(childSources, parentSources) { diff --git a/pkg/block/index.go b/pkg/block/index.go index 851dfa9d98a..4e62cdcfca6 100644 --- a/pkg/block/index.go +++ b/pkg/block/index.go @@ -328,6 +328,7 @@ func GatherIndexHealthStats(logger log.Logger, fn string, minTime, maxTime int64 if ooo > 0 { stats.OutOfOrderSeries++ stats.OutOfOrderChunks += ooo + level.Debug(logger).Log("msg", "found out of order series", "labels", lset) } seriesChunks.Add(int64(len(chks))) diff --git a/pkg/cache/cache.go b/pkg/cache/cache.go index acaa0e159d3..24e52aac00e 100644 --- a/pkg/cache/cache.go +++ b/pkg/cache/cache.go @@ -18,4 +18,6 @@ type Cache interface { // Fetch multiple keys from cache. Returns map of input keys to data. // If key isn't in the map, data for given key was not found. Fetch(ctx context.Context, keys []string) map[string][]byte + + Name() string } diff --git a/pkg/cache/inmemory.go b/pkg/cache/inmemory.go index 6a6036a03a8..8ddfe9f0b93 100644 --- a/pkg/cache/inmemory.go +++ b/pkg/cache/inmemory.go @@ -41,6 +41,7 @@ type InMemoryCache struct { logger log.Logger maxSizeBytes uint64 maxItemSizeBytes uint64 + name string mtx sync.Mutex curSize uint64 @@ -100,6 +101,7 @@ func NewInMemoryCacheWithConfig(name string, logger log.Logger, reg prometheus.R logger: logger, maxSizeBytes: uint64(config.MaxSize), maxItemSizeBytes: uint64(config.MaxItemSize), + name: name, } c.evicted = promauto.With(reg).NewCounter(prometheus.CounterOpts{ @@ -303,3 +305,7 @@ func (c *InMemoryCache) Fetch(ctx context.Context, keys []string) map[string][]b } return results } + +func (c *InMemoryCache) Name() string { + return c.name +} diff --git a/pkg/cache/memcached.go b/pkg/cache/memcached.go index 04c249576e2..695b78e6cd8 100644 --- a/pkg/cache/memcached.go +++ b/pkg/cache/memcached.go @@ -19,6 +19,7 @@ import ( type MemcachedCache struct { logger log.Logger memcached cacheutil.MemcachedClient + name string // Metrics. requests prometheus.Counter @@ -30,6 +31,7 @@ func NewMemcachedCache(name string, logger log.Logger, memcached cacheutil.Memca c := &MemcachedCache{ logger: logger, memcached: memcached, + name: name, } c.requests = promauto.With(reg).NewCounter(prometheus.CounterOpts{ @@ -81,3 +83,7 @@ func (c *MemcachedCache) Fetch(ctx context.Context, keys []string) map[string][] c.hits.Add(float64(len(results))) return results } + +func (c *MemcachedCache) Name() string { + return c.name +} diff --git a/pkg/cache/tracing_cache.go b/pkg/cache/tracing_cache.go index ed440c43ad5..81fb5bef8e2 100644 --- a/pkg/cache/tracing_cache.go +++ b/pkg/cache/tracing_cache.go @@ -27,6 +27,7 @@ func (t TracingCache) Store(ctx context.Context, data map[string][]byte, ttl tim func (t TracingCache) Fetch(ctx context.Context, keys []string) (result map[string][]byte) { tracing.DoWithSpan(ctx, "cache_fetch", func(spanCtx context.Context, span opentracing.Span) { + span.SetTag("name", t.Name()) span.LogKV("requested keys", len(keys)) result = t.c.Fetch(spanCtx, keys) @@ -39,3 +40,7 @@ func (t TracingCache) Fetch(ctx context.Context, keys []string) (result map[stri }) return } + +func (t TracingCache) Name() string { + return t.c.Name() +} diff --git a/pkg/compact/compact.go b/pkg/compact/compact.go index 547eae57f2c..2493a4a471c 100644 --- a/pkg/compact/compact.go +++ b/pkg/compact/compact.go @@ -1030,6 +1030,10 @@ func (c *BucketCompactor) Compact(ctx context.Context) (rerr error) { var groupErrs errutil.MultiError groupLoop: for _, g := range groups { + // Ignore groups with only one block because there is nothing to compact. + if len(g.IDs()) == 1 { + continue + } select { case groupErr := <-errChan: groupErrs.Add(groupErr) diff --git a/pkg/compact/compact_e2e_test.go b/pkg/compact/compact_e2e_test.go index 997b635a740..8b7843842ba 100644 --- a/pkg/compact/compact_e2e_test.go +++ b/pkg/compact/compact_e2e_test.go @@ -332,15 +332,13 @@ func testGroupCompactE2e(t *testing.T, mergeFunc storage.VerticalChunkSeriesMerg testutil.Equals(t, 4, MetricCount(grouper.compactionRunsStarted)) testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsStarted.WithLabelValues(DefaultGroupKey(metas[0].Thanos)))) testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsStarted.WithLabelValues(DefaultGroupKey(metas[7].Thanos)))) - // TODO(bwplotka): Looks like we do some unnecessary loops. Not a major problem but investigate. - testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsStarted.WithLabelValues(DefaultGroupKey(metas[4].Thanos)))) - testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsStarted.WithLabelValues(DefaultGroupKey(metas[5].Thanos)))) + testutil.Equals(t, 0.0, promtest.ToFloat64(grouper.compactionRunsStarted.WithLabelValues(DefaultGroupKey(metas[4].Thanos)))) + testutil.Equals(t, 0.0, promtest.ToFloat64(grouper.compactionRunsStarted.WithLabelValues(DefaultGroupKey(metas[5].Thanos)))) testutil.Equals(t, 4, MetricCount(grouper.compactionRunsCompleted)) testutil.Equals(t, 2.0, promtest.ToFloat64(grouper.compactionRunsCompleted.WithLabelValues(DefaultGroupKey(metas[0].Thanos)))) testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsCompleted.WithLabelValues(DefaultGroupKey(metas[7].Thanos)))) - // TODO(bwplotka): Looks like we do some unnecessary loops. Not a major problem but investigate. - testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsCompleted.WithLabelValues(DefaultGroupKey(metas[4].Thanos)))) - testutil.Equals(t, 3.0, promtest.ToFloat64(grouper.compactionRunsCompleted.WithLabelValues(DefaultGroupKey(metas[5].Thanos)))) + testutil.Equals(t, 0.0, promtest.ToFloat64(grouper.compactionRunsCompleted.WithLabelValues(DefaultGroupKey(metas[4].Thanos)))) + testutil.Equals(t, 0.0, promtest.ToFloat64(grouper.compactionRunsCompleted.WithLabelValues(DefaultGroupKey(metas[5].Thanos)))) testutil.Equals(t, 4, MetricCount(grouper.compactionFailures)) testutil.Equals(t, 1.0, promtest.ToFloat64(grouper.compactionFailures.WithLabelValues(DefaultGroupKey(metas[0].Thanos)))) testutil.Equals(t, 0.0, promtest.ToFloat64(grouper.compactionFailures.WithLabelValues(DefaultGroupKey(metas[7].Thanos)))) diff --git a/pkg/query/config.go b/pkg/httpconfig/config.go similarity index 63% rename from pkg/query/config.go rename to pkg/httpconfig/config.go index 12918e614f1..3280e333782 100644 --- a/pkg/query/config.go +++ b/pkg/httpconfig/config.go @@ -1,7 +1,7 @@ // Copyright (c) The Thanos Authors. // Licensed under the Apache License 2.0. -package query +package httpconfig import ( "fmt" @@ -11,20 +11,20 @@ import ( "gopkg.in/yaml.v2" "github.com/pkg/errors" - http_util "github.com/thanos-io/thanos/pkg/http" ) +// Config is a structure that allows pointing to various HTTP endpoint, e.g ruler connecting to queriers. type Config struct { - HTTPClientConfig http_util.ClientConfig `yaml:"http_config"` - EndpointsConfig http_util.EndpointsConfig `yaml:",inline"` + HTTPClientConfig ClientConfig `yaml:"http_config"` + EndpointsConfig EndpointsConfig `yaml:",inline"` } func DefaultConfig() Config { return Config{ - EndpointsConfig: http_util.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ Scheme: "http", StaticAddresses: []string{}, - FileSDConfigs: []http_util.FileSDConfig{}, + FileSDConfigs: []FileSDConfig{}, }, } } @@ -45,12 +45,12 @@ func LoadConfigs(confYAML []byte) ([]Config, error) { return queryCfg, nil } -// BuildQueryConfig returns a query client configuration from a static address. -func BuildQueryConfig(queryAddrs []string) ([]Config, error) { - configs := make([]Config, 0, len(queryAddrs)) - for i, addr := range queryAddrs { +// BuildConfig returns a configuration from a static addresses. +func BuildConfig(addrs []string) ([]Config, error) { + configs := make([]Config, 0, len(addrs)) + for i, addr := range addrs { if addr == "" { - return nil, errors.Errorf("static querier address cannot be empty at index %d", i) + return nil, errors.Errorf("static address cannot be empty at index %d", i) } // If addr is missing schema, add http. if !strings.Contains(addr, "://") { @@ -61,10 +61,10 @@ func BuildQueryConfig(queryAddrs []string) ([]Config, error) { return nil, errors.Wrapf(err, "failed to parse addr %q", addr) } if u.Scheme != "http" && u.Scheme != "https" { - return nil, errors.Errorf("%q is not supported scheme for querier address", u.Scheme) + return nil, errors.Errorf("%q is not supported scheme for address", u.Scheme) } configs = append(configs, Config{ - EndpointsConfig: http_util.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ Scheme: u.Scheme, StaticAddresses: []string{u.Host}, PathPrefix: u.Path, diff --git a/pkg/query/config_test.go b/pkg/httpconfig/config_test.go similarity index 83% rename from pkg/query/config_test.go rename to pkg/httpconfig/config_test.go index 1169df04989..fe876e859bf 100644 --- a/pkg/query/config_test.go +++ b/pkg/httpconfig/config_test.go @@ -1,16 +1,15 @@ // Copyright (c) The Thanos Authors. // Licensed under the Apache License 2.0. -package query +package httpconfig import ( "testing" - "github.com/thanos-io/thanos/pkg/http" "github.com/thanos-io/thanos/pkg/testutil" ) -func TestBuildQueryConfig(t *testing.T) { +func TestBuildConfig(t *testing.T) { for _, tc := range []struct { desc string addresses []string @@ -21,7 +20,7 @@ func TestBuildQueryConfig(t *testing.T) { desc: "single addr without path", addresses: []string{"localhost:9093"}, expected: []Config{{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ StaticAddresses: []string{"localhost:9093"}, Scheme: "http", }, @@ -32,13 +31,13 @@ func TestBuildQueryConfig(t *testing.T) { addresses: []string{"localhost:9093", "localhost:9094/prefix"}, expected: []Config{ { - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ StaticAddresses: []string{"localhost:9093"}, Scheme: "http", }, }, { - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ StaticAddresses: []string{"localhost:9094"}, Scheme: "http", PathPrefix: "/prefix", @@ -50,7 +49,7 @@ func TestBuildQueryConfig(t *testing.T) { desc: "single addr with path and http scheme", addresses: []string{"http://localhost:9093"}, expected: []Config{{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ StaticAddresses: []string{"localhost:9093"}, Scheme: "http", }, @@ -60,7 +59,7 @@ func TestBuildQueryConfig(t *testing.T) { desc: "single addr with path and https scheme", addresses: []string{"https://localhost:9093"}, expected: []Config{{ - EndpointsConfig: http.EndpointsConfig{ + EndpointsConfig: EndpointsConfig{ StaticAddresses: []string{"localhost:9093"}, Scheme: "https", }, @@ -83,7 +82,7 @@ func TestBuildQueryConfig(t *testing.T) { }, } { t.Run(tc.desc, func(t *testing.T) { - cfg, err := BuildQueryConfig(tc.addresses) + cfg, err := BuildConfig(tc.addresses) if tc.err { testutil.NotOk(t, err) return diff --git a/pkg/http/http.go b/pkg/httpconfig/http.go similarity index 97% rename from pkg/http/http.go rename to pkg/httpconfig/http.go index 5b82a1dc4e3..b00204e4259 100644 --- a/pkg/http/http.go +++ b/pkg/httpconfig/http.go @@ -1,8 +1,8 @@ // Copyright (c) The Thanos Authors. // Licensed under the Apache License 2.0. -// Package http is a wrapper around github.com/prometheus/common/config. -package http +// Package httpconfig is a wrapper around github.com/prometheus/common/config. +package httpconfig import ( "context" @@ -50,7 +50,7 @@ type TLSConfig struct { CertFile string `yaml:"cert_file"` // The client key file for the targets. KeyFile string `yaml:"key_file"` - // Used to verify the hostname for the targets. + // Used to verify the hostname for the targets. See https://tools.ietf.org/html/rfc4366#section-3.1 ServerName string `yaml:"server_name"` // Disable target certificate validation. InsecureSkipVerify bool `yaml:"insecure_skip_verify"` diff --git a/pkg/model/timeduration.go b/pkg/model/timeduration.go index 1d525136f09..02da2aa7698 100644 --- a/pkg/model/timeduration.go +++ b/pkg/model/timeduration.go @@ -51,6 +51,9 @@ func (tdv *TimeOrDurationValue) String() string { case tdv.Time != nil: return tdv.Time.String() case tdv.Dur != nil: + if v := *tdv.Dur; v < 0 { + return "-" + (-v).String() + } return tdv.Dur.String() } diff --git a/pkg/model/timeduration_test.go b/pkg/model/timeduration_test.go index c1b0a72a737..2ea62b67ef7 100644 --- a/pkg/model/timeduration_test.go +++ b/pkg/model/timeduration_test.go @@ -14,26 +14,44 @@ import ( ) func TestTimeOrDurationValue(t *testing.T) { - cmd := kingpin.New("test", "test") - - minTime := model.TimeOrDuration(cmd.Flag("min-time", "Start of time range limit to serve")) - - maxTime := model.TimeOrDuration(cmd.Flag("max-time", "End of time range limit to serve"). - Default("9999-12-31T23:59:59Z")) - - _, err := cmd.Parse([]string{"--min-time", "10s"}) - if err != nil { - t.Fatal(err) - } - - testutil.Equals(t, "10s", minTime.String()) - testutil.Equals(t, "9999-12-31 23:59:59 +0000 UTC", maxTime.String()) - - prevTime := timestamp.FromTime(time.Now()) - afterTime := timestamp.FromTime(time.Now().Add(15 * time.Second)) - - testutil.Assert(t, minTime.PrometheusTimestamp() > prevTime, "minTime prometheus timestamp is less than time now.") - testutil.Assert(t, minTime.PrometheusTimestamp() < afterTime, "minTime prometheus timestamp is more than time now + 15s") - - testutil.Assert(t, maxTime.PrometheusTimestamp() == 253402300799000, "maxTime is not equal to 253402300799000") + t.Run("positive", func(t *testing.T) { + cmd := kingpin.New("test", "test") + + minTime := model.TimeOrDuration(cmd.Flag("min-time", "Start of time range limit to serve")) + + maxTime := model.TimeOrDuration(cmd.Flag("max-time", "End of time range limit to serve"). + Default("9999-12-31T23:59:59Z")) + + _, err := cmd.Parse([]string{"--min-time", "10s"}) + if err != nil { + t.Fatal(err) + } + + testutil.Equals(t, "10s", minTime.String()) + testutil.Equals(t, "9999-12-31 23:59:59 +0000 UTC", maxTime.String()) + + prevTime := timestamp.FromTime(time.Now()) + afterTime := timestamp.FromTime(time.Now().Add(15 * time.Second)) + + testutil.Assert(t, minTime.PrometheusTimestamp() > prevTime, "minTime prometheus timestamp is less than time now.") + testutil.Assert(t, minTime.PrometheusTimestamp() < afterTime, "minTime prometheus timestamp is more than time now + 15s") + + testutil.Assert(t, maxTime.PrometheusTimestamp() == 253402300799000, "maxTime is not equal to 253402300799000") + }) + + t.Run("negative", func(t *testing.T) { + cmd := kingpin.New("test-negative", "test-negative") + var minTime model.TimeOrDurationValue + cmd.Flag("min-time", "Start of time range limit to serve").SetValue(&minTime) + _, err := cmd.Parse([]string{"--min-time=-10s"}) + if err != nil { + t.Fatal(err) + } + testutil.Equals(t, "-10s", minTime.String()) + + prevTime := timestamp.FromTime(time.Now().Add(-15 * time.Second)) + afterTime := timestamp.FromTime(time.Now()) + testutil.Assert(t, minTime.PrometheusTimestamp() > prevTime, "minTime prometheus timestamp is less than time now - 15s.") + testutil.Assert(t, minTime.PrometheusTimestamp() < afterTime, "minTime prometheus timestamp is more than time now.") + }) } diff --git a/pkg/objstore/s3/s3.go b/pkg/objstore/s3/s3.go index d27c0221505..149cc53da71 100644 --- a/pkg/objstore/s3/s3.go +++ b/pkg/objstore/s3/s3.go @@ -85,8 +85,9 @@ type Config struct { ListObjectsVersion string `yaml:"list_objects_version"` // PartSize used for multipart upload. Only used if uploaded object size is known and larger than configured PartSize. // NOTE we need to make sure this number does not produce more parts than 10 000. - PartSize uint64 `yaml:"part_size"` - SSEConfig SSEConfig `yaml:"sse_config"` + PartSize uint64 `yaml:"part_size"` + SSEConfig SSEConfig `yaml:"sse_config"` + STSEndpoint string `yaml:"sts_endpoint"` } // SSEConfig deals with the configuration of SSE for Minio. The following options are valid: @@ -234,6 +235,7 @@ func NewBucketWithConfig(logger log.Logger, config Config, component string) (*B Client: &http.Client{ Transport: http.DefaultTransport, }, + Endpoint: config.STSEndpoint, }), } } diff --git a/pkg/objstore/s3/s3_e2e_test.go b/pkg/objstore/s3/s3_e2e_test.go index e837b9baa58..8991acdbfe8 100644 --- a/pkg/objstore/s3/s3_e2e_test.go +++ b/pkg/objstore/s3/s3_e2e_test.go @@ -9,8 +9,8 @@ import ( "strings" "testing" - "github.com/cortexproject/cortex/integration/e2e" - e2edb "github.com/cortexproject/cortex/integration/e2e/db" + "github.com/efficientgo/e2e" + e2edb "github.com/efficientgo/e2e/db" "github.com/go-kit/kit/log" "github.com/thanos-io/thanos/pkg/objstore/s3" "github.com/thanos-io/thanos/test/e2e/e2ethanos" @@ -23,19 +23,19 @@ func BenchmarkUpload(b *testing.B) { b.ReportAllocs() ctx := context.Background() - s, err := e2e.NewScenario("e2e_bench_mino_client") + e, err := e2e.NewDockerEnvironment("e2e_bench_mino_client") testutil.Ok(b, err) - b.Cleanup(e2ethanos.CleanScenario(b, s)) + b.Cleanup(e2ethanos.CleanScenario(b, e)) - const bucket = "test" - m := e2edb.NewMinio(8080, bucket) - testutil.Ok(b, s.StartAndWaitReady(m)) + const bucket = "benchmark" + m := e2ethanos.NewMinio(e, "benchmark", bucket) + testutil.Ok(b, e2e.StartAndWaitReady(m)) bkt, err := s3.NewBucketWithConfig(log.NewNopLogger(), s3.Config{ Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.HTTPEndpoint(), + Endpoint: m.Endpoint("http"), Insecure: true, }, "test-feed") testutil.Ok(b, err) diff --git a/pkg/pool/pool.go b/pkg/pool/pool.go index cbd034e9e7c..a7eb98c8540 100644 --- a/pkg/pool/pool.go +++ b/pkg/pool/pool.go @@ -107,8 +107,9 @@ func (p *BucketedBytes) Put(b *[]byte) { return } + sz := cap(*b) for i, bktSize := range p.sizes { - if cap(*b) > bktSize { + if sz > bktSize { continue } *b = (*b)[:0] @@ -118,13 +119,11 @@ func (p *BucketedBytes) Put(b *[]byte) { p.mtx.Lock() defer p.mtx.Unlock() - // We could assume here that our users will not make the slices larger // but lets be on the safe side to avoid an underflow of p.usedTotal. - sz := uint64(cap(*b)) - if sz >= p.usedTotal { + if uint64(sz) >= p.usedTotal { p.usedTotal = 0 } else { - p.usedTotal -= sz + p.usedTotal -= uint64(sz) } } diff --git a/pkg/pool/pool_test.go b/pkg/pool/pool_test.go index a4140361d2a..14c8350acb4 100644 --- a/pkg/pool/pool_test.go +++ b/pkg/pool/pool_test.go @@ -4,8 +4,7 @@ package pool import ( - "bytes" - "fmt" + "strings" "sync" "testing" "time" @@ -71,52 +70,57 @@ func TestRacePutGet(t *testing.T) { s := sync.WaitGroup{} - // Start two goroutines: they always Get and Put two byte slices - // to which they write 'foo' / 'barbazbaz' and check if the data is still + const goroutines = 100 + + // Start multiple goroutines: they always Get and Put two byte slices + // to which they write their contents and check if the data is still // there after writing it, before putting it back. - errs := make(chan error, 2) - stop := make(chan bool, 2) + errs := make(chan error, goroutines) + stop := make(chan struct{}) - f := func(txt string) { + f := func(txt string, grow bool) { defer s.Done() for { select { case <-stop: return default: - c, err := chunkPool.Get(3) - if err != nil { - errs <- errors.Wrapf(err, "goroutine %s", txt) - return - } - - buf := bytes.NewBuffer(*c) - - _, err = fmt.Fprintf(buf, "%s", txt) + c, err := chunkPool.Get(len(txt)) if err != nil { errs <- errors.Wrapf(err, "goroutine %s", txt) return } - if buf.String() != txt { + *c = append(*c, txt...) + if string(*c) != txt { errs <- errors.New("expected to get the data just written") return } + if grow { + *c = append(*c, txt...) + *c = append(*c, txt...) + if string(*c) != txt+txt+txt { + errs <- errors.New("expected to get the data just written") + return + } + } - b := buf.Bytes() - chunkPool.Put(&b) + chunkPool.Put(c) } } } - s.Add(2) - go f("foo") - go f("barbazbaz") - - time.Sleep(5 * time.Second) - stop <- true - stop <- true + for i := 0; i < goroutines; i++ { + s.Add(1) + // make sure we start multiple goroutines with same len buf requirements, to hit same pools + s := strings.Repeat(string(byte(i)), i%10) + // some of the goroutines will append more elements to the provided slice + grow := i%2 == 0 + go f(s, grow) + } + time.Sleep(1 * time.Second) + close(stop) s.Wait() select { case err := <-errs: diff --git a/pkg/query/endpointset.go b/pkg/query/endpointset.go index 13625aa8283..a768b68e921 100644 --- a/pkg/query/endpointset.go +++ b/pkg/query/endpointset.go @@ -31,7 +31,8 @@ import ( ) const ( - unhealthyEndpointMessage = "removing endpoint because it's unhealthy or does not exist" + unhealthyEndpointMessage = "removing endpoint because it's unhealthy or does not exist" + noMetadataEndpointMessage = "cannot obtain metadata: neither info nor store client found" // Default minimum and maximum time values used by Prometheus when they are not passed as query parameter. MinTime = -9223309901257974 @@ -76,17 +77,27 @@ func (es *grpcEndpointSpec) Addr() string { // Metadata method for gRPC endpoint tries to call InfoAPI exposed by Thanos components until context timeout. If we are unable to get metadata after // that time, we assume that the host is unhealthy and return error. func (es *grpcEndpointSpec) Metadata(ctx context.Context, client *endpointClients) (*endpointMetadata, error) { - resp, err := client.info.Info(ctx, &infopb.InfoRequest{}, grpc.WaitForReady(true)) - if err != nil { - // Call Info method of StoreAPI, this way querier will be able to discovery old components not exposing InfoAPI. - metadata, merr := es.getMetadataUsingStoreAPI(ctx, client.store) - if merr != nil { - return nil, errors.Wrapf(merr, "fallback fetching info from %s after err: %v", es.addr, err) + // TODO(@matej-g): Info client should not be used due to https://github.com/thanos-io/thanos/issues/4699 + // Uncomment this after it is implemented in https://github.com/thanos-io/thanos/pull/4282. + // if client.info != nil { + // resp, err := client.info.Info(ctx, &infopb.InfoRequest{}, grpc.WaitForReady(true)) + // if err != nil { + // return nil, errors.Wrapf(err, "fetching info from %s", es.addr) + // } + + // return &endpointMetadata{resp}, nil + // } + + // Call Info method of StoreAPI, this way querier will be able to discovery old components not exposing InfoAPI. + if client.store != nil { + metadata, err := es.getMetadataUsingStoreAPI(ctx, client.store) + if err != nil { + return nil, errors.Wrapf(err, "fallback fetching info from %s", es.addr) } return metadata, nil } - return &endpointMetadata{resp}, nil + return nil, errors.New(noMetadataEndpointMessage) } func (es *grpcEndpointSpec) getMetadataUsingStoreAPI(ctx context.Context, client storepb.StoreClient) (*endpointMetadata, error) { @@ -493,7 +504,9 @@ func (e *EndpointSet) getActiveEndpoints(ctx context.Context, endpoints map[stri logger: e.logger, StoreClient: storepb.NewStoreClient(conn), clients: &endpointClients{ - info: infopb.NewInfoClient(conn), + // TODO(@matej-g): Info client should not be used due to https://github.com/thanos-io/thanos/issues/4699 + // Uncomment this after it is implemented in https://github.com/thanos-io/thanos/pull/4282. + // info: infopb.NewInfoClient(conn), store: storepb.NewStoreClient(conn), }, } @@ -667,49 +680,46 @@ func (er *endpointRef) ComponentType() component.Component { er.mtx.RLock() defer er.mtx.RUnlock() - return component.FromString(er.metadata.ComponentType) -} - -func (er *endpointRef) HasClients() bool { - er.mtx.RLock() - defer er.mtx.RUnlock() + if er.metadata == nil { + return component.UnknownStoreAPI + } - return er.clients != nil + return component.FromString(er.metadata.ComponentType) } func (er *endpointRef) HasStoreAPI() bool { er.mtx.RLock() defer er.mtx.RUnlock() - return er.HasClients() && er.clients.store != nil + return er.clients != nil && er.clients.store != nil } func (er *endpointRef) HasRulesAPI() bool { er.mtx.RLock() defer er.mtx.RUnlock() - return er.HasClients() && er.clients.rule != nil + return er.clients != nil && er.clients.rule != nil } func (er *endpointRef) HasTargetsAPI() bool { er.mtx.RLock() defer er.mtx.RUnlock() - return er.HasClients() && er.clients.target != nil + return er.clients != nil && er.clients.target != nil } func (er *endpointRef) HasMetricMetadataAPI() bool { er.mtx.RLock() defer er.mtx.RUnlock() - return er.HasClients() && er.clients.metricMetadata != nil + return er.clients != nil && er.clients.metricMetadata != nil } func (er *endpointRef) HasExemplarsAPI() bool { er.mtx.RLock() defer er.mtx.RUnlock() - return er.HasClients() && er.clients.exemplar != nil + return er.clients != nil && er.clients.exemplar != nil } func (er *endpointRef) LabelSets() []labels.Labels { @@ -785,13 +795,15 @@ func (er *endpointRef) apisPresent() []string { return apisPresent } +// TODO(@matej-g): Info client should not be used due to https://github.com/thanos-io/thanos/issues/4699 +// Uncomment the nolint directive after https://github.com/thanos-io/thanos/pull/4282. type endpointClients struct { store storepb.StoreClient rule rulespb.RulesClient metricMetadata metadatapb.MetadataClient exemplar exemplarspb.ExemplarsClient target targetspb.TargetsClient - info infopb.InfoClient + info infopb.InfoClient //nolint:structcheck,unused } type endpointMetadata struct { diff --git a/pkg/query/endpointset_test.go b/pkg/query/endpointset_test.go index 3e6b89f38a9..f6904e8223d 100644 --- a/pkg/query/endpointset_test.go +++ b/pkg/query/endpointset_test.go @@ -12,6 +12,7 @@ import ( "testing" "time" + "golang.org/x/sync/errgroup" "google.golang.org/grpc" "github.com/pkg/errors" @@ -19,6 +20,7 @@ import ( "github.com/thanos-io/thanos/pkg/info/infopb" "github.com/thanos-io/thanos/pkg/store" "github.com/thanos-io/thanos/pkg/store/labelpb" + "github.com/thanos-io/thanos/pkg/store/storepb" "github.com/thanos-io/thanos/pkg/testutil" ) @@ -58,7 +60,11 @@ var ( } ruleInfo = &infopb.InfoResponse{ ComponentType: component.Rule.String(), - Rules: &infopb.RulesInfo{}, + Store: &infopb.StoreInfo{ + MinTime: math.MinInt64, + MaxTime: math.MaxInt64, + }, + Rules: &infopb.RulesInfo{}, } storeGWInfo = &infopb.InfoResponse{ ComponentType: component.Store.String(), @@ -93,6 +99,28 @@ func (c *mockedEndpoint) Info(ctx context.Context, r *infopb.InfoRequest) (*info return &c.info, nil } +type mockedStoreSrv struct { + infoDelay time.Duration + info storepb.InfoResponse +} + +func (s *mockedStoreSrv) Info(context.Context, *storepb.InfoRequest) (*storepb.InfoResponse, error) { + if s.infoDelay > 0 { + time.Sleep(s.infoDelay) + } + + return &s.info, nil +} +func (s *mockedStoreSrv) Series(*storepb.SeriesRequest, storepb.Store_SeriesServer) error { + return nil +} +func (s *mockedStoreSrv) LabelNames(context.Context, *storepb.LabelNamesRequest) (*storepb.LabelNamesResponse, error) { + return nil, nil +} +func (s *mockedStoreSrv) LabelValues(context.Context, *storepb.LabelValuesRequest) (*storepb.LabelValuesResponse, error) { + return nil, nil +} + type APIs struct { store bool metricMetadata bool @@ -113,6 +141,25 @@ type testEndpoints struct { exposedAPIs map[string]*APIs } +func componentTypeToStoreType(componentType string) storepb.StoreType { + switch componentType { + case component.Query.String(): + return storepb.StoreType_QUERY + case component.Rule.String(): + return storepb.StoreType_RULE + case component.Sidecar.String(): + return storepb.StoreType_SIDECAR + case component.Store.String(): + return storepb.StoreType_STORE + case component.Receive.String(): + return storepb.StoreType_RECEIVE + case component.Debug.String(): + return storepb.StoreType_DEBUG + default: + return storepb.StoreType_STORE + } +} + func startTestEndpoints(testEndpointMeta []testEndpointMeta) (*testEndpoints, error) { e := &testEndpoints{ srvs: map[string]*grpc.Server{}, @@ -130,6 +177,19 @@ func startTestEndpoints(testEndpointMeta []testEndpointMeta) (*testEndpoints, er srv := grpc.NewServer() addr := listener.Addr().String() + storeSrv := &mockedStoreSrv{ + info: storepb.InfoResponse{ + LabelSets: meta.extlsetFn(listener.Addr().String()), + StoreType: componentTypeToStoreType(meta.ComponentType), + }, + infoDelay: meta.infoDelay, + } + + if meta.Store != nil { + storeSrv.info.MinTime = meta.Store.MinTime + storeSrv.info.MaxTime = meta.Store.MaxTime + } + endpointSrv := &mockedEndpoint{ info: infopb.InfoResponse{ LabelSets: meta.extlsetFn(listener.Addr().String()), @@ -143,6 +203,7 @@ func startTestEndpoints(testEndpointMeta []testEndpointMeta) (*testEndpoints, er infoDelay: meta.infoDelay, } infopb.RegisterInfoServer(srv, endpointSrv) + storepb.RegisterStoreServer(srv, storeSrv) go func() { _ = srv.Serve(listener) }() @@ -859,7 +920,7 @@ func TestEndpointSet_APIs_Discovery(t *testing.T) { } return endpointSpec }, - expectedStores: 4, // sidecar + querier + receiver + storeGW + expectedStores: 5, // sidecar + querier + receiver + storeGW + ruler expectedRules: 3, // sidecar + querier + ruler expectedTarget: 2, // sidecar + querier expectedMetricMetadata: 2, // sidecar + querier @@ -895,7 +956,7 @@ func TestEndpointSet_APIs_Discovery(t *testing.T) { NewGRPCEndpointSpec(endpoints.orderAddrs[1], false), } }, - expectedStores: 1, // sidecar + expectedStores: 2, // sidecar + ruler expectedRules: 2, // sidecar + ruler expectedTarget: 1, // sidecar expectedMetricMetadata: 1, // sidecar @@ -908,7 +969,8 @@ func TestEndpointSet_APIs_Discovery(t *testing.T) { NewGRPCEndpointSpec(endpoints.orderAddrs[1], false), } }, - expectedRules: 1, // ruler + expectedStores: 1, // ruler + expectedRules: 1, // ruler }, }, }, @@ -1106,6 +1168,7 @@ func exposedAPIs(c string) *APIs { } case component.Rule.String(): return &APIs{ + store: true, rules: true, } case component.Store.String(): @@ -1123,3 +1186,47 @@ func assertRegisteredAPIs(t *testing.T, expectedAPIs *APIs, er *endpointRef) { testutil.Equals(t, expectedAPIs.metricMetadata, er.HasMetricMetadataAPI()) testutil.Equals(t, expectedAPIs.exemplars, er.HasExemplarsAPI()) } + +// Regression test for: https://github.com/thanos-io/thanos/issues/4766. +func TestDeadlockLocking(t *testing.T) { + t.Parallel() + + mockEndpointRef := &endpointRef{ + addr: "mockedStore", + metadata: &endpointMetadata{ + &infopb.InfoResponse{}, + }, + clients: &endpointClients{}, + } + + g := &errgroup.Group{} + deadline := time.Now().Add(3 * time.Second) + + g.Go(func() error { + for { + if time.Now().After(deadline) { + break + } + mockEndpointRef.Update(&endpointMetadata{ + InfoResponse: &infopb.InfoResponse{}, + }) + } + return nil + }) + + g.Go(func() error { + for { + if time.Now().After(deadline) { + break + } + mockEndpointRef.HasStoreAPI() + mockEndpointRef.HasExemplarsAPI() + mockEndpointRef.HasMetricMetadataAPI() + mockEndpointRef.HasRulesAPI() + mockEndpointRef.HasTargetsAPI() + } + return nil + }) + + testutil.Ok(t, g.Wait()) +} diff --git a/pkg/queryfrontend/labels_codec_test.go b/pkg/queryfrontend/labels_codec_test.go index 441853d5990..061852ea307 100644 --- a/pkg/queryfrontend/labels_codec_test.go +++ b/pkg/queryfrontend/labels_codec_test.go @@ -7,6 +7,7 @@ import ( "bytes" "context" "encoding/json" + "fmt" "io/ioutil" "net/http" "testing" @@ -521,3 +522,212 @@ func TestLabelsCodec_MergeResponse(t *testing.T) { }) } } + +func BenchmarkLabelsCodecEncodeAndDecodeRequest(b *testing.B) { + codec := NewThanosLabelsCodec(false, time.Hour*2) + ctx := context.TODO() + + b.Run("SeriesRequest", func(b *testing.B) { + req := &ThanosSeriesRequest{ + Start: 123000, + End: 456000, + Path: "/api/v1/series", + Dedup: true, + } + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + reqEnc, err := codec.EncodeRequest(ctx, req) + testutil.Ok(b, err) + _, err = codec.DecodeRequest(ctx, reqEnc) + testutil.Ok(b, err) + } + }) + + b.Run("LabelsRequest", func(b *testing.B) { + req := &ThanosLabelsRequest{ + Path: "/api/v1/labels", + Start: 123000, + End: 456000, + PartialResponse: true, + Matchers: [][]*labels.Matcher{{labels.MustNewMatcher(labels.MatchEqual, "foo", "bar")}}, + StoreMatchers: [][]*labels.Matcher{}, + } + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + reqEnc, err := codec.EncodeRequest(ctx, req) + testutil.Ok(b, err) + _, err = codec.DecodeRequest(ctx, reqEnc) + testutil.Ok(b, err) + } + }) +} + +func BenchmarkLabelsCodecDecodeResponse(b *testing.B) { + codec := NewThanosLabelsCodec(false, time.Hour*2) + ctx := context.TODO() + + b.Run("SeriesResponse", func(b *testing.B) { + seriesData, err := json.Marshal(&ThanosSeriesResponse{ + Status: "success", + Data: []labelpb.ZLabelSet{{Labels: []labelpb.ZLabel{{Name: "foo", Value: "bar"}}}}, + }) + testutil.Ok(b, err) + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + _, err := codec.DecodeResponse( + ctx, + makeResponse(seriesData, false), + &ThanosSeriesRequest{}) + testutil.Ok(b, err) + } + }) + + b.Run("SeriesResponseWithHeaders", func(b *testing.B) { + seriesDataWithHeaders, err := json.Marshal(&ThanosSeriesResponse{ + Status: "success", + Data: []labelpb.ZLabelSet{{Labels: []labelpb.ZLabel{{Name: "foo", Value: "bar"}}}}, + Headers: []*ResponseHeader{{Name: cacheControlHeader, Values: []string{noStoreValue}}}, + }) + testutil.Ok(b, err) + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + _, err := codec.DecodeResponse( + ctx, + makeResponse(seriesDataWithHeaders, true), + &ThanosSeriesRequest{}) + testutil.Ok(b, err) + } + }) + + b.Run("LabelsResponse", func(b *testing.B) { + labelsData, err := json.Marshal(&ThanosLabelsResponse{ + Status: "success", + Data: []string{"__name__"}, + }) + testutil.Ok(b, err) + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + _, err := codec.DecodeResponse( + ctx, + makeResponse(labelsData, false), + &ThanosLabelsRequest{}) + testutil.Ok(b, err) + } + }) + + b.Run("LabelsResponseWithHeaders", func(b *testing.B) { + labelsDataWithHeaders, err := json.Marshal(&ThanosLabelsResponse{ + Status: "success", + Data: []string{"__name__"}, + Headers: []*ResponseHeader{{Name: cacheControlHeader, Values: []string{noStoreValue}}}, + }) + testutil.Ok(b, err) + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + _, err := codec.DecodeResponse( + ctx, + makeResponse(labelsDataWithHeaders, true), + &ThanosLabelsRequest{}) + testutil.Ok(b, err) + } + }) +} + +func BenchmarkLabelsCodecMergeResponses_1(b *testing.B) { + benchmarkMergeResponses(b, 1) +} + +func BenchmarkLabelsCodecMergeResponses_10(b *testing.B) { + benchmarkMergeResponses(b, 10) +} + +func BenchmarkLabelsCodecMergeResponses_100(b *testing.B) { + benchmarkMergeResponses(b, 100) +} + +func BenchmarkLabelsCodecMergeResponses_1000(b *testing.B) { + benchmarkMergeResponses(b, 1000) +} + +func benchmarkMergeResponses(b *testing.B, size int) { + codec := NewThanosLabelsCodec(false, time.Hour*2) + queryResLabel, queryResSeries := makeQueryRangeResponses(size) + + b.Run("SeriesResponses", func(b *testing.B) { + b.ReportAllocs() + b.ResetTimer() + + for i := 0; i < b.N; i++ { + _, _ = codec.MergeResponse(queryResSeries...) + } + }) + + b.Run("LabelsResponses", func(b *testing.B) { + b.ReportAllocs() + b.ResetTimer() + + for i := 0; i < b.N; i++ { + _, _ = codec.MergeResponse(queryResLabel...) + } + }) + +} + +func makeQueryRangeResponses(size int) ([]queryrange.Response, []queryrange.Response) { + labelResp := make([]queryrange.Response, 0, size) + seriesResp := make([]queryrange.Response, 0, size*2) + + // Generate with some duplicated values. + for i := 0; i < size; i++ { + labelResp = append(labelResp, &ThanosLabelsResponse{ + Status: "success", + Data: []string{fmt.Sprintf("data-%d", i), fmt.Sprintf("data-%d", i+1)}, + }) + + seriesResp = append( + seriesResp, + &ThanosSeriesResponse{ + Status: "success", + Data: []labelpb.ZLabelSet{{Labels: []labelpb.ZLabel{{Name: fmt.Sprintf("foo-%d", i), Value: fmt.Sprintf("bar-%d", i)}}}}, + }, + &ThanosSeriesResponse{ + Status: "success", + Data: []labelpb.ZLabelSet{{Labels: []labelpb.ZLabel{{Name: fmt.Sprintf("foo-%d", i+1), Value: fmt.Sprintf("bar-%d", i+1)}}}}, + }, + ) + } + + return labelResp, seriesResp +} + +func makeResponse(data []byte, withHeader bool) *http.Response { + r := &http.Response{ + StatusCode: 200, Body: ioutil.NopCloser(bytes.NewBuffer(data)), + } + + if withHeader { + r.Header = map[string][]string{ + cacheControlHeader: {noStoreValue}, + } + } + + return r +} diff --git a/pkg/queryfrontend/queryrange_codec_test.go b/pkg/queryfrontend/queryrange_codec_test.go index 0f033d402f3..7dc2f46b97b 100644 --- a/pkg/queryfrontend/queryrange_codec_test.go +++ b/pkg/queryfrontend/queryrange_codec_test.go @@ -282,3 +282,26 @@ func TestQueryRangeCodec_EncodeRequest(t *testing.T) { }) } } + +func BenchmarkQueryRangeCodecEncodeAndDecodeRequest(b *testing.B) { + codec := NewThanosQueryRangeCodec(true) + ctx := context.TODO() + + req := &ThanosQueryRangeRequest{ + Start: 123000, + End: 456000, + Step: 1000, + MaxSourceResolution: int64(compact.ResolutionLevel1h), + Dedup: true, + } + + b.ReportAllocs() + b.ResetTimer() + + for n := 0; n < b.N; n++ { + reqEnc, err := codec.EncodeRequest(ctx, req) + testutil.Ok(b, err) + _, err = codec.DecodeRequest(ctx, reqEnc) + testutil.Ok(b, err) + } +} diff --git a/pkg/reloader/reloader_test.go b/pkg/reloader/reloader_test.go index 6659d20cc7f..25a0af5ae9b 100644 --- a/pkg/reloader/reloader_test.go +++ b/pkg/reloader/reloader_test.go @@ -247,6 +247,84 @@ func TestReloader_DirectoriesApply(t *testing.T) { testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-source.yaml"), path.Join(dir2, "rule3-001.yaml"))) testutil.Ok(t, ioutil.WriteFile(path.Join(dir2, "rule-dir", "rule4.yaml"), []byte("rule4"), os.ModePerm)) + stepFunc := func(rel int) { + t.Log("Performing step number", rel) + switch rel { + case 0: + // Create rule2.yaml. + // + // dir + // ├─ rule-dir -> dir2/rule-dir + // ├─ rule1.yaml + // └─ rule2.yaml (*) + // dir2 + // ├─ rule-dir + // │ └─ rule4.yaml + // ├─ rule3-001.yaml -> rule3-source.yaml + // └─ rule3-source.yaml + testutil.Ok(t, ioutil.WriteFile(path.Join(dir, "rule2.yaml"), []byte("rule2"), os.ModePerm)) + case 1: + // Update rule1.yaml. + // + // dir + // ├─ rule-dir -> dir2/rule-dir + // ├─ rule1.yaml (*) + // └─ rule2.yaml + // dir2 + // ├─ rule-dir + // │ └─ rule4.yaml + // ├─ rule3-001.yaml -> rule3-source.yaml + // └─ rule3-source.yaml + testutil.Ok(t, os.Rename(tempRule1File, path.Join(dir, "rule1.yaml"))) + case 2: + // Create dir/rule3.yaml (symlink to rule3-001.yaml). + // + // dir + // ├─ rule-dir -> dir2/rule-dir + // ├─ rule1.yaml + // ├─ rule2.yaml + // └─ rule3.yaml -> dir2/rule3-001.yaml (*) + // dir2 + // ├─ rule-dir + // │ └─ rule4.yaml + // ├─ rule3-001.yaml -> rule3-source.yaml + // └─ rule3-source.yaml + testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-001.yaml"), path.Join(dir2, "rule3.yaml"))) + testutil.Ok(t, os.Rename(path.Join(dir2, "rule3.yaml"), path.Join(dir, "rule3.yaml"))) + case 3: + // Update the symlinked file and replace the symlink file to trigger fsnotify. + // + // dir + // ├─ rule-dir -> dir2/rule-dir + // ├─ rule1.yaml + // ├─ rule2.yaml + // └─ rule3.yaml -> dir2/rule3-002.yaml (*) + // dir2 + // ├─ rule-dir + // │ └─ rule4.yaml + // ├─ rule3-002.yaml -> rule3-source.yaml (*) + // └─ rule3-source.yaml (*) + testutil.Ok(t, os.Rename(tempRule3File, path.Join(dir2, "rule3-source.yaml"))) + testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-source.yaml"), path.Join(dir2, "rule3-002.yaml"))) + testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-002.yaml"), path.Join(dir2, "rule3.yaml"))) + testutil.Ok(t, os.Rename(path.Join(dir2, "rule3.yaml"), path.Join(dir, "rule3.yaml"))) + testutil.Ok(t, os.Remove(path.Join(dir2, "rule3-001.yaml"))) + case 4: + // Update rule4.yaml in the symlinked directory. + // + // dir + // ├─ rule-dir -> dir2/rule-dir + // ├─ rule1.yaml + // ├─ rule2.yaml + // └─ rule3.yaml -> rule3-source.yaml + // dir2 + // ├─ rule-dir + // │ └─ rule4.yaml (*) + // └─ rule3-source.yaml + testutil.Ok(t, os.Rename(tempRule4File, path.Join(dir2, "rule-dir", "rule4.yaml"))) + } + } + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) g := sync.WaitGroup{} g.Add(1) @@ -267,90 +345,21 @@ func TestReloader_DirectoriesApply(t *testing.T) { reloadsMtx.Lock() rel := reloads + reloadsMtx.Unlock() if init && rel <= reloadsSeen { - reloadsMtx.Unlock() continue } - reloadsMtx.Unlock() - init = true - reloadsSeen = rel - t.Log("Performing step number", rel) - switch rel { - case 0: - // Create rule2.yaml. - // - // dir - // ├─ rule-dir -> dir2/rule-dir - // ├─ rule1.yaml - // └─ rule2.yaml (*) - // dir2 - // ├─ rule-dir - // │ └─ rule4.yaml - // ├─ rule3-001.yaml -> rule3-source.yaml - // └─ rule3-source.yaml - testutil.Ok(t, ioutil.WriteFile(path.Join(dir, "rule2.yaml"), []byte("rule2"), os.ModePerm)) - case 1: - // Update rule1.yaml. - // - // dir - // ├─ rule-dir -> dir2/rule-dir - // ├─ rule1.yaml (*) - // └─ rule2.yaml - // dir2 - // ├─ rule-dir - // │ └─ rule4.yaml - // ├─ rule3-001.yaml -> rule3-source.yaml - // └─ rule3-source.yaml - testutil.Ok(t, os.Rename(tempRule1File, path.Join(dir, "rule1.yaml"))) - case 2: - // Create dir/rule3.yaml (symlink to rule3-001.yaml). - // - // dir - // ├─ rule-dir -> dir2/rule-dir - // ├─ rule1.yaml - // ├─ rule2.yaml - // └─ rule3.yaml -> dir2/rule3-001.yaml (*) - // dir2 - // ├─ rule-dir - // │ └─ rule4.yaml - // ├─ rule3-001.yaml -> rule3-source.yaml - // └─ rule3-source.yaml - testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-001.yaml"), path.Join(dir2, "rule3.yaml"))) - testutil.Ok(t, os.Rename(path.Join(dir2, "rule3.yaml"), path.Join(dir, "rule3.yaml"))) - case 3: - // Update the symlinked file and replace the symlink file to trigger fsnotify. - // - // dir - // ├─ rule-dir -> dir2/rule-dir - // ├─ rule1.yaml - // ├─ rule2.yaml - // └─ rule3.yaml -> dir2/rule3-002.yaml (*) - // dir2 - // ├─ rule-dir - // │ └─ rule4.yaml - // ├─ rule3-002.yaml -> rule3-source.yaml (*) - // └─ rule3-source.yaml (*) - testutil.Ok(t, os.Rename(tempRule3File, path.Join(dir2, "rule3-source.yaml"))) - testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-source.yaml"), path.Join(dir2, "rule3-002.yaml"))) - testutil.Ok(t, os.Symlink(path.Join(dir2, "rule3-002.yaml"), path.Join(dir2, "rule3.yaml"))) - testutil.Ok(t, os.Rename(path.Join(dir2, "rule3.yaml"), path.Join(dir, "rule3.yaml"))) - testutil.Ok(t, os.Remove(path.Join(dir2, "rule3-001.yaml"))) - case 4: - // Update rule4.yaml in the symlinked directory. - // - // dir - // ├─ rule-dir -> dir2/rule-dir - // ├─ rule1.yaml - // ├─ rule2.yaml - // └─ rule3.yaml -> rule3-source.yaml - // dir2 - // ├─ rule-dir - // │ └─ rule4.yaml (*) - // └─ rule3-source.yaml - testutil.Ok(t, os.Rename(tempRule4File, path.Join(dir2, "rule-dir", "rule4.yaml"))) + // Catch up if reloader is step(s) ahead. + for skipped := rel - reloadsSeen - 1; skipped > 0; skipped-- { + stepFunc(rel - skipped) } + stepFunc(rel) + + init = true + reloadsSeen = rel + if rel > 4 { // All good. return diff --git a/pkg/rules/rules_test.go b/pkg/rules/rules_test.go index 01965a9b31b..8cb73ada6b3 100644 --- a/pkg/rules/rules_test.go +++ b/pkg/rules/rules_test.go @@ -82,7 +82,7 @@ func testRulesAgainstExamples(t *testing.T, dir string, server rulespb.RulesServ { Name: "thanos-sidecar", File: filepath.Join(dir, "alerts.yaml"), - Rules: []*rulespb.Rule{someAlert, someAlert, someAlert}, + Rules: []*rulespb.Rule{someAlert, someAlert}, Interval: 60, PartialResponseStrategy: storepb.PartialResponseStrategy_ABORT, }, diff --git a/pkg/server/grpc/grpc.go b/pkg/server/grpc/grpc.go index 10c6f08e4bd..6a39ba5afbd 100644 --- a/pkg/server/grpc/grpc.go +++ b/pkg/server/grpc/grpc.go @@ -72,7 +72,11 @@ func New(logger log.Logger, reg prometheus.Registerer, tracer opentracing.Tracer } options.grpcOpts = append(options.grpcOpts, []grpc.ServerOption{ + // NOTE: It is recommended for gRPC messages to not go over 1MB, yet it is typical for remote write requests and store API responses to go over 4MB. + // Remove limits and allow users to use histogram message sizes to detect those situations. + // TODO(bwplotka): https://github.com/grpc-ecosystem/go-grpc-middleware/issues/462 grpc.MaxSendMsgSize(math.MaxInt32), + grpc.MaxRecvMsgSize(math.MaxInt32), grpc_middleware.WithUnaryServerChain( grpc_recovery.UnaryServerInterceptor(grpc_recovery.WithRecoveryHandler(grpcPanicRecoveryHandler)), met.UnaryServerInterceptor(), diff --git a/pkg/store/bucket.go b/pkg/store/bucket.go index d181f047315..be0e4bec97b 100644 --- a/pkg/store/bucket.go +++ b/pkg/store/bucket.go @@ -92,6 +92,12 @@ const ( // Labels for metrics. labelEncode = "encode" labelDecode = "decode" + + minBlockSyncConcurrency = 1 +) + +var ( + errBlockSyncConcurrencyNotValid = errors.New("the block sync concurrency must be equal or greater than 1.") ) type bucketStoreMetrics struct { @@ -298,6 +304,13 @@ type BucketStore struct { enableSeriesResponseHints bool } +func (b *BucketStore) validate() error { + if b.blockSyncConcurrency < minBlockSyncConcurrency { + return errBlockSyncConcurrencyNotValid + } + return nil +} + type noopCache struct{} func (noopCache) StorePostings(context.Context, ulid.ULID, labels.Label, []byte) {} @@ -407,6 +420,10 @@ func NewBucketStore( s.indexReaderPool = indexheader.NewReaderPool(s.logger, lazyIndexReaderEnabled, lazyIndexReaderIdleTimeout, indexReaderPoolMetrics) s.metrics = newBucketStoreMetrics(s.reg) // TODO(metalmatze): Might be possible via Option too + if err := s.validate(); err != nil { + return nil, errors.Wrap(err, "validate config") + } + if err := os.MkdirAll(dir, 0750); err != nil { return nil, errors.Wrap(err, "create dir") } @@ -2491,24 +2508,32 @@ func (r *bucketChunkReader) loadChunks(ctx context.Context, res []seriesEntry, a r.stats.chunksFetchedSizeSum += int(part.End - part.Start) var ( - buf = make([]byte, EstimatedMaxChunkSize) + buf []byte readOffset = int(pIdxs[0].offset) // Save a few allocations. - written int64 + written int diff uint32 chunkLen int n int ) + bufPooled, err := r.block.chunkPool.Get(EstimatedMaxChunkSize) + if err == nil { + buf = *bufPooled + } else { + buf = make([]byte, EstimatedMaxChunkSize) + } + defer r.block.chunkPool.Put(&buf) + for i, pIdx := range pIdxs { // Fast forward range reader to the next chunk start in case of sparse (for our purposes) byte range. for readOffset < int(pIdx.offset) { - written, err = io.CopyN(ioutil.Discard, bufReader, int64(pIdx.offset)-int64(readOffset)) + written, err = bufReader.Discard(int(pIdx.offset) - int(readOffset)) if err != nil { return errors.Wrap(err, "fast forward range reader") } - readOffset += int(written) + readOffset += written } // Presume chunk length to be reasonably large for common use cases. // However, declaration for EstimatedMaxChunkSize warns us some chunks could be larger in some rare cases. diff --git a/pkg/store/bucket_test.go b/pkg/store/bucket_test.go index e635ab22c1e..adc57017409 100644 --- a/pkg/store/bucket_test.go +++ b/pkg/store/bucket_test.go @@ -556,6 +556,32 @@ func TestGapBasedPartitioner_Partition(t *testing.T) { } } +func TestBucketStoreConfig_validate(t *testing.T) { + tests := map[string]struct { + config *BucketStore + expected error + }{ + "should pass on valid config": { + config: &BucketStore{ + blockSyncConcurrency: 1, + }, + expected: nil, + }, + "should fail on blockSyncConcurrency < 1": { + config: &BucketStore{ + blockSyncConcurrency: 0, + }, + expected: errBlockSyncConcurrencyNotValid, + }, + } + + for testName, testData := range tests { + t.Run(testName, func(t *testing.T) { + testutil.Equals(t, testData.expected, testData.config.validate()) + }) + } +} + func TestBucketStore_Info(t *testing.T) { defer testutil.TolerantVerifyLeak(t) diff --git a/pkg/store/cache/caching_bucket.go b/pkg/store/cache/caching_bucket.go index 2f55f2e0cd0..0a53af4e736 100644 --- a/pkg/store/cache/caching_bucket.go +++ b/pkg/store/cache/caching_bucket.go @@ -11,6 +11,7 @@ import ( "io" "io/ioutil" "strconv" + "strings" "sync" "time" @@ -31,7 +32,12 @@ const ( originBucket = "bucket" ) -var errObjNotFound = errors.Errorf("object not found") +var ( + errObjNotFound = errors.Errorf("object not found") + ErrInvalidBucketCacheKeyFormat = errors.New("key has invalid format") + ErrInvalidBucketCacheKeyVerb = errors.New("key has invalid verb") + ErrParseKeyInt = errors.New("failed to parse integer in key") +) // CachingBucket implementation that provides some caching features, based on passed configuration. type CachingBucket struct { @@ -130,8 +136,8 @@ func (cb *CachingBucket) Iter(ctx context.Context, dir string, f func(string) er } cb.operationRequests.WithLabelValues(objstore.OpIter, cfgName).Inc() - - key := cachingKeyIter(dir) + iterVerb := BucketCacheKey{Verb: IterVerb, Name: dir} + key := iterVerb.String() data := cfg.cache.Fetch(ctx, []string{key}) if data[key] != nil { list, err := cfg.codec.Decode(data[key]) @@ -176,7 +182,8 @@ func (cb *CachingBucket) Exists(ctx context.Context, name string) (bool, error) cb.operationRequests.WithLabelValues(objstore.OpExists, cfgName).Inc() - key := cachingKeyExists(name) + existsVerb := BucketCacheKey{Verb: ExistsVerb, Name: name} + key := existsVerb.String() hits := cfg.cache.Fetch(ctx, []string{key}) if ex := hits[key]; ex != nil { @@ -218,8 +225,10 @@ func (cb *CachingBucket) Get(ctx context.Context, name string) (io.ReadCloser, e cb.operationRequests.WithLabelValues(objstore.OpGet, cfgName).Inc() - contentKey := cachingKeyContent(name) - existsKey := cachingKeyExists(name) + contentVerb := BucketCacheKey{Verb: ContentVerb, Name: name} + contentKey := contentVerb.String() + existsVerb := BucketCacheKey{Verb: ExistsVerb, Name: name} + existsKey := existsVerb.String() hits := cfg.cache.Fetch(ctx, []string{contentKey, existsKey}) if hits[contentKey] != nil { @@ -286,7 +295,8 @@ func (cb *CachingBucket) Attributes(ctx context.Context, name string) (objstore. } func (cb *CachingBucket) cachedAttributes(ctx context.Context, name, cfgName string, cache cache.Cache, ttl time.Duration) (objstore.ObjectAttributes, error) { - key := cachingKeyAttributes(name) + attrVerb := BucketCacheKey{Verb: AttributesVerb, Name: name} + key := attrVerb.String() cb.operationRequests.WithLabelValues(objstore.OpAttributes, cfgName).Inc() @@ -357,8 +367,8 @@ func (cb *CachingBucket) cachedGetRange(ctx context.Context, name string, offset end = attrs.Size } totalRequestedBytes += (end - off) - - k := cachingKeyObjectSubrange(name, off, end) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: off, End: end} + k := objectSubrange.String() keys = append(keys, k) offsetKeys[off] = k } @@ -482,24 +492,86 @@ func mergeRanges(input []rng, limit int64) []rng { return input[:last+1] } -func cachingKeyAttributes(name string) string { - return fmt.Sprintf("attrs:%s", name) -} +// VerbType is the type of operation whose result has been stored in the caching bucket's cache. +type VerbType string + +const ( + ExistsVerb VerbType = "exists" + ContentVerb VerbType = "content" + IterVerb VerbType = "iter" + AttributesVerb VerbType = "attrs" + SubrangeVerb VerbType = "subrange" +) -func cachingKeyObjectSubrange(name string, start, end int64) string { - return fmt.Sprintf("subrange:%s:%d:%d", name, start, end) +type BucketCacheKey struct { + Verb VerbType + Name string + Start int64 + End int64 } -func cachingKeyIter(name string) string { - return fmt.Sprintf("iter:%s", name) +// String returns the string representation of BucketCacheKey. +func (ck BucketCacheKey) String() string { + if ck.Start == 0 && ck.End == 0 { + return fmt.Sprintf("%s:%s", ck.Verb, ck.Name) + } + + return fmt.Sprintf("%s:%s:%d:%d", ck.Verb, ck.Name, ck.Start, ck.End) } -func cachingKeyExists(name string) string { - return fmt.Sprintf("exists:%s", name) +// IsValidVerb checks if the VerbType matches the predefined verbs. +func IsValidVerb(v VerbType) bool { + switch v { + case + ExistsVerb, + ContentVerb, + IterVerb, + AttributesVerb, + SubrangeVerb: + return true + } + return false } -func cachingKeyContent(name string) string { - return fmt.Sprintf("content:%s", name) +// ParseBucketCacheKey parses a string and returns BucketCacheKey. +func ParseBucketCacheKey(key string) (BucketCacheKey, error) { + ck := BucketCacheKey{} + slice := strings.Split(key, ":") + if len(slice) < 2 { + return ck, ErrInvalidBucketCacheKeyFormat + } + + verb := VerbType(slice[0]) + if !IsValidVerb(verb) { + return BucketCacheKey{}, ErrInvalidBucketCacheKeyVerb + } + + if verb == SubrangeVerb { + if len(slice) != 4 { + return BucketCacheKey{}, ErrInvalidBucketCacheKeyFormat + } + + start, err := strconv.ParseInt(slice[2], 10, 64) + if err != nil { + return BucketCacheKey{}, ErrParseKeyInt + } + + end, err := strconv.ParseInt(slice[3], 10, 64) + if err != nil { + return BucketCacheKey{}, ErrParseKeyInt + } + + ck.Start = start + ck.End = end + } else { + if len(slice) != 2 { + return BucketCacheKey{}, ErrInvalidBucketCacheKeyFormat + } + } + + ck.Verb = verb + ck.Name = slice[1] + return ck, nil } // Reader implementation that uses in-memory subranges. diff --git a/pkg/store/cache/caching_bucket_test.go b/pkg/store/cache/caching_bucket_test.go index 9bf0bddcbd9..35875716925 100644 --- a/pkg/store/cache/caching_bucket_test.go +++ b/pkg/store/cache/caching_bucket_test.go @@ -125,9 +125,12 @@ func TestChunksCaching(t *testing.T) { expectedCachedBytes: 7 * subrangeSize, init: func() { // Delete first 3 subranges. - delete(cache.cache, cachingKeyObjectSubrange(name, 0*subrangeSize, 1*subrangeSize)) - delete(cache.cache, cachingKeyObjectSubrange(name, 1*subrangeSize, 2*subrangeSize)) - delete(cache.cache, cachingKeyObjectSubrange(name, 2*subrangeSize, 3*subrangeSize)) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 0 * subrangeSize, End: 1 * subrangeSize} + delete(cache.cache, objectSubrange.String()) + objectSubrange = BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 1 * subrangeSize, End: 2 * subrangeSize} + delete(cache.cache, objectSubrange.String()) + objectSubrange = BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 2 * subrangeSize, End: 3 * subrangeSize} + delete(cache.cache, objectSubrange.String()) }, }, @@ -140,9 +143,12 @@ func TestChunksCaching(t *testing.T) { expectedCachedBytes: 7 * subrangeSize, init: func() { // Delete last 3 subranges. - delete(cache.cache, cachingKeyObjectSubrange(name, 7*subrangeSize, 8*subrangeSize)) - delete(cache.cache, cachingKeyObjectSubrange(name, 8*subrangeSize, 9*subrangeSize)) - delete(cache.cache, cachingKeyObjectSubrange(name, 9*subrangeSize, 10*subrangeSize)) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 7 * subrangeSize, End: 8 * subrangeSize} + delete(cache.cache, objectSubrange.String()) + objectSubrange = BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 8 * subrangeSize, End: 9 * subrangeSize} + delete(cache.cache, objectSubrange.String()) + objectSubrange = BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 9 * subrangeSize, End: 10 * subrangeSize} + delete(cache.cache, objectSubrange.String()) }, }, @@ -155,9 +161,12 @@ func TestChunksCaching(t *testing.T) { expectedCachedBytes: 7 * subrangeSize, init: func() { // Delete 3 subranges in the middle. - delete(cache.cache, cachingKeyObjectSubrange(name, 3*subrangeSize, 4*subrangeSize)) - delete(cache.cache, cachingKeyObjectSubrange(name, 4*subrangeSize, 5*subrangeSize)) - delete(cache.cache, cachingKeyObjectSubrange(name, 5*subrangeSize, 6*subrangeSize)) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 3 * subrangeSize, End: 4 * subrangeSize} + delete(cache.cache, objectSubrange.String()) + objectSubrange = BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 4 * subrangeSize, End: 5 * subrangeSize} + delete(cache.cache, objectSubrange.String()) + objectSubrange = BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: 5 * subrangeSize, End: 6 * subrangeSize} + delete(cache.cache, objectSubrange.String()) }, }, @@ -174,7 +183,8 @@ func TestChunksCaching(t *testing.T) { if i > 0 && i%3 == 0 { continue } - delete(cache.cache, cachingKeyObjectSubrange(name, i*subrangeSize, (i+1)*subrangeSize)) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: i * subrangeSize, End: (i + 1) * subrangeSize} + delete(cache.cache, objectSubrange.String()) } }, }, @@ -194,7 +204,8 @@ func TestChunksCaching(t *testing.T) { if i == 3 || i == 5 || i == 7 { continue } - delete(cache.cache, cachingKeyObjectSubrange(name, i*subrangeSize, (i+1)*subrangeSize)) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: i * subrangeSize, End: (i + 1) * subrangeSize} + delete(cache.cache, objectSubrange.String()) } }, }, @@ -213,7 +224,8 @@ func TestChunksCaching(t *testing.T) { if i == 5 || i == 6 || i == 7 { continue } - delete(cache.cache, cachingKeyObjectSubrange(name, i*subrangeSize, (i+1)*subrangeSize)) + objectSubrange := BucketCacheKey{Verb: SubrangeVerb, Name: name, Start: i * subrangeSize, End: (i + 1) * subrangeSize} + delete(cache.cache, objectSubrange.String()) } }, }, @@ -295,6 +307,10 @@ func (m *mockCache) Fetch(_ context.Context, keys []string) map[string][]byte { return found } +func (m *mockCache) Name() string { + return "mockCache" +} + func (m *mockCache) flush() { m.cache = map[string]cacheItem{} } @@ -657,3 +673,116 @@ func verifyObjectAttrs(t *testing.T, cb *CachingBucket, file string, expectedLen } func matchAll(string) bool { return true } + +func TestParseBucketCacheKey(t *testing.T) { + testcases := []struct { + key string + expected BucketCacheKey + expectedErr error + }{ + { + key: "exists:name", + expected: BucketCacheKey{ + Verb: ExistsVerb, + Name: "name", + Start: 0, + End: 0, + }, + expectedErr: nil, + }, + { + key: "content:name", + expected: BucketCacheKey{ + Verb: ContentVerb, + Name: "name", + Start: 0, + End: 0, + }, + expectedErr: nil, + }, + { + key: "iter:name", + expected: BucketCacheKey{ + Verb: IterVerb, + Name: "name", + Start: 0, + End: 0, + }, + expectedErr: nil, + }, + { + key: "attrs:name", + expected: BucketCacheKey{ + Verb: AttributesVerb, + Name: "name", + Start: 0, + End: 0, + }, + expectedErr: nil, + }, + { + key: "subrange:name:10:20", + expected: BucketCacheKey{ + Verb: SubrangeVerb, + Name: "name", + Start: 10, + End: 20, + }, + expectedErr: nil, + }, + // Any VerbType other than SubrangeVerb should not have a "start" and "end". + { + key: "iter:name:10:20", + expected: BucketCacheKey{}, + expectedErr: ErrInvalidBucketCacheKeyFormat, + }, + // Key must always have a name. + { + key: "iter", + expected: BucketCacheKey{}, + expectedErr: ErrInvalidBucketCacheKeyFormat, + }, + // Invalid VerbType should return an error. + { + key: "random:name", + expected: BucketCacheKey{}, + expectedErr: ErrInvalidBucketCacheKeyVerb, + }, + // Start must be an integer. + { + key: "subrange:name:random:10", + expected: BucketCacheKey{}, + expectedErr: ErrParseKeyInt, + }, + // End must be an integer. + { + key: "subrange:name:10:random", + expected: BucketCacheKey{}, + expectedErr: ErrParseKeyInt, + }, + // SubrangeVerb must have start and end. + { + key: "subrange:name", + expected: BucketCacheKey{}, + expectedErr: ErrInvalidBucketCacheKeyFormat, + }, + // SubrangeVerb must have start and end both. + { + key: "subrange:name:10", + expected: BucketCacheKey{}, + expectedErr: ErrInvalidBucketCacheKeyFormat, + }, + // Key must not be an empty string. + { + key: "", + expected: BucketCacheKey{}, + expectedErr: ErrInvalidBucketCacheKeyFormat, + }, + } + + for _, tc := range testcases { + res, err := ParseBucketCacheKey(tc.key) + testutil.Equals(t, tc.expectedErr, err) + testutil.Equals(t, tc.expected, res) + } +} diff --git a/pkg/store/prometheus.go b/pkg/store/prometheus.go index 2952e4467b7..ec3525ad654 100644 --- a/pkg/store/prometheus.go +++ b/pkg/store/prometheus.go @@ -27,12 +27,12 @@ import ( "github.com/prometheus/prometheus/pkg/labels" "github.com/prometheus/prometheus/storage/remote" "github.com/prometheus/prometheus/tsdb/chunkenc" + "github.com/thanos-io/thanos/pkg/httpconfig" "github.com/thanos-io/thanos/pkg/store/labelpb" "google.golang.org/grpc/codes" "google.golang.org/grpc/status" "github.com/thanos-io/thanos/pkg/component" - thanoshttp "github.com/thanos-io/thanos/pkg/http" "github.com/thanos-io/thanos/pkg/promclient" "github.com/thanos-io/thanos/pkg/runutil" "github.com/thanos-io/thanos/pkg/store/storepb" @@ -441,7 +441,7 @@ func (p *PrometheusStore) startPromRemoteRead(ctx context.Context, q *prompb.Que preq.Header.Set("Content-Type", "application/x-stream-protobuf") preq.Header.Set("X-Prometheus-Remote-Read-Version", "0.1.0") - preq.Header.Set("User-Agent", thanoshttp.ThanosUserAgent) + preq.Header.Set("User-Agent", httpconfig.ThanosUserAgent) presp, err = p.client.Do(preq.WithContext(ctx)) if err != nil { return nil, errors.Wrap(err, "send request") diff --git a/pkg/ui/react-app/package-lock.json b/pkg/ui/react-app/package-lock.json index bea0b169cd7..9b41e52944b 100644 --- a/pkg/ui/react-app/package-lock.json +++ b/pkg/ui/react-app/package-lock.json @@ -82,7 +82,7 @@ "prettier": "^2.3.2", "react-scripts": "^4.0.3", "sinon": "^9.2.4", - "typescript": "3.9.9" + "typescript": "^4.4.3" }, "optionalDependencies": { "fsevents": "^2.3.2" @@ -23153,9 +23153,9 @@ } }, "node_modules/typescript": { - "version": "3.9.9", - "resolved": "https://registry.npmjs.org/typescript/-/typescript-3.9.9.tgz", - "integrity": "sha512-kdMjTiekY+z/ubJCATUPlRDl39vXYiMV9iyeMuEuXZh2we6zz80uovNN2WlAxmmdE/Z/YQe+EbOEXB5RHEED3w==", + "version": "4.4.3", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-4.4.3.tgz", + "integrity": "sha512-4xfscpisVgqqDfPaJo5vkd+Qd/ItkoagnHpufr+i2QCHBsNYp+G7UAoyFl8aPtx879u38wPV65rZ8qbGZijalA==", "dev": true, "bin": { "tsc": "bin/tsc", @@ -43794,9 +43794,9 @@ } }, "typescript": { - "version": "3.9.9", - "resolved": "https://registry.npmjs.org/typescript/-/typescript-3.9.9.tgz", - "integrity": "sha512-kdMjTiekY+z/ubJCATUPlRDl39vXYiMV9iyeMuEuXZh2we6zz80uovNN2WlAxmmdE/Z/YQe+EbOEXB5RHEED3w==", + "version": "4.4.3", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-4.4.3.tgz", + "integrity": "sha512-4xfscpisVgqqDfPaJo5vkd+Qd/ItkoagnHpufr+i2QCHBsNYp+G7UAoyFl8aPtx879u38wPV65rZ8qbGZijalA==", "dev": true }, "unbox-primitive": { diff --git a/pkg/ui/react-app/package.json b/pkg/ui/react-app/package.json index e418507b374..50c40136e36 100644 --- a/pkg/ui/react-app/package.json +++ b/pkg/ui/react-app/package.json @@ -97,7 +97,7 @@ "prettier": "^2.3.2", "react-scripts": "^4.0.3", "sinon": "^9.2.4", - "typescript": "3.9.9" + "typescript": "^4.4.3" }, "proxy": "http://localhost:10902", "jest": { diff --git a/pkg/ui/react-app/src/pages/graph/ExpressionInput.tsx b/pkg/ui/react-app/src/pages/graph/ExpressionInput.tsx index ade2a9e5f5a..e1b5a20a9b5 100644 --- a/pkg/ui/react-app/src/pages/graph/ExpressionInput.tsx +++ b/pkg/ui/react-app/src/pages/graph/ExpressionInput.tsx @@ -33,34 +33,34 @@ class ExpressionInput extends Component { + setHeight = (): void => { const { offsetHeight, clientHeight, scrollHeight } = this.exprInputRef.current!; const offset = offsetHeight - clientHeight; // Needed in order for the height to be more accurate. this.setState({ height: scrollHeight + offset }); }; - handleInput = () => { + handleInput = (): void => { this.setValue(this.exprInputRef.current!.value); }; - setValue = (value: string | null) => { + setValue = (value: string | null): void => { const { onExpressionChange } = this.props; onExpressionChange(value as string); this.setState({ height: 'auto' }, this.setHeight); }; - componentDidUpdate(prevProps: ExpressionInputProps) { + componentDidUpdate(prevProps: ExpressionInputProps): void { const { value } = this.props; if (value !== prevProps.value) { this.setValue(value); } } - handleKeyPress = (event: React.KeyboardEvent) => { + handleKeyPress = (event: React.KeyboardEvent): void => { const { executeQuery } = this.props; if (event.key === 'Enter' && !event.shiftKey) { executeQuery(); @@ -68,18 +68,18 @@ class ExpressionInput extends Component { + getSearchMatches = (input: string, expressions: string[]): FuzzyResult[] => { return fuz.filter(input.replace(/ /g, ''), expressions); }; - createAutocompleteSection = (downshift: ControllerStateAndHelpers) => { + createAutocompleteSection = (downshift: ControllerStateAndHelpers): JSX.Element | null => { const { inputValue = '', closeMenu, highlightedIndex } = downshift; const { autocompleteSections } = this.props; let index = 0; const sections = - inputValue!.length && this.props.enableAutocomplete + inputValue?.length && this.props.enableAutocomplete ? Object.entries(autocompleteSections).reduce((acc, [title, items]) => { - const matches = this.getSearchMatches(inputValue!, items); + const matches = this.getSearchMatches(inputValue, items); return !matches.length ? acc : [ diff --git a/pkg/ui/react-app/src/pages/graph/Graph.tsx b/pkg/ui/react-app/src/pages/graph/Graph.tsx index 09b15c8252d..1620bee1cd0 100644 --- a/pkg/ui/react-app/src/pages/graph/Graph.tsx +++ b/pkg/ui/react-app/src/pages/graph/Graph.tsx @@ -44,7 +44,7 @@ class Graph extends PureComponent { chartData: normalizeData(this.props), }; - componentDidUpdate(prevProps: GraphProps) { + componentDidUpdate(prevProps: GraphProps): void { const { data, stacked, useLocalTime } = this.props; if (prevProps.data !== data) { this.selectedSeriesIndexes = []; @@ -64,11 +64,11 @@ class Graph extends PureComponent { } } - componentDidMount() { + componentDidMount(): void { this.plot(); } - componentWillUnmount() { + componentWillUnmount(): void { this.destroyPlot(); } @@ -81,20 +81,20 @@ class Graph extends PureComponent { this.$chart = $.plot($(this.chartRef.current), data, getOptions(this.props.stacked, this.props.useLocalTime)); }; - destroyPlot = () => { + destroyPlot = (): void => { if (isPresent(this.$chart)) { this.$chart.destroy(); } }; - plotSetAndDraw(data: GraphSeries[] = this.state.chartData) { + plotSetAndDraw(data: GraphSeries[] = this.state.chartData): void { if (isPresent(this.$chart)) { this.$chart.setData(data); this.$chart.draw(); } } - handleSeriesSelect = (selected: number[], selectedIndex: number) => { + handleSeriesSelect = (selected: number[], selectedIndex: number): void => { const { chartData } = this.state; this.plot( this.selectedSeriesIndexes.length === 1 && this.selectedSeriesIndexes.includes(selectedIndex) @@ -113,18 +113,18 @@ class Graph extends PureComponent { }); }; - handleLegendMouseOut = () => { + handleLegendMouseOut = (): void => { cancelAnimationFrame(this.rafID); this.plotSetAndDraw(); }; - handleResize = () => { + handleResize = (): void => { if (isPresent(this.$chart)) { this.plot(this.$chart.getData() as GraphSeries[]); } }; - render() { + render(): JSX.Element { const { chartData } = this.state; return (
diff --git a/pkg/ui/react-app/src/pages/graph/GraphHelpers.ts b/pkg/ui/react-app/src/pages/graph/GraphHelpers.ts index b8357622cf0..228dcb1cbcb 100644 --- a/pkg/ui/react-app/src/pages/graph/GraphHelpers.ts +++ b/pkg/ui/react-app/src/pages/graph/GraphHelpers.ts @@ -53,7 +53,7 @@ export const formatValue = (y: number | null): string => { throw Error("couldn't format a value, this is a bug"); }; -export const getHoverColor = (color: string, opacity: number, stacked: boolean) => { +export const getHoverColor = (color: string, opacity: number, stacked: boolean): string => { const { r, g, b } = $.color.parse(color); if (!stacked) { return `rgba(${r}, ${g}, ${b}, ${opacity})`; @@ -137,7 +137,10 @@ export const getOptions = (stacked: boolean, useLocalTime: boolean): jquery.flot }; // This was adapted from Flot's color generation code. -export const getColors = (data: { resultType: string; result: Array<{ metric: Metric; values: [number, string][] }> }) => { +export const getColors = (data: { + resultType: string; + result: Array<{ metric: Metric; values: [number, string][] }>; +}): Color[] => { const colorPool = ['#edc240', '#afd8f8', '#cb4b4b', '#4da74d', '#9440ed']; const colorPoolSize = colorPool.length; let variation = 0; @@ -189,7 +192,7 @@ export const normalizeData = ({ queryParams, data }: GraphProps): GraphSeries[] }); }; -export const parseValue = (value: string) => { +export const parseValue = (value: string): null | number => { const val = parseFloat(value); // "+Inf", "-Inf", "+Inf" will be parsed into NaN by parseFloat(). They // can't be graphed, so show them as gaps (null). diff --git a/pkg/ui/react-app/src/pages/graph/Legend.tsx b/pkg/ui/react-app/src/pages/graph/Legend.tsx index 153f82f98a4..5e98904eea4 100644 --- a/pkg/ui/react-app/src/pages/graph/Legend.tsx +++ b/pkg/ui/react-app/src/pages/graph/Legend.tsx @@ -18,36 +18,38 @@ export class Legend extends PureComponent { state = { selectedIndexes: [] as number[], }; - componentDidUpdate(prevProps: LegendProps) { + componentDidUpdate(prevProps: LegendProps): void { if (this.props.shouldReset && prevProps.shouldReset !== this.props.shouldReset) { this.setState({ selectedIndexes: [] }); } } - handleSeriesSelect = (index: number) => (ev: React.MouseEvent) => { - // TODO: add proper event type - const { selectedIndexes } = this.state; + handleSeriesSelect = + (index: number) => + (ev: React.MouseEvent): void => { + // TODO: add proper event type + const { selectedIndexes } = this.state; - let selected = [index]; - if (ev.ctrlKey || ev.metaKey) { - const { chartData } = this.props; - if (selectedIndexes.includes(index)) { - selected = selectedIndexes.filter((idx) => idx !== index); - } else { - selected = - // Flip the logic - In case none is selected ctrl + click should deselect clicked series. - selectedIndexes.length === 0 - ? chartData.reduce((acc, _, i) => (i === index ? acc : [...acc, i]), []) - : [...selectedIndexes, index]; // Select multiple. + let selected = [index]; + if (ev.ctrlKey || ev.metaKey) { + const { chartData } = this.props; + if (selectedIndexes.includes(index)) { + selected = selectedIndexes.filter((idx) => idx !== index); + } else { + selected = + // Flip the logic - In case none is selected ctrl + click should deselect clicked series. + selectedIndexes.length === 0 + ? chartData.reduce((acc, _, i) => (i === index ? acc : [...acc, i]), []) + : [...selectedIndexes, index]; // Select multiple. + } + } else if (selectedIndexes.length === 1 && selectedIndexes.includes(index)) { + selected = []; } - } else if (selectedIndexes.length === 1 && selectedIndexes.includes(index)) { - selected = []; - } - this.setState({ selectedIndexes: selected }); - this.props.onSeriesToggle(selected, index); - }; + this.setState({ selectedIndexes: selected }); + this.props.onSeriesToggle(selected, index); + }; - render() { + render(): JSX.Element { const { chartData, onLegendMouseOut, onHover } = this.props; const { selectedIndexes } = this.state; const canUseHover = chartData.length > 1 && selectedIndexes.length === 0; diff --git a/pkg/ui/react-app/src/pages/graph/Panel.tsx b/pkg/ui/react-app/src/pages/graph/Panel.tsx index 35a23963065..0434f57a5b6 100644 --- a/pkg/ui/react-app/src/pages/graph/Panel.tsx +++ b/pkg/ui/react-app/src/pages/graph/Panel.tsx @@ -95,7 +95,7 @@ class Panel extends Component { this.handleStoreMatchChange = this.handleStoreMatchChange.bind(this); } - componentDidUpdate({ options: prevOpts }: PanelProps) { + componentDidUpdate({ options: prevOpts }: PanelProps): void { const { endTime, range, @@ -120,7 +120,7 @@ class Panel extends Component { } } - componentDidMount() { + componentDidMount(): void { this.executeQuery(); } @@ -249,24 +249,24 @@ class Panel extends Component { return this.props.options.endTime; }; - handleChangeEndTime = (endTime: number | null) => { + handleChangeEndTime = (endTime: number | null): void => { this.setOptions({ endTime: endTime }); }; - handleChangeResolution = (resolution: number | null) => { + handleChangeResolution = (resolution: number | null): void => { this.setOptions({ resolution: resolution }); }; - handleChangeMaxSourceResolution = (maxSourceResolution: string) => { + handleChangeMaxSourceResolution = (maxSourceResolution: string): void => { this.setOptions({ maxSourceResolution }); }; - handleChangeType = (type: PanelType) => { + handleChangeType = (type: PanelType): void => { this.setState({ data: null }); this.setOptions({ type: type }); }; - handleChangeStacking = (stacked: boolean) => { + handleChangeStacking = (stacked: boolean): void => { this.setOptions({ stacked: stacked }); }; @@ -286,7 +286,7 @@ class Panel extends Component { this.setState({ error: null }); }; - render() { + render(): JSX.Element { const { pastQueries, metricNames, options, id, stores } = this.props; return (
diff --git a/pkg/ui/react-app/src/pages/graph/TimeInput.tsx b/pkg/ui/react-app/src/pages/graph/TimeInput.tsx index e7b111c609e..d9382805890 100644 --- a/pkg/ui/react-app/src/pages/graph/TimeInput.tsx +++ b/pkg/ui/react-app/src/pages/graph/TimeInput.tsx @@ -39,7 +39,7 @@ class TimeInput extends Component { return this.props.time || moment().valueOf(); }; - calcShiftRange = () => this.props.range / 2; + calcShiftRange = (): number => this.props.range / 2; increaseTime = (): void => { const time = this.getBaseTime() + this.calcShiftRange(); @@ -59,7 +59,7 @@ class TimeInput extends Component { return this.props.useLocalTime ? moment.tz.guess() : 'UTC'; }; - componentDidMount() { + componentDidMount(): void { this.$time = $(this.timeInputRef.current!); this.$time.datetimepicker({ @@ -85,11 +85,11 @@ class TimeInput extends Component { }); } - componentWillUnmount() { + componentWillUnmount(): void { this.$time.datetimepicker('destroy'); } - componentDidUpdate(prevProps: TimeInputProps) { + componentDidUpdate(prevProps: TimeInputProps): void { const { time, useLocalTime } = this.props; if (prevProps.time !== time) { this.$time.datetimepicker('date', time ? moment(time) : null); @@ -99,7 +99,7 @@ class TimeInput extends Component { } } - render() { + render(): JSX.Element { return ( diff --git a/pkg/ui/react-app/src/utils/index.ts b/pkg/ui/react-app/src/utils/index.ts index fce9feabaa9..fe02722d5d7 100644 --- a/pkg/ui/react-app/src/utils/index.ts +++ b/pkg/ui/react-app/src/utils/index.ts @@ -4,11 +4,11 @@ import { PanelOptions, PanelType, PanelDefaultOptions } from '../pages/graph/Pan import { PanelMeta } from '../pages/graph/PanelList'; import { queryURL } from '../thanos/config'; -export const generateID = () => { +export const generateID = (): string => { return `_${Math.random().toString(36).substr(2, 9)}`; }; -export const byEmptyString = (p: string) => p.length > 0; +export const byEmptyString = (p: string): boolean => p.length > 0; export const isPresent = (obj: T): obj is NonNullable => obj !== null && obj !== undefined; @@ -27,7 +27,7 @@ export const escapeHTML = (str: string): string => { }); }; -export const metricToSeriesName = (labels: { [key: string]: string }) => { +export const metricToSeriesName = (labels: { [key: string]: string }): string => { if (labels === null) { return 'scalar'; } @@ -226,11 +226,13 @@ export const parseOption = (param: string): Partial => { return {}; }; -export const formatParam = (key: string) => (paramName: string, value: number | string | boolean) => { - return `g${key}.${paramName}=${encodeURIComponent(value)}`; -}; +export const formatParam = + (key: string) => + (paramName: string, value: number | string | boolean): string => { + return `g${key}.${paramName}=${encodeURIComponent(value)}`; + }; -export const toQueryString = ({ key, options }: PanelMeta) => { +export const toQueryString = ({ key, options }: PanelMeta): string => { const formatWithKey = formatParam(key); const { expr, @@ -260,15 +262,15 @@ export const toQueryString = ({ key, options }: PanelMeta) => { return urlParams.filter(byEmptyString).join('&'); }; -export const encodePanelOptionsToQueryString = (panels: PanelMeta[]) => { +export const encodePanelOptionsToQueryString = (panels: PanelMeta[]): string => { return `?${panels.map(toQueryString).join('&')}`; }; -export const createExpressionLink = (expr: string) => { +export const createExpressionLink = (expr: string): string => { return `../graph?g0.expr=${encodeURIComponent(expr)}&g0.tab=1&g0.stacked=0&g0.range_input=1h`; }; -export const createExternalExpressionLink = (expr: string) => { +export const createExternalExpressionLink = (expr: string): string => { const expLink = createExpressionLink(expr); return `${queryURL}${expLink.replace(/^\.\./, '')}`; }; diff --git a/pkg/ui/react-app/tsconfig.json b/pkg/ui/react-app/tsconfig.json index b8a3b865632..c062f25ccd2 100644 --- a/pkg/ui/react-app/tsconfig.json +++ b/pkg/ui/react-app/tsconfig.json @@ -17,7 +17,7 @@ "resolveJsonModule": true, "isolatedModules": true, "noEmit": true, - "jsx": "react", + "jsx": "react-jsx", "noFallthroughCasesInSwitch": true }, "include": [ diff --git a/scripts/cfggen/main.go b/scripts/cfggen/main.go index e58b6296707..56478558dbf 100644 --- a/scripts/cfggen/main.go +++ b/scripts/cfggen/main.go @@ -15,17 +15,15 @@ import ( "github.com/go-kit/kit/log" "github.com/go-kit/kit/log/level" "github.com/pkg/errors" + "github.com/thanos-io/thanos/pkg/httpconfig" "gopkg.in/alecthomas/kingpin.v2" "gopkg.in/yaml.v2" - "github.com/thanos-io/thanos/pkg/objstore/bos" - "github.com/thanos-io/thanos/pkg/query" - "github.com/thanos-io/thanos/pkg/alert" "github.com/thanos-io/thanos/pkg/cacheutil" - http_util "github.com/thanos-io/thanos/pkg/http" "github.com/thanos-io/thanos/pkg/logging" "github.com/thanos-io/thanos/pkg/objstore/azure" + "github.com/thanos-io/thanos/pkg/objstore/bos" "github.com/thanos-io/thanos/pkg/objstore/client" "github.com/thanos-io/thanos/pkg/objstore/cos" "github.com/thanos-io/thanos/pkg/objstore/filesystem" @@ -79,12 +77,12 @@ func init() { configs[name(logging.RequestConfig{})] = logging.RequestConfig{} alertmgrCfg := alert.DefaultAlertmanagerConfig() - alertmgrCfg.EndpointsConfig.FileSDConfigs = []http_util.FileSDConfig{{}} + alertmgrCfg.EndpointsConfig.FileSDConfigs = []httpconfig.FileSDConfig{{}} configs[name(alert.AlertingConfig{})] = alert.AlertingConfig{Alertmanagers: []alert.AlertmanagerConfig{alertmgrCfg}} - queryCfg := query.DefaultConfig() - queryCfg.EndpointsConfig.FileSDConfigs = []http_util.FileSDConfig{{}} - configs[name(query.Config{})] = []query.Config{queryCfg} + queryCfg := httpconfig.DefaultConfig() + queryCfg.EndpointsConfig.FileSDConfigs = []httpconfig.FileSDConfig{{}} + configs[name(httpconfig.Config{})] = []httpconfig.Config{queryCfg} for typ, config := range bucketConfigs { configs[name(config)] = client.BucketConfig{Type: typ, Config: config} diff --git a/test/e2e/compact_test.go b/test/e2e/compact_test.go index 2359816ed8e..1daede5ea90 100644 --- a/test/e2e/compact_test.go +++ b/test/e2e/compact_test.go @@ -16,8 +16,9 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" - e2edb "github.com/cortexproject/cortex/integration/e2e/db" + "github.com/efficientgo/e2e" + e2edb "github.com/efficientgo/e2e/db" + "github.com/efficientgo/e2e/matchers" "github.com/go-kit/kit/log" "github.com/oklog/ulid" "github.com/prometheus/client_golang/prometheus" @@ -33,6 +34,7 @@ import ( "github.com/thanos-io/thanos/pkg/objstore/client" "github.com/thanos-io/thanos/pkg/objstore/s3" "github.com/thanos-io/thanos/pkg/promclient" + "github.com/thanos-io/thanos/pkg/runutil" "github.com/thanos-io/thanos/pkg/testutil" "github.com/thanos-io/thanos/pkg/testutil/e2eutil" "github.com/thanos-io/thanos/test/e2e/e2ethanos" @@ -336,22 +338,22 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { if penaltyDedup { name = "e2e_test_compact_penalty_dedup" } - s, err := e2e.NewScenario(name) + e, err := e2e.NewDockerEnvironment(name) testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - dir := filepath.Join(s.SharedDir(), "tmp") + dir := filepath.Join(e.SharedDir(), "tmp") testutil.Ok(t, os.MkdirAll(dir, os.ModePerm)) const bucket = "compact_test" - m := e2edb.NewMinio(8080, bucket) - testutil.Ok(t, s.StartAndWaitReady(m)) + m := e2ethanos.NewMinio(e, "minio", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) bkt, err := s3.NewBucketWithConfig(logger, s3.Config{ Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.HTTPEndpoint(), // We need separate client config, when connecting to minio from outside. + Endpoint: m.Endpoint("http"), // We need separate client config, when connecting to minio from outside. Insecure: true, }, "test-feed") testutil.Ok(t, err) @@ -363,7 +365,10 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { for _, b := range blocks { id, err := b.Create(ctx, dir, justAfterConsistencyDelay, b.hashFunc) testutil.Ok(t, err) - testutil.Ok(t, objstore.UploadDir(ctx, logger, bkt, path.Join(dir, id.String()), id.String())) + testutil.Ok(t, runutil.Retry(time.Second, ctx.Done(), func() error { + return objstore.UploadDir(ctx, logger, bkt, path.Join(dir, id.String()), id.String()) + })) + rawBlockIDs[id] = struct{}{} if b.markedForNoCompact { testutil.Ok(t, block.MarkForNoCompact(ctx, logger, bkt, id, metadata.ManualNoCompactReason, "why not", promauto.With(nil).NewCounter(prometheus.CounterOpts{}))) @@ -442,26 +447,26 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.NetworkHTTPEndpoint(), + Endpoint: m.InternalEndpoint("http"), Insecure: true, }, } - str, err := e2ethanos.NewStoreGW(s.SharedDir(), "1", svcConfig) + str, err := e2ethanos.NewStoreGW(e, "1", svcConfig, "") testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(str)) + testutil.Ok(t, e2e.StartAndWaitReady(str)) testutil.Ok(t, str.WaitSumMetrics(e2e.Equals(float64(len(rawBlockIDs)+7)), "thanos_blocks_meta_synced")) testutil.Ok(t, str.WaitSumMetrics(e2e.Equals(0), "thanos_blocks_meta_sync_failures_total")) testutil.Ok(t, str.WaitSumMetrics(e2e.Equals(0), "thanos_blocks_meta_modified")) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", str.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", str.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel = context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) // Check if query detects current series, even if overlapped. - queryAndAssert(t, ctx, q.HTTPEndpoint(), + queryAndAssert(t, ctx, q.Endpoint("http"), fmt.Sprintf(`count_over_time({a="1"}[13h] offset %ds)`, int64(time.Since(now.Add(12*time.Hour)).Seconds())), promclient.QueryOptions{ Deduplicate: false, // This should be false, so that we can be sure deduplication was offline. @@ -599,7 +604,7 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { // Precreate a directory. It should be deleted. // In a hypothetical scenario, the directory could be a left-over from // a compaction that had crashed. - p := filepath.Join(s.SharedDir(), "data", "compact", "expect-to-halt", "compact") + p := filepath.Join(e.SharedDir(), "data", "compact", "expect-to-halt", "compact") testutil.Assert(t, len(blocksWithHashes) > 0) @@ -613,9 +618,9 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { testutil.Ok(t, err) testutil.Ok(t, f.Close()) - c, err := e2ethanos.NewCompactor(s.SharedDir(), "expect-to-halt", svcConfig, nil) + c, err := e2ethanos.NewCompactor(e, "expect-to-halt", svcConfig, nil) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(c)) + testutil.Ok(t, e2e.StartAndWaitReady(c)) // Expect compactor halted and for one cleanup iteration to happen. testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(1), "thanos_compact_halted")) @@ -626,10 +631,10 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(0), "thanos_blocks_meta_modified")) // The compact directory is still there. - dataDir := filepath.Join(s.SharedDir(), "data", "compact", "expect-to-halt") + dataDir := filepath.Join(e.SharedDir(), "data", "compact", "expect-to-halt") empty, err := isEmptyDir(dataDir) testutil.Ok(t, err) - testutil.Equals(t, false, empty, "directory %s should not be empty", dataDir) + testutil.Equals(t, false, empty, "directory %e should not be empty", dataDir) // We expect no ops. testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(0), "thanos_compact_iterations_total")) @@ -638,18 +643,18 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(0), "thanos_compact_group_compactions_total")) testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(0), "thanos_compact_group_vertical_compactions_total")) testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(1), "thanos_compact_group_compactions_failures_total")) - testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(3), "thanos_compact_group_compaction_runs_started_total")) - testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(2), "thanos_compact_group_compaction_runs_completed_total")) + testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(2), "thanos_compact_group_compaction_runs_started_total")) + testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(1), "thanos_compact_group_compaction_runs_completed_total")) // However, the blocks have been cleaned because that happens concurrently. testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(2), "thanos_compact_aborted_partial_uploads_deletion_attempts_total")) testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(2), "thanos_compact_blocks_cleaned_total")) // Ensure bucket UI. - ensureGETStatusCode(t, http.StatusOK, "http://"+path.Join(c.HTTPEndpoint(), "global")) - ensureGETStatusCode(t, http.StatusOK, "http://"+path.Join(c.HTTPEndpoint(), "loaded")) + ensureGETStatusCode(t, http.StatusOK, "http://"+path.Join(c.Endpoint("http"), "global")) + ensureGETStatusCode(t, http.StatusOK, "http://"+path.Join(c.Endpoint("http"), "loaded")) - testutil.Ok(t, s.Stop(c)) + testutil.Ok(t, c.Stop()) _, err = os.Stat(randBlockDir) testutil.NotOk(t, err) @@ -661,7 +666,7 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { // Dedup enabled; compactor should work as expected. { // Predownload block dirs with hashes. We should not try downloading them again. - p := filepath.Join(s.SharedDir(), "data", "compact", "working") + p := filepath.Join(e.SharedDir(), "data", "compact", "working") for _, id := range blocksWithHashes { m, err := block.DownloadMeta(ctx, logger, bkt, id) @@ -677,9 +682,9 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { } // We expect 2x 4-block compaction, 2-block vertical compaction, 2x 3-block compaction. - c, err := e2ethanos.NewCompactor(s.SharedDir(), "working", svcConfig, nil, extArgs...) + c, err := e2ethanos.NewCompactor(e, "working", svcConfig, nil, extArgs...) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(c)) + testutil.Ok(t, e2e.StartAndWaitReady(c)) // NOTE: We cannot assert on intermediate `thanos_blocks_meta_` metrics as those are gauge and change dynamically due to many // compaction groups. Wait for at least first compaction iteration (next is in 5m). @@ -706,9 +711,9 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(0), "thanos_compact_halted")) - bucketMatcher, err := labels.NewMatcher(labels.MatchEqual, "bucket", bucket) + bucketMatcher, err := matchers.NewMatcher(matchers.MatchEqual, "bucket", bucket) testutil.Ok(t, err) - operationMatcher, err := labels.NewMatcher(labels.MatchEqual, "operation", "get") + operationMatcher, err := matchers.NewMatcher(matchers.MatchEqual, "operation", "get") testutil.Ok(t, err) testutil.Ok(t, c.WaitSumMetricsWithOptions(e2e.Equals(478), []string{"thanos_objstore_bucket_operations_total"}, e2e.WithLabelMatchers( @@ -718,13 +723,13 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { ) // Make sure compactor does not modify anything else over time. - testutil.Ok(t, s.Stop(c)) + testutil.Ok(t, c.Stop()) ctx, cancel = context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) // Check if query detects new blocks. - queryAndAssert(t, ctx, q.HTTPEndpoint(), + queryAndAssert(t, ctx, q.Endpoint("http"), fmt.Sprintf(`count_over_time({a="1"}[13h] offset %ds)`, int64(time.Since(now.Add(12*time.Hour)).Seconds())), promclient.QueryOptions{ Deduplicate: false, // This should be false, so that we can be sure deduplication was offline. @@ -742,9 +747,9 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { if penaltyDedup { extArgs = append(extArgs, "--deduplication.func=penalty") } - c, err := e2ethanos.NewCompactor(s.SharedDir(), "working", svcConfig, nil, extArgs...) + c, err := e2ethanos.NewCompactor(e, "working-dedup", svcConfig, nil, extArgs...) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(c)) + testutil.Ok(t, e2e.StartAndWaitReady(c)) // NOTE: We cannot assert on intermediate `thanos_blocks_meta_` metrics as those are gauge and change dynamically due to many // compaction groups. Wait for at least first compaction iteration (next is in 5m). @@ -767,13 +772,13 @@ func testCompactWithStoreGateway(t *testing.T, penaltyDedup bool) { testutil.Ok(t, c.WaitSumMetrics(e2e.Equals(0), "thanos_compact_halted")) // Make sure compactor does not modify anything else over time. - testutil.Ok(t, s.Stop(c)) + testutil.Ok(t, c.Stop()) ctx, cancel = context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) // Check if query detects new blocks. - queryAndAssert(t, ctx, q.HTTPEndpoint(), + queryAndAssert(t, ctx, q.Endpoint("http"), fmt.Sprintf(`count_over_time({a="1"}[13h] offset %ds)`, int64(time.Since(now.Add(12*time.Hour)).Seconds())), promclient.QueryOptions{ Deduplicate: false, // This should be false, so that we can be sure deduplication was offline. diff --git a/test/e2e/compatibility_test.go b/test/e2e/compatibility_test.go new file mode 100644 index 00000000000..0ccdc685971 --- /dev/null +++ b/test/e2e/compatibility_test.go @@ -0,0 +1,107 @@ +// Copyright (c) The Thanos Authors. +// Licensed under the Apache License 2.0. + +package e2e_test + +import ( + "fmt" + "io/ioutil" + "os" + "path/filepath" + "testing" + "time" + + "github.com/efficientgo/e2e" + e2edb "github.com/efficientgo/e2e/db" + "github.com/thanos-io/thanos/pkg/testutil" + "github.com/thanos-io/thanos/test/e2e/e2ethanos" +) + +// Test requires at least ~11m, so run this with `-test.timeout 9999m`. +func TestPromQLCompliance(t *testing.T) { + t.Skip("This is interactive test, it requires time to build up (scrape) the data. The data is also obtain from remote promlab servers.") + + e, err := e2e.NewDockerEnvironment("compatibility") + testutil.Ok(t, err) + t.Cleanup(e.Close) + + // Start separate receive + Querier. + receiverRunnable, err := e2ethanos.NewIngestingReceiver(e, "receive") + testutil.Ok(t, err) + queryReceive := e2edb.NewThanosQuerier(e, "query_receive", []string{receiverRunnable.InternalEndpoint("grpc")}) + testutil.Ok(t, e2e.StartAndWaitReady(receiverRunnable, queryReceive)) + + // Start reference Prometheus. + prom := e2edb.NewPrometheus(e, "prom") + testutil.Ok(t, prom.SetConfig(` +global: + scrape_interval: 5s + evaluation_interval: 5s + external_labels: + prometheus: 1 + +remote_write: + - url: "`+e2ethanos.RemoteWriteEndpoint(receiverRunnable.InternalEndpoint("remote-write"))+`" + +scrape_configs: +- job_name: 'demo' + static_configs: + - targets: + - 'demo.promlabs.com:10000' + - 'demo.promlabs.com:10001' + - 'demo.promlabs.com:10002' +`, + )) + testutil.Ok(t, e2e.StartAndWaitReady(prom)) + + // Start separate sidecar + Querier + sidecar := e2edb.NewThanosSidecar(e, "sidecar", prom) + querySidecar := e2edb.NewThanosQuerier(e, "query_sidecar", []string{sidecar.InternalEndpoint("grpc")}) + testutil.Ok(t, e2e.StartAndWaitReady(sidecar, querySidecar)) + + // Start noop promql-compliance-tester. See https://github.com/prometheus/compliance/tree/main/promql on how to build local docker image. + compliance := e.Runnable("promql-compliance-tester").Init(e2e.StartOptions{ + Image: "promql-compliance-tester:latest", + Command: e2e.NewCommandWithoutEntrypoint("tail", "-f", "/dev/null"), + }) + testutil.Ok(t, e2e.StartAndWaitReady(compliance)) + + // Wait 10 minutes for Prometheus to scrape relevant data. + time.Sleep(10 * time.Minute) + + t.Run("receive", func(t *testing.T) { + testutil.Ok(t, ioutil.WriteFile(filepath.Join(compliance.Dir(), "receive.yaml"), + []byte(promLabelsPromQLConfig(prom, queryReceive, []string{"prometheus", "receive", "tenant_id"})), os.ModePerm)) + + stdout, stderr, err := compliance.Exec(e2e.NewCommand("-config-file", filepath.Join(compliance.InternalDir(), "receive.yaml"))) + testutil.Ok(t, err) + fmt.Println(stdout, stderr) + }) + t.Run("sidecar", func(t *testing.T) { + testutil.Ok(t, ioutil.WriteFile(filepath.Join(compliance.Dir(), "sidecar.yaml"), + []byte(promLabelsPromQLConfig(prom, querySidecar, []string{"prometheus"})), os.ModePerm)) + + stdout, stderr, err := compliance.Exec(e2e.NewCommand("-config-file", filepath.Join(compliance.InternalDir(), "sidecar.yaml"))) + testutil.Ok(t, err) + fmt.Println(stdout, stderr) + }) +} + +func promLabelsPromQLConfig(reference *e2edb.Prometheus, target e2e.Runnable, dropLabels []string) string { + return `reference_target_config: + query_url: '` + reference.InternalEndpoint("http") + `' + +test_target_config: + query_url: '` + target.InternalEndpoint("http") + `' + +query_tweaks: + - note: 'Thanos requires adding "external_labels" to distinguish Prometheus servers, leading to extra labels in query results that need to be stripped before comparing results.' + no_bug: true + drop_result_labels: +` + func() (ret string) { + for _, l := range dropLabels { + ret += ` - ` + l + } + return ret + }() +} diff --git a/test/e2e/e2ethanos/helpers.go b/test/e2e/e2ethanos/helpers.go index 0bcd21c9128..22ef3d81415 100644 --- a/test/e2e/e2ethanos/helpers.go +++ b/test/e2e/e2ethanos/helpers.go @@ -11,16 +11,16 @@ import ( "strings" "testing" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/thanos-io/thanos/pkg/testutil" ) -func CleanScenario(t testing.TB, s *e2e.Scenario) func() { +func CleanScenario(t testing.TB, e *e2e.DockerEnvironment) func() { return func() { // Make sure Clean can properly delete everything. - testutil.Ok(t, exec.Command("chmod", "-R", "777", s.SharedDir()).Run()) - s.Close() + testutil.Ok(t, exec.Command("chmod", "-R", "777", e.SharedDir()).Run()) + e.Close() } } diff --git a/test/e2e/e2ethanos/service.go b/test/e2e/e2ethanos/service.go index 57f5dea453d..2a8d0bd31f8 100644 --- a/test/e2e/e2ethanos/service.go +++ b/test/e2e/e2ethanos/service.go @@ -4,35 +4,74 @@ package e2ethanos import ( - "github.com/cortexproject/cortex/integration/e2e" -) + "os" + "strconv" -type Service struct { - *e2e.HTTPService + "github.com/efficientgo/e2e" +) - grpc int +type Port struct { + Name string + PortNum int + IsMetrics bool } func NewService( + e e2e.Environment, name string, image string, - command *e2e.Command, + command e2e.Command, readiness *e2e.HTTPReadinessProbe, http, grpc int, - otherPorts ...int, -) *Service { - return &Service{ - HTTPService: e2e.NewHTTPService(name, image, command, readiness, http, append(otherPorts, grpc)...), - grpc: grpc, - } + otherPorts ...Port, +) *e2e.InstrumentedRunnable { + return newUninitiatedService(e, name, http, grpc, otherPorts...).Init( + e2e.StartOptions{ + Image: image, + Command: command, + Readiness: readiness, + User: strconv.Itoa(os.Getuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, + ) } -func (s *Service) GRPCEndpoint() string { return s.Endpoint(s.grpc) } +func newUninitiatedService( + e e2e.Environment, + name string, + http, grpc int, + otherPorts ...Port, +) *e2e.FutureInstrumentedRunnable { + metricsPorts := "http" + ports := map[string]int{ + "http": http, + "grpc": grpc, + } -func (s *Service) GRPCNetworkEndpoint() string { - return s.NetworkEndpoint(s.grpc) + for _, op := range otherPorts { + ports[op.Name] = op.PortNum + + if op.IsMetrics { + metricsPorts = op.Name + } + } + + return e2e.NewInstrumentedRunnable(e, name, ports, metricsPorts) } -func (s *Service) GRPCNetworkEndpointFor(networkName string) string { - return s.NetworkEndpointFor(networkName, s.grpc) +func initiateService( + service *e2e.FutureInstrumentedRunnable, + image string, + command e2e.Command, + readiness *e2e.HTTPReadinessProbe, +) *e2e.InstrumentedRunnable { + return service.Init( + e2e.StartOptions{ + Image: image, + Command: command, + Readiness: readiness, + User: strconv.Itoa(os.Getuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, + ) } diff --git a/test/e2e/e2ethanos/services.go b/test/e2e/e2ethanos/services.go index c15c150cbb6..b0d9b1b3ad0 100644 --- a/test/e2e/e2ethanos/services.go +++ b/test/e2e/e2ethanos/services.go @@ -13,27 +13,31 @@ import ( "strings" "time" - "github.com/cortexproject/cortex/integration/e2e" - "github.com/grafana/dskit/backoff" + "github.com/efficientgo/e2e" + e2edb "github.com/efficientgo/e2e/db" + "github.com/efficientgo/tools/core/pkg/backoff" "github.com/pkg/errors" "github.com/prometheus/common/model" "github.com/prometheus/prometheus/discovery/targetgroup" "github.com/prometheus/prometheus/pkg/relabel" + "github.com/thanos-io/thanos/pkg/httpconfig" "gopkg.in/yaml.v2" "github.com/thanos-io/thanos/pkg/alert" "github.com/thanos-io/thanos/pkg/objstore/client" - "github.com/thanos-io/thanos/pkg/query" "github.com/thanos-io/thanos/pkg/queryfrontend" "github.com/thanos-io/thanos/pkg/receive" ) -const infoLogLevel = "info" +const ( + infoLogLevel = "info" + ContainerSharedDir = "/shared" +) // Same as default for now. var defaultBackoffConfig = backoff.Config{ - MinBackoff: 300 * time.Millisecond, - MaxBackoff: 600 * time.Millisecond, + Min: 300 * time.Millisecond, + Max: 600 * time.Millisecond, MaxRetries: 50, } @@ -60,9 +64,9 @@ func DefaultImage() string { return "thanos" } -func NewPrometheus(sharedDir, name, config, promImage string, enableFeatures ...string) (*e2e.HTTPService, string, error) { - dir := filepath.Join(sharedDir, "data", "prometheus", name) - container := filepath.Join(e2e.ContainerSharedDir, "data", "prometheus", name) +func NewPrometheus(e e2e.Environment, name, config, promImage string, enableFeatures ...string) (*e2e.InstrumentedRunnable, string, error) { + dir := filepath.Join(e.SharedDir(), "data", "prometheus", name) + container := filepath.Join(ContainerSharedDir, "data", "prometheus", name) if err := os.MkdirAll(dir, 0750); err != nil { return nil, "", errors.Wrap(err, "create prometheus dir") } @@ -82,31 +86,35 @@ func NewPrometheus(sharedDir, name, config, promImage string, enableFeatures ... if len(enableFeatures) > 0 { args = append(args, fmt.Sprintf("--enable-feature=%s", strings.Join(enableFeatures, ","))) } - prom := e2e.NewHTTPService( + prom := e2e.NewInstrumentedRunnable( + e, fmt.Sprintf("prometheus-%s", name), - promImage, - e2e.NewCommandWithoutEntrypoint("prometheus", args...), - e2e.NewHTTPReadinessProbe(9090, "/-/ready", 200, 200), - 9090, + map[string]int{"http": 9090}, + "http").Init( + e2e.StartOptions{ + Image: promImage, + Command: e2e.NewCommandWithoutEntrypoint("prometheus", args...), + Readiness: e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + User: strconv.Itoa(os.Getuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, ) - prom.SetUser(strconv.Itoa(os.Getuid())) - prom.SetBackoff(defaultBackoffConfig) return prom, container, nil } -func NewPrometheusWithSidecar(sharedDir, netName, name, config, promImage string, enableFeatures ...string) (*e2e.HTTPService, *Service, error) { - return NewPrometheusWithSidecarCustomImage(sharedDir, netName, name, config, promImage, DefaultImage(), enableFeatures...) +func NewPrometheusWithSidecar(e e2e.Environment, name, config, promImage string, enableFeatures ...string) (*e2e.InstrumentedRunnable, *e2e.InstrumentedRunnable, error) { + return NewPrometheusWithSidecarCustomImage(e, name, config, promImage, DefaultImage(), enableFeatures...) } -func NewPrometheusWithSidecarCustomImage(sharedDir, netName, name, config, promImage string, sidecarImage string, enableFeatures ...string) (*e2e.HTTPService, *Service, error) { - prom, dataDir, err := NewPrometheus(sharedDir, name, config, promImage, enableFeatures...) +func NewPrometheusWithSidecarCustomImage(e e2e.Environment, name, config, promImage string, sidecarImage string, enableFeatures ...string) (*e2e.InstrumentedRunnable, *e2e.InstrumentedRunnable, error) { + prom, dataDir, err := NewPrometheus(e, name, config, promImage, enableFeatures...) if err != nil { return nil, nil, err } - prom.SetBackoff(defaultBackoffConfig) sidecar := NewService( + e, fmt.Sprintf("sidecar-%s", name), sidecarImage, e2e.NewCommand("sidecar", e2e.BuildArgs(map[string]string{ @@ -114,21 +122,20 @@ func NewPrometheusWithSidecarCustomImage(sharedDir, netName, name, config, promI "--grpc-address": ":9091", "--grpc-grace-period": "0s", "--http-address": ":8080", - "--prometheus.url": "http://" + prom.NetworkEndpointFor(netName, 9090), + "--prometheus.url": "http://" + prom.InternalEndpoint("http"), "--tsdb.path": dataDir, "--log.level": infoLogLevel, })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), 8080, 9091, ) - sidecar.SetUser(strconv.Itoa(os.Getuid())) - sidecar.SetBackoff(defaultBackoffConfig) return prom, sidecar, nil } type QuerierBuilder struct { + environment e2e.Environment sharedDir string name string routePrefix string @@ -145,9 +152,10 @@ type QuerierBuilder struct { tracingConfig string } -func NewQuerierBuilder(sharedDir, name string, storeAddresses ...string) *QuerierBuilder { +func NewQuerierBuilder(e e2e.Environment, name string, storeAddresses ...string) *QuerierBuilder { return &QuerierBuilder{ - sharedDir: sharedDir, + environment: e, + sharedDir: e.SharedDir(), name: name, storeAddresses: storeAddresses, image: DefaultImage(), @@ -199,7 +207,52 @@ func (q *QuerierBuilder) WithTracingConfig(tracingConfig string) *QuerierBuilder return q } -func (q *QuerierBuilder) Build() (*Service, error) { +func (q *QuerierBuilder) BuildUninitiated() *e2e.FutureInstrumentedRunnable { + return newUninitiatedService( + q.environment, + fmt.Sprintf("querier-%v", q.name), + 8080, + 9091, + ) +} + +func (q *QuerierBuilder) Initiate(service *e2e.FutureInstrumentedRunnable, storeAddresses ...string) (*e2e.InstrumentedRunnable, error) { + q.storeAddresses = storeAddresses + args, err := q.collectArgs() + if err != nil { + return nil, err + } + + querier := initiateService( + service, + q.image, + e2e.NewCommand("query", args...), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + ) + + return querier, nil +} + +func (q *QuerierBuilder) Build() (*e2e.InstrumentedRunnable, error) { + args, err := q.collectArgs() + if err != nil { + return nil, err + } + + querier := NewService( + q.environment, + fmt.Sprintf("querier-%v", q.name), + q.image, + e2e.NewCommand("query", args...), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + 8080, + 9091, + ) + + return querier, nil +} + +func (q *QuerierBuilder) collectArgs() ([]string, error) { const replicaLabel = "replica" args := e2e.BuildArgs(map[string]string{ @@ -235,7 +288,7 @@ func (q *QuerierBuilder) Build() (*Service, error) { if len(q.fileSDStoreAddresses) > 0 { queryFileSDDir := filepath.Join(q.sharedDir, "data", "querier", q.name) - container := filepath.Join(e2e.ContainerSharedDir, "data", "querier", q.name) + container := filepath.Join(ContainerSharedDir, "data", "querier", q.name) if err := os.MkdirAll(queryFileSDDir, 0750); err != nil { return nil, errors.Wrap(err, "create query dir failed") } @@ -269,33 +322,31 @@ func (q *QuerierBuilder) Build() (*Service, error) { args = append(args, "--tracing.config="+q.tracingConfig) } - querier := NewService( - fmt.Sprintf("querier-%v", q.name), - q.image, - e2e.NewCommand("query", args...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), - 8080, - 9091, - ) - querier.SetUser(strconv.Itoa(os.Getuid())) - querier.SetBackoff(defaultBackoffConfig) - - return querier, nil + return args, nil } func RemoteWriteEndpoint(addr string) string { return fmt.Sprintf("http://%s/api/v1/receive", addr) } -// NewRoutingAndIngestingReceiver creates a Thanos Receive instances that is configured both for ingesting samples and routing samples to other receivers. -func NewRoutingAndIngestingReceiver(sharedDir, networkName, name string, replicationFactor int, hashring ...receive.HashringConfig) (*Service, error) { +// NewUninitiatedReceiver returns a future receiver that can be initiated. It is useful +// for obtaining a receiver address for hashring before the receiver is started. +func NewUninitiatedReceiver(e e2e.Environment, name string) *e2e.FutureInstrumentedRunnable { + return newUninitiatedService(e, fmt.Sprintf("receive-%v", name), 8080, 9091, Port{Name: "remote-write", PortNum: 8081}) +} - localEndpoint := NewService(fmt.Sprintf("receive-%v", name), "", e2e.NewCommand("", ""), nil, 8080, 9091, 8081).GRPCNetworkEndpointFor(networkName) +// NewRoutingAndIngestingReceiverFromService creates a Thanos Receive instances from an unitiated service. +// It is configured both for ingesting samples and routing samples to other receivers. +func NewRoutingAndIngestingReceiverFromService(service *e2e.FutureInstrumentedRunnable, sharedDir string, replicationFactor int, hashring ...receive.HashringConfig) (*e2e.InstrumentedRunnable, error) { + var localEndpoint string if len(hashring) == 0 { + localEndpoint = "0.0.0.0:9091" hashring = []receive.HashringConfig{{Endpoints: []string{localEndpoint}}} + } else { + localEndpoint = service.InternalEndpoint("grpc") } - dir := filepath.Join(sharedDir, "data", "receive", name) + dir := filepath.Join(sharedDir, "data", "receive", service.Name()) dataDir := filepath.Join(dir, "data") - container := filepath.Join(e2e.ContainerSharedDir, "data", "receive", name) + container := filepath.Join(ContainerSharedDir, "data", "receive", service.Name()) if err := os.MkdirAll(dataDir, 0750); err != nil { return nil, errors.Wrap(err, "create receive dir") } @@ -304,44 +355,87 @@ func NewRoutingAndIngestingReceiver(sharedDir, networkName, name string, replica return nil, errors.Wrapf(err, "generate hashring file: %v", hashring) } - receiver := NewService( - fmt.Sprintf("receive-%v", name), + receiver := initiateService( + service, DefaultImage(), // TODO(bwplotka): BuildArgs should be interface. e2e.NewCommand("receive", e2e.BuildArgs(map[string]string{ - "--debug.name": fmt.Sprintf("receive-%v", name), + "--debug.name": service.Name(), "--grpc-address": ":9091", "--grpc-grace-period": "0s", "--http-address": ":8080", "--remote-write.address": ":8081", - "--label": fmt.Sprintf(`receive="%s"`, name), + "--label": fmt.Sprintf(`receive="%s"`, service.Name()), "--tsdb.path": filepath.Join(container, "data"), "--log.level": infoLogLevel, "--receive.replication-factor": strconv.Itoa(replicationFactor), "--receive.local-endpoint": localEndpoint, "--receive.hashrings": string(b), })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), - 8080, - 9091, - 8081, + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + ) + + return receiver, nil +} + +func NewRoutingAndIngestingReceiverWithConfigWatcher(service *e2e.FutureInstrumentedRunnable, sharedDir string, replicationFactor int, hashring ...receive.HashringConfig) (*e2e.InstrumentedRunnable, error) { + var localEndpoint string + if len(hashring) == 0 { + localEndpoint = "0.0.0.0:9091" + hashring = []receive.HashringConfig{{Endpoints: []string{localEndpoint}}} + } else { + localEndpoint = service.InternalEndpoint("grpc") + } + + dir := filepath.Join(sharedDir, "data", "receive", service.Name()) + dataDir := filepath.Join(dir, "data") + container := filepath.Join(ContainerSharedDir, "data", "receive", service.Name()) + if err := os.MkdirAll(dataDir, 0750); err != nil { + return nil, errors.Wrap(err, "create receive dir") + } + b, err := json.Marshal(hashring) + if err != nil { + return nil, errors.Wrapf(err, "generate hashring file: %v", hashring) + } + + if err := ioutil.WriteFile(filepath.Join(dir, "hashrings.json"), b, 0600); err != nil { + return nil, errors.Wrap(err, "creating receive config") + } + + receiver := initiateService( + service, + DefaultImage(), + // TODO(bwplotka): BuildArgs should be interface. + e2e.NewCommand("receive", e2e.BuildArgs(map[string]string{ + "--debug.name": service.Name(), + "--grpc-address": ":9091", + "--grpc-grace-period": "0s", + "--http-address": ":8080", + "--remote-write.address": ":8081", + "--label": fmt.Sprintf(`receive="%s"`, service.Name()), + "--tsdb.path": filepath.Join(container, "data"), + "--log.level": infoLogLevel, + "--receive.replication-factor": strconv.Itoa(replicationFactor), + "--receive.local-endpoint": localEndpoint, + "--receive.hashrings-file": filepath.Join(container, "hashrings.json"), + "--receive.hashrings-file-refresh-interval": "5s", + })...), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), ) - receiver.SetUser(strconv.Itoa(os.Getuid())) - receiver.SetBackoff(defaultBackoffConfig) return receiver, nil } // NewRoutingReceiver creates a Thanos Receive instance that is only configured to route to other receive instances. It has no local storage. -func NewRoutingReceiver(sharedDir, name string, replicationFactor int, hashring ...receive.HashringConfig) (*Service, error) { +func NewRoutingReceiver(e e2e.Environment, name string, replicationFactor int, hashring ...receive.HashringConfig) (*e2e.InstrumentedRunnable, error) { if len(hashring) == 0 { return nil, errors.New("hashring should not be empty for receive-distributor mode") } - dir := filepath.Join(sharedDir, "data", "receive", name) + dir := filepath.Join(e.SharedDir(), "data", "receive", name) dataDir := filepath.Join(dir, "data") - container := filepath.Join(e2e.ContainerSharedDir, "data", "receive", name) + container := filepath.Join(ContainerSharedDir, "data", "receive", name) if err := os.MkdirAll(dataDir, 0750); err != nil { return nil, errors.Wrap(err, "create receive dir") } @@ -351,6 +445,7 @@ func NewRoutingReceiver(sharedDir, name string, replicationFactor int, hashring } receiver := NewService( + e, fmt.Sprintf("receive-%v", name), DefaultImage(), // TODO(bwplotka): BuildArgs should be interface. @@ -366,26 +461,24 @@ func NewRoutingReceiver(sharedDir, name string, replicationFactor int, hashring "--receive.replication-factor": strconv.Itoa(replicationFactor), "--receive.hashrings": string(b), })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), 8080, 9091, - 8081, + Port{Name: "remote-write", PortNum: 8081}, ) - receiver.SetUser(strconv.Itoa(os.Getuid())) - receiver.SetBackoff(defaultBackoffConfig) return receiver, nil } // NewIngestingReceiver creates a Thanos Receive instance that is only configured to ingest, not route to other receivers. -func NewIngestingReceiver(sharedDir, name string) (*Service, error) { - dir := filepath.Join(sharedDir, "data", "receive", name) +func NewIngestingReceiver(e e2e.Environment, name string) (*e2e.InstrumentedRunnable, error) { + dir := filepath.Join(e.SharedDir(), "data", "receive", name) dataDir := filepath.Join(dir, "data") - container := filepath.Join(e2e.ContainerSharedDir, "data", "receive", name) + container := filepath.Join(ContainerSharedDir, "data", "receive", name) if err := os.MkdirAll(dataDir, 0750); err != nil { return nil, errors.Wrap(err, "create receive dir") } - receiver := NewService( + receiver := NewService(e, fmt.Sprintf("receive-%v", name), DefaultImage(), // TODO(bwplotka): BuildArgs should be interface. @@ -399,70 +492,18 @@ func NewIngestingReceiver(sharedDir, name string) (*Service, error) { "--tsdb.path": filepath.Join(container, "data"), "--log.level": infoLogLevel, })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), - 8080, - 9091, - 8081, - ) - receiver.SetUser(strconv.Itoa(os.Getuid())) - receiver.SetBackoff(defaultBackoffConfig) - - return receiver, nil -} - -func NewRoutingAndIngestingReceiverWithConfigWatcher(sharedDir, networkName, name string, replicationFactor int, hashring ...receive.HashringConfig) (*Service, error) { - localEndpoint := NewService(fmt.Sprintf("receive-%v", name), "", e2e.NewCommand("", ""), nil, 8080, 9091, 8081).GRPCNetworkEndpointFor(networkName) - if len(hashring) == 0 { - hashring = []receive.HashringConfig{{Endpoints: []string{localEndpoint}}} - } - - dir := filepath.Join(sharedDir, "data", "receive", name) - dataDir := filepath.Join(dir, "data") - container := filepath.Join(e2e.ContainerSharedDir, "data", "receive", name) - if err := os.MkdirAll(dataDir, 0750); err != nil { - return nil, errors.Wrap(err, "create receive dir") - } - b, err := json.Marshal(hashring) - if err != nil { - return nil, errors.Wrapf(err, "generate hashring file: %v", hashring) - } - - if err := ioutil.WriteFile(filepath.Join(dir, "hashrings.json"), b, 0600); err != nil { - return nil, errors.Wrap(err, "creating receive config") - } - - receiver := NewService( - fmt.Sprintf("receive-%v", name), - DefaultImage(), - // TODO(bwplotka): BuildArgs should be interface. - e2e.NewCommand("receive", e2e.BuildArgs(map[string]string{ - "--debug.name": fmt.Sprintf("receive-%v", name), - "--grpc-address": ":9091", - "--grpc-grace-period": "0s", - "--http-address": ":8080", - "--remote-write.address": ":8081", - "--label": fmt.Sprintf(`receive="%s"`, name), - "--tsdb.path": filepath.Join(container, "data"), - "--log.level": infoLogLevel, - "--receive.replication-factor": strconv.Itoa(replicationFactor), - "--receive.local-endpoint": localEndpoint, - "--receive.hashrings-file": filepath.Join(container, "hashrings.json"), - "--receive.hashrings-file-refresh-interval": "5s", - })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), 8080, 9091, - 8081, + Port{Name: "remote-write", PortNum: 8081}, ) - receiver.SetUser(strconv.Itoa(os.Getuid())) - receiver.SetBackoff(defaultBackoffConfig) return receiver, nil } -func NewRuler(sharedDir, name, ruleSubDir string, amCfg []alert.AlertmanagerConfig, queryCfg []query.Config) (*Service, error) { - dir := filepath.Join(sharedDir, "data", "rule", name) - container := filepath.Join(e2e.ContainerSharedDir, "data", "rule", name) +func NewRuler(e e2e.Environment, name, ruleSubDir string, amCfg []alert.AlertmanagerConfig, queryCfg []httpconfig.Config) (*e2e.InstrumentedRunnable, error) { + dir := filepath.Join(e.SharedDir(), "data", "rule", name) + container := filepath.Join(ContainerSharedDir, "data", "rule", name) if err := os.MkdirAll(dir, 0750); err != nil { return nil, errors.Wrap(err, "create rule dir") } @@ -479,7 +520,7 @@ func NewRuler(sharedDir, name, ruleSubDir string, amCfg []alert.AlertmanagerConf return nil, errors.Wrapf(err, "generate query file: %v", queryCfg) } - ruler := NewService( + ruler := NewService(e, fmt.Sprintf("rule-%v", name), DefaultImage(), e2e.NewCommand("rule", e2e.BuildArgs(map[string]string{ @@ -489,8 +530,8 @@ func NewRuler(sharedDir, name, ruleSubDir string, amCfg []alert.AlertmanagerConf "--http-address": ":8080", "--label": fmt.Sprintf(`replica="%s"`, name), "--data-dir": container, - "--rule-file": filepath.Join(e2e.ContainerSharedDir, ruleSubDir, "*.yaml"), - "--eval-interval": "3s", + "--rule-file": filepath.Join(ContainerSharedDir, ruleSubDir, "*.yaml"), + "--eval-interval": "1s", "--alertmanagers.config": string(amCfgBytes), "--alertmanagers.sd-dns-interval": "1s", "--log.level": infoLogLevel, @@ -498,19 +539,17 @@ func NewRuler(sharedDir, name, ruleSubDir string, amCfg []alert.AlertmanagerConf "--query.sd-dns-interval": "1s", "--resend-delay": "5s", })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), 8080, 9091, ) - ruler.SetUser(strconv.Itoa(os.Getuid())) - ruler.SetBackoff(defaultBackoffConfig) return ruler, nil } -func NewAlertmanager(sharedDir, name string) (*e2e.HTTPService, error) { - dir := filepath.Join(sharedDir, "data", "am", name) - container := filepath.Join(e2e.ContainerSharedDir, "data", "am", name) +func NewAlertmanager(e e2e.Environment, name string) (*e2e.InstrumentedRunnable, error) { + dir := filepath.Join(e.SharedDir(), "data", "am", name) + container := filepath.Join(ContainerSharedDir, "data", "am", name) if err := os.MkdirAll(dir, 0750); err != nil { return nil, errors.Wrap(err, "create am dir") } @@ -527,29 +566,30 @@ receivers: return nil, errors.Wrap(err, "creating alertmanager config file failed") } - s := e2e.NewHTTPService( - fmt.Sprintf("alertmanager-%v", name), - DefaultAlertmanagerImage(), - e2e.NewCommandWithoutEntrypoint("/bin/alertmanager", e2e.BuildArgs(map[string]string{ - "--config.file": filepath.Join(container, "config.yaml"), - "--web.listen-address": "0.0.0.0:8080", - "--log.level": infoLogLevel, - "--storage.path": container, - "--web.get-concurrency": "1", - "--web.timeout": "2m", - })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), - 8080, + s := e2e.NewInstrumentedRunnable( + e, fmt.Sprintf("alertmanager-%v", name), map[string]int{"http": 8080}, "http").Init( + e2e.StartOptions{ + Image: DefaultAlertmanagerImage(), + Command: e2e.NewCommandWithoutEntrypoint("/bin/alertmanager", e2e.BuildArgs(map[string]string{ + "--config.file": filepath.Join(container, "config.yaml"), + "--web.listen-address": "0.0.0.0:8080", + "--log.level": infoLogLevel, + "--storage.path": container, + "--web.get-concurrency": "1", + "--web.timeout": "2m", + })...), + Readiness: e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + User: strconv.Itoa(os.Geteuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, ) - s.SetUser(strconv.Itoa(os.Getuid())) - s.SetBackoff(defaultBackoffConfig) return s, nil } -func NewStoreGW(sharedDir, name string, bucketConfig client.BucketConfig, relabelConfig ...relabel.Config) (*Service, error) { - dir := filepath.Join(sharedDir, "data", "store", name) - container := filepath.Join(e2e.ContainerSharedDir, "data", "store", name) +func NewStoreGW(e e2e.Environment, name string, bucketConfig client.BucketConfig, cacheConfig string, relabelConfig ...relabel.Config) (*e2e.InstrumentedRunnable, error) { + dir := filepath.Join(e.SharedDir(), "data", "store", name) + container := filepath.Join(ContainerSharedDir, "data", "store", name) if err := os.MkdirAll(dir, 0750); err != nil { return nil, errors.Wrap(err, "create store dir") } @@ -564,37 +604,42 @@ func NewStoreGW(sharedDir, name string, bucketConfig client.BucketConfig, relabe return nil, errors.Wrapf(err, "generate store relabel file: %v", relabelConfig) } + args := e2e.BuildArgs(map[string]string{ + "--debug.name": fmt.Sprintf("store-gw-%v", name), + "--grpc-address": ":9091", + "--grpc-grace-period": "0s", + "--http-address": ":8080", + "--log.level": infoLogLevel, + "--data-dir": container, + "--objstore.config": string(bktConfigBytes), + // Accelerated sync time for quicker test (3m by default). + "--sync-block-duration": "3s", + "--block-sync-concurrency": "1", + "--store.grpc.series-max-concurrency": "1", + "--selector.relabel-config": string(relabelConfigBytes), + "--consistency-delay": "30m", + }) + + if cacheConfig != "" { + args = append(args, "--store.caching-bucket.config", cacheConfig) + } + store := NewService( + e, fmt.Sprintf("store-gw-%v", name), DefaultImage(), - e2e.NewCommand("store", e2e.BuildArgs(map[string]string{ - "--debug.name": fmt.Sprintf("store-gw-%v", name), - "--grpc-address": ":9091", - "--grpc-grace-period": "0s", - "--http-address": ":8080", - "--log.level": infoLogLevel, - "--data-dir": container, - "--objstore.config": string(bktConfigBytes), - // Accelerated sync time for quicker test (3m by default). - "--sync-block-duration": "3s", - "--block-sync-concurrency": "1", - "--store.grpc.series-max-concurrency": "1", - "--selector.relabel-config": string(relabelConfigBytes), - "--consistency-delay": "30m", - })...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), + e2e.NewCommand("store", args...), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), 8080, 9091, ) - store.SetUser(strconv.Itoa(os.Getuid())) - store.SetBackoff(defaultBackoffConfig) return store, nil } -func NewCompactor(sharedDir, name string, bucketConfig client.BucketConfig, relabelConfig []relabel.Config, extArgs ...string) (*e2e.HTTPService, error) { - dir := filepath.Join(sharedDir, "data", "compact", name) - container := filepath.Join(e2e.ContainerSharedDir, "data", "compact", name) +func NewCompactor(e e2e.Environment, name string, bucketConfig client.BucketConfig, relabelConfig []relabel.Config, extArgs ...string) (*e2e.InstrumentedRunnable, error) { + dir := filepath.Join(e.SharedDir(), "data", "compact", name) + container := filepath.Join(ContainerSharedDir, "data", "compact", name) if err := os.MkdirAll(dir, 0750); err != nil { return nil, errors.Wrap(err, "create compact dir") @@ -610,29 +655,30 @@ func NewCompactor(sharedDir, name string, bucketConfig client.BucketConfig, rela return nil, errors.Wrapf(err, "generate compact relabel file: %v", relabelConfig) } - compactor := e2e.NewHTTPService( - fmt.Sprintf("compact-%s", name), - DefaultImage(), - e2e.NewCommand("compact", append(e2e.BuildArgs(map[string]string{ - "--debug.name": fmt.Sprintf("compact-%s", name), - "--log.level": infoLogLevel, - "--data-dir": container, - "--objstore.config": string(bktConfigBytes), - "--http-address": ":8080", - "--block-sync-concurrency": "20", - "--selector.relabel-config": string(relabelConfigBytes), - "--wait": "", - }), extArgs...)...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), - 8080, + compactor := e2e.NewInstrumentedRunnable( + e, fmt.Sprintf("compact-%s", name), map[string]int{"http": 8080}, "http").Init( + e2e.StartOptions{ + Image: DefaultImage(), + Command: e2e.NewCommand("compact", append(e2e.BuildArgs(map[string]string{ + "--debug.name": fmt.Sprintf("compact-%s", name), + "--log.level": infoLogLevel, + "--data-dir": container, + "--objstore.config": string(bktConfigBytes), + "--http-address": ":8080", + "--block-sync-concurrency": "20", + "--selector.relabel-config": string(relabelConfigBytes), + "--wait": "", + }), extArgs...)...), + Readiness: e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + User: strconv.Itoa(os.Getuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, ) - compactor.SetUser(strconv.Itoa(os.Getuid())) - compactor.SetBackoff(defaultBackoffConfig) return compactor, nil } -func NewQueryFrontend(name, downstreamURL string, cacheConfig queryfrontend.CacheProviderConfig) (*e2e.HTTPService, error) { +func NewQueryFrontend(e e2e.Environment, name, downstreamURL string, cacheConfig queryfrontend.CacheProviderConfig) (*e2e.InstrumentedRunnable, error) { cacheConfigBytes, err := yaml.Marshal(cacheConfig) if err != nil { return nil, errors.Wrapf(err, "marshal response cache config file: %v", cacheConfig) @@ -646,41 +692,118 @@ func NewQueryFrontend(name, downstreamURL string, cacheConfig queryfrontend.Cach "--query-range.response-cache-config": string(cacheConfigBytes), }) - queryFrontend := e2e.NewHTTPService( - fmt.Sprintf("query-frontend-%s", name), - DefaultImage(), - e2e.NewCommand("query-frontend", args...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), - 8080, + queryFrontend := e2e.NewInstrumentedRunnable( + e, fmt.Sprintf("query-frontend-%s", name), map[string]int{"http": 8080}, "http").Init( + e2e.StartOptions{ + Image: DefaultImage(), + Command: e2e.NewCommand("query-frontend", args...), + Readiness: e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), + User: strconv.Itoa(os.Getuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, ) - queryFrontend.SetUser(strconv.Itoa(os.Getuid())) - queryFrontend.SetBackoff(defaultBackoffConfig) return queryFrontend, nil } -func NewMemcached(name string) *e2e.ConcreteService { - memcached := e2e.NewConcreteService( - fmt.Sprintf("memcached-%s", name), - "docker.io/memcached:1.6.3-alpine", - e2e.NewCommand("memcached", []string{"-m 1024", "-I 1m", "-c 1024", "-v"}...), - nil, - 11211, +func NewReverseProxy(e e2e.Environment, name, tenantID, target string) (*e2e.InstrumentedRunnable, error) { + conf := fmt.Sprintf(` +events { + worker_connections 1024; +} + +http { + server { + listen 80; + server_name _; + + location / { + proxy_set_header THANOS-TENANT %s; + proxy_pass %s; + } + } +} +`, tenantID, target) + + dir := filepath.Join(e.SharedDir(), "data", "nginx", name) + if err := os.MkdirAll(dir, 0750); err != nil { + return nil, errors.Wrap(err, "create store dir") + } + + if err := ioutil.WriteFile(filepath.Join(dir, "nginx.conf"), []byte(conf), 0600); err != nil { + return nil, errors.Wrap(err, "creating nginx config file failed") + } + + nginx := e2e.NewInstrumentedRunnable(e, fmt.Sprintf("nginx-%s", name), map[string]int{"http": 80}, "http").Init( + e2e.StartOptions{ + Image: "docker.io/nginx:1.21.1-alpine", + Volumes: []string{filepath.Join(dir, "/nginx.conf") + ":/etc/nginx/nginx.conf:ro"}, + WaitReadyBackoff: &defaultBackoffConfig, + }, + ) + + return nginx, nil +} + +// NewMinio returns minio server, used as a local replacement for S3. +// TODO(@matej-g): This is a temporary workaround for https://github.com/efficientgo/e2e/issues/11; +// after this is addresses fixed all calls should be replaced with e2edb.NewMinio. +func NewMinio(env e2e.Environment, name, bktName string) *e2e.InstrumentedRunnable { + image := "minio/minio:RELEASE.2019-12-30T05-45-39Z" + minioKESGithubContent := "https://raw.githubusercontent.com/minio/kes/master" + commands := []string{ + "curl -sSL --tlsv1.2 -O '%s/root.key' -O '%s/root.cert'", + "mkdir -p /data/%s && minio server --address :%v --quiet /data", + } + + return e2e.NewInstrumentedRunnable( + env, + name, + map[string]int{"http": 8090}, + "http").Init( + e2e.StartOptions{ + Image: image, + // Create the required bucket before starting minio. + Command: e2e.NewCommandWithoutEntrypoint("sh", "-c", fmt.Sprintf(strings.Join(commands, " && "), minioKESGithubContent, minioKESGithubContent, bktName, 8090)), + Readiness: e2e.NewHTTPReadinessProbe("http", "/minio/health/ready", 200, 200), + EnvVars: map[string]string{ + "MINIO_ACCESS_KEY": e2edb.MinioAccessKey, + "MINIO_SECRET_KEY": e2edb.MinioSecretKey, + "MINIO_BROWSER": "off", + "ENABLE_HTTPS": "0", + // https://docs.min.io/docs/minio-kms-quickstart-guide.html + "MINIO_KMS_KES_ENDPOINT": "https://play.min.io:7373", + "MINIO_KMS_KES_KEY_FILE": "root.key", + "MINIO_KMS_KES_CERT_FILE": "root.cert", + "MINIO_KMS_KES_KEY_NAME": "my-minio-key", + }, + }, + ) +} + +func NewMemcached(e e2e.Environment, name string) *e2e.InstrumentedRunnable { + memcached := e2e.NewInstrumentedRunnable(e, fmt.Sprintf("memcached-%s", name), map[string]int{"memcached": 11211}, "memcached").Init( + e2e.StartOptions{ + Image: "docker.io/memcached:1.6.3-alpine", + Command: e2e.NewCommand("memcached", []string{"-m 1024", "-I 1m", "-c 1024", "-v"}...), + User: strconv.Itoa(os.Getuid()), + WaitReadyBackoff: &defaultBackoffConfig, + }, ) - memcached.SetUser(strconv.Itoa(os.Getuid())) - memcached.SetBackoff(defaultBackoffConfig) return memcached } func NewToolsBucketWeb( + e e2e.Environment, name string, bucketConfig client.BucketConfig, routePrefix, externalPrefix string, minTime string, maxTime string, - relabelConfig string) (*Service, error) { + relabelConfig string, +) (*e2e.InstrumentedRunnable, error) { bktConfigBytes, err := yaml.Marshal(bucketConfig) if err != nil { return nil, errors.Wrapf(err, "generate tools bucket web config file: %v", bucketConfig) @@ -714,16 +837,14 @@ func NewToolsBucketWeb( args = append([]string{"bucket", "web"}, args...) - toolsBucketWeb := NewService( + toolsBucketWeb := NewService(e, fmt.Sprintf("toolsBucketWeb-%s", name), DefaultImage(), e2e.NewCommand("tools", args...), - e2e.NewHTTPReadinessProbe(8080, "/-/ready", 200, 200), + e2e.NewHTTPReadinessProbe("http", "/-/ready", 200, 200), 8080, 9091, ) - toolsBucketWeb.SetUser(strconv.Itoa(os.Getuid())) - toolsBucketWeb.SetBackoff(defaultBackoffConfig) return toolsBucketWeb, nil } diff --git a/test/e2e/exemplars_api_test.go b/test/e2e/exemplars_api_test.go index 15aa749ecd1..13aaffca851 100644 --- a/test/e2e/exemplars_api_test.go +++ b/test/e2e/exemplars_api_test.go @@ -9,9 +9,10 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/pkg/errors" "github.com/prometheus/prometheus/pkg/timestamp" + "github.com/thanos-io/thanos/pkg/exemplars/exemplarspb" "github.com/thanos-io/thanos/pkg/store/labelpb" "github.com/thanos-io/thanos/pkg/testutil" @@ -24,64 +25,70 @@ const ( func TestExemplarsAPI_Fanout(t *testing.T) { t.Parallel() + var ( + prom1, prom2 *e2e.InstrumentedRunnable + sidecar1, sidecar2 *e2e.InstrumentedRunnable + err error + e *e2e.DockerEnvironment + ) - netName := "e2e_test_exemplars_fanout" - - s, err := e2e.NewScenario(netName) + e, err = e2e.NewDockerEnvironment("e2e_test_exemplars_fanout") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - stores := []string{ - e2e.NetworkContainerHostPort(netName, "sidecar-prom1", 9091), // TODO(bwplotka): Use newer e2e lib to handle this in type safe manner. - e2e.NetworkContainerHostPort(netName, "sidecar-prom2", 9091), // TODO(bwplotka): Use newer e2e lib to handle this in type safe manner. - } - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "query", stores...). - WithExemplarAddresses(stores...). - WithTracingConfig(fmt.Sprintf(`type: JAEGER -config: - sampler_type: const - sampler_param: 1 - service_name: %s`, s.NetworkName()+"-query")). - Build() - testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + qBuilder := e2ethanos.NewQuerierBuilder(e, "query") + qUnitiated := qBuilder.BuildUninitiated() - // Recreate Prometheus and sidecar with Thanos query scrape target. - prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + prom1, sidecar1, err = e2ethanos.NewPrometheusWithSidecar( + e, "prom1", - defaultPromConfig("ha", 0, "", "", "localhost:9090", q.NetworkHTTPEndpoint()), + defaultPromConfig("ha", 0, "", "", "localhost:9090", qUnitiated.InternalEndpoint("http")), e2ethanos.DefaultPrometheusImage(), e2ethanos.FeatureExemplarStorage, ) testutil.Ok(t, err) - prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + prom2, sidecar2, err = e2ethanos.NewPrometheusWithSidecar( + e, "prom2", - defaultPromConfig("ha", 1, "", "", "localhost:9090", q.NetworkHTTPEndpoint()), + defaultPromConfig("ha", 1, "", "", "localhost:9090", qUnitiated.InternalEndpoint("http")), e2ethanos.DefaultPrometheusImage(), e2ethanos.FeatureExemplarStorage, ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) + + tracingCfg := fmt.Sprintf(`type: JAEGER +config: + sampler_type: const + sampler_param: 1 + service_name: %s`, qUnitiated.Name()) + + stores := []string{sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc")} + + qBuilder = qBuilder.WithExemplarAddresses(stores...). + WithTracingConfig(tracingCfg) + + q, err := qBuilder.Initiate(qUnitiated, stores...) + testutil.Ok(t, err) + testutil.Ok(t, e2e.StartAndWaitReady(q)) + + testutil.Ok(t, err) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_query_exemplar_apis_dns_provider_results"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_query_exemplar_apis_dns_provider_results"}, e2e.WaitMissingMetrics())) now := time.Now() start := timestamp.FromTime(now.Add(-time.Hour)) end := timestamp.FromTime(now.Add(time.Hour)) // Send HTTP requests to thanos query to trigger exemplars. - labelNames(t, ctx, q.HTTPEndpoint(), nil, start, end, func(res []string) bool { return true }) + labelNames(t, ctx, q.Endpoint("http"), nil, start, end, func(res []string) bool { return true }) t.Run("Basic exemplars query", func(t *testing.T) { - queryExemplars(t, ctx, q.HTTPEndpoint(), `http_request_duration_seconds_bucket{handler="label_names"}`, start, end, exemplarsOnExpectedSeries(map[string]string{ + queryExemplars(t, ctx, q.Endpoint("http"), `http_request_duration_seconds_bucket{handler="label_names"}`, start, end, exemplarsOnExpectedSeries(map[string]string{ "__name__": "http_request_duration_seconds_bucket", "handler": "label_names", "job": "myself", @@ -92,7 +99,7 @@ config: t.Run("Exemplars query with matched external label", func(t *testing.T) { // Here replica is an external label. - queryExemplars(t, ctx, q.HTTPEndpoint(), `http_request_duration_seconds_bucket{handler="label_names", replica="0"}`, start, end, exemplarsOnExpectedSeries(map[string]string{ + queryExemplars(t, ctx, q.Endpoint("http"), `http_request_duration_seconds_bucket{handler="label_names", replica="0"}`, start, end, exemplarsOnExpectedSeries(map[string]string{ "__name__": "http_request_duration_seconds_bucket", "handler": "label_names", "job": "myself", @@ -103,7 +110,7 @@ config: t.Run("Exemplars query doesn't match external label", func(t *testing.T) { // Here replica is an external label, but it doesn't match. - queryExemplars(t, ctx, q.HTTPEndpoint(), `http_request_duration_seconds_bucket{handler="label_names", replica="foo"}`, + queryExemplars(t, ctx, q.Endpoint("http"), `http_request_duration_seconds_bucket{handler="label_names", replica="foo"}`, start, end, func(data []*exemplarspb.ExemplarData) error { if len(data) > 0 { return errors.Errorf("expected no examplers, got %v", data) diff --git a/test/e2e/metadata_api_test.go b/test/e2e/metadata_api_test.go index eda8b325946..096560e64e2 100644 --- a/test/e2e/metadata_api_test.go +++ b/test/e2e/metadata_api_test.go @@ -10,7 +10,7 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/thanos-io/thanos/pkg/metadata/metadatapb" "github.com/thanos-io/thanos/pkg/promclient" "github.com/thanos-io/thanos/pkg/runutil" @@ -21,17 +21,14 @@ import ( func TestMetadataAPI_Fanout(t *testing.T) { t.Parallel() - netName := "e2e_test_metadata_fanout" - - s, err := e2e.NewScenario(netName) + e, err := e2e.NewDockerEnvironment("e2e_test_metadata_fanout") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // 2x Prometheus. // Each Prometheus scrapes its own metrics and Sidecar's metrics. prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + e, "prom1", defaultPromConfig("ha", 0, "", "", "localhost:9090", "sidecar-prom1:8080"), e2ethanos.DefaultPrometheusImage(), @@ -39,32 +36,32 @@ func TestMetadataAPI_Fanout(t *testing.T) { testutil.Ok(t, err) prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + e, "prom2", defaultPromConfig("ha", 1, "", "", "localhost:9090", "sidecar-prom2:8080"), e2ethanos.DefaultPrometheusImage(), ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) - stores := []string{sidecar1.GRPCNetworkEndpoint(), sidecar2.GRPCNetworkEndpoint()} - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "query", stores...). + stores := []string{sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc")} + q, err := e2ethanos.NewQuerierBuilder( + e, "query", stores...). WithMetadataAddresses(stores...). Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_query_metadata_apis_dns_provider_results"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_query_metadata_apis_dns_provider_results"}, e2e.WaitMissingMetrics())) var promMeta map[string][]metadatapb.Meta // Wait metadata response to be ready as Prometheus gets metadata after scrape. - testutil.Ok(t, runutil.Retry(3*time.Second, ctx.Done(), func() error { - promMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+prom1.HTTPEndpoint()), "", -1) + testutil.Ok(t, runutil.Retry(5*time.Second, ctx.Done(), func() error { + promMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+prom1.Endpoint("http")), "", -1) testutil.Ok(t, err) if len(promMeta) > 0 { return nil @@ -72,7 +69,7 @@ func TestMetadataAPI_Fanout(t *testing.T) { return fmt.Errorf("empty metadata response from Prometheus") })) - thanosMeta, err := promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.HTTPEndpoint()), "", -1) + thanosMeta, err := promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.Endpoint("http")), "", -1) testutil.Ok(t, err) testutil.Assert(t, len(thanosMeta) > 0, "got empty metadata response from Thanos") @@ -80,22 +77,22 @@ func TestMetadataAPI_Fanout(t *testing.T) { metadataEqual(t, thanosMeta, promMeta) // We only expect to see one metadata returned. - thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.HTTPEndpoint()), "", 1) + thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.Endpoint("http")), "", 1) testutil.Ok(t, err) testutil.Equals(t, len(thanosMeta), 1) // We only expect to see ten metadata returned. - thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.HTTPEndpoint()), "", 10) + thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.Endpoint("http")), "", 10) testutil.Ok(t, err) testutil.Equals(t, len(thanosMeta), 10) // No metadata returned. - thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.HTTPEndpoint()), "", 0) + thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.Endpoint("http")), "", 0) testutil.Ok(t, err) testutil.Equals(t, len(thanosMeta), 0) // Only prometheus_build_info metric will be returned. - thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.HTTPEndpoint()), "prometheus_build_info", -1) + thanosMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.Endpoint("http")), "prometheus_build_info", -1) testutil.Ok(t, err) testutil.Assert(t, len(thanosMeta) == 1 && len(thanosMeta["prometheus_build_info"]) > 0, "expected one prometheus_build_info metadata from Thanos, got %v", thanosMeta) } diff --git a/test/e2e/query_frontend_test.go b/test/e2e/query_frontend_test.go index 37f4ea7d6b3..6635555ed0c 100644 --- a/test/e2e/query_frontend_test.go +++ b/test/e2e/query_frontend_test.go @@ -9,12 +9,13 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" + "github.com/efficientgo/e2e/matchers" "github.com/pkg/errors" - "github.com/prometheus/common/model" "github.com/prometheus/prometheus/pkg/labels" "github.com/prometheus/prometheus/pkg/timestamp" + "github.com/thanos-io/thanos/pkg/cacheutil" "github.com/thanos-io/thanos/pkg/promclient" "github.com/thanos-io/thanos/pkg/queryfrontend" @@ -25,19 +26,19 @@ import ( func TestQueryFrontend(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_frontend") + e, err := e2e.NewDockerEnvironment("e2e_test_query_frontend") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) now := time.Now() - prom, sidecar, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), s.NetworkName(), "1", defaultPromConfig("test", 0, "", ""), e2ethanos.DefaultPrometheusImage()) + prom, sidecar, err := e2ethanos.NewPrometheusWithSidecar(e, "1", defaultPromConfig("test", 0, "", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom, sidecar)) + testutil.Ok(t, e2e.StartAndWaitReady(prom, sidecar)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", sidecar.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", sidecar.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) inMemoryCacheConfig := queryfrontend.CacheProviderConfig{ Type: queryfrontend.INMEMORY, @@ -47,18 +48,18 @@ func TestQueryFrontend(t *testing.T) { }, } - queryFrontend, err := e2ethanos.NewQueryFrontend("1", "http://"+q.NetworkHTTPEndpoint(), inMemoryCacheConfig) + queryFrontend, err := e2ethanos.NewQueryFrontend(e, "1", "http://"+q.InternalEndpoint("http"), inMemoryCacheConfig) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(queryFrontend)) + testutil.Ok(t, e2e.StartAndWaitReady(queryFrontend)) ctx, cancel := context.WithTimeout(context.Background(), time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) // Ensure we can get the result from Querier first so that it // doesn't need to retry when we send queries to the frontend later. - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { @@ -68,14 +69,15 @@ func TestQueryFrontend(t *testing.T) { }, }) - vals, err := q.SumMetrics([]string{"http_requests_total"}, e2e.WithLabelMatchers( - labels.MustNewMatcher(labels.MatchEqual, "handler", "query"))) + vals, err := q.SumMetrics([]string{"http_requests_total"}) + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "query")) + testutil.Ok(t, err) testutil.Equals(t, 1, len(vals)) queryTimes := vals[0] t.Run("query frontend works for instant query", func(t *testing.T) { - queryAndAssertSeries(t, ctx, queryFrontend.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, queryFrontend.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { @@ -88,21 +90,21 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "query"))), - ) + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "query")), + )) testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(queryTimes+1), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "query"))), - ) + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "query")), + )) }) t.Run("query frontend works for range query and it can cache results", func(t *testing.T) { rangeQuery( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), queryUpWithoutInstance, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), @@ -119,8 +121,8 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "query_range"))), - ) + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "query_range")), + )) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(1), "cortex_cache_fetched_keys")) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(0), "cortex_cache_hits")) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(1), "querier_cache_added_new_total")) @@ -135,8 +137,8 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "query_range"))), - ) + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "query_range")), + )) }) t.Run("same range query, cache hit.", func(t *testing.T) { @@ -144,7 +146,7 @@ func TestQueryFrontend(t *testing.T) { rangeQuery( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), queryUpWithoutInstance, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), @@ -161,7 +163,7 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "query_range"))), ) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(2), "cortex_cache_fetched_keys")) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(1), "cortex_cache_hits")) @@ -174,14 +176,14 @@ func TestQueryFrontend(t *testing.T) { // Query is only 2h so it won't be split. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "query_range"))), ) // One more request is needed in order to satisfy the req range. testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "query_range"))), ) }) @@ -189,7 +191,7 @@ func TestQueryFrontend(t *testing.T) { rangeQuery( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), queryUpWithoutInstance, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(24*time.Hour)), @@ -206,7 +208,7 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(3), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "query_range"))), ) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(3), "cortex_cache_fetched_keys")) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(2), "cortex_cache_hits")) @@ -219,94 +221,94 @@ func TestQueryFrontend(t *testing.T) { // Query is 25h so it will be split to 2 requests. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(4), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "query_range"))), ) testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(4), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "query_range"))), ) }) t.Run("query frontend splitting works for labels names API", func(t *testing.T) { // LabelNames and LabelValues API should still work via query frontend. - labelNames(t, ctx, queryFrontend.HTTPEndpoint(), nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { + labelNames(t, ctx, queryFrontend.Endpoint("http"), nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) > 0 }) testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "label_names"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "label_names"))), ) testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "label_names"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "label_names"))), ) // Query is only 2h so it won't be split. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "labels"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "labels"))), ) - labelNames(t, ctx, queryFrontend.HTTPEndpoint(), nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { + labelNames(t, ctx, queryFrontend.Endpoint("http"), nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) > 0 }) testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(3), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "label_names"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "label_names"))), ) testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "label_names"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "label_names"))), ) // Query is 25h so split to 2 requests. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(3), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "labels"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "labels"))), ) }) t.Run("query frontend splitting works for labels values API", func(t *testing.T) { - labelValues(t, ctx, queryFrontend.HTTPEndpoint(), "instance", nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { + labelValues(t, ctx, queryFrontend.Endpoint("http"), "instance", nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) == 1 && res[0] == "localhost:9090" }) testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "label_values"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "label_values"))), ) testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "label_values"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "label_values"))), ) // Query is only 2h so it won't be split. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(4), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "labels"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "labels"))), ) - labelValues(t, ctx, queryFrontend.HTTPEndpoint(), "instance", nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { + labelValues(t, ctx, queryFrontend.Endpoint("http"), "instance", nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) == 1 && res[0] == "localhost:9090" }) testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(3), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "label_values"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "label_values"))), ) testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "label_values"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "label_values"))), ) // Query is 25h so split to 2 requests. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(6), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "labels"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "labels"))), ) }) @@ -314,7 +316,7 @@ func TestQueryFrontend(t *testing.T) { series( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), []*labels.Matcher{labels.MustNewMatcher(labels.MatchEqual, "__name__", "up")}, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), @@ -334,23 +336,23 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "series"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "series"))), ) testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "series"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "series"))), ) // Query is only 2h so it won't be split. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(7), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "labels"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "labels"))), ) series( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), []*labels.Matcher{labels.MustNewMatcher(labels.MatchEqual, "__name__", "up")}, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(time.Hour)), @@ -370,17 +372,17 @@ func TestQueryFrontend(t *testing.T) { testutil.Ok(t, q.WaitSumMetricsWithOptions( e2e.Equals(3), []string{"http_requests_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "handler", "series"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "handler", "series"))), ) testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "series"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "series"))), ) // Query is only 2h so it won't be split. testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(9), []string{"thanos_frontend_split_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "tripperware", "labels"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "tripperware", "labels"))), ) }) } @@ -388,28 +390,28 @@ func TestQueryFrontend(t *testing.T) { func TestQueryFrontendMemcachedCache(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_frontend_memcached") + e, err := e2e.NewDockerEnvironment("e2e_test_query_frontend_memcached") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) now := time.Now() - prom, sidecar, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), s.NetworkName(), "1", defaultPromConfig("test", 0, "", ""), e2ethanos.DefaultPrometheusImage()) + prom, sidecar, err := e2ethanos.NewPrometheusWithSidecar(e, "1", defaultPromConfig("test", 0, "", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom, sidecar)) + testutil.Ok(t, e2e.StartAndWaitReady(prom, sidecar)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", sidecar.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", sidecar.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) - memcached := e2ethanos.NewMemcached("1") - testutil.Ok(t, s.StartAndWaitReady(memcached)) + memcached := e2ethanos.NewMemcached(e, "1") + testutil.Ok(t, e2e.StartAndWaitReady(memcached)) memCachedConfig := queryfrontend.CacheProviderConfig{ Type: queryfrontend.MEMCACHED, Config: queryfrontend.MemcachedResponseCacheConfig{ Memcached: cacheutil.MemcachedClientConfig{ - Addresses: []string{memcached.NetworkEndpoint(11211)}, + Addresses: []string{memcached.InternalEndpoint("memcached")}, MaxIdleConnections: 100, MaxAsyncConcurrency: 20, MaxGetMultiConcurrency: 100, @@ -421,20 +423,20 @@ func TestQueryFrontendMemcachedCache(t *testing.T) { }, } - queryFrontend, err := e2ethanos.NewQueryFrontend("1", "http://"+q.NetworkHTTPEndpoint(), memCachedConfig) + queryFrontend, err := e2ethanos.NewQueryFrontend(e, "1", "http://"+q.InternalEndpoint("http"), memCachedConfig) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(queryFrontend)) + testutil.Ok(t, e2e.StartAndWaitReady(queryFrontend)) ctx, cancel := context.WithTimeout(context.Background(), time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(1), "cortex_memcache_client_servers")) // Ensure we can get the result from Querier first so that it // doesn't need to retry when we send queries to the frontend later. - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { @@ -445,14 +447,14 @@ func TestQueryFrontendMemcachedCache(t *testing.T) { }) vals, err := q.SumMetrics([]string{"http_requests_total"}, e2e.WithLabelMatchers( - labels.MustNewMatcher(labels.MatchEqual, "handler", "query"))) + matchers.MustNewMatcher(matchers.MatchEqual, "handler", "query"))) testutil.Ok(t, err) testutil.Equals(t, 1, len(vals)) rangeQuery( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), queryUpWithoutInstance, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), @@ -469,7 +471,7 @@ func TestQueryFrontendMemcachedCache(t *testing.T) { testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(1), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "query_range"))), ) testutil.Ok(t, queryFrontend.WaitSumMetrics(e2e.Equals(1), "cortex_cache_fetched_keys")) @@ -482,7 +484,7 @@ func TestQueryFrontendMemcachedCache(t *testing.T) { rangeQuery( t, ctx, - queryFrontend.HTTPEndpoint(), + queryFrontend.Endpoint("http"), queryUpWithoutInstance, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), @@ -499,7 +501,7 @@ func TestQueryFrontendMemcachedCache(t *testing.T) { testutil.Ok(t, queryFrontend.WaitSumMetricsWithOptions( e2e.Equals(2), []string{"thanos_query_frontend_queries_total"}, - e2e.WithLabelMatchers(labels.MustNewMatcher(labels.MatchEqual, "op", "query_range"))), + e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "op", "query_range"))), ) // Query is only 2h so it won't be split. diff --git a/test/e2e/query_test.go b/test/e2e/query_test.go index 8c34a282836..06c62087635 100644 --- a/test/e2e/query_test.go +++ b/test/e2e/query_test.go @@ -17,7 +17,7 @@ import ( "github.com/chromedp/cdproto/network" "github.com/chromedp/chromedp" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/go-kit/kit/log" "github.com/pkg/errors" "github.com/prometheus/common/model" @@ -97,36 +97,37 @@ func sortResults(res model.Vector) { func TestQuery(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query") + e, err := e2e.NewDockerEnvironment("e2e_test_query") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - receiver, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1) + receiver := e2ethanos.NewUninitiatedReceiver(e, "1") + receiverRunnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(receiver, e.SharedDir(), 1) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(receiver)) + testutil.Ok(t, e2e.StartAndWaitReady(receiverRunnable)) - prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), "e2e_test_query", "alone", defaultPromConfig("prom-alone", 0, "", ""), e2ethanos.DefaultPrometheusImage()) + prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar(e, "alone", defaultPromConfig("prom-alone", 0, "", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), "e2e_test_query", "remote-and-sidecar", defaultPromConfig("prom-both-remote-write-and-sidecar", 1234, e2ethanos.RemoteWriteEndpoint(receiver.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar(e, "remote-and-sidecar", defaultPromConfig("prom-both-remote-write-and-sidecar", 1234, e2ethanos.RemoteWriteEndpoint(receiver.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom3, sidecar3, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), "e2e_test_query", "ha1", defaultPromConfig("prom-ha", 0, "", filepath.Join(e2e.ContainerSharedDir, "", "*.yaml")), e2ethanos.DefaultPrometheusImage()) + prom3, sidecar3, err := e2ethanos.NewPrometheusWithSidecar(e, "ha1", defaultPromConfig("prom-ha", 0, "", filepath.Join(e2ethanos.ContainerSharedDir, "", "*.yaml")), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom4, sidecar4, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), "e2e_test_query", "ha2", defaultPromConfig("prom-ha", 1, "", filepath.Join(e2e.ContainerSharedDir, "", "*.yaml")), e2ethanos.DefaultPrometheusImage()) + prom4, sidecar4, err := e2ethanos.NewPrometheusWithSidecar(e, "ha2", defaultPromConfig("prom-ha", 1, "", filepath.Join(e2ethanos.ContainerSharedDir, "", "*.yaml")), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2, prom3, sidecar3, prom4, sidecar4)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2, prom3, sidecar3, prom4, sidecar4)) // Querier. Both fileSD and directly by flags. - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", sidecar1.GRPCNetworkEndpoint(), sidecar2.GRPCNetworkEndpoint(), receiver.GRPCNetworkEndpoint()). - WithFileSDStoreAddresses(sidecar3.GRPCNetworkEndpoint(), sidecar4.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc"), receiver.InternalEndpoint("grpc")). + WithFileSDStoreAddresses(sidecar3.InternalEndpoint("grpc"), sidecar4.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(5), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(5), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { @@ -137,7 +138,7 @@ func TestQuery(t *testing.T) { { "job": "myself", "prometheus": "prom-both-remote-write-and-sidecar", - "receive": "1", + "receive": "receive-1", "replica": "1234", "tenant_id": "default-tenant", }, @@ -159,7 +160,7 @@ func TestQuery(t *testing.T) { }) // With deduplication. - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: true, }, []model.Metric{ { @@ -169,7 +170,7 @@ func TestQuery(t *testing.T) { { "job": "myself", "prometheus": "prom-both-remote-write-and-sidecar", - "receive": "1", + "receive": "receive-1", "tenant_id": "default-tenant", }, { @@ -186,35 +187,35 @@ func TestQuery(t *testing.T) { func TestQueryExternalPrefixWithoutReverseProxy(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_route_prefix") + e, err := e2e.NewDockerEnvironment("e2e_test_query_route_prefix") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) externalPrefix := "test" - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1"). + q, err := e2ethanos.NewQuerierBuilder(e, "1"). WithExternalPrefix(externalPrefix).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) - checkNetworkRequests(t, "http://"+q.HTTPEndpoint()+"/"+externalPrefix+"/graph") + checkNetworkRequests(t, "http://"+q.Endpoint("http")+"/"+externalPrefix+"/graph") } func TestQueryExternalPrefix(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_external_prefix") + e, err := e2e.NewDockerEnvironment("e2e_test_query_external_prefix") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) externalPrefix := "thanos" - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1"). + q, err := e2ethanos.NewQuerierBuilder(e, "1"). WithExternalPrefix(externalPrefix).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) - querierURL := mustURLParse(t, "http://"+q.HTTPEndpoint()+"/"+externalPrefix) + querierURL := mustURLParse(t, "http://"+q.Endpoint("http")+"/"+externalPrefix) querierProxy := httptest.NewServer(e2ethanos.NewSingleHostReverseProxy(querierURL, externalPrefix)) t.Cleanup(querierProxy.Close) @@ -225,21 +226,21 @@ func TestQueryExternalPrefix(t *testing.T) { func TestQueryExternalPrefixAndRoutePrefix(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_external_prefix_and_route_prefix") + e, err := e2e.NewDockerEnvironment("e2e_test_query_external_prefix_and_route_prefix") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) externalPrefix := "thanos" routePrefix := "test" - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1"). + q, err := e2ethanos.NewQuerierBuilder(e, "1"). WithRoutePrefix(routePrefix). WithExternalPrefix(externalPrefix). Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) - querierURL := mustURLParse(t, "http://"+q.HTTPEndpoint()+"/"+routePrefix) + querierURL := mustURLParse(t, "http://"+q.Endpoint("http")+"/"+routePrefix) querierProxy := httptest.NewServer(e2ethanos.NewSingleHostReverseProxy(querierURL, externalPrefix)) t.Cleanup(querierProxy.Close) @@ -250,38 +251,39 @@ func TestQueryExternalPrefixAndRoutePrefix(t *testing.T) { func TestQueryLabelNames(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_label_names") + e, err := e2e.NewDockerEnvironment("e2e_test_query_label_names") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - receiver, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1) + receiver := e2ethanos.NewUninitiatedReceiver(e, "1") + receiverRunnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(receiver, e.SharedDir(), 1) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(receiver)) + testutil.Ok(t, e2e.StartAndWaitReady(receiverRunnable)) - prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), s.NetworkName(), "alone", defaultPromConfig("prom-alone", 0, "", ""), e2ethanos.DefaultPrometheusImage()) + prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar(e, "alone", defaultPromConfig("prom-alone", 0, "", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), s.NetworkName(), "remote-and-sidecar", defaultPromConfig("prom-both-remote-write-and-sidecar", 1234, e2ethanos.RemoteWriteEndpoint(receiver.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar(e, "remote-and-sidecar", defaultPromConfig("prom-both-remote-write-and-sidecar", 1234, e2ethanos.RemoteWriteEndpoint(receiver.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", sidecar1.GRPCNetworkEndpoint(), sidecar2.GRPCNetworkEndpoint(), receiver.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc"), receiver.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute) t.Cleanup(cancel) now := time.Now() - labelNames(t, ctx, q.HTTPEndpoint(), nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { + labelNames(t, ctx, q.Endpoint("http"), nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) > 0 }) // Outside time range. - labelNames(t, ctx, q.HTTPEndpoint(), nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(-23*time.Hour)), func(res []string) bool { + labelNames(t, ctx, q.Endpoint("http"), nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(-23*time.Hour)), func(res []string) bool { return len(res) == 0 }) - labelNames(t, ctx, q.HTTPEndpoint(), []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "up"}}, + labelNames(t, ctx, q.Endpoint("http"), []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "up"}}, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { // Expected result: [__name__, instance, job, prometheus, replica, receive, tenant_id] // Pre-labelnames pushdown we've done Select() over all series and picked out the label names hence they all had external labels. @@ -291,7 +293,7 @@ func TestQueryLabelNames(t *testing.T) { ) // There is no matched series. - labelNames(t, ctx, q.HTTPEndpoint(), []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "foobar"}}, + labelNames(t, ctx, q.Endpoint("http"), []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "foobar"}}, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) == 0 }, @@ -301,44 +303,45 @@ func TestQueryLabelNames(t *testing.T) { func TestQueryLabelValues(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_query_label_values") + e, err := e2e.NewDockerEnvironment("e2e_test_query_label_values") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - receiver, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1) + receiver := e2ethanos.NewUninitiatedReceiver(e, "1") + receiverRunnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(receiver, e.SharedDir(), 1) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(receiver)) + testutil.Ok(t, e2e.StartAndWaitReady(receiverRunnable)) - prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), s.NetworkName(), "alone", defaultPromConfig("prom-alone", 0, "", ""), e2ethanos.DefaultPrometheusImage()) + prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar(e, "alone", defaultPromConfig("prom-alone", 0, "", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar(s.SharedDir(), s.NetworkName(), "remote-and-sidecar", defaultPromConfig("prom-both-remote-write-and-sidecar", 1234, e2ethanos.RemoteWriteEndpoint(receiver.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar(e, "remote-and-sidecar", defaultPromConfig("prom-both-remote-write-and-sidecar", 1234, e2ethanos.RemoteWriteEndpoint(receiver.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", sidecar1.GRPCNetworkEndpoint(), sidecar2.GRPCNetworkEndpoint(), receiver.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc"), receiver.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute) t.Cleanup(cancel) now := time.Now() - labelValues(t, ctx, q.HTTPEndpoint(), "instance", nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { + labelValues(t, ctx, q.Endpoint("http"), "instance", nil, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) == 1 && res[0] == "localhost:9090" }) // Outside time range. - labelValues(t, ctx, q.HTTPEndpoint(), "instance", nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(-23*time.Hour)), func(res []string) bool { + labelValues(t, ctx, q.Endpoint("http"), "instance", nil, timestamp.FromTime(now.Add(-24*time.Hour)), timestamp.FromTime(now.Add(-23*time.Hour)), func(res []string) bool { return len(res) == 0 }) - labelValues(t, ctx, q.HTTPEndpoint(), "__name__", []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "up"}}, + labelValues(t, ctx, q.Endpoint("http"), "__name__", []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "up"}}, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) == 1 && res[0] == "up" }, ) - labelValues(t, ctx, q.HTTPEndpoint(), "__name__", []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "foobar"}}, + labelValues(t, ctx, q.Endpoint("http"), "__name__", []*labels.Matcher{{Type: labels.MatchEqual, Name: "__name__", Value: "foobar"}}, timestamp.FromTime(now.Add(-time.Hour)), timestamp.FromTime(now.Add(time.Hour)), func(res []string) bool { return len(res) == 0 }, @@ -363,52 +366,53 @@ func TestQueryCompatibilityWithPreInfoAPI(t *testing.T) { } { i := i t.Run(fmt.Sprintf("%+v", tcase), func(t *testing.T) { - net := fmt.Sprintf("e2e_test_query_comp_query_%d", i) - s, err := e2e.NewScenario(net) + e, err := e2e.NewDockerEnvironment(fmt.Sprintf("e2e_test_query_comp_query_%d", i)) testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) promRulesSubDir := filepath.Join("rules") - testutil.Ok(t, os.MkdirAll(filepath.Join(s.SharedDir(), promRulesSubDir), os.ModePerm)) + testutil.Ok(t, os.MkdirAll(filepath.Join(e.SharedDir(), promRulesSubDir), os.ModePerm)) // Create the abort_on_partial_response alert for Prometheus. // We don't create the warn_on_partial_response alert as Prometheus has strict yaml unmarshalling. - createRuleFile(t, filepath.Join(s.SharedDir(), promRulesSubDir, "rules.yaml"), testAlertRuleAbortOnPartialResponse) + createRuleFile(t, filepath.Join(e.SharedDir(), promRulesSubDir, "rules.yaml"), testAlertRuleAbortOnPartialResponse) + + qBuilder := e2ethanos.NewQuerierBuilder(e, "1") + qUninit := qBuilder.BuildUninitiated() p1, s1, err := e2ethanos.NewPrometheusWithSidecarCustomImage( - s.SharedDir(), - net, + e, "p1", - defaultPromConfig("p1", 0, "", filepath.Join(e2e.ContainerSharedDir, promRulesSubDir, "*.yaml"), "localhost:9090", e2e.NetworkContainerHostPort(net, "querier-1", 8080)), // TODO(bwplotka): Use newer e2e lib to handle this in type safe manner. + defaultPromConfig("p1", 0, "", filepath.Join(e2ethanos.ContainerSharedDir, promRulesSubDir, "*.yaml"), "localhost:9090", qUninit.InternalEndpoint("http")), e2ethanos.DefaultPrometheusImage(), tcase.sidecarImage, e2ethanos.FeatureExemplarStorage, ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(p1, s1)) + testutil.Ok(t, e2e.StartAndWaitReady(p1, s1)) // Newest querier with old --rules --meta etc flags. - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", s1.GRPCNetworkEndpoint()). - WithMetadataAddresses(s1.GRPCNetworkEndpoint()). - WithExemplarAddresses(s1.GRPCNetworkEndpoint()). - WithTargetAddresses(s1.GRPCNetworkEndpoint()). - WithRuleAddresses(s1.GRPCNetworkEndpoint()). + q, err := qBuilder. + WithMetadataAddresses(s1.InternalEndpoint("grpc")). + WithExemplarAddresses(s1.InternalEndpoint("grpc")). + WithTargetAddresses(s1.InternalEndpoint("grpc")). + WithRuleAddresses(s1.InternalEndpoint("grpc")). WithTracingConfig(fmt.Sprintf(`type: JAEGER config: sampler_type: const sampler_param: 1 - service_name: %s`, s.NetworkName()+"-query")). // Use fake tracing config to trigger exemplar. + service_name: %s`, qUninit.Name())). // Use fake tracing config to trigger exemplar. WithImage(tcase.queryImage). - Build() + Initiate(qUninit, s1.InternalEndpoint("grpc")) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) t.Cleanup(cancel) // We should have single TCP connection, since all APIs are against the same server. - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { @@ -425,7 +429,7 @@ config: var promMeta map[string][]metadatapb.Meta // Wait metadata response to be ready as Prometheus gets metadata after scrape. testutil.Ok(t, runutil.Retry(3*time.Second, ctx.Done(), func() error { - promMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+p1.HTTPEndpoint()), "", -1) + promMeta, err = promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+p1.Endpoint("http")), "", -1) testutil.Ok(t, err) if len(promMeta) > 0 { return nil @@ -433,7 +437,7 @@ config: return fmt.Errorf("empty metadata response from Prometheus") })) - thanosMeta, err := promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.HTTPEndpoint()), "", -1) + thanosMeta, err := promclient.NewDefaultClient().MetricMetadataInGRPC(ctx, mustURLParse(t, "http://"+q.Endpoint("http")), "", -1) testutil.Ok(t, err) testutil.Assert(t, len(thanosMeta) > 0, "got empty metadata response from Thanos") @@ -448,11 +452,11 @@ config: end := timestamp.FromTime(now.Add(time.Hour)) // Send HTTP requests to thanos query to trigger exemplars. - labelNames(t, ctx, q.HTTPEndpoint(), nil, start, end, func(res []string) bool { + labelNames(t, ctx, q.Endpoint("http"), nil, start, end, func(res []string) bool { return true }) - queryExemplars(t, ctx, q.HTTPEndpoint(), `http_request_duration_seconds_bucket{handler="label_names"}`, start, end, exemplarsOnExpectedSeries(map[string]string{ + queryExemplars(t, ctx, q.Endpoint("http"), `http_request_duration_seconds_bucket{handler="label_names"}`, start, end, exemplarsOnExpectedSeries(map[string]string{ "__name__": "http_request_duration_seconds_bucket", "handler": "label_names", "job": "myself", @@ -463,7 +467,7 @@ config: // Targets. { - targetAndAssert(t, ctx, q.HTTPEndpoint(), "", &targetspb.TargetDiscovery{ + targetAndAssert(t, ctx, q.Endpoint("http"), "", &targetspb.TargetDiscovery{ ActiveTargets: []*targetspb.ActiveTarget{ { DiscoveredLabels: labelpb.ZLabelSet{Labels: []labelpb.ZLabel{ @@ -506,7 +510,7 @@ config: // Rules. { - ruleAndAssert(t, ctx, q.HTTPEndpoint(), "", []*rulespb.RuleGroup{ + ruleAndAssert(t, ctx, q.Endpoint("http"), "", []*rulespb.RuleGroup{ { Name: "example_abort", File: "/shared/rules/rules.yaml", @@ -575,7 +579,7 @@ func instantQuery(t *testing.T, ctx context.Context, addr, q string, opts promcl logger := log.NewLogfmtLogger(os.Stdout) logger = log.With(logger, "ts", log.DefaultTimestampUTC) - testutil.Ok(t, runutil.RetryWithLog(logger, time.Second, ctx.Done(), func() error { + testutil.Ok(t, runutil.RetryWithLog(logger, 5*time.Second, ctx.Done(), func() error { res, warnings, err := promclient.NewDefaultClient().QueryInstant(ctx, mustURLParse(t, "http://"+addr), q, time.Now(), opts) if err != nil { return err diff --git a/test/e2e/receive_test.go b/test/e2e/receive_test.go index bb335d5c66f..c41c9b7ddb1 100644 --- a/test/e2e/receive_test.go +++ b/test/e2e/receive_test.go @@ -8,11 +8,10 @@ import ( "log" "net/http" "net/http/httputil" - "net/url" "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/prometheus/common/model" "github.com/thanos-io/thanos/pkg/promclient" "github.com/thanos-io/thanos/pkg/receive" @@ -20,12 +19,6 @@ import ( "github.com/thanos-io/thanos/test/e2e/e2ethanos" ) -type ReverseProxyConfig struct { - tenantId string - port string - target string -} - type DebugTransport struct{} func (DebugTransport) RoundTrip(r *http.Request) (*http.Response, error) { @@ -36,19 +29,6 @@ func (DebugTransport) RoundTrip(r *http.Request) (*http.Response, error) { return http.DefaultTransport.RoundTrip(r) } -func generateProxy(conf ReverseProxyConfig) { - targetURL, _ := url.Parse(conf.target) - proxy := httputil.NewSingleHostReverseProxy(targetURL) - d := proxy.Director - proxy.Director = func(r *http.Request) { - d(r) // call default director - r.Header.Add("THANOS-TENANT", conf.tenantId) - } - proxy.ErrorHandler = ErrorHandler - proxy.Transport = DebugTransport{} - log.Fatal(http.ListenAndServe(conf.port, proxy)) -} - func ErrorHandler(_ http.ResponseWriter, _ *http.Request, err error) { log.Print("Response from receiver") log.Print(err) @@ -75,31 +55,31 @@ func TestReceive(t *testing.T) { */ t.Parallel() - s, err := e2e.NewScenario("e2e_receive_single_ingestor") + e, err := e2e.NewDockerEnvironment("e2e_receive_single_ingestor") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // Setup Router Ingestor. - i, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "ingestor") + i, err := e2ethanos.NewIngestingReceiver(e, "ingestor") testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(i)) + testutil.Ok(t, e2e.StartAndWaitReady(i)) // Setup Prometheus - prom, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(i.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(i.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom)) + testutil.Ok(t, e2e.StartAndWaitReady(prom)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", i.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", i.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) // We expect the data from each Prometheus instance to be replicated twice across our ingesting instances - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { @@ -143,51 +123,51 @@ func TestReceive(t *testing.T) { */ t.Parallel() - s, err := e2e.NewScenario("e2e_receive_router_replication") + e, err := e2e.NewDockerEnvironment("e2e_receive_router_replication") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // Setup 3 ingestors. - i1, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "i1") + i1, err := e2ethanos.NewIngestingReceiver(e, "i1") testutil.Ok(t, err) - i2, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "i2") + i2, err := e2ethanos.NewIngestingReceiver(e, "i2") testutil.Ok(t, err) - i3, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "i3") + i3, err := e2ethanos.NewIngestingReceiver(e, "i3") testutil.Ok(t, err) h := receive.HashringConfig{ Endpoints: []string{ - i1.GRPCNetworkEndpointFor(s.NetworkName()), - i2.GRPCNetworkEndpointFor(s.NetworkName()), - i3.GRPCNetworkEndpointFor(s.NetworkName()), + i1.InternalEndpoint("grpc"), + i2.InternalEndpoint("grpc"), + i3.InternalEndpoint("grpc"), }, } // Setup 1 distributor - r1, err := e2ethanos.NewRoutingReceiver(s.SharedDir(), "r1", 2, h) + r1, err := e2ethanos.NewRoutingReceiver(e, "r1", 2, h) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(i1, i2, i3, r1)) + testutil.Ok(t, e2e.StartAndWaitReady(i1, i2, i3, r1)) - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, _, err := e2ethanos.NewPrometheus(e, "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom3, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "3", defaultPromConfig("prom3", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom3, _, err := e2ethanos.NewPrometheus(e, "3", defaultPromConfig("prom3", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, prom2, prom3)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, prom2, prom3)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", i1.GRPCNetworkEndpoint(), i2.GRPCNetworkEndpoint(), i3.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", i1.InternalEndpoint("grpc"), i2.InternalEndpoint("grpc"), i3.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) expectedReplicationFactor := 2.0 - queryAndAssert(t, ctx, q.HTTPEndpoint(), "count(up) by (prometheus)", promclient.QueryOptions{ + queryAndAssert(t, ctx, q.Endpoint("http"), "count(up) by (prometheus)", promclient.QueryOptions{ Deduplicate: false, }, model.Vector{ &model.Sample{ @@ -250,57 +230,57 @@ func TestReceive(t *testing.T) { */ t.Parallel() - s, err := e2e.NewScenario("e2e_receive_routing_tree") + e, err := e2e.NewDockerEnvironment("e2e_receive_routing_tree") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // Setup ingestors. - i1, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "i1") + i1, err := e2ethanos.NewIngestingReceiver(e, "i1") testutil.Ok(t, err) - i2, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "i2") + i2, err := e2ethanos.NewIngestingReceiver(e, "i2") testutil.Ok(t, err) - i3, err := e2ethanos.NewIngestingReceiver(s.SharedDir(), "i3") + i3, err := e2ethanos.NewIngestingReceiver(e, "i3") testutil.Ok(t, err) // Setup distributors - r2, err := e2ethanos.NewRoutingReceiver(s.SharedDir(), "r2", 2, receive.HashringConfig{ + r2, err := e2ethanos.NewRoutingReceiver(e, "r2", 2, receive.HashringConfig{ Endpoints: []string{ - i2.GRPCNetworkEndpointFor(s.NetworkName()), - i3.GRPCNetworkEndpointFor(s.NetworkName()), + i2.InternalEndpoint("grpc"), + i3.InternalEndpoint("grpc"), }, }) testutil.Ok(t, err) - r1, err := e2ethanos.NewRoutingReceiver(s.SharedDir(), "r1", 2, receive.HashringConfig{ + r1, err := e2ethanos.NewRoutingReceiver(e, "r1", 2, receive.HashringConfig{ Endpoints: []string{ - r2.GRPCNetworkEndpointFor(s.NetworkName()), - i1.GRPCNetworkEndpointFor(s.NetworkName()), + i1.InternalEndpoint("grpc"), + r2.InternalEndpoint("grpc"), }, }) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(i1, i2, i3, r1, r2)) + testutil.Ok(t, e2e.StartAndWaitReady(i1, i2, i3, r1, r2)) //Setup Prometheuses - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, _, err := e2ethanos.NewPrometheus(e, "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, prom2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, prom2)) //Setup Querier - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", i1.GRPCNetworkEndpoint(), i2.GRPCNetworkEndpoint(), i3.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", i1.InternalEndpoint("grpc"), i2.InternalEndpoint("grpc"), i3.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) expectedReplicationFactor := 3.0 - queryAndAssert(t, ctx, q.HTTPEndpoint(), "count(up) by (prometheus)", promclient.QueryOptions{ + queryAndAssert(t, ctx, q.Endpoint("http"), "count(up) by (prometheus)", promclient.QueryOptions{ Deduplicate: false, }, model.Vector{ &model.Sample{ @@ -355,72 +335,70 @@ func TestReceive(t *testing.T) { └───────┘ */ t.Parallel() - s, err := e2e.NewScenario("e2e_test_receive_hashring") - testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) - r1, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1) - testutil.Ok(t, err) - r2, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 1) - testutil.Ok(t, err) - r3, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "3", 1) + e, err := e2e.NewDockerEnvironment("e2e_test_receive_hashring") testutil.Ok(t, err) + t.Cleanup(e2ethanos.CleanScenario(t, e)) + + r1 := e2ethanos.NewUninitiatedReceiver(e, "1") + r2 := e2ethanos.NewUninitiatedReceiver(e, "2") + r3 := e2ethanos.NewUninitiatedReceiver(e, "3") h := receive.HashringConfig{ Endpoints: []string{ - r1.GRPCNetworkEndpointFor(s.NetworkName()), - r2.GRPCNetworkEndpointFor(s.NetworkName()), - r3.GRPCNetworkEndpointFor(s.NetworkName()), + r1.InternalEndpoint("grpc"), + r2.InternalEndpoint("grpc"), + r3.InternalEndpoint("grpc"), }, } - // Recreate again, but with hashring config. - r1, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1, h) + // Create with hashring config. + r1Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r1, e.SharedDir(), 1, h) testutil.Ok(t, err) - r2, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 1, h) + r2Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r2, e.SharedDir(), 1, h) testutil.Ok(t, err) - r3, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "3", 1, h) + r3Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r3, e.SharedDir(), 1, h) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r1, r2, r3)) + testutil.Ok(t, e2e.StartAndWaitReady(r1Runnable, r2Runnable, r3Runnable)) - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r2.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, _, err := e2ethanos.NewPrometheus(e, "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r2.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom3, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "3", defaultPromConfig("prom3", 0, e2ethanos.RemoteWriteEndpoint(r3.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom3, _, err := e2ethanos.NewPrometheus(e, "3", defaultPromConfig("prom3", 0, e2ethanos.RemoteWriteEndpoint(r3.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, prom2, prom3)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, prom2, prom3)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", r1.GRPCNetworkEndpoint(), r2.GRPCNetworkEndpoint(), r3.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", r1.InternalEndpoint("grpc"), r2.InternalEndpoint("grpc"), r3.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { "job": "myself", "prometheus": "prom1", - "receive": "2", + "receive": "receive-2", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom2", - "receive": "1", + "receive": "receive-1", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom3", - "receive": "2", + "receive": "receive-2", "replica": "0", "tenant_id": "default-tenant", }, @@ -430,73 +408,70 @@ func TestReceive(t *testing.T) { t.Run("hashring with config watcher", func(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_receive_hashring_config_watcher") + e, err := e2e.NewDockerEnvironment("e2e_test_receive_hashring_config_watcher") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - r1, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1) - testutil.Ok(t, err) - r2, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 1) - testutil.Ok(t, err) - r3, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "3", 1) - testutil.Ok(t, err) + r1 := e2ethanos.NewUninitiatedReceiver(e, "1") + r2 := e2ethanos.NewUninitiatedReceiver(e, "2") + r3 := e2ethanos.NewUninitiatedReceiver(e, "3") h := receive.HashringConfig{ Endpoints: []string{ - r1.GRPCNetworkEndpointFor(s.NetworkName()), - r2.GRPCNetworkEndpointFor(s.NetworkName()), - r3.GRPCNetworkEndpointFor(s.NetworkName()), + r1.InternalEndpoint("grpc"), + r2.InternalEndpoint("grpc"), + r3.InternalEndpoint("grpc"), }, } - // Recreate again, but with hashring config. + // Create with hashring config. // TODO(kakkoyun): Update config file and wait config watcher to reconcile hashring. - r1, err = e2ethanos.NewRoutingAndIngestingReceiverWithConfigWatcher(s.SharedDir(), s.NetworkName(), "1", 1, h) + r1Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverWithConfigWatcher(r1, e.SharedDir(), 1, h) testutil.Ok(t, err) - r2, err = e2ethanos.NewRoutingAndIngestingReceiverWithConfigWatcher(s.SharedDir(), s.NetworkName(), "2", 1, h) + r2Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverWithConfigWatcher(r2, e.SharedDir(), 1, h) testutil.Ok(t, err) - r3, err = e2ethanos.NewRoutingAndIngestingReceiverWithConfigWatcher(s.SharedDir(), s.NetworkName(), "3", 1, h) + r3Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverWithConfigWatcher(r3, e.SharedDir(), 1, h) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r1, r2, r3)) + testutil.Ok(t, e2e.StartAndWaitReady(r1Runnable, r2Runnable, r3Runnable)) - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r2.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom2, _, err := e2ethanos.NewPrometheus(e, "2", defaultPromConfig("prom2", 0, e2ethanos.RemoteWriteEndpoint(r2.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom3, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "3", defaultPromConfig("prom3", 0, e2ethanos.RemoteWriteEndpoint(r3.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom3, _, err := e2ethanos.NewPrometheus(e, "3", defaultPromConfig("prom3", 0, e2ethanos.RemoteWriteEndpoint(r3.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, prom2, prom3)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, prom2, prom3)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", r1.GRPCNetworkEndpoint(), r2.GRPCNetworkEndpoint(), r3.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", r1.InternalEndpoint("grpc"), r2.InternalEndpoint("grpc"), r3.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { "job": "myself", "prometheus": "prom1", - "receive": "2", + "receive": "receive-2", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom2", - "receive": "1", + "receive": "receive-1", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom3", - "receive": "2", + "receive": "receive-2", "replica": "0", "tenant_id": "default-tenant", }, @@ -506,72 +481,70 @@ func TestReceive(t *testing.T) { t.Run("replication", func(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_receive_replication") + e, err := e2e.NewDockerEnvironment("e2e_test_receive_replication") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // The replication suite creates three receivers but only one // receives Prometheus remote-written data. The querier queries all // receivers and the test verifies that the time series are // replicated to all of the nodes. - r1, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 3) - testutil.Ok(t, err) - r2, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 3) - testutil.Ok(t, err) - r3, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "3", 3) - testutil.Ok(t, err) + + r1 := e2ethanos.NewUninitiatedReceiver(e, "1") + r2 := e2ethanos.NewUninitiatedReceiver(e, "2") + r3 := e2ethanos.NewUninitiatedReceiver(e, "3") h := receive.HashringConfig{ Endpoints: []string{ - r1.GRPCNetworkEndpointFor(s.NetworkName()), - r2.GRPCNetworkEndpointFor(s.NetworkName()), - r3.GRPCNetworkEndpointFor(s.NetworkName()), + r1.InternalEndpoint("grpc"), + r2.InternalEndpoint("grpc"), + r3.InternalEndpoint("grpc"), }, } - // Recreate again, but with hashring config. - r1, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 3, h) + // Create with hashring config. + r1Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r1, e.SharedDir(), 3, h) testutil.Ok(t, err) - r2, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 3, h) + r2Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r2, e.SharedDir(), 3, h) testutil.Ok(t, err) - r3, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "3", 3, h) + r3Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r3, e.SharedDir(), 3, h) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r1, r2, r3)) + testutil.Ok(t, e2e.StartAndWaitReady(r1Runnable, r2Runnable, r3Runnable)) - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", r1.GRPCNetworkEndpoint(), r2.GRPCNetworkEndpoint(), r3.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", r1.InternalEndpoint("grpc"), r2.InternalEndpoint("grpc"), r3.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(3), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { "job": "myself", "prometheus": "prom1", - "receive": "1", + "receive": "receive-1", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom1", - "receive": "2", + "receive": "receive-2", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom1", - "receive": "3", + "receive": "receive-3", "replica": "0", "tenant_id": "default-tenant", }, @@ -581,62 +554,60 @@ func TestReceive(t *testing.T) { t.Run("replication_with_outage", func(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_receive_replication_with_outage") + e, err := e2e.NewDockerEnvironment("e2e_test_receive_replication_with_outage") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // The replication suite creates a three-node hashring but one of the // receivers is dead. In this case, replication should still // succeed and the time series should be replicated to the other nodes. - r1, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 3) - testutil.Ok(t, err) - r2, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 3) - testutil.Ok(t, err) - notRunningR3, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "3", 3) - testutil.Ok(t, err) + + r1 := e2ethanos.NewUninitiatedReceiver(e, "1") + r2 := e2ethanos.NewUninitiatedReceiver(e, "2") + r3 := e2ethanos.NewUninitiatedReceiver(e, "3") h := receive.HashringConfig{ Endpoints: []string{ - r1.GRPCNetworkEndpointFor(s.NetworkName()), - r2.GRPCNetworkEndpointFor(s.NetworkName()), - notRunningR3.GRPCNetworkEndpointFor(s.NetworkName()), + r1.InternalEndpoint("grpc"), + r2.InternalEndpoint("grpc"), + r3.InternalEndpoint("grpc"), }, } - // Recreate again, but with hashring config. - r1, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 3, h) + // Create with hashring config. + r1Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r1, e.SharedDir(), 3, h) testutil.Ok(t, err) - r2, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "2", 3, h) + r2Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r2, e.SharedDir(), 3, h) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r1, r2)) + testutil.Ok(t, e2e.StartAndWaitReady(r1Runnable, r2Runnable)) - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.NetworkEndpoint(8081)), ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, e2ethanos.RemoteWriteEndpoint(r1.InternalEndpoint("remote-write")), ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", r1.GRPCNetworkEndpoint(), r2.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", r1.InternalEndpoint("grpc"), r2.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { "job": "myself", "prometheus": "prom1", - "receive": "1", + "receive": "receive-1", "replica": "0", "tenant_id": "default-tenant", }, { "job": "myself", "prometheus": "prom1", - "receive": "2", + "receive": "receive-2", "replica": "0", "tenant_id": "default-tenant", }, @@ -646,70 +617,56 @@ func TestReceive(t *testing.T) { t.Run("multitenancy", func(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_for_multitenancy") + e, err := e2e.NewDockerEnvironment("e2e_test_for_multitenancy") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - // The replication suite creates a three-node hashring but one of the - // receivers is dead. In this case, replication should still - // succeed and the time series should be replicated to the other nodes. - r1, err := e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1) - testutil.Ok(t, err) + r1 := e2ethanos.NewUninitiatedReceiver(e, "1") h := receive.HashringConfig{ Endpoints: []string{ - r1.GRPCNetworkEndpointFor(s.NetworkName()), + r1.InternalEndpoint("grpc"), }, } - // Recreate again, but with hashring config. - r1, err = e2ethanos.NewRoutingAndIngestingReceiver(s.SharedDir(), s.NetworkName(), "1", 1, h) + // Create with hashring config. + r1Runnable, err := e2ethanos.NewRoutingAndIngestingReceiverFromService(r1, e.SharedDir(), 1, h) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r1)) - testutil.Ok(t, err) - - conf1 := ReverseProxyConfig{ - tenantId: "tenant-1", - port: ":9097", - target: "http://" + r1.Endpoint(8081), - } - conf2 := ReverseProxyConfig{ - tenantId: "tenant-2", - port: ":9098", - target: "http://" + r1.Endpoint(8081), - } + testutil.Ok(t, e2e.StartAndWaitReady(r1Runnable)) - go generateProxy(conf1) - go generateProxy(conf2) + rp1, err := e2ethanos.NewReverseProxy(e, "1", "tenant-1", "http://"+r1.InternalEndpoint("remote-write")) + testutil.Ok(t, err) + rp2, err := e2ethanos.NewReverseProxy(e, "2", "tenant-2", "http://"+r1.InternalEndpoint("remote-write")) + testutil.Ok(t, err) + testutil.Ok(t, e2e.StartAndWaitReady(rp1, rp2)) - prom1, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "1", defaultPromConfig("prom1", 0, "http://172.17.0.1:9097/api/v1/receive", ""), e2ethanos.DefaultPrometheusImage()) + prom1, _, err := e2ethanos.NewPrometheus(e, "1", defaultPromConfig("prom1", 0, "http://"+rp1.InternalEndpoint("http")+"/api/v1/receive", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - prom2, _, err := e2ethanos.NewPrometheus(s.SharedDir(), "2", defaultPromConfig("prom1", 0, "http://172.17.0.1:9098/api/v1/receive", ""), e2ethanos.DefaultPrometheusImage()) + prom2, _, err := e2ethanos.NewPrometheus(e, "2", defaultPromConfig("prom2", 0, "http://"+rp2.InternalEndpoint("http")+"/api/v1/receive", ""), e2ethanos.DefaultPrometheusImage()) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1)) - testutil.Ok(t, s.StartAndWaitReady(prom2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, prom2)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", r1.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", r1.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), queryUpWithoutInstance, promclient.QueryOptions{ + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) + queryAndAssertSeries(t, ctx, q.Endpoint("http"), queryUpWithoutInstance, promclient.QueryOptions{ Deduplicate: false, }, []model.Metric{ { "job": "myself", "prometheus": "prom1", - "receive": "1", + "receive": "receive-1", "replica": "0", "tenant_id": "tenant-1", }, { "job": "myself", - "prometheus": "prom1", - "receive": "1", + "prometheus": "prom2", + "receive": "receive-1", "replica": "0", "tenant_id": "tenant-2", }, diff --git a/test/e2e/rule_test.go b/test/e2e/rule_test.go index fb7289ca906..eeb2c20b9c1 100644 --- a/test/e2e/rule_test.go +++ b/test/e2e/rule_test.go @@ -15,15 +15,14 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/prometheus/common/model" "github.com/prometheus/prometheus/discovery/targetgroup" + "github.com/thanos-io/thanos/pkg/httpconfig" "gopkg.in/yaml.v2" "github.com/thanos-io/thanos/pkg/alert" - http_util "github.com/thanos-io/thanos/pkg/http" "github.com/thanos-io/thanos/pkg/promclient" - "github.com/thanos-io/thanos/pkg/query" "github.com/thanos-io/thanos/pkg/rules/rulespb" "github.com/thanos-io/thanos/pkg/testutil" "github.com/thanos-io/thanos/test/e2e/e2ethanos" @@ -127,7 +126,7 @@ func reloadRulesHTTP(t *testing.T, ctx context.Context, endpoint string) { testutil.Equals(t, 200, resp.StatusCode) } -func reloadRulesSignal(t *testing.T, r *e2ethanos.Service) { +func reloadRulesSignal(t *testing.T, r *e2e.InstrumentedRunnable) { c := e2e.NewCommand("kill", "-1", "1") _, _, err := r.Exec(c) testutil.Ok(t, err) @@ -197,55 +196,55 @@ func writeTargets(t *testing.T, path string, addrs ...string) { func TestRule(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_rule") + e, err := e2e.NewDockerEnvironment("e2e_test_rule") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) - ctx, cancel := context.WithTimeout(context.Background(), 3*time.Minute) + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) t.Cleanup(cancel) // Prepare work dirs. rulesSubDir := filepath.Join("rules") - rulesPath := filepath.Join(s.SharedDir(), rulesSubDir) + rulesPath := filepath.Join(e.SharedDir(), rulesSubDir) testutil.Ok(t, os.MkdirAll(rulesPath, os.ModePerm)) createRuleFiles(t, rulesPath) amTargetsSubDir := filepath.Join("rules_am_targets") - testutil.Ok(t, os.MkdirAll(filepath.Join(s.SharedDir(), amTargetsSubDir), os.ModePerm)) + testutil.Ok(t, os.MkdirAll(filepath.Join(e.SharedDir(), amTargetsSubDir), os.ModePerm)) queryTargetsSubDir := filepath.Join("rules_query_targets") - testutil.Ok(t, os.MkdirAll(filepath.Join(s.SharedDir(), queryTargetsSubDir), os.ModePerm)) + testutil.Ok(t, os.MkdirAll(filepath.Join(e.SharedDir(), queryTargetsSubDir), os.ModePerm)) - am1, err := e2ethanos.NewAlertmanager(s.SharedDir(), "1") + am1, err := e2ethanos.NewAlertmanager(e, "1") testutil.Ok(t, err) - am2, err := e2ethanos.NewAlertmanager(s.SharedDir(), "2") + am2, err := e2ethanos.NewAlertmanager(e, "2") testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(am1, am2)) + testutil.Ok(t, e2e.StartAndWaitReady(am1, am2)) - r, err := e2ethanos.NewRuler(s.SharedDir(), "1", rulesSubDir, []alert.AlertmanagerConfig{ + r, err := e2ethanos.NewRuler(e, "1", rulesSubDir, []alert.AlertmanagerConfig{ { - EndpointsConfig: http_util.EndpointsConfig{ - FileSDConfigs: []http_util.FileSDConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ + FileSDConfigs: []httpconfig.FileSDConfig{ { // FileSD which will be used to register discover dynamically am1. - Files: []string{filepath.Join(e2e.ContainerSharedDir, amTargetsSubDir, "*.yaml")}, + Files: []string{filepath.Join(e2ethanos.ContainerSharedDir, amTargetsSubDir, "*.yaml")}, RefreshInterval: model.Duration(time.Second), }, }, StaticAddresses: []string{ - am2.NetworkHTTPEndpoint(), + am2.InternalEndpoint("http"), }, Scheme: "http", }, Timeout: model.Duration(10 * time.Second), APIVersion: alert.APIv1, }, - }, []query.Config{ + }, []httpconfig.Config{ { - EndpointsConfig: http_util.EndpointsConfig{ + EndpointsConfig: httpconfig.EndpointsConfig{ // We test Statically Addressed queries in other tests. Focus on FileSD here. - FileSDConfigs: []http_util.FileSDConfig{ + FileSDConfigs: []httpconfig.FileSDConfig{ { // FileSD which will be used to register discover dynamically q. - Files: []string{filepath.Join(e2e.ContainerSharedDir, queryTargetsSubDir, "*.yaml")}, + Files: []string{filepath.Join(e2ethanos.ContainerSharedDir, queryTargetsSubDir, "*.yaml")}, RefreshInterval: model.Duration(time.Second), }, }, @@ -254,11 +253,11 @@ func TestRule(t *testing.T) { }, }) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r)) + testutil.Ok(t, e2e.StartAndWaitReady(r)) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", r.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", r.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) t.Run("no query configured", func(t *testing.T) { // Check for a few evaluations, check all of them failed. @@ -272,9 +271,9 @@ func TestRule(t *testing.T) { var currentFailures float64 t.Run("attach query", func(t *testing.T) { // Attach querier to target files. - writeTargets(t, filepath.Join(s.SharedDir(), queryTargetsSubDir, "targets.yaml"), q.NetworkHTTPEndpoint()) + writeTargets(t, filepath.Join(e.SharedDir(), queryTargetsSubDir, "targets.yaml"), q.InternalEndpoint("http")) - testutil.Ok(t, r.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_rule_query_apis_dns_provider_results"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, r.WaitSumMetricsWithOptions(e2e.Equals(1), []string{"thanos_rule_query_apis_dns_provider_results"}, e2e.WaitMissingMetrics())) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(1), "thanos_rule_alertmanagers_dns_provider_results")) var currentVal float64 @@ -305,7 +304,7 @@ func TestRule(t *testing.T) { }) t.Run("attach am1", func(t *testing.T) { // Attach am1 to target files. - writeTargets(t, filepath.Join(s.SharedDir(), amTargetsSubDir, "targets.yaml"), am1.NetworkHTTPEndpoint()) + writeTargets(t, filepath.Join(e.SharedDir(), amTargetsSubDir, "targets.yaml"), am1.InternalEndpoint("http")) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(1), "thanos_rule_query_apis_dns_provider_results")) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(2), "thanos_rule_alertmanagers_dns_provider_results")) @@ -329,7 +328,7 @@ func TestRule(t *testing.T) { }) t.Run("am1 drops again", func(t *testing.T) { - testutil.Ok(t, os.RemoveAll(filepath.Join(s.SharedDir(), amTargetsSubDir, "targets.yaml"))) + testutil.Ok(t, os.RemoveAll(filepath.Join(e.SharedDir(), amTargetsSubDir, "targets.yaml"))) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(1), "thanos_rule_query_apis_dns_provider_results")) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(1), "thanos_rule_alertmanagers_dns_provider_results")) @@ -356,83 +355,85 @@ func TestRule(t *testing.T) { testutil.Ok(t, am1.WaitSumMetrics(e2e.Equals(currentValAm1), "alertmanager_alerts_received_total")) }) - t.Run("duplicate am ", func(t *testing.T) { + t.Run("duplicate am", func(t *testing.T) { // am2 is already registered in static addresses. - writeTargets(t, filepath.Join(s.SharedDir(), amTargetsSubDir, "targets.yaml"), am2.NetworkHTTPEndpoint()) + writeTargets(t, filepath.Join(e.SharedDir(), amTargetsSubDir, "targets.yaml"), am2.InternalEndpoint("http")) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(1), "thanos_rule_query_apis_dns_provider_results")) testutil.Ok(t, r.WaitSumMetrics(e2e.Equals(1), "thanos_rule_alertmanagers_dns_provider_results")) }) t.Run("rule groups have last evaluation and evaluation duration set", func(t *testing.T) { - rulegroupCorrectData(t, ctx, r.HTTPEndpoint()) + rulegroupCorrectData(t, ctx, r.Endpoint("http")) }) t.Run("signal reload works", func(t *testing.T) { // Add a new rule via sending sighup createRuleFile(t, fmt.Sprintf("%s/newrule.yaml", rulesPath), testAlertRuleAddedLaterSignal) reloadRulesSignal(t, r) - checkReloadSuccessful(t, ctx, r.HTTPEndpoint(), 4) + checkReloadSuccessful(t, ctx, r.Endpoint("http"), 4) }) t.Run("http reload works", func(t *testing.T) { // Add a new rule via /-/reload. createRuleFile(t, fmt.Sprintf("%s/newrule.yaml", rulesPath), testAlertRuleAddedLaterWebHandler) - reloadRulesHTTP(t, ctx, r.HTTPEndpoint()) - checkReloadSuccessful(t, ctx, r.HTTPEndpoint(), 3) + reloadRulesHTTP(t, ctx, r.Endpoint("http")) + checkReloadSuccessful(t, ctx, r.Endpoint("http"), 3) }) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), "ALERTS", promclient.QueryOptions{ - Deduplicate: false, - }, []model.Metric{ - { - "__name__": "ALERTS", - "severity": "page", - "alertname": "TestAlert_AbortOnPartialResponse", - "alertstate": "firing", - "replica": "1", - }, - { - "__name__": "ALERTS", - "severity": "page", - "alertname": "TestAlert_HasBeenLoadedViaWebHandler", - "alertstate": "firing", - "replica": "1", - }, - { - "__name__": "ALERTS", - "severity": "page", - "alertname": "TestAlert_WarnOnPartialResponse", - "alertstate": "firing", - "replica": "1", - }, - }) + t.Run("query alerts", func(t *testing.T) { + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "ALERTS", promclient.QueryOptions{ + Deduplicate: false, + }, []model.Metric{ + { + "__name__": "ALERTS", + "severity": "page", + "alertname": "TestAlert_AbortOnPartialResponse", + "alertstate": "firing", + "replica": "1", + }, + { + "__name__": "ALERTS", + "severity": "page", + "alertname": "TestAlert_HasBeenLoadedViaWebHandler", + "alertstate": "firing", + "replica": "1", + }, + { + "__name__": "ALERTS", + "severity": "page", + "alertname": "TestAlert_WarnOnPartialResponse", + "alertstate": "firing", + "replica": "1", + }, + }) - expAlertLabels := []model.LabelSet{ - { - "severity": "page", - "alertname": "TestAlert_AbortOnPartialResponse", - "replica": "1", - }, - { - "severity": "page", - "alertname": "TestAlert_HasBeenLoadedViaWebHandler", - "replica": "1", - }, - { - "severity": "page", - "alertname": "TestAlert_WarnOnPartialResponse", - "replica": "1", - }, - } + expAlertLabels := []model.LabelSet{ + { + "severity": "page", + "alertname": "TestAlert_AbortOnPartialResponse", + "replica": "1", + }, + { + "severity": "page", + "alertname": "TestAlert_HasBeenLoadedViaWebHandler", + "replica": "1", + }, + { + "severity": "page", + "alertname": "TestAlert_WarnOnPartialResponse", + "replica": "1", + }, + } - alrts, err := promclient.NewDefaultClient().AlertmanagerAlerts(ctx, mustURLParse(t, "http://"+am2.HTTPEndpoint())) - testutil.Ok(t, err) + alrts, err := promclient.NewDefaultClient().AlertmanagerAlerts(ctx, mustURLParse(t, "http://"+am2.Endpoint("http"))) + testutil.Ok(t, err) - testutil.Equals(t, len(expAlertLabels), len(alrts)) - for i, a := range alrts { - testutil.Assert(t, a.Labels.Equal(expAlertLabels[i]), "unexpected labels %s", a.Labels) - } + testutil.Equals(t, len(expAlertLabels), len(alrts)) + for i, a := range alrts { + testutil.Assert(t, a.Labels.Equal(expAlertLabels[i]), "unexpected labels %s", a.Labels) + } + }) } // Test Ruler behavior on different storepb.PartialResponseStrategy when having partial response from single `failingStoreAPI`. diff --git a/test/e2e/rules_api_test.go b/test/e2e/rules_api_test.go index 86f97cdf181..0d94317c8be 100644 --- a/test/e2e/rules_api_test.go +++ b/test/e2e/rules_api_test.go @@ -13,13 +13,12 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/go-kit/kit/log" "github.com/pkg/errors" + "github.com/thanos-io/thanos/pkg/httpconfig" - http_util "github.com/thanos-io/thanos/pkg/http" "github.com/thanos-io/thanos/pkg/promclient" - "github.com/thanos-io/thanos/pkg/query" "github.com/thanos-io/thanos/pkg/rules/rulespb" "github.com/thanos-io/thanos/pkg/runutil" "github.com/thanos-io/thanos/pkg/store/labelpb" @@ -30,76 +29,67 @@ import ( func TestRulesAPI_Fanout(t *testing.T) { t.Parallel() - netName := "e2e_test_rules_fanout" - - s, err := e2e.NewScenario(netName) + e, err := e2e.NewDockerEnvironment("e2e_test_rules_fanout") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) promRulesSubDir := filepath.Join("rules") - testutil.Ok(t, os.MkdirAll(filepath.Join(s.SharedDir(), promRulesSubDir), os.ModePerm)) + testutil.Ok(t, os.MkdirAll(filepath.Join(e.SharedDir(), promRulesSubDir), os.ModePerm)) // Create the abort_on_partial_response alert for Prometheus. // We don't create the warn_on_partial_response alert as Prometheus has strict yaml unmarshalling. - createRuleFile(t, filepath.Join(s.SharedDir(), promRulesSubDir, "rules.yaml"), testAlertRuleAbortOnPartialResponse) + createRuleFile(t, filepath.Join(e.SharedDir(), promRulesSubDir, "rules.yaml"), testAlertRuleAbortOnPartialResponse) thanosRulesSubDir := filepath.Join("thanos-rules") - testutil.Ok(t, os.MkdirAll(filepath.Join(s.SharedDir(), thanosRulesSubDir), os.ModePerm)) - createRuleFiles(t, filepath.Join(s.SharedDir(), thanosRulesSubDir)) + testutil.Ok(t, os.MkdirAll(filepath.Join(e.SharedDir(), thanosRulesSubDir), os.ModePerm)) + createRuleFiles(t, filepath.Join(e.SharedDir(), thanosRulesSubDir)) // 2x Prometheus. prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + e, "prom1", - defaultPromConfig("ha", 0, "", filepath.Join(e2e.ContainerSharedDir, promRulesSubDir, "*.yaml")), + defaultPromConfig("ha", 0, "", filepath.Join(e2ethanos.ContainerSharedDir, promRulesSubDir, "*.yaml")), e2ethanos.DefaultPrometheusImage(), ) testutil.Ok(t, err) prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + e, "prom2", - defaultPromConfig("ha", 1, "", filepath.Join(e2e.ContainerSharedDir, promRulesSubDir, "*.yaml")), + defaultPromConfig("ha", 1, "", filepath.Join(e2ethanos.ContainerSharedDir, promRulesSubDir, "*.yaml")), e2ethanos.DefaultPrometheusImage(), ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) - // 2x Rulers. - r1, err := e2ethanos.NewRuler(s.SharedDir(), "rule1", thanosRulesSubDir, nil, nil) - testutil.Ok(t, err) - r2, err := e2ethanos.NewRuler(s.SharedDir(), "rule2", thanosRulesSubDir, nil, nil) - testutil.Ok(t, err) + qBuilder := e2ethanos.NewQuerierBuilder(e, "query") + qUninit := qBuilder.BuildUninitiated() - stores := []string{sidecar1.GRPCNetworkEndpoint(), sidecar2.GRPCNetworkEndpoint(), r1.NetworkEndpointFor(s.NetworkName(), 9091), r2.NetworkEndpointFor(s.NetworkName(), 9091)} - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "query", stores...). - WithRuleAddresses(stores...). - Build() - testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) - - queryCfg := []query.Config{ + queryCfg := []httpconfig.Config{ { - EndpointsConfig: http_util.EndpointsConfig{ - StaticAddresses: []string{q.NetworkHTTPEndpoint()}, + EndpointsConfig: httpconfig.EndpointsConfig{ + StaticAddresses: []string{qUninit.InternalEndpoint("http")}, Scheme: "http", }, }, } // Recreate rulers with the corresponding query config. - r1, err = e2ethanos.NewRuler(s.SharedDir(), "rule1", thanosRulesSubDir, nil, queryCfg) + r1, err := e2ethanos.NewRuler(e, "rule1", thanosRulesSubDir, nil, queryCfg) + testutil.Ok(t, err) + r2, err := e2ethanos.NewRuler(e, "rule2", thanosRulesSubDir, nil, queryCfg) testutil.Ok(t, err) - r2, err = e2ethanos.NewRuler(s.SharedDir(), "rule2", thanosRulesSubDir, nil, queryCfg) + testutil.Ok(t, e2e.StartAndWaitReady(r1, r2)) + + stores := []string{sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc"), r1.InternalEndpoint("grpc"), r2.InternalEndpoint("grpc")} + q, err := qBuilder.WithRuleAddresses(stores...).Initiate(qUninit, stores...) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(r1, r2)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(4), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(4), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - ruleAndAssert(t, ctx, q.HTTPEndpoint(), "", []*rulespb.RuleGroup{ + ruleAndAssert(t, ctx, q.Endpoint("http"), "", []*rulespb.RuleGroup{ { Name: "example_abort", File: "/shared/rules/rules.yaml", diff --git a/test/e2e/store_gateway_test.go b/test/e2e/store_gateway_test.go index 92f51ac93cc..e7a2079058f 100644 --- a/test/e2e/store_gateway_test.go +++ b/test/e2e/store_gateway_test.go @@ -5,6 +5,7 @@ package e2e_test import ( "context" + "fmt" "net/http" "os" "path" @@ -12,8 +13,9 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" - e2edb "github.com/cortexproject/cortex/integration/e2e/db" + "github.com/efficientgo/e2e" + e2edb "github.com/efficientgo/e2e/db" + "github.com/efficientgo/e2e/matchers" "github.com/go-kit/kit/log" "github.com/prometheus/common/model" "github.com/prometheus/prometheus/pkg/labels" @@ -31,43 +33,61 @@ import ( "github.com/thanos-io/thanos/test/e2e/e2ethanos" ) -// TODO(bwplotka): Extend this test to have multiple stores and memcached. +// TODO(bwplotka): Extend this test to have multiple stores. // TODO(bwplotka): Extend this test for downsampling. func TestStoreGateway(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_store_gateway") + e, err := e2e.NewDockerEnvironment("e2e_test_store_gateway") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) - - m := e2edb.NewMinio(8080, "thanos") - testutil.Ok(t, s.StartAndWaitReady(m)) - - s1, err := e2ethanos.NewStoreGW(s.SharedDir(), "1", client.BucketConfig{ - Type: client.S3, - Config: s3.Config{ - Bucket: "thanos", - AccessKey: e2edb.MinioAccessKey, - SecretKey: e2edb.MinioSecretKey, - Endpoint: m.NetworkHTTPEndpoint(), - Insecure: true, + t.Cleanup(e2ethanos.CleanScenario(t, e)) + + const bucket = "store_gateway_test" + m := e2ethanos.NewMinio(e, "thanos-minio", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) + + memcached := e2ethanos.NewMemcached(e, "1") + testutil.Ok(t, e2e.StartAndWaitReady(memcached)) + + memcachedConfig := fmt.Sprintf(`type: MEMCACHED +config: + addresses: [%s] +blocks_iter_ttl: 0s +metafile_exists_ttl: 0s +metafile_doesnt_exist_ttl: 0s +metafile_content_ttl: 0s`, memcached.InternalEndpoint("memcached")) + + s1, err := e2ethanos.NewStoreGW( + e, + "1", + client.BucketConfig{ + Type: client.S3, + Config: s3.Config{ + Bucket: bucket, + AccessKey: e2edb.MinioAccessKey, + SecretKey: e2edb.MinioSecretKey, + Endpoint: m.InternalEndpoint("http"), + Insecure: true, + }, }, - }, relabel.Config{ - Action: relabel.Drop, - Regex: relabel.MustNewRegexp("value2"), - SourceLabels: model.LabelNames{"ext1"}, - }) + memcachedConfig, + relabel.Config{ + Action: relabel.Drop, + Regex: relabel.MustNewRegexp("value2"), + SourceLabels: model.LabelNames{"ext1"}, + }, + ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(s1)) + testutil.Ok(t, e2e.StartAndWaitReady(s1)) // Ensure bucket UI. - ensureGETStatusCode(t, http.StatusOK, "http://"+path.Join(s1.HTTPEndpoint(), "loaded")) + ensureGETStatusCode(t, http.StatusOK, "http://"+path.Join(s1.Endpoint("http"), "loaded")) - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "1", s1.GRPCNetworkEndpoint()).Build() + q, err := e2ethanos.NewQuerierBuilder(e, "1", s1.InternalEndpoint("grpc")).Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) - dir := filepath.Join(s.SharedDir(), "tmp") - testutil.Ok(t, os.MkdirAll(filepath.Join(s.SharedDir(), dir), os.ModePerm)) + dir := filepath.Join(e.SharedDir(), "tmp") + testutil.Ok(t, os.MkdirAll(filepath.Join(e.SharedDir(), dir), os.ModePerm)) series := []labels.Labels{labels.FromStrings("a", "1", "b", "2")} extLset := labels.FromStrings("ext1", "value1", "replica", "1") @@ -75,7 +95,7 @@ func TestStoreGateway(t *testing.T) { extLset3 := labels.FromStrings("ext1", "value2", "replica", "3") extLset4 := labels.FromStrings("ext1", "value1", "replica", "3") - ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute) t.Cleanup(cancel) now := time.Now() @@ -89,10 +109,10 @@ func TestStoreGateway(t *testing.T) { testutil.Ok(t, err) l := log.NewLogfmtLogger(os.Stdout) bkt, err := s3.NewBucketWithConfig(l, s3.Config{ - Bucket: "thanos", + Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.HTTPEndpoint(), // We need separate client config, when connecting to minio from outside. + Endpoint: m.Endpoint("http"), // We need separate client config, when connecting to minio from outside. Insecure: true, }, "test-feed") testutil.Ok(t, err) @@ -112,7 +132,7 @@ func TestStoreGateway(t *testing.T) { testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_bucket_store_block_load_failures_total")) t.Run("query works", func(t *testing.T) { - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), "{a=\"1\"}", + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", promclient.QueryOptions{ Deduplicate: false, }, @@ -137,7 +157,7 @@ func TestStoreGateway(t *testing.T) { testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(6), "thanos_bucket_store_series_data_fetched")) testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(2), "thanos_bucket_store_series_blocks_queried")) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), "{a=\"1\"}", + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", promclient.QueryOptions{ Deduplicate: true, }, @@ -167,7 +187,7 @@ func TestStoreGateway(t *testing.T) { testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_bucket_store_block_load_failures_total")) // TODO(bwplotka): Entries are still in LRU cache. - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), "{a=\"1\"}", + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", promclient.QueryOptions{ Deduplicate: false, }, @@ -196,7 +216,7 @@ func TestStoreGateway(t *testing.T) { testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(1), "thanos_bucket_store_block_drops_total")) testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_bucket_store_block_load_failures_total")) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), "{a=\"1\"}", + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", promclient.QueryOptions{ Deduplicate: false, }, @@ -229,7 +249,7 @@ func TestStoreGateway(t *testing.T) { testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(1+1), "thanos_bucket_store_block_drops_total")) testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_bucket_store_block_load_failures_total")) - queryAndAssertSeries(t, ctx, q.HTTPEndpoint(), "{a=\"1\"}", + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", promclient.QueryOptions{ Deduplicate: false, }, @@ -247,3 +267,117 @@ func TestStoreGateway(t *testing.T) { // TODO(khyati) Let's add some case for compaction-meta.json once the PR will be merged: https://github.com/thanos-io/thanos/pull/2136. } + +func TestStoreGatewayMemcachedCache(t *testing.T) { + t.Parallel() + + e, err := e2e.NewDockerEnvironment("e2e_test_store_gateway_memcached_cache") + testutil.Ok(t, err) + t.Cleanup(e2ethanos.CleanScenario(t, e)) + + const bucket = "store_gateway_memcached_cache_test" + m := e2ethanos.NewMinio(e, "thanos-minio", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) + + memcached := e2ethanos.NewMemcached(e, "1") + testutil.Ok(t, e2e.StartAndWaitReady(memcached)) + + memcachedConfig := fmt.Sprintf(`type: MEMCACHED +config: + addresses: [%s] +blocks_iter_ttl: 0s`, memcached.InternalEndpoint("memcached")) + + s1, err := e2ethanos.NewStoreGW( + e, + "1", + client.BucketConfig{ + Type: client.S3, + Config: s3.Config{ + Bucket: bucket, + AccessKey: e2edb.MinioAccessKey, + SecretKey: e2edb.MinioSecretKey, + Endpoint: m.InternalEndpoint("http"), + Insecure: true, + }, + }, + memcachedConfig, + ) + testutil.Ok(t, err) + testutil.Ok(t, e2e.StartAndWaitReady(s1)) + + q, err := e2ethanos.NewQuerierBuilder(e, "1", s1.InternalEndpoint("grpc")).Build() + testutil.Ok(t, err) + testutil.Ok(t, e2e.StartAndWaitReady(q)) + + dir := filepath.Join(e.SharedDir(), "tmp") + testutil.Ok(t, os.MkdirAll(dir, os.ModePerm)) + + series := []labels.Labels{labels.FromStrings("a", "1", "b", "2")} + extLset := labels.FromStrings("ext1", "value1", "replica", "1") + + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute) + t.Cleanup(cancel) + + now := time.Now() + id, err := e2eutil.CreateBlockWithBlockDelay(ctx, dir, series, 10, timestamp.FromTime(now), timestamp.FromTime(now.Add(2*time.Hour)), 30*time.Minute, extLset, 0, metadata.NoneFunc) + testutil.Ok(t, err) + + l := log.NewLogfmtLogger(os.Stdout) + bkt, err := s3.NewBucketWithConfig(l, s3.Config{ + Bucket: bucket, + AccessKey: e2edb.MinioAccessKey, + SecretKey: e2edb.MinioSecretKey, + Endpoint: m.Endpoint("http"), // We need separate client config, when connecting to minio from outside. + Insecure: true, + }, "test-feed") + testutil.Ok(t, err) + + testutil.Ok(t, objstore.UploadDir(ctx, l, bkt, path.Join(dir, id.String()), id.String())) + + // Wait for store to sync blocks. + // thanos_blocks_meta_synced: 1x loadedMeta 0x labelExcludedMeta 0x TooFreshMeta. + testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(1), "thanos_blocks_meta_synced")) + testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_blocks_meta_sync_failures_total")) + + testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(1), "thanos_bucket_store_blocks_loaded")) + testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_bucket_store_block_drops_total")) + testutil.Ok(t, s1.WaitSumMetrics(e2e.Equals(0), "thanos_bucket_store_block_load_failures_total")) + + t.Run("query with cache miss", func(t *testing.T) { + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", + promclient.QueryOptions{ + Deduplicate: false, + }, + []model.Metric{ + { + "a": "1", + "b": "2", + "ext1": "value1", + "replica": "1", + }, + }, + ) + + testutil.Ok(t, s1.WaitSumMetricsWithOptions(e2e.Equals(0), []string{`thanos_store_bucket_cache_operation_hits_total`}, e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "config", "chunks")))) + }) + + t.Run("query with cache hit", func(t *testing.T) { + queryAndAssertSeries(t, ctx, q.Endpoint("http"), "{a=\"1\"}", + promclient.QueryOptions{ + Deduplicate: false, + }, + []model.Metric{ + { + "a": "1", + "b": "2", + "ext1": "value1", + "replica": "1", + }, + }, + ) + + testutil.Ok(t, s1.WaitSumMetricsWithOptions(e2e.Greater(0), []string{`thanos_store_bucket_cache_operation_hits_total`}, e2e.WithLabelMatchers(matchers.MustNewMatcher(matchers.MatchEqual, "config", "chunks")))) + testutil.Ok(t, s1.WaitSumMetrics(e2e.Greater(0), "thanos_cache_memcached_hits_total")) + }) + +} diff --git a/test/e2e/targets_api_test.go b/test/e2e/targets_api_test.go index 7b8bb33fb1f..a3b2d4a6156 100644 --- a/test/e2e/targets_api_test.go +++ b/test/e2e/targets_api_test.go @@ -12,7 +12,7 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" + "github.com/efficientgo/e2e" "github.com/go-kit/kit/log" "github.com/pkg/errors" @@ -29,44 +29,40 @@ func TestTargetsAPI_Fanout(t *testing.T) { t.Parallel() - netName := "e2e_test_targets_fanout" - - s, err := e2e.NewScenario(netName) + e, err := e2e.NewDockerEnvironment("e2e_test_targets_fanout") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // 2x Prometheus. prom1, sidecar1, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + e, "prom1", defaultPromConfig("ha", 0, "", "", "localhost:9090", "localhost:80"), e2ethanos.DefaultPrometheusImage(), ) testutil.Ok(t, err) prom2, sidecar2, err := e2ethanos.NewPrometheusWithSidecar( - s.SharedDir(), - netName, + e, "prom2", defaultPromConfig("ha", 1, "", "", "localhost:9090", "localhost:80"), e2ethanos.DefaultPrometheusImage(), ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) + testutil.Ok(t, e2e.StartAndWaitReady(prom1, sidecar1, prom2, sidecar2)) - stores := []string{sidecar1.GRPCNetworkEndpoint(), sidecar2.GRPCNetworkEndpoint()} - q, err := e2ethanos.NewQuerierBuilder(s.SharedDir(), "query", stores...). + stores := []string{sidecar1.InternalEndpoint("grpc"), sidecar2.InternalEndpoint("grpc")} + q, err := e2ethanos.NewQuerierBuilder(e, "query", stores...). WithTargetAddresses(stores...). Build() testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(q)) + testutil.Ok(t, e2e.StartAndWaitReady(q)) ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute) t.Cleanup(cancel) - testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics)) + testutil.Ok(t, q.WaitSumMetricsWithOptions(e2e.Equals(2), []string{"thanos_store_nodes_grpc_connections"}, e2e.WaitMissingMetrics())) - targetAndAssert(t, ctx, q.HTTPEndpoint(), "", &targetspb.TargetDiscovery{ + targetAndAssert(t, ctx, q.Endpoint("http"), "", &targetspb.TargetDiscovery{ ActiveTargets: []*targetspb.ActiveTarget{ { DiscoveredLabels: labelpb.ZLabelSet{Labels: []labelpb.ZLabel{ diff --git a/test/e2e/tools_bucket_web_test.go b/test/e2e/tools_bucket_web_test.go index 97d6c0db009..0c9f03bac60 100644 --- a/test/e2e/tools_bucket_web_test.go +++ b/test/e2e/tools_bucket_web_test.go @@ -15,8 +15,8 @@ import ( "testing" "time" - "github.com/cortexproject/cortex/integration/e2e" - e2edb "github.com/cortexproject/cortex/integration/e2e/db" + "github.com/efficientgo/e2e" + e2edb "github.com/efficientgo/e2e/db" "github.com/go-kit/kit/log" "github.com/prometheus/prometheus/pkg/labels" "github.com/prometheus/prometheus/pkg/timestamp" @@ -25,6 +25,7 @@ import ( "github.com/thanos-io/thanos/pkg/objstore" "github.com/thanos-io/thanos/pkg/objstore/client" "github.com/thanos-io/thanos/pkg/objstore/s3" + "github.com/thanos-io/thanos/pkg/runutil" "github.com/thanos-io/thanos/pkg/testutil" "github.com/thanos-io/thanos/test/e2e/e2ethanos" ) @@ -32,26 +33,29 @@ import ( func TestToolsBucketWebExternalPrefixWithoutReverseProxy(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_tools_bucket_web_route_prefix") + e, err := e2e.NewDockerEnvironment("e2e_test_tools_bucket_web_route_prefix") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) externalPrefix := "testThanos" - m := e2edb.NewMinio(8080, "thanos") - testutil.Ok(t, s.StartAndWaitReady(m)) + + const bucket = "compact_test" + m := e2ethanos.NewMinio(e, "thanos", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) svcConfig := client.BucketConfig{ Type: client.S3, Config: s3.Config{ - Bucket: "thanos", + Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.NetworkHTTPEndpoint(), + Endpoint: m.Endpoint("http"), Insecure: true, }, } b, err := e2ethanos.NewToolsBucketWeb( + e, "1", svcConfig, "", @@ -61,22 +65,22 @@ func TestToolsBucketWebExternalPrefixWithoutReverseProxy(t *testing.T) { "", ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(b)) + testutil.Ok(t, e2e.StartAndWaitReady(b)) - checkNetworkRequests(t, "http://"+b.HTTPEndpoint()+"/"+externalPrefix+"/blocks") + checkNetworkRequests(t, "http://"+b.Endpoint("http")+"/"+externalPrefix+"/blocks") } func TestToolsBucketWebExternalPrefix(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_tools_bucket_web_external_prefix") + e, err := e2e.NewDockerEnvironment("e2e_test_tools_bucket_web_external_prefix") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) externalPrefix := "testThanos" const bucket = "toolsBucketWeb_test" - m := e2edb.NewMinio(8080, bucket) - testutil.Ok(t, s.StartAndWaitReady(m)) + m := e2ethanos.NewMinio(e, "thanos", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) svcConfig := client.BucketConfig{ Type: client.S3, @@ -84,12 +88,13 @@ func TestToolsBucketWebExternalPrefix(t *testing.T) { Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.NetworkHTTPEndpoint(), + Endpoint: m.Endpoint("http"), Insecure: true, }, } b, err := e2ethanos.NewToolsBucketWeb( + e, "1", svcConfig, "", @@ -99,9 +104,9 @@ func TestToolsBucketWebExternalPrefix(t *testing.T) { "", ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(b)) + testutil.Ok(t, e2e.StartAndWaitReady(b)) - toolsBucketWebURL := mustURLParse(t, "http://"+b.HTTPEndpoint()+"/"+externalPrefix) + toolsBucketWebURL := mustURLParse(t, "http://"+b.Endpoint("http")+"/"+externalPrefix) toolsBucketWebProxy := httptest.NewServer(e2ethanos.NewSingleHostReverseProxy(toolsBucketWebURL, externalPrefix)) t.Cleanup(toolsBucketWebProxy.Close) @@ -112,15 +117,15 @@ func TestToolsBucketWebExternalPrefix(t *testing.T) { func TestToolsBucketWebExternalPrefixAndRoutePrefix(t *testing.T) { t.Parallel() - s, err := e2e.NewScenario("e2e_test_tools_bucket_web_and_route_prefix") + e, err := e2e.NewDockerEnvironment("e2e_test_tools_bucket_web_and_route_prefix") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) externalPrefix := "testThanos" routePrefix := "test" const bucket = "toolsBucketWeb_test" - m := e2edb.NewMinio(8080, bucket) - testutil.Ok(t, s.StartAndWaitReady(m)) + m := e2ethanos.NewMinio(e, "thanos", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) svcConfig := client.BucketConfig{ Type: client.S3, @@ -128,12 +133,13 @@ func TestToolsBucketWebExternalPrefixAndRoutePrefix(t *testing.T) { Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.NetworkHTTPEndpoint(), + Endpoint: m.Endpoint("http"), Insecure: true, }, } b, err := e2ethanos.NewToolsBucketWeb( + e, "1", svcConfig, routePrefix, @@ -143,9 +149,9 @@ func TestToolsBucketWebExternalPrefixAndRoutePrefix(t *testing.T) { "", ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(b)) + testutil.Ok(t, e2e.StartAndWaitReady(b)) - toolsBucketWebURL := mustURLParse(t, "http://"+b.HTTPEndpoint()+"/"+routePrefix) + toolsBucketWebURL := mustURLParse(t, "http://"+b.Endpoint("http")+"/"+routePrefix) toolsBucketWebProxy := httptest.NewServer(e2ethanos.NewSingleHostReverseProxy(toolsBucketWebURL, externalPrefix)) t.Cleanup(toolsBucketWebProxy.Close) @@ -156,25 +162,25 @@ func TestToolsBucketWebExternalPrefixAndRoutePrefix(t *testing.T) { func TestToolsBucketWebWithTimeAndRelabelFilter(t *testing.T) { t.Parallel() // Create network. - s, err := e2e.NewScenario("e2e_test_tools_bucket_web_time_and_relabel_filter") + e, err := e2e.NewDockerEnvironment("e2e_test_tools_bucket_web_time_and_relabel_filter") testutil.Ok(t, err) - t.Cleanup(e2ethanos.CleanScenario(t, s)) + t.Cleanup(e2ethanos.CleanScenario(t, e)) // Create Minio. const bucket = "toolsBucketWeb_test" - m := e2edb.NewMinio(8080, bucket) - testutil.Ok(t, s.StartAndWaitReady(m)) + m := e2ethanos.NewMinio(e, "thanos", bucket) + testutil.Ok(t, e2e.StartAndWaitReady(m)) // Create bucket. logger := log.NewLogfmtLogger(os.Stdout) bkt, err := s3.NewBucketWithConfig(logger, s3.Config{ Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.HTTPEndpoint(), + Endpoint: m.Endpoint("http"), Insecure: true, }, "tools") testutil.Ok(t, err) // Create share dir for upload. - dir := filepath.Join(s.SharedDir(), "tmp") + dir := filepath.Join(e.SharedDir(), "tmp") testutil.Ok(t, os.MkdirAll(dir, os.ModePerm)) // Upload blocks. now, err := time.Parse(time.RFC3339, "2021-07-24T08:00:00Z") @@ -200,9 +206,14 @@ func TestToolsBucketWebWithTimeAndRelabelFilter(t *testing.T) { }, } for _, b := range blocks { - id, err := b.Create(context.Background(), dir, 0, b.hashFunc) + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + t.Cleanup(cancel) + + id, err := b.Create(ctx, dir, 0, b.hashFunc) testutil.Ok(t, err) - testutil.Ok(t, objstore.UploadDir(context.Background(), logger, bkt, path.Join(dir, id.String()), id.String())) + testutil.Ok(t, runutil.Retry(time.Second, ctx.Done(), func() error { + return objstore.UploadDir(ctx, logger, bkt, path.Join(dir, id.String()), id.String()) + })) } // Start thanos tool bucket web. svcConfig := client.BucketConfig{ @@ -211,11 +222,12 @@ func TestToolsBucketWebWithTimeAndRelabelFilter(t *testing.T) { Bucket: bucket, AccessKey: e2edb.MinioAccessKey, SecretKey: e2edb.MinioSecretKey, - Endpoint: m.NetworkHTTPEndpoint(), + Endpoint: m.InternalEndpoint("http"), Insecure: true, }, } b, err := e2ethanos.NewToolsBucketWeb( + e, "1", svcConfig, "", @@ -228,9 +240,9 @@ func TestToolsBucketWebWithTimeAndRelabelFilter(t *testing.T) { source_labels: ["tenant_id"]`, ) testutil.Ok(t, err) - testutil.Ok(t, s.StartAndWaitReady(b)) + testutil.Ok(t, e2e.StartAndWaitReady(b)) // Request blocks api. - resp, err := http.DefaultClient.Get("http://" + b.HTTPEndpoint() + "/api/v1/blocks") + resp, err := http.DefaultClient.Get("http://" + b.Endpoint("http") + "/api/v1/blocks") testutil.Ok(t, err) testutil.Equals(t, http.StatusOK, resp.StatusCode) defer resp.Body.Close() diff --git a/tutorials/katacoda/thanos/1-globalview/courseBase.sh b/tutorials/katacoda/thanos/1-globalview/courseBase.sh index 7e631bbd93c..3432b3c8ec5 100644 --- a/tutorials/katacoda/thanos/1-globalview/courseBase.sh +++ b/tutorials/katacoda/thanos/1-globalview/courseBase.sh @@ -1,4 +1,4 @@ #!/usr/bin/env bash docker pull quay.io/prometheus/prometheus:v2.16.0 -docker pull quay.io/thanos/thanos:v0.22.0 +docker pull quay.io/thanos/thanos:v0.23.1 diff --git a/tutorials/katacoda/thanos/1-globalview/step2.md b/tutorials/katacoda/thanos/1-globalview/step2.md index d1e97ab2c50..ae346081711 100644 --- a/tutorials/katacoda/thanos/1-globalview/step2.md +++ b/tutorials/katacoda/thanos/1-globalview/step2.md @@ -10,7 +10,7 @@ component and can be invoked in a single command. Let's take a look at all the Thanos commands: ``` -docker run --rm quay.io/thanos/thanos:v0.22.0 --help +docker run --rm quay.io/thanos/thanos:v0.23.1 --help ```{{execute}} You should see multiple commands that solves different purposes. @@ -53,7 +53,7 @@ docker run -d --net=host --rm \ -v $(pwd)/prometheus0_eu1.yml:/etc/prometheus/prometheus.yml \ --name prometheus-0-sidecar-eu1 \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19090 \ --grpc-address 0.0.0.0:19190 \ @@ -68,7 +68,7 @@ docker run -d --net=host --rm \ -v $(pwd)/prometheus0_us1.yml:/etc/prometheus/prometheus.yml \ --name prometheus-0-sidecar-us1 \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19091 \ --grpc-address 0.0.0.0:19191 \ @@ -81,7 +81,7 @@ docker run -d --net=host --rm \ -v $(pwd)/prometheus1_us1.yml:/etc/prometheus/prometheus.yml \ --name prometheus-1-sidecar-us1 \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19092 \ --grpc-address 0.0.0.0:19192 \ diff --git a/tutorials/katacoda/thanos/1-globalview/step3.md b/tutorials/katacoda/thanos/1-globalview/step3.md index 905a87b8bfc..76f404e90a7 100644 --- a/tutorials/katacoda/thanos/1-globalview/step3.md +++ b/tutorials/katacoda/thanos/1-globalview/step3.md @@ -28,7 +28,7 @@ Click below snippet to start the Querier. ``` docker run -d --net=host --rm \ --name querier \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:29090 \ --query.replica-label replica \ diff --git a/tutorials/katacoda/thanos/2-lts/courseBase.sh b/tutorials/katacoda/thanos/2-lts/courseBase.sh index 286de850263..af98ac3d530 100644 --- a/tutorials/katacoda/thanos/2-lts/courseBase.sh +++ b/tutorials/katacoda/thanos/2-lts/courseBase.sh @@ -2,7 +2,7 @@ docker pull minio/minio:RELEASE.2019-01-31T00-31-19Z docker pull quay.io/prometheus/prometheus:v2.20.0 -docker pull quay.io/thanos/thanos:v0.22.0 +docker pull quay.io/thanos/thanos:v0.23.1 docker pull quay.io/thanos/thanosbench:v0.2.0-rc.1 mkdir /root/editor diff --git a/tutorials/katacoda/thanos/2-lts/step1.md b/tutorials/katacoda/thanos/2-lts/step1.md index cdb069b6327..5cca5de9c05 100644 --- a/tutorials/katacoda/thanos/2-lts/step1.md +++ b/tutorials/katacoda/thanos/2-lts/step1.md @@ -117,7 +117,7 @@ Similar to previous course, let's setup global view querying with sidecar: docker run -d --net=host --rm \ --name prometheus-0-eu1-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19090 \ --grpc-address 0.0.0.0:19190 \ @@ -130,7 +130,7 @@ so we will make sure we point the Querier to the gRPC endpoints of the sidecar: ``` docker run -d --net=host --rm \ --name querier \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:9091 \ --query.replica-label replica \ diff --git a/tutorials/katacoda/thanos/2-lts/step2.md b/tutorials/katacoda/thanos/2-lts/step2.md index 8a7436dc8de..717a3ea9d80 100644 --- a/tutorials/katacoda/thanos/2-lts/step2.md +++ b/tutorials/katacoda/thanos/2-lts/step2.md @@ -79,7 +79,7 @@ docker run -d --net=host --rm \ -v /root/prom-eu1:/prometheus \ --name prometheus-0-eu1-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --tsdb.path /prometheus \ --objstore.config-file /etc/thanos/minio-bucket.yaml \ diff --git a/tutorials/katacoda/thanos/2-lts/step3.md b/tutorials/katacoda/thanos/2-lts/step3.md index b922333b5db..43e7a145afa 100644 --- a/tutorials/katacoda/thanos/2-lts/step3.md +++ b/tutorials/katacoda/thanos/2-lts/step3.md @@ -6,7 +6,7 @@ In this step, we will learn about Thanos Store Gateway and how to deploy it. Let's take a look at all the Thanos commands: -```docker run --rm quay.io/thanos/thanos:v0.22.0 --help```{{execute}} +```docker run --rm quay.io/thanos/thanos:v0.23.1 --help```{{execute}} You should see multiple commands that solve different purposes, block storage based long-term storage for Prometheus. @@ -32,7 +32,7 @@ You can read more about [Store](https://thanos.io/tip/components/store.md/) here docker run -d --net=host --rm \ -v /root/editor/bucket_storage.yaml:/etc/thanos/minio-bucket.yaml \ --name store-gateway \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ store \ --objstore.config-file /etc/thanos/minio-bucket.yaml \ --http-address 0.0.0.0:19091 \ @@ -49,7 +49,7 @@ Currently querier does not know about store yet. Let's change it by adding Store docker stop querier && \ docker run -d --net=host --rm \ --name querier \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:9091 \ --query.replica-label replica \ diff --git a/tutorials/katacoda/thanos/2-lts/step4.md b/tutorials/katacoda/thanos/2-lts/step4.md index 1a16c561cf3..c703f87e920 100644 --- a/tutorials/katacoda/thanos/2-lts/step4.md +++ b/tutorials/katacoda/thanos/2-lts/step4.md @@ -25,7 +25,7 @@ Click below snippet to start the Compactor. docker run -d --net=host --rm \ -v /root/editor/bucket_storage.yaml:/etc/thanos/minio-bucket.yaml \ --name thanos-compact \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ compact \ --wait --wait-interval 30s \ --consistency-delay 0s \ diff --git a/tutorials/katacoda/thanos/6-query-caching/courseBase.sh b/tutorials/katacoda/thanos/6-query-caching/courseBase.sh index ceb21419e0f..b4c2e508371 100644 --- a/tutorials/katacoda/thanos/6-query-caching/courseBase.sh +++ b/tutorials/katacoda/thanos/6-query-caching/courseBase.sh @@ -1,5 +1,5 @@ #!/usr/bin/env bash docker pull quay.io/prometheus/prometheus:v2.22.2 -docker pull quay.io/thanos/thanos:v0.22.0 +docker pull quay.io/thanos/thanos:v0.23.1 docker pull yannrobert/docker-nginx diff --git a/tutorials/katacoda/thanos/6-query-caching/step1.md b/tutorials/katacoda/thanos/6-query-caching/step1.md index b2188e969bf..610ced6b79e 100644 --- a/tutorials/katacoda/thanos/6-query-caching/step1.md +++ b/tutorials/katacoda/thanos/6-query-caching/step1.md @@ -103,7 +103,7 @@ docker run -d --net=host --rm \ -v $(pwd)/prometheus"${i}".yml:/etc/prometheus/prometheus.yml \ --name prometheus-sidecar"${i}" \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address=0.0.0.0:1909"${i}" \ --grpc-address=0.0.0.0:1919"${i}" \ @@ -129,7 +129,7 @@ And now, let's deploy Thanos Querier to have a global overview on our services. ``` docker run -d --net=host --rm \ --name querier \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:10912 \ --grpc-address 0.0.0.0:10901 \ diff --git a/tutorials/katacoda/thanos/6-query-caching/step2.md b/tutorials/katacoda/thanos/6-query-caching/step2.md index 8fce765ccb9..5832b08abdd 100644 --- a/tutorials/katacoda/thanos/6-query-caching/step2.md +++ b/tutorials/katacoda/thanos/6-query-caching/step2.md @@ -62,7 +62,7 @@ And deploy Query Frontend: docker run -d --net=host --rm \ -v $(pwd)/frontend.yml:/etc/thanos/frontend.yml \ --name query-frontend \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query-frontend \ --http-address 0.0.0.0:20902 \ --query-frontend.compress-responses \ diff --git a/tutorials/katacoda/thanos/7-multi-tenancy/courseBase.sh b/tutorials/katacoda/thanos/7-multi-tenancy/courseBase.sh index 951657ec0d0..91ecf9559c3 100644 --- a/tutorials/katacoda/thanos/7-multi-tenancy/courseBase.sh +++ b/tutorials/katacoda/thanos/7-multi-tenancy/courseBase.sh @@ -1,7 +1,7 @@ #!/usr/bin/env bash docker pull quay.io/prometheus/prometheus:v2.20.0 -docker pull quay.io/thanos/thanos:v0.22.0 +docker pull quay.io/thanos/thanos:v0.23.1 docker pull quay.io/thanos/prom-label-proxy:v0.3.0-rc.0-ext1 docker pull caddy:2.2.1 diff --git a/tutorials/katacoda/thanos/7-multi-tenancy/step1.md b/tutorials/katacoda/thanos/7-multi-tenancy/step1.md index 51b3b7b39a0..1574c99af79 100644 --- a/tutorials/katacoda/thanos/7-multi-tenancy/step1.md +++ b/tutorials/katacoda/thanos/7-multi-tenancy/step1.md @@ -88,7 +88,7 @@ docker run -d --net=host --rm \ -v $(pwd)/editor/prometheus0_fruit.yml:/etc/prometheus/prometheus.yml \ --name prometheus-0-sidecar-fruit \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19090 \ --grpc-address 0.0.0.0:19190 \ @@ -120,7 +120,7 @@ docker run -d --net=host --rm \ -v $(pwd)/editor/prometheus0_veggie.yml:/etc/prometheus/prometheus.yml \ --name prometheus-0-sidecar-veggie \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19091 \ --grpc-address 0.0.0.0:19191 \ @@ -152,7 +152,7 @@ docker run -d --net=host --rm \ -v $(pwd)/editor/prometheus1_veggie.yml:/etc/prometheus/prometheus.yml \ --name prometheus-01-sidecar-veggie \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19092 \ --grpc-address 0.0.0.0:19192 \ @@ -170,7 +170,7 @@ Fruit: ``` docker run -d --net=host --rm \ --name querier-fruit \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:29091 \ --grpc-address 0.0.0.0:29191 \ @@ -183,7 +183,7 @@ Veggie: ``` docker run -d --net=host --rm \ --name querier-veggie \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:29092 \ --grpc-address 0.0.0.0:29192 \ diff --git a/tutorials/katacoda/thanos/7-multi-tenancy/step2.md b/tutorials/katacoda/thanos/7-multi-tenancy/step2.md index 88182026b45..6380a0df706 100644 --- a/tutorials/katacoda/thanos/7-multi-tenancy/step2.md +++ b/tutorials/katacoda/thanos/7-multi-tenancy/step2.md @@ -11,7 +11,7 @@ docker stop querier-fruit && docker stop querier-veggie ``` docker run -d --net=host --rm \ --name querier-multi \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:29090 \ --grpc-address 0.0.0.0:29190 \ diff --git a/tutorials/katacoda/thanos/x-playground/courseBase.sh b/tutorials/katacoda/thanos/x-playground/courseBase.sh index a6b9cad2a9e..7c15d1711ae 100644 --- a/tutorials/katacoda/thanos/x-playground/courseBase.sh +++ b/tutorials/katacoda/thanos/x-playground/courseBase.sh @@ -1,7 +1,7 @@ #!/usr/bin/env bash docker pull quay.io/prometheus/prometheus:v2.20.0 -docker pull quay.io/thanos/thanos:v0.22.0 +docker pull quay.io/thanos/thanos:v0.23.1 docker pull quay.io/thanos/thanosbench:v0.2.0-rc.1 docker pull minio/minio:RELEASE.2019-01-31T00-31-19Z diff --git a/tutorials/katacoda/thanos/x-playground/step1.md b/tutorials/katacoda/thanos/x-playground/step1.md index 6b0a54e2e35..8336605df80 100644 --- a/tutorials/katacoda/thanos/x-playground/step1.md +++ b/tutorials/katacoda/thanos/x-playground/step1.md @@ -169,7 +169,7 @@ docker run -d --net=host --rm \ ### Step1: Sidecar ``` -docker run -it --rm quay.io/thanos/thanos:v0.22.0 --help +docker run -it --rm quay.io/thanos/thanos:v0.23.1 --help ```{{execute}} @@ -180,7 +180,7 @@ docker run -d --net=host --rm \ -v ${CURR_DIR}/prom-eu1-replica0-config.yaml:/etc/prometheus/prometheus.yml \ --name prom-eu1-0-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19091 \ --grpc-address 0.0.0.0:19191 \ @@ -195,7 +195,7 @@ docker run -d --net=host --rm \ -v ${CURR_DIR}/prom-eu1-replica1-config.yaml:/etc/prometheus/prometheus.yml \ --name prom-eu1-1-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19092 \ --grpc-address 0.0.0.0:19192 \ @@ -210,7 +210,7 @@ docker run -d --net=host --rm \ -v ${CURR_DIR}/prom-us1-replica0-config.yaml:/etc/prometheus/prometheus.yml \ --name prom-us1-0-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --http-address 0.0.0.0:19093 \ --grpc-address 0.0.0.0:19193 \ @@ -223,7 +223,7 @@ docker run -d --net=host --rm \ ``` docker run -d --net=host --rm \ --name querier \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:9090 \ --grpc-address 0.0.0.0:19190 \ diff --git a/tutorials/katacoda/thanos/x-playground/step2.md b/tutorials/katacoda/thanos/x-playground/step2.md index bfbbeb91f64..7bf658c9709 100644 --- a/tutorials/katacoda/thanos/x-playground/step2.md +++ b/tutorials/katacoda/thanos/x-playground/step2.md @@ -65,7 +65,7 @@ docker run -d --net=host --rm \ -v ${CURR_DIR}/prom-eu1-replica0:/prometheus \ --name prom-eu1-0-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --tsdb.path /prometheus \ --objstore.config-file /etc/thanos/minio-bucket.yaml \ @@ -85,7 +85,7 @@ docker run -d --net=host --rm \ -v ${CURR_DIR}/prom-eu1-replica1:/prometheus \ --name prom-eu1-1-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --tsdb.path /prometheus \ --objstore.config-file /etc/thanos/minio-bucket.yaml \ @@ -105,7 +105,7 @@ docker run -d --net=host --rm \ -v ${CURR_DIR}/prom-us1-replica0:/prometheus \ --name prom-us1-0-sidecar \ -u root \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ sidecar \ --tsdb.path /prometheus \ --objstore.config-file /etc/thanos/minio-bucket.yaml \ @@ -130,7 +130,7 @@ Let's run Store Gateway server: docker run -d --net=host --rm \ -v ${CURR_DIR}/minio-bucket.yaml:/etc/thanos/minio-bucket.yaml \ --name store-gateway \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ store \ --objstore.config-file /etc/thanos/minio-bucket.yaml \ --http-address 0.0.0.0:19094 \ @@ -143,7 +143,7 @@ docker run -d --net=host --rm \ docker stop querier && \ docker run -d --net=host --rm \ --name querier \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ query \ --http-address 0.0.0.0:9090 \ --grpc-address 0.0.0.0:19190 \ @@ -162,7 +162,7 @@ Visit https://[[HOST_SUBDOMAIN]]-9090-[[KATACODA_HOST]].environments.katacoda.co docker run -d --net=host --rm \ -v ${CURR_DIR}/minio-bucket.yaml:/etc/thanos/minio-bucket.yaml \ --name compactor \ - quay.io/thanos/thanos:v0.22.0 \ + quay.io/thanos/thanos:v0.23.1 \ compact \ --wait --wait-interval 30s \ --consistency-delay 0s \ diff --git a/website/data/adopters.yml b/website/data/adopters.yml index b5e41f6d09b..86762b08431 100644 --- a/website/data/adopters.yml +++ b/website/data/adopters.yml @@ -155,4 +155,10 @@ adopters: logo: pagbank.png - name: Itaú Unibanco url: https://www.itau.com.br/ - logo: itau-unibanco.png \ No newline at end of file + logo: itau-unibanco.png +- name: LabyrinthLabs + url: https://lablabs.io + logo: lablabs.png +- name: Darwinbox Digital Solutions + url: https://darwinbox.com + logo: darwinbox.png \ No newline at end of file diff --git a/website/static/logos/darwinbox.png b/website/static/logos/darwinbox.png new file mode 100644 index 00000000000..02be75ac9e5 Binary files /dev/null and b/website/static/logos/darwinbox.png differ diff --git a/website/static/logos/lablabs.png b/website/static/logos/lablabs.png new file mode 100644 index 00000000000..946c0ef790b Binary files /dev/null and b/website/static/logos/lablabs.png differ