Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into aws-sdk-auth
Browse files Browse the repository at this point in the history
  • Loading branch information
sylr committed Oct 22, 2021
2 parents 552c7e4 + 1804950 commit d8c781e
Show file tree
Hide file tree
Showing 102 changed files with 2,789 additions and 1,508 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kube/.minikube

# Ignore e2e working dirs.
data/
test/e2e/e2e_integration_test*
test/e2e/e2e_*

# Ignore promu artifacts.
/.build
Expand Down
53 changes: 33 additions & 20 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,44 +10,57 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re

## Unreleased

### Added

- [#4667](https://github.com/thanos-io/thanos/pull/4667) Add a pure aws-sdk auth for s3 storage.
- [#4680](https://github.com/thanos-io/thanos/pull/4680) Query: add `exemplar.partial-response` flag to control partial response.
- [#4679](https://github.com/thanos-io/thanos/pull/4679) Added `enable-feature` flag to enable negative offsets and @ modifier, similar to Prometheus.
- [#4696](https://github.com/thanos-io/thanos/pull/4696) Query: add cache name to tracing spans.
- [#4736](https://github.com/thanos-io/thanos/pull/4736) S3: Add capability to use custom AWS STS Endpoint.
- [#4764](https://github.com/thanos-io/thanos/pull/4764) Compactor: add `block-viewer.global.sync-block-timeout` flag to set the timeout of synchronization block metas.

### Fixed

- [#4663](https://github.com/thanos-io/thanos/pull/4663) Fetcher: Fix discovered data races
- [#4508](https://github.com/thanos-io/thanos/pull/4508) Adjust and rename `ThanosSidecarUnhealthy` to `ThanosSidecarNoConnectionToStartedPrometheus`; Remove `ThanosSidecarPrometheusDown` alert; Remove unused `thanos_sidecar_last_heartbeat_success_time_seconds` metrics.
- [#4663](https://github.com/thanos-io/thanos/pull/4663) Fetcher: Fix discovered data races.
- [#4754](https://github.com/thanos-io/thanos/pull/4754) Query: Fix possible panic on stores endpoint.
- [#4753](https://github.com/thanos-io/thanos/pull/4753) Store: validate block sync concurrency parameter
- [#4792](https://github.com/thanos-io/thanos/pull/4792) Store: Fix data race in BucketedBytes pool.

### Added
## [v0.23.1](https://github.com/thanos-io/thanos/tree/release-0.23) - 2021.10.1

- [#4680](https://github.com/thanos-io/thanos/pull/4680) Query: add `exemplar.partial-response` flag to control partial response.
- [#4714](https://github.com/thanos-io/thanos/pull/4714) EndpointSet: Do not use unimplemented yet new InfoAPI to obtain metadata (avoids unnecessary HTTP roundtrip, instrumentation/alerts spam and logs).

## v0.23.0 - In Progress
## [v0.23.0](https://github.com/thanos-io/thanos/tree/release-0.23) - 2021.09.23

### Added

- [#4453](https://github.com/thanos-io/thanos/pull/4453) Tools: Add flag `--selector.relabel-config-file` / `--selector.relabel-config` / `--max-time` / `--min-time` to filter served blocks.
- [#4482](https://github.com/thanos-io/thanos/pull/4482) COS: Add http_config for cos object store client.
- [#4487](https://github.com/thanos-io/thanos/pull/4487) Query: Add memcached auto discovery support.
- [#4444](https://github.com/thanos-io/thanos/pull/4444) UI: Add search block UI.
- [#4509](https://github.com/thanos-io/thanos/pull/4509) Logging: Adds duration_ms in int64 to the logs.
- [#4462](https://github.com/thanos-io/thanos/pull/4462) UI: Add find overlap block UI.
- [#4469](https://github.com/thanos-io/thanos/pull/4469) Compact: Add flag `compact.skip-block-with-out-of-order-chunks` to skip blocks with out-of-order chunks during compaction instead of halting
- [#4506](https://github.com/thanos-io/thanos/pull/4506) `Baidu BOS` object storage, see [documents](docs/storage.md#baidu-bos) for further information.
- [#4552](https://github.com/thanos-io/thanos/pull/4552) Compact: Adds `thanos_compact_downsample_duration_seconds` histogram.
- [#4594](https://github.com/thanos-io/thanos/pull/4594) reloader: Expose metrics in config reloader to give info on the last operation.
- [#4623](https://github.com/thanos-io/thanos/pull/4623) query-frontend: made HTTP downstream tripper (client) configurable via parameters `--query-range.downstream-tripper-config` and `--query-range.downstream-tripper-config-file`. If your downstream URL is localhost or 127.0.0.1 then it is strongly recommended to bump `max_idle_conns_per_host` to at least 100 so that `query-frontend` could properly use HTTP keep-alive connections and thus reduce the latency of `query-frontend` by about 20%.
- [#4636](https://github.com/thanos-io/thanos/pull/4636) Azure: Support authentication using user-assigned managed identity
- [#4453](https://github.com/thanos-io/thanos/pull/4453) Tools `thanos bucket web`: Add flag `--selector.relabel-config-file` / `--selector.relabel-config` / `--max-time` / `--min-time` to filter served blocks.
- [#4482](https://github.com/thanos-io/thanos/pull/4482) Store: Add `http_config` option for COS object store client.
- [#4487](https://github.com/thanos-io/thanos/pull/4487) Query/Store: Add memcached auto discovery support for all caching clients.
- [#4444](https://github.com/thanos-io/thanos/pull/4444) UI: Add search to the Block UI.
- [#4509](https://github.com/thanos-io/thanos/pull/4509) Logging: Add `duration_ms` in int64 to the logs for easier log filtering.
- [#4462](https://github.com/thanos-io/thanos/pull/4462) UI: Highlighting blocks overlap in the Block UI.
- [#4469](https://github.com/thanos-io/thanos/pull/4469) Compact: Add flag `compact.skip-block-with-out-of-order-chunks` to skip blocks with out-of-order chunks during compaction instead of halting.
- [#4506](https://github.com/thanos-io/thanos/pull/4506) Store: Add `Baidu BOS` object storage, see [documents](docs/storage.md#baidu-bos) for further information.
- [#4552](https://github.com/thanos-io/thanos/pull/4552) Compact: Add `thanos_compact_downsample_duration_seconds` histogram metric.
- [#4594](https://github.com/thanos-io/thanos/pull/4594) Reloader: Expose metrics in config reloader to give info on the last operation.
- [#4619](https://github.com/thanos-io/thanos/pull/4619) Tracing: Added consistent tags to Series call from Querier about number important series statistics: `processed.series`, `processed.samples`, `processed.samples` and `processed.bytes`. This will give admin idea of how much data each component processes per query.
- [#4623](https://github.com/thanos-io/thanos/pull/4623) Query-frontend: Make HTTP downstream tripper (client) configurable via parameters `--query-range.downstream-tripper-config` and `--query-range.downstream-tripper-config-file`. If your downstream URL is localhost or 127.0.0.1 then it is strongly recommended to bump `max_idle_conns_per_host` to at least 100 so that `query-frontend` could properly use HTTP keep-alive connections and thus reduce the latency of `query-frontend` by about 20%.

### Fixed

- [#4468](https://github.com/thanos-io/thanos/pull/4468) Rule: Fix temporary rule filename composition issue.
- [#4476](https://github.com/thanos-io/thanos/pull/4476) UI: fix incorrect html escape sequence used for '>' symbol.
- [#4532](https://github.com/thanos-io/thanos/pull/4532) Mixin: Fixed "all jobs" selector in thanos mixin dashboards.
- [#4607](https://github.com/thanos-io/thanos/pull/4607) Azure: Fix Azure MSI Rate Limit
- [#4476](https://github.com/thanos-io/thanos/pull/4476) UI: Fix incorrect html escape sequence used for '>' symbol.
- [#4532](https://github.com/thanos-io/thanos/pull/4532) Mixin: Fix "all jobs" selector in thanos mixin dashboards.
- [#4607](https://github.com/thanos-io/thanos/pull/4607) Azure: Fix Azure MSI Rate Limit.

### Changed

- [#4519](https://github.com/thanos-io/thanos/pull/4519) Query: switch to miekgdns DNS resolver as the default one.
- [#4519](https://github.com/thanos-io/thanos/pull/4519) Query: Switch to miekgdns DNS resolver as the default one.
- [#4586](https://github.com/thanos-io/thanos/pull/4586) Update Prometheus/Cortex dependencies and implement LabelNames() pushdown as a result; provides massive speed-up for the labels API in Thanos Query.
- [#4421](https://github.com/thanos-io/thanos/pull/4421) *breaking :warning:*: `--store` (in the future, to be renamed to `--endpoints`) now supports passing any APIs from Thanos gRPC APIs: StoreAPI, MetadataAPI, RulesAPI, TargetsAPI and ExemplarsAPI (in oppose in the past you have to put it in hidden `--targets`, `--rules` etc flags). `--store` will now automatically detect what APIs server exposes.
- [#4669](https://github.com/thanos-io/thanos/pull/4669) Moved Prometheus dependency to v2.30.

## [v0.22.0](https://github.com/thanos-io/thanos/tree/release-0.22) - 2021.07.22

Expand Down
9 changes: 6 additions & 3 deletions cmd/thanos/compact.go
Original file line number Diff line number Diff line change
Expand Up @@ -448,7 +448,7 @@ func runCompact(

// TODO(bwplotka): Find a way to avoid syncing if no op was done.
if err := sy.SyncMetas(ctx); err != nil {
return errors.Wrap(err, "sync before first pass of downsampling")
return errors.Wrap(err, "sync before retention")
}

if err := compact.ApplyRetentionPolicyByResolution(ctx, logger, bkt, sy.Metas(), retentionByResolution, compactMetrics.blocksMarked.WithLabelValues(metadata.DeletionMarkFilename, "")); err != nil {
Expand Down Expand Up @@ -537,14 +537,14 @@ func runCompact(
}

g.Add(func() error {
iterCtx, iterCancel := context.WithTimeout(ctx, conf.waitInterval)
iterCtx, iterCancel := context.WithTimeout(ctx, conf.blockViewerSyncBlockTimeout)
_, _, _ = f.Fetch(iterCtx)
iterCancel()

// For /global state make sure to fetch periodically.
return runutil.Repeat(conf.blockViewerSyncBlockInterval, ctx.Done(), func() error {
return runutil.RetryWithLog(logger, time.Minute, ctx.Done(), func() error {
iterCtx, iterCancel := context.WithTimeout(ctx, conf.waitInterval)
iterCtx, iterCancel := context.WithTimeout(ctx, conf.blockViewerSyncBlockTimeout)
defer iterCancel()

_, _, err := f.Fetch(iterCtx)
Expand Down Expand Up @@ -576,6 +576,7 @@ type compactConfig struct {
blockSyncConcurrency int
blockMetaFetchConcurrency int
blockViewerSyncBlockInterval time.Duration
blockViewerSyncBlockTimeout time.Duration
cleanupBlocksInterval time.Duration
compactionConcurrency int
downsampleConcurrency int
Expand Down Expand Up @@ -634,6 +635,8 @@ func (cc *compactConfig) registerFlag(cmd extkingpin.FlagClause) {
Default("32").IntVar(&cc.blockMetaFetchConcurrency)
cmd.Flag("block-viewer.global.sync-block-interval", "Repeat interval for syncing the blocks between local and remote view for /global Block Viewer UI.").
Default("1m").DurationVar(&cc.blockViewerSyncBlockInterval)
cmd.Flag("block-viewer.global.sync-block-timeout", "Maximum time for syncing the blocks between local and remote view for /global Block Viewer UI.").
Default("5m").DurationVar(&cc.blockViewerSyncBlockTimeout)
cmd.Flag("compact.cleanup-interval", "How often we should clean up partially uploaded blocks and blocks with deletion mark in the background when --wait has been enabled. Setting it to \"0s\" disables it - the cleaning will only happen at the end of an iteration.").
Default("5m").DurationVar(&cc.cleanupBlocksInterval)

Expand Down
27 changes: 25 additions & 2 deletions cmd/thanos/query.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,11 @@ import (
"github.com/thanos-io/thanos/pkg/ui"
)

const (
promqlNegativeOffset = "promql-negative-offset"
promqlAtModifier = "promql-at-modifier"
)

// registerQuery registers a query command.
func registerQuery(app *extkingpin.App) {
comp := component.Query
Expand Down Expand Up @@ -146,6 +151,8 @@ func registerQuery(app *extkingpin.App) {
enableMetricMetadataPartialResponse := cmd.Flag("metric-metadata.partial-response", "Enable partial response for metric metadata endpoint. --no-metric-metadata.partial-response for disabling.").
Hidden().Default("true").Bool()

featureList := cmd.Flag("enable-feature", "Comma separated experimental feature names to enable.The current list of features is "+promqlNegativeOffset+" and "+promqlAtModifier+".").Default("").Strings()

enableExemplarPartialResponse := cmd.Flag("exemplar.partial-response", "Enable partial response for exemplar endpoint. --no-exemplar.partial-response for disabling.").
Hidden().Default("true").Bool()

Expand All @@ -163,6 +170,16 @@ func registerQuery(app *extkingpin.App) {
return errors.Wrap(err, "parse federation labels")
}

var enableNegativeOffset, enableAtModifier bool
for _, feature := range *featureList {
if feature == promqlNegativeOffset {
enableNegativeOffset = true
}
if feature == promqlAtModifier {
enableAtModifier = true
}
}

if dup := firstDuplicate(*stores); dup != "" {
return errors.Errorf("Address %s is duplicated for --store flag.", dup)
}
Expand Down Expand Up @@ -266,6 +283,8 @@ func registerQuery(app *extkingpin.App) {
*defaultMetadataTimeRange,
*strictStores,
*webDisableCORS,
enableAtModifier,
enableNegativeOffset,
component.Query,
)
})
Expand Down Expand Up @@ -329,6 +348,8 @@ func runQuery(
defaultMetadataTimeRange time.Duration,
strictStores []string,
disableCORS bool,
enableAtModifier bool,
enableNegativeOffset bool,
comp component.Component,
) error {
// TODO(bplotka in PR #513 review): Move arguments into struct.
Expand Down Expand Up @@ -456,6 +477,9 @@ func runQuery(
cancelRun()
})

engineOpts.EnableAtModifier = enableAtModifier
engineOpts.EnableNegativeOffset = enableNegativeOffset

ctxUpdate, cancelUpdate := context.WithCancel(context.Background())
g.Add(func() error {
for {
Expand All @@ -479,7 +503,6 @@ func runQuery(
}
}, func(error) {
cancelUpdate()
close(fileSDUpdates)
})
}
// Periodically update the addresses from static flags and file SD by resolving them using DNS SD if necessary.
Expand Down Expand Up @@ -546,7 +569,7 @@ func runQuery(

api := v1.NewQueryAPI(
logger,
endpoints,
endpoints.GetEndpointStatus,
engineFactory(promql.NewEngine, engineOpts, dynamicLookbackDelta),
queryableCreator,
// NOTE: Will share the same replica label as the query for now.
Expand Down
33 changes: 16 additions & 17 deletions cmd/thanos/rule.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ import (
"github.com/prometheus/prometheus/util/strutil"
"github.com/thanos-io/thanos/pkg/errutil"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/httpconfig"

extflag "github.com/efficientgo/tools/extkingpin"
"github.com/thanos-io/thanos/pkg/alert"
Expand All @@ -43,12 +44,10 @@ import (
"github.com/thanos-io/thanos/pkg/discovery/dns"
"github.com/thanos-io/thanos/pkg/extprom"
extpromhttp "github.com/thanos-io/thanos/pkg/extprom/http"
http_util "github.com/thanos-io/thanos/pkg/http"
"github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/objstore/client"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/promclient"
"github.com/thanos-io/thanos/pkg/query"
thanosrules "github.com/thanos-io/thanos/pkg/rules"
"github.com/thanos-io/thanos/pkg/runutil"
grpcserver "github.com/thanos-io/thanos/pkg/server/grpc"
Expand Down Expand Up @@ -266,29 +265,29 @@ func runRule(
) error {
metrics := newRuleMetrics(reg)

var queryCfg []query.Config
var queryCfg []httpconfig.Config
var err error
if len(conf.queryConfigYAML) > 0 {
queryCfg, err = query.LoadConfigs(conf.queryConfigYAML)
queryCfg, err = httpconfig.LoadConfigs(conf.queryConfigYAML)
if err != nil {
return err
}
} else {
queryCfg, err = query.BuildQueryConfig(conf.query.addrs)
queryCfg, err = httpconfig.BuildConfig(conf.query.addrs)
if err != nil {
return err
return errors.Wrap(err, "query configuration")
}

// Build the query configuration from the legacy query flags.
var fileSDConfigs []http_util.FileSDConfig
var fileSDConfigs []httpconfig.FileSDConfig
if len(conf.query.sdFiles) > 0 {
fileSDConfigs = append(fileSDConfigs, http_util.FileSDConfig{
fileSDConfigs = append(fileSDConfigs, httpconfig.FileSDConfig{
Files: conf.query.sdFiles,
RefreshInterval: model.Duration(conf.query.sdInterval),
})
queryCfg = append(queryCfg,
query.Config{
EndpointsConfig: http_util.EndpointsConfig{
httpconfig.Config{
EndpointsConfig: httpconfig.EndpointsConfig{
Scheme: "http",
FileSDConfigs: fileSDConfigs,
},
Expand All @@ -302,16 +301,16 @@ func runRule(
extprom.WrapRegistererWithPrefix("thanos_rule_query_apis_", reg),
dns.ResolverType(conf.query.dnsSDResolver),
)
var queryClients []*http_util.Client
var queryClients []*httpconfig.Client
queryClientMetrics := extpromhttp.NewClientMetrics(extprom.WrapRegistererWith(prometheus.Labels{"client": "query"}, reg))
for _, cfg := range queryCfg {
cfg.HTTPClientConfig.ClientMetrics = queryClientMetrics
c, err := http_util.NewHTTPClient(cfg.HTTPClientConfig, "query")
c, err := httpconfig.NewHTTPClient(cfg.HTTPClientConfig, "query")
if err != nil {
return err
}
c.Transport = tracing.HTTPTripperware(logger, c.Transport)
queryClient, err := http_util.NewClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone())
queryClient, err := httpconfig.NewClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone())
if err != nil {
return err
}
Expand Down Expand Up @@ -381,13 +380,13 @@ func runRule(
)
for _, cfg := range alertingCfg.Alertmanagers {
cfg.HTTPClientConfig.ClientMetrics = amClientMetrics
c, err := http_util.NewHTTPClient(cfg.HTTPClientConfig, "alertmanager")
c, err := httpconfig.NewHTTPClient(cfg.HTTPClientConfig, "alertmanager")
if err != nil {
return err
}
c.Transport = tracing.HTTPTripperware(logger, c.Transport)
// Each Alertmanager client has a different list of targets thus each needs its own DNS provider.
amClient, err := http_util.NewClient(logger, cfg.EndpointsConfig, c, amProvider.Clone())
amClient, err := httpconfig.NewClient(logger, cfg.EndpointsConfig, c, amProvider.Clone())
if err != nil {
return err
}
Expand Down Expand Up @@ -706,7 +705,7 @@ func removeDuplicateQueryEndpoints(logger log.Logger, duplicatedQueriers prometh

func queryFuncCreator(
logger log.Logger,
queriers []*http_util.Client,
queriers []*httpconfig.Client,
duplicatedQuery prometheus.Counter,
ruleEvalWarnings *prometheus.CounterVec,
httpMethod string,
Expand Down Expand Up @@ -762,7 +761,7 @@ func queryFuncCreator(
}
}

func addDiscoveryGroups(g *run.Group, c *http_util.Client, interval time.Duration) {
func addDiscoveryGroups(g *run.Group, c *httpconfig.Client, interval time.Duration) {
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
c.Discover(ctx)
Expand Down
10 changes: 2 additions & 8 deletions cmd/thanos/sidecar.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ import (
"github.com/thanos-io/thanos/pkg/exthttp"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/extprom"
thanoshttp "github.com/thanos-io/thanos/pkg/http"
"github.com/thanos-io/thanos/pkg/httpconfig"
"github.com/thanos-io/thanos/pkg/logging"
meta "github.com/thanos-io/thanos/pkg/metadata"
thanosmodel "github.com/thanos-io/thanos/pkg/model"
Expand Down Expand Up @@ -138,10 +138,6 @@ func runSidecar(
Name: "thanos_sidecar_prometheus_up",
Help: "Boolean indicator whether the sidecar can reach its Prometheus peer.",
})
lastHeartbeat := promauto.With(reg).NewGauge(prometheus.GaugeOpts{
Name: "thanos_sidecar_last_heartbeat_success_time_seconds",
Help: "Timestamp of the last successful heartbeat in seconds.",
})

ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
Expand Down Expand Up @@ -191,7 +187,6 @@ func runSidecar(
)
promUp.Set(1)
statusProber.Ready()
lastHeartbeat.SetToCurrentTime()
return nil
})
if err != nil {
Expand All @@ -213,7 +208,6 @@ func runSidecar(
promUp.Set(0)
} else {
promUp.Set(1)
lastHeartbeat.SetToCurrentTime()
}

return nil
Expand All @@ -234,7 +228,7 @@ func runSidecar(
t := exthttp.NewTransport()
t.MaxIdleConnsPerHost = conf.connection.maxIdleConnsPerHost
t.MaxIdleConns = conf.connection.maxIdleConns
c := promclient.NewClient(&http.Client{Transport: tracing.HTTPTripperware(logger, t)}, logger, thanoshttp.ThanosUserAgent)
c := promclient.NewClient(&http.Client{Transport: tracing.HTTPTripperware(logger, t)}, logger, httpconfig.ThanosUserAgent)

promStore, err := store.NewPrometheusStore(logger, reg, c, conf.prometheus.url, component.Sidecar, m.Labels, m.Timestamps, m.Version)
if err != nil {
Expand Down
Loading

0 comments on commit d8c781e

Please sign in to comment.