Skip to content

Commit

Permalink
Merge branch 'main' into native-histograms
Browse files Browse the repository at this point in the history
  • Loading branch information
rabenhorst authored Jan 24, 2023
2 parents 4a5d074 + b58aeda commit 204b20d
Show file tree
Hide file tree
Showing 6 changed files with 74 additions and 1 deletion.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re

- [#5995](https://github.com/thanos-io/thanos/pull/5995) Sidecar: Loads the TLS certificate during startup.
- [#6044](https://github.com/thanos-io/thanos/pull/6044) Receive: mark ouf of window errors as conflict, if out-of-window samples ingestion is activated
- [#6066](https://github.com/thanos-io/thanos/pull/6066) Tracing: fixed panic because of nil sampler
- [#6067](https://github.com/thanos-io/thanos/pull/6067) Receive: fixed panic when querying uninitialized TSDBs.

### Changed

Expand Down
Binary file added docs/img/distributed-execution-proposal-6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
30 changes: 30 additions & 0 deletions docs/proposals-accepted/202301-distributed-query-execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,36 @@ Store groups can be created by either partitioning TSDBs by time (time-based par

<img src="../img/distributed-execution-proposal-5.png" alt="Distributed query execution" width="400"/>

### Distributed execution against Receive components

We currently lack the mechanism to configure a Querier against a subset of TSDBs, unless that Querier is exclusively attached to Stores that have those TSDBs. In the case of Receivers, TSDBs are created and pruned dynamically, which makes it hard to apply the distributed query model against this component.

To resolve this issue, this proposal suggests adding a `"selector.relabel-config` command-line flag to the Query component that will work the same way as the Store Gateway selector works. For each query, the Querier will apply the given relabel config against each Store's external label set and decide whether to keep or drop a TSDB from the query. After the relabeling is applied, the query will be rewritten to target only those TSDBs that match the selector.

An example config that only targets TSDBs with external labels `tenant=a` would be:

```
- source_labels: [tenant]
action: keep
regex: a
```

With this mechanism, a user can run a pool of Queriers with a selector config as follows:

```
- source_labels: [ext_label_a, ext_label_b]
action: hashmod
target_label: query_shard
modulus: ${query_shard_replicas}
- action: keep
source_labels: [query_shard]
regex: ${query_shard_instance}
```

<img src="../img/distributed-execution-proposal-6.png">

This approach can also be used to create Querier shards against Store Gateways, or any other pool of Store components.

## 7 Alternatives

A viable alternative to the proposed method is to add support for Query Pushdown in the Thanos Querier. By extracting better as described in https://github.com/thanos-io/thanos/issues/5984, we can decide to execute a query in a local Querier, similar to how the sidecar does that against Prometheus.
Expand Down
8 changes: 7 additions & 1 deletion pkg/receive/multitsdb.go
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,9 @@ func (t *tenant) client() store.Client {
defer t.mtx.RUnlock()

store := t.store()
if store == nil {
return nil
}
client := storepb.ServerAsClient(store, 0)
return newLocalClient(client, store.LabelSet, store.TimeRange)
}
Expand Down Expand Up @@ -429,7 +432,10 @@ func (t *MultiTSDB) TSDBLocalClients() []store.Client {

res := make([]store.Client, 0, len(t.tenants))
for _, tenant := range t.tenants {
res = append(res, tenant.client())
client := tenant.client()
if client != nil {
res = append(res, client)
}
}

return res
Expand Down
34 changes: 34 additions & 0 deletions pkg/receive/multitsdb_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"golang.org/x/sync/errgroup"

"github.com/efficientgo/core/testutil"

"github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/exemplars/exemplarspb"
"github.com/thanos-io/thanos/pkg/runutil"
Expand Down Expand Up @@ -521,6 +522,39 @@ func TestMultiTSDBStats(t *testing.T) {
}
}

// Regression test for https://github.com/thanos-io/thanos/issues/6047.
func TestMultiTSDBWithNilStore(t *testing.T) {
dir := t.TempDir()

m := NewMultiTSDB(dir, log.NewNopLogger(), prometheus.NewRegistry(),
&tsdb.Options{
MinBlockDuration: (2 * time.Hour).Milliseconds(),
MaxBlockDuration: (2 * time.Hour).Milliseconds(),
RetentionDuration: (6 * time.Hour).Milliseconds(),
},
labels.FromStrings("replica", "test"),
"tenant_id",
nil,
false,
metadata.NoneFunc,
)
defer func() { testutil.Ok(t, m.Close()) }()

const tenantID = "test-tenant"
_, err := m.TenantAppendable(tenantID)
testutil.Ok(t, err)

// Get LabelSets of newly created TSDB.
clients := m.TSDBLocalClients()
for _, client := range clients {
testutil.Ok(t, testutil.FaultOrPanicToErr(func() { client.LabelSets() }))
}

// Wait for tenant to become ready before terminating the test.
// This allows the tear down procedure to cleanup properly.
testutil.Ok(t, appendSample(m, tenantID, time.Now()))
}

func appendSample(m *MultiTSDB, tenant string, timestamp time.Time) error {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
Expand Down
1 change: 1 addition & 0 deletions pkg/tracing/jaeger/config_yaml.go
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ func getSampler(config Config) tracesdk.Sampler {
sampler = jaegerremote.New(config.ServiceName, remoteOptions...)
// Fallback always to default (rate limiting).
case SamplerTypeRateLimiting:
fallthrough
default:
// The same config options are applicable to both remote and rate-limiting samplers.
remoteOptions := getRemoteOptions(config)
Expand Down

0 comments on commit 204b20d

Please sign in to comment.