Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Frontend: forward tenant information downstream #6595

Merged
merged 29 commits into from
Oct 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
f4506f5
Make query frontend forward tenant info downstream
douglascamata Aug 3, 2023
4b2ca14
Add TODO for client certificate
douglascamata Aug 3, 2023
aa33d04
Finish qfe tenant forward features
douglascamata Aug 7, 2023
6e8ed12
Test more scenarios
douglascamata Aug 7, 2023
ffcbed6
Goimports files
douglascamata Aug 8, 2023
92e3d71
Merge branch 'main' of github.com:thanos-io/thanos into qfe-forward-t…
douglascamata Aug 8, 2023
e4ea657
Rerun CI
douglascamata Aug 8, 2023
85506ab
Rerun goimports without any line breaks
douglascamata Aug 9, 2023
399a365
Resort imports in groups (std, 3rd-party, local)
douglascamata Aug 9, 2023
6e9ffeb
No need to forward a custom tenant header downstream
douglascamata Aug 9, 2023
aa9f3d3
Make query frontend tenant cert config hidden
douglascamata Aug 16, 2023
f955f71
Error out when org ig and tenant header are provided at the same time
douglascamata Aug 16, 2023
500d29b
Merge branch 'main' of github.com:thanos-io/thanos into qfe-forward-t…
douglascamata Aug 16, 2023
08dd0c4
Fix typo
douglascamata Aug 16, 2023
dd619e3
Update docs
douglascamata Aug 16, 2023
531ded7
Improve comments around forward and org id headers for tenancy
douglascamata Aug 16, 2023
9d62032
Add some test comments
douglascamata Aug 16, 2023
01f8071
Improve logic for detecting tenant and org id headers were provided
douglascamata Aug 16, 2023
60e093e
Fix order of check for org id headers
douglascamata Aug 16, 2023
4e8ef60
Add some TODO comments for the future
douglascamata Aug 16, 2023
8cf2ba0
Rerun CI
douglascamata Aug 17, 2023
f5c5cba
Make default tenant flag in qfe match receive's
douglascamata Sep 13, 2023
82a7e3a
Merge branch 'main' of github.com:thanos-io/thanos into qfe-forward-t…
douglascamata Sep 13, 2023
f1667e2
Merge branch 'main' of github.com:thanos-io/thanos into qfe-forward-t…
douglascamata Sep 27, 2023
1033baf
Remove unnecessary struct fields
douglascamata Sep 27, 2023
41c3f6e
Add instant query to tenant forward tests
douglascamata Sep 27, 2023
fec5df9
Merge branch 'main' of github.com:thanos-io/thanos into qfe-forward-t…
douglascamata Oct 2, 2023
d6ec416
Merge branch 'main' of github.com:thanos-io/thanos into qfe-forward-t…
douglascamata Oct 24, 2023
6daddf7
Rerun build
douglascamata Oct 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 30 additions & 3 deletions cmd/thanos/query_frontend.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,15 @@ import (
"github.com/thanos-io/thanos/pkg/queryfrontend"
httpserver "github.com/thanos-io/thanos/pkg/server/http"
"github.com/thanos-io/thanos/pkg/server/http/middleware"
"github.com/thanos-io/thanos/pkg/tenancy"
"github.com/thanos-io/thanos/pkg/tracing"
)

type queryFrontendConfig struct {
queryfrontend.Config
http httpConfig
webDisableCORS bool
queryfrontend.Config
orgIdHeaders []string
orgIdHeaders []string
}

func registerQueryFrontend(app *extkingpin.App) {
Expand Down Expand Up @@ -140,13 +141,19 @@ func registerQueryFrontend(app *extkingpin.App) {
cmd.Flag("query-frontend.log-queries-longer-than", "Log queries that are slower than the specified duration. "+
"Set to 0 to disable. Set to < 0 to enable on all queries.").Default("0").DurationVar(&cfg.CortexHandlerConfig.LogQueriesLongerThan)

cmd.Flag("query-frontend.org-id-header", "Request header names used to identify the source of slow queries (repeated flag). "+
cmd.Flag("query-frontend.org-id-header", "Deprecation Warning - This flag will be soon deprecated in favor of query-frontend.tenant-header"+
" and both flags cannot be used at the same time. "+
"Request header names used to identify the source of slow queries (repeated flag). "+
"The values of the header will be added to the org id field in the slow query log. "+
"If multiple headers match the request, the first matching arg specified will take precedence. "+
"If no headers match 'anonymous' will be used.").PlaceHolder("<http-header-name>").StringsVar(&cfg.orgIdHeaders)

cmd.Flag("query-frontend.forward-header", "List of headers forwarded by the query-frontend to downstream queriers, default is empty").PlaceHolder("<http-header-name>").StringsVar(&cfg.ForwardHeaders)

cmd.Flag("query-frontend.tenant-header", "HTTP header to determine tenant").Default(tenancy.DefaultTenantHeader).Hidden().StringVar(&cfg.TenantHeader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the right place to discuss this, but the query frontend also has a query-frontend.org-id-header arg. Thanos side, I believe this is only used when logging slow queries, but probably has some more use for Cortex.

Do we want to keep both the org-id-header and the tenant-header long term ? Could perhaps be nice to unify that into one cmd line arg if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term we should indeed converge on only one to avoid confusion for users and ourselves, and for consistency.

As part of the migration process, I can mark the query-frontend.org-id-header as deprecated for now, mention the new one, and store the configured org-id-header in the tenant header config. This way someone already using this will have a warning about the change and can easily adapt to the new CLI arg, allowing us to completely drop the org ID config in a future breaking change.

Internally and more technically speaking, I am copying the tenant header field's value to the Org ID one because of some middlewares we are importing/reusing from Cortex, but I think over time we can adapt them to avoid this and clean things up.

What do you think, @jacobbaungard?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good plan to me. What would happen if one define both, can we error out in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacobbaungard we could error out or print out a warning & give priority to the tenant header config, similar to what happens if you configure the header and the cert field in Receive. I personally have no preference. Will update to error out when both are provided. 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think the orgID was introduced only as a method of identifying sources of slow queries and passing that information to Cortex. It's also a repeatable flag, with significance in how headers are ordered, afaiu reading original issue/PR.

Thus I don't think it is mutually exclusive with this tenant header, and both have different semantics as well, as you can have multiple orgID headers, but only a single tenant one. So let's not error or warn the user, and keep it for now.

cc: @yeya24 in case he has some other thoughts on this! 🙂

cmd.Flag("query-frontend.default-tenant-id", "Default tenant ID to use if tenant header is not present").Default(tenancy.DefaultTenant).Hidden().StringVar(&cfg.DefaultTenant)
cmd.Flag("query-frontend.tenant-certificate-field", "Use TLS client's certificate field to determine tenant for requests. Must be one of "+tenancy.CertificateFieldOrganization+", "+tenancy.CertificateFieldOrganizationalUnit+" or "+tenancy.CertificateFieldCommonName+". This setting will cause the query-frontend.tenant-header flag value to be ignored.").Hidden().Default("").EnumVar(&cfg.TenantCertField, "", tenancy.CertificateFieldOrganization, tenancy.CertificateFieldOrganizationalUnit, tenancy.CertificateFieldCommonName)

cmd.Flag("query-frontend.vertical-shards", "Number of shards to use when distributing shardable PromQL queries. For more details, you can refer to the Vertical query sharding proposal: https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md").IntVar(&cfg.NumShards)

reqLogConfig := extkingpin.RegisterRequestLoggingFlags(cmd)
Expand Down Expand Up @@ -224,6 +231,26 @@ func runQueryFrontend(
cfg *queryFrontendConfig,
comp component.Component,
) error {
tenantHeaderProvided := cfg.TenantHeader != "" && cfg.TenantHeader != tenancy.DefaultTenantHeader
// If tenant header is set and different from the default tenant header, add it to the list of org id headers.
// In this case we don't need to add the tenant header to `cfg.ForwardHeaders` because tripperware will modify
// the request, renaming the tenant header to the default tenant header, before the header propagation logic runs.
// TODO: This should be removed once the org id header is fully removed in Thanos.
if tenantHeaderProvided {
// If tenant header is provided together with the org id header, error out.
if len(cfg.orgIdHeaders) != 0 {
return errors.New("query-frontend.org-id-header and query-frontend.tenant-header cannot be used together")
}

cfg.orgIdHeaders = append(cfg.orgIdHeaders, cfg.TenantHeader)
}

// Temporarily manually adding the default tenant header into the list of headers to forward and org id headers.
// This facilitates the transition from org id to tenant id with minimal amount of changes.
cfg.ForwardHeaders = append(cfg.ForwardHeaders, tenancy.DefaultTenantHeader)
// TODO: This should be removed once the org id header is fully removed in Thanos.
cfg.orgIdHeaders = append(cfg.orgIdHeaders, tenancy.DefaultTenantHeader)

queryRangeCacheConfContentYaml, err := cfg.QueryRangeConfig.CachePathOrContent.Content()
if err != nil {
return err
Expand Down
18 changes: 11 additions & 7 deletions docs/components/query-frontend.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,13 +260,17 @@ Flags:
duration. Set to 0 to disable. Set to < 0 to
enable on all queries.
--query-frontend.org-id-header=<http-header-name> ...
Request header names used to identify the
source of slow queries (repeated flag).
The values of the header will be added to
the org id field in the slow query log. If
multiple headers match the request, the first
matching arg specified will take precedence.
If no headers match 'anonymous' will be used.
Deprecation Warning - This flag
will be soon deprecated in favor of
query-frontend.tenant-header and both flags
cannot be used at the same time. Request header
names used to identify the source of slow
queries (repeated flag). The values of the
header will be added to the org id field in
the slow query log. If multiple headers match
the request, the first matching arg specified
will take precedence. If no headers match
'anonymous' will be used.
--query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS
Number of shards to use when
distributing shardable PromQL queries.
Expand Down
3 changes: 3 additions & 0 deletions pkg/queryfrontend/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,9 @@ type Config struct {
DownstreamURL string
ForwardHeaders []string
NumShards int
TenantHeader string
DefaultTenant string
TenantCertField string
}

// QueryRangeConfig holds the config for query range tripperware.
Expand Down
11 changes: 10 additions & 1 deletion pkg/queryfrontend/roundtrip.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import (
"strings"
"time"

"github.com/thanos-io/thanos/pkg/tenancy"

douglascamata marked this conversation as resolved.
Show resolved Hide resolved
"github.com/thanos-io/thanos/pkg/querysharding"

"github.com/go-kit/log"
Expand Down Expand Up @@ -78,7 +80,14 @@ func NewTripperware(config Config, reg prometheus.Registerer, logger log.Logger)
config.ForwardHeaders,
)
return func(next http.RoundTripper) http.RoundTripper {
return newRoundTripper(next, queryRangeTripperware(next), labelsTripperware(next), queryInstantTripperware(next), reg)
tripper := newRoundTripper(
next,
queryRangeTripperware(next),
labelsTripperware(next),
queryInstantTripperware(next),
reg,
)
return tenancy.InternalTenancyConversionTripper(config.TenantHeader, config.TenantCertField, tripper)
}, nil
}

Expand Down
27 changes: 26 additions & 1 deletion pkg/tenancy/tenancy.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,10 @@ func GetTenantFromHTTP(r *http.Request, tenantHeader string, defaultTenantID str
var err error
tenant := r.Header.Get(tenantHeader)
if tenant == "" {
tenant = defaultTenantID
tenant = r.Header.Get(DefaultTenantHeader)
if tenant == "" {
tenant = defaultTenantID
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think its worth logging down below if we end up overwriting based on the cert? or maybe we could just handle this on startup with a log or an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be too spammy. We had similar logs when parsing the tenant information from the request in the Store GW and they were removed: 510c054

}

if certTenantField != "" {
Expand All @@ -65,6 +68,28 @@ func GetTenantFromHTTP(r *http.Request, tenantHeader string, defaultTenantID str
return tenant, nil
}

type roundTripperFunc func(*http.Request) (*http.Response, error)

func (r roundTripperFunc) RoundTrip(request *http.Request) (*http.Response, error) {
return r(request)
}

// InternalTenancyConversionTripper is a tripperware that rewrites the configurable tenancy header in the request into
// the hardcoded tenancy header that is used for internal communication in Thanos components. If any custom tenant
// header is configured and present in the request, it will be stripped out.
func InternalTenancyConversionTripper(customTenantHeader, certTenantField string, next http.RoundTripper) http.RoundTripper {
return roundTripperFunc(func(r *http.Request) (*http.Response, error) {
tenant, _ := GetTenantFromHTTP(r, customTenantHeader, DefaultTenant, certTenantField)
r.Header.Set(DefaultTenantHeader, tenant)
// If the custom tenant header is not the same as the default internal header, we want to exclude the custom
// one from the request to keep things simple.
if customTenantHeader != DefaultTenantHeader {
r.Header.Del(customTenantHeader)
}
return next.RoundTrip(r)
})
}

// getTenantFromCertificate extracts the tenant value from a client's presented certificate. The x509 field to use as
// value can be configured with Options.TenantField. An error is returned when the extraction has not succeeded.
func getTenantFromCertificate(r *http.Request, certTenantField string) (string, error) {
Expand Down
7 changes: 7 additions & 0 deletions test/e2e/e2ethanos/services.go
Original file line number Diff line number Diff line change
Expand Up @@ -978,6 +978,13 @@ func NewQueryFrontend(e e2e.Environment, name, downstreamURL string, config quer
flags["--query-range.split-interval"] = "0"
}

if config.TenantHeader != "" {
flags["--query-frontend.tenant-header"] = config.TenantHeader
}
if config.DefaultTenant != "" {
flags["--query-frontend.default-tenant"] = config.DefaultTenant
}

return e2eobs.AsObservable(e.Runnable(fmt.Sprintf("query-frontend-%s", name)).
WithPorts(map[string]int{"http": 8080}).
Init(e2e.StartOptions{
Expand Down
128 changes: 127 additions & 1 deletion test/e2e/query_frontend_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,31 @@ package e2e_test

import (
"context"
"crypto/sha256"
"fmt"
"net/http"
"net/http/httptest"
"reflect"
"sort"
"testing"
"time"

"github.com/efficientgo/core/testutil"
"github.com/efficientgo/e2e"
e2emon "github.com/efficientgo/e2e/monitoring"
"github.com/efficientgo/e2e/monitoring/matchers"
"github.com/pkg/errors"
"github.com/prometheus/client_golang/api"
v1 "github.com/prometheus/client_golang/api/prometheus/v1"
"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/model/timestamp"

"github.com/efficientgo/core/testutil"
"github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/cacheutil"
"github.com/thanos-io/thanos/pkg/promclient"
"github.com/thanos-io/thanos/pkg/queryfrontend"
"github.com/thanos-io/thanos/pkg/tenancy"
"github.com/thanos-io/thanos/pkg/testutil/e2eutil"
"github.com/thanos-io/thanos/test/e2e/e2ethanos"
)
Expand Down Expand Up @@ -810,3 +817,122 @@ func TestInstantQueryShardingWithRandomData(t *testing.T) {
})
}
}

func TestQueryFrontendTenantForward(t *testing.T) {
t.Parallel()

tests := []struct {
name string
customTenantHeaderName string
tenantName string
}{
{
name: "default tenant header name with a tenant name",
customTenantHeaderName: tenancy.DefaultTenantHeader,
tenantName: "test-tenant",
},
{
name: "default tenant header name without a tenant name",
customTenantHeaderName: tenancy.DefaultTenantHeader,
},
{
name: "custom tenant header name with a tenant name",
customTenantHeaderName: "X-Foobar-Tenant",
tenantName: "test-tenant",
},
{
name: "custom tenant header name without a tenant name",
customTenantHeaderName: "X-Foobar-Tenant",
},
}
for _, tc := range tests {
tc := tc
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
if tc.tenantName == "" {
tc.tenantName = tenancy.DefaultTenant
}
// Use a shorthash of tc.name as e2e env name because the random name generator is having a collision for
// some reason.
e2ename := fmt.Sprintf("%x", sha256.Sum256([]byte(tc.name)))[:8]
e, err := e2e.New(e2e.WithName(e2ename))
testutil.Ok(t, err)
t.Cleanup(e2ethanos.CleanScenario(t, e))

ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNoContent)
// The tenant header present in the outgoing request should be the default tenant header.
testutil.Equals(t, tc.tenantName, r.Header.Get(tenancy.DefaultTenantHeader))

// In case the query frontend is configured with a custom tenant header name, verify such header
// is not present in the outgoing request.
if tc.customTenantHeaderName != tenancy.DefaultTenantHeader {
testutil.Equals(t, "", r.Header.Get(tc.customTenantHeaderName))
}

// Verify the outgoing request will keep the X-Scope-OrgID header for compatibility with Cortex.
testutil.Equals(t, tc.tenantName, r.Header.Get("X-Scope-OrgID"))
}))
t.Cleanup(ts.Close)
tsPort := urlParse(t, ts.URL).Port()

inMemoryCacheConfig := queryfrontend.CacheProviderConfig{
Type: queryfrontend.INMEMORY,
Config: queryfrontend.InMemoryResponseCacheConfig{
MaxSizeItems: 1000,
Validity: time.Hour,
},
}
queryFrontendConfig := queryfrontend.Config{
TenantHeader: tc.customTenantHeaderName,
}
queryFrontend := e2ethanos.NewQueryFrontend(
e,
"qfe",
fmt.Sprintf("http://%s:%s", e.HostAddr(), tsPort),
queryFrontendConfig,
inMemoryCacheConfig,
)
testutil.Ok(t, e2e.StartAndWaitReady(queryFrontend))

ctx, cancel := context.WithTimeout(context.Background(), time.Minute)
t.Cleanup(cancel)

promClient, err := api.NewClient(api.Config{
Address: "http://" + queryFrontend.Endpoint("http"),
RoundTripper: tenantRoundTripper{
tenant: tc.tenantName,
rt: http.DefaultTransport,
},
})
testutil.Ok(t, err)
v1api := v1.NewAPI(promClient)

r := v1.Range{
Start: time.Now().Add(-time.Hour),
End: time.Now(),
Step: time.Minute,
}

_, _, _ = v1api.QueryRange(ctx, "rate(prometheus_tsdb_head_samples_appended_total[5m])", r)
_, _, _ = v1api.Query(ctx, "rate(prometheus_tsdb_head_samples_appended_total[5m])", time.Now())
})
}
}

type tenantRoundTripper struct {
tenant string
tenantHeader string
rt http.RoundTripper
}

var _ http.RoundTripper = tenantRoundTripper{}

// RoundTrip implements the http.RoundTripper interface.
func (u tenantRoundTripper) RoundTrip(r *http.Request) (*http.Response, error) {
if u.tenantHeader == "" {
u.tenantHeader = tenancy.DefaultTenantHeader
}
r.Header.Set(u.tenantHeader, u.tenant)
return u.rt.RoundTrip(r)
}
Loading