-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add single compactor http client for delete and gennumber clients #7453
Conversation
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0.1%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.1% |
bb6f93e
to
0f3d83d
Compare
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.2% |
0f3d83d
to
9132927
Compare
@periklis thanks for adding the backport label, aiming on end of week for actually cutting 2.7 so should be plenty of time to get this in. |
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.2% |
9132927
to
54d7c88
Compare
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0.1%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.7% |
@trevorwhitney Reminder about this one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Peri! One question and one suggestion:
- why did the simplification of the client require so much additional checking of whether or not retention was enabled that wasn't previously needed? My assumption here is that previously we were creating the client but just not using it when retention was not enabled, but I'm assuming something about this change makes that problematic?
- how would you feel about moving the
t.CfgCompactorConfig.RetentionEnabled
check intot.supportIndexDeleteRequests
, or are there cases we would support the index delete requests even if retention was not enabled?
@@ -395,7 +395,7 @@ func (t *Loki) initQuerier() (services.Service, error) { | |||
toMerge := []middleware.Interface{ | |||
httpreq.ExtractQueryMetricsMiddleware(), | |||
} | |||
if t.supportIndexDeleteRequest() { | |||
if t.supportIndexDeleteRequest() && t.Cfg.CompactorConfig.RetentionEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this additional check for, why wasn't it previously needed? You check it in combination with t.supportIndexDeleteRequests
in multiple places so I'm curious if it should be moved into the t.supportIndexDeleteRequests
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above in the description this extra check is needed when the deletes-based retention is entirely disabled. This is the case by default IIRC and produces red-herring errors in the querier and ruler logs like this:
level=error ts=2022-10-18T14:13:41.598649348Z caller=delete_requests_client.go:211 msg="error getting delete requests from the store" err="unexpected status code: 404"
ts=2022-10-18T14:13:41.598697295Z caller=spanlogger.go:80 user=application level=error msg="failed loading deletes for user" err="unexpected status code: 404"
I found moving it into the supportIndexDeleteRequests
obscures the fact that this method is dedicated to check if the underlying index store supports deletes. Adding RetentionEnabled
in there would make the check for retention hidden/implicit with the store capabilities. OTH as proposed in the implementation it is more explicit for the reader to know that we need an index store that supports deletes and explicitly opt-in to use the feature.
@@ -1160,7 +1167,7 @@ func (t *Loki) initUsageReport() (services.Service, error) { | |||
} | |||
|
|||
func (t *Loki) deleteRequestsClient(clientType string, limits *validation.Overrides) (deletion.DeleteRequestsClient, error) { | |||
if !t.supportIndexDeleteRequest() { | |||
if !t.supportIndexDeleteRequest() || !t.Cfg.CompactorConfig.RetentionEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, this seems to add to the argument this additional check should be moved into that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, it is more explicit for the reader that we need an index store that supports deletes and explicitly opt-in to use the feature.
@@ -214,7 +217,7 @@ func (s resultsCache) Do(ctx context.Context, r Request) (Response, error) { | |||
return s.next.Do(ctx, r) | |||
} | |||
|
|||
if s.cacheGenNumberLoader != nil { | |||
if s.cacheGenNumberLoader != nil && s.retentionEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again I'm not quite sure why we are having to add this check now but it wasn't needed before? seems out of scope with adding TLS support to the client, so I'm curious what motivated it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it seems out of scope, but on the other hand fixing TLS makes the 404s visible in the querier/ruler logs. I know if you run the entire system withough TLS you see the 404s anyway if retention is disabled. In any case both bits missing TLS support and 404s on disabled retentions are hindsights of the compactor client. IMHO the PR is just removing these hindsights and we may need to just re-title/change the changelog entry to be more informative in that direction. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing TLS makes the 404s visible in the querier/ruler log
sounds like good enough justification to me! thanks for that added context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the added context. LGTM!
54d7c88
to
5348eb8
Compare
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
- querier/queryrange -0.1%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
- loki -0.7% |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new branch
git switch --create backport-7453-to-release-2.7.x origin/release-2.7.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x e0a7b28a61e1ef83d62493c52ad1d2a51ba76902
# Push it to GitHub
git push --set-upstream origin backport-7453-to-release-2.7.x
git switch main
# Remove the local backport branch
git branch -D backport-7453-to-release-2.7.x Then, create a pull request where the |
…afana#7453) (cherry picked from commit e0a7b28)
What this PR does / why we need it:
This PR completes work started with #7426 as there are two types of http clients for using the compactor retention API, one for the deletes and one for the generation numbers. However #7426 introduced by a hindsight on the existence of generation number client the
boltdb.shipper.compactor.delete_client
which is replaced in this PR with a singleboltdb.shipper.compactor.client
. Latter enables to set the TLS settings for the underlying http client when calling the compactor retention API.Furthermore this change set includes also a fix for enabling the generation number middlewares only if the compactor retention API is enabled. Currently this produced non-blocking but confusing 404 errors like the following on each query and rule execution:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Added backport label to fix the CLI flags from
boltdb.shipper.compactor.delete_client
toboltdb.shipper.compactor.client
before we go with #7455 out or soon after. (cc @MichelHollands @trevorwhitney)Checklist
CONTRIBUTING.md
guideCHANGELOG.md
updateddocs/sources/upgrading/_index.md