-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
receive: Unify error handling #2899
receive: Unify error handling #2899
Conversation
f3f2cc5
to
02f2957
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kakkoyun this PR introduces pretty invasive changes to how errors are counted and handled and it seems like more than just a unification IMO. In general, I am worried about such large changes to the core of the receive forwarding as this could easily introduce regressions. I would like to be very cautious about changing this logic as significantly as this PR unless there is a strong need for it. I would find it really helpful for the commit message to have a summary describing what problem(s) this PR is hoping to address to that we can see if maybe there is a simpler way to do it.
cb2cf99
to
44ebc68
Compare
@squat I have removed the nested counting, it should be good now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like the direction, but I would work on those helpers more. I think we could work on simplifying them more, it might be hard to follow.
f0f8259
to
4554982
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this refactoring a lot! I think we can do a bit better on naming so it's easier to understand what the functions do from a caller perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love it
6d6695e
to
e754ad7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm but I think I would like another pair of eyes on this
@bwplotka I think I had addressed your requests for this one. Could you have another look? |
e754ad7
to
a4beeaf
Compare
@bwplotka friendly ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just small nits 👍🏽
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
Signed-off-by: Kemal Akkoyun <[email protected]>
8039222
to
2fcb946
Compare
Signed-off-by: Kemal Akkoyun <[email protected]>
2fcb946
to
8ed993c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 💪🏽
Signed-off-by: Chans321 <[email protected]> fixes in e2e test for new receieve architecture - more fixes pending fixes in receive and e2e test of receive Signed-off-by: Chans321 <[email protected]> fixed connection reset by peer error in receive e2e tests. all e2e tests pass Signed-off-by: Chans321 <[email protected]> pkg/route: Fix unit tests route: Make hashring only work on labels route: Incorporate thanos-io#2899 fixed import issues and changed the query test signature Signed-off-by: Yash Sharma <[email protected]> made some corrections while rebasing receiver changes Signed-off-by: Yash Sharma <[email protected]> Added a method Error() to implement golang error interface Signed-off-by: Yash Sharma <[email protected]> linting Signed-off-by: Yash Sharma <[email protected]> use the new promauto package Signed-off-by: Yash Sharma <[email protected]> removed unused params Signed-off-by: Yash Sharma <[email protected]> rename method and add comments Signed-off-by: Yash Sharma <[email protected]> removed unused flags in receiver service in e2e tests Signed-off-by: Yash Sharma <[email protected]> replaced promauto with prometheus, as earlier was causing unnecessary panic Signed-off-by: Yash Sharma <[email protected]> changes for debug Signed-off-by: Yash Sharma <[email protected]> removed entry for receive test for e2e Signed-off-by: Yash Sharma <[email protected]> added a readiness probe Signed-off-by: Yash Sharma <[email protected]> fix E2E tests for receive router Signed-off-by: yeya24 <[email protected]>
Signed-off-by: Chans321 <[email protected]> fixes in e2e test for new receieve architecture - more fixes pending fixes in receive and e2e test of receive Signed-off-by: Chans321 <[email protected]> fixed connection reset by peer error in receive e2e tests. all e2e tests pass Signed-off-by: Chans321 <[email protected]> pkg/route: Fix unit tests route: Make hashring only work on labels route: Incorporate thanos-io#2899 fixed import issues and changed the query test signature Signed-off-by: Yash Sharma <[email protected]> made some corrections while rebasing receiver changes Signed-off-by: Yash Sharma <[email protected]> Added a method Error() to implement golang error interface Signed-off-by: Yash Sharma <[email protected]> linting Signed-off-by: Yash Sharma <[email protected]> use the new promauto package Signed-off-by: Yash Sharma <[email protected]> removed unused params Signed-off-by: Yash Sharma <[email protected]> rename method and add comments Signed-off-by: Yash Sharma <[email protected]> removed unused flags in receiver service in e2e tests Signed-off-by: Yash Sharma <[email protected]> replaced promauto with prometheus, as earlier was causing unnecessary panic Signed-off-by: Yash Sharma <[email protected]> changes for debug Signed-off-by: Yash Sharma <[email protected]> removed entry for receive test for e2e Signed-off-by: Yash Sharma <[email protected]> added a readiness probe Signed-off-by: Yash Sharma <[email protected]> fix E2E tests for receive router Signed-off-by: yeya24 <[email protected]>
Signed-off-by: Chans321 <[email protected]> fixes in e2e test for new receieve architecture - more fixes pending fixes in receive and e2e test of receive Signed-off-by: Chans321 <[email protected]> fixed connection reset by peer error in receive e2e tests. all e2e tests pass Signed-off-by: Chans321 <[email protected]> pkg/route: Fix unit tests route: Make hashring only work on labels route: Incorporate thanos-io#2899 fixed import issues and changed the query test signature Signed-off-by: Yash Sharma <[email protected]> made some corrections while rebasing receiver changes Signed-off-by: Yash Sharma <[email protected]> Added a method Error() to implement golang error interface Signed-off-by: Yash Sharma <[email protected]> linting Signed-off-by: Yash Sharma <[email protected]> use the new promauto package Signed-off-by: Yash Sharma <[email protected]> removed unused params Signed-off-by: Yash Sharma <[email protected]> rename method and add comments Signed-off-by: Yash Sharma <[email protected]> removed unused flags in receiver service in e2e tests Signed-off-by: Yash Sharma <[email protected]> replaced promauto with prometheus, as earlier was causing unnecessary panic Signed-off-by: Yash Sharma <[email protected]> changes for debug Signed-off-by: Yash Sharma <[email protected]> removed entry for receive test for e2e Signed-off-by: Yash Sharma <[email protected]> added a readiness probe Signed-off-by: Yash Sharma <[email protected]> fix E2E tests for receive router Signed-off-by: yeya24 <[email protected]> revert changes for cmd/main Signed-off-by: Yash Sharma <[email protected]> nitpicks Signed-off-by: Yash Sharma <[email protected]> removed redundant health check Signed-off-by: Yash Sharma <[email protected]> Add receive router registration Signed-off-by: Yash Sharma <[email protected]> not to be commited Signed-off-by: Yash Sharma <[email protected]> modified pkg/receive/handler.go Signed-off-by: Yash Sharma <[email protected]> modified pkg/receive/multitsdb.go Signed-off-by: Yash Sharma <[email protected]> modified pkg/route Signed-off-by: Yash Sharma <[email protected]> modified cmd/thanos/receive_route.go Signed-off-by: Yash Sharma <[email protected]> replaced MultiError with NonNilMultiError Signed-off-by: Yash Sharma <[email protected]> changed to promauto Signed-off-by: Yash Sharma <[email protected]> removed unused variables Signed-off-by: Yash Sharma <[email protected]> changed labelpb.Zlabel to timeseries Signed-off-by: Yash Sharma <[email protected]> whitespace removal and revert the labels Signed-off-by: Yash Sharma <[email protected]> use the main branch's implementation of MultiError Signed-off-by: Yash Sharma <[email protected]>
The Receiver has a couple of similar error modes that could be originated from different resources (i.e local tsdb, forward requests for other peers in the ring), this PR attempts to unify how errors are returned from those paths and how the errors handled, especially for the nested errors, in order to provide consistent fail mode behaviour and make the code easier to follow.
One other reason was, for some cases because of the wrapped errors, it was wrongly categorising the root cause of the errors, this PR also attempts to fix it.
Signed-off-by: Kemal Akkoyun [email protected]
Changes
Verification
make test-local