Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: don't increment objstore_bucket_operation_failures_total if context cancelled while reading #117

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented May 1, 2024

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

This PR fixes an issue where the objstore_bucket_operation_failures_total metric is incremented if the context is cancelled while reading an object from a bucket but after the Get or GetRange method has returned.

This makes the behaviour consistent with other operations, and makes the behaviour consistent with a cancellation that occurs while Get or GetRange is executing.

Verification

I have added unit tests to cover this scenario.

@charleskorn charleskorn changed the title all: don't increment objstore_bucket_operation_failures_total if context cancelled while reading metrics: don't increment objstore_bucket_operation_failures_total if context cancelled while reading May 1, 2024
Signed-off-by: Charles Korn <[email protected]>
Signed-off-by: Charles Korn <[email protected]>
Signed-off-by: Charles Korn <[email protected]>
@charleskorn charleskorn force-pushed the cancellation-during-read-is-not-an-error branch from b92b4ef to 374a7dc Compare May 1, 2024 02:02
Copy link
Contributor

@MichaHoffmann MichaHoffmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we get the same by marking canceled context as expected error?

@charleskorn
Copy link
Contributor Author

Couldn't we get the same by marking canceled context as expected error?

Yes. However, cancelled context is currently treated as a special case everywhere else that checks for expected errors (eg. here and here), so this PR makes reading an object consistent with that.

We could remove all the special checks for context cancellation, but that would then be a change in current behaviour for all consumers of this library.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (modulo a minor comment)

objstore.go Outdated Show resolved Hide resolved
@charleskorn charleskorn requested a review from MichaHoffmann May 6, 2024 01:24
Copy link
Contributor

@MichaHoffmann MichaHoffmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@MichaHoffmann MichaHoffmann merged commit 63052b4 into thanos-io:main May 6, 2024
7 checks passed
@MichaHoffmann
Copy link
Contributor

Isn't this actually hiding an error? If the initial s3 read that we do to check for 404 fails because of connection issue we don't get timing or error metrics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants