Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure retry on Rate-Limiting from remote-write config #8477

Conversation

Harkishen-Singh
Copy link
Contributor

Signed-off-by: Harkishen-Singh [email protected]

Fixes: #8474

cc @csmarchbanks @roidelapluie

@Harkishen-Singh Harkishen-Singh force-pushed the configure-rate-limit-via-config branch from 90a9f97 to 72d031e Compare February 11, 2021 18:15
@@ -1724,6 +1724,9 @@ url: <string>
headers:
[ <string>: <string> ... ]

# Retry on receiving a rate-limiting status code in response from remote-write storage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you put a note here saying that this configuration may change in any release, similar to the note above metadata_config. I imagine we may get rid of it once there is an agreed upon solution to how to handle rate limited retries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP 429 should be mentioned.

I am not confortable to change this behavior in a minor release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc string should also say retry on 429 status code, and the variable name two lines down still needs to be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retry_on_rate_limit should be retry_on_http_429 now a couple lines below this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I hope the message is good.

storage/remote/client.go Outdated Show resolved Hide resolved
storage/remote/client.go Outdated Show resolved Hide resolved
storage/remote/client_test.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
@@ -1724,6 +1724,9 @@ url: <string>
headers:
[ <string>: <string> ... ]

# Retry on receiving a rate-limiting status code in response from remote-write storage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP 429 should be mentioned.

I am not confortable to change this behavior in a minor release.

@Harkishen-Singh Harkishen-Singh force-pushed the configure-rate-limit-via-config branch from 72d031e to 653e37d Compare February 11, 2021 19:51
@Harkishen-Singh Harkishen-Singh force-pushed the configure-rate-limit-via-config branch from 653e37d to 4a175dd Compare February 11, 2021 20:32
Copy link
Member

@roidelapluie roidelapluie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read the documentation, I think that we should move this setting into the queue_config, next to the other retries control, like min_backoff and max_backoff.

docs/configuration/configuration.md Outdated Show resolved Hide resolved
@csmarchbanks
Copy link
Member

Let's hold off a bit on this since #8237 will be reverted for the release while we figure out what to do about limiting retries.

config/config.go Outdated Show resolved Hide resolved
@csmarchbanks
Copy link
Member

The release branch has been merged back into master, so we can add this flag without worrying about merge conflicts now once the final comments are addressed.

@Harkishen-Singh Harkishen-Singh force-pushed the configure-rate-limit-via-config branch from a7bc54d to 77c20fd Compare February 16, 2021 09:23
@roidelapluie
Copy link
Member

When I see this, I think it is a mistake to put that in the queue config because it is actually managed by the client. Sorry I did not spot this before. Even if it looks better next to backoff, it is actually really a client thing. Maybe we need to revert to having it at the client level again. so sorry about this. cc @csmarchbanks what is your opinion here?

@Harkishen-Singh
Copy link
Contributor Author

That's completely fine. No problem. I am happy to revert back if Chris agrees too.

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer having it in the queue_config. The client happens to be where the implementation uses the flag (right now at least), but it is the queue that actually does the retries. I can imagine a future where there is a RateLimitedError in addition to RecoverableError at which point the flag would likely be used by the queue. Not a strong opinion and I would be fine having the toggle near the client config as well.

Approving as i am happy with this as stands, @roidelapluie if you agree with my slight preference feel free to merge, otherwise let's go back to what it was before.

@roidelapluie roidelapluie merged commit 5f92a82 into prometheus:master Feb 16, 2021
@roidelapluie
Copy link
Member

Okay, I agree from a user perspective, it makes more sense in the queue config.

Thanks @Harkishen-Singh !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remote write: Make HTTP 429 errors recoverable only when the user opts-in
3 participants