-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix exponential backoff for api.LifetimeWatcher #26383
Conversation
CI Results: |
Build Results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM.
Please could you add a changelog file before it's merged?
@peteski22 Thanks for the review. Added a changelog entry, not entirely sure of the vault project labeling conventions for changelog entries, so let me know if I should do something other than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for adding the changelog entry!
@ccapurso / @peteski22 : Just FYI, looks like I don't have permission to merge since I'm not part of the Vault org, so think someone from your team will have to. Thanks! |
I can merge. Thanks again for the fix! |
I just opened up a bug for this PR, this fixes it. Here's the original context from the bug:
Describe the bug
We were using the Vault
LifetimeWatcher
from theapi
package in an internal project and noticed an issue with the backoff behavior of token renewal that was causing a bunch of our tests to fail when we upgraded to a new version of Vault.The bug is here:
vault/api/lifetime_watcher.go
Lines 348 to 354 in 1274f2d
sleepDuration
appears to be thetime.Duration
used prior to re-running the renewal loop. In the case whenerrBackoff
isnil
, then a simple backoff duration is calculated based on the call tocalculateSleepDuration
. IferrBackoff
is not nil thensleepDuration
is never set and the timeout in the following select block immediately fires again.In our testing environment this was caught because our mock Vault server was returning an invalid response, so the renew operation was failing and we were getting an inordinate amount of immediate retries.
The fix is just refactoring the above block to capture the
errBackoff.NextBackoff()
value assleepDuration
.fixes: #26382