-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(catalog): fix issue where subscriptions sometimes get "stuck" #847
fix(catalog): fix issue where subscriptions sometimes get "stuck" #847
Conversation
we were not resetting the client when updating a catalogsource, which meant it was possible for the client to be stale and never attempt a reconnect if it didn't go unhealthy "in time" for us to detect and reconnect.
@@ -446,6 +446,12 @@ func (o *Operator) syncCatalogSources(obj interface{}) (syncError error) { | |||
o.sourcesLastUpdate = timeNow() | |||
logger.Debug("registry server recreated") | |||
|
|||
func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the fix, everything else is small things I noticed when reviewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you use a closure instead of just lock/unlock without defer?
o.sourcesLock.Lock()
delete(o.sources, sourceKey)
o.sourcesLock.Unlock()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's purely to keep the lock/unlock always next to each other, so that in the future if we need to do additional work with sources
here it's harder to make a mistake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, just curious if I could learn something here, thanks.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ecordell, tkashem The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
1 similar comment
/retest |
/retest |
we were not resetting the client when updating a catalogsource, which
meant it was possible for the client to be stale and never attempt
a reconnect if it didn't go unhealthy "in time" for us to detect and
reconnect.
I ran
InstallPlanWithCSVsAcrossMultipleCatalogSources
20 times in a row to verify it no longer flakes (previously it would error within ~5 tries)