-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race condition in iptables partial sync handling #122204
Fix race condition in iptables partial sync handling #122204
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Dec 6 10:11:59 UTC 2023. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
902aaf3
to
abbcd67
Compare
func (cache *EndpointSliceCache) checkoutChanges() []*endpointsChange { | ||
changes := []*endpointsChange{} | ||
func (cache *EndpointSliceCache) checkoutChanges() map[types.NamespacedName]*endpointsChange { | ||
changes := make(map[types.NamespacedName]*endpointsChange) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is obtained from trackerByServiceMap
so there is no risk of collapsing two different changes associated to the same service
pkg/proxy/endpoints_test.go
Outdated
for i, change := range changes { | ||
expectedChange := tc.expectedChanges[i] | ||
for _, change := range changes { | ||
// The test only supports 0 or 1 changes, so if we're here, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check in detail , but is this because the test can not support more than 1 change or existing testcases only have 0 or 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the latter
@@ -1207,7 +1201,7 @@ func (proxier *Proxier) syncProxyRules() { | |||
// them in activeNATChains, so they won't get deleted.) However, we have | |||
// to still figure out how many chains we _would_ have written to make the | |||
// metrics come out right, so we just compute them and throw them away. | |||
if tryPartialSync && !serviceChanged.Has(svcName.NamespacedName.String()) && !endpointsChanged.Has(svcName.NamespacedName.String()) { | |||
if tryPartialSync && !serviceUpdateResult.UpdatedServices.Has(svcName.NamespacedName) && !endpointUpdateResult.UpdatedServices.Has(svcName.NamespacedName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems much better and easier to think about
only one question about one test, but LGTM |
abbcd67
to
0c1ffec
Compare
/lgtm |
LGTM label has been added. Git tree hash: d37505ac6dbd12087479890dcc2c5a76946b3872
|
@danwinship the bot says it needs rebase |
it doesn't; the tag is stale... |
ServicePortMap.Update() and EndpointsMap.Update() were just a tiny wrappers around the corresponding apply() methods, which had no other callers. So squash them together. (Also fix the variable naming in ServicePortMap.Update() to match other methods.)
…ults This fixes a race condition where the tracker could be updated in between us calling .PendingChanges() and .Update().
0c1ffec
to
626f349
Compare
/retest-required |
/lgtm |
LGTM label has been added. Git tree hash: b3c85bffa5dd1428437dc6809eece403275fd28f
|
/retest-required |
/cherrypick release-1.28 |
…04-upstream-release-1.27 Automated cherry pick of #122204: Fix race condition in iptables partial sync handling
…04-upstream-release-1.28 Cherry pick of #122204: Fix race condition in iptables partial sync handling
…04-upstream-release-1.29 Automated cherry pick of #122204: Fix race condition in iptables partial sync handling
What type of PR is this?
/kind bug
/kind regression
What this PR does / why we need it:
Fixes a race condition in the iptables proxy partial sync handling; the methods that process apiserver events (
proxier.OnServiceUpdated
etc) rely on the change trackers' mutexes rather than the proxier's mutexes (since they don't use any mutable variables in the proxier), butsyncProxyRules
currently makes two separate calls into the change trackers,PendingUpdates()
andUpdate()
, so it's possible that an event could arrive between them, causing the two results to reflect inconsistent state. (#121362 (comment))Which issue(s) this PR fixes:
Fixes #121362
Special notes for your reviewer:
This does not add any regression test since the bug is a race condition and I don't think there's any easy way we could force it?
Does this PR introduce a user-facing change?
/sig network
/priority important-soon
cc @aojea @juliantaylor