Skip to content

Commit

Permalink
Fix a race condition where machineset reconciles too quickly
Browse files Browse the repository at this point in the history
The race condition is where the machineset reconciles on the same key
too quickly, where the creation/deletion of machines is not detected by
the second reconcilation, causing it create/delete additional machines.

I attempted to use WaitForCacheSync, but that is also insufficient in
preventing the race condition.

The fix here is to add 1 second sleep before releasing the mutex lock
when reconciling, which gives the system a chance to recognize the
changes made from the first reconciliation.

Issue kubernetes-sigs#245 was created to improve this hacky fix.
  • Loading branch information
k4leung4 committed May 30, 2018
1 parent 90100b1 commit 2d0cfe2
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 0 deletions.
5 changes: 5 additions & 0 deletions pkg/controller/machineset/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ import (
// controllerKind contains the schema.GroupVersionKind for this controller type.
var controllerKind = v1alpha1.SchemeGroupVersion.WithKind("MachineSet")

// reconcileMutexSleepSec is the duration to sleep before releasing the mutex lock that is held for reconcilation.
// See https://github.com/kubernetes-sigs/cluster-api/issues/245
var reconcileMutexSleepSec = time.Second

// +controller:group=cluster,version=v1alpha1,kind=MachineSet,resource=machinesets
type MachineSetControllerImpl struct {
builders.DefaultControllerFns
Expand Down Expand Up @@ -118,6 +122,7 @@ func (c *MachineSetControllerImpl) Reconcile(machineSet *v1alpha1.MachineSet) er
mux := c.msKeyMuxMap[key]
mux.Lock()
defer mux.Unlock()
defer time.Sleep(reconcileMutexSleepSec)

glog.V(4).Infof("Reconcile machineset %v", machineSet.Name)
allMachines, err := c.machineLister.Machines(machineSet.Namespace).List(labels.Everything())
Expand Down
1 change: 1 addition & 0 deletions pkg/controller/machineset/reconcile_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ func TestMachineSetControllerReconcileHandler(t *testing.T) {
},
}

reconcileMutexSleepSec = 0
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
// setup the test scenario
Expand Down

0 comments on commit 2d0cfe2

Please sign in to comment.