-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cell iteration order #202
Fix cell iteration order #202
Conversation
The iteration order of a map is undefined in golang. The nova controller iterates the cellTemplates map to create NovaCell CRs. Also there is a dependency check that a cell is not created if it needs API access and the cell0 is not ready as that cell syncs the API DB. However the cell template iteration order is random so in some cases cell0 is not checked by the loop before another cell wants to check the status of the cell0. This can lead to situation where cell0 was ready but cell1 reconciliation is not kicked off as cell0 was not yet iterated in that loop. This caused random test failures. And this can cause that cell1 status is flipping between Ready and Not Ready. This patch makes sure that the cells are iterated in an order where cell0 is always handled first.
if cellName != Cell0Name { | ||
orderedCellNames = append(orderedCellNames, cellName) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this works but we could also use https://pkg.go.dev/github.com/wk8/go-ordered-map
nice find either way.
ill admit i had not checked if the hash order was consistent in golang and given the test run with a random seed which would also presumably impact the random number generator used for the hashing it makes sesce that this would be non deterministic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you noted below simply having an insertion order preserving map is not enough as we need to process cell0 before all the other cells depending on cell0 (those where hasAPIAccess = True).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the minimal required fix to resolve the flaky test execution.
while i think i would prefer to address this via the type system by using ordered maps if possible i think we can proceed with this for now and come back to it in January.
my concern with the current approach is its easy to forget to do so where order matters to use i would prefer to use an ordered map type.
lets look into this when we have more time.
hopefully we can modify the CRDs to preserver order if not then we just need to be carful when we iterate going forward.
with that said this is actually a more robust fix then just using an ordered map as it does not depend on uses putting cell0 first which is why I'm approving this while we discuss if there are other advnatgtes to using an order map type later.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gibizer, SeanMooney The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The iteration order of a map is undefined in golang. The nova controller iterates the cellTemplates map to create NovaCell CRs. Also there is a dependency check that a cell is not created if it needs API access and the cell0 is not ready as that cell syncs the API DB. However the cell template iteration order is random so in some cases cell0 is not checked by the loop before another cell wants to check the status of the cell0. This can lead to situation where cell0 was ready but cell1 reconciliation is not kicked off as cell0 was not yet iterated in that loop. This caused random test failures. And this can cause that cell1 status is flipping between Ready and Not Ready.
This patch makes sure that the cells are iterated in an order where cell0 is always handled first.