Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cell iteration order #202

Conversation

gibizer
Copy link
Contributor

@gibizer gibizer commented Dec 16, 2022

The iteration order of a map is undefined in golang. The nova controller iterates the cellTemplates map to create NovaCell CRs. Also there is a dependency check that a cell is not created if it needs API access and the cell0 is not ready as that cell syncs the API DB. However the cell template iteration order is random so in some cases cell0 is not checked by the loop before another cell wants to check the status of the cell0. This can lead to situation where cell0 was ready but cell1 reconciliation is not kicked off as cell0 was not yet iterated in that loop. This caused random test failures. And this can cause that cell1 status is flipping between Ready and Not Ready.

This patch makes sure that the cells are iterated in an order where cell0 is always handled first.

The iteration order of a map is undefined in golang. The nova controller
iterates the cellTemplates map to create NovaCell CRs. Also there is a
dependency check that a cell is not created if it needs API access and
the cell0 is not ready as that cell syncs the API DB. However the cell
template iteration order is random so in some cases cell0 is not checked
by the loop before another cell wants to check the status of the cell0.
This can lead to situation where cell0 was ready but cell1
reconciliation is not kicked off as cell0 was not yet iterated in that
loop. This caused random test failures. And this can cause that cell1
status is flipping between Ready and Not Ready.

This patch makes sure that the cells are iterated in an order where
cell0 is always handled first.
@gibizer gibizer requested a review from SeanMooney December 16, 2022 19:10
@openshift-ci openshift-ci bot requested a review from bogdando December 16, 2022 19:10
if cellName != Cell0Name {
orderedCellNames = append(orderedCellNames, cellName)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this works but we could also use https://pkg.go.dev/github.com/wk8/go-ordered-map
nice find either way.
ill admit i had not checked if the hash order was consistent in golang and given the test run with a random seed which would also presumably impact the random number generator used for the hashing it makes sesce that this would be non deterministic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noted below simply having an insertion order preserving map is not enough as we need to process cell0 before all the other cells depending on cell0 (those where hasAPIAccess = True).

Copy link
Contributor

@SeanMooney SeanMooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the minimal required fix to resolve the flaky test execution.

while i think i would prefer to address this via the type system by using ordered maps if possible i think we can proceed with this for now and come back to it in January.

my concern with the current approach is its easy to forget to do so where order matters to use i would prefer to use an ordered map type.

lets look into this when we have more time.
hopefully we can modify the CRDs to preserver order if not then we just need to be carful when we iterate going forward.

with that said this is actually a more robust fix then just using an ordered map as it does not depend on uses putting cell0 first which is why I'm approving this while we discuss if there are other advnatgtes to using an order map type later.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 16, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gibizer, SeanMooney

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit e5faf6b into openstack-k8s-operators:master Dec 16, 2022
@gibizer gibizer deleted the fix-cell-iteration branch December 18, 2022 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants