-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Render osImageURL, handle "bootstrap" case in MCD #324
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -91,21 +91,16 @@ func generateMachineConfigs(config *RenderConfig, templateDir string) ([]*mcfgv1 | |||
return cfgs, nil | |||
} | |||
|
|||
// GenerateMachineConfigsForRole is part of generateMachineConfigs; it operates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding the documentation!
/test e2e-aws-op |
1 similar comment
/test e2e-aws-op |
Terraform/AWS flake and
/retest |
fb3a622
to
fbd06d1
Compare
OK rolled in #329 to this and also added a hacky patch to give us some debugging data when our e2e fails. |
fbd06d1
to
842693d
Compare
842693d
to
2349419
Compare
/retest |
548b2cd
to
3964b8e
Compare
I think we have this in CI but we're not noticing it. If it's happening we need to fix it. Ref: openshift#301
Since something is doing that, and I want to know if it's somehow succeeding.
Have the MCC take `osImageURL` as provided by the cluster update/release payload and generate a `00-{master,worker}-osimageurl` MC from it, which ensures the MCD will update the node to it. However, we need special handling for the *initial* case where we boot into a target config, but we may be using an old OS image. Currently the MCD would treat this as "config drift" and go degraded. Today we write the node annotations to a file in `/etc` as part of the rendered Ignition. Use that as a "bootstrap may be required" flag, and handle it specially - if we need to pivot, do *just* that and reboot. We also clean things up by unlinking that node annotation file; after that, if the `osImageURL` drifts from the expected config, we'll go degraded, just like if someone modified a file. Closes: openshift#183
As I understand it today.
I'd like to know how long the parts of the initial sync take.
The MCC is what's going to start churning the MachineConfigs.
And also after this we should have the pool pause until all nodes are out of Bootstrap. This should also help with races.
0528d24
to
ed76ea6
Compare
This is an ugly hack, better to do the config first with osimageurl, but I don't quite understand the bootstrap code yet.
ed76ea6
to
85c1e17
Compare
/retest |
Errors trying to read the e2e log file /retest |
ISE trying to see logs /retest |
/retest |
@cgwalters: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@cgwalters: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
OK so here's the problem I'm struggling with...the "bootstrap MC" and the "inital rendered MC" need to be the same. If we land the installer change without this PR, or vice versa - things will break because booted nodes won't be able to find the MC they expect. That was the core problem that brought this PR to a halt. We need to figure out how to "ratchet" the PRs to both the installer and here. What I'm vaguely thinking about is that we somehow detect whether or not the installer set up the osimageurl. Maybe stuff something in the |
…le-baseimages-to-mach-ocp-build-data-config Bug 1878163: Updating Dockerfile baseimages to mach ocp-build-data config
This is option 2 for #273
Still a WIP, it builds but haven't tested updates with it.