Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildx: log errors in initializing builders #1206

Merged
merged 2 commits into from
Aug 1, 2022

Conversation

jedevc
Copy link
Collaborator

@jedevc jedevc commented Jul 12, 2022

Previously, errors within the driver config would not be reported to the user until they tried to use the driver, even though they are easily accessible from the node group info.

For example:

# before
$ buildx create --bootstrap --name=kube --driver=kubernetes --driver-opt=namespace=buildkit,replicas=4,loadbalance=test
kube

# after
$ buildx create --bootstrap --name=kube --driver=kubernetes --driver-opt=namespace=buildkit,replicas=4,loadbalance=test
ERROR: failed to initialize builder kube (kube0): invalid loadbalance "test"
kube

This is in a similar vein to #1188

Previously, errors within the driver config would not be reported to the
user until they tried to use the driver, even though they are easily
accessible from the node group info.

This patch reports these errors (but will not fail because of them,
since the data is already saved) - this should help improve
debuggability of some of the more complex drivers, and prevent error
messages being suppressed.

Signed-off-by: Justin Chadwell <[email protected]>
@jedevc jedevc requested a review from crazy-max July 12, 2022 12:13
Copy link
Member

@crazy-max crazy-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the command is still successful in these cases? Maybe we should error and delete the builder?

This builds on the added warnings from initialized builders, now
erroring the command, and additionally attempting to revert to the
previous configuration.

To preserve the previous configuration, we add a deep Copy() function to
the NodeGroup and Node so that we can easily restore it later if we
encounter a failure.

Signed-off-by: Justin Chadwell <[email protected]>
@jedevc
Copy link
Collaborator Author

jedevc commented Jul 27, 2022

Done, this should now properly error and attempt to rollback to the previous configuration if it failed.

@jedevc jedevc added this to the v0.9.0 milestone Aug 1, 2022
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Letting this in for now but we should refactor this so that we can validate the driver inputs without creating the nodegroup instance and then deleting it. There is a race condition in the current implementation when the rollback runs.

@tonistiigi tonistiigi merged commit c1fbebe into docker:master Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants