-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controller: avoid "context canceled" error on cleanup #1746
Conversation
Did you manage to work this out? |
// wait for Build() completion(or timeout) to ensure the Build's finalizing and avoiding an error "context canceled" | ||
select { | ||
case <-buildDoneCh: | ||
case <-time.After(5 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should have a delay here, I think this should block?
Actually, instead of introducing another channel, could we keep the custom context we had before but use context.WithCancelCause
and then use a custom error type to cancel it in the implementation of gwDone
, then we can just detect that error when loading from ctx.Err()
and return a nil
error in that case? I think that might be a neater implementation that doesn't require we go and add another channel.
With the current refactor gwCtx
isn't cancelled after gwDone
is called, so we need to make sure it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we get different results between the local and remote controller? I'm guessing one gets cancelled, and the other doesn't?
Did you manage to work this out?
With this patch, the behaviour of both of the controllers seem to be the same. On --sbom=false
, they successfully cleanup without "context canceled" error. On --sbom=true
, they cause panic (#1640 (comment)) on buildkitd.
I don't think we should have a delay here, I think this should block?
BuildKit's client.Build()
seems to block indefinitely when buildkitd panics. So timeout is needed if we want to avoid buildx blocking indefinitely when buildkitd panics.
context.WithCancelCause
Fixed to use context.WithCancelCause
. We still need a channel to ensure gwDone
returns after Build()
returns. Otherwise buildx process (running local controller) can exit after completion of gwDone
but before completion of Build()
, which results in BuildKit reporting cancellation error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this seems better to me, thanks 🎉
That seems a lot better from the buildkit side thanks! I do get some weird errors on the buildx side when using the local controller though - I switch away from the created process from on-error, and then ctrl-d to exit the monitor:
or, sometimes:
|
Heads-up, it looks like these issues are resolved in #1750, by simplifying to only use a single |
Signed-off-by: Kohei Tokunaga <[email protected]>
Rebased onto master to resolve conflicts. |
Fixes #1640 (comment)
This is because buildx exits before
Build()
(invoked in a goroutine) completes and returns.This commit fixes the code to ensure completion of
Build()
on cleanup.cc @jedevc