Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove stargz from builder bob #9492

Merged
merged 1 commit into from
May 20, 2022
Merged

Remove stargz from builder bob #9492

merged 1 commit into from
May 20, 2022

Conversation

Furisto
Copy link
Member

@Furisto Furisto commented Apr 22, 2022

Description

Remove stargz from builder bob

Related Issue(s)

n.a.

How to test

n.a

Release Notes

Remove stargz snapshotter from image build

Documentation

@csweichel
Copy link
Contributor

Can we move this along? We're still seeing stuck image builds in prod.

cc @Furisto @aledbf

@aledbf
Copy link
Member

aledbf commented Apr 26, 2022

@csweichel I am not sure this will help. I prefer to finish the image builder transition to workspace clusters

@csweichel
Copy link
Contributor

I prefer to finish the image builder transition to workspace clusters

Same here - that's still a bit out though.
@Furisto if I understood you correctly, a possible cause of this issue is the use of FUSE when building the image, no? If so, remove this and reverting later seems like a low-hanging way to sort this out.

@aledbf
Copy link
Member

aledbf commented Apr 26, 2022

Can we reproduce the issue in a workspace cluster?

@Furisto
Copy link
Member Author

Furisto commented Apr 26, 2022

Can we reproduce the issue in a workspace cluster?

Checking now...

@Furisto
Copy link
Member Author

Furisto commented Apr 26, 2022

/werft run

👍 started the job as gitpod-build-fo-remove-stargz.1
(with .werft/ from main)

@aledbf
Copy link
Member

aledbf commented Apr 26, 2022

@csweichel can we update the meta clusters without merging this change?

@csweichel
Copy link
Contributor

@csweichel can we update the meta clusters without merging this change?

Good call.
@geropl wdyt?

@csweichel
Copy link
Contributor

it would be a change to the registry-facade configmap which automatically reloads. No service restart is required.

@kylos101
Copy link
Contributor

it would be a change to the registry-facade configmap which automatically reloads. No service restart is required.

Dear @csweichel , first, thank you so much for your suggestion and guidance with this. 💯

I looked at registry-facade, and could not find a reference to image-builder-bob. Do you mean the image-builder-mk3 configmap, and its builderImage property?

I assume yes, but, wanted to double check. It looks like this. "builderImage": "eu.gcr.io/gitpod-core-dev/build/image-builder-mk3/bob:5ba78ec619601c7c08f485a0d4db3d9303bc3444"

@geropl wdyt? This seems like a nice way to solve the problem where image builds are lingering in a Terminating status. Worst case, we'd have to do a rolling restart of image builder mk3, if image-build-mk3 lacks hot reload after a config map change.

Stuck builds in EU:

imagebuild-0c027fc6-5b8a-40fd-b669-534a62e3c746   0/1     Terminating   0          8h
imagebuild-34b06ba9-54bb-46e3-9bb6-4e891b5d0b8a   0/1     Terminating   0          11h
imagebuild-984b3310-11f2-4d10-94a5-4c128c854574   0/1     Terminating   0          8h
imagebuild-99166f14-7a0e-4a7b-8b36-85d36597237e   0/1     Terminating   0          7h13m
imagebuild-9a542d54-66b5-4f8b-baa9-3aabd480c7bf   0/1     Terminating   0          18h
imagebuild-dba39041-f3ed-4673-8dd2-8ecd375c0eb7   0/1     Terminating   0          17h

@csweichel
Copy link
Contributor

csweichel commented Apr 27, 2022

I looked at registry-facade, and could not find a reference to image-builder-bob. Do you mean the image-builder-mk3 configmap, and its builderImage property?

🤦
My bad - it is in fact the image-builder config where we choose this value, not registry-facade.
Changes to that config require an image-builder restart.

@geropl
Copy link
Member

geropl commented Apr 27, 2022

@csweichel can we update the meta clusters without merging this change?

Good call.
@geropl wdyt?

@csweichel Sure, we can update right away.

It feels I'm missing some context, though: Since when do we see stuck image builds? Yesterdays deployment? 🤔

@csweichel
Copy link
Contributor

Since when do we see stuck image builds?

Not entirely sure - but since Friday last week for sure

@geropl
Copy link
Member

geropl commented Apr 27, 2022

Not entirely sure - but since Friday last week for sure

👍

@csweichel @kylos101 Going to bump bob to: eu.gcr.io/gitpod-core-dev/build/image-builder-mk3/bob:9c0c6ae5a34dab7507fa94ab357a004868111fdc from the last green build on this branch

Update: Internal link for x-ref: https://gitpod.slack.com/archives/C021LT6GUJG/p1651053749792269

@aledbf
Copy link
Member

aledbf commented May 3, 2022

This change is not enough. The daemon still uses --oci-worker-snapshotter=stargz (the source of the error)

@AlexTugarev
Copy link
Member

@Furisto, what's the timeline on this fix? I'm asking, because unintentionally we did an undo of the workaround with the recent deployment of WebApp.

@Furisto Furisto force-pushed the fo/remove-stargz branch from 9c0c6ae to b609249 Compare May 4, 2022 15:03
@Furisto
Copy link
Member Author

Furisto commented May 4, 2022

@aledbf I have removed the stargz snapshotter from the daemon args as well.

@Furisto Furisto marked this pull request as ready for review May 4, 2022 15:10
@Furisto Furisto requested a review from a team May 4, 2022 15:10
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label May 4, 2022
@aledbf
Copy link
Member

aledbf commented May 4, 2022

@Furisto I would still apply this change only in the meta clusters.

We need to review this after the migration of image builder to the workspace cluster.

@geropl
Copy link
Member

geropl commented May 4, 2022

@aledbf @Furisto Thank you both! So we're going to apply the bob image as last time to both prod app clusters, right?

Edit: waiting for this build

@aledbf
Copy link
Member

aledbf commented May 4, 2022

@geropl yes please 🙏

@Furisto Furisto marked this pull request as draft May 4, 2022 15:13
@csweichel
Copy link
Contributor

csweichel commented May 20, 2022

/werft run no-preview

👍 started the job as gitpod-build-fo-remove-stargz.4
(with .werft/ from main)

@csweichel csweichel marked this pull request as ready for review May 20, 2022 09:34
@roboquat roboquat merged commit 20ae141 into main May 20, 2022
@roboquat roboquat deleted the fo/remove-stargz branch May 20, 2022 09:37
@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note size/XS team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants