Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server] Add and use supervisor image config #6669

Merged
merged 1 commit into from
Nov 18, 2021

Conversation

corneliusludmann
Copy link
Contributor

@corneliusludmann corneliusludmann commented Nov 11, 2021

Description

This PR adds the supervisor image to the server-ide-configmap and passes this value to the registry facade. That allows to change the supervisor image just by altering the config and thus allows to deploy a new supervisor image by the IDE team on its own.

Related Issue(s)

Fixes #6512

This aspect is still not solved: #6512 (comment). That is why this PR does not close the issue yet.

How to test

  • Start a workspace and see in the logs the supervisor version. Log filter:
    resource.labels.namespace_name="staging-clu-server-supervisor-config"
    jsonPayload.serviceContext.service="supervisor"
    
    jsonPayload.serviceContext.version has the version.
  • Change the supervisor version in server-ide-config configmap, e.g. use version commit-a7166daa720e8c05c1f200257f567646d427b79b. No server restart necessary.
  • Start another workspace and verify that the version in the logs has been changed.

Note: The version in the configmap can differ from what it is in the logs because images can have multiple tags. E.g. when you have commit-93104671b75c9f81b2cc8b5a542dc8680e36928f in the configmap, it's logged as version commit-20e06321092f0e658622332880889ddda2b77f59 but you can see in the registry that both are valid tags for the same image.

Release Notes

NONE

@roboquat roboquat added release-note-none team: webapp Issue belongs to the WebApp team labels Nov 11, 2021
@codecov
Copy link

codecov bot commented Nov 11, 2021

Codecov Report

Merging #6669 (9310467) into main (a7166da) will decrease coverage by 26.73%.
The diff coverage is n/a.

❗ Current head 9310467 differs from pull request most recent head 259a3a1. Consider uploading reports for the commit 259a3a1 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main   #6669       +/-   ##
==========================================
- Coverage   32.74%   6.00%   -26.74%     
==========================================
  Files         122      12      -110     
  Lines       22193    1116    -21077     
==========================================
- Hits         7266      67     -7199     
+ Misses      14309    1048    -13261     
+ Partials      618       1      -617     
Flag Coverage Δ
components-blobserve-app ?
components-content-service-app ?
components-content-service-lib ?
components-ee-agent-smith-app ?
components-ee-agent-smith-lib ?
components-ee-kedge-app ?
components-ee-ws-scheduler-app ?
components-image-builder-app ?
components-image-builder-mk3-app ?
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
components-registry-facade-app ?
components-service-waiter-app ?
components-supervisor-app ?
components-workspacekit-app ?
components-ws-daemon-app ?
components-ws-daemon-lib ?
components-ws-manager-app ?
components-ws-proxy-app ?
components-ws-proxy-lib ?
dev-poolkeeper-app ?
installer-raw-app 6.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-manager/pkg/clock/clock.go
...ts/ee/ws-scheduler/pkg/scheduler/priority-queue.go
components/ws-proxy/pkg/proxy/auth.go
components/supervisor/pkg/ports/tunnel.go
components/supervisor/pkg/terminal/ring-buffer.go
components/ee/agent-smith/pkg/agent/metrics.go
components/ws-manager/pkg/manager/manager_ee.go
...omponents/registry-facade/pkg/registry/registry.go
components/ws-manager/pkg/manager/metrics.go
...mponents/ee/ws-scheduler/pkg/scheduler/strategy.go
... and 100 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7166da...259a3a1. Read the comment docs.

@csweichel
Copy link
Contributor

@corneliusludmann We haven't really introduced the installer to the teams yet, but it would probably make sense to update that part accordingly.

@corneliusludmann
Copy link
Contributor Author

@corneliusludmann We haven't really introduced the installer to the teams yet, but it would probably make sense to update that part accordingly.

Thanks for the very useful hint! If you don't object, I would like to handle this in a separate PR that also updates other missing changes in the installer: #6700

@corneliusludmann corneliusludmann force-pushed the clu/server-supervisor-config branch from 9310467 to 259a3a1 Compare November 15, 2021 08:56
@JanKoehnlein
Copy link
Contributor

Hmm, maybe I got it wrong, but it always reports

serviceContext: {
service: "supervisor"
version: "commit-20e06321092f0e658622332880889ddda2b77f59"
}

here, no matter what I put in as supervisorImage

@corneliusludmann
Copy link
Contributor Author

corneliusludmann commented Nov 15, 2021

Thanks for carefully testing, @JanKoehnlein! It took me what felt like an eternity of debugging until I figured out what's wrong here. It seems the code is correct but the “How To Test” is a bit tricky.

According to the logs, the server config has been reloaded at the time you tested this PR a couple of times but the supervisor image was always commit-93104671b75c9f81b2cc8b5a542dc8680e36928f. Only the last time, the version has been changed to commit-c55bdc265f2e999994668087283e7900a6daa7e5. Don't know how many versions you tried. Maybe you were too fast and the server hasn't reloaded the version until you tried the next version. You can see when the server reads a new version with this logs search query:

resource.labels.namespace_name="staging-clu-server-supervisor-config"
labels.k8s-pod/component="server"
"ide config updated"

And now comes the tricky part. Version commit-c55bdc265f2e999994668087283e7900a6daa7e5 as well as commit-93104671b75c9f81b2cc8b5a542dc8680e36928f of supervisor logs the same version commit-20e06321092f0e658622332880889ddda2b77f59. The reason is, that the build that has been built the new version commit-c55bdc265f2e999994668087283e7900a6daa7e5 has changes in the supervisor Dockerfile only and the Docker build used the cached version of the supervisor app. That's why the 2 different supervisor Docker images use the same supervisor app binary with the same version. However, when you use a version like main.1830 or commit-a7166daa720e8c05c1f200257f567646d427b79b you get actually a different version.

TL;DR:

Using a different Docker version tag does not necessarily mean that you get another supervisor app binary with a different logged version. However, use these 2 tags and you should get different versions logged:

commit-20e06321092f0e658622332880889ddda2b77f59
commit-a7166daa720e8c05c1f200257f567646d427b79b

@akosyakov
Copy link
Member

akosyakov commented Nov 18, 2021

@gitpod-io/engineering-meta Could someone finish the review please? 🙏

Copy link
Member

@geropl geropl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot test myself bc of cert issues but:

  • I see others have tested before
  • I understand the mechanisms at play thx to @corneliusludmann nice explanation
  • code LGTM
    👍

@roboquat
Copy link
Contributor

LGTM label has been added.

Git tree hash: 29f1dcf44fb9215a062737f59983525ac37cb839

@roboquat
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: geropl

Associated issue: #6512

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@roboquat roboquat merged commit 44d50bc into main Nov 18, 2021
@roboquat roboquat deleted the clu/server-supervisor-config branch November 18, 2021 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

deploy supervisor image via server ide config
6 participants