Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ws-manager] Wait for workspace pod to be ready #8289

Merged
merged 2 commits into from
Feb 18, 2022
Merged

Conversation

sagor999
Copy link
Contributor

@sagor999 sagor999 commented Feb 17, 2022

Description

Make sure to wait for container to start up, in case of out of memory or other errors.

Related Issue(s)

Fixes #8253

How to test

Spin up new cluster in workspace preview env:
./new-vm.sh -v pavel-oom-fix.6 -z us-west1-c

start the workspace. it should start without issues.
now cordon the node (since we cannot simulate oom error reliably)
try to start workspace. observe pod is created and pending. after 5 seconds ws-manager will delete the pending pod (since it is considered failed now), and will create a new one.
uncordon the node.
observe pod was created. you should be able to load into your workspace now.

caveat: if you leave node cordoned for longer than 30 seconds, then startWorkspace context will get cancelled and server will attempt to create workspace again.

Release Notes

Improve handling of "Out of Memory" error when starting up workspaces

Documentation

@codecov
Copy link

codecov bot commented Feb 17, 2022

Codecov Report

Merging #8289 (b866d4a) into main (6497329) will increase coverage by 2.28%.
The diff coverage is 0.00%.

❗ Current head b866d4a differs from pull request most recent head 4bb5490. Consider uploading reports for the commit 4bb5490 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #8289      +/-   ##
==========================================
+ Coverage   31.23%   33.52%   +2.28%     
==========================================
  Files          39       31       -8     
  Lines        5910     4567    -1343     
==========================================
- Hits         1846     1531     -315     
+ Misses       3923     2920    -1003     
+ Partials      141      116      -25     
Flag Coverage Δ
components-gitpod-cli-app 11.17% <ø> (ø)
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
components-supervisor-app ?
components-ws-manager-app 39.73% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-manager/pkg/manager/manager.go 20.81% <0.00%> (ø)
components/supervisor/pkg/terminal/ring-buffer.go
components/supervisor/pkg/supervisor/ssh.go
components/local-app/pkg/auth/pkce.go
components/supervisor/pkg/dropwriter/dropwriter.go
components/supervisor/pkg/config/gitpod-config.go
components/local-app/pkg/auth/auth.go
components/supervisor/pkg/supervisor/config.go
components/supervisor/pkg/terminal/terminal.go
components/supervisor/pkg/supervisor/user.go
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6497329...4bb5490. Read the comment docs.

@sagor999 sagor999 marked this pull request as ready for review February 17, 2022 22:00
@sagor999 sagor999 requested a review from a team February 17, 2022 22:00
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Feb 17, 2022
@sagor999 sagor999 marked this pull request as ready for review February 18, 2022 03:54
@sagor999 sagor999 requested a review from a team February 18, 2022 04:19
@princerachit
Copy link
Contributor

/hold

Copy link
Contributor

@princerachit princerachit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good to me. I have not tested this. Putting this on hold to avoid auto merge. Pavel feel free to merge this when you are ready

@sagor999
Copy link
Contributor Author

/unhold

@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note size/L team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Starting new workspace causes it to fail with OOM error from kubelet
6 participants