Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ws-daemon] Refactor unmount #5240

Merged
merged 1 commit into from
Aug 25, 2021
Merged

[ws-daemon] Refactor unmount #5240

merged 1 commit into from
Aug 25, 2021

Conversation

aledbf
Copy link
Member

@aledbf aledbf commented Aug 17, 2021

Issue:

  • ws-daemon restarts side effect is workspaces (pods) in indefinitely Terminating state.

Changes:

  • Ensure pending connections are closed on exit.
  • move Teardown from ring0 to ring1 (too late for IWS socket).
  • ensure finalizeWorkspaceContent runs without Containerd4214 workaround.
  • preserve manual backup feature (using gitpod.io/containerIsGone annotation).
  • ensure mark is not mounted before trying to remove workspace content on disk to allow the pod termination.

Scenarios:

  • start workspace
  • wait until finish loading
  • change a file
  • stop workspace
  • check the workspace is terminated and a backup is created
  • workspace pod should be removed
  • open workspace and check the change is present
  • start workspace
  • close window
  • wait until timeout
  • check the workspace is terminated and a backup is created
  • workspace pod should be removed
  • open workspace and check the change is present
  • start workspace
  • kill ws-daemon pod in the same node
  • stop the workspace (it will stay closing...)
  • check the workspace is terminated and a backup is created
  • workspace pod should be removed
  • open workspace and check the change is present
  • start workspace
  • kill ws-daemon pod in the same node
  • close window
  • wait until timeout
  • check the workspace is terminated and a backup is created
  • workspace pod should be removed
  • open workspace and check the change is present
  • start workspace
  • wait until finish loading
  • start a container docker run alpine (to trigger the socket activation)
  • stop workspace
  • check the workspace is terminated and a backup is created
  • workspace pod should be removed
  • open workspace and check the change is present
  • start workspace

  • wait until finish loading

  • run workspacekit lift bash (to use lift socket)

  • stop workspace

  • check the workspace is terminated and a backup is created

  • workspace pod should be removed

  • open workspace and check the change is present

  • /werft with-clean-slate-deployment

@roboquat roboquat requested a review from csweichel August 17, 2021 14:02
@codecov
Copy link

codecov bot commented Aug 17, 2021

Codecov Report

Merging #5240 (87ad830) into main (2d66afc) will increase coverage by 22.98%.
The diff coverage is 0.00%.

❗ Current head 87ad830 differs from pull request most recent head bdb9c33. Consider uploading reports for the commit bdb9c33 to get more accurate results
Impacted file tree graph

@@            Coverage Diff            @@
##           main    #5240       +/-   ##
=========================================
+ Coverage      0   22.98%   +22.98%     
=========================================
  Files         0       11       +11     
  Lines         0     1945     +1945     
=========================================
+ Hits          0      447      +447     
- Misses        0     1439     +1439     
- Partials      0       59       +59     
Flag Coverage Δ
components-ws-daemon-app 22.98% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-daemon/pkg/content/service.go 0.00% <0.00%> (ø)
components/ws-daemon/pkg/quota/size.go 87.30% <0.00%> (ø)
components/ws-daemon/pkg/resources/dispatch.go 0.00% <0.00%> (ø)
components/ws-daemon/pkg/content/tar.go 46.71% <0.00%> (ø)
components/ws-daemon/pkg/content/initializer.go 0.00% <0.00%> (ø)
components/ws-daemon/pkg/resources/limiter.go 77.77% <0.00%> (ø)
components/ws-daemon/pkg/resources/controller.go 33.69% <0.00%> (ø)
components/ws-daemon/pkg/content/config.go 62.50% <0.00%> (ø)
components/ws-daemon/pkg/internal/session/store.go 19.38% <0.00%> (ø)
components/ws-daemon/pkg/content/archive.go 58.88% <0.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d66afc...bdb9c33. Read the comment docs.

@geropl
Copy link
Member

geropl commented Aug 23, 2021

@aledbf I finally managed to test on devstaging (without the "monitor.go" changes): Sadly we still see the underlying "unexpected waiting for container to stop" (link) 😢

Would be awesome to discuss in sync again.

@aledbf
Copy link
Member Author

aledbf commented Aug 23, 2021

@geropl did you sync with main and this PR? (I created several PRs from the initial content)

@geropl
Copy link
Member

geropl commented Aug 23, 2021

@geropl did you sync with main and this PR? (I created several PRs from the initial content)

Yes, I branched this morning from this PR, and think I have all relevant changes.

@aledbf
Copy link
Member Author

aledbf commented Aug 23, 2021

/hold

@aledbf
Copy link
Member Author

aledbf commented Aug 23, 2021

@csweichel @geropl I moved the unmountMark as we agreed, but the behavior is not the same (it doesn't work).

@roboquat roboquat added size/M and removed size/L labels Aug 24, 2021
@aledbf aledbf force-pushed the aledbf/unmount branch 4 times, most recently from 87ad830 to 8414dfd Compare August 25, 2021 10:53
@aledbf
Copy link
Member Author

aledbf commented Aug 25, 2021

/hold cancel

@csweichel
Copy link
Contributor

/lgtm

@roboquat
Copy link
Contributor

LGTM label has been added.

Git tree hash: 59290d9c32d2b90591d3188a1882b37f94845a2d

@csweichel
Copy link
Contributor

/approve no-issue

@roboquat
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: csweichel, JanKoehnlein

Associated issue requirement bypassed by: csweichel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@roboquat roboquat merged commit a1da634 into main Aug 25, 2021
@roboquat roboquat deleted the aledbf/unmount branch August 25, 2021 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants