Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maximum lifetime constraint for a workspace #8307

Merged
merged 1 commit into from
Feb 22, 2022

Conversation

princerachit
Copy link
Contributor

@princerachit princerachit commented Feb 18, 2022

Description

Add maximum lifetime constraint for a workspace. Timeout the workspace if it breaches the limit. The default value is 36 hours or 2160m

Related Issue(s)

Fixes #7318

How to test

  1. Edit ws-manager configmap and update the maxLifetime value to something short e.g. 3m00s
  2. Restart ws-manager
  3. Start a new workspace
  4. Make some changes in the workspace
  5. The workspace should timeout automatically after 3 mins even when there was a recent activity
  6. Check the status of the ws in the db. It should correctly show the reason of timeout:
status: {"repo": {"branch": "master", "latestCommit": "e81cb46b130e72e96096a91523ed76d00d3263d5", "uncommitedFiles": ["README.md"], "totalUntrackedFiles": 0, "totalUncommitedFiles": 1, "totalUnpushedCommits": 0}, "phase": "stopped", "nodeIp": "10.132.0.41", "message": "", "podName": "ws-10d2368f-69ea-4b10-a7a7-7c6e59c37d6c", "timeout": "60m", "nodeName": "gke-core-dev-workspace-4-adb83b08-505z", "conditions": {"failed": "", "timeout": "workspace timed out after maximum lifetime (00h03m) took longer than 00h03m", "deployed": false, "pullingImages": false, "stoppedByRequest": false, "headlessTaskFailed": ""}, "ownerToken": "RYhpErXpCr6U2MJvkFpUSxWp6ZmqGAXO", "exposedPorts": []}
  1. Start the workspace again from thew dashboard
  2. Verify that the changes persisted

Release Notes

Add max lifetime timeout for a workspace

Documentation

@codecov
Copy link

codecov bot commented Feb 18, 2022

Codecov Report

Merging #8307 (58dc389) into main (26e2777) will decrease coverage by 2.94%.
The diff coverage is 80.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #8307      +/-   ##
==========================================
- Coverage   31.10%   28.15%   -2.95%     
==========================================
  Files          39       44       +5     
  Lines        5913     5689     -224     
==========================================
- Hits         1839     1602     -237     
- Misses       3934     3970      +36     
+ Partials      140      117      -23     
Flag Coverage Δ
components-gitpod-cli-app 11.17% <ø> (ø)
components-installer-raw-app 5.62% <0.00%> (?)
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
components-supervisor-app ?
components-ws-manager-app 40.63% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...s/installer/pkg/components/ws-manager/configmap.go 29.88% <0.00%> (ø)
components/ws-manager/pkg/manager/monitor.go 9.46% <ø> (ø)
components/ws-manager/pkg/manager/status.go 75.13% <100.00%> (ø)
components/local-app/pkg/auth/pkce.go
components/supervisor/pkg/supervisor/user.go
components/supervisor/pkg/ports/served-ports.go
components/supervisor/pkg/terminal/terminal.go
components/supervisor/pkg/terminal/ring-buffer.go
components/supervisor/pkg/ports/tunnel.go
components/supervisor/pkg/ports/ports.go
... and 40 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26e2777...58dc389. Read the comment docs.

@princerachit princerachit marked this pull request as ready for review February 18, 2022 15:21
@princerachit princerachit requested a review from a team February 18, 2022 15:21
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Feb 18, 2022
@sagor999
Copy link
Contributor

/hold

sagor999
sagor999 previously approved these changes Feb 18, 2022
Copy link
Contributor

@sagor999 sagor999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look good to me, but added hold so that @csweichel can weigh in as well.

as far as UX goes, when workspace will timeout, do we show to user the reason for this anywhere? Or is it just will show up as stopped? as it might be confusing for those few who have workspaces running longer than 36 hours.
(if we don't have anything right now, @princerachit can you create a new issue for webapp team?)

@princerachit
Copy link
Contributor Author

@sagor999 The reason for workspace timeout is not displayed. We can have this information but impact wise I am not so sure if it will be useful given that very few users actually run workspace for 36 hours.
I will create an issue for this.

Copy link
Contributor

@csweichel csweichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a bunch of nits, this looks spot on.

Re showing feedback when a workspace times out: we currently show that the workspace timed out, but not why it did that: https://github.com/gitpod-io/gitpod/blob/main/components/dashboard/src/start/StartWorkspace.tsx#L526-L527

@@ -82,6 +82,7 @@ data:
"headlessWorkspace": "60m",
"initialization": "30m",
"regularWorkspace": "30m",
"maxLifetime": "2160m",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"maxLifetime": "2160m",
"maxLifetime": "36h",

@@ -41,6 +41,7 @@
"headlessWorkspace": "60m",
"initialization": "30m",
"regularWorkspace": "30m",
"maxLifetime": "2160m",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"maxLifetime": "2160m",
"maxLifetime": "36h",

@@ -87,6 +87,7 @@ func forIntegrationTestGetManager(t *testing.T) *Manager {
Initialization: util.Duration(30 * time.Minute),
TotalStartup: util.Duration(45 * time.Minute),
RegularWorkspace: util.Duration(60 * time.Minute),
MaxLifetime: util.Duration(2160 * time.Minute),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MaxLifetime: util.Duration(2160 * time.Minute),
MaxLifetime: util.Duration(36 * time.Hour),

Comment on lines 721 to 723
timeout := m.Config.Timeouts.MaxLifetime
activity := activityMaxLifetime
if msg, err := decide(start, timeout, activity); msg != "" {
return msg, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be simplified (we don't need the variable assignment here because there's no variance in the decision)

Suggested change
timeout := m.Config.Timeouts.MaxLifetime
activity := activityMaxLifetime
if msg, err := decide(start, timeout, activity); msg != "" {
return msg, err
}
if msg, err := decide(start, m.Config.Timeouts.MaxLifetime, activityMaxLifetime); msg != "" {
return msg, err
}

@@ -56,6 +56,7 @@ func forTestingOnlyManagerConfig() config.Configuration {
Initialization: util.Duration(30 * time.Minute),
TotalStartup: util.Duration(45 * time.Minute),
RegularWorkspace: util.Duration(60 * time.Minute),
MaxLifetime: util.Duration(2160 * time.Minute),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MaxLifetime: util.Duration(2160 * time.Minute),
MaxLifetime: util.Duration(36 * time.Hour),

Copy link
Contributor

@csweichel csweichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@princerachit
Copy link
Contributor Author

/un-hold

@princerachit
Copy link
Contributor Author

/unhold

@roboquat roboquat merged commit 4d48ccb into main Feb 22, 2022
@roboquat roboquat deleted the princerachit/workspace-instances-7318 branch February 22, 2022 10:55
@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note size/XL team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workspace instances should have a max lifetime
4 participants