Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ws-manager] config terminationGracePeriodSeconds #13352

Closed
wants to merge 1 commit into from

Conversation

svenefftinge
Copy link
Member

@svenefftinge svenefftinge commented Sep 27, 2022

Description

Allow extending the termination grace period of workspace pods through gitpod.yml configuration.

Related Issue(s)

Fixes #

How to test

Release Notes

Allow configuration of 'terminationGracePeriodSeconds' to get more time for gracefully stopping services in workspaces

Documentation

Werft options:

  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-integration-tests=all
    Valid options are all, workspace, webapp, ide

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-se-flexible-termination-period.1 because the annotations in the pull request description changed
(with .werft/ from main)

@svenefftinge svenefftinge force-pushed the se/flexible-termination-period branch from 4d83eb9 to 65b686f Compare September 27, 2022 08:46
@roboquat roboquat added size/XL and removed size/S labels Sep 27, 2022
@svenefftinge
Copy link
Member Author

svenefftinge commented Sep 27, 2022

/werft run

👍 started the job as gitpod-build-se-flexible-termination-period.3
(with .werft/ from main)

@svenefftinge svenefftinge marked this pull request as ready for review September 27, 2022 11:23
@svenefftinge svenefftinge requested a review from a team September 27, 2022 11:23
@svenefftinge svenefftinge requested a review from a team September 27, 2022 11:23
@github-actions github-actions bot added team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team labels Sep 27, 2022
Copy link
Member

@easyCZ easyCZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the general release plan for this feature? Are we looking to control access to the termination grace period somehow?

/hold for Qs

@@ -114,6 +114,10 @@
"additionalProperties": false
}
},
"terminationGracePeriodSeconds": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the protocol it's marked as optional, does it need to be marked optional here too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By making this change, all requests will now be able to set this despite it not being actually implemented by workspaces. Is that intended? Do we want to control release of this in some way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the protocol it's marked as optional, does it need to be marked optional here too?

it is optional by default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it used to have optional in the v2 version of the spec. in v4 it only has required. Thanks!

@@ -567,6 +567,9 @@ message StartWorkspaceSpec {

// ssh_public_keys is user's uploaded ssh public keys
repeated string ssh_public_keys = 15;

// termination_grace_period_seconds sets the TerminationGracePeriodSeconds, defaults to 30 seconds
int64 termination_grace_period_seconds = 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth keeping in mind that ws-manager running in WS clusters will have a different rollout cadence to webapp. This change is backwards compatible so it shouldn't be a system problem, just an expectation.

@svenefftinge
Copy link
Member Author

btw. the bash process we are starting doesn't seem to propagate SIGINT to the process it is waiting for.

@csweichel
Copy link
Contributor

We have quite a few places where we pass in a termination grace period, e.g. all call sites of stopWorkspace. It's not clear which value (pod or arg) gets precedence.

@sagor999
Copy link
Contributor

@csweichel
https://github.com/kubernetes/apiserver/blob/710ad8da8e0ae8c5dcb4a12744d231c26c7e9905/pkg/registry/rest/delete.go#L123-L128
According to this only shorter grace period can be provided into delete. Which means our stopWorkspace call (and any other place where we do this) can effectively be a maximum grace period? 🤔
though it changes current behaviour quite a bit.

@svenefftinge svenefftinge force-pushed the se/flexible-termination-period branch from 65b686f to 616fb25 Compare September 28, 2022 06:36
@svenefftinge svenefftinge requested a review from a team September 28, 2022 06:36
@roboquat roboquat added size/XXL and removed size/XL labels Sep 28, 2022
@svenefftinge svenefftinge force-pushed the se/flexible-termination-period branch 2 times, most recently from 55caad1 to 0e9d536 Compare September 28, 2022 09:05
@csweichel
Copy link
Contributor

csweichel commented Sep 28, 2022

@csweichel https://github.com/kubernetes/apiserver/blob/710ad8da8e0ae8c5dcb4a12744d231c26c7e9905/pkg/registry/rest/delete.go#L123-L128 According to this only shorter grace period can be provided into delete. Which means our stopWorkspace call (and any other place where we do this) can effectively be a maximum grace period? 🤔 though it changes current behaviour quite a bit.

Which indeed means that the value we use for stopWorkspace would take precedence, because it's either the default 30s or less than that.

We could either (in descending order of implementation complexity)

  1. use the actual termination grace period of the pod,
  2. use the configurable max termination grace period,
  3. just use a very long constant time (999 * time.Hour).

@svenefftinge
Copy link
Member Author

svenefftinge commented Sep 28, 2022

I've set it to the same hard-coded max (5 minutes). (see)

@svenefftinge svenefftinge force-pushed the se/flexible-termination-period branch 2 times, most recently from 377dfde to 222dbfb Compare September 29, 2022 13:33
@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-se-flexible-termination-period.9 because the annotations in the pull request description changed
(with .werft/ from main)

@sagor999
Copy link
Contributor

/hold
for conflicts and failed build

@svenefftinge svenefftinge force-pushed the se/flexible-termination-period branch 3 times, most recently from e5b13f1 to 5939f9c Compare October 7, 2022 14:59
@svenefftinge svenefftinge force-pushed the se/flexible-termination-period branch from 5939f9c to 36923c6 Compare October 7, 2022 15:23
@svenefftinge
Copy link
Member Author

svenefftinge commented Oct 7, 2022

/werft run with-clean-slate-deployment

👍 started the job as gitpod-build-se-flexible-termination-period.14
(with .werft/ from main)

@geropl geropl marked this pull request as draft October 13, 2022 07:41
@stale
Copy link

stale bot commented Oct 28, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Oct 28, 2022
@stale stale bot closed this Nov 9, 2022
@svenefftinge svenefftinge deleted the se/flexible-termination-period branch December 9, 2022 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold do-not-merge/work-in-progress meta: stale This issue/PR is stale and will be closed soon release-note size/XXL team: IDE team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants