Workspaces are stuck in stopping #4156

meysholdt · 2021-05-05T19:13:01Z

on node gke-gp-ws-us04-us-west-workspace-pool-7059fb33-x2dc,
we have a bunch of pods that show up in crictl pods but not in kubectl get pod:

5c92554172121       3 days ago           Ready               ws-313c6102-d2c5-403c-9343-e71b1e369157                          default              0
37025af212bef       3 days ago           Ready               ws-622dd973-f2d7-481c-be56-4db13c406ebf                          default              0
fc47eee06f890       3 days ago           Ready               ws-39b8a620-e1e2-49c7-99d3-9742d4589eb8                          default              0
46b3f9d45734e       3 days ago           Ready               ws-808eb049-8fbb-4e9e-a4fd-890c74c1aa12                          default              0
e0c80d524714d       3 days ago           Ready               ws-92f56389-afdf-4301-8d0b-687f1e16d02c                          default              0
72480bd3fd443       3 days ago           Ready               ws-f8ab866d-e726-4203-9581-0dd53d7fd0db                          default              0
d925886f7b3e8       3 days ago           Ready               ws-e6dda1c5-602a-4c3e-ab0e-88b708e89c8b                          default              0
67a6d69e78b00       4 days ago           Ready               ws-e9abe1e6-9b89-49bc-bbed-17c5498ffe33                          default              0
ca53cc59e67bd       4 days ago           Ready               ws-57da3955-f3a0-4b1d-997e-49ceed8b52ff                          default              0
1cb060ab3ef1b       4 days ago           Ready               ws-ab8458d9-44b5-47ef-8c40-ab8fd6bf8c70                          default              0
b2c80e77bf67c       4 days ago           Ready               ws-501d406d-75ed-4106-8509-7905da795b0f                          default              0
382ed4f28c4b1       5 days ago           Ready               ws-ec751702-f1e7-420b-86f9-216a0c4f5e03                          default              0
e071e1321684f       5 days ago           Ready               ws-16a8b763-843f-4cb8-8932-45f59cd25de4                          default              0
f6a98e2372cfa       5 days ago           Ready               ws-e87f9da6-587f-4e18-bb34-d7c709150e07                          default              0
ee0068f2ed4d4       5 days ago           Ready               ws-12085801-eb5e-468b-a99f-9a6647f167e6                          default              0
bf118653e09b6       5 days ago           Ready               ws-3281a471-50be-4e00-bda8-fc7c838cf236                          default              0
ffcf942e2877e       5 days ago           Ready               ws-e99352a2-2634-47f2-a323-97092ea92c71                          default              0
074d28d4c0c5f       5 days ago           Ready               ws-7d495d6e-b814-4f5e-a600-9f58cc291fa3                          default              0
2c4b1c073c8b1       5 days ago           Ready               ws-17233da7-8efb-403e-a770-fe0fc0364a51                          default              0
c6b670fed2222       5 days ago           Ready               ws-427cdd1f-d98a-4a29-baa0-2274b804bbdf                          default              0
5b552e3347d10       5 days ago           Ready               ws-184177be-7b29-456e-89a2-f2541cab74aa                          default              0
f8f139b8e4b61       5 days ago           Ready               ws-767048dc-dfe5-4abd-a729-3298f983bcbc                          default              0
443ae56d7ffd6       5 days ago           Ready               ws-819cf210-dc50-4c76-8e8c-6d0f4283fa24                          default              0
39ba9f092067f       5 days ago           Ready               ws-6d9f0d64-ab0d-4ea5-83b3-2399d1607d30                          default              0
836505b9b07e0       5 days ago           Ready               ws-e4d94bc7-f33a-4f84-82ea-6ba1ddea36f3                          default              0
51ea29b158fac       5 days ago           Ready               ws-a4eb26f0-84b6-4446-b122-486681c428ed                          default              0
bd7656affd9e5       5 days ago           Ready               ws-2356a2c9-fae7-4561-bb14-466e8e370f43                          default              0
078475c296344       5 days ago           Ready               ws-e7e676fa-82be-4fb0-a9c2-e7afe871fe30                          default              0
4fe7fa0077627       5 days ago           Ready               ws-2cdeb2c7-2aff-4390-9244-85a75d6e62fb                          default              0
2ef52bfdd0808       5 days ago           Ready               ws-739419fe-b029-4840-989b-16849369b52f                          default              0
29916edb4d2f0       5 days ago           Ready               ws-a8d76355-8b75-422f-b077-a4acccd8b4a1                          default              0
93ff275df5e82       5 days ago           Ready               ws-a3337af6-2932-43b0-8a48-9c17f54f9aa0                          default              0
b3e9d3ccfa2be       5 days ago           Ready               ws-b09751ed-f963-46cb-af2e-ed77934dfa7f                          default              0
cd461f05a674c       5 days ago           Ready               ws-0b9ae0a0-0010-489e-995d-c1720980b115                          default              0
818a219f26f05       5 days ago           Ready               ws-b320a5cf-dd54-42fa-8fe0-f7d728b5f43c                          default              0
7d69dda1bbbfa       5 days ago           Ready               ws-42127aab-e80c-47f4-96a6-dc73432d9c97                          default              0
ac33eb4b51c54       5 days ago           Ready               ws-321a0536-06ca-40c2-9f91-f743330aa08a                          default              0
1f082d993b860       5 days ago           Ready               ws-51e8fc5c-b324-4ad1-8558-25cd5586ffd4                          default              0
cce0cce4f1945       5 days ago           Ready               ws-79c9a4d0-fb3c-4b87-9a42-1d50e3d3e290                          default              0
3b61760e56ed6       5 days ago           Ready               ws-0fbffe62-f4ae-4680-83f2-c10fbeab0abf                          default              0
bd96a593fdc36       5 days ago           Ready               ws-f3269d7c-7b48-4595-8c4b-7da144a12e1c                          default              0
6b6e767707f1c       5 days ago           Ready               ws-51459d72-2496-41a2-aa44-2e773cad7322                          default              0
644572fc5759c       6 days ago           Ready               ws-246f4d5f-54f3-4edf-befe-e83637b81e06                          default              0

not sure what's preventing them from stopping. I became aware of them due to an out-of-disk-space-alert. I suspect they fill up the logs leading to #4135.

The text was updated successfully, but these errors were encountered:

csweichel · 2021-05-06T05:57:25Z

I'd think this has to do with the runc/containerd issue we're working around in ws-daemon because of containerd 1.2.10

AlexTugarev · 2021-05-06T08:45:23Z

@csweichel do you mean this workaround here?

gitpod/components/ws-daemon/pkg/daemon/containerd4214.go

Line 91 in e863704

    
           func (c *Containerd4214Workaround) ensurePodGetsDeleted(rt container.Runtime, clientSet kubernetes.Interface, ws *dispatch.Workspace) (err error) {

I tried to find log entries to see the workaround working/failing for the workspaces mentioned above, but it appears there is no trace of them besides updates of labels.

JanKoehnlein · 2021-05-31T12:51:27Z

See #4171

svenefftinge · 2021-08-11T06:14:15Z

We still see regular workspace start failures, that leave workspaces in a state where user can no longer start them.
Besides fixing the root cause fo those failures we also need to make sure that a failed workspace start doesn't hinder starting a workspace again.

svenefftinge · 2021-08-11T06:57:41Z

Also users themselves should be able to stop workspaces. Currently it seems like the stop request just fails because ws-man doesn't find the pod anymore.

csweichel · 2021-08-11T10:51:05Z

/assign

csweichel · 2021-08-11T10:52:29Z

/schedule

mrsimonemms · 2021-08-12T10:13:20Z

Looked at with @csweichel. Seems the node and cluster have been destroyed and this issue has been fixed subsequently. See #4683 and #5130

meysholdt added type: incident Gitpod.io service is unstable type: bug Something isn't working labels May 5, 2021

svenefftinge added the priority: highest (user impact) Directly user impacting label Aug 11, 2021

roboquat assigned csweichel Aug 11, 2021

roboquat added the groundwork: scheduled label Aug 11, 2021

csweichel removed their assignment Aug 11, 2021

mrsimonemms closed this as completed Aug 12, 2021

mrsimonemms removed the groundwork: scheduled label Aug 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workspaces are stuck in stopping #4156

Workspaces are stuck in stopping #4156

meysholdt commented May 5, 2021

csweichel commented May 6, 2021

AlexTugarev commented May 6, 2021 •

edited

Loading

JanKoehnlein commented May 31, 2021

svenefftinge commented Aug 11, 2021

svenefftinge commented Aug 11, 2021

csweichel commented Aug 11, 2021

csweichel commented Aug 11, 2021

mrsimonemms commented Aug 12, 2021

Workspaces are stuck in stopping #4156

Workspaces are stuck in stopping #4156

Comments

meysholdt commented May 5, 2021

csweichel commented May 6, 2021

AlexTugarev commented May 6, 2021 • edited Loading

JanKoehnlein commented May 31, 2021

svenefftinge commented Aug 11, 2021

svenefftinge commented Aug 11, 2021

csweichel commented Aug 11, 2021

csweichel commented Aug 11, 2021

mrsimonemms commented Aug 12, 2021

AlexTugarev commented May 6, 2021 •

edited

Loading