Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workspaces which cannot be scheduled to a cluster get stuck in Pending #11397

Closed
1 task
kylos101 opened this issue Jul 15, 2022 · 8 comments · Fixed by #13350
Closed
1 task

workspaces which cannot be scheduled to a cluster get stuck in Pending #11397

kylos101 opened this issue Jul 15, 2022 · 8 comments · Fixed by #13350
Assignees
Labels

Comments

@kylos101
Copy link
Contributor

kylos101 commented Jul 15, 2022

Bug description

Workspaces which cannot be scheduled to a cluster get stuck in Pending.

Steps to reproduce

In a preview environment, kubectl cordon the node. Try starting a workspace. It'll stay stuck in Pending for-ev-er!

Workspace affected

n/a

Expected behavior

The workspace should fail, and land in a permanent Failed phase

Example repository

n/a

Anything else?

This becomes a problem when you're trying to delete a cluster, because ws-manager-bridge reports that there are still active workspaces in the database for the cluster, even though there are no workspace pods in the governed workspace cluster.

This prevented the delete of us53 (which was empty, and had no workspace clusters), until we ran the following queries:

-- find pending workspaces that aren't really pending
SELECT dbwi.id
FROM gitpod.d_b_workspace dbw  
inner join gitpod.d_b_workspace_instance dbwi 
on dbw.id = dbwi.workspaceId  
WHERE 
dbwi.phasePersisted not in ('stopped','failed') 
# change to match a region you're interested in
and dbwi.region = 'us53'
ORDER by dbwi.creationTime asc
;

-- force stop the pending workspaces
UPDATE gitpod.d_b_workspace_instance AS wsi
SET
			wsi.stoppedTime = IF(wsi.startedTime = '', wsi.creationTime, wsi.startedTime),
			wsi.stoppingTime = IF(wsi.startedTime = '', wsi.creationTime, wsi.startedTime),
			wsi.phasePersisted = 'stopped',			
			wsi.status = '{"phase": "stopped", "conditions": {}}'
WHERE wsi.id IN ('fb52614b-927b-4412-b583-dd3cc7bc3087',
'04eedb5b-1f35-4e4b-a8b3-56ea13c0f2dd',
'9b3d3928-59dd-46f0-9876-cc2bab905a2e',
'261bbde5-100f-4a19-8085-125ef82a0f0a',
'7878cc2a-544e-4f24-82ee-62a24c4f9c34',
'5dda17bf-1fe9-4cc6-b5ae-82aa6603049c',
'ad40bce6-a74c-47a9-b9a4-d9490afbfd41',
'028d826e-49b2-45ce-b829-8b4f693e93d3',
'99b74f2b-0a49-45c6-9c59-c4339a6dcb48',
'3695365a-0b4c-4df2-96f3-700598fe3baf')
;
@kylos101 kylos101 added the type: bug Something isn't working label Jul 15, 2022
@kylos101 kylos101 moved this to Scheduled in 🌌 Workspace Team Jul 15, 2022
@kylos101
Copy link
Contributor Author

kylos101 commented Jul 15, 2022

note: this may not be a problem with ws-manager, in hindsight, it might be a problem on the webapp side, but needs work to confirm.

@utam0k utam0k self-assigned this Jul 26, 2022
@utam0k utam0k moved this from Scheduled to In Progress in 🌌 Workspace Team Jul 26, 2022
@utam0k utam0k removed their assignment Jul 26, 2022
@utam0k utam0k moved this from In Progress to Scheduled in 🌌 Workspace Team Jul 26, 2022
@jenting
Copy link
Contributor

jenting commented Jul 29, 2022

Similar to #11673.

@utam0k
Copy link
Contributor

utam0k commented Jul 31, 2022

@jenting Has this issue already been solved by #11673?

Similar to #11673.

@jenting
Copy link
Contributor

jenting commented Aug 1, 2022

@jenting Has this issue already been solved by #11673?

Similar to #11673.

No, the #11673 is to make a retry within 30 mins (we wait for the workspace pod to be running with 7 mins timeout).

The good news is the workspace pod be removed after 30 mins, and the bad news is the dashboard displays the workspace in a Pending state.

@kylos101 kylos101 moved this from Scheduled to Backlog in 🌌 Workspace Team Aug 1, 2022
@kylos101
Copy link
Contributor Author

kylos101 commented Aug 5, 2022

I believe Pending is a phase that indicates webapp has not selected a cluster (yet) for the workspace to land on. @geropl reassigning for now. wdyt?

@geropl geropl moved this to Clarification in 🍎 WebApp Team Aug 8, 2022
@kylos101 kylos101 changed the title [ws-manager] workspaces which cannot be scheduled to a cluster get stuck in Pending workspaces which cannot be scheduled to a cluster get stuck in Pending Aug 11, 2022
@kylos101 kylos101 removed the status in 🍎 WebApp Team Aug 23, 2022
@kylos101
Copy link
Contributor Author

👋 @geropl hey, we had to do some workspace clean-up in the database last week when decommissioning to gen61, as we shifted to gen62. May we ask for your help in scheduling this for the next couple weeks?

@geropl geropl moved this to Scheduled in 🍎 WebApp Team Aug 23, 2022
@geropl
Copy link
Member

geropl commented Sep 9, 2022

@kylos101 Done, set to next week.

@geropl
Copy link
Member

geropl commented Sep 26, 2022

Linked to: #6770

@geropl geropl moved this to Scheduled in 🍎 WebApp Team Sep 26, 2022
@laushinka laushinka self-assigned this Sep 27, 2022
@laushinka laushinka moved this from Scheduled to In Progress in 🍎 WebApp Team Sep 27, 2022
Repository owner moved this from In Progress to In Validation in 🍎 WebApp Team Sep 29, 2022
@laushinka laushinka moved this from In Validation to Done in 🍎 WebApp Team Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment