-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ws-manager-bridge] the number of workspace instances remaining seems to be wrong #11399
Comments
@jenting Could you elaborate on what the actual problem is here? The fact that we're querying the DB when deregistering the cluster? Or that you feel the numbers aren't correct? If it's the latter: What was the number you expected to see? E.g. in the cluster/by asking ws-manager? |
Well, the problem is that we run the werft job We are not sure whether the problem is the code bug or we query the wrong SQL command to the production DB. You could check the https://gitpod.slack.com/archives/C02F19UUW6S/p1657831120699349 thread to see the SQL command we queried. Thank you. |
@geropl both, it seems wrong to query the database (but I may be missing historical context as to why we do that), and the numbers between the workspace cluster and database do not match. Here is an example for # see how there are zero workspaces in the cluster?
gitpod /workspace/gitpod (main) $ kubectl get pods
NAME READY STATUS RESTARTS AGE
agent-smith-49rd2 2/2 Running 0 2d3h
agent-smith-ccplz 2/2 Running 0 3d14h
agent-smith-gsbpp 2/2 Running 0 3d18h
agent-smith-n7m9s 2/2 Running 0 2d3h
agent-smith-nr9zg 2/2 Running 0 5d21h
image-builder-mk3-65f487c8c5-p6fw8 2/2 Running 0 6d
registry-facade-2nlbt 3/3 Running 0 3d14h
registry-facade-9fqnh 3/3 Running 0 3d18h
registry-facade-f9xct 3/3 Running 0 2d3h
registry-facade-r5578 3/3 Running 1 (5d8h ago) 5d22h
registry-facade-wwg7n 3/3 Running 0 2d3h
ws-daemon-5rcpt 3/3 Running 0 3d18h
ws-daemon-9gjsx 3/3 Running 0 2d3h
ws-daemon-g2lsz 3/3 Running 0 3d14h
ws-daemon-kv22v 3/3 Running 0 5d22h
ws-daemon-pvrx8 3/3 Running 0 2d3h
ws-manager-84bb5cffd6-6pq5h 2/2 Running 0 2d7h
ws-proxy-c4cb5d5cf-77m27 2/2 Running 0 6d
ws-proxy-c4cb5d5cf-89lnl 2/2 Running 0 6d
ws-proxy-c4cb5d5cf-rp469 2/2 Running 0 6d In this job, we get: It looks like the workspaces that are being counted in this case are the The query that would have rendered this I think looks similar to: SELECT dbwi.id, dbwi.phasePersisted, dbw.id, dbwi.deleted
FROM gitpod.d_b_workspace dbw
inner join gitpod.d_b_workspace_instance dbwi
on dbw.id = dbwi.workspaceId
WHERE
dbwi.phasePersisted not in ('stopped', 'stopping')
and dbwi.deleted = 0
# change to match a region you're interested in
and dbwi.region = 'us58'
ORDER by 2 asc
; Which yields 14 pending workspaces.
We've been having phase management problems lately, I imagine this maybe a symptom? I would have expected there to be zero returned by |
Bug description
We wanted to delete the us53 cluster, the number of workspace pod is empty, so we triggered the werft job to delete the us53 cluster. But somehow, the ws-manager-bridge reported the number of running instances is 2.
After that, we check the DB according to the code logic
gitpod/components/ws-manager-bridge/src/cluster-service-server.ts
Line 263 in c752a16
gitpod/components/gitpod-db/src/typeorm/workspace-db-impl.ts
Lines 450 to 485 in c752a16
The number of workspace according to the criteria using SQL query is 10, rather than 2.
We are not sure is it the bug of ws-manager-bridge or we input the wrong query parameters.
Steps to reproduce
Internal slack thread.
Workspace affected
No response
Expected behavior
No response
Example repository
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: