Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[container-runtime] if workspace pod has more than 2 containers, ws-daemon ExtractCGroupPathFromContainer fails #14103

Closed
sagor999 opened this issue Oct 21, 2022 · 5 comments · Fixed by #14146
Assignees
Labels
type: bug Something isn't working

Comments

@sagor999
Copy link
Contributor

Bug description

In this PR I am adding init container (when using PVC) to workspace pod: #14096
It causes workspace to fail to start up as it times out waiting for daemon.sock to appear.
Looking at ws-daemon logs, it errors out with:

{"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","error":"open /mnt/node-cgroups/kubepods.slice/kubepods-burstable.sli │
│ ce/kubepods-burstable-pod96d24469_ad2f_45bf_9d97_79170477153f.slice/cri-containerd-834b47c55a6bdc5ef9de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c.scope/workspace │
│ /user/cgroup.procs: no such file or directory","file":"plugin_process_priority_v2.go:64","func":"func1","level":"error","message":"cannot read cgroup.procs file"," │
│ path":"/mnt/node-cgroups/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod96d24469_ad2f_45bf_9d97_79170477153f.slice/cri-containerd-834b47c55a6bdc5ef9 │
│ de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c.scope","serviceContext":{"service":"ws-daemon","version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"},"severit │
│ y":"ERROR","time":"2022-10-21T22:25:59Z"}

For some reason, something appears to be not working as we are expecting it to when there is more then one container in workspace pod.
I double check container ID in

func (s *Containerd) handleNewContainer(c containers.Container) {
and it does grab workspace container and uses it to create cgroup path.
Here is full log (with additional debug output that I added):

{
   "ID":"c6a3884abdabb18e604755b4f2bce2d58cb4d16294dd8e640a1962b42230adae",
   "file":"containerd.go:185",
   "func":"handleNewContainer",
   "image":"docker.io/rancher/mirrored-pause:3.6",
   "labels":{
      "app":"gitpod",
      "component":"workspace",
      "gitpod.io/networkpolicy":"default",
      "gitpod.io/pvcFeature":"true",
      "gitpod.io/workspaceClass":"default",
      "gpwsman":"true",
      "headless":"false",
      "io.cri-containerd.kind":"sandbox",
      "io.kubernetes.pod.name":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
      "io.kubernetes.pod.namespace":"default",
      "io.kubernetes.pod.uid":"96d24469-ad2f-45bf-9d97-79170477153f",
      "metaID":"gitpodio-templategolang-w2ix0tm8ihy",
      "owner":"cb275faa-c58a-4dc2-bb24-24a62f3ea47e",
      "workspaceID":"02d30297-0a6e-4d4f-94fd-f08599211724",
      "workspaceType":"regular"
   },
   "level":"info",
   "message":"new container",
   "serviceContext":{
      "service":"ws-daemon",
      "version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"
   },
   "severity":"INFO",
   "time":"2022-10-21T22:25:47Z"
}{
   "file":"containerd.go:218",
   "func":"handleNewContainer",
   "instanceId":"02d30297-0a6e-4d4f-94fd-f08599211724",
   "labels":{
      "app":"gitpod",
      "component":"workspace",
      "gitpod.io/networkpolicy":"default",
      "gitpod.io/pvcFeature":"true",
      "gitpod.io/workspaceClass":"default",
      "gpwsman":"true",
      "headless":"false",
      "io.cri-containerd.kind":"sandbox",
      "io.kubernetes.pod.name":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
      "io.kubernetes.pod.namespace":"default",
      "io.kubernetes.pod.uid":"96d24469-ad2f-45bf-9d97-79170477153f",
      "metaID":"gitpodio-templategolang-w2ix0tm8ihy",
      "owner":"cb275faa-c58a-4dc2-bb24-24a62f3ea47e",
      "workspaceID":"02d30297-0a6e-4d4f-94fd-f08599211724",
      "workspaceType":"regular"
   },
   "level":"info",
   "message":"found sandbox - adding to label cache",
   "podname":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
   "serviceContext":{
      "service":"ws-daemon",
      "version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"
   },
   "severity":"INFO",
   "time":"2022-10-21T22:25:47Z",
   "userId":"cb275faa-c58a-4dc2-bb24-24a62f3ea47e",
   "workspaceId":"gitpodio-templategolang-w2ix0tm8ihy"
}{
   "ID":"bf27a01609849bd3188d25c777213926feed48ffb9ecc87314a12659704ef891",
   "file":"containerd.go:185",
   "func":"handleNewContainer",
   "image":"docker.io/library/busybox:latest",
   "labels":{
      "io.cri-containerd.kind":"container",
      "io.kubernetes.container.name":"zz-chown-workspace",
      "io.kubernetes.pod.name":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
      "io.kubernetes.pod.namespace":"default",
      "io.kubernetes.pod.uid":"96d24469-ad2f-45bf-9d97-79170477153f"
   },
   "level":"info",
   "message":"new container",
   "serviceContext":{
      "service":"ws-daemon",
      "version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"
   },
   "severity":"INFO",
   "time":"2022-10-21T22:25:47Z"
}{
   "ID":"834b47c55a6bdc5ef9de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c",
   "file":"containerd.go:185",
   "func":"handleNewContainer",
   "image":"reg.pavel-14003.preview.gitpod-dev.com:30955/remote/02d30297-0a6e-4d4f-94fd-f08599211724:latest",
   "labels":{
      "io.cri-containerd.kind":"container",
      "io.kubernetes.container.name":"workspace",
      "io.kubernetes.pod.name":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
      "io.kubernetes.pod.namespace":"default",
      "io.kubernetes.pod.uid":"96d24469-ad2f-45bf-9d97-79170477153f"
   },
   "level":"info",
   "message":"new container",
   "serviceContext":{
      "service":"ws-daemon",
      "version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"
   },
   "severity":"INFO",
   "time":"2022-10-21T22:25:49Z"
}{
   "CGroupPath":"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod96d24469_ad2f_45bf_9d97_79170477153f.slice/cri-containerd-834b47c55a6bdc5ef9de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c.scope",
   "ID":"834b47c55a6bdc5ef9de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c",
   "file":"containerd.go:248",
   "func":"handleNewContainer",
   "instanceId":"02d30297-0a6e-4d4f-94fd-f08599211724",
   "labels":{
      "io.cri-containerd.kind":"container",
      "io.kubernetes.container.name":"workspace",
      "io.kubernetes.pod.name":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
      "io.kubernetes.pod.namespace":"default",
      "io.kubernetes.pod.uid":"96d24469-ad2f-45bf-9d97-79170477153f"
   },
   "level":"info",
   "message":"found workspace container - updating label cache",
   "podname":"ws-02d30297-0a6e-4d4f-94fd-f08599211724",
   "serviceContext":{
      "service":"ws-daemon",
      "version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"
   },
   "severity":"INFO",
   "time":"2022-10-21T22:25:49Z",
   "userId":"cb275faa-c58a-4dc2-bb24-24a62f3ea47e",
   "workspaceId":"gitpodio-templategolang-w2ix0tm8ihy"
}{
   "@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent",
   "error":"open /mnt/node-cgroups/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod96d24469_ad2f_45bf_9d97_79170477153f.slice/cri-containerd-834b47c55a6bdc5ef9de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c.scope/workspace/user/cgroup.procs: no such file or directory",
   "file":"plugin_process_priority_v2.go:64",
   "func":"func1",
   "level":"error",
   "message":"cannot read cgroup.procs file",
   "path":"/mnt/node-cgroups/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod96d24469_ad2f_45bf_9d97_79170477153f.slice/cri-containerd-834b47c55a6bdc5ef9de4b491aa4ffbd0589ebd4af1af69fc382db1d6fa8be7c.scope",
   "serviceContext":{
      "service":"ws-daemon",
      "version":"commit-a0f3e3d2cba39ddb43b668b9f38d7c5a76705259"
   },
   "severity":"ERROR",
   "time":"2022-10-21T22:25:59Z"
}

Steps to reproduce

Use preview env from this PR: #14096
Enable PVC.
Try to open any workspace.

Workspace affected

No response

Expected behavior

No response

Example repository

No response

Anything else?

No response

@Furisto
Copy link
Member

Furisto commented Oct 23, 2022

I don't think the error you are seeing in the log is the cause of the socket not being available. That comes from the process priority cgroup plugin which should not affect startup.

@utam0k utam0k self-assigned this Oct 24, 2022
@utam0k utam0k moved this from Scheduled to In Progress in 🌌 Workspace Team Oct 24, 2022
@sagor999
Copy link
Contributor Author

@Furisto if I remove init container then it works. So it is definitely something to do with the second container in the pod.
Log error is from plugin indeed, but I think it is because something earlier did not work as expected.

@utam0k
Copy link
Contributor

utam0k commented Oct 24, 2022

It may relates to this issue. I'm fixing.
#11986

@sagor999
Copy link
Contributor Author

@utam0k I rebased my PR on top of yours: #14111
and it did not fix the problem. 😞 Still the same issue, daemon socket does not appear when workspace starts.

@utam0k
Copy link
Contributor

utam0k commented Oct 25, 2022

@utam0k I rebased my PR on top of yours: #14111
and it did not fix the problem. Still the same issue, daemon socket does not appear when workspace starts.

Thanks for your confirm. I'll continue to fix it

Repository owner moved this from In Progress to Awaiting Deployment in 🌌 Workspace Team Oct 25, 2022
@utam0k utam0k moved this from Awaiting Deployment to Done in 🌌 Workspace Team Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants