Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user systemd service and socket #3662

Merged

Conversation

marusak
Copy link
Contributor

@marusak marusak commented Jul 29, 2019

This enables user to interact with varlink.
Before this PR user could only do sudo varlink call unix:/run/podman/io.podman/io.podman.ListContainer to list containers owned by root. With this PR a normal user can also call varlink call unix:/run/user/1000/podman/io.podman/io.podman.ListContainers which lists user owned containers.

I have one problem though. Can someone please give me some hints, I admit I don't really understand how and why pause.pid is used, but it seems it causes some problem. Here is how to reproduce my problem:

This works just fine after I start my machine. Steps I do:

  1. systemctl --user start io.podman.socket
  2. varlink call unix:/run/user/1000/podman/io.podman/io.podman.ListContainers
 "containers": [
    {
      "command": [
        "/bin/bash"
      ],
      "containerrunning": false,
      "createdat": "2019-07-29T10:25:30+02:00",
      "id": "11d619fcd6ec3d4f9f6ebe663c87d270797980703089af594a4a13fd82b5f3d4",
      "image": "docker.io/library/fedora:latest",
      "imageid": "d09302f77cfcc3e867829d80ff47f9e7738ffef69730d54ec44341a9fb1d359b",
      "labels": {
        "maintainer": "Clement Verna <[email protected]>"
      },
      "mounts": [
        {
...
  1. When I do any other varlink call (for example the same as in 2.) I get Unable to connect: CannotConnect
$ systemctl --user status io.podman.socket
● io.podman.socket - Podman Remote API Socket
   Loaded: loaded (/usr/lib/systemd/user/io.podman.socket; disabled; vendor preset: enabled)
   Active: failed (Result: service-start-limit-hit) since Mon 2019-07-29 10:35:00 CEST; 7s ago
     Docs: man:podman-varlink(1)
   Listen: /run/user/1000/podman/io.podman (Stream)

Jul 29 10:34:16 dhcppc4 systemd[1338]: Listening on Podman Remote API Socket.
Jul 29 10:35:00 dhcppc4 systemd[1338]: io.podman.socket: Failed with result 'service-start-limit-hit'.

$ systemctl --user status io.podman.service
● io.podman.service - Podman Remote API Service
   Loaded: loaded (/usr/lib/systemd/user/io.podman.service; disabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2019-07-29 10:35:00 CEST; 19s ago
     Docs: man:podman-varlink(1)
  Process: 16351 ExecStart=/usr/bin/podman varlink unix:/run/user/1000/podman/io.podman (code=exited, status=1/FAILURE)
 Main PID: 16351 (code=exited, status=1/FAILURE)

Jul 29 10:35:00 dhcppc4 systemd[1338]: Started Podman Remote API Service.
Jul 29 10:35:00 dhcppc4 podman[16351]: time="2019-07-29T10:35:00+02:00" level=error msg="cannot join pause process.  You may need to remove /run/user/1000/libpod/pause.pid and stop all containers"
Jul 29 10:35:00 dhcppc4 podman[16351]: time="2019-07-29T10:35:00+02:00" level=error msg="you can use `system migrate` to recreate the pause process"
Jul 29 10:35:00 dhcppc4 podman[16351]: time="2019-07-29T10:35:00+02:00" level=error msg="open /proc/16176/ns/user: no such file or directory"
Jul 29 10:35:00 dhcppc4 systemd[1338]: io.podman.service: Main process exited, code=exited, status=1/FAILURE
Jul 29 10:35:00 dhcppc4 systemd[1338]: io.podman.service: Failed with result 'exit-code'.
Jul 29 10:35:00 dhcppc4 systemd[1338]: io.podman.service: Start request repeated too quickly.
Jul 29 10:35:00 dhcppc4 systemd[1338]: io.podman.service: Failed with result 'exit-code'.
Jul 29 10:35:00 dhcppc4 systemd[1338]: Failed to start Podman Remote API Service.

Removing /run/user/1000/libpod/pause.pid and restarting both socket and service I can get it to the same state as after boot, but doing varlink call gets it back to the same failed state. (varlink call returns Connection closed.)

If I reboot the machine I can do exactly one varlink call and then get back to this failed state.

@openshift-ci-robot
Copy link
Collaborator

Hi @marusak. Thanks for your PR.

I'm waiting for a containers or openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 29, 2019
@rh-atomic-bot
Copy link
Collaborator

Can one of the admins verify this patch?
I understand the following commands:

  • bot, add author to whitelist
  • bot, test pull request
  • bot, test pull request once

@marusak marusak changed the title foo: Add user systemd service and socket Add user systemd service and socket Jul 29, 2019
@mheon
Copy link
Member

mheon commented Jul 29, 2019

@baude PTAL

@rhatdan
Copy link
Member

rhatdan commented Jul 29, 2019

I think the latest podman commands work without setting up a systemd unit file/service.

podman-remote commands are supposed to automatically launch a podman varlink session to handle the remote connection.

@baude
Copy link
Member

baude commented Jul 29, 2019

this is true for podman-remote when using the bridge.

@giuseppe
Copy link
Member

I have one problem though. Can someone please give me some hints, I admit I don't really understand how and why pause.pid is used, but it seems it causes some problem. Here is how to reproduce my problem:

pause.pid is used for rootless containers to keep the user namespace alive so that all containers can run from the same namespace.

Looks like systemd is killing all the processes in the cgroup when you start it from a unit file.

Can you try with KillMode=none in the service file? I think we need to require it for rootless containers

@marusak
Copy link
Contributor Author

marusak commented Jul 30, 2019

Can you try with KillMode=none in the service file?

Awesome, adding it into the service file fixes it. Is KillMode=none unwanted for root containers? If it can go into the same service file or we need to crate two different ones?

@giuseppe
Copy link
Member

Awesome, adding it into the service file fixes it. Is KillMode=none unwanted for root containers? If it can go into the same service file or we need to crate two different ones?

root containers don't need a pause process as they don't need to join the same user namespace.

@giuseppe
Copy link
Member

Awesome, adding it into the service file fixes it. Is KillMode=none unwanted for root containers? If it can go into the same service file or we need to crate two different ones?

although we can probably simplify and add it for root containers as well

@giuseppe
Copy link
Member

@marusak could you amend your patch to include it?

@marusak
Copy link
Contributor Author

marusak commented Jul 30, 2019

could you amend your patch to include it?

of course, I'll push here in a bit.
Are there tests for these varlink stuff where I should also test that it works with normal user?

@giuseppe
Copy link
Member

I don't think we have tests for io.podman.service

@marusak marusak force-pushed the user_socket_service branch from 254b600 to 241c46c Compare July 30, 2019 10:37
@marusak
Copy link
Contributor Author

marusak commented Jul 30, 2019

Updated

@giuseppe
Copy link
Member

/ok-to-test
/approve

@openshift-ci-robot openshift-ci-robot added ok-to-test and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 30, 2019
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, marusak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 30, 2019
@giuseppe
Copy link
Member

please add a Signed-off-by: to the commit and repush.

You can do it with git commit --amend -s

@marusak marusak force-pushed the user_socket_service branch from 241c46c to d106ab2 Compare July 30, 2019 10:41
@marusak
Copy link
Contributor Author

marusak commented Jul 30, 2019

Added Signed-off-by line

@marusak
Copy link
Contributor Author

marusak commented Aug 5, 2019

One issue I've found. Not sure if this is related to these changes or this has nothing to do with this change, but I cannot call GetContainerStats, see:

$ podman run -dit --name foo fedora
fe2a46137a99962f25b152c851f0e58d8974b7d990ff49e40705e2a4e87cb309

$ varlink call unix:/run/user/1000/podman/io.podman/io.podman.GetContainerStats '{ "name": "foo" }'
Call failed with error: io.podman.ErrorOccurred
{
  "reason": "unable to load cgroup at /libpod_parent/libpod-fe2a46137a99962f25b152c851f0e58d8974b7d990ff49e40705e2a4e87cb309: cgroups: cgroup deleted"
}

$ podman ps
CONTAINER ID  IMAGE                            COMMAND    CREATED         STATUS             PORTS  NAMES
fe2a46137a99  docker.io/library/fedora:latest  /bin/bash  20 seconds ago  Up 19 seconds ago         foo
0eaa3d1b2e57  docker.io/library/fedora:latest  /bin/bash  3 hours ago     Up 3 hours ago            stoic_kapitsa

@mheon
Copy link
Member

mheon commented Aug 5, 2019

I think stats is an expected failure for rootless - probably need to touch up the error message.

It requires CGroups to be present, but rootless Podman has no privileges to make CGroups.

@rhatdan
Copy link
Member

rhatdan commented Aug 5, 2019

No privs for cgroupsV1...

@marusak
Copy link
Contributor Author

marusak commented Aug 7, 2019

I opened #3749 not to block this as that happens on master as well.

@marusak
Copy link
Contributor Author

marusak commented Aug 12, 2019

Does this need some additional attention from my side?

@rhatdan
Copy link
Member

rhatdan commented Aug 12, 2019

@marusak Can you rebase and repush. There were lots of fixes in the CI system on Friday, that should make this PR mergable.

@marusak marusak force-pushed the user_socket_service branch from d106ab2 to d14c23a Compare August 12, 2019 17:56
@marusak
Copy link
Contributor Author

marusak commented Aug 12, 2019

Can you rebase and repush.

Done

Makefile Outdated
@@ -397,7 +398,9 @@ install.docker: docker-docs
install.systemd:
install ${SELINUXOPT} -m 755 -d ${DESTDIR}${SYSTEMDDIR} ${DESTDIR}${TMPFILESDIR}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to crate your ${DESTDIR}${USERSYSTEMDDIR} here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting! fixed

@@ -397,7 +398,9 @@ install.docker: docker-docs
install.systemd:
install ${SELINUXOPT} -m 755 -d ${DESTDIR}${SYSTEMDDIR} ${DESTDIR}${TMPFILESDIR}
install ${SELINUXOPT} -m 644 contrib/varlink/io.podman.socket ${DESTDIR}${SYSTEMDDIR}/io.podman.socket
install ${SELINUXOPT} -m 644 contrib/varlink/io.podman.socket ${DESTDIR}${USERSYSTEMDDIR}/io.podman.socket
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do we need a -D in install.

This enables user to interact with varlink and create/manage rootless
containers through it.

Using as:
`varlink call unix:/run/user/1000/podman/io.podman/io.podman.ListContainers`

Signed-off-by: Matej Marusak <[email protected]>
@marusak marusak force-pushed the user_socket_service branch from d14c23a to daf7044 Compare August 13, 2019 05:01
@rhatdan
Copy link
Member

rhatdan commented Aug 13, 2019

LGTM
@baude @mheon @jwhonce PTAL

@rhatdan
Copy link
Member

rhatdan commented Aug 13, 2019

@giuseppe @vrothberg PTAL

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Member

rhatdan commented Aug 13, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 13, 2019
@openshift-merge-robot openshift-merge-robot merged commit b6c9b10 into containers:master Aug 13, 2019
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. ok-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants