-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rootless managing of additional host groups #18333
Comments
There is |
Kind of, but not really: According to https://docs.podman.io/en/latest/markdown/podman-run.1.html#uidmap-container-uid-from-uid-amount, My point here is that, as a rootless podman user:
Since I can't rely on the first intermediate mapping being stable (because my sysadmin may trust me with some extra subordinate ids), my usage of Reproducible example:
List files through podman using uidmap and gidmap:
My sysadmin adds gid 1001 as subordinate GID
My script is now invalid, since the gidmap has changed:
|
I am pretty sure the intermediate mapping is deterministic and depends on the order in /etc/sub?id, you can always check with Since there is only ever one intermediate namespace (pause process) there is no way to check for an image at the point where we create the namespace to see that the mapping will not conflict with an existing groups in the image. @giuseppe WDYT? |
Thanks for the feedback @Luap99. I added a reproducible example to hopefully clarify the issue. I see your point. I can use However dealing with groups in this way in a script seems a bit error prone to me 😓 And besides I seem to need to provide the full list of |
that is correct. The user namespace is used as the parent for any Podman operation, so it has no knowledge of the images that are you going to use. You need to parse the |
My point is that the first mapping podman makes by default is not very reasonable when there are low subordinated ids. It is reasonable for podman to assume that subordinate ids above 99999 are meant to be unused at the host and available to be used as low IDs at the container. But I do not think it is reasonable for podman to assume that with low subordinate ids, since they are usually used at the host for something. Low IDs typically correspond to some host user or group, and therefore they should be mapped to predictable container IDs. I argue that a good default range for that mapping is to offset the host ID by a large number (e.g. 100 000). This yields a symmetric behaviour between mappings at the host and at the container, because 100 000 is also the default starting range for assigning unused subordinate ids:
I do not have go experience and I am not familiar with podman's source code, but I would be willing to do my best to attempt to provide a pull request if you think this feature makes sense. Some directions would be very welcome |
it is too late to change the mappings we perform, some users might be depending on that, and it is the same mapping done by We usually need to have a root inside the user namespace because otherwise all the files from the image owned by root will show up as owned by nobody. I suggest you take a look at the |
I understand your desire to preserve backwards compatibility. Please consider that:
I haven't verified this (my unshare version is not recent enough) but the docs available at https://man7.org/linux/man-pages/man1/unshare.1.html state that (bold is mine)
If that is true, then So podman's behaviour of mapping all subordinate blocks and not just the first one is actually different from the one of Maybe I would be more convincing if I went to util-linux and convinced the unshare developers of providing an option to map "subordinate blocks different than the first one" to "predictable high IDs" as I am suggesting here? If raising this issue there helps I am willing to go for it.
Yes, that's perfectly fine. No plans to change that. I don't understand why anyone would want to change that. Maybe I did not explain something well.
I want my user mapped inside the container as root. I want additional groups in the host mapped inside the container to predictable IDs, and that's something podman does not seem to be able to do by default since the mapping and the gidmap I have to provide depend on which groups I have subordinated to my user. |
I've used I don't think the current behavior is buggy, it is just accommodating the most common case. Usually, additional IDs are added when the user realized there are not enough IDs available, so the most common case is to use them all contiguously so that we can pull and deal with images having higher IDs. e.g. I expect that the following add all the listed IDs, not that the second line means mapping 165536 to 165536.
so I am against treating the additional entries with a different meaning. We could think of adding a function like |
This would make things simpler and the syntax is straightforward to understand. I like it a lot. I have also noticed that when I provide a So instead of:
I can use:
This would make group mapping in rootless containers much easier to use and understand for many people in my environment. I don't know if I will be capable of implementing this in a pull request, but if we agree on the design then we can see how to implement it. Thanks for all the time you are spending on this. |
there are no default mappings added when you specify |
I would expect this command:
To behave like this:
When there is no Currently, if I use |
this could probably also be expressed using some special annotation for the
to insert that mapping and break the existing mappings (if any) |
Being able to use
When running a container would make a whole community around me (and myself) really happy. In summary, the features proposed after this discussion are two extensions to the
I don't know if I will be able to implement these two things, but if nobody else has the bandwidth to do it I suppose that it's either me doing this or letting this linger. I don't have experience with go, but I have been able to follow instructions and build podman myself, so that's a start. I believe I may need to modify the containers/storage repository, since that's where I believe the parsing of id mappings happens. I'm feeling however this may be quite daunting as a first issue for me so your help (or anyone with more experience or time) taking over would make things easier, I guess. Advice and suggestions are very much welcome. |
I'll go step by step, so I will start by exploring the Supporting
|
Use the --gidmap "+100000:@1001:1" |
Tagging @cboettig since he suggested me to write something here I have a solution for running the container rootless: - rocker-org/rocker-versioned2#636 There are many advantages on rootless containers, the main one being security. The main caveat with rootless containers is when we want to map additional groups to the container (for instance when we have an additional group that owns a "shared_data" directory we want to access). In that case, we still need to learn quite a bit about id mapping. I've done my best to explain how things work and to provide a step by step guide in this pull request. Hopefully this will eventually be simplified. It may be that I have overlooked something - containers/podman#18333 I guess we can wait some days to see how the issue evolves. It may be that I've missed something and my solution is overly complicated or that some feature needs to land in podman to simplify additional group management. English is not my primary language. I would appreciate feedback or change in wordings. Besides I've been writing this for too long. I may need to take some time to get some perspective and re-read it again, but I believe it is worth a first read. --------- Co-authored-by: eitsupi <[email protected]>
I managed to provide an implementation at #18713. It lacks unit testing and documentation, because I have to figure out where do they go. Comments and feedback are very welcome. |
This topic has been discussed at length at containers#18333, with @giuseppe, @Luap99 and with the feedback from @rhatdan. The requirements were defined there and this aims to be the implementation. Motivation =========== These series of patches aim to make --uidmap and --gidmap easier to use, especially in rootless podman setups. (I will focus here on the --gidmap option, although the same applies for --uidmap.) In rootless podman, the user namespace mapping happens in two steps, through an intermediate mapping. See https://docs.podman.io/en/latest/markdown/podman-run.1.html#uidmap-container-uid-from-uid-amount for further detail, here is a summary: First the user GID is mapped to 0 (root), and all subordinate GIDs (defined at /etc/subgid, and usually >100000) are mapped starting at 1. If we want to change it further, we can use the --gidmap option, to map that intermediate mapping to the final mapping that will be seen by the container. As an example, let's say we have as main GID the group 1000, and we also belong to the additional GID 2000, that we want to make accessible inside the container. We first ask the sysadmin to subordinate the group to us, by adding "$user:2000:1" to /etc/subgid. Then we need to use --gidmap to specify that we want to map GID 2000 into some GID inside the container. And here is the first trouble: Since the --gidmap option operates on the intermediate mapping, we first need to figure out where has podman placed our GID 2000 in that intermediate mapping using: podman unshare cat /proc/self/gid_map Then, we may see that GID 2000 was mapped intermediate GID 5. So our --gidmap option should include: --gidmap 20000:5:1 This intermediate mapping may change in the future if further groups are subordinated to us (or we stop having its subordination), so we are forced to verify the mapping with `podman unshare cat /proc/self/gid_map` every time, and parse it if we want to script it. **The first usability improvement** we agreed on containers#18333 is to be able to use: --gidmap 20000:@2000:1 so podman does this lookup in the parent user namespace for us. But this is only part of the problem. We must specify a full gidmap and not only what we want: --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 This is becoming complicated. We had to break the gidmap at 5, because the intermediate 5 had to be mapped to another value (20000), and then we had to keep mapping all other subordinate ids... up to close to the maximum number of subordinate ids that we have (or some reasonable value). This is hard to explain to someone who does not understand how the mappings work internally. **The second usability improvement** is to be able to use: --gidmap "+20000:@2000:1" where the plus sign (`+`) states that we want to start with an identity mapping, and break it where necessary so this mapping gets included. One final improvement related to this is the following: By default, when podman gets a --gidmap argument but not a --uidmap argument, it copies the mapping. With the new syntax this copying does not make sense. Having a GID subordinated to us does not imply that the same UID will be subordinated as well. This means, that when we wanted to use: --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 We also had to include: --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 --uidmap 0:0:65000 making everything even harder to understand without proper context. In this series of patches, when a "break and insert" gidmap is given (using the described `+` syntax) without a --uidmap, we assume that we want the "identity mapping" as --uidmap (0:0:65000). To preserve backwards compatibility, this different default mapping is only used when the `+` syntax is used, so users who rely on the previous behaviour don't suffer any changes. Signed-off-by: Sergio Oller <[email protected]>
A friendly reminder that this issue had no activity for 30 days. |
Thanks bot. I have been busy with work but I haven't forgotten this, I just need a bit more time |
A friendly reminder that this issue had no activity for 30 days. |
@zeehio Reminder. |
Motivation =========== This feature aims to make --uidmap and --gidmap easier to use, especially in rootless podman setups. (I will focus here on the --gidmap option, although the same applies for --uidmap.) In rootless podman, the user namespace mapping happens in two steps, through an intermediate mapping. See https://docs.podman.io/en/latest/markdown/podman-run.1.html#uidmap-container-uid-from-uid-amount for further detail, here is a summary: First the user GID is mapped to 0 (root), and all subordinate GIDs (defined at /etc/subgid, and usually >100000) are mapped starting at 1. One way to customize the mapping is through the `--gidmap` option, that maps that intermediate mapping to the final mapping that will be seen by the container. As an example, let's say we have as main GID the group 1000, and we also belong to the additional GID 2000, that we want to make accessible inside the container. We first ask the sysadmin to subordinate the group to us, by adding "$user:2000:1" to /etc/subgid. Then we need to use --gidmap to specify that we want to map GID 2000 into some GID inside the container. And here is the first trouble: Since the --gidmap option operates on the intermediate mapping, we first need to figure out where has podman placed our GID 2000 in that intermediate mapping using: podman unshare cat /proc/self/gid_map Then, we may see that GID 2000 was mapped to intermediate GID 5. So our --gidmap option should include: --gidmap 20000:5:1 This intermediate mapping may change in the future if further groups are subordinated to us (or we stop having its subordination), so we are forced to verify the mapping with `podman unshare cat /proc/self/gid_map` every time, and parse it if we want to script it. **The first usability improvement** we agreed on containers#18333 is to be able to use: --gidmap 20000:@2000:1 so podman does this lookup in the parent user namespace for us. But this is only part of the problem. We must specify a **full** gidmap and not only what we want: --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 This is becoming complicated. We had to break the gidmap at 5, because the intermediate 5 had to be mapped to another value (20000), and then we had to keep mapping all other subordinate ids... up to close to the maximum number of subordinate ids that we have (or some reasonable value). This is hard to explain to someone who does not understand how the mappings work internally. To simplify this, **the second usability improvement** is to be able to use: --gidmap "+20000:@2000:1" where the plus flag (`+`) states that the given mapping should extend any previous/default mapping, overriding any previous conflicting assignment. Podman will set that mapping and fill the rest of mapped gids with all other subordinated gids, leading to the same (or an equivalent) full gidmap that we were specifying before. One final usability improvement related to this is the following: By default, when podman gets a --gidmap argument but not a --uidmap argument, it copies the mapping. This is convenient in many scenarios, since usually subordinated uids and gids are assigned in chunks simultaneously, and the subordinated IDs in /etc/subuid and /etc/subgid for a given user match. For scenarios with additional subordinated GIDs, this map copying is annoying, since it forces the user to provide a --uidmap, to prevent the copy from being made. This means, that when the user wants: --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 The user has to include a uidmap as well: --gidmap 0:0:5 --gidmap 5:6:15000 --gidmap 20000:5:1 --uidmap 0:0:65000 making everything even harder to understand without proper context. For this reason, besides the "+" flag, we introduce the "u" and "g" flags. Those flags applied to a mapping tell podman that the mapping should only apply to users or groups, and ignored otherwise. Therefore we can use: --gidmap "+g20000:@2000:1" So the mapping only applies to groups and is ignored for uidmaps. If no "u" nor "g" flag is assigned podman assumes the mapping applies to both users and groups as before, so we preserve backwards compatibility. Co-authored-by: Tom Sweeney <[email protected]> Signed-off-by: Sergio Oller <[email protected]>
Feature request description
This feature request describes how podman could allow rootless management of
additional host groups using better semantics.
One of the current issues with
--group-add keep-groups
is that while groupsare added, the groups are not mapped, so when listing the groups or a
mounted volume the overflowgid
nobody
appears. This is confusing.Besides, some container images need the users to be mapped to operate, for
instance because they create processes based on user logins into the container and
those created processes do not have the unmapped groups set (because they
can't appear in the container's
/etc/groups
). As an example therocker/rstudio
imageexpects users to login to rstudio server through a web interface, and the web logged in user
does not have those groups kept.
Suggest potential solution
This problem can be addressed if the system administrator is willing to give
some additional trust to users (giving the group as a subordinate GID) and
if podman improves its semantics a bit. This is what this feature request is
about.
Asking permission to the system administrator
The system administrator is able to subordinate arbitrary GID usage to a user
through
/etc/subgid
.The most common usage of subordinate GIDs is for the sysadmin to give a large
non-overlapping chunk of unused GIDs to each user. Large GIDs are usually available
so we can easily give them away to users.
However, the sysadmin may consider giving as well an existing GID as subordinate to
several users. These users can act as if they owned that GID.
Reads as:
This has security implications, as it essentially means that user
alice
can"impersonate" GID 1001.
Telling rootless podman to map GID 1001
To run a container, we need to be able to act as if we had many users and groups
available. For instance, container images may include files owned by different
users and groups and as a regular user we should be able to create and manage
those files. The way to do this is for us (regular users) to create a mapping
between the UIDs and GIDs our container needs (or may need) and a subordinate
UID (GID) range we have.
By default, we may also want to map the root user and root group to our UID/GID
at the host.
This is what rootless podman essentially does:
There is no need for podman to map all the subordinate GIDs we have, as long
as all the GIDs the container needs are mapped to some GID we have.
What we would like to do is to have control over host GID 1001. However, we
would like to map it not to an arbitrary GID in the 1-n range, but to a higher stable
GID the container does not actually use.
By choosing to map to a high GID (e.g. container GID 100000+1001), we avoid the risk of
overlapping with GIDs the container needs.
Podman should add a group entry into the containers
/etc/groups
, withthe same group name as the host group 1001, with group id 101001 and with
the root user (or whatever user we have mapped onto) as a member, if we are
members of group 1001 in the host.
Suggested improvement
Semantically, it would be helpful defining something in
containers.conf
thatwould tell podman to use host GID 1001 as a "higher container GID", that would not
collide with existing GIDs in the container image, and let podman add the
corresponding entry in
/etc/groups
.For instance, the
containers.conf
could have a field like:If 1001 is in the
subgid
for the user running the container, this setting would tell podman to map1001 to
local_group_offset+1001
.Have you considered any alternatives?
Currently podman treats all subordinate GIDs the same. This means that with
the
/etc/subgid
defined above, podman would map the host group 1001 tothe container GID 1, and the rest of subordinate GIDs would follow.
This is confusing, because GID1 is usually assigned in many containers and it
is defined as the
bin
ordaemon
group or some other system group.Besides, as proven in the reproducible example below, the usage of
--gidmap
is not invariant to changes on subgids, making the scripts of the rootless user unnecessarily dependant on the subordinate GIDs available.Additional context
No response
The text was updated successfully, but these errors were encountered: