-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --overlay and related options #547
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a Signed-off-by
to your commit(s) to indicate that you agree to the Developer Certificate of Origin terms.
The commit message should have |
742d3af
to
f995ca0
Compare
f995ca0
to
09429c6
Compare
bc8044a
to
23ff0f8
Compare
The only outstanding thread here should be getting feedback on my reasoning for option ordering and naming. Please let me know what you think and if there's anything I can do to move this forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this. A re-review is on my list, but the PR is non-trivial and my list is not short.
I ran into a race when running two bwrap processes consecutively using the same overlay, and the kernel has On my Arch Linux system, this example: #!/usr/bin/env sh
set -eu
rm -rf $HOME/bwrap_ofs_cleanup_test
mkdir $HOME/bwrap_ofs_cleanup_test && cd $HOME/bwrap_ofs_cleanup_test
mkdir -p rw wrk
bwrap --unshare-user --unshare-pid --dev-bind / / --overlay-src "$PWD" --overlay "$PWD"/rw "$PWD"/wrk "$PWD" \
dd if=/dev/zero of=dummy bs=1M count=10 status=none
bwrap --unshare-user --unshare-pid --dev-bind / / --overlay-src "$PWD" --overlay "$PWD"/rw "$PWD"/wrk "$PWD" \
echo SUCCESS Fails with this error:
And the error on
Inserting a short sleep or a CauseI have taken a look and I believe that the cause is: Since I'm using Trying
|
Honestly, this is what I thought Maybe it could be enabled by a separate flag? I would imagine that waiting for everything in the sandbox to be well and truly finished is a feature that would be useful in other cases. |
TBH I thought the same, but you can see that that's not the case by running I guess it could be useful if you are launching something that daemonizes, but it's going to be trouble for anything that locks a resource like the overlayfs inode index. So I also think that having flag(s) to wait or kill all processes instead of just for the main process could be useful for those scenarios. Also, I suppose the |
FWIW: if pid1 exists (no matter how) everything in the pid namespace will get a SIGKILL. I.e. there is no way pid1 has exited and deamons are still running. Of course you only have a pid1 when you use unshare-pid. |
You mean if pid 1 exits, I think (unfortunately either word makes sense here, but only one is correct). Yes, if we are unsharing the pid namespace for the sandbox, when pid 1 within that namespace exits (that's the user-specified process if you used However, I don't know whether they are killed synchronously before pid 1 is allowed to exit, or whether pid 1 can exit while the overlayfs resource is still in use...
If there is any situation in which the reaper process would not exit promptly, then that would make sense as an opt-in thing. Or if the reaper process would always exit promptly in any case, then it might as well be on by default, in all cases where the reaper process gets run. (In situations where the reaper is not run, of course we must not wait for it.) If this is likely to be a practical problem, it might make sense to have an Or, you might find that your use-case is better served by creating a long-running sandbox containing an IPC server, and then poking commands into it, analogous to |
If I understand the relevant terms correctly, we know that the reaper process doesn't exit until all children have terminated (the
It might indeed make sense to have this sort of option (as well as for the other options overlayfs exposes—there are a few of them), but I'm reluctant to design and implement them without an actual need. Turning off the index for this issue is more of a workaround for not having |
I found the problem while trying to create a sandbox wrapper over an existing command line tool ( Perhaps my specific use case is a bit obscure, but I think the idea of being able to use overlays in drop-in tool wrappers that are used from scripts is something that ought to work.
I agree that creating a new issue/PR is the right approach, ultimately the cause of the problem I ran into is not related to this PR, but rather a more general issue with the exit/cleanup code and locked resources, and the conditions to run into it are pretty specific.
Agreed, I think bwrap should strive for simplicity and turning off the overlayfs index is just a workaround. If someone needs to tune their mounts for some specific use case, this can be done like in #412 (comment). |
23ff0f8
to
31d8b9e
Compare
A few nits but this looks great! Just from a simplicity perspective I think it would be easier to follow what's going on if the ops were added directly and then processed in The other big thing is how a user like flatpak should test for availability of this feature. How is this done with other features? |
@smcv, in #596 (comment) you said ‘solving its remaining issues’. As far as I'm aware the only outstanding issue here is your decision on whether the design choices we discussed in #547 (comment) are acceptable. Is there anything else for anyone to do? Let me know and I'll be happy to work on it. |
FWIW, the order as implemented right now makes a lot of sense when you build the command line for it from a program like flatpak. |
I have a flatpak branch now which is basically doing Letting bwrap chown the directory doesn't work and fails with EINVAL. For me
or just bind mounting it without the tmp overlay like this:
In both cases $masking-dir contains a the file Unfortunately the masking dir layer is on the oldroot. What I want, really is a way to add a layer from the newroot:
This also works for bind mounting:
Long story short, I think an option to create an overlayfs layer from a directory from the newroot would be really helpful. |
I've rebased onto the 0.9.0 release and implemented the @smcv, any chance this can get your eyes again? |
Sorry, this got preempted by a security vulnerability in Flatpak which required a new bubblewrap feature to be added under embargo. Please could you rebase onto 0.10.0? Is this otherwise in a ready-to-review state, or are there implementation concerns remaining? Please mark the threads that various people have raised as "Resolved", if you believe those concerns have been addressed (or if you can't do that because you aren't a maintainer, the next best thing is to reply to the thread and say "Resolved"). |
Originally I was trying to ensure some level of isolation, but now I realize that's contrary to my original goal: a clean $HOME. Given that, I have changed the bwrap to bind all relevant mounts from root and the user home. In order to prevent propagation of the managed configs, directories are recursively bind mounted, which at the moment is incurring a performance penalty, but may be improved if containers/bubblewrap#547 gets merged.
15eb0b5
to
3f10549
Compare
Rebased and ready for review. |
3f10549
to
8195a26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patience with this!
8195a26
to
b927140
Compare
Sorry if I have to barge in again, but this seems to be a related issue: Overlayfs (in the kernel) seems to have a few lines of code where it checks whether the user and group of a file that is modified have a mapping in the current userns (see torvalds/linux@4f11ada). In my use case (trying to alter files of a different uid, but same gid) this leads to various errors with the error text "Value too large for defined data type" (which is imho not really helpful). So far I don't see how I can make bwrap identity map all users and groups (except maybe for root) into the user namespace used to setup the unprivileged mount namespace, in order to not trigger this error in overlayfs. (The solution might even be that bwrap does this automatically when using an overlayfs?) Any thoughts? |
As far as I'm aware, this is impossible: the kernel does not allow it. While unprivileged, we are only allowed to map one uid and one gid (our own uid and primary gid). Everyone else is mapped to the overflow uid/gid (usually Sorry, but in this project we have a strict policy of not doing impossible things.
We do not have any control over that error message, it comes from the kernel (which reports Unfortunately, kernel-level error reporting does not have a way to clarify the real meaning of error messages like this. You can see other effects of this design in various places inside bubblewrap, for example where we have to substitute a different error message when So I think your use-case is going to be impossible, even after merging this PR. You can't use the overlay to modify files that are owned by a different user. The closest you can get is that if you know in advance which files will be modified, you can make a copy in advance that is owned by you, and mount it over the top of the original file. For more advanced use-cases with multiple uids, please consider using a fully-featured container manager like podman instead. |
Fair point. I didn't know that.
No need to be an obnoxious muppet. ❤️
Okay, buddy. Just for you in simple terms: I pointed to the source code of the kernel, because the check there was relevant to my comment. I did complain about the error message as a general side note, not with the expectation for you to fix it. I wanted to point out how the issue I ran into with this PR manifests, to hopefully prevent this happening to others further down the line. There's really no need to be so condescending.
Fair point. I thought accessing files of the same group is not too extravagant of a use-case, but as this seems to require a lot more preparation, I can see how it is out of scope. I'll think about your suggestion. Thanks. |
I'm sorry I didn't initially understand that. There does seem to be a tendency in other issue reports for bubblewrap users to assume that any container problem is a bug that should be fixed in bubblewrap (including kernel limitations that are outside our control), so it's difficult to distinguish between that expectation and an awareness that not everything can be in-scope here. |
Back to the subject of this PR, the updated version with overflow checks is looking good - this is a relatively complicated addition, so I want to give it one more review pass from first principles before hitting merge, but it's certainly close. |
b927140
to
94f8aa9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a version that has been rebased onto #660 (using true
and false
for booleans).
94f8aa9
to
5e9f066
Compare
I think the last thing blocking this now is that it needs testing in older build environments, to make sure the test is run or skipped appropriately. I'll aim to do that soon. Fixing #547 (comment) would be nice but is not a blocker. |
This commit adds --overlay, --tmp-overlay, --ro-overlay, and --overlay-src options to enable bubblewrap to create overlay mounts. These options are only permitted when bubblewrap is not installed setuid. Resolves: containers#412 Co-authored-by: William Manley <[email protected]> Signed-off-by: Ryan Hendrickson <[email protected]> [smcv: Fix merge conflicts with containers#660] Signed-off-by: Simon McVittie <[email protected]>
5e9f066
to
f371022
Compare
This commit adds --overlay, --tmp-overlay, --ro-overlay, and --overlay-src options to enable bubblewrap to create overlay mounts. These options are only permitted when bubblewrap is not installed setuid.
This is a continuation/partial rewrite of #167, addressing the feedback given there.
Other improvements of note:
--tmp-overlay
is a new contribution for a use case I frequently have: I want to mount an overlay that is writable but I don't want to persist the writes between runs.