-
Notifications
You must be signed in to change notification settings - Fork 316
Introduce support for syscall filtering in containers #237 #384
Conversation
@mheon The api branch was just merged into master. Would you mind rebasing? Thanks! |
@mrunalp Rebased, though it looks like I might have broken the test suite along the way - looking into it. |
@mheon Could you rebase again? Github is complaining about conflicts. I will test this out today. |
848420b
to
ee60d35
Compare
@mrunalp Rebased, unit tests passing. Worth noting that there's a bug in libseccomp upstream at the moment which breaks conditional syscall blocking. A fix is incoming, but for the moment there are no tests on blocking a syscall based on multiple conditions |
@mheon Thanks! |
The seccomp tests aren't actually getting run in jenkins because of the seccomp tag. |
"user": "daemon", | ||
"seccomp_config": { | ||
"enable": true, | ||
"whitelist": false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whitelist_toggle?
I've added the seccomp tag to the Dockerfile so the tests should be running automatically now. There were previously issues with running Seccomp tests on the CI server, but these seem to be resolved. It should even be possible to remove the seccomp tag completely at this point, but it could still be desirable to build libcontainer without libseccomp support. |
Rebased on latest master. Some changes in integration tests, which broke on the new version of busybox. |
@mheon Thanks! I will try to get to this by end of today. |
@crosbymichael @LK4D4 ping |
I took a quick look and overall looks good. |
@mheon Can you rebase pls? |
@gabrtv here you go :) |
@LK4D4 Rebased |
@mheon I have question. Is this makes sense to release goseccomp as separate package under your name? |
Conditions []SyscallCondition `json:"conditions,omitempty"` | ||
} | ||
|
||
type SeccompConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer call this type Config
because seccomp
part always obvious from package name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here probably docstrings too. At least for WhitelistToggle
@LK4D4 From an upstream perspective, it makes things easier for the Libseccomp maintainers. However, given it does create some issues with vendoring the bindings, I'd be willing to maintain a copy of the bindings independent of the main repository. |
) | ||
|
||
var ( | ||
actAllow libseccomp.ScmpAction = libseccomp.ActAllow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libseccomp.ScmpAction
is redundant, type will be inferred from right side.
Can someone please explain why including the libseccomp docker bindings with the rest of libseccomp is problematic? I'm willing to make reasonable adjustments, but no one has explained the problem. |
@pcmoore All the Go dependencies for Libcontainer are included in this repository. The way of consistently acquiring or updating this externally-available dependencies is by a vendoring script, which clones their Git repositories. If the Libseccomp Go bindings are in the Libseccomp repository, this would grab the entire repository, including all the C source code and test suite. Ideally we'd just want the Go bindings. Right now, my solution is to clone the entire repo, and then remove every file that's not the Go bindings and license, which is kind of gross. |
@pcmoore @mheon Based on the libct backend dependencies (C + go) that we will end up pulling into libcontainer, I think it should be fine to pull all of libseccomp. |
@mrunalp I don't care. |
This PR introduces the ability to filter system calls on a per-container basis on Linux, using libseccomp to support multiple architectures. This adds another layer of security between containers and the kernel. System calls which are unnecessary in a container or problematic from a security perspective can be restricted to prevent their use. Most of the truly problematic syscalls are already restricted by dropping capabilities; this adds an additional, finer-grained layer of protection. This PR adds a vendored library dependency (Go bindings for libseccomp) and a build dependency on libseccomp >= v2.1. The actual changes to libcontainer are fairly minimal, most of the delta is in the libseccomp bindings. Docker-DCO-1.1-Signed-off-by: Dan Walsh <[email protected]> (github: rhatdan) Docker-DCO-1.1-Signed-off-by: Matt Heon <[email protected]> (github: mheon)
Also update library itself to latest master. Signed-off-by: Matthew Heon <[email protected]>
Signed-off-by: Matthew Heon <[email protected]>
Requested changes have been made. Most notably, we pull in the entire Libseccomp repository now, meaning the diffstat just exploded. |
Signed-off-by: Matthew Heon <[email protected]>
Okay, so all is good now with respect to the repo? |
Certainly would be good to get this into docker-1.7. |
@rhatdan agreed. I think the big question is what a default profile will look like. |
@diogomonica I think Matt (And I) have come up with a pretty good default blocking profile. Mostly stolen from other container tools and through suggestions from kernel developers. If we could at least start getting some of this merged, then we could release an experimental patch to Rawhide, to see if anyone sees issues with the default list of blocked syscalls. |
@rhatdan There were some licensing concerns around merging this. (LGPL). |
@mrunalp We're looking into it. I wouldn't be opposed to relicensing, but the bindings are going in the upstream repository as well, so we want to examine options there. Updates on that (hopefully) soon! On defaults: I think the most sane policy for a default blocklist is a blacklist, as I think we should allow syscalls introduced by new kernel versions by default, and in general whitelisting is more cumbersome than blacklisting for syscalls (an absolutely minimal whitelist to get from Docker to "return 0" is around 25 syscalls, and that goes up sharply in number if you want to do basic I/O, file operations, networking, etc). |
Can we take a look at this PR #529, it made it in another way. |
Interesting, it handles the seccomp BPF filter generation completely in Golang. I'm obviously a little biased so I won't comment on which approach is better, but I will mention that just a quick glance at the code in PR #529 revealed a few issues/concerns:
I suspect if I look closer there are more issues as well, not to mention the long term maintenence issues (who will keep the syscall tables updated?). |
To me it would seem prudent to merge in libseccomp support and then allow the golang implementation to grow in support til it matches libseccomp. At that point, I would suggest we look at this again. But I really do not want this effort to get derailed. |
@rhatdan agreed. This said, BPF filter generation completely in Golang is totally the right way to go. |
@mheon what about the license? Even if it's in upstream LGPL does not make sense for Go because of the static compiles. |
@crosbymichael I'd be fine with relicensing the bindings, though I'd have to go through @pcmoore to verify that he's alright with it as well. Big issue from my perspective is that the actual libseccomp library, itself LGPL, is going to have to be statically-linked into Docker. |
@mheon we can dynamically link it into docker but i still don't think it makes sense for Go because every go pkg is static. what do you think? |
Why is this even a problem. Don't we already staticly link in libc.so which is LGPL? |
@rhatdan I think no, we're not linking libc, that's why we have our own |
#613 was merged with a pure go implementation. Thanks for your help and input on this feature. |
We would like to question this. We have strong reservations about the go lang bindings. |
To add to this: I think the comments from @pcmoore on #529 provide a good summary of our reasoning for an alternative implementation (ease of maintenance, feature completeness, performance). This PR hadn't seen much love in a long time (I'd moved on to other projects and not found time to rebase/solve the licensing issues). However, the way forward is fairly clear (relicense the bindings to a non-copyleft license... I was thinking BSD?), and I think I could have an updated version free of licensing hurdles by the end of tomorrow. I'd like this at least to be considered as an alternative before locking in to supporting a Golang implementation for the foreseeable future. |
Obsoletes #263
This introduces Seccomp system call filtering, which restricts system calls, preventing them from being made inside containers.
Compared to the previous pull request, several changes have been made:
The vendored bindings for libseccomp are less than optimal right now - they're located in the main libseccomp repository, so checking them out also checks out fifty-odd C files. If there are suggestions for a better way of doing things, I'd be open to changing this.