-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
procfs: add safe procfs API and harden proc operations #42
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cyphar
force-pushed
the
procfs-hardening
branch
3 times, most recently
from
July 24, 2024 18:11
bb0c038
to
250bca5
Compare
cyphar
force-pushed
the
procfs-hardening
branch
10 times, most recently
from
July 28, 2024 10:56
30f4811
to
c3b8c39
Compare
The new We're only really missing tests for the helper |
cyphar
force-pushed
the
procfs-hardening
branch
2 times, most recently
from
July 28, 2024 15:41
8dd2a3e
to
332a551
Compare
This was referenced Jul 28, 2024
This will make matching against Error types in our tests much simpler. Maybe we will want to export this at some point. Signed-off-by: Aleksa Sarai <[email protected]>
At the moment this interface is not entirely safe, so it's not exported yet. It will gain hardening in future commits. Signed-off-by: Aleksa Sarai <[email protected]>
To protect against basic bind-mount attacks, we can use statx to fetch the mount id of the path we land on. If it is a bind-mount (even a procfs bind-mount) we'll detect it. This doesn't protect against: 1. Path components that are bind-mounts (either tmpfs or of symlinks) that jump to a different place in the original procfs. Fixing this requires adding a specialised resolver that uses openat2 or a restricted subset of resolvers::opath. 2. Mounts occurring on the host that race with open_follow. Unfortunately, there is no way to re-open a magic-link and so there is a race-window where a mount could be applied on top of the magic-link after we check the stx_mnt_id of the link but before we resolve it. The only real solution is to create a private procfs (with fsopen(2) or open_tree(2)) which cannot have racing mounts. Both issues are fixed in future patches. Signed-off-by: Aleksa Sarai <[email protected]>
This logic is taken from ProcThreadSelf in runc and filepath-securejoin. Some old RHEL systems have pre-3.17 kernels which require this workaround. Unfortunately, runc has to do /proc/... operations after joining a PID namespace but while still using the host /proc, which results in errors if we just try /proc/$pid/task/$tid (the TIDs are from our PID namespace but the TIDs used by the procfs are based on the PID namespace used when it was mounted, which means that host PID namespace in most cases). In this case, the only other option is to just use /proc/self and hope nothing breaks. We could panic, but users would probably then want to use an API that does this fallback anyway (and there's no real way to conclusively verify that /proc/self is "okay" to use). Signed-off-by: Aleksa Sarai <[email protected]>
This is loosely based on the opath resolver, with support for using openat2 if possible. The new procfs_beneath resolver currently only supports doing O_PATH (and O_DIRECTORY) lookups. This is enough for our needs within libpathrs but when we expose ProcfsHandle as a public API we will need to support more modes. Unfortunately, we can't just do a /proc/self/fd/... re-open because the resolver is used within the ProcfsHandle implementation to do reopening. Signed-off-by: Aleksa Sarai <[email protected]>
(This is adapted from filepath-securejoin.) Because we depend on procfs to be correct when operating on other filesystems, having a safe procfs handle is vital. Unfortunately, if we are an administrative program running inside a container that can modify its mount configuration, our current checks are not sufficient to avoid being tricked into thinking a path is real. Luckily, with fsopen() and open_tree() it is possible to create a completely private procfs instance that an attacker cannot modify. Note that while they are both safe, they are safe for different reasons: 1. fsopen() is safe because the created mount is completely separate to any other procfs mount, so any changes to mounts on the host are irrelevant. fsopen() can fail if we trip the mnt_too_revealing() check, so we may have to fall back to open_tree() in some cases. 2. open_tree() creates a clone of a snapshot of the mount tree (or just the top mount if can avoid using AT_RECURSIVE, but mnt_too_revealing() may force us to use AT_RECURSIVE). While the tree we clone might have been messed with by an attacker, after cloning there is no way for the attacker to affect our clone (even mount propagation won't propagate into a clone[1]). The only risk is whether there are over-mounts with AT_RECURSIVE. Because anonymous mounts don't show up in mountinfo, the best we can do is check the mount id through statx to see whether a target has an overmount (this is identical to the logic we already had, but at least it is now safe against races). openat2 is sufficient for non-magic-links but for magic-links we need to check this explicitly. listmounts(2) would let us detect any overmounts at creation time, but it's not clear whether you want to error out if there is a mount over a path you never use (lxcfs has overmounts of /proc/cpuinfo and /proc/meminfo, which we never use). Unfortunately, we can only use these when running as a privileged user. In theory we could create a user namespace to gain the necessary privileges to create these handles, but this would require spawning a proper subprocess (CLONE_NEWUSER must be done in a single-threaded program) which won't always work when libpathrs is used as a library. It's also far too complicated to deal with in practice. [1]: This is true since at least Linux 5.12. See commit ee2e3f50629f ("mount: fix mounting of detached mounts onto targets that reside on shared mounts"). Signed-off-by: Aleksa Sarai <[email protected]>
Provide some nice APIs to do procfs operations in a safe way. In theory we could just provide pathrs_proc_open(), but so many people will want to do pathrs_proc_readlink() that it makes sense to just provide it (to ensure users don't misuse readlinkat() and accidentally make the operation unsafe). Here are some toy examples, based on some of the usecases we have for these APIs in runc. 1. In order to get a reference to /proc/self/exe we can use for doing a re-exec or doing memfd binary cloning, we can use pathrs_proc_open(): /* Safely get an fd to /proc/self/exe. */ int get_self_exe() { /* This follows the trailing magic-link! */ int fd = pathrs_proc_open(PATHRS_PROC_SELF, "exe", O_PATH); if (fd < 0) { pathrs_error_t *error = pathrs_errorinfo(fd); /* ... print the error ... */ pathrs_errorinfo_free(error); return -1; } return fd; } 2. When writing to AppArmor labels, we want to be absolutely sure there are no bind-mounts that would trick us into not actually writing the labels. The key risk is something like a bind-mount of /proc/self/sched (which is a procfs file that is no-op for all writes) or a bind-mount. int write_apparmor_label(char *label) { int fd, err, saved_errno = 0; fd = pathrs_proc_open(PATHRS_PROC_SELF, "attr/exec", O_WRONLY|O_NOFOLLOW); if (fd < 0) { pathrs_error_t *error = pathrs_errorinfo(fd); /* ... print the error ... */ pathrs_errorinfo_free(error); return -1; } err = write(fd, label, strlen(label)); close(fd); return err; } 3. Sometimes we need to get the "real" path of a given file descriptor. This is fundamentally racy, and you would only really want to do this for debugging information, but it is something you need to do sometimes: char *get_unsafe_path(int fd) { char *fdpath; if (asprintf(&fdpath, "fd/%d", fd) < 0) return NULL; int linkbuf_size = 128; char *linkbuf = malloc(size); if (!linkbuf) goto err; for (;;) { int len = pathrs_proc_readlink(PATHRS_PROC_THREAD_SELF, fdpath, linkbuf, linkbuf_size); if (len < 0) { pathrs_error_t *error = pathrs_errorinfo(fd); /* ... print the error ... */ pathrs_errorinfo_free(error); goto err; } if (len <= linkbuf_size) break; linkbuf_size = len; linkbuf = realloc(linkbuf, linkbuf_size); if (!linkbuf) goto err; } free(fdpath); return name_buf; err: free(fdpath); free(name_buf); return NULL; } Signed-off-by: Aleksa Sarai <[email protected]>
The main Go-specific thing we need to keep in mind when opening files from /proc/thread-self is that we have to runtime.LockOSThread to avoid hitting cases where we are migrated to a different thread and the original thread dies while we're using the handle (for most procfs files this results in spurious errors or blank files). Signed-off-by: Aleksa Sarai <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
These tests are based on the filepath-securejoin tests, and notably include tests of the key cases when operating on procfs. Signed-off-by: Aleksa Sarai <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A toy example of using this API to do a few operations runc needs to do internally:
Fixes #7
Fixes #15
Signed-off-by: Aleksa Sarai [email protected]