procfs: add safe procfs API and harden proc operations #42

cyphar · 2024-07-24T15:56:09Z

A toy example of using this API to do a few operations runc needs to do internally:

int get_self_exe()
{
	/* This follows the trailing magic-link! */
	int fd = pathrs_proc_open(PATHRS_PROC_SELF, "exe", O_PATH);
	if (fd < 0) {
		pathrs_error_t *error = pathrs_errorinfo(fd);
		/* ... print the error ... */
		pathrs_errorinfo_free(error);
		return -1;
	}
	return fd;
}

int write_apparmor_label(char *label)
{
	int fd, err, saved_errno = 0;

	fd = pathrs_proc_open(PATHRS_PROC_SELF, "attr/exec", O_WRONLY|O_NOFOLLOW);
	if (fd < 0) {
		pathrs_error_t *error = pathrs_errorinfo(fd);
		/* ... print the error ... */
		pathrs_errorinfo_free(error);
		return -1;
	}

	err = write(fd, label, strlen(label));
	close(fd);
	return err;
}

char *get_unsafe_path(int fd)
{
	char *fdpath;

	if (asprintf(&fdpath, "fd/%d", fd) < 0)
		return NULL;

	int linkbuf_size = 128;
	char *linkbuf = malloc(size);
	if (!linkbuf)
		goto err;
	for (;;) {
		int len = pathrs_proc_readlink(PATHRS_PROC_THREAD_SELF, fdpath, linkbuf, linkbuf_size);
		if (len < 0) {
			pathrs_error_t *error = pathrs_errorinfo(fd);
			/* ... print the error ... */
			pathrs_errorinfo_free(error);
			goto err;
		}

		if (len <= linkbuf_size)
			break;

		linkbuf_size = len;
		linkbuf = realloc(linkbuf, linkbuf_size);
		if (!linkbuf)
			goto err;
	}

	free(fdpath);
	return name_buf;

err:
	free(fdpath);
	free(name_buf);
	return NULL;
}

Fixes #7
Fixes #15
Signed-off-by: Aleksa Sarai [email protected]

cyphar · 2024-07-28T11:05:24Z

The new O_PATH resolver has extensive tests, as does ProcfsHandle (based on the filepath-securejoin test suite but more extensive).

We're only really missing tests for the helper Root::... methods. The C API is not included because it isn't called from Rust. In theory we could add test cases using the C API from Rust, but we can look at whether that makes sense once we have tests for everything.

This will make matching against Error types in our tests much simpler. Maybe we will want to export this at some point. Signed-off-by: Aleksa Sarai <[email protected]>

At the moment this interface is not entirely safe, so it's not exported yet. It will gain hardening in future commits. Signed-off-by: Aleksa Sarai <[email protected]>

To protect against basic bind-mount attacks, we can use statx to fetch the mount id of the path we land on. If it is a bind-mount (even a procfs bind-mount) we'll detect it. This doesn't protect against: 1. Path components that are bind-mounts (either tmpfs or of symlinks) that jump to a different place in the original procfs. Fixing this requires adding a specialised resolver that uses openat2 or a restricted subset of resolvers::opath. 2. Mounts occurring on the host that race with open_follow. Unfortunately, there is no way to re-open a magic-link and so there is a race-window where a mount could be applied on top of the magic-link after we check the stx_mnt_id of the link but before we resolve it. The only real solution is to create a private procfs (with fsopen(2) or open_tree(2)) which cannot have racing mounts. Both issues are fixed in future patches. Signed-off-by: Aleksa Sarai <[email protected]>

This logic is taken from ProcThreadSelf in runc and filepath-securejoin. Some old RHEL systems have pre-3.17 kernels which require this workaround. Unfortunately, runc has to do /proc/... operations after joining a PID namespace but while still using the host /proc, which results in errors if we just try /proc/$pid/task/$tid (the TIDs are from our PID namespace but the TIDs used by the procfs are based on the PID namespace used when it was mounted, which means that host PID namespace in most cases). In this case, the only other option is to just use /proc/self and hope nothing breaks. We could panic, but users would probably then want to use an API that does this fallback anyway (and there's no real way to conclusively verify that /proc/self is "okay" to use). Signed-off-by: Aleksa Sarai <[email protected]>

This is loosely based on the opath resolver, with support for using openat2 if possible. The new procfs_beneath resolver currently only supports doing O_PATH (and O_DIRECTORY) lookups. This is enough for our needs within libpathrs but when we expose ProcfsHandle as a public API we will need to support more modes. Unfortunately, we can't just do a /proc/self/fd/... re-open because the resolver is used within the ProcfsHandle implementation to do reopening. Signed-off-by: Aleksa Sarai <[email protected]>

(This is adapted from filepath-securejoin.) Because we depend on procfs to be correct when operating on other filesystems, having a safe procfs handle is vital. Unfortunately, if we are an administrative program running inside a container that can modify its mount configuration, our current checks are not sufficient to avoid being tricked into thinking a path is real. Luckily, with fsopen() and open_tree() it is possible to create a completely private procfs instance that an attacker cannot modify. Note that while they are both safe, they are safe for different reasons: 1. fsopen() is safe because the created mount is completely separate to any other procfs mount, so any changes to mounts on the host are irrelevant. fsopen() can fail if we trip the mnt_too_revealing() check, so we may have to fall back to open_tree() in some cases. 2. open_tree() creates a clone of a snapshot of the mount tree (or just the top mount if can avoid using AT_RECURSIVE, but mnt_too_revealing() may force us to use AT_RECURSIVE). While the tree we clone might have been messed with by an attacker, after cloning there is no way for the attacker to affect our clone (even mount propagation won't propagate into a clone[1]). The only risk is whether there are over-mounts with AT_RECURSIVE. Because anonymous mounts don't show up in mountinfo, the best we can do is check the mount id through statx to see whether a target has an overmount (this is identical to the logic we already had, but at least it is now safe against races). openat2 is sufficient for non-magic-links but for magic-links we need to check this explicitly. listmounts(2) would let us detect any overmounts at creation time, but it's not clear whether you want to error out if there is a mount over a path you never use (lxcfs has overmounts of /proc/cpuinfo and /proc/meminfo, which we never use). Unfortunately, we can only use these when running as a privileged user. In theory we could create a user namespace to gain the necessary privileges to create these handles, but this would require spawning a proper subprocess (CLONE_NEWUSER must be done in a single-threaded program) which won't always work when libpathrs is used as a library. It's also far too complicated to deal with in practice. [1]: This is true since at least Linux 5.12. See commit ee2e3f50629f ("mount: fix mounting of detached mounts onto targets that reside on shared mounts"). Signed-off-by: Aleksa Sarai <[email protected]>

Provide some nice APIs to do procfs operations in a safe way. In theory we could just provide pathrs_proc_open(), but so many people will want to do pathrs_proc_readlink() that it makes sense to just provide it (to ensure users don't misuse readlinkat() and accidentally make the operation unsafe). Here are some toy examples, based on some of the usecases we have for these APIs in runc. 1. In order to get a reference to /proc/self/exe we can use for doing a re-exec or doing memfd binary cloning, we can use pathrs_proc_open(): /* Safely get an fd to /proc/self/exe. */ int get_self_exe() { /* This follows the trailing magic-link! */ int fd = pathrs_proc_open(PATHRS_PROC_SELF, "exe", O_PATH); if (fd < 0) { pathrs_error_t *error = pathrs_errorinfo(fd); /* ... print the error ... */ pathrs_errorinfo_free(error); return -1; } return fd; } 2. When writing to AppArmor labels, we want to be absolutely sure there are no bind-mounts that would trick us into not actually writing the labels. The key risk is something like a bind-mount of /proc/self/sched (which is a procfs file that is no-op for all writes) or a bind-mount. int write_apparmor_label(char *label) { int fd, err, saved_errno = 0; fd = pathrs_proc_open(PATHRS_PROC_SELF, "attr/exec", O_WRONLY|O_NOFOLLOW); if (fd < 0) { pathrs_error_t *error = pathrs_errorinfo(fd); /* ... print the error ... */ pathrs_errorinfo_free(error); return -1; } err = write(fd, label, strlen(label)); close(fd); return err; } 3. Sometimes we need to get the "real" path of a given file descriptor. This is fundamentally racy, and you would only really want to do this for debugging information, but it is something you need to do sometimes: char *get_unsafe_path(int fd) { char *fdpath; if (asprintf(&fdpath, "fd/%d", fd) < 0) return NULL; int linkbuf_size = 128; char *linkbuf = malloc(size); if (!linkbuf) goto err; for (;;) { int len = pathrs_proc_readlink(PATHRS_PROC_THREAD_SELF, fdpath, linkbuf, linkbuf_size); if (len < 0) { pathrs_error_t *error = pathrs_errorinfo(fd); /* ... print the error ... */ pathrs_errorinfo_free(error); goto err; } if (len <= linkbuf_size) break; linkbuf_size = len; linkbuf = realloc(linkbuf, linkbuf_size); if (!linkbuf) goto err; } free(fdpath); return name_buf; err: free(fdpath); free(name_buf); return NULL; } Signed-off-by: Aleksa Sarai <[email protected]>

The main Go-specific thing we need to keep in mind when opening files from /proc/thread-self is that we have to runtime.LockOSThread to avoid hitting cases where we are migrated to a different thread and the original thread dies while we're using the handle (for most procfs files this results in spurious errors or blank files). Signed-off-by: Aleksa Sarai <[email protected]>

Signed-off-by: Aleksa Sarai <[email protected]>

These tests are based on the filepath-securejoin tests, and notably include tests of the key cases when operating on procfs. Signed-off-by: Aleksa Sarai <[email protected]>

cyphar force-pushed the procfs-hardening branch 3 times, most recently from bb0c038 to 250bca5 Compare July 24, 2024 18:11

cyphar force-pushed the main branch from 1730f5a to e906ac0 Compare July 25, 2024 07:19

cyphar force-pushed the procfs-hardening branch 10 times, most recently from 30f4811 to c3b8c39 Compare July 28, 2024 10:56

cyphar marked this pull request as ready for review July 28, 2024 11:01

cyphar force-pushed the procfs-hardening branch 2 times, most recently from 8dd2a3e to 332a551 Compare July 28, 2024 15:41

This was referenced Jul 28, 2024

upstream kernel hardening: pidfd_open("exe") #17

Open

upstream kernel hardening: O_EMPTYPATH #16

Open

cyphar added 10 commits July 29, 2024 02:18

error: add (currently-test-only) ErrorKind

1a266c4

This will make matching against Error types in our tests much simpler. Maybe we will want to export this at some point. Signed-off-by: Aleksa Sarai <[email protected]>

procfs: move internal handle to a new ProcfsHandle API

e5fc3c4

At the moment this interface is not entirely safe, so it's not exported yet. It will gain hardening in future commits. Signed-off-by: Aleksa Sarai <[email protected]>

python bindings: add pathrs_proc_* wrappers

83aff8c

Signed-off-by: Aleksa Sarai <[email protected]>

tests: add ProcfsHandle tests

59fdb76

These tests are based on the filepath-securejoin tests, and notably include tests of the key cases when operating on procfs. Signed-off-by: Aleksa Sarai <[email protected]>

cyphar force-pushed the procfs-hardening branch from 332a551 to 59fdb76 Compare July 29, 2024 01:05

cyphar merged commit 5d37a90 into openSUSE:main Jul 29, 2024
6 checks passed

cyphar deleted the procfs-hardening branch July 29, 2024 01:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

procfs: add safe procfs API and harden proc operations #42

procfs: add safe procfs API and harden proc operations #42

cyphar commented Jul 24, 2024 •

edited

Loading

cyphar commented Jul 28, 2024 •

edited

Loading

procfs: add safe procfs API and harden proc operations #42

procfs: add safe procfs API and harden proc operations #42

Conversation

cyphar commented Jul 24, 2024 • edited Loading

cyphar commented Jul 28, 2024 • edited Loading

cyphar commented Jul 24, 2024 •

edited

Loading

cyphar commented Jul 28, 2024 •

edited

Loading