Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require at least libseccomp 2.5.5 #10591

Merged
merged 1 commit into from
Apr 22, 2024
Merged

Conversation

Ma27
Copy link
Member

@Ma27 Ma27 commented Apr 22, 2024

Motivation

Closes #10585

As it turns out, libseccomp maintains an internal syscall table and validates each rule against it. This means that when using libseccomp 2.5.4 or older, one may pass 452 as syscall number against it, but since it doesn't exist in the internal structure, libseccomp will refuse to create a filter for that. This happens with nixpkgs-23.11, i.e. on stable NixOS and when building Nix against the project's flake.

To work around that

  • a backport of libseccomp 2.5.5 on upstream nixpkgs has been scheduled[1].

  • the package now uses libseccomp 2.5.5 on its own already. This is to provide a quick fix since the correct fix for 23.11 is still a staging cycle away.

It must not be possible to build a Nix with an incompatible libseccomp version (nothing can be built in a sandbox on Linux!), so configure.ac rejects libseccomp if __SNR_fchmodat2 is not defined.

We still need the compat header though since SCMP_SYS(fchmodat2) internally transforms this into __SNR_fchmodat2 which points to __NR_fchmodat2 from glibc 2.39, so it wouldn't build on glibc 2.38. The updated syscall table from libseccomp 2.5.5 is NOT used for that step, but used later, so we need both, our compat header and their syscall table 🤷

[1] NixOS/nixpkgs#306070


cc @max-privatevoid @Gerg-L
cc @NixOS/nix-team

Will backport this with #10501 as soon as this got reviewed.

  • Can build stuff with this patch built using the package from nixpkgs
  • Can build stuff with this patch built using the project's flake
  • tests.remoteBuilds builds again.

Context

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Closes NixOS#10585

As it turns out, libseccomp maintains an internal syscall table and
validates each rule against it. This means that when using libseccomp
2.5.4 or older, one may pass `452` as syscall number against it, but
since it doesn't exist in the internal structure, `libseccomp` will refuse
to create a filter for that. This happens with nixpkgs-23.11, i.e. on
stable NixOS and when building Nix against the project's flake.

To work around that

* a backport of libseccomp 2.5.5 on upstream nixpkgs has been
  scheduled[1].

* the package now uses libseccomp 2.5.5 on its own already. This is to
  provide a quick fix since the correct fix for 23.11 is still a staging cycle
  away.

It must not be possible to build a Nix with an incompatible libseccomp
version (nothing can be built in a sandbox on Linux!), so configure.ac
rejects libseccomp if `__SNR_fchmodat2` is not defined.

We still need the compat header though since `SCMP_SYS(fchmodat2)`
internally transforms this into `__SNR_fchmodat2` which points to
`__NR_fchmodat2` from glibc 2.39, so it wouldn't build on glibc 2.38.
The updated syscall table from libseccomp 2.5.5 is NOT used for that
step, but used later, so we need both, our compat header and their
syscall table 🤷

[1] NixOS/nixpkgs#306070
@Ma27 Ma27 requested a review from edolstra as a code owner April 22, 2024 21:01
], [], [
echo "libseccomp is missing __SNR_fchmodat2. Please provide libseccomp 2.5.5 or later"
exit 1
])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does what I want, but other than that I don't know much about autoconf, so somebody may want to double-check that.

@edolstra edolstra enabled auto-merge April 22, 2024 21:12
@edolstra edolstra merged commit df42466 into NixOS:master Apr 22, 2024
9 checks passed
@Ma27 Ma27 deleted the require-libseccomp-2.5.5 branch April 23, 2024 05:43
Comment on lines +252 to +258
] ++ lib.optional stdenv.isLinux (libseccomp.overrideAttrs (_: rec {
version = "2.5.5";
src = fetchurl {
url = "https://github.com/seccomp/libseccomp/releases/download/v${version}/libseccomp-${version}.tar.gz";
hash = "sha256-JIosik2bmFiqa69ScSw0r+/PnJ6Ut23OAsHJqiX7M3U=";
};
}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, this sort of thing should go in the flake.nix not the package.nix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(#10610 fixes)

lf- pushed a commit to lix-project/lix that referenced this pull request May 4, 2024
With Linux kernel >=6.6 & glibc 2.39 a `fchmodat2(2)` is available that
isn't filtered away by the libseccomp sandbox.

Being able to use this to bypass that restriction has surprising results
for some builds such as lxc[1]:

> With kernel ≥6.6 and glibc 2.39, lxc's install phase uses fchmodat2,
> which slips through https://github.com/NixOS/nix/blob/9b88e5284608116b7db0dbd3d5dd7a33b90d52d7/src/libstore/build/local-derivation-goal.cc#L1650-L1663.
> The fixupPhase then uses fchmodat, which fails.
> With older kernel or glibc, setting the suid bit fails in the
> install phase, which is not treated as fatal, and then the
> fixup phase does not try to set it again.

Please note that there are still ways to bypass this sandbox[2] and this is
mostly a fix for the breaking builds.

This change works by creating a syscall filter for the `fchmodat2`
syscall (number 452 on most systems). The problem is that glibc 2.39
is needed to have the correct syscall number available via
`__NR_fchmodat2` / `__SNR_fchmodat2`, but this flake is still on
nixpkgs 23.11. To have this change everywhere and not dependent on the
glibc this package is built against, I added a header
"fchmodat2-compat.hh" that sets the syscall number based on the
architecture. On most platforms its 452 according to glibc with a few
exceptions:

    $ rg --pcre2 'define __NR_fchmodat2 (?!452)'
    sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
    58:#define __NR_fchmodat2 1073742276

    sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
    67:#define __NR_fchmodat2 6452

    sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
    62:#define __NR_fchmodat2 5452

    sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
    70:#define __NR_fchmodat2 4452

    sysdeps/unix/sysv/linux/alpha/arch-syscall.h
    59:#define __NR_fchmodat2 562

I added a small regression-test to the setuid integration-test that
attempts to set the suid bit on a file using the fchmodat2 syscall.
I confirmed that the test fails without the change in
local-derivation-goal.

Additionally, we require libseccomp 2.5.5 or greater now: as it turns
out, libseccomp maintains an internal syscall table and
validates each rule against it. This means that when using libseccomp
2.5.4 or older, one may pass `452` as syscall number against it, but
since it doesn't exist in the internal structure, `libseccomp` will refuse
to create a filter for that. This happens with nixpkgs-23.11, i.e. on
stable NixOS and when building Lix against the project's flake.

To work around that

* a backport of libseccomp 2.5.5 on upstream nixpkgs has been
  scheduled[3].

* the package now uses libseccomp 2.5.5 on its own already. This is to
  provide a quick fix since the correct fix for 23.11 is still a staging cycle
  away.

We still need the compat header though since `SCMP_SYS(fchmodat2)`
internally transforms this into `__SNR_fchmodat2` which points to
`__NR_fchmodat2` from glibc 2.39, so it wouldn't build on glibc 2.38.
The updated syscall table from libseccomp 2.5.5 is NOT used for that
step, but used later, so we need both, our compat header and their
syscall table 🤷

Relevant PRs in CppNix:

* NixOS/nix#10591
* NixOS/nix#10501

[1] NixOS/nixpkgs#300635 (comment)
[2] NixOS/nixpkgs#300635 (comment)
[3] NixOS/nixpkgs#306070

(cherry picked from commit ba68045)
Change-Id: I6921ab5a363188c6bff617750d00bb517276b7fe
lf- pushed a commit to lix-project/lix that referenced this pull request May 4, 2024
With Linux kernel >=6.6 & glibc 2.39 a `fchmodat2(2)` is available that
isn't filtered away by the libseccomp sandbox.

Being able to use this to bypass that restriction has surprising results
for some builds such as lxc[1]:

> With kernel ≥6.6 and glibc 2.39, lxc's install phase uses fchmodat2,
> which slips through https://github.com/NixOS/nix/blob/9b88e5284608116b7db0dbd3d5dd7a33b90d52d7/src/libstore/build/local-derivation-goal.cc#L1650-L1663.
> The fixupPhase then uses fchmodat, which fails.
> With older kernel or glibc, setting the suid bit fails in the
> install phase, which is not treated as fatal, and then the
> fixup phase does not try to set it again.

Please note that there are still ways to bypass this sandbox[2] and this is
mostly a fix for the breaking builds.

This change works by creating a syscall filter for the `fchmodat2`
syscall (number 452 on most systems). The problem is that glibc 2.39
is needed to have the correct syscall number available via
`__NR_fchmodat2` / `__SNR_fchmodat2`, but this flake is still on
nixpkgs 23.11. To have this change everywhere and not dependent on the
glibc this package is built against, I added a header
"fchmodat2-compat.hh" that sets the syscall number based on the
architecture. On most platforms its 452 according to glibc with a few
exceptions:

    $ rg --pcre2 'define __NR_fchmodat2 (?!452)'
    sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
    58:#define __NR_fchmodat2 1073742276

    sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
    67:#define __NR_fchmodat2 6452

    sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
    62:#define __NR_fchmodat2 5452

    sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
    70:#define __NR_fchmodat2 4452

    sysdeps/unix/sysv/linux/alpha/arch-syscall.h
    59:#define __NR_fchmodat2 562

I added a small regression-test to the setuid integration-test that
attempts to set the suid bit on a file using the fchmodat2 syscall.
I confirmed that the test fails without the change in
local-derivation-goal.

Additionally, we require libseccomp 2.5.5 or greater now: as it turns
out, libseccomp maintains an internal syscall table and
validates each rule against it. This means that when using libseccomp
2.5.4 or older, one may pass `452` as syscall number against it, but
since it doesn't exist in the internal structure, `libseccomp` will refuse
to create a filter for that. This happens with nixpkgs-23.11, i.e. on
stable NixOS and when building Lix against the project's flake.

To work around that

* a backport of libseccomp 2.5.5 on upstream nixpkgs has been
  scheduled[3].

* the package now uses libseccomp 2.5.5 on its own already. This is to
  provide a quick fix since the correct fix for 23.11 is still a staging cycle
  away.

We still need the compat header though since `SCMP_SYS(fchmodat2)`
internally transforms this into `__SNR_fchmodat2` which points to
`__NR_fchmodat2` from glibc 2.39, so it wouldn't build on glibc 2.38.
The updated syscall table from libseccomp 2.5.5 is NOT used for that
step, but used later, so we need both, our compat header and their
syscall table 🤷

Relevant PRs in CppNix:

* NixOS/nix#10591
* NixOS/nix#10501

[1] NixOS/nixpkgs#300635 (comment)
[2] NixOS/nixpkgs#300635 (comment)
[3] NixOS/nixpkgs#306070

(cherry picked from commit ba68045)
Change-Id: I6921ab5a363188c6bff617750d00bb517276b7fe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fchmodat2 seccomp filter breaks sandboxed builds with glibc 2.38
3 participants