Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/postgresql: set up sandboxing #344925

Merged
merged 4 commits into from
Nov 2, 2024
Merged

Conversation

mweinelt
Copy link
Member

Reduces the general exposure of the postgresql.service through systemd hardening options.

I fully expect that I've missed a number of use cases (e.g. through extensions), feedback is welcome!

Here's what systemd-analye security postgresql.service thinks of the remaining options:

machine:   NAME                                                        DESCRIPTION                                                                    EXPOSURE
✗ RootDirectory=/RootImage=                                   Service runs within the host's root directory                                       0.1
✗ RestrictAddressFamilies=~AF_UNIX                            Service may allocate local sockets                                                  0.1
✗ RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                               0.3
✗ PrivateNetwork=                                             Service has access to the host's network                                            0.5
✗ PrivateUsers=                                               Service has access to other users                                                   0.2
✗ PrivateTmp=                                                 Service has access to other software's temporary files                              0.2
✗ DeviceAllow=                                                Service has a device ACL with some special devices: char-rtc:r                      0.1
✗ IPAddressDeny=                                              Service does not define an IP address allow list                                    0.2

→ Overall exposure level for postgresql.service: 1.4 OK :-)

Description of changes

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable: (nixosTests.postgresql, postgresql.tests)
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Sep 27, 2024
@wolfgangwalther
Copy link
Contributor

I'm only using nixpkgs, not NixOS, so no experience with those services/modules. I have only very little experience with running PostgreSQL via systemd either, so.. can't help too much I guess.

@wolfgangwalther wolfgangwalther removed their request for review September 27, 2024 17:54
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Sep 27, 2024
@wolfgangwalther
Copy link
Contributor

I just found #113100, so maybe @Izorkin, @aanderse or @aszlig want to chime in.

@Izorkin
Copy link
Contributor

Izorkin commented Sep 28, 2024

I've been using my option for a long time. I haven't found any problems.

@Izorkin
Copy link
Contributor

Izorkin commented Sep 28, 2024

Can you add ReadWritePaths?

@aszlig
Copy link
Member

aszlig commented Sep 28, 2024

I've been using my option for a long time. I haven't found any problems.

Same here with confinement, running since years. The confinement variant would be also more restrictive than the implementation here because all you get is the Nix closure of PostgreSQL. If the chroot-only mode is used you don't even need the Protect* options because those mounts aren't there in the first place.

The downside is that systemd-analyze doesn't detect this properly.

Nevertheless, using the options here is better than nothing, so 👍 from me.

@mweinelt mweinelt force-pushed the postgresql-hardening branch from 57e9e92 to 64565d6 Compare September 29, 2024 11:12
@Ma27 Ma27 force-pushed the postgresql-hardening branch from 066f749 to 376f9c1 Compare September 30, 2024 15:37
Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good.
I guess I'll throw this into my deployment and see how it goes.

Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess release-notes would also be nice.

PrivateTmp = true;
ProtectHome = true;
ProtectSystem = "strict";
MemoryDenyWriteExecute = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this conflicts with JIT compilation. Maybe some of these should be put behind lib.mkDefault?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the extent tthat the nixos tests cover the JIT package testing has been done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the nixos JIT tests work, then there's probably nothing to worry about. I'll mark it as resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is broken though:

machine # WARNING:  error during JITing: Permission denied
machine # [   14.012280] postgres[913]: [913] WARNING:  error during JITing: Permission denied
machine # ERROR:  failed to look up symbol "evalexpr_0_1": Failed to materialize symbols: { (main, { evalexpr_0_1, evalexpr_0_0 }) }

I confirmed that this gets fixed with MemoryDenyWriteExecute = false;.

I reduced jit_above_cost by a large factor for now on one of my databases to see if JIT is good now and so far it's looking reasonable:

matrix-synapse=# select sum(jit_functions) from pg_stat_statements where jit_functions > 0;
 sum  
------
 2175
(1 row)

(and my synapse is still usable).

mweinelt and others added 2 commits October 1, 2024 10:31
Reduces the general exposure of the postgresql.service through systemd
hardening options.
…ver test

The issue was that the old test-case used `/tmp` to share data. Using
`JoinsNamespaceOf=` wasn't a real workaround since the private `/tmp` is
recreated when a service gets stopped/started which is the case here, so
the wals were still lost.

To keep the test building with `PrivateTmp=yes`, create a dedicated
directory in `/var/cache` with tmpfiles and allow the hardened
`postgresql.service` to access it via `ReadWritePaths`.
@hacscred
Copy link

hacscred commented Oct 1, 2024

I think the only way for such a change to not result in angry users is if the existing postgresql module gets some additional features. Essentially, every postgres extension module (like postgis) should specify a set of systemd features it is compatible with (or alternatively incompatible with) and the computed intersection of those options is how it will run then. So, right now your approach is to hard code some options, but that design won't scale.

The design should be such that someone can use a postgresql extension involving GPUs (incompatible with your current set of options) by enabling it via some compositional method and the required configuration should automatically work. (In fact, one could argue Postgresql is designed in the wrong way to not have per module hardening via a capabilities system. )

If this is merged someone will get error messages in the best case, which are likely hard to comprehend and certainly they won't get an error at configuration time.

It's somewhat useful that you presumably found a set of which 90% probably won't cause any issues die anyone, but making an actual improvement (for every user, ignoring the slower evaluation) would require my design or another design (which would require evidence of it also offering the compositional guarantee I outlined).

I think it would be better if a prototype were to be developed based on my compositional method and then the discussion could continue, since perhaps I missed something. (The actual implementation (which could easily be thousands of lines of code) and method of maintenance are independent of providing the basic guarantee.)

@Ma27
Copy link
Member

Ma27 commented Oct 12, 2024

@mweinelt I'm actually wondering if we should do some passthru.sandboxAdjustments to each extension (although it's a little weird how a package exposes a partial module here). These could be mkMerged together. Bottom line is that each existing extension gets checked.

cc extension maintainers @1000101 @bbigras @chayleaf @danbst @DeeUnderscore @DerTim1 @diogotcorreia @esclear @euank @gaelreyrol @ggPeti @idontgetoutmuch @ivan @jopejoe1 @k0001 @MarcWeber @mmusnjak @netcrns @nialov @qoelet @shivaraj-bh @sikmir @spinus @steve-chavez @takeda @thenonameguy @typetetris @willcohen @WilliButz @wolfgangwalther @zimbatm @sikmir @imincik would be helpful if you could check which of your extensions will have compatibility problems.

I don't think we can do much for stuff that isn't packaged in nixpkgs though except for good release notes.

@mweinelt
Copy link
Member Author

Yes, conditional hardening is the ideal outcome, thanks for pinging everyone.

Copy link
Member

@Scrumplex Scrumplex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just deployed this on some of my servers, one of which hosts an instance of Immich, which uses several postgres extensions: unaccent uuid-ossp vectors (pgvecto.rs) cube earthdistance and pg_trgm.

Seems to be working well

@euank
Copy link
Member

euank commented Oct 13, 2024

I'd expect wal2json to be fine with this if the tests pass, and I can't think of any reason sandboxing would break it.

@ofborg test postgresqlPackages.wal2json.passthru.tests

@takeda
Copy link
Contributor

takeda commented Oct 13, 2024

I currently don't have NixOS running to test, but regarding pgmq which I just added, that extension only contains sql files, nothing extra is built, so if any extension work I would expect this one would too.

@DeeUnderscore
Copy link
Contributor

I don't anticipate any problems running the rum extension under this, and I did not show any in a quick test.

The test breaks like this otherwise:

    machine # WARNING:  error during JITing: Permission denied
    machine # [   14.012280] postgres[913]: [913] WARNING:  error during JITing: Permission denied
    machine # ERROR:  failed to look up symbol "evalexpr_0_1": Failed to materialize symbols: { (main, { evalexpr_0_1, evalexpr_0_0 }) }
@mweinelt mweinelt force-pushed the postgresql-hardening branch from 4083be3 to 0f1e2a1 Compare October 14, 2024 21:57
@Ma27
Copy link
Member

Ma27 commented Oct 27, 2024

OK so from what I've seen so far, there's no extension packaged in nixpkgs that requires additional changes to the hardening, at least no maintainer gave feedback.

I'm still wondering if we need some kind of passthru.sandboxAdjustments, however, I'm not sure how: on the same level are also settings that shouldn't be overridden by an extension (e.g. ExecStart).
Also, which priority do we give things in sandboxAdjustments and the module? Like, otherwise we'd need extensive mkForce usage in the extensions (though that's probably the most reasonable thing).

However this makes it pretty hard to catch incompatibilities between extensions with different adjustments. If everything is written down in conditionals in the module, I'd expect it to be way easier to catch issues.

And if people package things out of nixpkgs, I think it's perfectly fine to expect them to override the relevant settings on their own. Hardening the service even more is a breaking change and shouldn't be backported then, however.


So tl;dr I'm leaning towards leaving the change code wise as-is and writing docs (release-notes and probably even a note in the manual).

WDYT @mweinelt @wolfgangwalther @h7x4 ?

@github-actions github-actions bot added 8.has: documentation This PR adds or changes documentation 8.has: changelog labels Nov 1, 2024
Copy link
Member

@Ma27 Ma27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a commit mentioning the change in the release notes and the manual.

I'd say we got for it in 24.11: I'm using this for quite a while on my own and haven't encountered any issues (short of the JIT issue at the very beginning, heh). Since we got no feedback about potential breakage with extensions, I don't consider this a breaking change.

Would you be OK with that @mweinelt @wolfgangwalther

@Ma27 Ma27 force-pushed the postgresql-hardening branch from 765effb to 70a6092 Compare November 1, 2024 15:31
@Ma27 Ma27 merged commit 500d745 into NixOS:master Nov 2, 2024
25 checks passed
@mweinelt mweinelt deleted the postgresql-hardening branch November 2, 2024 14:46
RestrictRealtime = true;
RestrictSUIDSGID = true;
SystemCallArchitectures = "native";
SystemCallFilter = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use plv8 extension and the database server crashes every time a query using plv8 function is called:

postgres[2325667]: [2325667] LOG:  server process (PID 2327131) was terminated by signal 31: Bad system call

Is there a way to determine which system calls are needed?

I tried

SystemCallFilter = [ "" ];
SystemCallLog = [ "@privileged ~@resources ~@system-service" ];

but it only made it crash with

Nov 10 15:43:06 azazel postgres[2336657]: #
Nov 10 15:43:06 azazel postgres[2336657]: # Fatal error in , line 0
Nov 10 15:43:06 azazel postgres[2336657]: # Check failed: 12 == (*__errno_location ()).
Nov 10 15:43:06 azazel postgres[2336657]: #
Nov 10 15:43:06 azazel postgres[2336657]: #
Nov 10 15:43:06 azazel postgres[2336657]: #
Nov 10 15:43:06 azazel postgres[2336657]: #FailureMessage Object: 0x7ffff76dee30
Nov 10 15:43:06 azazel postgres[2336657]: ==== C stack trace ===============================
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::base::debug::StackTrace::StackTrace()+0x16) [0x7f056f00e386]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(+0xb12e4a) [0x7f056ead7e4a]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(V8_Fatal(char const*, ...)+0x17a) [0x7f056e8aacea]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(+0x8de83d) [0x7f056e8a383d]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::MemoryAllocator::SetPermissionsOnExecutableMemoryChunk(v8::internal::VirtualMemory*, unsigned long, unsigned long, unsigned long)+0xfa) [0x7f056ea4222a]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::MemoryAllocator::AllocateAlignedMemory(unsigned long, unsigned long, unsigned long, v8::internal::AllocationSpace, v8::internal::Executability, void*, v8::internal::VirtualMemory*)+0x1f5) [0x7f056ea42505]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::MemoryAllocator::AllocateUninitializedChunkAt(v8::internal::BaseSpace*, unsigned long, v8::internal::Executability, unsigned long, v8::internal::PageSize)+0x96) [0x7f056ea42606]
Nov 10 15:43:06 azazel kernel: traps: postgres[387266] trap int3 ip:7f056e8a3b2e sp:7ffff76dee08 error:0 in plv8-3.2.3.so[7f056e7e7000+1b90000]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::MemoryAllocator::AllocatePage(v8::internal::MemoryAllocator::AllocationMode, v8::internal::Space*, v8::internal::Executability)+0x7e) [0x7f056ea42d8e]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::PagedSpaceBase::TryExpandImpl()+0x2c) [0x7f056ea56d0c]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::PagedSpaceBase::TryExpand(int, v8::internal::AllocationOrigin)+0x1f) [0x7f056ea58a0f]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::PagedSpaceBase::RawRefillLabMain(int, v8::internal::AllocationOrigin)+0x416) [0x7f056ea593d6]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::PagedSpaceBase::RefillLabMain(int, v8::internal::AllocationOrigin)+0x28) [0x7f056ea596c8]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment)+0x5e1) [0x7f056e9c6201]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::Factory::CodeBuilder::AllocateInstructionStream(bool)+0x7d) [0x7f056e9ac52d]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::Factory::CodeBuilder::BuildInternal(bool)+0x21b) [0x7f056e9ac95b]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::compiler::CodeGenerator::FinalizeCode()+0x16c) [0x7f056fe8ba7c]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::compiler::PipelineImpl::FinalizeCode(bool)+0x4ef) [0x7f056f1753ef]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::compiler::PipelineCompilationJob::FinalizeJobImpl(v8::internal::Isolate*)+0x70) [0x7f056f1763e0]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::OptimizedCompilationJob::FinalizeJob(v8::internal::Isolate*)+0x44) [0x7f056e8b8234]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::Compiler::FinalizeTurbofanCompilationJob(v8::internal::TurbofanCompilationJob*, v8::internal::Isolate*)+0x40b) [0x7f056e8be13b]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions()+0xc9) [0x7f056e8f24a9]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::StackGuard::HandleInterrupts()+0x1e0) [0x7f056e981590]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*)+0xb9) [0x7f056f3adef9]
Nov 10 15:43:06 azazel postgres[2336657]:     /nix/store/5nvfkms6x5nkaaq44w0xw7ph5dg051y6-postgresql-and-plugins-14.13/lib/plv8-3.2.3.so(+0xf1def6) [0x7f056eee2ef6]
Nov 10 15:43:06 azazel postgres[2336019]: [2336019] LOG:  server process (PID 2336657) was terminated by signal 5: Trace/breakpoint trap
Nov 10 15:43:06 azazel postgres[2336019]: [2336019] DETAIL:  Failed process was running: with recent_event as (select id from event where level = 'world' and "end" < current_date and "end" > '2022-01-01' order by "end" desc limit 2), past_team as (select *, 0.5 as coef, 6 as ceiling, position_in_class(age, 'ultraveteran') over (w order by score desc, time asc) as position_uv, count(nullif(is_in_class(age, 'ultraveteran'), false)) over w as limit_uv, position_in_class(age, 'superveteran') over (w order by score desc, time asc) as position_sv, count(nullif(is_in_class(age, 'superveteran'), false)) over w as limit_sv, position_in_class(age, 'veteran') over (w order by score desc, time asc) as position_v, count(nullif(is_in_class(age, 'veteran'), false)) over w as limit_v, position_in_class(age, 'open') over (w order by score desc, time asc) as position_o, count(nullif(is_in_class(age, 'open'), false)) over w as limit_o, position_in_class(age, 'junior') over (w order by score desc, time asc) as position_j, count(nullif(is_in_class(age, 'junior'), false)) over w as limit_j from team where event_id in
Nov 10 15:43:06 azazel postgres[2336019]: [2336019] LOG:  terminating any other active server processes
Nov 10 15:43:06 azazel node[2333506]: Error: Connection terminated unexpectedly
Nov 10 15:43:06 azazel node[2333506]:     at /nix/store/yq48bpc7clgjrybghnz5a57fnsw4scb4-wrcq-0-unstable-2024-10-22/node_modules/pg-pool/index.js:45:11
Nov 10 15:43:06 azazel node[2333506]:     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Nov 10 15:43:06 azazel node[2333506]:     at async file:///nix/store/yq48bpc7clgjrybghnz5a57fnsw4scb4-wrcq-0-unstable-2024-10-22/index.js:309:32
Nov 10 15:43:06 azazel postgres[2336665]: [2336665] FATAL:  the database system is in recovery mode
Nov 10 15:43:06 azazel node[2333506]: ::1 - - [10/Nov/2024:14:43:06 +0000] "GET /qualified HTTP/1.0" 500 -
Nov 10 15:43:06 azazel node[2333506]: error: the database system is in recovery mode
Nov 10 15:43:06 azazel node[2333506]:     at /nix/store/yq48bpc7clgjrybghnz5a57fnsw4scb4-wrcq-0-unstable-2024-10-22/node_modules/pg-pool/index.js:45:11
Nov 10 15:43:06 azazel node[2333506]:     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Nov 10 15:43:06 azazel node[2333506]:     at async PGStore._asyncQuery (/nix/store/yq48bpc7clgjrybghnz5a57fnsw4scb4-wrcq-0-unstable-2024-10-22/node_modules/connect-pg-simple/index.js:322:21)
Nov 10 15:43:06 azazel postgres[2336019]: [2336019] LOG:  all server processes terminated; reinitializing
Nov 10 15:43:06 azazel postgres[2336666]: [2336666] LOG:  database system was interrupted; last known up at 2024-11-10 14:42:55 GMT
Nov 10 15:43:06 azazel postgres[2336666]: [2336666] LOG:  database system was not properly shut down; automatic recovery in progress
Nov 10 15:43:06 azazel postgres[2336666]: [2336666] LOG:  redo starts at 2/9F5C7BE0
Nov 10 15:43:06 azazel postgres[2336666]: [2336666] LOG:  invalid record length at 2/9F5CEE68: wanted 24, got 0
Nov 10 15:43:06 azazel postgres[2336666]: [2336666] LOG:  redo done at 2/9F5CEE28 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
Nov 10 15:43:06 azazel postgres[2336019]: [2336019] LOG:  database system is ready to accept connections

Reverting this whole PR works as a workaround.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the system call filter is the issue, you should see an error in the kernel log (or auditd, I'm not entirely sure).

But given that this is v8, there might be another problem: can you try setting MemoryDenyWriteExecute to false?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to set up a vm test for plv8 - then we could just bisect through the list of changes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we could just bisect through the list of changes?

I guess I would've just turned off more and more flags until it works. But yeah.

@jtojnar never used this extension, but any chance you can share a reproducer?

Copy link
Member

@jtojnar jtojnar Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even just calling ALTER EXTENSION plv8 UPDATE; was enough to trigger the crash for me. Maybe the example select from https://plv8.github.io/ would work too. Not sure if I will be able to create a test today, maybe sometime during the week.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, tried adding SystemCallFilter = [ "@pkey" ]; and got the errno fatal error above. Adding MemoryDenyWriteExecute = false; as well made it work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what triggers the second error? Would be nice to be able to repro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a pretty big SQL query including custom window functions:

Original query
WITH
  recent_event AS (
    SELECT
      id
    FROM
      event
    WHERE
      level = 'world'
      AND "end" < CURRENT_DATE
      AND "end" > '2022-01-01'
    ORDER BY
      "end" DESC
    LIMIT
      2
  ),
  past_team AS (
    SELECT
      *,
      0.5 AS coef,
      6 AS CEILING,
      position_in_class (age, 'ultraveteran') OVER (
        w
        ORDER BY
          score DESC,
          time ASC
      ) AS position_uv,
      count(nullif(is_in_class (age, 'ultraveteran'), FALSE)) OVER w AS limit_uv,
      position_in_class (age, 'superveteran') OVER (
        w
        ORDER BY
          score DESC,
          time ASC
      ) AS position_sv,
      count(nullif(is_in_class (age, 'superveteran'), FALSE)) OVER w AS limit_sv,
      position_in_class (age, 'veteran') OVER (
        w
        ORDER BY
          score DESC,
          time ASC
      ) AS position_v,
      count(nullif(is_in_class (age, 'veteran'), FALSE)) OVER w AS limit_v,
      position_in_class (age, 'open') OVER (
        w
        ORDER BY
          score DESC,
          time ASC
      ) AS position_o,
      count(nullif(is_in_class (age, 'open'), FALSE)) OVER w AS limit_o,
      position_in_class (age, 'junior') OVER (
        w
        ORDER BY
          score DESC,
          time ASC
      ) AS position_j,
      count(nullif(is_in_class (age, 'junior'), FALSE)) OVER w AS limit_j
    FROM
      team
    WHERE
      event_id IN (
        SELECT
          *
        FROM
          recent_event
      )
    WINDOW
      w AS (
        PARTITION BY
          event_id,
          gender
      )
  ),
  past_team_crit AS (
    SELECT
      *,
      (
        position_uv <= CEILING
        AND position_uv <= ceil(limit_uv * coef)
      ) AS prequalified_uv,
      (
        position_sv <= CEILING
        AND position_sv <= ceil(limit_sv * coef)
      ) AS prequalified_sv,
      (
        position_v <= CEILING
        AND position_v <= ceil(limit_v * coef)
      ) AS prequalified_v,
      (
        position_o <= CEILING
        AND position_o <= ceil(limit_o * coef)
      ) AS prequalified_o,
      (
        position_j <= CEILING
        AND position_j <= ceil(limit_j * coef)
      ) AS prequalified_j
    FROM
      past_team
  )
SELECT
  *
FROM
  member
  LEFT JOIN past_team_crit ON member.event_id = past_team_crit.event_id
  AND member.team_id = past_team_crit.id
WHERE
  prequalified_uv
  OR prequalified_sv
  OR prequalified_v
  OR prequalified_o
  OR prequalified_j;

I managed to pare it down slightly but it is not clear what is causing it:

Reduced query
WITH
  recent_event AS (
    SELECT
      id
    FROM
      event
    WHERE
      level = 'world'
      AND "end" < CURRENT_DATE
      AND "end" > '2022-01-01'
    ORDER BY
      "end" DESC
    LIMIT
      2
  ),
  past_team AS (
    SELECT
      *,
      0.5 AS coef,
      6 AS CEILING,
      position_in_class (age, 'ultraveteran') OVER (
        w
        ORDER BY
          score DESC,
          time ASC
      ) AS position_uv,
      count(nullif(is_in_class (age, 'ultraveteran'), FALSE)) OVER w AS limit_uv
    FROM
      team
    WHERE
      event_id IN (
        SELECT
          *
        FROM
          recent_event
      )
    WINDOW
      w AS (
        PARTITION BY
          event_id,
          gender
      )
  )
SELECT
  *,
  (
    position_uv <= CEILING
    AND position_uv <= ceil(limit_uv * coef)
  ) AS prequalified_uv
FROM
  past_team

Curiously, the crash triggers if I remove ultraveteran category or if I only keep ultraveteran category, but not when I keep any other category. I should really move the logic out of SQL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that is probably too big for a reproducer in a nixos test. Either way #355010 should be ready to go. Feel free to pick that into your tree, if that fits your workflow.

Copy link
Member

@jtojnar jtojnar Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so looking at nodejs/node#55509, it is probably indeed caused by V8 turning on JIT.

I managed to make PLV8 (presumably) enable JIT with the following:

DO $$
let xs = [];
for (let i = 0, n = 4000000000; i < n; i++) {
    xs.push(Math.round(Math.random() * n))
}
console.log(Math.sum(xs));
$$ LANGUAGE plv8;

jtojnar added a commit to jtojnar/nixfiles that referenced this pull request Nov 11, 2024
We use PLV8 for PQE, which uses some extra syscalls and, as a result, the database will be killed:
LOG:  server process (PID 2327131) was terminated by signal 31: Bad system call

Partially reverts NixOS/nixpkgs#344925
@dali99
Copy link
Member

dali99 commented Dec 5, 2024

The readWritePaths breaks users who use TABLESPACE

Affected users can fix it by setting systemd.services.postgresql.serviceConfig.ReadWritePaths = [ "/path/to/tablespace" ]

Ma27 added a commit to Ma27/nixpkgs that referenced this pull request Dec 11, 2024
It was brought up that the restricted file-system access breaks
tablespaces[1]. I'd argue that this is the desired behavior, the whole
point of the hardening is the lock the service down and I don't consider
tablespaces common enough to elevate privileges again. Especially since
the workaround is trivial as shown in the diff.

For completeness sake, this adds the necessary `ReadWritePaths` change
to the postgresql section of the manual.

This also adds a small correction about the state of
`ensurePermissions`.

[1] NixOS#344925 (comment)
Ma27 added a commit to Ma27/nixpkgs that referenced this pull request Dec 12, 2024
It was brought up that the restricted file-system access breaks
tablespaces[1]. I'd argue that this is the desired behavior, the whole
point of the hardening is the lock the service down and I don't consider
tablespaces common enough to elevate privileges again. Especially since
the workaround is trivial as shown in the diff.

For completeness sake, this adds the necessary `ReadWritePaths` change
to the postgresql section of the manual.

This also adds a small correction about the state of
`ensurePermissions`.

[1] NixOS#344925 (comment)
nix-backports bot pushed a commit that referenced this pull request Dec 12, 2024
It was brought up that the restricted file-system access breaks
tablespaces[1]. I'd argue that this is the desired behavior, the whole
point of the hardening is the lock the service down and I don't consider
tablespaces common enough to elevate privileges again. Especially since
the workaround is trivial as shown in the diff.

For completeness sake, this adds the necessary `ReadWritePaths` change
to the postgresql section of the manual.

This also adds a small correction about the state of
`ensurePermissions`.

[1] #344925 (comment)

(cherry picked from commit 51a6938)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: changelog 8.has: documentation This PR adds or changes documentation 8.has: module (update) This PR changes an existing module in `nixos/` 9.needs: documentation 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.