test: run dbus-broker under ASan and UBsan #359

mrc0mmand · 2024-04-17T17:08:22Z

Let's introduce a test that runs dbus-broker under Address Sanitizer and Undefined Behavior Sanitizer, while running other tests against it.

The setup to achieve this is slightly convoluted, since we need to run (and restart) sanitized dbus-broker without nuking the host machine. For that we setup an nspawn-container that re-uses host's rootfs (to some degree) and overlays our additions on top of that. This way we can test (not-only) the full user-space boot with sanitized dbus-broker without risking "damage" to the host machine.

mrc0mmand · 2024-04-17T18:41:16Z

@evverx as promised here's a PoC of a test that runs dbus-broker under ASan and UBSan while we hammer it with other tests on the side. It got slightly more involved to protect the host machine, and LSan needed some extra care as well (see the comments in the test code).

It's different from the original idea of running existing tests on a sanitized build, since I currently have no clue how to incorporate this into the whole Packit/TestingFarm infra (and I'm also not quite sure how well it would handle dying dbus and collecting artifacts from such machines).

If I run the test in one of my Arch VMs against the latest upstream, it reports a leak after I run dfuzzer just on the D-Bus control interface, so it looks like it's doing something :)

# TMT_TEST_DATA=~/logs DBUS_BROKER_TREE=$PWD test/integration/fuzz/sanitizers/test.sh
...
[688762.734598] dbus-broker[66]: A security policy denied :1.16 to send method call /org/freedesktop/DBus:org.freedesktop.DBus.Peer.GetMachineId to org.freedesktop.DBus.
[688762.735575] dbus-broker[66]: A security policy denied :1.16 to send method call /org/freedesktop/DBus:org.freedesktop.DBus.Peer.Ping to org.freedesktop.DBus.
[688763.062724] dbus-broker[66]: Dispatched 1954 messages @ 198(±297)μs / message.
[688763.063133] systemd[1]: Stopping D-Bus System Message Bus...
[688763.105139] dbus-broker-launch[65]: =================================================================
[688763.105437] dbus-broker-launch[65]: ==65==ERROR: LeakSanitizer: detected memory leaks
[688763.105437] dbus-broker-launch[65]: Direct leak of 880 byte(s) in 10 object(s) allocated from:
[688763.105564] dbus-broker-launch[65]:     #0 0x75fe014e2cc1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
[688763.105564] dbus-broker-launch[65]:     #1 0x5b9d8d5fe8f7 in policy_record_new_xmit ../src/launch/policy.c:56
[688763.105564] dbus-broker-launch[65]:     #2 0x5b9d8d609364 in policy_import_send ../src/launch/policy.c:460
[688763.105564] dbus-broker-launch[65]:     #3 0x5b9d8d60f44f in policy_import ../src/launch/policy.c:696
[688763.105564] dbus-broker-launch[65]:     #4 0x5b9d8d5c018b in launcher_load_policy ../src/launch/launcher.c:1096
[688763.105564] dbus-broker-launch[65]:     #5 0x5b9d8d5c1943 in launcher_reload_config ../src/launch/launcher.c:1195
[688763.105564] dbus-broker-launch[65]:     #6 0x5b9d8d5c240c in bus_method_reload_config ../src/launch/launcher.c:1296
[688763.105564] dbus-broker-launch[65]:     #7 0x75fe0131b45c  (/usr/lib/libsystemd.so.0+0x2f45c) (BuildId: 540b0b05a1f8668a797c3ff583cd0b19ff6962f4)
[688763.105564] dbus-broker-launch[65]:     #8 0x75fe01334b1a  (/usr/lib/libsystemd.so.0+0x48b1a) (BuildId: 540b0b05a1f8668a797c3ff583cd0b19ff6962f4)
[688763.105734] dbus-broker-launch[65]: Direct leak of 88 byte(s) in 1 object(s) allocated from:
[688763.105877] dbus-broker-launch[65]:     #0 0x75fe014e2cc1 in __interceptor_calloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
[688763.105877] dbus-broker-launch[65]:     #1 0x5b9d8d5fe8f7 in policy_record_new_xmit ../src/launch/policy.c:56
[688763.105877] dbus-broker-launch[65]:     #2 0x5b9d8d609364 in policy_import_send ../src/launch/policy.c:460
[688763.105877] dbus-broker-launch[65]:     #3 0x5b9d8d60f44f in policy_import ../src/launch/policy.c:696
[688763.105877] dbus-broker-launch[65]:     #4 0x5b9d8d5c018b in launcher_load_policy ../src/launch/launcher.c:1096
[688763.105877] dbus-broker-launch[65]:     #5 0x5b9d8d5c313f in launcher_run ../src/launch/launcher.c:1347
[688763.105877] dbus-broker-launch[65]:     #6 0x5b9d8d5b270c in run ../src/launch/main.c:151
[688763.105877] dbus-broker-launch[65]:     #7 0x5b9d8d5b2949 in main ../src/launch/main.c:174
[688763.105877] dbus-broker-launch[65]:     #8 0x75fe00b1cccf  (/usr/lib/libc.so.6+0x25ccf) (BuildId: c0caa0b7709d3369ee575fcd7d7d0b0fc48733af)
[688763.106017] dbus-broker-launch[65]: SUMMARY: AddressSanitizer: 968 byte(s) leaked in 11 allocation(s).
[688763.115803] systemd[1]: dbus-broker.service: Main process exited, code=exited, status=1/FAILURE
[688763.115956] systemd[1]: dbus-broker.service: Failed with result 'exit-code'.
[688763.116408] systemd[1]: Stopped D-Bus System Message Bus.
[688763.116595] systemd[1]: dbus-broker.service: Consumed 2.112s CPU time, 142.2M memory peak, 0B memory swap peak

But it doesn't appear to ... appear in CI, interesting.

evverx · 2024-04-18T00:59:02Z

If I run the test in one of my Arch VMs against the latest upstream, it reports a leak after I run dfuzzer just on the D-Bus control interface, so it looks like it's doing something. But it doesn't appear to ... appear in CI, interesting

I'm guessing the policy triggering it isn't included in the Fedora base image used by the CI. It should probably be possible to track that policy down by removing the policies on the Arch VM one by one.

evverx · 2024-04-18T01:36:59Z

test/integration/fuzz/sanitizers/test.sh

+Environment=ASAN_OPTIONS=$ASAN_OPTIONS
+Environment=UBSAN_OPTIONS=$UBSAN_OPTIONS
+# Useful for debugging LSan errors, but it's very verbose, hence disabled by default
+#Environment=LSAN_OPTIONS=verbosity=1:log_threads=1


As far as I understand the errors were caused by dropping privileges when it was run as dbus. With <user>root</user> it should probably be fine to remove it.

evverx · 2024-04-18T01:58:23Z

test/integration/fuzz/sanitizers/test.sh

+# To make the test a bit more robust without too much effort, let's use systemd-nspawn to run an ephemeral
+# container on top of the current rootfs. To get the "sanitized" dbus-broker into that container, we need to
+# prepare a special rootfs with just the sanitized dbus-broker (and a couple of other things) which we then
+# simply overlay on top of the ephemeral rootfs in the container.


I can't say I fully understand what's going on here :-) but assuming the next step would be to run it all under Valgrind too I wonder if it's possible to somehow split the script and move the test suite itself to a separate bash script that could be run in two different places?

That being said I think it would make sense to avoid complicating things by splitting the script and make Asan/UBSan work first.

It should be relatively easy to factor the container shenanigans out into a separate helper, I'll check what's the actual reality :)

I have one more question. Is it run on VMs/bare-metal machines where coredumps can be collected? I'm not sure what should happen when dbus-broker crashes/segfaults or whatever.

Ah, good point. Until recently systemd-coredump would leave coredumps from containers on the host, but since systemd/systemd@a108c43 it forwads the coredumps to the container to handle, so this might need some extra attention as well. We could either disable the forwarding (CoredumpReceive=no in the respective [email protected] instance) and just use coredumpctl on the host, configure systemd-coredump to store the coredumps in the journal (since the container journal is already exported on the host), or bind mount a directory from the host to /var/lib/systemd/coredump so the coredumps don't vanish together with the container.

I like the first option the most (and since we use host's /usr we shouldn't have any issues with mismatches symbols), but I'll play with this a bit to see how well it works in practice.

(Apologies for dropping the ball on this, got very busy with some internal stuff. Hopefully I'll get back to this soon.)

We need all the listed packages so upgrade the dependency appropriately.

mrc0mmand · 2024-05-15T12:26:11Z

@evverx I factored out the common parts into a separate utility script, prepped another test that runs dbus-broker under Valgrind, and it seems to work (or at least Valgrind seem to complain a lot). However, it will need a bit more polish I'll move the last two commits into a separate branch with the next push (which might take a bit, as I'll like to gather some coverage reports for the sanitized dbus-broker to see how we could improve the ASan+UBSan test), so it doesn't block the sanitizer test.

evverx · 2024-05-15T12:38:10Z

@mrc0mmand I agree that Valgrind shouldn't block this PR. As far as I can remember the launcher should be tweaked too to run dbus-broker under Valgrind and some syscalls should be instrumented. The backtraces came with PID fds as far as I can remember.

evverx · 2024-05-15T12:48:42Z

test/integration/fuzz/sanitizers/main.fmf

+    - dbus-daemon
+    - dfuzzer
+    - expat-devel
+    - gcc


I think it would be better to build it with clang because it comes with more checks. (Ideally it would be great to build it with both but if it's either gcc or clang I'd pick clang :-))

Hehe, I'm just playing around with sancov (which works best with clang), so I'm inclined to using clang by default as well :)

evverx · 2024-05-15T13:09:21Z

test/integration/fuzz/sanitizers/test.sh

+# issues:
+#
+#   1) We need to restart dbus-broker (and hence the machine we're currently running on)
+#   2) If dbus-broker crashes due to ASan/UBSan error, the whole machine is hosed


As far as I understand it should be applicable to "plain" builds too in the sense that if dbus-broker crashes/gets stuck there the testbed is hosed as well. I wonder if this container trick should be used in the integration tests generally? (Just to clarify I'm not saying it should be implemented here and I don't fully understand what's going on under the hood. I'm guessing it isn't possible to prevent the package from being installed and started)

I thought about that as well, but I'm not sure if it's worth the trouble. I opted into the container shenanigans just because both the ASan and the Valgrind tests change how dbus-broker is invoked (and under which user), which has slightly higher probability of going south (and some services don't like restarting dbus in general).

However, even if dbus dies the machine still remains (somewhat) usable (and this scenario should be very unlikely in "plain" tests). I plan on adding an "at exit" test/task that'll run after all tests (or maybe after each test) and collects possible coredumps and other useful logs, so even if we manage to crash dbus-broker in a "plain" test, we should still have necessary dumps and logs available to debug it.

evverx · 2024-05-15T13:26:30Z

test/integration/util.sh

+    mkdir -p "/run/systemd/system/systemd-nspawn@$CONTAINER_NAME.service.d"
+    cat >"/run/systemd/system/systemd-nspawn@$CONTAINER_NAME.service.d/override.conf" <<EOF
+    [Service]
+# We'll handle the coredumps on the host instead


It's really cool that it's possible to collect coredumps on the host there. GH Actions are run in containers so coredumps go to the underlying host actions have no access to. I guess it's convincing enough to switch to this testing infrastructure :-)

_exit() calls skip at-exit hooks, which also skips a call to __gcov_dump() when collecting coverage with gcov, resulting in inaccurate coverage reports. To mitigate this, define a custom _exit() function which injects __gcov_dump() just before _exit(), and use a macro to override the already existing _exit() function. To make this work without a bunch of includes scattered across the codebase, inject the coverage-specific include into the compiler command line when -Db_coverage=true is used.

Let's introduce a test that runs dbus-broker under Address Sanitizer and Undefined Behavior Sanitizer, while running other tests against it. The setup to achieve this is slightly convoluted, since we need to run (and restart) sanitized dbus-broker without nuking the host machine. For that we setup an nspawn-container that re-uses host's rootfs (to some degree) and overlays our additions on top of that. This way we can test (not-only) the full user-space boot with sanitized dbus-broker without risking "damage" to the host machine.

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch 3 times, most recently from 69331b9 to 4dd4726 Compare April 17, 2024 18:08

evverx mentioned this pull request Apr 18, 2024

There seems to be a memory leak in policy_import #360

Closed

evverx reviewed Apr 18, 2024

View reviewed changes

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch 3 times, most recently from d03a43d to 1c7a0f8 Compare May 13, 2024 15:00

test: show dbus-broker's service status after running dfuzzer

506e25a

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch 8 times, most recently from 6bb14bb to 8f32458 Compare May 14, 2024 10:11

test: require packages for the test instead of just recommending them

a2d78c3

We need all the listed packages so upgrade the dependency appropriately.

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch 3 times, most recently from b86c1d0 to 718bb38 Compare May 14, 2024 19:36

evverx reviewed May 15, 2024

View reviewed changes

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch from 718bb38 to 596eecb Compare May 15, 2024 16:03

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch 3 times, most recently from c7495e3 to 50c0c52 Compare May 20, 2024 13:11

mrc0mmand added 3 commits May 20, 2024 17:34

debug-only: skip the UBSan checks

ddedfdd

mrc0mmand force-pushed the test-run-dbus-broker-under-sanitizers branch from 50c0c52 to ddedfdd Compare May 20, 2024 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: run dbus-broker under ASan and UBsan #359

test: run dbus-broker under ASan and UBsan #359

mrc0mmand commented Apr 17, 2024

mrc0mmand commented Apr 17, 2024 •

edited

Loading

evverx commented Apr 18, 2024

evverx Apr 18, 2024

evverx Apr 18, 2024 •

edited

Loading

mrc0mmand Apr 18, 2024

evverx Apr 18, 2024

mrc0mmand Apr 18, 2024 •

edited

Loading

mrc0mmand May 6, 2024

mrc0mmand commented May 15, 2024

evverx commented May 15, 2024

evverx May 15, 2024

mrc0mmand May 15, 2024

evverx May 15, 2024

mrc0mmand May 15, 2024 •

edited

Loading

evverx May 15, 2024

test: run dbus-broker under ASan and UBsan #359

Are you sure you want to change the base?

test: run dbus-broker under ASan and UBsan #359

Conversation

mrc0mmand commented Apr 17, 2024

mrc0mmand commented Apr 17, 2024 • edited Loading

evverx commented Apr 18, 2024

Choose a reason for hiding this comment

evverx Apr 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrc0mmand Apr 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrc0mmand commented May 15, 2024

evverx commented May 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrc0mmand May 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrc0mmand commented Apr 17, 2024 •

edited

Loading

evverx Apr 18, 2024 •

edited

Loading

mrc0mmand Apr 18, 2024 •

edited

Loading

mrc0mmand May 15, 2024 •

edited

Loading