When running Mutter on the native backend, it takes over the input and output of the system. This poses a problem when we have to debug it: when a debugger stops Mutter, the input and output will not be released and the debugger cannot be controlled.
A second system is required in that case to continue controlling the debugger. Usually we login to a tty on the target system, create a "screen" session, and then detach from it. On the other system, we log in to the target system with ssh, resume the screen session (screen -rd
), and then start Mutter the usual way. This even works in toolbox.
gdb usually works like with any other application. There are situations, especially when debugging the native backend, where breaking the execution can screw up the timing of the thing that's being debugged. The gdb scripting feature and conditional breakpoints can be a great help.
When our code creates a warning or a critical message, we can make glib send a signal when this code is hit, to make gdb stop there. This can be done with setting G_DEBUG
to either fatal-warnings
, fatal-criticals
, or both fatal-warnings,fatal-criticals
.
It's possible to make Mutter much more verbose by turning on some debugging topics with the MUTTER_DEBUG
environment variable.
The different topics are defined in src/core/util.c
as meta_debug_keys
. It's possible to enable multiple topics:
MUTTER_DEBUG="focus,stack" dbus-run-session mutter --wayland --nested
Gnome Shell has a build-in debugger called Looking Glass. It's possible to enable Mutter debug topics at runtime, show damage, run arbitrary JS code in the shell, and much more. It can be opened with the "Run a Command" prompt (Alt+F2) by executing lg
.
When a test fails when running meson test
, it makes sense to re-run this specific test with --print-errorlogs
. Meson's --wrapper
argument is not useful here because the test cases run in their own special environment (meta-dbus-runner.py
). Use the META_DBUS_RUNNER_WRAPPER
environment variable instead to specify a custom wrapper binary. Meson takes over input and output which means it might be required to run the test outside of meson. A failing test prints out the command line that was invoked:
11/243 mutter:cogl+cogl/conform / cogl-test-backface-culling-gl3 OK 1.49s
12/243 mutter:cogl+cogl/conform / cogl-test-backface-culling-gles2 OK 1.46s
^CWARNING: CTRL-C detected, interrupting mutter:cogl+cogl/conform / cogl-test-just-vertex-shader-gl3
13/243 mutter:cogl+cogl/conform / cogl-test-just-vertex-shader-gl3 INTERRUPT 0.75s killed by signal 15 SIGTERM
>>> COGL_DRIVER=gl3 G_TEST_BUILDDIR=/var/home/swick/Projects/mutter/build/src/tests/cogl/conform MALLOC_PERTURB_=83 G_TEST_SRCDIR=/var/home/swick/Projects/mutter/src/tests/cogl/conform G_ENABLE_DIAGNOSTIC=0 LD_LIBRARY_PATH=/var/home/swick/Projects/mutter/build/cogl/cogl:/var/home/swick/Projects/mutter/build/src/tests:/var/home/swick/Projects/mutter/build/mtk/mtk:/var/home/swick/Projects/mutter/build/clutter/clutter:/var/home/swick/Projects/mutter/build/src /var/home/swick/Projects/mutter/src/tests/meta-dbus-runner.py -- /var/home/swick/Projects/mutter/build/src/tests/cogl/conform/cogl-test-just-vertex-shader
Adding gdb:
[...] /var/home/swick/Projects/mutter/src/tests/meta-dbus-runner.py -- gdb --args /var/home/swick/Projects/mutter/build/src/tests/cogl/conform/cogl-test-just-vertex-shader
Some tests spawn a test client to exercise some functionality of the compositor. If the test client has a bug we have to help gdb out a bit. Usually we just break on the function spawning a test client (for example subsurface_corner_cases
) and then set follow-fork-mode child
. This can be quite annoying to do every time we re-run the test, so making use of gdb's scripting abilities can help out a lot:
set breakpoint pending on
break buffer_ycbcr_basic
commands 1
set follow-fork-mode child
break wayland_buffer_shm_allocate
continue
end
run
Debugging GL based applications can be hard without insights into the GL state that's actually in effect. Cogl has its own set of debug options in cogl/cogl/cogl-debug.c
, including one to show the shader sources (COGL_DEBUG=show-source
).
For more detailed state dumps, it's possible to use graphics debuggers such as apitrace and renderdoc.
Mutter has built-in Renderdoc support (see meta_backend_renderdoc_capture
and lg global.context.get_backend().renderdoc_capture()
).
To get renderdoc to capture mutter, LD_PRELOAD
must be set to the path of librenderdoc.so
. If any process is in between setting the environment variable and mutter, it will fail.
For example, LD_PRELOAD=path/to/librenderdoc.so dbus-run-session gnome-shell --wayland
will not work, but dbus-run-session env LD_PRELOAD=path/to/librenderdoc.so gnome-shell --wayland
will.
The way gdb starts the target also involves another process, so use gdb -ex 'set exec-wrapper env LD_PRELOAD=path/to/librenderdoc.so'
in those cases.
Apitraces can help by capturing multiple frames. The mutter invocation can be wrapped with apitrace trace --api=egl
to capture a trace which can be replay with eglretrace
. Apitrace can also be captured by renderdoc to select a single frame to investigate further.
-
Create a podman that can run gdb locally using the same image used in CI. The example below uses the tag
x86_64-2022-01-20
but this will depend on the image used by the failed CI job. The Fedora version may also differ.podman run -t -i --cap-add=SYS_PTRACE registry.gitlab.gnome.org/gnome/mutter/fedora/35:x86_64-2022-01-20 bash -l
-
Clone, build and install Mutter inside the container
git clone https://gitlab.gnome.org/[your-user]/mutter.git -b [merge-request-branch] cd mutter meson build ninja -C build install
-
Install debug utilities
dnf install -y gdb
-
Run the test case
meson test -C build failing-test-case
-
Run the test case in gdb
./src/tests/meta-dbus-runner.py meson test -C build --setup plain --gdb failing-test-case