-
Notifications
You must be signed in to change notification settings - Fork 588
Debugging protips
Protips both for debugging rr and tracees.
All .cc files (should) have a line near the top that looks like
//#define DEBUGTAG "Sched" //... #include "dbg.h"
Uncomment the definition and you'll get full, verbose, debug-level logging output.
In the .cc file you want to redirect
static FILE* locallog = fopen("/tmp/rr-sched.log", "w"); #define LOG_FILE locallog //... #include "dbg.h"
rr doesn't implement gdb multi-process support yet, so you're using 1980s/1990s-era debugging technology. If you gdb the-wrong-image
, gdb will get very confused. If you're trying to attach to a tracee after seeing
(rr debug server listening on :X)
then you can make use of the fact that X is the tid of the task for which a debugger server has been started. Launch gdb using the following command
gdb /proc/X/exe
and gdb will load the right executable image for the tracee.
Useful for many things. For example
$ pstree -p $(pidof rr) rr(1969)───firefox(1975)─┬─Browser(2178)─┬─{Browser}(2179) │ ├─{Browser}(2180) │ ├─{Browser}(2181) │ ├─{Browser}(2182) │ ├─{Browser}(2183) │ ├─{Browser}(2184) │ ├─{Browser}(2185) │ ├─{Browser}(2186) │ ├─{Browser}(2187) │ ├─{Browser}(2188) │ ├─{Browser}(2189) │ ├─{Browser}(2190) │ ├─{Browser}(2191) │ └─{Browser}(2192) ├─{firefox}(1982) ├─{firefox}(1983) ├─{firefox}(1984) ├─{firefox}(1985) ├─{firefox}(1986) ├─{firefox}(1987) ├─{firefox}(1988) ├─{firefox}(1989) ├─{firefox}(1990) ├─{firefox}(1991) ├─{firefox}(1992) ├─{firefox}(1993) ├─{firefox}(1994) ├─{firefox}(1995) ├─{firefox}(1996) ├─{firefox}(1997) ├─{firefox}(1998) ├─{firefox}(2000) ├─{firefox}(2001) ├─{firefox}(2002) ├─{firefox}(2003) ├─{firefox}(2004) ├─{firefox}(2005) ├─{firefox}(2006) ├─{firefox}(2008) ├─{firefox}(2010) ├─{firefox}(2011) ├─{firefox}(2012) ├─{firefox}(2013) ├─{firefox}(2014) ├─{firefox}(2016) └─{firefox}(2115)
Listed tasks that are in {curly-brackets}(tid)
are clone children, aka threads. Other tasks listed not-in-curly-brackets(tid)
are fork children, aka subprocesses.
(gdb) call 'Task::dump_all'(0)
dumps information about all known tracees to stderr. F.e.
Task<0x81bae80>(tid:7373 rec_tid:7373 status:0x857f) [INFO] (log_event) SYSCALL: futex
Or to dump a specific task, (gdb) call r->dump(0)
.
Sometimes you'll be debugging and see output like
[FATAL] (/home/cjones/rr/rr/src/replayer/replayer.cc:1729:emergency_debug: errno: None) (trace line 297399) -> (session doesn't look interactive, aborting emergency debugging)
This is to guard against rr deadlocking when run from a script, where the user may not be able to see that an rr assertion has failed. For unit test scripts that tee output, e.g., where the user can see it but rr's can-the-user-see-this heuristic fails, this behavior is not at all what you want. Pass the command line flag rr -f ...
to override the heuristic and always launch a gdbserver when requested.
If you want to debug a tracee t
a known point in rr code, add a call like the following assert_exec(t, false, "")
. A gdbserver will launch for t
with a message like
[EMERGENCY] (file:line:function: errno: None) (task X (rec:Y) at trace line Z) -> Assertion `false' failed to hold: ' (rr debug server listening on :X)
A gdb tracee-program
then (gdb) target remote :X
will attach to the gdbserver.
In your debugger (gdb) p Task::find([tid])
.
Sometimes there's not a single point at which you know you'll want to launch the gdbserver. This often happens in live-lock-esque scenarios. You can do this by attaching a debugger to rr and then manually starting a gdbserver for a given task
$ gdb -p $(pidof rr) ... (gdb) p 'Task::find'([tid]) $1 = (Task *) 0x818c1c8 (gdb) call start_debug_server($1) (rr debug server listening on :X)
Then in another shell, follow the instructions above for attaching to a tracee.
Task
has a helper method next_roundrobin()
that returns a successor task in round-robin order, meaning each task is cycled through circularly. So if you can locate a single Task*
in a debugger, then you can find all the others as well by successive calls to t->next_roundrobin()
.
cgjones/rr-workbench: collection of helpers for automating rr tasks (especially running FF unit tests), and for random things like stringifying waitpid
status codes (status2text). It also contains an llvm Bell-Larus path logger and some proof-of-concept programs that have been written to prototype rr features.