-
Notifications
You must be signed in to change notification settings - Fork 598
Debugging divergence
Robert O'Callahan edited this page Oct 31, 2023
·
3 revisions
Debugging divergence is hard. Here are some things to try.
- In a failed replay, try following the emergency debugger instructions to get a stack. Are you somewhere suspicious, e.g. that might be manipulating memory shared outside the trace?
If you can reproduce the bug by re-recording:
- Try using Intel PT to check for control flow divergence; see below.
- Try using memory checksums to identify changes in memory values before the divergence was detected.
Make sure that the "max locked memory" limit is very high (e.g. 1073741824 KB). Then record with Intel PT data collection enabled, e.g.
rr record --intel-pt ls
This captures the full tracee control flow into Intel's PT compressed trace representation. If you crash with an error about overflowing buffers, try increasing PT_PERF_AUX_SIZE
in PerfCounters.cc
. If you run out of memory, try reducing it.
Install libipt
. Build rr with cmake -Dintel_pt_decoding=TRUE
. Then
rr replay --intel-pt-start-checking-event=1 -a
This captures tracee control flow during replay and checks it (starting at event 1) against the control flow during recording. The instructions at the first divergence in control flow will be reported.