Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRASH: tool pc-rel instru mangled by DR can be mistaken as clean call instru and mis-translated #5786

Closed
derekbruening opened this issue Dec 14, 2022 · 0 comments · Fixed by #5791
Assignees

Comments

@derekbruening
Copy link
Contributor

We are seeing crashes when using drbbdup on x86 on detach. The drbbdup state restore function seems perfectly correct: it is DR that is messing things up. We finally tracked it down to DR thinking that drbbdup's case load instructions are in fact a clean call setup. DR then tries to restore the context from the dstack and gets incorrect values.

This is with drmemtrace where the drbbdup case operand is a pc-relative load. DR has a feature where it will mangle tool-inserted pc-relative references into two-part sequences if they won't reach in their code cache destination. Thus we end up with something like this (this is with no tool instrumentation aside from drbbdup, so we just have 2 copies of this simple cmp;jnz app block) where diagnostics are marking 3 instrs considered "our_mangling":

 +0    m4 @0x00007f6b07b75f70  65 48 89 3c 25 20 01 mov    %rdi -> %gs:0x00000120[8byte]
                               00 00
 +9    m4MANGLING @0x00007f6b0b183178  48 bf 88 34 35 6d 17 mov    $0x000056176d353488 -> %rdi
                               56 00 00
 +19   m4MANGLING @0x00007f6b0b1832c8  48 8b 3f             mov    (%rdi)[8byte] -> %rdi
 +22   m4 @0x00007f6b07b75d90                       <label note=0x000000000000004e>
 +22   m4 @0x00007f6b0b183260  e9 49 00 00 00       jmp    @0x00007f6b0b1830a8[8byte]
 +27   m4 @0x00007f6b0b183040  65 48 8b 3c 25 20 01 mov    %gs:0x00000120[8byte] -> %rdi
                               00 00
 +36   L3 @0x00007f6b07b75ec8  80 7d bc 00          cmp    0xffffffbc(%rbp)[1byte] $0x00
 +40   m4 @0x00007f6b0b183348                       <label note=0x0000000000000001>
 +40   m4 @0x00007f6b07b75c40  e9 4a 00 00 00       jmp    @0x00007f6b07b75cc0[8byte]
 +45   m4 @0x00007f6b07b75d28                       <label note=0x0000000000000002>
 +45   m4 @0x00007f6b0b1830a8                       <label note=0x000000000000004e>
 +45   m4 @0x00007f6b0b1831e0  65 48 8b 3c 25 20 01 mov    %gs:0x00000120[8byte] -> %rdi
                               00 00
 +54   L3 @0x00007f6b0b1833c8  80 7d bc 00          cmp    0xffffffbc(%rbp)[1byte] $0x00
 +58   m4 @0x00007f6b07b75e60                       <label note=0x0000000000000001>
 +58   m4 @0x00007f6b07b75cc0                       <label note=0x000000000000004f>
 +58   m4 @0x00007f6b07b75df8                       <label note=0x0000000000000002>
 +58   L4 @0x00007f6b07b75b98  0f 85 62 53 e2 09    jnz    $0x000056176cab2268
 +64   L4MANGLING @0x00007f6b0b183430  e9 f2 52 e2 09       jmp    $0x000056176cab21f7

The first 2 of those are the 2-part case load.

The current translation code has no precise way to identify clean calls and instead it identifies everything else and says that anything in "our_mangling" with no translation must then be a clean call. But DR mangling tool code breaks this assumption.

We do not want to not mark DR-mangled tool code as not our_mangling as there can be spills and restores in there and we want DR to properly restore those.

We need to either precisely identify mangle_rel_addr() sequences, or precisely identify clean call sequences. Both are complex. Since the pc-rel doesn't need any action beyond regular spill tracking, it seems better to identify clean calls. There are plenty of bits in translation_info_t, but not in instr_t: so I'm thinking of using labels for the instrlist case, and flags for the stored-info case.

@derekbruening derekbruening self-assigned this Dec 14, 2022
derekbruening added a commit that referenced this issue Dec 15, 2022
Adds new labels delimiting clean call sequences.
Converts into a translation record flag when storing translations.

Uses the new labels and flag to precisely identify clean call
mangling, replacing the previous scheme which incorrectly thought
mangled tool pc-relative was a clean call, resulting in incorrect
translations and crashes.

Adds a test case to api.detach_state by adding a client (by converting
it to use static DR) which inserts a pc-relative load.  This
reproduces the crash on detach, and is fixed with this fix.
The added instrumentation caused periodic detach failures which were
solved by setting the translation and adding a restore-state event:
i#4232 covers trying to improve the situation.

Issue: #5786, #4232
Fixes #5786
derekbruening added a commit that referenced this issue Jan 14, 2023
Adds new labels delimiting clean call sequences.
Converts into a translation record flag when storing translations.

Uses the new labels and flag to precisely identify clean call
mangling, replacing the previous scheme which incorrectly thought
mangled tool pc-relative was a clean call, resulting in incorrect
translations and crashes.

Adds a test case to api.detach_state by adding a client (by converting
it to use static DR) which inserts a pc-relative load.  This
reproduces the crash on detach, and is fixed with this fix.
The added instrumentation caused periodic detach failures which were
solved by setting the translation and adding a restore-state event:
i#4232 covers trying to improve the situation.

Adds a new instr_t.offset field.
Stops using instr_t.note to hold encoding offsets for pc-releative
operands.  Adds a new field instr_t.offset which is used for this
purpose.  This leaves note values in place across encodings, which is
needed for new clean call marking labels and also simplifies rseq
handling code.

This instr_t field is a compatibility break and we bump the version and 
OLDEST_COMPATIBLE_VERSION here to 990.

Updates dr_get_note docs.

Augments logging of xl8 info with new flag info.

Reduces DR_NOTE_FIRST_RESERVED to give DR more reserved labels.
This is another compatibility break, while at it.

Fixes several issues hit in tests that happened to trigger on the
heap bucket size and other changes:
+ Fixes a rank order violation at loglevel 5: xref #1649
+ Writes real xstate_bv into signal frame when setting the xstate context to
   avoid lazy AVX restore problems.
+ Tweaks the thread_churn test to work around non-linearities.

Issue: #5786, #4232
Fixes #5786
dolanzhao pushed a commit that referenced this issue Jan 30, 2023
Adds new labels delimiting clean call sequences.
Converts into a translation record flag when storing translations.

Uses the new labels and flag to precisely identify clean call
mangling, replacing the previous scheme which incorrectly thought
mangled tool pc-relative was a clean call, resulting in incorrect
translations and crashes.

Adds a test case to api.detach_state by adding a client (by converting
it to use static DR) which inserts a pc-relative load.  This
reproduces the crash on detach, and is fixed with this fix.
The added instrumentation caused periodic detach failures which were
solved by setting the translation and adding a restore-state event:
i#4232 covers trying to improve the situation.

Adds a new instr_t.offset field.
Stops using instr_t.note to hold encoding offsets for pc-releative
operands.  Adds a new field instr_t.offset which is used for this
purpose.  This leaves note values in place across encodings, which is
needed for new clean call marking labels and also simplifies rseq
handling code.

This instr_t field is a compatibility break and we bump the version and 
OLDEST_COMPATIBLE_VERSION here to 990.

Updates dr_get_note docs.

Augments logging of xl8 info with new flag info.

Reduces DR_NOTE_FIRST_RESERVED to give DR more reserved labels.
This is another compatibility break, while at it.

Fixes several issues hit in tests that happened to trigger on the
heap bucket size and other changes:
+ Fixes a rank order violation at loglevel 5: xref #1649
+ Writes real xstate_bv into signal frame when setting the xstate context to
   avoid lazy AVX restore problems.
+ Tweaks the thread_churn test to work around non-linearities.

Issue: #5786, #4232
Fixes #5786
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant