-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#2491 ARM unallocated encodings: Better handling of undecoded instrs. #2495
Conversation
In check_encode_decode_consistency(), do not try to reencode an OP_UNDECODED instruction. In instr_encode_arch(), use instr->opcode, not instr_get_opcode(instr), as the latter can call the decoder, which is unhelpful when we are trying to encode. Fixes #2491 Change-Id: I66c42dc87268f1722eae4600b364026896796fc8
Modify drreg-test.c to test an unallocated encoding on ARM, rather than the officially undefined UDF, and modify this test on both ARM and AArch64 so that it can distinguish between an unallocated encoding, which causes SIGILL, and a NOP. Change-Id: I544031ec55b8c2ecf0b60599471a79d94ac1c64a
AppVeyor failure is #2286. |
@@ -2766,7 +2766,7 @@ instr_encode_arch(dcontext_t *dcontext, instr_t *instr, byte *copy_pc, byte *fin | |||
} | |||
|
|||
decode_info_init_for_instr(&di, instr); | |||
di.opcode = instr_get_opcode(instr); | |||
di.opcode = instr->opcode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems better to match what x86 does: move handling of raw-bits-valid instrs up above this decode_info_t init. That will be more efficient and more symmetric. Then we have an invariant that it's level 3+ by the time we get here, which the operands_valid assert will double-check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume you mean: copy the 7 lines starting with "/* first handle the already-encoded instructions */" from x86/encode.c and insert them at line 2767. I'll test that tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not from x86, just reorder the lines in this routine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the comment at line 2776: "We need to track the IT block state even for raw-bits-valid instrs." So we can't significantly reorder this stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are perhaps two general, cross-architecture things here which I don't want to address in this commit:
-
instr_get_opcode
looks like a harmless accessor but may invoke the (full) decoder. -
OP_UNDECODED
might mean that we haven't got round to decoding this instruction, or it might mean that we have tried and found that the encoding does not correspond to any known instruction (except on AArch64 where the decoder always succeeds). Perhaps this distinction is well-defined somehow, but it's not clear to me, at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the original problem... I think it's that INSTR_RAW_BITS_VALID
gets cleared by the use of instr_get_opcode
, perhaps by a call to instr_reuse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or perhaps because of the early return from decode_opcode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The x86 code has an early return from decode_opcode
, before the call to instr_set_raw_bits
, but the x86 code does not call instr_get_opcode
from instr_encode_arch
before testing instr_raw_bits_valid
. An awful lot of twiddling of INSTR_RAW_BITS_VALID
can occur because of an innocent-looking instr_get_opcode
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encode_track_it_block_di() needs to know whether this is OP_it, so it seems like it has to call the decoder from this encode routine if the instr is undecoded.
For instructions that the manual calls "unpredictable": see #1685.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the original problem... I think it's that INSTR_RAW_BITS_VALID gets cleared by the use of instr_get_opcode, perhaps by a call to instr_reuse.
Looks like decode_opcode should set the raw bits before returning NULL on an invalid instr (and probably that CLIENT_ASSERT should not be there since it's called internally like this). Messy -- but so is the IT block tracking so I don't think there's a simpler solution.
Why is it OP_UNDECODED in the first place -- ARM doesn't have a fast decoder so isn't the opcode always set or set to OP_INVALID?
Could this be tested in dis-armA32? Maybe I am missing something, but as long as the binary is not executed, could we add an invalid instruction encoding to the test binary? The decoder should use OP_UNDECODED and not encode it again (and not crash at it does now I assume). |
There are already several invalid encodings in dis-armA32. The crash only happens when you try to execute one of them under DynamoRIO. |
Is the real, original problem, perhaps, that the A32 decoder is returning OP_UNDECODED instead of OP_INVALID for 0xf2800f00? |
Do we know why this happened? ARM doesn't have a fast decoder so isn't the opcode always set or set to OP_INVALID? |
Correction: the decoder originally gives |
In check_encode_decode_consistency(), do not try to reencode an
OP_UNDECODED instruction.
In instr_encode_arch(), use instr->opcode, not instr_get_opcode(instr),
as the latter can call the decoder, which is unhelpful when we are
trying to encode.
Fixes #2491
Change-Id: I66c42dc87268f1722eae4600b364026896796fc8