Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#2491 ARM unallocated encodings: Better handling of undecoded instrs. #2495

Closed
wants to merge 2 commits into from

Conversation

egrimley
Copy link
Contributor

@egrimley egrimley commented Jul 4, 2017

In check_encode_decode_consistency(), do not try to reencode an
OP_UNDECODED instruction.

In instr_encode_arch(), use instr->opcode, not instr_get_opcode(instr),
as the latter can call the decoder, which is unhelpful when we are
trying to encode.

Fixes #2491

Change-Id: I66c42dc87268f1722eae4600b364026896796fc8

In check_encode_decode_consistency(), do not try to reencode an
OP_UNDECODED instruction.

In instr_encode_arch(), use instr->opcode, not instr_get_opcode(instr),
as the latter can call the decoder, which is unhelpful when we are
trying to encode.

Fixes #2491

Change-Id: I66c42dc87268f1722eae4600b364026896796fc8
Modify drreg-test.c to test an unallocated encoding on ARM, rather than
the officially undefined UDF, and modify this test on both ARM and
AArch64 so that it can distinguish between an unallocated encoding,
which causes SIGILL, and a NOP.

Change-Id: I544031ec55b8c2ecf0b60599471a79d94ac1c64a
@egrimley
Copy link
Contributor Author

egrimley commented Jul 4, 2017

AppVeyor failure is #2286.

@@ -2766,7 +2766,7 @@ instr_encode_arch(dcontext_t *dcontext, instr_t *instr, byte *copy_pc, byte *fin
}

decode_info_init_for_instr(&di, instr);
di.opcode = instr_get_opcode(instr);
di.opcode = instr->opcode;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems better to match what x86 does: move handling of raw-bits-valid instrs up above this decode_info_t init. That will be more efficient and more symmetric. Then we have an invariant that it's level 3+ by the time we get here, which the operands_valid assert will double-check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume you mean: copy the 7 lines starting with "/* first handle the already-encoded instructions */" from x86/encode.c and insert them at line 2767. I'll test that tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not from x86, just reorder the lines in this routine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the comment at line 2776: "We need to track the IT block state even for raw-bits-valid instrs." So we can't significantly reorder this stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are perhaps two general, cross-architecture things here which I don't want to address in this commit:

  • instr_get_opcode looks like a harmless accessor but may invoke the (full) decoder.

  • OP_UNDECODED might mean that we haven't got round to decoding this instruction, or it might mean that we have tried and found that the encoding does not correspond to any known instruction (except on AArch64 where the decoder always succeeds). Perhaps this distinction is well-defined somehow, but it's not clear to me, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the original problem... I think it's that INSTR_RAW_BITS_VALID gets cleared by the use of instr_get_opcode, perhaps by a call to instr_reuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps because of the early return from decode_opcode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The x86 code has an early return from decode_opcode, before the call to instr_set_raw_bits, but the x86 code does not call instr_get_opcode from instr_encode_arch before testing instr_raw_bits_valid. An awful lot of twiddling of INSTR_RAW_BITS_VALID can occur because of an innocent-looking instr_get_opcode...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

encode_track_it_block_di() needs to know whether this is OP_it, so it seems like it has to call the decoder from this encode routine if the instr is undecoded.

For instructions that the manual calls "unpredictable": see #1685.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the original problem... I think it's that INSTR_RAW_BITS_VALID gets cleared by the use of instr_get_opcode, perhaps by a call to instr_reuse.

Looks like decode_opcode should set the raw bits before returning NULL on an invalid instr (and probably that CLIENT_ASSERT should not be there since it's called internally like this). Messy -- but so is the IT block tracking so I don't think there's a simpler solution.

Why is it OP_UNDECODED in the first place -- ARM doesn't have a fast decoder so isn't the opcode always set or set to OP_INVALID?

@fhahn
Copy link
Contributor

fhahn commented Jul 5, 2017

Could this be tested in dis-armA32? Maybe I am missing something, but as long as the binary is not executed, could we add an invalid instruction encoding to the test binary? The decoder should use OP_UNDECODED and not encode it again (and not crash at it does now I assume).

@egrimley
Copy link
Contributor Author

egrimley commented Jul 5, 2017

There are already several invalid encodings in dis-armA32. The crash only happens when you try to execute one of them under DynamoRIO.

@egrimley
Copy link
Contributor Author

egrimley commented Jul 5, 2017

Is the real, original problem, perhaps, that the A32 decoder is returning OP_UNDECODED instead of OP_INVALID for 0xf2800f00?

@derekbruening
Copy link
Contributor

Is the real, original problem, perhaps, that the A32 decoder is returning OP_UNDECODED instead of OP_INVALID for 0xf2800f00?

Do we know why this happened? ARM doesn't have a fast decoder so isn't the opcode always set or set to OP_INVALID?

@egrimley
Copy link
Contributor Author

egrimley commented Jul 5, 2017

Correction: the decoder originally gives OP_INVALID, but bb_process_invalid_instr(), called from build_bb_ilist(), changes it to OP_UNDECODED by calling instr_set_raw_bits(). It's a maze of twisty little passages, most of them fiddling with INSTR_RAW_BITS_VALID.

@egrimley egrimley closed this Jul 10, 2017
@egrimley egrimley deleted the i2491-undecoded branch July 10, 2017 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants