-
Notifications
You must be signed in to change notification settings - Fork 952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #41, #169 - Handle pointers with segment override correctly #391
Conversation
The regression test bugs.idioms-reg-couples.TestCryptoLocker failed with these changes. Any reason why this regression test verifies that the bug I am fixing here is still present? |
@@ -182,6 +182,7 @@ ShPtr<Expression> LLVMConverter::llvmConstantToExpression(llvm::Constant *c) { | |||
return IntToPtrCastExpr::create(op, llvmTypeToType(ce->getType())); | |||
|
|||
case llvm::Instruction::BitCast: | |||
case llvm::Instruction::AddrSpaceCast: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR. Changes in capstone2llvmir
and bin2llvmir
will be reviewed by @PeterMatula, but I have a comment concerning the change in llvmir2hll
. I think we should add case llvm::Instruction::AddrSpaceCast
also to the following two places:
src/llvmir2hll/llvm/llvmir2bir_converters/orig_llvmir2bir_converter/llvm_converter.cpp
(inLLVMConverter::visitCastInst()
):case llvm::Instruction::BitCast: return BitCastExpr::create(op, llvmTypeToType(i.getType()));
src/llvmir2hll/llvm/llvmir2bir_converters/new_llvmir2bir_converter/llvm_instruction_converter.cpp
(inLLVMInstructionConverter::convertConstExprToExpression
):Including adding a unit test intocase llvm::Instruction::BitCast: return convertCastInstToExpression<BitCastExpr>(*cExpr);
tests/llvmir2hll/llvm/llvmir2bir_converters/new_llvmir2bir_converter/llvm_value_converter_tests/llvm_instruction_converter_tests.cpp
. This code is part of a new LLVMIR to BIR converter that is under development. To use it, you need to runretdec-decompiler.py
with--backend-llvmir2bir-converter new
.
There we go. I made sure that the new LLVMIR to BIR converter can also deal with address space casts, and I added a unit test. |
Thank you for this. We knew that work with I did not merge it directly to
int main(void)
{
DWORD x = __readfsdword (0x18);
// mov eax,dword ptr fs:[00000018h]
return x;
} gets decompiled to: int main(int argc, char ** argv) {
return *(int32_t *)24;
} Hex-Rays generates this: int __cdecl main(int argc, const char **argv, const char **envp)
{
return __readfsdword(0x18u);
} It looks like it generates the same functions MSVC uses (__readfsbyte, __writefsbyte).
We decided for the second option, because it contains the load/store semantics for LLVM passes - there are actual
This will solve most/many problems described in #41 and #169, since it removes one source of work with Also, if you have some objections/ideas about the things I wrote, lets discuss them here. |
No objections on my end. I don't know the product too well, so I took the easy route - feel free to implement a proper solution. And - sure, ensuring that LLVM won't simply remove code it deems unreachable would be great, but that's also beyond my capabilities. |
Merged to |
See my analysis of the issue in #41 (#169 seems to be a duplicate). capstone2llvmir currently ignores segment override on pointers like
fs:[0]
. This makes LLVM think that the code is handling null pointers and remove that code as unreachable.First step was to treat segment overrides correctly, the matching LLVM concept is address spaces. Unfortunately, the only place resembling a documentation for the mapping between segments and address spaces is https://github.com/avast-tl/llvm/blob/725d0cee133c6ab9b95c493f05de3b08016f5c3c/lib/Target/X86/X86ISelDAGToDAG.cpp#L1427, that's what I implemented.
Having address spaces on pointers caused issues further down in the pipeline. bin2llvmir was always producing bitcasts for pointer casts, this had to be changed into address space casts when address spaces change.
And llvmir2hll didn't know address space casts. I made it treat them the same as bitcasts which should do in the short term. In fact, I don't really know how that
add byte ptr gs:[ebx], dh
instruction (the one that produced the address space cast) got into this driver and what it could be converted into.