Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for decompilation of 64-bit x86 files? #9

Closed
Manouchehri opened this issue Dec 13, 2017 · 10 comments
Closed

Support for decompilation of 64-bit x86 files? #9

Manouchehri opened this issue Dec 13, 2017 · 10 comments

Comments

@Manouchehri
Copy link
Contributor

Manouchehri commented Dec 13, 2017

Since this question is going to get asked sooner or later, might as well ask it now: what needs to implemented for 64-bit support?

llvmir2hll was able to decompile a simple hello world LLVM IR file compiled by a 64-bit Linux and macOS host. I didn't test anything complex or McSema (which someone should definitely try and let us know!).

Is bin2llvmir the main road block?

@PeterMatula
Copy link
Collaborator

Hi,
I actually already hacked my local RetDec and tried a full x86-64 bit decompilation of a simple ack.c program (the one from retdec-regression-tests). It did pass the whole chain all the way to C. It even looked somehow OK-ish. The biggest problem was that x86-64 is using a different calling conventions that we currently do not handle at all. Also, even if this was not an issue, we still would not just add the support without at least a few regression tests.

When adding a new architecture, most of the work needs to be done in capstone2llvmir. This library already supports 64-bit for x86, mips, and ppc. However, other changes, like in this case, might be required to get some reasonable results.

So, what needs to be done to enable x86-64:

  1. Disable RetDec's defences agains unsupported formats -- currently x86-64 won't even get through scripts.
  2. Create x86 decoder set to 64-bit mode in decoder.
  3. Add support for x86-64 calling conventions.
  4. Test it a bit. If some other major problem occur, modify other parts.
  5. Write some regression tests.

We would like to get to it pretty soon, but before that, some refactorizations are in order:

  1. Refactor capstone2llvmir -- v1.0 is kind of a prototype.
  2. Refactor decoder pass in bin2llvmir and merge it with control-flow pass -- this is a relic from before capstone2llvmir that was just forced to use it. It does not make sense to have these two parts separated. Once this is done, all decompilation results should be much better and adding x86-64 much easier.

@s3rvac s3rvac changed the title 64-bit Support? Support for decompilation of 64-bit files? Dec 13, 2017
@breznak breznak mentioned this issue Dec 14, 2017
@PeterMatula PeterMatula self-assigned this Dec 14, 2017
@PeterMatula
Copy link
Collaborator

I have been asked what changes did I make in order to try x86-64 decompilation. Instead of listing them somewhere, I decided to create a branch where it is enabled. So if anyone wants to play with it, here you go. I was able to decompile the simplest hello world program (hello-x86_64.zip). You can see in the hello.c.frontend.dsm file that 64-bit instructions were indeed created. I did not try anything more complex. There is a good chance it would not work.

Keep in mind, that everything I wrote above still holds. This does not mean RetDec supports x86-64, or anything like that. All I did was let these files go through scripts into decompilation (commit). Much more work will be needed to properly support this.

Please do not report any issues related to this. But if while you are playing with it you fix/improve something, feel free to contribute.

@Mcilie
Copy link

Mcilie commented Apr 28, 2018

Guys hows it coming along? when do you think it will be done?

@PeterMatula
Copy link
Collaborator

@Mcilie I'm spending more time than I thought on #116, so this did not really moved much.

@bannsec
Copy link

bannsec commented Jul 8, 2018

+1 on this. Given the majority of binaries are 64-bit now (especially linux elfs), not having 64bit decompilation support is a major issue for usability.

@PeterMatula PeterMatula changed the title Support for decompilation of 64-bit files? Support for decompilation of 64-bit x86 files? Sep 12, 2018
@PeterMatula
Copy link
Collaborator

I changed this issue to be specific to 64-bit x86 (x64), since this architecture was mainly discussed here, and it is better to have this separated from issues dealing with 64-bit support of other architectures (e.g. #268).

@PeterMatula PeterMatula added this to the x64 support milestone Sep 12, 2018
@PeterMatula
Copy link
Collaborator

PeterMatula commented Sep 12, 2018

This is being worked on by one student as his bachelor thesis - see milestone and the referenced forked repository.

@jonahharris
Copy link

@PeterMatula I've updated the Python-based replacement of the shell script with similar changes and tested it. I haven't had any x86-64 issues with the decompiler (yet). As it's been a long time since your x86_64-enabled branch was updated, it was a huge merge with master + the Python change and I opted to make a different branch you could pull the changes yourself in x86-64-support

@PeterMatula
Copy link
Collaborator

@jonahharris This will let x64 files go through, and RetDec will probably generate some output, but as I wrote above, the output is not very good. What the mentioned student is doing at the moment:

  • Adding ABI specifications for supported architectures and possible ABIs (including x64 ABIs).
  • Rewriting analysis of functions' parameters and returns to uniformly use these specifications.
  • Adding support for x64 in other architecture-specific parts like static code detection, main detection, etc.
  • Maybe even adding basic support for extended instruction sets. This is a topic for other bachelor thesis, but XMM registers might be used to pass arguments on x64, so this is needed for this topic as well.
  • Adding tests for all this.

The thesis is due at the end of summer semester 2019 (may), but i hope we will merge parts of it much sooner - and enable an experimental support for x64.

@PeterMatula
Copy link
Collaborator

After #513, RetDec has a basic support of 64-bit x86 architecture. Moreover, function parameter & return analysis, ABI, and other related modules, were completely rewritten - this should improve quality for other architectures as well.

Big thanks to Peter Kubov (@xkubov).

  • We will put together a new release in the near future.
  • We will continue to improve quality of x64 decompilation.
  • ARM64 should also be ready in a few months.

I'm closing this, since basic x64 support have been added. We are aware of many more issues we need to fix/implement in order to improve x64 decompilation quality, but these can be solved in dedicated issues and not here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants