This is a tool experimenting a formal method to recover program control flow graph from binaries obfuscated by virtualizing obfuscation, even when binaries are virtualized mutliple times. Currently, it considers the transformations of
- Tigress
- VMProtect
- Code Virtualizer
- O-LLVM1
- Other ad-hoc implementations2.
The code is in active development, still buggy and difficult to use. The underlying concolic execution engine are not fully published yet3, though the current published code can work with any concolic/fuzzing engine. Moreover the strength of this tool depends only on the execution engine, that is a rational theoretical limit of the method.
Though I follow a mathematical approach, the main idea is simple. It might have been considered implicitly in many "unpack tutorials" of great hackers and crackers. The only original contribution here is to give a more solid theoretical base that explains these concrete techniques, and this leads to a "less ad-hoc" deobfuscation technique.
The tool is written mostly in C++ and OCaml, and uses the following great softwares:
Currently there is no documentation (if you are interested in, I am very happy to answer any question). I try also to prepare a paper on this but there are still a lot of things to do.
1O-LLVM does not support yet virtualization transformations, though control-flow-graph flattening can be considered as a "light-weight" virtualization.
2Collected from crackmes.de.
3BinSec is in very active development and it will be open when it is ready, some technical documents and (rather old) source codes can be referenced here.