You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently radare2 only parses the executable to look at the gcc version in the hardcoded metadata. This only works on elf. I would like to implement a toolchain provenance tool on radare. The idea is to determine the exact version of the compilator, disassemble the malware, decompile it and then get the same signature to proove we caught exactly the same source code.
toolchain analysis is a work in progress in academic research. We even do not have it on ghidra.
Please describe what are you missing or wanting to be improved
The command i | grep cc should give the real version of gcc not based on program metadata but on toolchain provenance. Then we will get:
-the compilator name
-the very exact commit of the gcc release if open source compiler, else the compilator version
-the compiler options used by gcc (-O0, etc...)
-work even if binary is stripped or if it is a firmware/driver binary
Provide images, ascii-art, test files and anything that may help us understand your request
Description
Currently radare2 only parses the executable to look at the gcc version in the hardcoded metadata. This only works on elf. I would like to implement a toolchain provenance tool on radare. The idea is to determine the exact version of the compilator, disassemble the malware, decompile it and then get the same signature to proove we caught exactly the same source code.
toolchain analysis is a work in progress in academic research. We even do not have it on ghidra.
Please describe what are you missing or wanting to be improved
The command
i | grep cc
should give the real version of gcc not based on program metadata but on toolchain provenance. Then we will get:-the compilator name
-the very exact commit of the gcc release if open source compiler, else the compilator version
-the compiler options used by gcc (-O0, etc...)
-work even if binary is stripped or if it is a firmware/driver binary
Provide images, ascii-art, test files and anything that may help us understand your request
repo example:
https://github.com/dyninst/toolchain-origin
With neural network:
https://yuede.github.io/files/21_ACNS_Vestige.pdf
https://www.youtube.com/watch?v=wdzjVfwFAPc&ab_channel=IEEESANER2021
With no neural neutwork:
https://www.researchgate.net/profile/Barton-Miller/publication/220854600_Recovering_the_Toolchain_Provenance_of_Binary_Code/links/0deec52ceab25bf292000000/Recovering-the-Toolchain-Provenance-of-Binary-Code.pdf?origin=publication_detail
stack overflow ressources:
https://reverseengineering.stackexchange.com/questions/11/what-hints-in-machine-code-can-point-me-to-the-compiler-which-was-used-to-genera
The text was updated successfully, but these errors were encountered: