r2dec is mostly written in javascript and the engine is quickjs.
First of all when you need to add a new architecture, you need to create a new .js
file inside js/libdec/arch/
.
For example js/libdec/arch/arch9999.js
and it needs to follow the minimal base javascript template:
import Base from '../core/base.js';
export default {
instructions: {
add: function(instr, context, instructions) {
return Base.add(instr.parsed.opd[0], instr.parsed.opd[1], instr.parsed.opd[2]);
},
nop: function() {
return Base.nop();
},
invalid: function(instr, context, instructions) {
return Base.nop();
}
},
parse: function(assembly) {
var tokens = assembly.trim().split(' ');
return { mnem: tokens.shift(), opd: tokens };
},
context: function() {
return { cond: { a: '?', b: '?' } };
},
preanalisys: function(instructions, context) {},
postanalisys: function(instructions, context) {},
localvars: function(context) {
return [];
},
globalvars: function(context) {
return [];
},
arguments: function(context) {
return [];
},
returns: function(context) {
return 'void';
}
};
After saving the new arch (arch9999.js
in the example), you need to add this arch to the file libdec/Archs.js
.
The new architecture needs to have the same name as the cmd e asm.arch
, because the architecture is choosen from that input, regardless the bits, etc.. of the arch.
For example:
import arm from './arch/arm.js';
import arch9999 from './arch/arch9999.js';
import x86 from './arch/x86.js';
export default {
arm: arm,
arch9999: arch9999,
x86: x86
};
So the codebase will use the Base
object.
- All the common instructions are under
Base.*
; they will follow the following input:fcn(destination, reg0, reg1, ...)
.Base.composed
allows to build a set of readable instructions that can be used to express a complex opcode/instruction (see for examplerlwimi
underppc.js
).
Variable.*
will include the creators for known arguments, likeVariable.functionPointer(value, type_or_bits, is_signed)
defines a function pointer (uint16_t (*mypointer)(...)
) as argument forBase.*
.Variable.pointer(value, type_or_bits, is_signed)
defines a pointer (uint16_t* mypointer
) as argument forBase.*
.Variable.local(value, type_or_bits, is_signed)
defines a local (int32_t mylocal
) as argument forBase.*
.Variable.string(content)
is ment to be used for strings as argument ofBase.*
functions; eg:return Base.assign('r0', Base.string('"wooooow"'));
.
var Long = require('libdec/long');
can be used to support 64 bits values on javascript.
All the instructions added under arch.instructions.*
will have the following data as input (instr, context, instructions)
, where:
instr
is the current instruction analized.context
is the an object that can be used to store values tha will be used by instructions that will be analyzed later.instructions
is the array with all thelibdec/instruction.js
derived objects that can be used to recover some required infos that might be needed. One last thing:context
used by instructions to store/retrieve data is generated byarch.context()
.arch.parse
is used to parse the instruction to a json object:- an example:
"add r0, r1, r2"
to{ "mnem": "add", "opd": ["r0", "r1", "r2"] }
. assembly
is a string containing the r2 enriched assembly (for example:call sym.imp.__libc_start_main
).simplified
is a string containing the standard assembly (for example:call 0x4a234
).- the returned object is available under
instr.parsed
, meanwhile the original string can still be found underinstr.assembly
orinstr.simplified
.
- an example:
Deroad.