Skip to content

Tutorial utils

magrazia edited this page May 24, 2016 · 1 revision

First, let's have a look at jsondisass. This script has many functionalities and you can list all of them with the -h flag from the command line. Let's print the first 10 instructions in a format that can be used by asmcompare.

       emdel -> python jsondisass.py -f 25-04-16_part1_0-78100_hwcontext.json -b 0 -e 10
	[-- ROPMEMU framework - jsondisass --]
	[+] Getting 25-04-16_part1_0-78100_hwcontext.json
	1) ret 
	2) pop rax
	3) ret 
	4) mov qword ptr [rax], rdx
	5) ret 
	6) pop rax
	7) ret 
	8) mov rdx, rcx
	9) ret 
	10) mov qword ptr [rax], rdx
	11) ret 

To be used by asmcompare:

           python jsondisass.py -b 0 -e 40 -f 20_hwcontext.json | grep -vi "ropmemu\|getting" > /tmp/20instr.jdisass

Another interesting feature quite useful during a debugging session is the search flag. It allows the user to follow a register or an instruction.

       emdel -> python jsondisass.py -f 25-04-16_part1_0-78100_hwcontext.json -B 0 -E 20 -s rax
	[-- ROPMEMU framework - jsondisass --]
	[+] Getting 25-04-16_part1_0-78100_hwcontext.json
	0xffff88001b800008-2 2 pop rax
	0xffff88001b800018-3 4 mov qword ptr [rax], rdx
	0xffff88001b800020-4 6 pop rax
	0xffff88001b800038-6 10 mov qword ptr [rax], rdx
	0xffff88001b8000a0-13 24 pop rax
	0xffff88001b8000b0-14 26 mov qword ptr [rax], rdx

Now we have a look at the asmcompare script. It takes two inputs: a text file generated by jsondisass and another text script generated by the unrop GDB script in mode 0. The script compares up to the last instruction of the shortest trace.

            emdel -> python asmcompare.py /tmp/20instr.jdisass unrop_300.txt 
	[+] ropemu - total instructions: 40
	[+] GDB (unrop) - total instructions: 299
	[+] Results: 
		 - match: 40 - mismatch: 0

Another interesting script is cpucompare. It has many inputs: the JSON trace generated by ropemu, the JSON trace from the unrop GDB script, and the number of gadgets. A possible usage:

      emdel -> python cpucompare.py 20_hwcontext.json unrop_300.json 2 4
	[-- ROPMEMU framework -- cpucompare --]

	[+] emu instruction: ret 
	[+] Unrop instruction: mov    QWORD PTR [rax],rdx

	[+] Results:
		 - Mismatch EFLAGS 0x0 0x97
		 - Mismatch RAX 0xffff88001bc00000 0xffff880024c00000
		 - Mismatch RBP 0x0 0x4
		 - Mismatch RBX 0x0 0xc9e660
		 - Match RCX 0x0 0x0
		 - Mismatch RDI 0x0 0x4
		 - Mismatch RDX 0x0 0x10
		 - Mismatch RSI 0x0 0x7fffddef6f10
		 - Match R8 0x0 0x0
		 - Mismatch R9 0x0 0x6d3
		 - Match R10 0x0 0x0
		 - Mismatch R11 0x0 0x293
		 - Mismatch R12 0x0 0x7fffffff
		 - Mismatch R13 0x0 0xf01470
		 - Mismatch R14 0x0 0x7fd5f5539560
		 - Mismatch R15 0x0 0x1
		 - Mismatch RSP 0xffff88001b800018 0xffff880024800018
		 - Match RIP 0xffffffff8115c832 0xffffffff8115c832

This is the output from the first instructions of the copy chain in a debugging session. Note the different format between unrop and ropemu JSON traces and that the traces generated by GDB contain the state before the emulation. RSP does not match because the copy chain stack address changes at every run of the rootkit.

An important step after unchain is loops. This script identifies and compresses unrolled ROP loops. A typical usage:

	emdel -> time python loops.py -f copyold_0.bin -b 1 -e 276178 -o /tmp/bin_copy_loop.bin
	[-- ROPMEMU framework - loops --]

	[+] Mode: x64
	[+] Filename: copyold_0.bin
	[+] rcx under analysis
		 * possible relation: rcx-rdi
	[+] rsi under analysis
	[+] rbx under analysis
	[+] rdi under analysis
	[+] rdx under analysis
	[+] rsp under analysis
	[+] rax under analysis
		 * possible relation: rax-rdx
	[+] Main loop with: rax
	[+] Detected ranges:
		0xffff88001bc00000 - 0xffff88001bc00020
		0xffff880026f02000 - 0xffff880026f5be10
	[+] Generating /tmp/bin_copy_loop.bin
	------------------------------
	push rbp
	mov rbp, rsp

	[+] Generating /tmp/loops_0-2.asm
	[+] Generating /tmp/loops_0-2
	[+] Compressing loop (0xffff880026f02000, 0xffff880026f5be10)
	[+] Inserting instructions from 1 to 54
	[+] Write from 1 to 54
	[+] Loop from instruction 54 to 276162
	------------------------------
	movabs rax, 0xffff880026f02000
	movabs rdx, 0xffffffff8100a4de
	movabs rdx, 0xffffffff8100a4de
	mov qword ptr [rax], rdx
	movabs rdi, 8
	add rdi, rax
	------------------------------
	mov qword rax, 0xffff880026f02000
	mov qword rbx, 0xffff880026f5be10
	loop_2:
	pop rdx
	add rdi, rax
	add rax, 0x08
	cmp rax, rbx
	jne loop_2

	[+] Generating /tmp/loops_54-276162.asm
	[+] Generating /tmp/loops_54-276162
	[+] Apply template...
	[+] Delta: 6
	[+] Enum + delta: 276168
	[+] Write from 276168 to 276178
	------------------------------
	leave
	ret

	[+] Generating /tmp/loops_276178-2.asm
	[+] Generating /tmp/loops_276178-2

	real    2m45.439s
	user    2m45.072s
	sys     0m0.460s

Where copyold_0.bin has been generated by unchain. In additional, the script adds the function prologue and the epilogue to simplify the IDA function identification.

The first step to recover the control flow graph is to run blocks with the JSON traces generated by ropemu. blocks generates the basic traces (splitting on pushf instruction) and then a graph is drawn. This is the raw graph, then we pass again over all the traces in order to find the maximum overlap among all the leaves. We analyze every couple from the bottom and we go up until we find a mismatch. Then we check if we have a sink and finally we split the leaves in two blocks the 'mismatch part' and the sink and the final graph is generated. A quick example:

        emdel -> time python blocks.py 03-05-16_disp_p1_def0_hwcontext.json,04-05-16_disp_p2_def0_hwcontext.json,04-05-16_disp_p2_def1_hwcontext.json blocks_test_disp/
	...
	...
	--- VISUALIZATION ---
	[+] Generating final-image.png
	--- SERIALIZE ---
	[+] Dumping block-list.json
	--- SERIALIZE ---
	[+] Dumping metadata.json

	real    0m33.605s
	user    0m32.524s
	sys     0m1.060s

The output is quite verbose so here we show only the juicy part. final-image.png contains the CFG while block-list.jsonlists the blocks of the CFG:

        emdel -> cat block-list.json 
	  ...
	  "e0c0bc25b3c220bf6aa6fdcff5342b52", 
	  "ee305c2aad361cc48126da35e3904d4b", 
	  "e0c0bc25b3c220bf6aa6fdcff5342b52",
              ...

metadata.json stores the metadata information, describing how the blocks are connected:

            emdel -> cat metadata.json
            ...
	"943baa7d5081e4b627001e6c4008dc60": [
	    "fd24d90948734bec4e9a447231673dad^0", 
	    "f0a291c15ff9b799ff542e603f7ec4d2^1"
	  ], 
             ...

The previous phase (blocks) generates many traces, and we are interested only in the unique blocks. tcollect is the script that easily extracts these traces:

         emdel -> python tcollect.py block-list.json blocks_test_disp/ tcollect_test/
	[+] Loaded 8 labels
	tcollect_test/fd24d90948734bec4e9a447231673dad.json
	tcollect_test/ee305c2aad361cc48126da35e3904d4b.json
	tcollect_test/943baa7d5081e4b627001e6c4008dc60.json
	tcollect_test/d41d8cd98f00b204e9800998ecf8427e.json
	tcollect_test/f0a291c15ff9b799ff542e603f7ec4d2.json
	tcollect_test/4aabd758ce2c71be35bfe7b629e4485d.json
	tcollect_test/0a26886708280a78dbc0a531c54d47e5.json
	tcollect_test/e0c0bc25b3c220bf6aa6fdcff5342b52.json

Finally premove removes the pushf blocks and dumps the final blocks and the updated metadata information. A quick example:

       emdel -> time python premove.py block-list.json metadata.json tcollect_test/
	set([0, u'0a26886708280a78dbc0a531c54d47e5', u'ee305c2aad361cc48126da35e3904d4b', u'943baa7d5081e4b627001e6c4008dc60', u'e0c0bc25b3c220bf6aa6fdcff5342b52', u'4aabd758ce2c71be35bfe7b629e4485d', u'f0a291c15ff9b799ff542e603f7ec4d2', u'fd24d90948734bec4e9a447231673dad', u'd41d8cd98f00b204e9800998ecf8427e'])
	[+] Loaded 9 labels
	[+] Getting traces...
	[+] Got 8 traces
	[+] Pass pushf-block..
	...
	...
	[+] Parent: ee305c2aad361cc48126da35e3904d4b
		 - Child: fd24d90948734bec4e9a447231673dad ZF: 0
		 - Child @: tcollect_test/fd24d90948734bec4e9a447231673dad.json
		 - Hash pushf block: 23cf1d97fa676e2845d0d7de27ede7c5 - Until: 0xffff880026f02270-185219
		 - Before: 43 - After: 25 - Diff: 18
		 + Creating 2e6b283f41ed432593edc1dbe728ea2b
		 - Removing tcollect_test/fd24d90948734bec4e9a447231673dad.json
		 + Dumping tcollect_test/2e6b283f41ed432593edc1dbe728ea2b.json
		 - Child: 0a26886708280a78dbc0a531c54d47e5 ZF: 1
		 - Child @: tcollect_test/0a26886708280a78dbc0a531c54d47e5.json
		 - Hash pushf block: 23cf1d97fa676e2845d0d7de27ede7c5 - Until: 0xffff880026f02270-184283
		 - Before: 10387 - After: 10369 - Diff: 18
		 + Creating a0c846494fdf927876b9cc1990d7ce45
		 - Removing tcollect_test/0a26886708280a78dbc0a531c54d47e5.json
		 + Dumping tcollect_test/a0c846494fdf927876b9cc1990d7ce45.json
	...
	...
	--- VISUALIZATION ---
	[+] Generating premove-image.png
	--- SERIALIZE ---
	[+] Dumping premove-metadata.json

	real    0m7.852s
	user    0m7.564s
	sys     0m0.276s

The output is a new image showing the CFG with the new blocks and the updated metadata file.

The last step is to connect together all the binary blobs generated by unchain and refined by loops and premove. This phase is implemented by glue. A quick example:

        emdel -> time python glue.py -d tcollect_test_bins/ -j premove-metadata.json -o /tmp/test_glue.bin
	:: Info: 
	::: Mode: x64
	::: Directory: tcollect_test_bins/
	::: Metadata: premove-metadata.json
	::: Output: /tmp/test_glue.bin

	:: Analysis:
	- Head: e0c0bc25b3c220bf6aa6fdcff5342b52
	- Leaves:  [u'd41d8cd98f00b204e9800998ecf8427e']
	- Sink: d41d8cd98f00b204e9800998ecf8427e
	- Under analysis: e0c0bc25b3c220bf6aa6fdcff5342b52
	- Current label: label_e0c0bc25b3c220bf6aa6fdcff5342b52
		 - Plugging child: label_2e6b283f41ed432593edc1dbe728ea2b
	- Under analysis: 2e6b283f41ed432593edc1dbe728ea2b
	- Current label: label_2e6b283f41ed432593edc1dbe728ea2b
		 - Plugging child: label_1123edfcdfa9d07e93da89c908e27c94
		 Child: 1123edfcdfa9d07e93da89c908e27c94 - ZF: 0
		 Child: 67ad853f7c1bab15a79e6d3b7fa6a59b - ZF: 0
	- Under analysis: 1123edfcdfa9d07e93da89c908e27c94
	- Current label: label_1123edfcdfa9d07e93da89c908e27c94
		 - Plugging child: label_6a74663b927f1c72acad8f5d62173618
		 Child: 2e6b283f41ed432593edc1dbe728ea2b - ZF: 0
		 Child: 6a74663b927f1c72acad8f5d62173618 - ZF: 1
	- Under analysis: 67ad853f7c1bab15a79e6d3b7fa6a59b
	- Current label: label_67ad853f7c1bab15a79e6d3b7fa6a59b
		 Child: d41d8cd98f00b204e9800998ecf8427e - ZF: F
	- Under analysis: 6a74663b927f1c72acad8f5d62173618
	- Current label: label_6a74663b927f1c72acad8f5d62173618
		 Child: d41d8cd98f00b204e9800998ecf8427e - ZF: F
	- Plugging sink: label_d41d8cd98f00b204e9800998ecf8427e
	::::::::::::: Generating /tmp/label_prologue.asm
	::::::::::::: Generating /tmp/label_glue.asm

	real    0m0.288s
	user    0m0.232s
	sys     0m0.048s

The output is a binary blob in this case called /tmp/test_glue.bin. This file is the input for dust, a bash script that generates the final ELF file:

         emdel -> bash bin_chains/dust.sh /tmp/test_glue.bin /tmp/glue.c glue_test /tmp/gluez
	:: Compiling /tmp/glue.c
	:: OEP: 0000000000400400
	:: Change .text section name
	:: Feeding .text section
	:: Removing useless sections...
	:: OEP: 0000000000400400
	:: DONE

And the output is:

        emdel -> file *
	glue_test: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), BuildID[sha1]=33773d67b51d83f25df26e9c0b1f2e78a2fc3df2, not stripped
Clone this wiki locally