Skip to content

Latest commit

 

History

History
77 lines (68 loc) · 6.46 KB

File metadata and controls

77 lines (68 loc) · 6.46 KB

Registers


  • Temporary storages that are built into the CPU. Aside from the General Purpose Registers (GPRs), most other registers are dedicated to specific purposes
  • The 6 16-bit selector registers for x86 architecture are CS, DS, ES, FS, GS, SS
    • A selector register contains address to a specific block of memory from which one can read or write. It was created for Intel 8086 processor's memory segmentation model to enable access to the full 1 MB of physical memory despite machine word size being only 16-bit (would require 20-bit word size). Nowadays, memory addressing uses the flat memory model since modern processor does not run into the complication with machine word size and addressable memory
    • Although memory segmentation is no longer relevant, FS and GS selector registers are still being used in OS-specific manners
  • Control register: EFLAGS. It contains arithmetic and system flag values
    • Arithmetic flags are used by JCC instructions to decide whether to branch or not
    • System flags are more specific. For example, Direction Flag (DF) is only used by string instructions (STOS, SCAS, LODS, MOVS, CMPS) to decide whether to process character bytes from high to low addresses or vice versa

Assembly to Machine Code Is Not One-To-One


  • An opcode can have multiple mnemonics associated with it and a mnemonic can have multiple opcodes associated with it
  • Example 1: 0x75 is both the opcode for JNZ and JNE
  • Example 2: 0xb142 and 0xc6c142 both corresponds to the instruction MOV CL, 66

Lost Of Type Information


  • There is no way to tell the datatype of bytes stored in memory by just looking at the location of where it is stored. Instead, the datatype is implied by the operations that are used on it
  • Type Inference From JCC Instruction: Type inference is generally a non-trivial task, but when the data is used to determine the outcome of a JCC instruction (branch or not) we can easily determine if the data is a sign or unsigned since certain JCC instructions are sign-specific

Floating Point Arithmetic


  • Floating point operations are performed using the FPU Register Stack, or the "x87 Stack." FPU is divided into 8 registers, st0 to st7. Typical FPU operations will pop item(s) off the stack, perform on it/them, and push the result back to the stack
  • FLD instruction is for loading values onto the FPU Register Stack
  • FST instruction is for storing values from ST0 into memory
  • FPU Register Stack can be accessed only by FPU instructions

Variable-Length Instruction


  • Even though most instruction opcodes are only 1 byte long, the total size for those instructions can still range from 1 to 15 bytes. To understand how x86 instructions are encoded, check out this article

one byte x86 instructions


Commonly Used But Hard To Remember x86 Instructions With Side Effects


  • Side Effects?: effects on memory and registers after executing a particular instruction without the effects being explicitly stated in the instruction itself
    • Binary Ninja does a good job revealing implicit registers usage

registers implicitly used by movsd and stosd instructions are revealed in assembly view

although use of implicit registers are not shown in assembly view for imul, idiv, and cld instructions, they are in LLIL view

  • IMUL reg/mem: register is multiplied with AL, AX, or EAX and the result is stored in AX, DX:AX, or EDX:EAX
  • IDIV reg/mem: takes one parameter (divisor). Depending on the divisor’s size, div will use either AX, DX:AX, or EDX:EAX as the dividend, and the resulting quotient/remainder pair are stored in AL/AH, AX/DX, or EAX/EDX
  • STOS(B/W/D): writes the value AL/AX/EAX to EDI. Commonly used to initialize a buffer to a constant value
  • SCAS(B/W/D): compares AL/AX/EAX with data starting at the memory address EDI
  • LODS(B/W/D): reads 1, 2, or 4 byte value from esi and stores it in al, ax, or eax
  • MOVS(B/W/D): moves data with 1, 2, or 4 byte granularity between two memory addresses. They implicitly use EDI/ESI as the destination/source address
  • CLD/STD: CLD/STD clears/sets direction flag (DF). If DF is 1, addresses are decremented. It is used by STOS(B/W/D), SCAS(B/W/D), LODS(B/W/D), and MOVS(B/W/D)
  • REP: repeats an instruction up to ECX times
  • PUSHAD/POPAD: pushes/pops all 8 general-purpose registers
  • PUSHFD/POPFD: pushes/pops EFLAGS register
  • MOVSX/MOVZX: both works like a MOV except MOVSX sign-extends the value in the destination register while MOVZX zero-extends the value in the destination register
  • CMOVcc: if the condition code's (cc) corresponding flag is set in EFLAGS, MOV will be performed. Otherwises, it's just like a NOP

GDB Tips <- RERM[.instruction-sets] -> x86-64