Understanding x86 assembly and its instruction set is essential for grasping how programs interact with hardware. This guide provides an overview of the structure of assembly programs, the x86 instruction set, and how various elements like registers, memory, and the stack operate at runtime. By focusing on the essentials, this article helps you decode disassembled programs and uncover the logic behind them.
Assembly Programs: Structure and Components
Assembly programs consist of four main components:
- Instructions: These are the actual operations executed by the CPU (e.g.,
MOV EAX, 1
). - Directives: Commands that guide the assembler but don’t translate directly into machine code (e.g.,
.text
,.data
). - Labels: Symbolic names that reference specific locations in the program (e.g.,
label1:
). - Comments: Human-readable annotations ignored by the assembler (e.g.,
; This is a comment
).
Key Sections in an Assembly Program
- Data Section (
.data
or.rodata
): Stores constants and initialized data, such as string literals.
Example:asmCopy.rodata .LC0: .string "Hello, World!"
- Text Section (
.text
): Contains the executable code of the program.
Example:asmCopy.text .global main main: MOV EDI, OFFSET FLAT:.LC0 ; Load address of the string CALL puts ; Print the string
x86 Instruction Set: Basics
x86 instructions follow the format:
Mnemonic Destination, Source
- Mnemonic: Represents the operation (e.g.,
MOV
,ADD
,CMP
). - Operands: Specify the data the instruction operates on (e.g., registers, memory addresses, or constants).
Instruction Types
- Data Movement: Transfers data between registers, memory, or constants.
MOV EAX, 10
– Moves 10 into theEAX
register.LEA EAX, [EBX+4]
– Loads the effective address ofEBX+4
intoEAX
.
- Arithmetic: Performs mathematical operations.
ADD EAX, 5
– Adds 5 to the value inEAX
.SUB EBX, EAX
– SubtractsEAX
fromEBX
.
- Logical: Executes bitwise operations.
AND EAX, EBX
– Performs a bitwise AND betweenEAX
andEBX
.OR EAX, 0x1
– Sets the least significant bit inEAX
.
- Control Flow: Directs the program execution path.
JMP label
– Unconditionally jumps to a label.JE label
– Jumps to a label if the zero flag is set.
- Stack Operations: Manages data on the stack.
PUSH EAX
– Places the value ofEAX
onto the stack.POP EAX
– Removes the top value from the stack intoEAX
.
Registers in x86 Architecture
- General-Purpose Registers:
EAX, EBX, ECX, EDX
– Used for arithmetic, data storage, and loop control.ESI, EDI
– Source and destination indexes, often for string operations.ESP, EBP
– Stack pointer and base pointer, used for stack management.
- Special-Purpose Registers:
- Instruction Pointer (
EIP
): Points to the next instruction to execute. - Flags Register (
EFLAGS
): Tracks the results of operations (e.g., zero flag, sign flag).
- Instruction Pointer (
- Segment Registers:
CS, DS, SS, ES, FS, GS
– Define memory segments for code, data, and stack.
- Control and Debug Registers:
CR0-CR4
– Control CPU operation modes.DR0-DR7
– Provide hardware support for breakpoints and debugging.
Memory Operands in x86
x86 allows memory operands using the formula:
Base + (Index * Scale) + Displacement
Example:
asmCopyMOV EAX, [EBX+4*ECX+8]
- Base:
EBX
- Index:
ECX
(multiplied by a scale factor of 4) - Displacement:
8
The Stack in x86
The stack is a Last-In-First-Out (LIFO) data structure used for function calls, local variables, and return addresses.
- PUSH: Adds data to the top of the stack.
- POP: Removes data from the top of the stack.
Stack Example
asmCopyPUSH 5 ; Pushes 5 onto the stack
PUSH 10 ; Pushes 10 onto the stack
POP EAX ; Removes 10 and stores it in EAX
In this example, the stack grows downward, and EAX
now holds the value 10
.
AT&T vs. Intel Syntax
x86 assembly is written in two common syntaxes:
- Intel Syntax (used in this guide):
- Destination operand appears first (
MOV EAX, 10
). - Operands are size-determined (e.g.,
MOV
determines if it’s a byte or word operation).
- Destination operand appears first (
- AT&T Syntax:
- Source operand appears first (
MOVL $10, %EAX
). - Uses suffixes to indicate operand size (
B
for byte,W
for word,L
for long).
- Source operand appears first (
Conditional Branching and Status Flags
The CMP
instruction and flags in the EFLAGS
register allow conditional branching:
- Zero Flag (
ZF
): Set if the result is zero. - Sign Flag (
SF
): Set if the result is negative. - Overflow Flag (
OF
): Set if an operation causes an overflow.
Example:
asmCopyCMP EAX, EBX ; Compare EAX with EBX
JE label ; Jump to label if EAX == EBX
Conclusion
Understanding x86 assembly and its instruction set is invaluable for reverse engineering, malware analysis, and low-level programming. By breaking down programs into their components, mastering registers, and analyzing memory operations, you can decode the behavior of even complex binaries.
We love to share our knowledge on current technologies. Our motto is ‘Do our best so that we can’t blame ourselves for anything“.