Understanding Format String Vulnerabilities in C

Introduction

A format string vulnerability is a security flaw that occurs when user-controlled input is used as a format string in functions like printf, sprintf, or fprintf. This vulnerability can lead to memory leaks, arbitrary memory writes, and even remote code execution.

Often referred to as “buffer overflow’s nasty little brother,” format string vulnerabilities are particularly dangerous because they allow attackers to:
Read memory addresses (information disclosure)
Modify arbitrary memory locations (code execution)
Bypass security mechanisms

This article explains how format string vulnerabilities work, how attackers exploit them, and how to prevent them.

How Format String Vulnerabilities Work

In C, functions like printf use format specifiers to determine how data should be interpreted and displayed.

Example: Safe Use of printf

CopyEdit

#include <stdio.h>

int main() {

int value = 42;

printf(“The value is: %d\n”, value);

return 0;

}

Expected Output:

csharp

CopyEdit

The value is: 42

Here, “The value is: %d” is a format string, and %d tells printf to expect an integer argument (value).

Vulnerable Code: Missing Format Specifier

Consider the following vulnerable program:

CopyEdit

#include <stdio.h>

void vulnerable_function(char *user_input) {

printf(user_input); // No format string specified

}

int main(int argc, char *argv[]) {

if (argc < 2) {

printf(“Usage: %s <input>\n”, argv[0]);

return 1;

}

vulnerable_function(argv[1]);

return 0;

}

Why is this dangerous?

The program directly passes user input to printf.
If the user enters format specifiers (%x, %s, %n), printf will treat them as actual commands instead of just displaying the input.

Example Attack: Memory Leak

An attacker can input:

bash

CopyEdit

./vulnerable_program “%x %x %x %x”

Possible Output:

CopyEdit

bffffdf0 080483f4 00000001 b7fdcaf0

This reveals memory contents, which can help an attacker bypass security protections.

Exploiting Format String Vulnerabilities

1. Reading Arbitrary Memory (Information Disclosure)

Attackers can use %x or %s to leak data from memory, such as:

Stack canaries (used for buffer overflow protection)
Pointers to critical system functions (used to bypass ASLR)

Example: Reading Memory from the Stack

bash

CopyEdit

./vulnerable_program “%08x %08x %08x”

Output:

CopyEdit

deadbeef 080484b0 ffffffff

This allows an attacker to extract sensitive memory addresses.

2. Writing to Arbitrary Memory (%n Attack)

The %n format specifier writes the number of bytes printed so far to a given memory address.

Example: Overwriting a Variable

CopyEdit

#include <stdio.h>

int main() {

int secret = 0;

printf(“Before: %d\n”, secret);

printf(“%n”, &secret); // %n writes the number of bytes printed into ‘secret’

printf(“After: %d\n”, secret);

return 0;

}

Output:

makefile

CopyEdit

Before: 0

After: 10

An attacker could use this technique to overwrite return addresses, modify function pointers, or escalate privileges.

Exploiting Format String Vulnerabilities for Code Execution

To gain full control over a system, attackers often use a combination of techniques, such as:

Leaking memory addresses to defeat ASLR
Overwriting function pointers (e.g., GOT entries)
Redirecting execution to shellcode

Example Attack: Overwriting Global Offset Table (GOT) Entry

The GOT stores function addresses, and modifying them allows an attacker to redirect execution.

Find the address of printf in the GOT:

bash

CopyEdit

objdump -R vulnerable_program | grep printf

Use %n to overwrite the GOT entry with shellcode address.

Defending Against Format String Attacks

1. Always Use Explicit Format Strings

Good Practice:

CopyEdit

printf(“%s”, user_input); // Explicitly specify format

Bad Practice:

CopyEdit

printf(user_input); // Vulnerable to format string attacks

2. Disable Executable Stack (NX Bit & DEP)

To prevent execution of injected shellcode:

bash

CopyEdit

gcc -z noexecstack -o secure_program program.c

3. Enable Address Space Layout Randomization (ASLR)

ASLR makes it difficult to predict memory addresses:

bash

CopyEdit

echo 2 > /proc/sys/kernel/randomize_va_space

4. Stack Canaries & Fortify Source

Compile with:

bash

CopyEdit

gcc -fstack-protector-all -D_FORTIFY_SOURCE=2 -o secure_program program.c

Conclusion

Format string vulnerabilities can leak sensitive memory information, modify program behavior, and even lead to full system compromise. Developers should always use explicit format specifiers, enable compiler security features, and follow secure coding practices to mitigate these risks.