Fuzzying

Material on fuzzing

Papers

An empirical study of the reliability of UNIX utilities
The Art, Science, and Engineering of Fuzzing: A Survey
AFL++: Combining Incremental Steps of Fuzzing Research
A systematic review of fuzzing based on machine learning techniques

Practical

Fuzzing Like a Caveman, A series on the practical techniques to writing a fuzzer in C++
OWASP fuzzing guide

Practice

Fuzzgoat, An application to test fuzzers

Tools and fuzzers

AFL (mutation, coverage-guided)
AFL++ (more advanced than above, a fork)
LibFuzzer (advanced, llvm-based)
Radamsa (mutational, test-case generator, prototyping)
Fuzzowski (Network)

Advanced

Driller

Web fuzzing

Restler
APIFuzzer

Plan for this lecture

What is fuzzying
Build your own fuzzer
American Fuzzy Lop (AFL)

Intro: What is fuzzying?

QA Engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 999999999 beers. Orders a lizard. Orders -1 beers. Orders a sfdeljknesv.

Fuzzying is (almost) like that

Fuzzying

Often used for file formats or networking protocols
Easy to set up
Almost no compute requirements for simple programs
Generating/mutating inputs for a program
- Large set of valid inputs (corpus) mixed together and corrupted
- Read a specification for a protocol, design something to make valid traffic
Goal of creating an exceptional condition
- Program crash
- Unique errors
- Hangs

The fuzz cycle

Limitations of fuzzying

Generating all valid inputs is nearly impossible
Custom protocol extensions hard to know about without reversing
Non-crashing bugs typically not found
- Arbitrary file read
- SQL injection
- Magical privilege escalation
Hard to interpret progress
- No crashes? Are there no bugs? Is your fuzzer not working?
- Loads of crashes? Maybe it’s working? Maybe you’re finding 1% of the bugs?
Most off the shelf tooling requires source

Harnessing

The tooling used to observe program behavior
- Debugger watching for crashes
- Code coverage instrumentation by a compiler
- Emulators/hypervisors to observe a whole system
What are they looking for?
- Crashes
- Code coverage
- Unexpected program states
- Error messages
- Information leakage

Examples

(off-the-shelf fuzzers)

Just invoking the program

Usually the right place to start
Write some tool that generates a mutated file
Run the program to parse the file
What if we get a crash?
- Ehh… just attach GDB or WinDbg or enable core dumps :D
Reproducibility can be a huge issue
Can be impossible to scale as the program can only have one instance
Program startup times can be long (seconds to open up Word)

AFL++

The gold standard
- Looks to be being replaced slowly by libfuzzer in popularity
Coverage guided fuzzing
Requires source code
Relies on having source in standard configuration
Can use QEMU for coverage

if x < 10:
   doSomething() # covered
   took path 1
else:
   doSomethingElse() # covered
   took path 2
for x in range(10):
   dosomething()

test 1: x = 10 test 2: x = 0

libfuzzer

Designed to be baked into your target application
Part of LLVM, easily used when building a target with clang
Coverage guided
Requires source
Extremely fast as it’s in-memory fuzzing
- Not dropping files to disk every iteration
Similar to AFL it’s corpus based
- Need to have some well formed inputs to start with

Architectural Improvements to Fuzzing

Coverage Guided fuzzying

Gather which code has been hit based on an input
Input saved when a new unique codepath is observed
Input is used as a basis for future inputs
One of the biggest improvements that can be made to a fuzzer
Can ultimately turn exponential complexity into linear complexity

Coverage Guided Fuzzing Example

Write a program to remove all occurrences of the word “the” in a sentence

Coverage Guided Fuzzing Visualized

Crash Amplification

Increase program sensitivity to malformed input
- ASAN / PageHeap / Electric Fence
Heatmaps direct fuzzying to inputs that generate more crashes
Add hooks to find logic bugs (e.g. crash on auth success)
Limitations:
- Many programs won’t start with ASAN
- Some incorrect memory access does not result in crashes

Roll your own fuzzer

Inspired from fuzzying like a caveman

Scheleton

def get_bytes(filename):
    f = open(filename, "rb").read()
    return bytearray(f)

def create_new(data):
    f = open("mutated.jpg", "wb+")
    f.write(data)
    f.close()

N = 100000

def exif(counter,data):
    command = "./exif mutated.jpg -verbose"
    out, returncode = run(command, withexitstatus=1)
    if b"ERROR" in out:
        f = open("crashes/crash.{}.jpg".format(str(counter)), "ab+")
        f.write(data)
        f.close()
    if counter % 100 == 0:
        print(counter, end="\r")
    
def fuzz(filename):
    for counter in range(N):
        data = get_bytes(filename)
        mutated = mutate(data)
        create_new(mutated)
        exif(counter,mutated)

Mutation

def bit_flip(data):
    num_of_flips = int((len(data) - 4) * .01)
    indexes = range(4, (len(data) - 4))
    chosen_indexes = random.sample(indexes, num_of_flips)
    for x in chosen_indexes:
        data[x] ^= 1 << random.randint(0,7)
    return data

Gynvael’s Magic Numbers

Gynvael Coldwind ‘Basics of fuzzing’ enumerates several ‘magic numbers’ that typically produce errors
These numbers relate to data type sizes and arithmetic-induced errors

0xFF
0x7F
0x00
0xFFFF
0x0000
0xFFFFFFFF
0x00000000
0x80000000 <- minimum 32-bit int
0x40000000 <- just half of that amount
0x7FFFFFFF <- max 32-bit int

If used as parameters to malloc() or other array operations, overflows are common
For instance 0x1 plus 0xFF on a one-byte register overflows to 0x00 and can produce unintended behavior
HEVD actually has an integer overflow bug similar to this concept.
If we choose to use 0x7FFFFFFF as the magic number then we need to change four bytes

Magic number mutation

def magic(data):

    magic_vals = [(1, 255), (1, 255), (1, 127), (1, 0), (2, 255), (2, 0), 
                  (4, 255), (4, 0), (4, 128), (4, 64), (4, 127) ]

    (picked_size, picked_magic) = random.choice(magic_vals)

    picked_index = random.randint(4, len(data)-4)
    
    for i in range(picked_size):
        data[picked_index + i] = picked_magic

    return data

Mutation - putting it together

def mutate(data):
    f = random.choice([bit_flip, magic])
    return f(data)

Input:
Target: https://github.com/mkttanabe/exif
Compile with ASAN: -fsanitize=address -ggdb

Triaging

def get_files():
    return os.listdir("crashes/")

def triage_files(files):
    for x in files:
        original_output = os.popen(f"./exif crashes/{x} -verbose 2>&1").read()
        output = original_output

        # Getting crash reason
        crash = "SEGV" if "SEGV" in output else "HBO" if "heap-buffer-overflow" in output else None

        if crash == "HBO":
            output = output.split("\n")
            counter = 0
            while counter < len(output):
                if output[counter] == "=================================================================":
                    target_line = output[counter + 1]
                    target_line2 = output[counter + 2]
                    counter += 1
                else:
                    counter += 1
            target_line = target_line.split(" ")
            address = target_line[5].replace("0x","")


            target_line2 = target_line2.split(" ")
            operation = target_line2[0]


        elif crash == "SEGV":
            output = output.split("\n")
            counter = 0
            while counter < len(output):
                if output[counter] == "=================================================================":
                    target_line = output[counter + 1]
                    target_line2 = output[counter + 2]
                    counter += 1
                else:
                    counter += 1
            if "unknown address" in target_line:
                address = "00000000"
            else:
                address = None

            if "READ" in target_line2:
                operation = "READ"
            elif "WRITE" in target_line2:
                operation = "WRITE"
            else:
                operation = None

        if crash:
            log_name = (x.replace(".jpg","") + "." + crash + "." + address + "." + operation)
            f = open(log_name,"w+")
            f.write(original_output)
            f.close()

Trying AFL

Download fuzzgoat and harness it!

Non-binary targets? we’ve got you covered!

REST APIs

Try Restler
or APIFuzzer

Disclaimer: check before you fuzz a live environment!

API fuzzing on live environments is bad practice and can get you in trouble
Often VDPs explicitely disallow it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

5-fuzzying.org

5-fuzzying.org

Fuzzying

Material on fuzzing

Papers

Practical

Practice

Tools and fuzzers

Advanced

Web fuzzing

Plan for this lecture

Intro: What is fuzzying?

Fuzzying

The fuzz cycle

Limitations of fuzzying

Harnessing

Examples

Just invoking the program

AFL++

libfuzzer

Architectural Improvements to Fuzzing

Coverage Guided fuzzying

Coverage Guided Fuzzing Example

Coverage Guided Fuzzing Visualized

Coverage Guided Fuzzing Visualized

Crash Amplification

Roll your own fuzzer

Scheleton

Mutation

Gynvael’s Magic Numbers

Magic number mutation

Mutation - putting it together

Triaging

Trying AFL

Non-binary targets? we’ve got you covered!

REST APIs

Disclaimer: check before you fuzz a live environment!

Files

5-fuzzying.org

Latest commit

History

5-fuzzying.org

File metadata and controls

Fuzzying

Material on fuzzing

Papers

Practical

Practice

Tools and fuzzers

Advanced

Web fuzzing

Plan for this lecture

Intro: What is fuzzying?

Fuzzying

The fuzz cycle

Limitations of fuzzying

Harnessing

Examples

Just invoking the program

AFL++

libfuzzer

Architectural Improvements to Fuzzing

Coverage Guided fuzzying

Coverage Guided Fuzzing Example

Coverage Guided Fuzzing Visualized

Coverage Guided Fuzzing Visualized

Crash Amplification

Roll your own fuzzer

Scheleton

Mutation

Gynvael’s Magic Numbers

Magic number mutation

Mutation - putting it together

Triaging

Trying AFL

Non-binary targets? we’ve got you covered!

REST APIs

Disclaimer: check before you fuzz a live environment!