Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get 100% stability for a target in under 5 minutes (TM) #677

Closed
vanhauser-thc opened this issue Aug 14, 2020 · 3 comments
Closed

Comments

@vanhauser-thc
Copy link
Collaborator

vanhauser-thc commented Aug 14, 2020

  1. select a fuzzbench target export TARGET=...

  2. checkout fuzzbench, make two edits and make the target:

git clone --depth=1 https://github.com/google/fuzzbench
cd fuzzbench

vi fuzzers/aflplusplus_optimal/fuzzer.py

the build function should look like this afterwards:

def build():
    aflplusplus_fuzzer.build("lto", "dynamic))
    for copy_file in glob.glob("/afl/libc*"):
        shutil.copy(copy_file, os.environ['OUT'])
    return

then:

vi benchmarks/${TARGET}/build.sh

before the compilation of the fuzzer add this line:

export AFL_LLVM_DOCUMENT_IDS=$OUT/ids.txt

Note: some targets compile the fuzzer in the overall make. in this case after the make rm the fuzzer, set the env and again put a make command
Then make the target:

make test-run-aflplusplus_optimal-${TARGET}
  1. in the docker container: grab the ids.txt and Identify the ids
docker run -ti -v /tmp:/tmp --entrypoint /bin/bash gcr.io/fuzzbench/builders/aflplusplus_optimal/${TARGET}
cp ids.txt /tmp/
# might need to unzip seeds into the seeds/ directory
AFL_SKIP_CPUFREQ=1 AFL_DEBUG=1 ./afl-fuzz -i seeds -o /tmp/out -m none -d -- ./fuzz-target
# let it run for a few minutes then Control-C
exit
  1. Identify the unstable function and create the denylist.txt for the compiler
cd /tmp
for i in `grep var_bytes out/fuzzer_stats | sed 's/^.*://'`; do
  egrep "edgeID=$i\$" ids.txt
done | awk '{print$2}' | sed 's/Function=/fun: /' | sort -u > denylist.txt
  1. Add the denylist for afl++ or sancov based fuzzers
    For non-PCGUARD afl++:
    AFL_LLVM_DENYLIST=/path/to/denylist.txt
    For PCGUARD and any sancov compilers (needs llvm 12):
    CFLAGS=-fsanitize-coverage-blocklist=/path/to/denylist.txt

for ${TARGET} == freetype2-2017 this is:

cat denylist.txt 
fun: cff_subfont_load
@jonathanmetzman
Copy link
Contributor

Thank you for this excellent guide.
I actually think it might be good to know people's thoughts about how bad instability actually is.

Two years ago, an intern on our team added a feature in libFuzzer to automatically ignore unstable edges. But we couldn't find any noticeable improvement that resulted from this on ClusterFuzz (probably because looking for real improvements on CF is super hard) so we got rid of this feature. Since then, I don't get so worried about instability, and I never really had any cases where I thought it really burnt me.
I'm not sure if AFL(++) already automatically ignores unstable edges, I believe it does not, but wouldn't that provide the benefits of this guide for free?
I know syzkaller implements this feature.

Is instability that terrible? Quite a few targets in OSS-Fuzz/ClusterFuzz land are unstable and they seem to find reproducible bugs anyway. I sometimes wonder if this is an issue we take too seriously because otherwise AFL will display evil red text at you.
I might be biased on this issue because ClusterFuzz only reports non-deterministically reproducible bugs that happen very frequently, so the time I spend triaging unreproducible bugs is probably lower than normal and if a lot of bugs would be reproducible but aren't due to instability, I probably wouldn't notice.

When implementing this feature, I did some investigation of unstable targets. One thing interesting that I noticed is a lot of instability is caused by initialization. For example if my target calls this code:

int f(char* buffer) {
static foo_t* foo = new Foo();
}

The first time f is called it's going to initialize foo but on subsequent runs that constructor won't be called. I think this sort of instability is less likely to lead to unreproducible crashes though it might bias the fuzzer in favor of the input that first calls f.

An easy "fix" for that problem is to set persistent executions to 1, this might be worth trying but I haven't done so because I am worried about the performance impact.

With respect to the aflplusplus_same* experiments, if there are significant differences between the same* fuzzers on any benchmarks, even the unstable ones, I think we need to fix something in FuzzBench (though maybe the fix will involve changing the benchmark). Off-topic for this conversation, but I think if there are significant differences between the exact same fuzzer even without the same random seed, we need to fix something in FuzzBench.

@vanhauser-thc
Copy link
Collaborator Author

vanhauser-thc commented Aug 14, 2020

I am starting with this:

I'm not sure if AFL(++) already automatically ignores unstable edges, I believe it does not, but wouldn't that provide the benefits of this guide for free?

it doesnt and we would not.

I am also not advocating 100% stability as a must for realworld fuzzing. you want coverage in the target.
if you ignore blocks then you loose a bit of coverage information.

for real world fuzzing a 100% stable target that covers all edges is the best. a 90% stable target that covers all edges is however better than a 100% stable target that ignores 10% of the edges.

with instability you basically have a partial coverage loss on an edge, with ignore you have a full loss on that edge.
I think this explains best what my experience with that is.

Why do I still want 100% (or a high number) stability?

For my tests where it is about speed comparisons for example (so no functional change) I need a 100% deterministic target otherwise I have too much entropy in there to be measuring something useful. the system the fuzzer runs on is noisy already.

And there are functions that are unstable, but also provide value to coverage. init functions for with data that comes from input for example.
If however it is a function does has nothing to do with the input data, e.g. checking jitter, or is a hash function, or reads some system state etc. - then I definetly will want to blacklist this function, because it will generate new queue entries that are dummies and that (slightly) negatively affects the fuzzing.

and to be able to make this decision - that is where the denylist.txt comes in. you look at the text file, leave in what is uninteresting and remove what should be part of the coverage.

EDIT: some phrases were nonsense and stated the opposite of what I meant :)

@jonathanmetzman
Copy link
Contributor

it doesnt and we would not.

I am also not advocating 100% stability as a must for realworld fuzzing. you want coverage in the target.
if you ignore blocks then you loose a bit of coverage information.

for real world fuzzing a 100% stable target that covers all edges is the best. a 90% stable target that covers all edges is however better than a 100% stable target that ignores 10% of the edges.

ACK. I think we are in agreement.

For my tests where it is about speed comparisons for example (so no functional change)

Not directly relevant to this stability conversation but you might want to wait for #648 it will probably land in september since I will be busy at the end of august but it hopefully will allow reporting of speed and CPU time.

and to be able to make this decision - that is where the denylist.txt comes in. you look at the text file, leave in what is uninteresting and remove what should be part of the coverage.

Makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants