Archive coverage data alongside corpus archives #2020

addisoncrump · 2024-08-08T13:12:42Z

Currently, only corpora are saved in the archive and the summaries of coverage are provided at the end of the experiment. This change simply incorporates the saving of the coverage data snapshots next to the trial corpus snapshots.

addisoncrump · 2024-08-08T13:19:42Z

Forgot to format...

addisoncrump · 2024-08-08T14:13:35Z

It doesn't seem that the saving works as expected. I'm going to keep trying with this, but it's quite difficult to debug.

addisoncrump · 2024-08-08T15:12:47Z

Okay, this should work now. I got confused as to the direction of the copy originally.

DonggeLiu · 2024-08-08T23:59:20Z

Thanks @addisoncrump!
The code looks great to me. But before merging this, let's run an experiment on this PR to triple-check that this also works in the cloud instances : )
Could you please make a trivial modification to service/gcbrun_experiment.py?
This will allow me to launch experiments in this PR for final validation. Here is an example to add a dummy comment.
We can revert this after the experiment.
Thanks!

addisoncrump · 2024-08-09T09:13:44Z

let's run an experiment on this PR to triple-check that this also works in the cloud instances

Sure, and also to collect the corresponding coverage data for the "standard" fuzzers. I'll make that change shortly.

addisoncrump · 2024-08-09T09:16:41Z

Also, a local experiment shows that we also get warning info in the JSON (!):

warning: 6 functions have mismatched data
{"data":[{"files":[{"branches":[[102,22,102,36,0,0,0,0,4],[103,9,103,41,0,0,0,0,4],...]}]}]}

Should we remove this?

DonggeLiu · 2024-08-09T10:10:26Z

Should we remove this?

Do you happen to know the cause of this?

addisoncrump · 2024-08-09T11:59:25Z

To be honest, I've looked around a bit now and do not see the root cause.

It seems to be using new_process.execute, but this redirects stdout only. I presume, then, that llvm-cov is actually producing warnings in stdout(!). I'll see if I can find the appropriate command line switch to remove this.

addisoncrump · 2024-08-09T12:01:06Z

It seems to be a known issue btw; get_coverage_infomation (typo: information) already handles this.

addisoncrump · 2024-08-09T12:10:46Z

That seems to have done it. The get_coverage_infomation function can remain as-is without loss of functionality.

Running a quick local test and then will stage the cloud test.

addisoncrump · 2024-08-09T13:07:31Z

Okay, so I spent quite a while debugging a weird change that was occurring when presubmit was applied; namely, make presubmit was modifying the file analysis/test_data/pairwise_unique_coverage_heatmap-failed-diff.png. This was a result of the seaborn version being incompatible with the version of matplotlib. I fixed this by updating the dependency in requirements.txt. Nonetheless, this still had metadata changes which caused the diff to be modified in disk. Since this is the result of a test, I added it to the gitignore.

This also implies to me that the test should be failing, but isn't. I think this is a minor difference in how seaborn now emits heatmaps (seems to be some offset change).

addisoncrump · 2024-08-09T13:15:50Z

Also, experimenting with compression, because the coverage dumps are quite large and easily compressible.

addisoncrump · 2024-08-09T13:36:35Z

llvm-cov export: Unknown command line argument '-no-warn'. Try: 'llvm-cov export --help'

Well, the version of llvm-cov used is too old. I'll revert this now.

addisoncrump · 2024-08-09T13:46:11Z

Compression reduces 15MB => 1MB, so seems worth it. This is now in a stable state and ready for a test run!

DonggeLiu · 2024-08-09T13:50:54Z

Nice! Let's start with a simple one.

collect the corresponding coverage data for the "standard" fuzzers.

Then we collect these.

DonggeLiu · 2024-08-09T13:51:37Z

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-09-dg-2020 --fuzzers libfuzzer --benchmarks libxml2_xml

addisoncrump · 2024-08-10T14:10:05Z

The data directory was generated as expected, but the report was not.

If none of the measurements have happened yet, it won't have created a report, no?

tokatoka · 2024-08-10T19:43:01Z

i guess we need to update libafl
@addisoncrump
can you change the commit we are using for libafl?
and also use fuzzers/fuzzbench/fuzzbench instead of fuzzers/fuzzbench

addisoncrump · 2024-08-11T14:45:46Z

@DonggeLiu Any complaints if I make the libafl change in this PR as well?

DonggeLiu · 2024-08-12T00:35:22Z

@DonggeLiu Any complaints if I make the libafl change in this PR as well?

Ah we would really appreciate it if you could do it in a different PR, given it is a stand-alone change.
Hope that won't cause too much trouble : )

Thanks!

DonggeLiu · 2024-08-12T00:37:20Z

Thanks for the info, @tokatoka.

can you change the commit we are using for libafl?

What is the preferred commit to use?

tokatoka · 2024-08-12T00:45:43Z

i'd say we can just use the latest

addisoncrump · 2024-08-12T12:00:42Z

Wait, something is going wrong with the 2024-08-10-base.

@DonggeLiu, was the root cause ever discovered?

DonggeLiu · 2024-08-12T12:51:10Z

@DonggeLiu, was the root cause ever discovered?

I think this is the reason: #2023.

There are other warnings/errors, but I reckon this is the reason.

DonggeLiu · 2024-08-12T12:59:09Z

Also seeing a lot of this, but I presume that's unrelated to your PR?

Traceback (most recent call last):
  File "/work/src/experiment/measurer/coverage_utils.py", line 74, in generate_coverage_report
    coverage_reporter.generate_coverage_summary_json()
  File "/work/src/experiment/measurer/coverage_utils.py", line 141, in generate_coverage_summary_json
    result = generate_json_summary(coverage_binary,
  File "/work/src/experiment/measurer/coverage_utils.py", line 269, in generate_json_summary
    with open(output_file, 'w', encoding='utf-8') as dst_file:
FileNotFoundError: [Errno 2] No such file or directory: '/work/measurement-folders/lcms_cms_transform_fuzzer-centipede/merged.json'

addisoncrump · 2024-08-12T13:06:08Z

I don't think so -- the modifications which were applied were done by the formatter. I can just revert that whole file if needed.

DonggeLiu · 2024-08-12T13:08:11Z

I can just revert that whole file if needed.

No need, I've addressed this in #2023.
Later we can merge that into here.

DonggeLiu · 2024-08-12T13:11:56Z

Oh, thanks for doing this.
I don't think that is caused by your modification, but since you have reverted, let's run an experiment for it.

DonggeLiu · 2024-08-12T13:12:20Z

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-08-12-2020 --fuzzers aflplusplus centipede honggfuzz libfuzzer

addisoncrump · 2024-08-12T13:15:28Z

👍 I figure since I didn't make any meaningful changes to that file anyway, better to leave it untouched. If the experiment magically starts working, I have no idea what that means, but I'll be happy about it lol

DonggeLiu · 2024-08-12T13:16:45Z

Experiment 2024-08-12-2020 data and results will be available later at:
The experiment data.
The experiment report.
The experiment report(experimental).

addisoncrump · 2024-08-12T14:59:53Z

Yeah, looks like it's not working. This run should probably be cancelled, if nothing but to save some CPU time.

DonggeLiu · 2024-08-13T00:14:52Z

Yep, I suspect this is due to a benchmark compatibility issue.
Let me verify this.

Also, seeing a lot of instances in this experiment being preempted:

addisoncrump · 2024-08-14T15:22:50Z

Superceded by #2028.

1. Fix `TypeError: expected str, bytes or os.PathLike object, not NoneType` in [`2024-08-10-test`](#2020 (comment)). ```python Traceback (most recent call last): File "/src/experiment/runner.py", line 468, in experiment_main runner.conduct_trial() File "/src/experiment/runner.py", line 290, in conduct_trial self.set_up_corpus_directories() File "/src/experiment/runner.py", line 275, in set_up_corpus_directories _unpack_clusterfuzz_seed_corpus(target_binary, input_corpus) File "/src/experiment/runner.py", line 144, in _unpack_clusterfuzz_seed_corpus seed_corpus_archive_path = get_clusterfuzz_seed_corpus_path( File "/src/experiment/runner.py", line 98, in get_clusterfuzz_seed_corpus_path fuzz_target_without_extension = os.path.splitext(fuzz_target_path)[0] File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext p = os.fspath(p) TypeError: expected str, bytes or os.PathLike object, not NoneType ``` This happens on [many benchmarks+fuzzers](https://pantheon.corp.google.com/logs/query;query=%222024-08-10-test%22%0Aseverity%3E%3DERROR%0A--Hide%20similar%20entries%0A-%2528jsonPayload.message%3D~%22Error%20watching%20metadata:%20context%20canceled%22%2529%0A--End%20of%20hide%20similar%20entries;cursorTimestamp=2024-08-10T11:04:34.735815901Z;duration=P7D?project=fuzzbench&mods=logs_tg_prod). To be investigated later: 1. Why `fuzz_target_path` is `None`. 2. Why this did not happen in other recent experiments. 3. I thought I had seen this a long ago, Déjà vu? 2. Fixing `No such file or directory: '/work/measurement-folders/<benchmark>-<fuzzer>/merged.json`: ```python Traceback (most recent call last): File "/work/src/experiment/measurer/coverage_utils.py", line 74, in generate_coverage_report coverage_reporter.generate_coverage_summary_json() File "/work/src/experiment/measurer/coverage_utils.py", line 141, in generate_coverage_summary_json result = generate_json_summary(coverage_binary, File "/work/src/experiment/measurer/coverage_utils.py", line 269, in generate_json_summary with open(output_file, 'w', encoding='utf-8') as dst_file: FileNotFoundError: [Errno 2] No such file or directory: '/work/measurement-folders/lcms_cms_transform_fuzzer-centipede/merged.json' ``` 3. Remove incompatible benchmarks: `openh264_decoder_fuzzer`, `stb_stbi_read_fuzzer`

@tokatoka

Changing forks so @tokatoka can collab with me on this. Supercedes #2021. As requested in #2020.

…2028) Supercedes #2020. Moving so we (AFL++ people) can collaborate on this PR. From the original: > Currently, only corpora are saved in the archive and the summaries of coverage are provided at the end of the experiment. This change simply incorporates the saving of the coverage data snapshots next to the trial corpus snapshots. --------- Co-authored-by: Toka <[email protected]>

1. Fix `TypeError: expected str, bytes or os.PathLike object, not NoneType` in [`2024-08-10-test`](google#2020 (comment)). ```python Traceback (most recent call last): File "/src/experiment/runner.py", line 468, in experiment_main runner.conduct_trial() File "/src/experiment/runner.py", line 290, in conduct_trial self.set_up_corpus_directories() File "/src/experiment/runner.py", line 275, in set_up_corpus_directories _unpack_clusterfuzz_seed_corpus(target_binary, input_corpus) File "/src/experiment/runner.py", line 144, in _unpack_clusterfuzz_seed_corpus seed_corpus_archive_path = get_clusterfuzz_seed_corpus_path( File "/src/experiment/runner.py", line 98, in get_clusterfuzz_seed_corpus_path fuzz_target_without_extension = os.path.splitext(fuzz_target_path)[0] File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext p = os.fspath(p) TypeError: expected str, bytes or os.PathLike object, not NoneType ``` This happens on [many benchmarks+fuzzers](https://pantheon.corp.google.com/logs/query;query=%222024-08-10-test%22%0Aseverity%3E%3DERROR%0A--Hide%20similar%20entries%0A-%2528jsonPayload.message%3D~%22Error%20watching%20metadata:%20context%20canceled%22%2529%0A--End%20of%20hide%20similar%20entries;cursorTimestamp=2024-08-10T11:04:34.735815901Z;duration=P7D?project=fuzzbench&mods=logs_tg_prod). To be investigated later: 1. Why `fuzz_target_path` is `None`. 2. Why this did not happen in other recent experiments. 3. I thought I had seen this a long ago, Déjà vu? 2. Fixing `No such file or directory: '/work/measurement-folders/<benchmark>-<fuzzer>/merged.json`: ```python Traceback (most recent call last): File "/work/src/experiment/measurer/coverage_utils.py", line 74, in generate_coverage_report coverage_reporter.generate_coverage_summary_json() File "/work/src/experiment/measurer/coverage_utils.py", line 141, in generate_coverage_summary_json result = generate_json_summary(coverage_binary, File "/work/src/experiment/measurer/coverage_utils.py", line 269, in generate_json_summary with open(output_file, 'w', encoding='utf-8') as dst_file: FileNotFoundError: [Errno 2] No such file or directory: '/work/measurement-folders/lcms_cms_transform_fuzzer-centipede/merged.json' ``` 3. Remove incompatible benchmarks: `openh264_decoder_fuzzer`, `stb_stbi_read_fuzzer`

@tokatoka

Changing forks so @tokatoka can collab with me on this. Supercedes google#2021. As requested in google#2020.

…oogle#2028) Supercedes google#2020. Moving so we (AFL++ people) can collaborate on this PR. From the original: > Currently, only corpora are saved in the archive and the summaries of coverage are provided at the end of the experiment. This change simply incorporates the saving of the coverage data snapshots next to the trial corpus snapshots. --------- Co-authored-by: Toka <[email protected]>

@tokatoka

Changing forks so @tokatoka can collab with me on this. Supercedes As requested in google#2020.

archive coverage data alongside corpus archives

394dc29

format

3dc90d2

addisoncrump marked this pull request as draft August 8, 2024 13:34

addisoncrump added 4 commits August 8, 2024 16:55

upload to the right place

e7432b5

direction???

936ed1f

format

6b56a82

don't put it in corpus lol

9a3770b

addisoncrump marked this pull request as ready for review August 8, 2024 15:12

oliverchang requested review from DonggeLiu and jonathanmetzman August 8, 2024 22:28

llvm-cov no-warn

3f43fcf

weird presubmit bug fix

e3ce950

addisoncrump added 2 commits August 9, 2024 15:37

ok, no no-warn

a0cdf19

dummy change for experiment runner

d8718bc

addisoncrump mentioned this pull request Aug 12, 2024

Update libafl-based fuzzers #2021

Closed

DonggeLiu mentioned this pull request Aug 12, 2024

Fix recent FuzzBench cloud experiment failures #2023

Merged

revert changes in coverage_utils; remove warning lines in save

2b164d7

addisoncrump mentioned this pull request Aug 14, 2024

Archive coverage data alongside corpus archives (from AFL++ fork) #2028

Merged

addisoncrump closed this Aug 14, 2024

DonggeLiu pushed a commit that referenced this pull request Aug 16, 2024

Update libafl-based fuzzers (from AFL++ fork) (#2027)

e72f5bb

Changing forks so @tokatoka can collab with me on this. Supercedes #2021. As requested in #2020.

ardier pushed a commit to ardier/fuzzbench that referenced this pull request Nov 25, 2024

Update libafl-based fuzzers (from AFL++ fork) (google#2027)

fa33c8e

Changing forks so @tokatoka can collab with me on this. Supercedes google#2021. As requested in google#2020.

ardier pushed a commit to ardier/fuzzbench that referenced this pull request Nov 25, 2024

Update libafl-based fuzzers (from AFL++ fork) (google#2027)

e915adb

Changing forks so @tokatoka can collab with me on this. Supercedes As requested in google#2020.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Archive coverage data alongside corpus archives #2020

Archive coverage data alongside corpus archives #2020

addisoncrump commented Aug 8, 2024

addisoncrump commented Aug 8, 2024

addisoncrump commented Aug 8, 2024

addisoncrump commented Aug 8, 2024

DonggeLiu commented Aug 8, 2024

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024

DonggeLiu commented Aug 9, 2024

addisoncrump commented Aug 9, 2024 •

edited

Loading

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024 •

edited

Loading

addisoncrump commented Aug 9, 2024 •

edited

Loading

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024

DonggeLiu commented Aug 9, 2024

DonggeLiu commented Aug 9, 2024

addisoncrump commented Aug 10, 2024

tokatoka commented Aug 10, 2024

addisoncrump commented Aug 11, 2024

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

tokatoka commented Aug 12, 2024

addisoncrump commented Aug 12, 2024 •

edited

Loading

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

addisoncrump commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

addisoncrump commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

addisoncrump commented Aug 12, 2024

DonggeLiu commented Aug 13, 2024 •

edited

Loading

addisoncrump commented Aug 14, 2024

Archive coverage data alongside corpus archives #2020

Archive coverage data alongside corpus archives #2020

Conversation

addisoncrump commented Aug 8, 2024

addisoncrump commented Aug 8, 2024

addisoncrump commented Aug 8, 2024

addisoncrump commented Aug 8, 2024

DonggeLiu commented Aug 8, 2024

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024

DonggeLiu commented Aug 9, 2024

addisoncrump commented Aug 9, 2024 • edited Loading

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024 • edited Loading

addisoncrump commented Aug 9, 2024 • edited Loading

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024

addisoncrump commented Aug 9, 2024

DonggeLiu commented Aug 9, 2024

DonggeLiu commented Aug 9, 2024

addisoncrump commented Aug 10, 2024

tokatoka commented Aug 10, 2024

addisoncrump commented Aug 11, 2024

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

tokatoka commented Aug 12, 2024

addisoncrump commented Aug 12, 2024 • edited Loading

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

addisoncrump commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

addisoncrump commented Aug 12, 2024

DonggeLiu commented Aug 12, 2024

addisoncrump commented Aug 12, 2024

DonggeLiu commented Aug 13, 2024 • edited Loading

addisoncrump commented Aug 14, 2024

addisoncrump commented Aug 9, 2024 •

edited

Loading

addisoncrump commented Aug 9, 2024 •

edited

Loading

addisoncrump commented Aug 9, 2024 •

edited

Loading

addisoncrump commented Aug 12, 2024 •

edited

Loading

DonggeLiu commented Aug 13, 2024 •

edited

Loading