Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIFuzz/CFLite should warn (or maybe optionally fail) when coverage goes down significantly #7190

Open
evverx opened this issue Jan 26, 2022 · 8 comments

Comments

@evverx
Copy link
Contributor

evverx commented Jan 26, 2022

CIFuzz can be used to make sure that PRs don't break fuzz targets and that new bugs aren't introduced, which is helpful assuming coverage is intact. But when PRs introduce changes preventing fuzz targets from covering their usual codepaths CIFuzz can't detect that and passes. I think it would be great if by analogy with, say, coveralls.io CIFuzz could compare coverage reports and at least warn when fuzz targets no longer cover anything. This issue was brought by a systemd fuzz target that was supposed to cover its dhcp-server but didn't actually fuzz much for almost two months as far as I can tell.

@jonathanmetzman
Copy link
Contributor

I think this would be interesting to do, just like coverage for unittests.
The issue regarding that one particular fuzz target might be helped by a changed we are going to make anyway, having seperate coverage reports per target in addition to the one aggregate target.
Another idea that's harder but which I like a lot is letting users know if the changed code was actually covered during CI-fuzzing and how much it was hit. Maybe this could even determine how long we fuzz for.

@evverx
Copy link
Contributor Author

evverx commented Jan 27, 2022

The issue regarding that one particular fuzz target might be helped by a changed we are going to make anyway, having seperate coverage reports per target in addition to the one aggregate target.

As far as I understand those reports will be on OSS-Fuzz (which is great) but I don't think issues like that should reach OSS-Fuzz given that they can be discovered much earlier.

Another idea that's harder but which I like a lot is letting users know if the changed code was actually covered during CI-fuzzing and how much it was hit.

I'm just spitballing but I think it should be possible to download the public OSS-Fuzz corpora, run the latest fuzzers and the fuzzers built in PRs (with coverage) against those corpora and compare their coverage reports. This way, CIFuzz/CFLite could make sure "coverage" isn't broken in general and on top of that it should make it possible to account for the runtime environment (which on Clusterfuzz is different from GHActions).

@evverx
Copy link
Contributor Author

evverx commented Jan 27, 2022

Just to clarify, coverage reports on OSS-Fuzz are helpful but I don't think anybody looks at them when fuzz targets are believed to be mature enough. If coverage fluctuations aren't reported explicitly they are ignored until they are discovered by accident. For example, I ran into that accidentally trying to figure out why CIFuzz missed two issues that should have been caught. It was due to #7011 but along the way I looked at the PR where they were introduced and eventually ended up looking at the coverage report.

@evverx
Copy link
Contributor Author

evverx commented Jan 21, 2023

I came across https://fitzgeraldnick.com/2022/10/24/how-fuzzy-are-your-fuzzers.html today where basically the same feature was mentioned:

As an aside, a super neat feature for OSS-Fuzz to grow would be automatically filing an issue whenever a fuzz target’s coverage dramatically drops after pulling the latest code from upstream or something like that.

To judge from GHSA-5fhj-g3p3-pq9g that bug could have been caught earlier:

This bug was discovered when we discovered that Wasmtime's fuzz target for exercising GC and stack maps, table_ops, was mistakenly not performing any actual work, and hadn't been for some time now. This meant that while the fuzzer was reporting success it wasn't actually doing anything substantive. After the fuzz target was fixed to exercise what it was meant to, it quickly found this issue.

@DavidKorczynski I think it's probably somewhat related to ossf/fuzz-introspector#734 in the sense that it's another example of where diffs between two reports can be used.

@DavidKorczynski
Copy link
Collaborator

DavidKorczynski commented Jan 21, 2023

@DavidKorczynski I think it's probably somewhat related to ossf/fuzz-introspector#734 in the sense that it's another example of where diffs between two reports can be used.

Interesting, thanks for pointing out. FYI, the diffing is being done atm and you can see some documentation here: https://fuzz-introspector.readthedocs.io/en/latest/user-guides/comparing-introspector-reports.html#example-of-runtime-coverage-improvements

Am not sure if it's feasible to run Fuzz Introspector in the CI, however:

  • We could maybe take out logic that is related only to code coverage as this can mostly be run without the heavy compilation process of FI.
  • With the diffing being implemented atm, and also the exporting of Fuzz Introspector data into summary.json (e.g. https://storage.googleapis.com/oss-fuzz-introspector/dbus-broker/inspector-report/20230121/summary.json) an idea I'm considering is adding historical analysis by default -- also including it in the main HTML page for each project. Using this we can create email alerts if something "odd" (as in, decrease in code coverage/reachability) happens?

@evverx
Copy link
Contributor Author

evverx commented Jan 21, 2023

FYI, the diffing is being done atm and you can see some documentation here

I didn't know that. Thanks for the link!

Am not sure if it's feasible to run Fuzz Introspector in the CI

Agreed.

We could maybe take out logic that is related only to code coverage as this can mostly be run without the heavy compilation process of FI.

I think that would probably be ideal in terms of catching those sudden drops in coverage because runtime coverage should indeed be enough in almost all cases probably.

e.g. https://storage.googleapis.com/oss-fuzz-introspector/dbus-broker/inspector-report/20230121/summary.json

Nice! Thanks!

an idea I'm considering is adding historical analysis by default -- also including it in the main HTML page for each project. Using this we can create email alerts if something "odd" (as in, decrease in code coverage/reachability) happens?

Sounds like a plan! Though I keep forgetting that for example systemd can't be built with FI upstream so it can't rely on anything apart from plain coverage reports.

@evverx
Copy link
Contributor Author

evverx commented Oct 22, 2023

For the record FI shows nice graphs.

@DavidKorczynski I wonder if FI links are stable and can be embedded into READMEs to make the state of projects more visible and hopefully attract more contributors (at least as long as LLMs aren't quiet there yet :-))?

@evverx
Copy link
Contributor Author

evverx commented Oct 22, 2023

Never mind. #10924 is probably what I'm looking for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants