Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add libfuzzer compatible fuzz harness #7512

Merged
merged 11 commits into from
May 1, 2023
Merged

Conversation

nathaniel-brough
Copy link
Contributor

As discussed in #4088 the fuzz_simplify harness is an easy candidate to begin support for coverage-driven feedback required for high-performance fuzz testing.

@nathaniel-brough
Copy link
Contributor Author

I wasn't sure on how best to manage the Cmake build files for fuzz tests, so I'm certainly open to suggestions on some of the changes here :)

@alexreinking
Copy link
Member

alexreinking commented Apr 15, 2023

Because of the special toolchain requirements, I think this test should be moved to test/fuzz/simplify.cpp. That folder should get its own CMakeLists.txt containing the following:

tests(GROUPS fuzz
      SOURCES
      simplify.cpp
      )
target_compile_options(fuzz_simplify PRIVATE -fsanitize=fuzzer,address)
target_link_options(fuzz_simplify PRIVATE -fsanitize=fuzzer,address)

The test/CMakeLists.txt should be modified to contain the following:

# near the top:
include(CheckCXXCompilerFlag)

# toward the bottom
check_cxx_compiler_flag("-fsanitize=fuzzer,address" HAS_FUZZ_FLAGS)
cmake_dependent_option(
    WITH_TEST_FUZZ "Build fuzz tests" ON
    HAS_FUZZ_FLAGS OFF
)
if (WITH_TEST_FUZZ)
    add_subdirectory(fuzz)
endif ()

The guiding principle here is "ask for what you want", here the -fsanitize=fuzzer,address flag. Checking a proxy like the compiler/platform combo is suspect. In fact, there are compilers for both Windows and macOS that support libFuzzer.

@nathaniel-brough
Copy link
Contributor Author

nathaniel-brough commented Apr 15, 2023

Thanks for the quick response, I've made the suggested changes and also added some of the precursors for multi-fuzzer engine builds for oss-fuzz :)

test/CMakeLists.txt Outdated Show resolved Hide resolved
@nathaniel-brough
Copy link
Contributor Author

Looks like I've got at least a couple of formatting things to fix.

On a different note, I've got a corresponding draft PR to integrate with OSS-fuzz. A couple of things I'll need for that integration to go through;

  • A primary contact email address (associated with a google/gmail account).
  • A list of email addresses that you want to have CC'd into automatic bug reports from oss-fuzz.
  • A contributor here with write access to Halide to comment mentioning that you approve of integration with oss-fuzz.

If you are happy for me to do so, I can pull the email addresses from the git log, it's worth noting however that these emails will be listed in plaintext in the oss-fuzz repo.

test/fuzz/CMakeLists.txt Outdated Show resolved Hide resolved
test/CMakeLists.txt Outdated Show resolved Hide resolved
@steven-johnson
Copy link
Contributor

Part of the point of checking is that it's less necessary to know the complete matrix of compiler vendors and versions

No, but when it inevitably fails somewhere, having comments and/or messages to help explain would be nice

As discussed in halide#4088 the fuzz_simplify harness is an easy
candidate to begin support for coverage driven feedback
that is required for high performance fuzz testing.
test/CMakeLists.txt Outdated Show resolved Hide resolved
test/CMakeLists.txt Show resolved Hide resolved
@nathaniel-brough
Copy link
Contributor Author

Ok so I think there is a bit of a bug with cmakes CHECK_CXX_COMPILER_FLAG as it doesn't seem to support checking sanitiser flags. Something to do with it needing those sanitizer flags during linking as well, so the check fails. I'm going to have a bit of a go at using CheckCXXSourceCompiles instead as it seems to give a bit more control over how/what flags are used during compiling/linking.

@steven-johnson steven-johnson added the buildbot_test_everything Buildbots should run all available tests on this PR (unless build_test_nothing is set). label Apr 19, 2023
test/CMakeLists.txt Outdated Show resolved Hide resolved
@nathaniel-brough
Copy link
Contributor Author

nathaniel-brough commented Apr 20, 2023

I think I've covered most things here. Is this build failure related to my changes? The error is;

The following tests FAILED:
	243 - correctness_multi_way_select (Failed)
Errors while running CTest
program finished with exit code 8
elapsedTime=578.168036

@steven-johnson
Copy link
Contributor

LGTM, waiting on @alexreinking approval

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks for the contribution!

Copy link
Contributor Author

@nathaniel-brough nathaniel-brough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steven-johnson would you mind double checking these? I'll have a PR up shortly with these changes.

Type fuzz_types[] = {UInt(1), UInt(8), UInt(16), UInt(32), Int(8), Int(16), Int(32)};
const int fuzz_type_count = sizeof(fuzz_types) / sizeof(fuzz_types[0]);

std::string fuzz_var(int i) {
return std::string(1, 'a' + i);
}

Expr random_var() {
int fuzz_count = rng() % fuzz_var_count;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: change as per #7546

std::vector<int> divisors = {t.lanes()};
for (int dd = 2; dd < t.lanes(); dd++) {
if (t.lanes() % dd == 0) {
divisors.push_back(dd);
}
}

return divisors[rng() % divisors.size()];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine as per #7546

if (T.is_int() && T.bits() == 32) {
overflow_undef = true;
}
if (T.is_scalar()) {
int var = rng() % fuzz_var_count + 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fdp.ConsumeIntegralInRange<int>(0, fuzz_var_count) as per #7546 NOTE: drop the + 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics of fuzz_var_count aren't clear to me -- in various for loops we do for (int i = 0; i < fuzz_var_count; i++) which implies that we need to limit the range to max=fuzz_var_count-1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looking at this again the var variable seems to only get used once on line 56, so I could probably just get rid of it and replace the condition on line 56 with fdp.ConsumeBool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing it was +1 in the beginning so that the condition on line 56 was sometimes true/false. In any case it seems like a really roundabout way of doing it. I'm just going to replace it with the consumebool method.

return cast(T, v1);
} else {
if (overflow_undef) {
// For Int(32), we don't care about correctness during
// overflow, so just use numbers that are unlikely to
// overflow.
return cast(T, (int)(rng() % 256 - 128));
return cast(T, fdp.ConsumeIntegralInRange<int>(-128, 128));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fdp.ConsumeIntegralInRange<int>(-128, 127) as per #7546

} else {
return cast(T, (int)(rng() - RAND_MAX / 2));
return cast(T, fdp.ConsumeIntegralInRange<int>(-RAND_MAX / 2, RAND_MAX / 2));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fdp.ConsumeIntegralInRange<int>(-RAND_MAX / 2, RAND_MAX / 2 -1) as per #7546

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...except, RAND_MAX applies to the rand() call in stdlib; does libfuzzer also make the same promise about range?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the allowable range is going to be std::numeric_limits<int>::min() -> std::numeric_limits<int>::min() which is what ConsumeIntegral<int> will do but then if you want to constrain it further you can use the ConsumeIntegralInRange so no there aren't any promises other than the given range.

The FuzzedDataProvider class is just taking a stream of bytes from the fuzzer and turning them into other types,in this case an integral. The FuzzedDataProvider class just makes managing this easier as it essentially moves a cursor along the byte stream so that you aren't consuming the the same data twice.

int op = rng() % op_count;
Expr a = random_expr(fdp, T, depth);
Expr b = random_expr(fdp, T, depth);
int op = fdp.ConsumeIntegralInRange<int>(0, op_count);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fdp.ConsumeIntegralInRange<int>(0, op_count-1) as per #7546

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or better yet, use PickValueInArray()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True that would be better

}

const int bin_op_count = sizeof(make_bin_op) / sizeof(make_bin_op[0]);
const int bool_bin_op_count = sizeof(make_bool_bin_op) / sizeof(make_bool_bin_op[0]);
const int op_count = bin_op_count + bool_bin_op_count + 5;

int op = rng() % op_count;
int op = fdp.ConsumeIntegralInRange<int>(0, op_count);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fdp.ConsumeIntegralInRange<int>(0, op_count-1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.. but looking at this now, the logic is a little fragile; the switch statement has implicit assumptions about the count and ordering of the two op arrays. What happens if someone inserts a new one? Would be nice to make this a bit more robust, e.g.:

  • Use custom enum values to associate the op functions with the switch statements?
  • Or maybe, rework make_bin_op and make_bool_bin_op arrays so that each entry is a pair<> with the existing op + a lambda function that contains the code from the relevant switch case, so that the switch goes away entirely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I realize the fragility was pre-existing before your changes, but improving the resilience here seems easy and worthwhile while we are here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looking at this again a second time I'm struggling to understand what is going on here. Where did this 5 come from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nevermind I see what is going on here... that's a bit wack. I'll refactor as per your suggestions.

Type random_type(int width) {
Type T = fuzz_types[rng() % fuzz_type_count];
Type random_type(FuzzedDataProvider &fdp, int width) {
Type T = fdp.PickValueInArray(fuzz_types);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadn't noticed this call existed -- perhaps we should aggressively use it in other cases too where possible (e.g. line 47, line 102)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't work on line 47 as its a vector and even though it would have been trivial to implement it doesn't work on vectors. But 102 could definetely use that.

ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buildbot_test_everything Buildbots should run all available tests on this PR (unless build_test_nothing is set).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants