Differential fuzzing for proto-lens #280

blackgnezdo · 2018-12-13T18:52:57Z

As we are improving the speed of proto-lens we are incurring more complexity tax. We should start seriously thinking about a test suite that would be more adversarial (and real world-like).

One idea from @kcc was to create a differential fuzzing test. The idea here is to have two parallel implementations, e.g.

haskellReparse :: ByteString -> Maybe ByteString   -- implemented in Haskell
cxxReparse :: ByteString -> Maybe ByteString  -- implemented in C++

Both functions would parse the given byte string (or fail) and then serialize the results back. The fuzzer driver will then invoke an asssertion function that e.g. uses proto diff to confirm the results match. The magic part here comes from the fuzzer driving corpus generation by collecting coverage. Even though Haskell doesn't normally have coverage (at least I've never heard GHC LLVM backend supporting coverage generation), still the C++ branch is coverage-enabled and the fuzzer framework will exploit it for corpus generation.

An example of such code is this cross-checking test between openssl and libgrypt.

judah · 2018-12-13T20:06:51Z

This is an interesting idea. Are you also thinking of having the fuzzer generate the .proto files themselves? (Or, I guess equivalently, the DescriptorProto structure.) If so, are you aware of any existing implementations? I found one, but it's in Python: https://github.com/trailofbits/protofuzz

There's also a few known edge cases where we do differ from the C++ implementation:
https://github.com/google/proto-lens#current-differences-from-the-standard
This task could be a motivation to finish them off.

blackgnezdo · 2018-12-14T02:57:28Z

This is an interesting idea. Are you also thinking of having the fuzzer generate the .proto files themselves? (Or, I guess equivalently, the DescriptorProto structure.) If so, are you aware of any existing implementations? I found one, but it's in Python: https://github.com/trailofbits/protofuzz

That's probably going too far as it requires compiling generated C++.

There's also a few known edge cases where we do differ from the C++ implementation:
https://github.com/google/proto-lens#current-differences-from-the-standard
This task could be a motivation to finish them off.

Great, we want to eventually be compatible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differential fuzzing for proto-lens #280

Differential fuzzing for proto-lens #280

blackgnezdo commented Dec 13, 2018

judah commented Dec 13, 2018

blackgnezdo commented Dec 14, 2018

Differential fuzzing for proto-lens #280

Differential fuzzing for proto-lens #280

Comments

blackgnezdo commented Dec 13, 2018

judah commented Dec 13, 2018

blackgnezdo commented Dec 14, 2018