Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roundtrip tests with QuickCheck #233

Closed
wants to merge 13 commits into from

Conversation

mbarbin
Copy link
Contributor

@mbarbin mbarbin commented Jan 16, 2024

While working with ocaml-protoc, I encountered a message type that caused an RPC
deserialization error when the client attempted to decode the server's response.
This prompted me to explore an implementation of roundtrip property tests for
ocaml-protoc.

In this PR, I've created a new test directory that adds roundtrip tests using
QuickCheck. The property under test is the ability of a message to roundtrip
through protoc serialization (using both protobuf and json). The quickcheck
generation is focused on generating inputs of the message type under test.

This test was able to automagically find the problematic value I was dealing
with.

This new test directory is structured as a separate test package and does not
introduce new dependencies to the ocaml-protoc codebase. However, it does have a
dependency footprint that might not align with your preferences. Therefore, I'm
submitting this PR as a draft for your review, and start a discussion.

As for the original issue, I haven't investigated its root cause. I'm also
unsure if the message type I'm trying to express is a valid proto specification.
I would greatly appreciate your insights on this matter.

The message in question is the Unit case of the UnitOrError message type.

Thank you for your time and consideration!

@mbarbin mbarbin marked this pull request as draft January 16, 2024 10:10
@mbarbin mbarbin changed the title Fuzzy Roundtrip Tests Roundtrip tests with QuickCheck Jan 16, 2024
@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 16, 2024

Note: Upon further reflection, I realized that I had inadvertently conflated
fuzz testing with QuickCheck property testing in my previous description.
I've updated the description above to more accurately reflect this distinction.

Also, a note for @c-cube: I just came across qcheck, a QuickCheck
implementation that I see you have contributed to. I want to clarify that my
selection of a different QuickCheck library (base_quickcheck) in this PR
was simply due to my prior experience with it, and not a reflection on
comparing both libraries. This is really just a draft I put together without
thinking much about all the details, to get some feedback on the approach
first. I am certainly open to exploring qcheck too and would be interested
in incorporating it into the tests if you believe it would be a more beneficial
dependency for the project.

@mbarbin mbarbin mentioned this pull request Jan 16, 2024
@Lupus
Copy link
Contributor

Lupus commented Jan 17, 2024

I would suggest going with AFL fuzzer, and a very good https://github.com/stedolan/crowbar library that supports it. I've had a lot of success finding obsure bugs in my code with this combo. AFL supports dictionary-based mode which saves some CPU cycles for coming up with protobuf syntax on its own (although it definitely can do that, it can even pull JPEGs out of thin air). It maximizes branch coverage in the tested process by producing new inputs, and it can also minimimize the test cases so that you have minimal set that has the same coverage.

I was thinking that we could also add some official protobuf implementation into the loop, like Golang one, and have ocaml-protoc encode a message, and Golang generated code decode it and vice versa, this way we can property-test compatibility with official Google tooling.

@Lupus
Copy link
Contributor

Lupus commented Jan 17, 2024

Hm, probably AFL is not really required for roundtrip testing that you suggest, it's powers are not really helpful here. Given some protobuf message some quick-check style fuzzing is more than sufficient to cover the input space...

We could use AFL/crowbar though to come up with a minimal collection of input .proto files that cover most of what ocaml-protoc can parse though. Filter out incorrect inputs with some official tool like buf, and feed the rest into ocaml-protoc. This will find cases where ocaml-protoc fails to parse valid .proto code. Maybe we could just use some already-created set of .proto files used as test corpus for other protobuf implementations for this purpose...

Having a decent collection of .proto files, we could come up with code generation plugin that will emit quickcheck code for all the generated types. Actually you can tell ocaml-protoc to add quickcheck ppx annotations to all types by its own, this way you don't have to make duplicates of types in your quickcheck code only to add the necessary ppxes. Having automatically generated code for quickcheck for all types, you can just generate the whole testing code for whole collection of .proto's and run it! This will ensure that whatever is encoded by ocaml-protoc is properly decoded back, and the only thing that would be missing is compatibility with other tooling.

@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 18, 2024

Hi @Lupus !

I want to express my gratitude for your innovative thinking and consideration of
compatibility with other frameworks. Your ideas have also guided my thoughts in
valuable new directions. Thank you!

While I may not be able to fully articulate the essence of my feelings at this
time, I find it important to distinguish between compile-time and runtime
errors. The prophecy "If it compiles, it works" is well-known: I find myself
more concerned about bugs where ocaml-protoc accepts a definition but generates
suspicious code that doesn't roundtrip correctly. These runtime issues, whether
due to a failure to reject an invalid proto definition or otherwise, are my
primary focus. Compile-time issues, like a supposedly valid proto specification
that fails to parse, provide us with ample time and notice to handle.

While I currently have reservations about the accuracy of proto files generated
by a fuzzing framework in representing real-world specifications, I recognize
this may be due to my own bias and lack of experience with such generation
methods. I'm open to the possibility of unexpected outcomes, and I must admit,
the idea of a solution that generates not only the roundtrip tests with their
QuickCheck inputs, but also the type definitions from the start, is quite
impressive. For now my current focus will be on real cases still.

Your suggestion about adding ppx deriving directly to the generated code has not
gone unnoticed. I understand that injecting ppx deriving constructs could
simplify user code. However, I have concerns about the need to overwrite a
QuickCheck generator for a specific type, which could potentially undermine the
whole approach. This scenario may not occur, but it's something I'm keeping in
mind.

I greatly appreciate your long-term vision. If we were to think about near-term,
actionable steps, while keeping your broader goals in mind, what would it look
like? As a first step, I think building a facility to roundtrip proto
specifications, whether they're manually collected, gathered from known
datasets, or fuzzily generated, would be a immediate win.

To reduce the amount of code needed to wrap the generated code, we could
introduce a type-class that encapsulates the functionality I've manually defined
in user land. For instance:

module T = struct
  type t = Messages.person

  let encode_pb = Messages.encode_pb_person
  let decode_pb = Messages.decode_pb_person
  let encode_json = Messages.encode_json_person
  let decode_json = Messages.decode_json_person
end

This binding could be automatically generated and made available as a type-class
(perhaps a record) alongside each generated type. This could be done
conditionally, based on a new flag for the protoc compiler. The newly generated
value could then be directly used by the tests, thereby minimizing their
footprint. This approach would streamline the testing process by reducing the
amount of manual code required.

Thank you again for your valuable input and inspiring vision. I look forward to
further discussions on these topics.

@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 19, 2024

Latest Developments:

  • I've experimented with qcheck. At one point in the PR history, both
    libraries were operational side by side, and I confirmed that qcheck could
    identify the same problematic input.

  • Subsequently, I removed the base_quickcheck components, hoping that qcheck's
    dependencies would be more compatible with the project and the CI.

  • I've attempted to incorporate some elements from our discussion. I've included
    a type class with most of the bindings, but unfortunately, the most crucial
    one (the quickcheck generator gen) cannot be generated with ppx_deriving.
    The obstacle is that the --ocaml_all_types_ppx option causes the generator
    to output the deriving construct in both the .ml and the .mli, but
    ppx_deriving_quickcheck doesn't seem to support signatures. To progress,
    I've relied on the existing structure and continued to create the generator
    from the user land.

  • It's worth noting that I'm not familiar with qcheck and the unit test
    framework it's likely designed to work with. I don't believe we need
    ppx_expect here, which is primarily used to print out the discovered values.
    I've enabled the option in the PR to accept external contributions, so if
    anyone has time and interest, I'd welcome a few commits to convert this into a
    standard test that best suits the project (perhaps alcotest?).

I'm eager to discuss whether you believe this PR, in its current state, is
converging towards something that could be merged. It has already identified
some useful problematic values, and the new directory can serve as a repository
for new tests as we progress.

For further improvements:

  1. Contribute to ppx_deriving_qcheck to make it compatible with
    --ocaml_all_types_ppx (support for mli seems useful for
    ppx_deriving_check as well), or
  2. Introduce a new option in ocaml-protoc to inject ppx in *.ml files only.
    This would be a more immediate solution that doesn't necessitate upstream
    changes.

While both options are viable in the long term, I'm questioning whether it's
worth pursuing option 2 in this PR.

@c-cube
Copy link
Collaborator

c-cube commented Jan 19, 2024

A few comments off the top of my head:

Thank you for your work, it is very much appreciated. Your thoughtfulness in comments is also noted :-).

The further improvement 1. would be nice, the ppx for qcheck should work for files with .mli in general. It should be too hard as it's a signature per type. I don't think 2. is useful because then it means we can't access the printers from the outside.

QCheck can be used with alcotest or ounit, but it also has its own runner library in qcheck-core.runner. It just takes a list of tests and CLI arguments, and it might be enough.

@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 19, 2024

I don't think 2. is useful because then it means we can't access the printers
from the outside.

I believe you're referring to the qcheck generators, correct? Just to make sure
we're on the same page, the idea that would make option 2 viable in this case is
that the generator would be exported via the type_class that the mli already
exports. For instance, see in the messages.mli:

(** {2 QuickCheck} *)

val quickcheck_person : person Pbrt_quickcheck.Type_class.t
(** [quickcheck_person] provides helpers to test the type person with quickcheck *)

This would be accompanied by a modification to the type_class type to include
the missing field:

------ /ocaml-protoc/src/runtime-qcheck/pbrt_quickcheck.ml
++++++ /ocaml-protoc/src/runtime-qcheck/pbrt_quickcheck.ml
@|-1,13 +1,14 ============================================================
 |(** Runtime for QuickCheck based tests. *)
 |
 |(** A type class generated for each type *)
 |module Type_class = struct
 |  type 'a t = {
 |    pp: Format.formatter -> 'a -> unit;
+|    gen: 'a QCheck2.Gen.t
 |    equal: 'a -> 'a -> bool;
 |    encode_pb: 'a -> Pbrt.Encoder.t -> unit;
 |    decode_pb: Pbrt.Decoder.t -> 'a;
 |    encode_json: 'a -> Yojson.Basic.t;
 |    decode_json: Yojson.Basic.t -> 'a;
 |  }
 |end

QCheck can be used with alcotest or ounit, but it also has its own runner
library in qcheck-core.runner. It just takes a list of tests and CLI
arguments, and it might be enough.

Thanks for this. I plan to try this when I next work on the PR.

@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 20, 2024

I decided to take one more step while the details are still fresh in my mind. I've modified the test to use qcheck's core runner, as you suggested. Here's the outcome:

$ dune exec ./src/tests/roundtrip/main.exe -- --verbose --colors
random seed: 130848014
generated error fail pass / total     time test name
[✓]  100    0    0  100 /  100     0.0s error.protobuf-roundtrip
[✓]  100    0    0  100 /  100     0.0s error.json-roundtrip
[✓]  100    0    0  100 /  100     1.5s person.protobuf-roundtrip
[✓]  100    0    0  100 /  100     1.5s person.json-roundtrip
[✓]  100    0    0  100 /  100     0.0s empty.protobuf-roundtrip
[✓]  100    0    0  100 /  100     0.0s empty.json-roundtrip
[✓]  100    0    0  100 /  100     0.0s error.protobuf-roundtrip
[✓]  100    0    0  100 /  100     0.0s error.json-roundtrip
[✗]    1    1    0    0 /  100     0.0s unit_or_error.protobuf-roundtrip
[✓]  100    0    0  100 /  100     0.0s unit_or_error.json-roundtrip

=== Error ======================================================================

Test unit_or_error.protobuf-roundtrip errored on (0 shrink steps):

Unit

exception Pbrt.Decoder.Failure(Malformed_variant("unit_or_error"))

================================================================================
failure (0 tests failed, 1 tests errored, ran 10 tests)

Currently, I haven't found a way to incorporate this failing test in a manner that doesn't disrupt the build, without resorting to cram tests, which I believe aren't enabled in this repository. This is something to be determined.

Regarding the issue with qcheck2 in *.mli: This draft employs a rather crude workaround of filtering out qcheck2 if the generator detects that it's generating an mli. While I don't recommend this for the live version, it allows me to demonstrate progress on the PR.

I believe this latest round of changes brings us significantly closer to the level of automation discussed by @Lupus.

@c-cube
Copy link
Collaborator

c-cube commented Jan 21, 2024 via email

@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 25, 2024

Do you not use proptests in TDD?

Long-term, users submitting failing tests 1 in PRs could be a valuable workflow.
This lets maintainers prioritize asynchronously, and use pre-existing
expect-tests to see the impact of their changes when they get to it. The effect
of fixes on these tests also improves the code review experience, and surviving
tests become effective regression tests 2.

In the near term, we could pragmatically just disable the only failing item just
before merging.

Regarding the bug, I want to manage expectations: I am not currently
investigating the root cause. My focus is on making it easier to use
ocaml-protoc through my interfacing work on ocaml-grpc, and discussing testing
aspects of ocaml-protoc, as I am doing in this PR. I would prefer to avoid
delving into the actual runtime for now, if that makes sense.

For the qcheck part, I've opened a PR to add the missing support for signature
to move this PR forward.

Footnotes

  1. A test that demonstrates an issue in a reproducible fashion, but doesn't affect the succesful exit code of the overall build process.

  2. A similar approach seemed to work well in the ocp-ident project.

@c-cube
Copy link
Collaborator

c-cube commented Jan 26, 2024

I don't necessarily use TDD. I like the regression tests part, but I've never before used a failing qcheck property as a positive test (if you make it work, why not though!). Pragmatically we can comment/disable the failing test and re-enable it once the bug is fixed, that works for me.

- Now it's possible to add failing tests without breaking the build.

- The new stanza added required bumping dune-lang to `2.2`.
This reverts commit 5b1a17e.

The ci needs to know what dependencies to install. I'm not sure where
to declare them if not in an unused package definition? TBD.
@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 28, 2024

I've added the following:

  1. The ability to override the QuickCheck generator during test construction.

  2. The option to disable certain tests from a generated test suite, providing
    flexibility for specific use cases.

if you make it work, why not though!

Thanks! I've made a new attempt:

I've observed that the qcheck-runner, when executed without the --verbose
flag, only outputs to stdout. It also provides a --seed option for setting a
fixed seed for random generation. Leveraging these features, I've managed to run
the runner stably, enabling the capture of its output in a dune expect test.

This looks like this:

(rule
 (target test.outputs)
 (action
  (with-accepted-exit-codes
   (or 0 1)
   (with-outputs-to
    %{target}
    (run ./%{dep:main.exe} --no-colors --seed 382079347)))))

To use the with-accepted-exit-codes construct, I had to upgrade the dune-lang
version to 2.2. I trust this won't pose any issues?

I'm pleased to report that the CI is now passing. Looking forward to your next
round of feedback!

@c-cube
Copy link
Collaborator

c-cube commented Jan 29, 2024

That looks good! Dune 2.2 is a reasonable dependency. Do you plan to add more tests, or do you want to mark this PR as ready?

@mbarbin
Copy link
Contributor Author

mbarbin commented Jan 29, 2024

Thank you, @c-cube, for your continued involvement with this PR. I'm OK with the
current tests for this PR, and am envisioning a growing collection of tests
contributed from users once the directory is merged.

I've published a preview package1 and listed it2 for experimentation. I've
also integrated it into one of my projects3.

@Lupus, I'd love to hear your thoughts too before transitioning this PR to the
review stage. I'm particularly interested in hearing about applications in a
broader environment. Would you consider testing this in one of your protoc
interfaces? Thank you!

Footnotes

  1. https://github.com/mbarbin/ocaml-protoc/releases/tag/3.0.2-preview.4

  2. https://github.com/mbarbin/opam-repository

  3. https://github.com/mbarbin/eio-rpc/commit/f89ec6dcf880831f1299eb5a230725d74ff4f69a

@Lupus
Copy link
Contributor

Lupus commented Mar 5, 2024

Hey @mbarbin , sorry for late response. I doubt I'll have time to test this with our proto interfaces. We don't have a large collection of protos as of now anyways, so I doubt that it will bring much value. There is that Google's test suite for protobuf, have you tried testing those?

@mbarbin
Copy link
Contributor Author

mbarbin commented May 31, 2024

Hi! 👋

I've been away from the protoc project for a while, but I'm now revisiting my open PRs.

I've been using the quickcheck generation implemented in this PR for several months experimentally, but I'm now reconsidering the approach.

Firstly, the workaround I've used here to bypass the lack of support for [@@deriving qcheck2] in signatures doesn't hold up in real-world scenarios. This becomes evident when trying to build a protoc specification for a type across multiple protoc files. While I could wait for progress on this issue, it leads me to my second point: I'm hesitant to commit to a specific quickcheck package. Since drafting this feature, a new quickcheck library (bam) has been released, bringing the total number of options (I know of) to four (base_quickcheck, qcheck, qcheck2, and bam).

For my current use case, I'm not directly manipulating the types generated by ocaml-protoc. Instead, I'm working with intermediate types defined manually in a user library. I already have roundtrip tests in place for these, which include the protoc serialization (similar to this example). As I'm already using base.quickcheck, I'm now more reluctant to add another quickcheck framework dependency to my projects.

Looking back at the protoc code generator, I notice that the boundaries for the abstraction of a generation plugin are already very well established in the compilerlib's implementation, in the form of the Pb_codegen_plugin.S signature. I feel that protoc's compilerlib would be close to supporting user plugins. I believe supporting generation plugins could potentially be a good long-term direction for the project. Specifically, if the ocaml-protoc cli could accept user-developed code generation plugins as extra arguments (I am thinking about OCaml's dynamic loading capability), these features could be implemented as third-party packages. This would allow users to choose the quickcheck library that best suits their needs.

Given these considerations, I'm inclined to close this PR for now. I could revisit it in the future if the plugin route seems appealing. I'd appreciate your thoughts on this.

Thank you for your time!

@mbarbin mbarbin closed this Jun 7, 2024
@mbarbin
Copy link
Contributor Author

mbarbin commented Jun 7, 2024

cc @c-cube - I don't know how GitHub notifications works on draft or closed PRs. Just wanted to make sure you'd see the reasons I closed this one. Thanks a lot, and looking forward to potential future work together!

@c-cube
Copy link
Collaborator

c-cube commented Jun 7, 2024

That's fair, no worries! :-)

(Notifications work just the same btw.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants