-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add round-trip fuzz test #55
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this up! I agree that round-tripping would be a great property, but believe that the markdown parser in its current incarnation at least is inherently lossy in a couple of places. I am sure by now this has been improved, and maybe entirely rectified, but it would require some some research to be sure of that.
With that said, I hope there are more deterministic ways to add roundtrip-testing, maybe even in cases that are known to not round-trip correctly, instead of using fuzzer that will also run in CI thanks to make tests
.
My disposition here is to not merge due to non-determinism that could also be fully deterministic, but wanted to hear your thoughts first. Thanks for sharing.
This checks that the library can accurately go from Markdown to events and back to the same Markdown.
I agree that there will be a lossy step here: a list item will result in That is expected and I'm actually talking about round-tripping after such normalization. That is why my fuzzer does this: let round_trip_1 = round_trip(&text);
let round_trip_2 = round_trip(&round_trip_1);
assert_eq!(round_trip_1, round_trip_2); The hope would be that However, some brief testing showed me that this is not the case. I've tried round-tripping more and more times in the hope of finding a fixpoint. However, it seems that I'm just generating more elaborate test cases 😄 Here is an input which changes shape at least five times:
I updated the PR to have this code in case it's interesting.
Yes, definitey! This PR is not ready to be merged, let me mark it as a WIP to make it more clear to others. The fuzz test here is more of a future goal, and perhaps a way to find small low-hanging fruit which spoil the round-tripping today. |
The corresponding events to the
It looks like a bug to me that |
Thanks so much for elaborating! I understand now that after one round it could stabilize even if it isn't generally round-tripable. It would also be my expectation that issues with multi-step round-tripping are solely to be attributed to this crate rather than the parser. |
Yeah, I'll add normal tests for individual cases in a new PR. I see this PR as a bit of a test-generating-machine 😄 |
The "using a fuzzer to generate test-cases for invariants" definitely goes into the tool-belt :). Thanks for pointing it out. |
Yeah, it's a new tool for me too! 😄 It's something I want to try applying more in various projects since it super powerful. |
This checks that the library can accurately go from Markdown to events and back to the same Markdown. I'm trying to use the library in google/mdbook-i18n-helpers#19.
My assumption was that I can round-trip Markdown through the library. While there is a lot of Markdown input which can produce a given sequence of
pulldown_cmark::Event
s, I was hoping to normalize the output by running it throughcmark
. So I hoped thatwould hold, there
round_trip
runstext
throughcmark(Parser::new(&text), &mut result)
, seeround-trip.rs
.The fuzz test fails. To run it yourself, install
cargo fuzz
and runInput like
"**"
becomes"\\**"
after one iteration and then becomes"\\*\\*"
after a second round of normalization:The events become
"\\**"
, which the become these events:These events are finally turned into
"\\*\\*"
.I can work around this case by merging adjacent
Text
events — but if I do that, I find a more complex input that fails the fuzz test.Is there a chance to have the library support this kind of round-tripping?