Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between Gen and Arbitrary for Option #401

Closed
ceedubs opened this issue Apr 25, 2018 · 10 comments
Closed

Discrepancy between Gen and Arbitrary for Option #401

ceedubs opened this issue Apr 25, 2018 · 10 comments

Comments

@ceedubs
Copy link
Contributor

ceedubs commented Apr 25, 2018

The Gen instance for Option generates a Some 90% of the time (see #286).

The Arbitrary instance for Option doesn't delegate through to the Gen instance but instead uses the size as a factor for the frequency of Some.

It seems to me like Scalacheck should settle on one of these two approaches for generating Option, have the Gen use it, and have the Arbitrary instance simply delegate to the Gen instance. The idea of using the size for the frequency of Some is an interesting one, though it's not immediately obvious to me whether one approach is better than the other. Also the Arbitrary instance is currently resizing the inner Gen[A] instance to half of size which stumped me a little and appears to have done the same to @rickynils.

I'm happy to contribute the change if someone suggests to me which of these approaches seems like the better option.

@ceedubs
Copy link
Contributor Author

ceedubs commented Apr 25, 2018

cc @non

@ashawley
Copy link
Contributor

ashawley commented Apr 25, 2018

I vote for 90% of the timeSome, 10% None.

Really, the Arbitrary instance should reuse the Gen instance,Arbitrary(Gen.option(arbitrary[T])), as you suggested. No point repeating it.

@ashawley
Copy link
Contributor

Looking at the git history, it seems the current implementation for Arbitrary uses halving simply because it was modifying the previous version that also used sized and resize, but was doing arithmetic subtraction. The old method was defective at producing None enough. Introducing the frequency turned out to be the fix, but it looks like maybe the halving was added for good measure. That this code ends up working seems like an accident of history to me.

@ceedubs
Copy link
Contributor Author

ceedubs commented Apr 25, 2018

Currently #397 is tripping me up and preventing me from submitting a PR for this.

@martijnhoekstra
Copy link
Contributor

@ceedubs I see that #397 is now closed

@ashawley
Copy link
Contributor

ashawley commented Jul 17, 2019

It has been revealed that there was a use case of ScalaCheck for generating classes with recursive definitions using Arbitrary[Option[T] in a few projects. Changing the bias of Arbitrary.arbOption away from Noneand to Some in #408 caused their tests to suddenly stackoverflow.

Here's an example of this use case:

case class RecursiveOption(opt: Option[RecursiveOption])

implicit def arbRecursiveOption: Arbitrary[RecursiveOption] = Arbitrary {
  Arbitrary.arbitrary[Option[RecursiveOption]]
    .map(RecursiveOption(_))
}

Weighting None to 10% would require more code to defend against stackoverlow in these tests:

implicit def arbRecursiveOption: Arbitrary[RecursiveOption] = Arbitrary {
  Gen.oneOf(
    Gen.const(RecursiveOption(None)),
    Gen.delay(Arbitrary.arbitrary[Option[RecursiveOption]] // !
      .map(RecursiveOption(_))))
}

There are arguments to be made for both biased weighting and for these recursive definitions, but we can't really satisfy both. I suppose we should just roll back to the old definition of equally weighting Some and None. Alternatively, we could go back to having different implementations for Gen and Arbitrary. The former, rolling back, seems more appropriate in the context of getting a bug fix version released. We could still keep the fix of the discrepancy between Gen and Arbitrary that this PR addressed. However, the weighting change to Gen was released in 1.14.0. So I guess we'd effectively be rolling back #286.

@dwijnand
Copy link
Contributor

Can the issue be solved by providing a Gen/Arb-creating function for such recursive cases?

I find the reasoning given in #286 and #408 to be good ones, and not worth reverting for this.

@ashawley
Copy link
Contributor

I don't believe there is a way.

Indeed, the reasoning was good. However, this would not be a good change in a bug-fix release.

@ceedubs
Copy link
Contributor Author

ceedubs commented Jul 17, 2019

I'm inclined to think that the higher Some weighting should be the long-term solution and that libraries that are htting stack overflows should use a depth tracker to avoid generating structures that are too deeply nested. I would imagine that they already have a chance to stack overflow, but it's just very small.

Having said that, if this change has been too disruptive for a patch release, I have no strong objection to reverting it until the next minor version. Sorry to cause trouble.

@ashawley
Copy link
Contributor

Yes, that is a good argument. The termination does seem accidental. It could be put back for a minor version release. We could document this change, as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants