Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seed: introduce seed user option #165

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

fenollp
Copy link
Contributor

@fenollp fenollp commented May 17, 2018

Any reason this wasn't added earlier? This is an otherwise pretty common option for a QuickCheck implementation.

Note: not sure how much type-checking I need to do on the option's value as not much checking is done on the other options. Plus this might prove useful for RNGs that aren't seeded with the usual triple.

@fenollp
Copy link
Contributor Author

fenollp commented May 17, 2018

Seeing CI failures I expect they are due to older OTP releases using different RNGs.
If that’s true and there’s no backport of the RNG used in >=19 I should probably update the test to be otp-vsn dependent. Thoughts?

@kostis
Copy link
Collaborator

kostis commented May 17, 2018

I am a bit skeptical about introducing a test that relies on the RNG producing the same random number based on a given seed. Looks to me that this test is a perfect candidate to break not only in some past releases but also in some future ones. I would suggest another test instead; perhaps a weaker one that just checks that PropEr accepts this option.

@TheGeorge
Copy link
Contributor

TheGeorge commented May 17, 2018

In your PR the seed get's only parsed from the options but it does not actually get set as seed for the RNG. The reason for failing the test case is that >=19 PropEr uses random and afterwards rand (see proper_internal.hrl the macro ?RANDOM_MOD and ?SEED_NAME)

You have to set the seed using proper_arith:rand_start(Seed) for it to have any effect.

Furthermore, I think that the test is not good as it depends on the implementation of the RNG. What you want to test instead is that if you set the seed by hand it is set to exactly that value (probably using the ?SETUP macro since that gets evaluated before any input value is generated)

(see also Kostis comments)

@fenollp
Copy link
Contributor Author

fenollp commented May 17, 2018

Well I’d rather test what comes out of the RNG that way I can be sure I have all the inputs needed for deterministic/reproducible runs.
Maybe along with the seed I should add the necessary RNG functions as parameters too?

I’m not sure where start_rand should be called if at all since I can reproduce my test’s output with and with the call.
Also I tried reading ?SEED_RAND from ?SETUP but that does not seem to have much to do with the triplet seed. Should I be comparing ?SEED_RAND during cleanup with what was in setup?

Thanks!

@TheGeorge
Copy link
Contributor

Thinking about it a little bit, what you want to achieve with setting the seed is that runs of the same property with the same seed produce the same input as you say. So I think that running a property twice with the same seed and then comparing that the generated input is the same (e.g. fails on the same value) should be a good test. I still don't think that the test should rely on a specific implementation of a RNG.

In proper:global_state_init() proper uses the seed option already to set the seed. Sorry for the confusion earlier.

@fenollp
Copy link
Contributor Author

fenollp commented May 17, 2018

Disregard commit that just went through... WIP!

Maybe along with the seed I should add the necessary RNG functions as parameters too?

What do you think about me also adding the following user options:

  • {rng_seed, fun rng_seed/1}
    • default: fun ?RANDOM_MOD:seed/1
  • {rng_uniform, fun rng_uniform/0}
    • default: fun ?RANDOM_MOD:uniform/0
  • {rng_uniform, fun rng_uniform/1}
    • default: fun ?RANDOM_MOD:uniform/1

There are still a few calls to os:timestamp() in proper_arith:rand_reseed/0: shouldn't these be replaced by some function of Seed?

Something like fun ({A,B,C}) -> {B,2*C,A} end? Something deterministic I mean.

Thanks a lot for your help!

@TheGeorge
Copy link
Contributor

I am working on a fix for cleaning up the targeted part from the state.

QC = fun (Prop) ->
R = proper:quickcheck(Prop, Opts),
proper:clean_garbage(),
catch proper_target:cleanup_strategy(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheGeorge this fixed the issue of cleaning up TPBT.

@TheGeorge
Copy link
Contributor

TheGeorge commented May 20, 2018 via email

@fenollp fenollp mentioned this pull request May 22, 2018
state_is_clean() ->
get() =:= [].
-define(state_is_clean(), ?assertEqual([],get())).
-define(_state_is_clean(), ?_assertEqual([],get())).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the advantage of using state_is_clean() to using ?assert(state_is_clean())? For me the old notation is easier to read (an assertion that checks that the state is clean)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially prefixed it with assert_ but then thought it was getting too long. Would you be happy if I prefixed it?

The advantage is to show what’s in the get() when it’s not empty and to report the line where the macro is called.

@codecov-io
Copy link

codecov-io commented Jun 26, 2018

Codecov Report

Merging #165 into master will increase coverage by 0.29%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #165      +/-   ##
==========================================
+ Coverage   83.23%   83.53%   +0.29%     
==========================================
  Files          13       13              
  Lines        3132     3140       +8     
==========================================
+ Hits         2607     2623      +16     
+ Misses        525      517       -8     
Impacted Files Coverage Δ
src/proper.erl 86.15% <100.00%> (+0.48%) ⬆️
src/proper_arith.erl 86.11% <100.00%> (-0.30%) ⬇️
src/proper_target.erl 90.62% <100.00%> (+0.30%) ⬆️
src/proper_gen_next.erl 77.34% <0.00%> (+0.27%) ⬆️
src/proper_statem.erl 94.34% <0.00%> (+2.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0fbe32...81d9968. Read the comment docs.

@fenollp
Copy link
Contributor Author

fenollp commented Jun 27, 2018

@TheGeorge Any progress on proper_target:cleanup_strategy?

@TheGeorge
Copy link
Contributor

I'm sorry for being so slow. I have a patch that fixes the issue and will upload it in a minute.

@TheGeorge
Copy link
Contributor

#174 should now clean up the strategy correctly.

@fenollp
Copy link
Contributor Author

fenollp commented Jun 27, 2018

Amazing! Will amend once #174 is merged :) Thanks a lot

@fenollp
Copy link
Contributor Author

fenollp commented Jun 27, 2018

Ready!

@kostis
Copy link
Collaborator

kostis commented Jun 27, 2018

Ready!

No, it's not.

For starters, I do not "buy" the reasoning to introduce all these unnecessary ?assert() diffs (see how the current diff looks). They have absolutely nothing to do with the primary goal of this pull request.

Also, please use the current style for the test -- i.e., no commas in the beginning of lines, etc.

Finally, and most importantly, I do not really understand what the test is actually doing. It cannot be setting the seed to some thing that is changing (os:timestamp/0 call) but it needs to be setting it to something stable and be checking that the behavior we get is

  1. the "expected" one and
  2. different than that that one would get without specifying a seed.

@fenollp
Copy link
Contributor Author

fenollp commented Jun 27, 2018

Ready!

No, it's not.

Yes it is: for review.

WRT the asserts commit (it is a separate commit, no need to show a full diff...):

The advantage is to show what’s in the get() when it’s not empty and to report the line where the macro is called.

That helped a lot debugging failing tests.

WRT style: what's in the "etc"? you mention no leading commas, I'm guessing ] shouldn't be on a line on their own I guess? what else?
Is there a coding style document I missed somewhere?

WRT the test:

  • yes the seed changes for each run of the test. That is to ensure the property is valid not just for a specific seed.
  • I believe running the same property with the same seed twice and comparing the counterexample is enough to ensure the seeding works. I set a very high size & disabled shrinking so that the generated counterexamples "have very unexpected values", not the common values that any seed could find such as false, 0, 4, .... Also it is trickier to test the property against a specific value since the RNG can be changed and has changed over OTP releases.
  • Now I should probably test seeded shrinking of a hard-to-shrink property. What do you think?
  • I like the idea of testing that the counterexample is different without specifying a seed! I'll probably also test that same property using a different seed.

Thanks for reviewing, I'll push an update soon.

@fenollp
Copy link
Contributor Author

fenollp commented Jul 17, 2018

Hi there, I've replied and pushed a commit 20 days ago now. Anything more I need to do?

@evnu
Copy link
Contributor

evnu commented Mar 18, 2019

Is this PR still under consideration? This feature might be useful in PropCheck (issue) to check if a persisted counterexample is still valid, or if the generator used to produce that example no longer exist.

@x4lldux
Copy link
Contributor

x4lldux commented Mar 29, 2020

@kostis @TheGeorge is there something that still needs to be finished with this? How can I help to get this merged?

@fenollp
Copy link
Contributor Author

fenollp commented Mar 31, 2020

Rebased ;)

@fenollp
Copy link
Contributor Author

fenollp commented Apr 2, 2020

Rebased again & enhanced the tests:

  • now there's a flakyness flag that allows some failures (for now)
  • the test failure reports which "property runner" failed
    Right now the reproducibility test is failing with and only with the targeted runner. I haven't run a git-bisect yet but I know test passed when I opened the PR 2 years ago. I'd say there's a good chance this comes from the somewhat recent changes to TPBT but of course this is not finger pointing.

I'm currently investigating whether reproducibility can be achieved by making proper_arith:rand_reseed/0 deterministic.

@fenollp fenollp force-pushed the seed branch 2 times, most recently from 398bd17 to f4f9f03 Compare June 25, 2020 17:20
@fenollp
Copy link
Contributor Author

fenollp commented Nov 8, 2020

Rebased again. Is CI disabled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants