`nat_split`, `pos_split` #114

gasche · 2021-05-09T20:26:25Z

The goal of this PR is to implement a function that takes a natural number n and parameter k, and uniformly splits n into k numbers n1..nk that sum back to n (n1 + ... +nk = n). This is useful when splitting "fuel" or "size" in a random generator, among k components instead of just two. (cc @olivier-martinot, with whom I am collaborating on a random generator that would benefit.)

Two variants are provided: in nat_split, the n1..nk are natural numbers (range [0;n]), in pos_split they are strictly positive (range [1;n], or in fact [1;n-k]).

The code for these functions is simple and relies on a generic helper that is also useful. However getting them right is tricky, and random generators are hard to test. I checked the implementation in an experimental version of my random-generator library that computes finite distributions.

# open Random_generator.Generator.Make(Random_generator.Prob_monad.Distr);;
# Gen.Nat.split ~size:3 5 |> Gen.run ~cmp:Stdlib.compare;;
- : ((int * int) * int list) list =
[((1, 21), [0; 0; 5]); ((1, 21), [0; 1; 4]); ((1, 21), [0; 2; 3]);
 ((1, 21), [0; 3; 2]); ((1, 21), [0; 4; 1]); ((1, 21), [0; 5; 0]);
 ((1, 21), [1; 0; 4]); ((1, 21), [1; 1; 3]); ((1, 21), [1; 2; 2]);
 ((1, 21), [1; 3; 1]); ((1, 21), [1; 4; 0]); ((1, 21), [2; 0; 3]);
 ((1, 21), [2; 1; 2]); ((1, 21), [2; 2; 1]); ((1, 21), [2; 3; 0]);
 ((1, 21), [3; 0; 2]); ((1, 21), [3; 1; 1]); ((1, 21), [3; 2; 0]);
 ((1, 21), [4; 0; 1]); ((1, 21), [4; 1; 0]); ((1, 21), [5; 0; 0])]

utop # Gen.Nat.pos_split ~size:3 5 |> Gen.run ~cmp:Stdlib.compare;;
- : ((int * int) * int list) list =
[((1, 6), [1; 1; 3]); ((1, 6), [1; 2; 2]); ((1, 6), [1; 3; 1]);
 ((1, 6), [2; 1; 2]); ((1, 6), [2; 2; 1]); ((1, 6), [3; 1; 1])]

Apparently there are 21 different ways to split 5 into three natural numbers, and 6 ways to split it into three strictly-positive numbers. (The pair (a, b) before an output represents the rational number a/b, the probability of observing this output.)

c-cube · 2021-05-10T00:17:47Z

That's a good idea. If I recall correctly you use this in recursive sized generators, to allocate size "tokens" between recursive calls, correct?

I wonder about the interaction with #109 . It seems like this shouldn't "shrink" in any meaningful way (apart from the fact it's probably used with >>= in most cases), so minimal interaction?

gasche · 2021-05-10T23:09:13Z

I wonder about the interaction with #109 . It seems like this shouldn't "shrink" in any meaningful way (apart from the fact it's probably used with >>= in most cases), so minimal interaction?

Yes, a common way to use this (but not the only way) would be to first pick a random number k, then split the current size/fuel k-ways. The split itself cannot be shrunk, but you can re-split after shrinking k.

gasche added 2 commits May 9, 2021 22:20

Gen: range_subset, array_subset

3e9ce83

Gen: {nat,pos}_split{,2}

400a695

c-cube merged commit e47b105 into c-cube:master May 15, 2021

This was referenced Sep 22, 2021

nat_split2 and pos_split2 are reversed #180

Closed

fix Gen.{nat,pos}_split{2,} #183

Merged

jmid mentioned this pull request Apr 19, 2022

Add range_subset, pos_split2, nat_split2, pos_split, nat_split to QCheck2 #238

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`nat_split`, `pos_split` #114

`nat_split`, `pos_split` #114

gasche commented May 9, 2021 •

edited

Loading

c-cube commented May 10, 2021

gasche commented May 10, 2021

nat_split, pos_split #114

nat_split, pos_split #114

Conversation

gasche commented May 9, 2021 • edited Loading

c-cube commented May 10, 2021

gasche commented May 10, 2021

`nat_split`, `pos_split` #114

`nat_split`, `pos_split` #114

gasche commented May 9, 2021 •

edited

Loading