Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nat_split, pos_split #114

Merged
merged 2 commits into from
May 15, 2021
Merged

nat_split, pos_split #114

merged 2 commits into from
May 15, 2021

Conversation

gasche
Copy link
Contributor

@gasche gasche commented May 9, 2021

The goal of this PR is to implement a function that takes a natural number n and parameter k, and uniformly splits n into k numbers n1..nk that sum back to n (n1 + ... +nk = n). This is useful when splitting "fuel" or "size" in a random generator, among k components instead of just two. (cc @olivier-martinot, with whom I am collaborating on a random generator that would benefit.)

Two variants are provided: in nat_split, the n1..nk are natural numbers (range [0;n]), in pos_split they are strictly positive (range [1;n], or in fact [1;n-k]).

The code for these functions is simple and relies on a generic helper that is also useful. However getting them right is tricky, and random generators are hard to test. I checked the implementation in an experimental version of my random-generator library that computes finite distributions.

# open Random_generator.Generator.Make(Random_generator.Prob_monad.Distr);;
# Gen.Nat.split ~size:3 5 |> Gen.run ~cmp:Stdlib.compare;;
- : ((int * int) * int list) list =
[((1, 21), [0; 0; 5]); ((1, 21), [0; 1; 4]); ((1, 21), [0; 2; 3]);
 ((1, 21), [0; 3; 2]); ((1, 21), [0; 4; 1]); ((1, 21), [0; 5; 0]);
 ((1, 21), [1; 0; 4]); ((1, 21), [1; 1; 3]); ((1, 21), [1; 2; 2]);
 ((1, 21), [1; 3; 1]); ((1, 21), [1; 4; 0]); ((1, 21), [2; 0; 3]);
 ((1, 21), [2; 1; 2]); ((1, 21), [2; 2; 1]); ((1, 21), [2; 3; 0]);
 ((1, 21), [3; 0; 2]); ((1, 21), [3; 1; 1]); ((1, 21), [3; 2; 0]);
 ((1, 21), [4; 0; 1]); ((1, 21), [4; 1; 0]); ((1, 21), [5; 0; 0])]

utop # Gen.Nat.pos_split ~size:3 5 |> Gen.run ~cmp:Stdlib.compare;;
- : ((int * int) * int list) list =
[((1, 6), [1; 1; 3]); ((1, 6), [1; 2; 2]); ((1, 6), [1; 3; 1]);
 ((1, 6), [2; 1; 2]); ((1, 6), [2; 2; 1]); ((1, 6), [3; 1; 1])]

Apparently there are 21 different ways to split 5 into three natural numbers, and 6 ways to split it into three strictly-positive numbers. (The pair (a, b) before an output represents the rational number a/b, the probability of observing this output.)

@c-cube
Copy link
Owner

c-cube commented May 10, 2021

That's a good idea. If I recall correctly you use this in recursive sized generators, to allocate size "tokens" between recursive calls, correct?

I wonder about the interaction with #109 . It seems like this shouldn't "shrink" in any meaningful way (apart from the fact it's probably used with >>= in most cases), so minimal interaction?

@gasche
Copy link
Contributor Author

gasche commented May 10, 2021

I wonder about the interaction with #109 . It seems like this shouldn't "shrink" in any meaningful way (apart from the fact it's probably used with >>= in most cases), so minimal interaction?

Yes, a common way to use this (but not the only way) would be to first pick a random number k, then split the current size/fuel k-ways. The split itself cannot be shrunk, but you can re-split after shrinking k.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants