-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more Kolmogorov–Smirnov test #1504
Conversation
@benjamin-lieser (in case you didn't see this already)
For now, it would be helpful if you would comment out each failing parameter set.
Passing the set of tests with multiple seeds does slightly increase the robustness of those tests, though it may not be an efficient approach. Lets see how long the tests take (so far, it looks like Miri is still the slowest). |
Thanks a lot for the additional tests. I will have a look at the failing ones. I was thinking that the tests are probably better placed in the unit test section for the distributions. As it is right now, it just takes a new seed for a new parameter set, a constant seed could only improve things if we save the output from the rng, which does not seem worth it. |
136453f
to
78f6d0d
Compare
@benjamin-lieser I think it's a good idea. However, the implementation related to Kolmogorov-Smirnov might need to be moved to But it might be inconvenient to check which distributions haven't implemented the KS test. |
e899930
to
dce3073
Compare
|
||
let parameters = [ | ||
0.1_f32, 1.0, 7.5, | ||
// 1.844E+19, // fail case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case fails because your cdf does not work for values bigger than 1e8
. The problem is in the inc_gamma though, it just return 1.0 when lambda > 1e8
. Maybe you can find a different implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be related to the implementation of lgamma
, but rather that the probability is too small to be stored in f64
. In fact, cdf(1e8 as i64, 1e8)
can output the result 0.5000265872649796
, but cdf((1e8 - 1e5) as i64, 1e8)
rounds to 0
. I tried using f128
, but encountered linker-related issues, and even if it worked, it wouldn't be usable in the stable version.
I will also check other implementations of the CDF to see if there is a possibility of underflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cdf(2e9 as i64, 1e9)
should be 1.0
but it is 0.0
cdf(2e7 as i64, 1e7)
is correctly 1.0
because it is under the threshold where inc_gamma
stops being correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest commit, I copied the gamma_lr
implementation from statrs
and used it to calculate the CDF:
println!("{}", cdf(1e9 as i64, 1e9)); // 0.5000077942086617
println!("{}", cdf(2e9 as i64, 1e9)); // 1
println!("{}", cdf((1e9 - 1e4) as i64, 1e9)); // 0.3759229845139799
For reference, here is the calculation in R:
> ppois(1e9, 1e9)
[1] 0.5000084
> ppois(2e9, 1e9)
[1] 1
> ppois(1e9-1e4, 1e9)
[1] 0.3759226
I'd prefer we keep this separate; it's slow and requires a dev-dependency. You can use multiple modules. |
If we have multiple testing methods in the future, we can place the algorithm implementations in
This approach allows tests to be distributed across their respective distributions, but it will result in a larger number of files. |
If we run the |
Regarding the integrations, you're correct that this is not ideal. Also, testing using My suggestion is that we move sparkline and KS tests under the Logically we might want to rename Note: at some point, |
4035e06
to
dbfda2b
Compare
I think that's a good idea. I can move the KS tests to the |
rand_distr/tests/cdf.rs
Outdated
(1.0, 100.0), | ||
]; | ||
|
||
println!("{}", cdf(5.0, 2.0, 1.5)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a debug print.
By the way there is a dbg!
macro for this.
rand_distr/tests/cdf.rs
Outdated
(10.0, 0.1), | ||
(0.1, 10.0), | ||
(15.0, 20.0), | ||
// (1000.0, 0.001), // Fail case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case does not really make sense because of the small k
(the expected values is bigger than the largest finite f64), I would use (1000, 0.01)
instead which passes.
rand_distr/tests/cdf.rs
Outdated
} | ||
|
||
let parameters = [ | ||
// 0.01, // Fail case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should also be fine to drop (for now). The distribution has very high Kurtosis, so there will a significant amount of mass in the tails.
(60, 10, 7), | ||
(70, 20, 50), | ||
(100, 50, 10), | ||
// (100, 50, 49), // Fail case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug in the hypergeometric sampling, after this was merged this test should be included.
|
||
1.0 - gamma_lr(k as f64 + 1.0, lambda) | ||
} | ||
let parameters = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a test over f64
? Then 1e9
should also pass.
We have to think about how to test over different floating point, but I would do everything in f64
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use f64
here because its sample
method has both f64
and u64
implementations, and version 1.61.0 cannot specify which implementation to use. This causes the MSRV test to fail during compilation.
However, if I move the KS test to benches
, it will only be tested in nightly, allowing it to be performed on f64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can change the definition of test_discrete
to replace the impl Trait with a generic
rand_distr/tests/cdf.rs
Outdated
t | ||
} | ||
|
||
fn tf(h: f64, a: f64, ah: f64) -> f64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this function? Where did the code come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above code is mentioned in the paper cited in the comment of the owen_t
function:
TF(H,A,AH) (H ≥ 0, 0 ≤ A ≤ 1). This in turn selects the appropriate method (T1,…, T6) for computing T(H,A) based on the input values of H and A according to the ranges given in Figure 2.
The paper provides Fortran code at the end, and I just translated it into Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could embed this function inside owen_t
then. Or it would be neater to move the lot to a new file.
rand_distr/tests/cdf.rs
Outdated
(2.0, 3.5), | ||
(10.0, 1.0), | ||
(100.0, 50.0), | ||
// (10.0, 0.1), // Fail case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a very extreme distribution concentrated around 1.0
I will look into it in more detail later, for now we can drop this case.
@JamboChen yes, ideally the restructure is in a different PR. Since this PR already exists, it may be easiest to merge this first (without any restructuring). I'll let you resolve @benjamin-lieser's concerns first then give your changes a glance over. |
I would also decrease the sample size so the tests run quick enough. This looses us sensitivity, but it should be fine until the release and we can think about a more permanent solution. I imagine some more exhaustive tests over the parameter space, maybe some code coverage tool to make sure we get all the different sampling strategies. |
@benjamin-lieser the total run-time isn't too bad here, if tested once per CI run. My inclination is that we should move this to For now, lets just get this merged then move to a new repo and fix the CI jobs. |
Here are the runtimes for each test (as a reference for relative speed):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good (without looking too closely).
There are a couple of disabled tests; I'll note in #357.
rand_distr/tests/cdf.rs
Outdated
t | ||
} | ||
|
||
fn tf(h: f64, a: f64, ah: f64) -> f64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could embed this function inside owen_t
then. Or it would be neater to move the lot to a new file.
Enhance Kolmogorov–Smirnov Test Coverage for Various Distributions
CHANGELOG.md
entrySummary
Enhance Kolmogorov–Smirnov Test Coverage for Various Distributions
Motivation
Improve CDF test coverage
Details
I have implemented Kolmogorov–Smirnov tests for the following distributions:
I've noticed that some samplers may not pass the test with extreme parameters (e.g.,
(10000, 0.0001)
), and I'm unsure if these should be retained in the test set. Additionally, I'm considering whether using a fixed set of seeds combined with the distribution's parameters is beneficial, as it may increase testing time.