Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rand( TDist(Inf) ) gives NaN in v0.20.0 #929

Closed
atteson opened this issue Jul 10, 2019 · 8 comments
Closed

rand( TDist(Inf) ) gives NaN in v0.20.0 #929

atteson opened this issue Jul 10, 2019 · 8 comments

Comments

@atteson
Copy link

atteson commented Jul 10, 2019

I noticed that rand( TDist(Inf) ) gives NaN in v0.20.0 but gives a value in v0.16.4 (which I assume corresponds to a standard normal variate).

pdf( TDist(Inf), 1.0 ) works fine in both versions.

@matbesancon
Copy link
Member

Several issues here, the distribution parameter should not be taking a real type but an integer value. The type parameter could capture in some sense the infinite / finite thing and dispatch accordingly.

The documentation is also pointing to the "student" wikipedia page, which is not the Student distribution page.
PR welcome

@atteson
Copy link
Author

atteson commented Jul 11, 2019

Actually, the t distribution is perfectly well defined for integer values of nu and I (and others) use it that way.

@johnczito
Copy link
Member

johnczito commented Jul 12, 2019

I think the change happened here, where

rand(d::TDist) = StatsFuns.RFunctions.tdistrand(d.ν)

became

rand(rng::AbstractRNG, d::TDist) = randn(rng) / sqrt(rand(rng, Chisq(d.ν))/d.ν)

The denominator becomes sqrt(Inf/Inf) = NaN. So do we want something like this?

rand(rng::AbstractRNG, d::TDist) = randn(rng) / ( isinf(d.ν) ? 1 : sqrt(rand(rng, Chisq(d.ν))/d.ν) )

@atteson
Copy link
Author

atteson commented Jul 12, 2019

My preference is to allow for the nu=Inf to mean the normal distribution since this is always the limiting behavior. I like the way the R function works. However, there are other limiting cases which are not handled in Distributions. For example, a normal with sigma=0 doesn't become a delta distribution.

@johnczito
Copy link
Member

Right, so you want TDist(Inf) to simply return Normal(0,1). That makes sense, though I'm not sure what the type stability implications of that are.

@atteson
Copy link
Author

atteson commented Jul 12, 2019

Actually, my intent is that TDist( Inf ) remain that type but act like Normal(0,1) in rand, pdf, cdf, etc. I think that's probably better because of type stability as you mention. Someone might assign these to a Vector{TDist} or some such.

The bigger question is "what's the policy of Distributions for edge cases"? Some approaches that I see are:

  1. Make the distribution behave as the limiting distribution in the edge cases (at least when there is a limiting distribution under no other conditions).

  2. Have the distribution fail in the edge cases.

  3. Have the functions of the distribution return NaN.

#1 is what TDist(Inf) used to do. #2 is what Normal(0,0) does. #3 is what TDist now does.

My preference is for #1 though this becomes subtle for multivariate distributions. A multivariate normal with a singular covariance matrix is a normal on a lower dimensional linear subspace.

@johnczito
Copy link
Member

That all sounds right to me. You said pdf is behaving. So for rand we could do something like this and it should work again:

rand(rng::AbstractRNG, d::TDist) = randn(rng) / ( isinf(d.ν) ? 1 : sqrt(rand(rng, Chisq(d.ν))/d.ν) )

@atteson
Copy link
Author

atteson commented Jul 12, 2019

Sounds good.

matbesancon pushed a commit that referenced this issue Jul 18, 2019
* rand that handles TDist(Inf)

* begin unit tests for TDist(Inf)

* get var to agree

* make entropy, cf, and gradlogpdf of TDist(Inf) agree with Normal(0,1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants