Type of return value of density, cdf and quantile is inconsistent #99

venpopov · 2024-04-01T15:12:05Z

While doing some testing I noticed that density, cdf and quantile sometimes return vectors, sometimes lists:

library(distributional)

density(dist_normal(), c(0,1))                     # returns list
#> [[1]]
#> [1] 0.3989423 0.2419707
density(c(dist_normal(),dist_normal()), c(0,1))    # returns list
#> [[1]]
#> [1] 0.3989423 0.2419707
#> 
#> [[2]]
#> [1] 0.3989423 0.2419707
density(dist_normal(), 0)                          # returns vector
#> [1] 0.3989423
density(c(dist_normal(),dist_normal()), 0)         # returns vector
#> [1] 0.3989423 0.3989423


cdf(dist_normal(), c(0,1))                         # returns list
#> [[1]]
#> [1] 0.5000000 0.8413447
cdf(c(dist_normal(),dist_normal()), c(0,1))        # returns list
#> [[1]]
#> [1] 0.5000000 0.8413447
#> 
#> [[2]]
#> [1] 0.5000000 0.8413447
cdf(dist_normal(), 0)                              # returns vector
#> [1] 0.5
cdf(c(dist_normal(),dist_normal()), 0)             # returns vector
#> [1] 0.5 0.5


quantile(dist_normal(), c(0,1))                    # returns list
#> [[1]]
#> [1] -Inf  Inf
quantile(c(dist_normal(),dist_normal()), c(0,1))   # returns list
#> [[1]]
#> [1] -Inf  Inf
#> 
#> [[2]]
#> [1] -Inf  Inf
quantile(dist_normal(), 0)                         # returns vector
#> [1] -Inf
quantile(c(dist_normal(),dist_normal()), 0)        # returns vector
#> [1] -Inf -Inf


# all return lists
generate(dist_normal(), 2)                         
#> [[1]]
#> [1] 1.7134674 0.4435964
generate(c(dist_normal(),dist_normal()), 2)
#> [[1]]
#> [1] -0.8288996  1.3283409
#> 
#> [[2]]
#> [1]  0.3532802 -0.7265644
generate(c(dist_normal(),dist_normal()), c(5,10))
#> [[1]]
#> [1] -0.24069096 -0.69155357  0.75666937  1.41401325 -0.07069785
#> 
#> [[2]]
#>  [1] -0.5421478 -1.2236593 -0.4537853 -0.3443427  0.2597258 -1.8010175
#>  [7]  0.8553006 -0.9100003 -0.3662697 -0.1896853

^{Created on 2024-04-01 with reprex v2.1.0}

And the expected output type is not documented for these functions. I think it would be good to have consistent output type. For example, I was surprised that density(dist_norma(), 0) returns a scalar, but that density(dist_normal(), c(0,1)) returns a list, while with two distributions and single point we get a vector.

One possibility is to always return a list, where each element corresponds to one distribution, like it is now for some of the outputs. Although I believe this would be desirable, it is a breaking change that might affect other packages. In any case, the return value type should be documented.

Additionally, it could also be nice to have an option simplify2array as an argument.

The text was updated successfully, but these errors were encountered:

mitchelloharawild · 2024-04-19T06:37:01Z

The documentation should definitely be improved to explain how the distributions and arguments are vectorised. Essentially the operation's argument (e.g. density(at=)) is vectorised across the distributions and the result is simplified if possible. While I agree that returning lists with one element is more consistent, it would be less efficient and less usable - it could be a non-default option, like drop = FALSE.

The nature of vectorisation in distributional is discussed here: Vectorisation of distributions and operation arguments #52
This specific behaviour was considered here: Vectorisation of distributions and operation arguments #52 (comment)

The issue with simplify2array() is that the vectorisation needs to work with multivariate distributions which already produces arrays.

mitchelloharawild added this to the v1.0.0 milestone Apr 19, 2024

mitchelloharawild mentioned this issue Apr 19, 2024

Write a vignette about how operations are vectorised #107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type of return value of density, cdf and quantile is inconsistent #99

Type of return value of density, cdf and quantile is inconsistent #99

venpopov commented Apr 1, 2024

mitchelloharawild commented Apr 19, 2024

Type of return value of density, cdf and quantile is inconsistent #99

Type of return value of density, cdf and quantile is inconsistent #99

Comments

venpopov commented Apr 1, 2024

mitchelloharawild commented Apr 19, 2024