Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All distributions should implement suitable statistics module traits. #276

Open
YeungOnion opened this issue Sep 5, 2024 · 2 comments
Open

Comments

@YeungOnion
Copy link
Contributor

Noticed while reviewing #275 that we're missing some of the support traits (Min, Max) for multinomial.
There may be others to find as well.

@soumyasen1809
Copy link
Contributor

@YeungOnion I can try working on this one. if it is okay.

Also, 1 question:
Minimum Value is 0, and
Maximum Value is the total number of trials, n.
Is my understanding correct?

@YeungOnion
Copy link
Contributor Author

Sorry about the delay in reply,

Minimum Value is 0, and
Maximum Value is the total number of trials, n.

I believe I posted the issue thinking that the above sentiment was correct. All 0 or all n, it is analogous to the impl for MultivariateNormal. However, I suppose that Max and Min for multivariate distributions are not clear.

Perhaps we could flesh out the use case of these kinds of values and then reimplement the useful trait, whether or not it maintains its name and signature. At the moment, I don't see a generic programming usage for Max and Min, i.e. I wouldn't use them as trait bounds. Further, we don't express inclusivity/exclusivity on those bounds, but I'm rambling, apologies.

--

For actionable steps, I've a list below. 1

  • verifying that all traits defined in statistics::traits that have an "obviously" unique implementation for a given distribution under the current constraint are implemented as such for each distribution
  • enumerating instances where there are implementations of traits in statistics::traits that are not "obviously" unique
    • clearly denoting the ambiguity and the behavior chosen in the docs at the impl

As an open question for some design, how might we reformulate this API to express something that users or contributors to statrs may wish to generically program over or use? Just to toss my own ideas into the air,

  • bounds checking in pdf and cdf to test if a value is between max and min would be supported by the existing traits,
    • expressing that a value is within the sample space is possible with max and min but a notion of pdf "support" could fill the bounds checking need
  • observing extreme values, like maximums or minimums from samples could be of interest as well.

Feel free to rename or open a new issue for that part as well. Thanks for your help!

Footnotes

  1. I use the word "obvious" to convey a very low degree of unexpectedness or very little ambiguity in behavior when a user sees the trait with it's trait level docs and sees that a given type implements it, but does not see the implementation or impl-level docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants