Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

PDF operators for each distribution #14579

Closed
wants to merge 1 commit into from

Conversation

david-seiler
Copy link
Contributor

Description

This PR adds operators for computing the densities of samples drawn from any of the various distributions defined in operator/random, as well as their gradients. There are lots of changes to test_random.py to test each PDF alongside its distribution; aside from that, the patch should be entirely stand-alone. See pdf_op.cc for more-detailed description strings.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

produces output of dimension *n+1* such that each *n*-dimensional index *i*
in the output array holds the PDFs of the samples at index *i* in *sample*,
parameterized by the values of *low* and *high* at index *i*.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A special case is that sample is also an n-dimensional tensor which means that we have exactly one sample per distribution. In that case, the output will also be an n-dimensional tensor. I think the code does (and should) support this and that this is an important use case. So should be documented here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Called that out explicitly.


Examples::

random_pdf_uniform(sample=[[1,2,3,4]], low=[0], high=[10]) = [0.1 0.1 0.1 0.1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commas missing in the output array.


random_pdf_exponential(sample=[[1, 2, 3]], lam=[1]) =
[[0.36787945 0.13533528 0.04978707]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have consistent types of examples for all distributions.

  1. single distribution, multiple samples
  2. multiple distributions, multiple samples
  3. multiple distributions, single sample per distribution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added second examples for the exponential and Dirichlet.

bool is_log;
DMLC_DECLARE_PARAMETER(PdfParam) {
DMLC_DECLARE_FIELD(is_log).set_default(false)
.describe("Whether to compute the log PDF or not.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be that can be phrased a bit better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now "If set, compute the density of the log-probability instead of the probability."

template<typename DType, typename IType1, typename IType2>
MSHADOW_XINLINE static void Map(int start, int length, int sample_size, int index,
DType *out, IType1 *sample, IType2 *lower, IType2 *upper) {
const DType l(lower[index]), h(upper[index]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providing "index" to this and all similar functions below is redundant. According to the processing in LaunchExWrapper it always hold that index == start/sample_size. Still we may want to keep "index" as an explicit parameter for clarity. So o.k. with either choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's gone.

@asmushetzel
Copy link
Contributor

Should be noted that this PR allows to compute log-pdf and pdf. And that it also adds pdf of Dirichlet distribution which does not yet exist in MXNet.
For consistency, we should then add also a sampler for Dirichlet-distributions after this PR gets merged.

@asmushetzel
Copy link
Contributor

This addresses several requests from issue #12932

)code");
}

inline std::string dirichlet_desc() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have a function to draw samples from a dirichlet distribution? this only computes the pdf of the distribution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. Sampler does not exist yet. As by my previous comments, we should add it. We may have the bandwidth to do so in our team (and actually will need it). But we should not make existence of a sampler be a conditional to this PR as having the PDF is beneficial by itself (and we are using it already in a project).
My hope is that by adding serious functionality to the random-namespace in MXNet, we get more people interested (for example supplying also CDF etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the commit message to clarify that, unlike the other distributions, we don't have a Dirichlet sampler yet.

@marcoabreu
Copy link
Contributor

Is this PR pointing against the 1.3 branch on purpose?

@david-seiler david-seiler force-pushed the v1.3.x branch 2 times, most recently from a2e601c to 971ae83 Compare April 2, 2019 15:37
@piyushghai
Copy link
Contributor

@david-seiler Seems like you want to add a particular feature to a previous version of MXNet.
The correct process to do this would be :

  • Raise a PR against the master branch.
  • After that PR is merged into master, cherry-pick the same commit in another PR raised against v1.3.x (Or a different version) branch.

@mxnet-label-bot Add [pr-awaiting-review, operator]

@marcoabreu marcoabreu added Operator pr-awaiting-review PR is waiting for code review labels Apr 2, 2019
@david-seiler
Copy link
Contributor Author

@david-seiler Seems like you want to add a particular feature to a previous version of MXNet.

I started with 1.3.x because it's what we're using internally (for now); I'll close this in favor of a new PR against master once I've updated to build against master and addressed the rest of the review comments.

…r (plus also the PDF of the Dirichlet). Supports probabilities and log-probabilities, as well as gradients.
@david-seiler
Copy link
Contributor Author

This PR now correctly targets master -- still from a branch named v1.3.x on my side, since I couldn't see how to reset that, but that branch now follows apache:master with this one commit on top. Sorry for the confusion, but it seems better on net than throwing away the existing conversation.

@david-seiler
Copy link
Contributor Author

The tests in Jenkins have been stuck ever single I switched this PR to master, and I haven't found a way to unstick them, so please join me in a new PR of the same code: #14617

All the comments from this PR should be addressed there.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Operator pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants