Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logpdf_grad for BroadcastedNormal. #433

Merged
merged 20 commits into from
Aug 2, 2021

Conversation

MrVPlusOne
Copy link
Contributor

As requested by #432, this PR fixes the gradient of BroadcastedNormal for the non-scalar cases.

Now it should correctly handle non-scalar arguments.
@ztangent
Copy link
Member

Thanks for making this PR! I'm not sure why the Travis build errored, but separate of that, it'd great if you could adjust the implementation to be overflow-safe in line with #321!

Copy link
Contributor

@bzinberg bzinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MrVPlusOne! A couple of initial comments now; I'll come back later to suggest unit / regression tests.

src/modeling_library/distributions/normal.jl Outdated Show resolved Hide resolved
src/modeling_library/distributions/normal.jl Outdated Show resolved Hide resolved
Comment on lines 129 to 131
@test isapprox(actual[1], finite_diff(f, args, 1, dx; broadcast=true))
@test isapprox(actual[2], finite_diff(f, args, 2, dx; broadcast=true))
@test isapprox(actual[3], finite_diff(f, args, 3, dx; broadcast=true))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

@MrVPlusOne MrVPlusOne Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually a little unsure how to fix these tests. It seems that the first argument of isapprox is a zero-dimensional array (which I believe is the expected behavior) but the second argument is a Float64, which causes Julia to complain that no method of isapprox matches the given argument types. Any suggestions on how to fix this?

Copy link
Contributor

@bzinberg bzinberg Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After investigating this a bit, I think it boils down to JuliaLang/julia#28866, which I consider to be a bug in Julia. Yes -- in the version of the code before this PR, both the LHS and the RHS should be 0-dimensional arrays. For the RHS, the issue appears to be that these lines

Gen.jl/test/runtests.jl

Lines 22 to 23 in d79aded

pos_args[i] = copy(args[i]) .+ dx
neg_args[i] = copy(args[i]) .- dx

do ::Aray{T,0} .+ ::T and ::Aray{T,0} .- ::T, which should give an ::Array{T,0} but due to JuliaLang/julia#28866 gives a scalar. So the LHS of the ./ on the following line should have an Array{T,0} as the ith argument to f, thus (since broadcast = true) should output an Array{T,0}, and consequently the division operation should be ::Array{T,0} ./ ::T which should output an Array{T,0}, but that is not what happens.

I think the cleanest way to fix this is to add a workaround to the (test-only) function finite_diff so that it handles zero-dimensional array valued arguments correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm currently drafting this.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, in addition to the above, the function f being supplied to finite_diff does not satisfy the requirements in its docstring: it takes array-valued arguments, but is not the broadcast of a function that takes scalar-valued arguments (i.e. logpdf_grad(vector-valued distribution parameters) is not a broadcast of a bunch of logpdf_grad(scalar-valued distribution parameters)'s). So I guess we should add a variant of finite_diff that has the semantics we need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think to properly test this, we will need a more general finite_diff function or use an autodiff library implementation as a reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you clearly trust autodiff more than I do 🙂

@bzinberg bzinberg changed the title Update logpdf_grad for BroadcastedNormal. Fix logpdf_grad for BroadcastedNormal. Jul 14, 2021
@bzinberg
Copy link
Contributor

bzinberg commented Jul 14, 2021

@MrVPlusOne, I've drafted a fix to the unit test breakage. Take a look and LMKWYT?

@MrVPlusOne
Copy link
Contributor Author

@bzinberg Thanks! The fix looks good to me!

@bzinberg
Copy link
Contributor

@MrVPlusOne, I've made a few more changes that are mostly by way of tidying.

  • Rename the vars in the unbroadcast helper functions to generically describe what they do, rather than coupling them to how they're used in logpdf_grad; e.g., unbroadcast_for_arg(arg, grad) becomes _unbroadcast_like(a, full_arr)
  • Add some documentation, mainly for what "unbroadcast" means since I can imagine that being ambiguous or confusing to a reader who doesn't have the same context we have
  • A bit of code tidying without changing the logic
  • Line length and indentation

The one functional change I made was to add to a couple of the unit tests an explicit check on the shape of the returned arrays.

I think regression testing is covered by the shape checks that were fixed in e712c71. They passed before, but that is because the unit tests were incorrect.

@MrVPlusOne, any comments you'd like to make on the above? If looks good to you, I think this is ready to merge.

@MrVPlusOne
Copy link
Contributor Author

Thanks for the cleaning up; they all look good to me! I think one more nice thing to have would be to add a unit test for the case where the two arguments of broadcasted_norm have different numbers of dimensions. We currently don't have a test for that, right?

@bzinberg
Copy link
Contributor

Great idea @MrVPlusOne, how's this?

@MrVPlusOne
Copy link
Contributor Author

Thanks, looks good to me!

@bzinberg bzinberg merged commit acd7005 into probcomp:master Aug 2, 2021
@bzinberg
Copy link
Contributor

bzinberg commented Aug 2, 2021

Thanks @MrVPlusOne for using and improving Gen!

@bzinberg bzinberg linked an issue Aug 4, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

logpdf_grad(::BroadcastedNormal, ...) sums over the wrong dims
3 participants