Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't try and convert to FloatX except if AbstractFloat (option 2) #418

Closed
wants to merge 2 commits into from

Conversation

oxinabox
Copy link
Member

Alternative to #417
which also says to leave integers alone as well.
Which @DhairyaLGandhi likes the idea of.
(I can see that it might lead to some faster code and avoid allocations)

One option to fix the issues with tying to use ForwardDiff on a rrule pullback that makes use of ProjectTo
See discussion on JuliaDiff/ForwardDiff.jl#538.

Which fixes the following test that fails in FluxML/Zygote.jl#1035

  using Zygote
  dx, dy = diaghessian(f34, xs, y)
  @test size(dx) == size(xs)
  @test vec(dx)  diag(hessian(x -> f34(x,y), xs))
  @test dy  hessian(y -> f34(xs,y), y)

This option basically says:
All subtypes of Real represent the same subspace: all of Real.
So if you give me one that is not a AbstractFloat subtype (.e.g Integer or Dual) I am just going to leave it alone.
But if you give me a AbstractFloat, I will fix the precision to match what i saw in the primal pass.

@codecov-commenter
Copy link

codecov-commenter commented Jul 27, 2021

Codecov Report

Merging #418 (bd91dcb) into master (424a0b7) will increase coverage by 0.13%.
The diff coverage is 100.00%.

❗ Current head bd91dcb differs from pull request most recent head 325480d. Consider uploading reports for the commit 325480d to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #418      +/-   ##
==========================================
+ Coverage   92.25%   92.39%   +0.13%     
==========================================
  Files          14       14              
  Lines         775      776       +1     
==========================================
+ Hits          715      717       +2     
+ Misses         60       59       -1     
Impacted Files Coverage Δ
src/projection.jl 96.09% <100.00%> (+0.53%) ⬆️
src/accumulation.jl 97.14% <0.00%> (-0.08%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 424a0b7...325480d. Read the comment docs.

Comment on lines 155 to 156
(::ProjectTo{<:Number})(dx::Number) = dx
(::ProjectTo{<:Real})(dx::Real) = dx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't these work already, via the generic fallback, ProjectTo{T}(dx::T)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not.
Without this it breaks.
Which is weird as the method error says the method is there.

LinearAlgebra: dense structured matrices: Error During Test at /home/oxinabox/JuliaEnvs/ChainRulesWorld/ChainRulesCore.jl/test/projection.jl:175
  Test threw exception
  Expression: pherm(reshape(1:9, 3, 3) .+ im) == [1.0 3.0 5.0; 3.0 5.0 7.0; 5.0 7.0 9.0]
  MethodError: no method matching (::ProjectTo{ComplexF64, NamedTuple{(), Tuple{}}})(::Complex{Int64})
  Closest candidates are:
    (::ProjectTo{T, D} where D<:NamedTuple)(::T) where T at /home/oxinabox/JuliaEnvs/ChainRulesWorld/ChainRulesCore.jl/src/projection.jl:111
    ```

src/projection.jl Outdated Show resolved Hide resolved
src/projection.jl Show resolved Hide resolved
src/projection.jl Outdated Show resolved Hide resolved
src/projection.jl Outdated Show resolved Hide resolved
src/projection.jl Outdated Show resolved Hide resolved
@oxinabox oxinabox changed the title Don't try and convert to FloatX except if Integer or AbstractFloat (option 2) Don't try and convert to FloatX except if AbstractFloat (option 2) Jul 27, 2021
@mcabbott
Copy link
Member

NB: This isn't just an alternative implementation of #417 to address edge cases like JuliaDiff/ForwardDiff.jl#538 with strange number types. It also completely removes the idea that gradients of Integers should be floats, on which I thought there was consensus.

We can discuss of course, but we did already discuss at length. What's the argument for changing this?

@oxinabox
Copy link
Member Author

NB: This isn't just an alternative implementation of #417 to address edge cases like JuliaDiff/ForwardDiff.jl#538 with strange number types. It also completely removes the idea that gradients of Integers should be floats, on which I thought there was consensus.

We can discuss of course, but we did already discuss at length. What's the argument for changing this?

Correct, sorry for the incorrect title. Fixed now

I am happy with option 1 or option 2.
I openned this option 2 on the idea that if we are willing to treat any thing that said it was a Real subtypes a being good enough for any other real subtype.
Then why not include integer in that defintion.

But I am more than happy to stick to option 1.
We can come back to this if we want to later after option 1 is in.

Apply suggestions from code review

Co-authored-by: Michael Abbott <[email protected]>
remove redundant real
@mcabbott
Copy link
Member

My vote is option 1 then, with the tidier code here copied back & a few lines like

(::ProjectTo{T})(dx::Integer) where T<:AbstractFloat = convert(T, dx)

@oxinabox
Copy link
Member Author

I like that also. Good.

@oxinabox oxinabox closed this Jul 27, 2021
@mcabbott
Copy link
Member

any thing that said it was a Real subtypes a being good enough for any other real subtype. Then why not include integer in that defintion.

For reference, I think the argument we had was the integers really are special. It's a compromise between knowing that they are really categorical, and every calculus 101 lecture which uses them as representatives of the real line, which happen to be quick to write on the blackboard. In this lecture there is certainly no expectation that the gradients also be integers; but there is an expectation they be real.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants