Add rules for dense matrix exponential #351

sethaxen · 2021-01-18T11:33:09Z

Fixes #331

Because I plan to add a number of matrix functions, and they are long, I have added these rules to a new function matfun.jl.
These rules unfortunately require quite a bit of code duplication from LinearAlgebra, but following discussion in JuliaLang/julia#5840, that's only unavoidable if we refactor these functions in LinearAlgebra to return their intermediates.

sethaxen · 2021-01-18T11:34:11Z

src/rulesets/LinearAlgebra/matfun.jl

+# NOTE: for matrix functions whose power series representation has real coefficients,
+# the pullback and pushforward are related by an adjoint.
+# Specifically, if the pushforward of f(A) is (f_*)_A(ΔA), then the pullback at Y=f(A) is
+# (f^*)_Y(ΔY) = (f_*)_{A'}(ΔY) = ((f_*)_A(ΔY'))'


Well that is hideous, but notation is hard, and harder in unicode.

Unicode's missing subscripts make it extra hard.

Idea that might not be worth doing:
What if we just made a section for this in the docs, (maybe as internal notes or something)
and wrote the latex and then linked to that?

But yeah notation for pullbacks and pushforwards is hard.
It has to convey so much state

This bit makes sense:
(f^*)_Y(ΔY) = ((f_*)_A(ΔY'))'
so the pullback at A, i.e. the pullback from Y (though that's not well defined since not all functions are monotonic?)
is equal to the the adjoint of the pushing forward at A, the adjoint of of the output senstivity.
the fact that that is also equal to (f_*)_{A'}(ΔY) is pretty magic.

Magical expodential symmetry? (I feel like i made the same suprised sounds for the same reason on your last PR)

What if we just made a section for this in the docs, (maybe as internal notes or something)
and wrote the latex and then linked to that?

Hm, that's an idea. I'll consider it, potentially for a future PR.

This bit makes sense:
(f^*)_Y(ΔY) = ((f_*)_A(ΔY'))'
so the pullback at A, i.e. the pullback from Y...is equal to the the adjoint of the pushing forward at A, the adjoint of of the output senstivity.

Ah yes your description is correct (although it's the adjoint of the pushing forward of the adjoint). I just checked Lee, and this should be the right notation:

Suggested change

# (f^*)_Y(ΔY) = (f_*)_{A'}(ΔY) = ((f_*)_A(ΔY'))'

# (f^*)_A(ΔY) = (f_*)_{A'}(ΔY) = ((f_*)_A(ΔY'))'

(though that's not well defined since not all functions are monotonic?)

I'm not sure what you mean by this.

Magical expodential symmetry? (I feel like i made the same suprised sounds for the same reason on your last PR)

It's still surprising to me. Although this property is general for all of the matrix functions defined in LinearAlgebra, not just exp. It doesn't follow for all matrix functions though, just those whose convergent power series have real coefficients.

src/rulesets/LinearAlgebra/matfun.jl

oxinabox

I started to comment on the formatting, but its a bit much.
I think just run https://github.com/domluna/JuliaFormatter.jl/ over it
format_file("matfun.jl", BlueStyle())
its pretty good.

I will review after that, since then I won't be spending time on the basic stuff.

src/rulesets/LinearAlgebra/matfun.jl

sethaxen · 2021-01-18T19:48:16Z

I started to comment on the formatting, but its a bit much.

Yeah, it's the formatting used by exp! in LinearAlgebra, which is not great.

I think just run https://github.com/domluna/JuliaFormatter.jl/ over it
format_file("matfun.jl", BlueStyle())

Done!

codecov-io · 2021-01-18T19:51:30Z

Codecov Report

Merging #351 (8c27276) into master (9004ee0) will decrease coverage by 11.21%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           master     #351       +/-   ##
===========================================
- Coverage   97.64%   86.43%   -11.22%     
===========================================
  Files          18       19        +1     
  Lines        1231     1172       -59     
===========================================
- Hits         1202     1013      -189     
- Misses         29      159      +130

Impacted Files	Coverage Δ
src/ChainRules.jl	`66.66% <ø> (-33.34%)`	⬇️
src/rulesets/LinearAlgebra/matfun.jl	`100.00% <100.00%> (ø)`
src/rulesets/LinearAlgebra/symmetric.jl	`83.15% <100.00%> (-15.55%)`	⬇️
src/rulesets/Base/evalpoly.jl	`0.00% <0.00%> (-97.68%)`	⬇️
src/rulesets/Base/utils.jl	`0.00% <0.00%> (-80.00%)`	⬇️
src/rulesets/Statistics/statistics.jl	`66.66% <0.00%> (-23.34%)`	⬇️
src/rulesets/LinearAlgebra/utils.jl	`66.66% <0.00%> (-20.00%)`	⬇️
src/rulesets/LinearAlgebra/structured.jl	`92.04% <0.00%> (-6.84%)`	⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9004ee0...8c27276. Read the comment docs.

src/rulesets/LinearAlgebra/matfun.jl

… exp2

test/rulesets/LinearAlgebra/matfun.jl

src/rulesets/LinearAlgebra/matfun.jl

oxinabox · 2021-01-19T11:34:45Z

src/rulesets/LinearAlgebra/matfun.jl

+# NOTE: for matrix functions whose power series representation has real coefficients,
+# the pullback and pushforward are related by an adjoint.
+# Specifically, if the pushforward of f(A) is (f_*)_A(ΔA), then the pullback at Y=f(A) is
+# (f^*)_Y(ΔY) = (f_*)_{A'}(ΔY) = ((f_*)_A(ΔY'))'


Unicode's missing subscripts make it extra hard.

Idea that might not be worth doing:
What if we just made a section for this in the docs, (maybe as internal notes or something)
and wrote the latex and then linked to that?

But yeah notation for pullbacks and pushforwards is hard.
It has to convey so much state

oxinabox · 2021-01-19T12:06:30Z

src/rulesets/LinearAlgebra/matfun.jl

+# NOTE: for matrix functions whose power series representation has real coefficients,
+# the pullback and pushforward are related by an adjoint.
+# Specifically, if the pushforward of f(A) is (f_*)_A(ΔA), then the pullback at Y=f(A) is
+# (f^*)_Y(ΔY) = (f_*)_{A'}(ΔY) = ((f_*)_A(ΔY'))'


This bit makes sense:
(f^*)_Y(ΔY) = ((f_*)_A(ΔY'))'
so the pullback at A, i.e. the pullback from Y (though that's not well defined since not all functions are monotonic?)
is equal to the the adjoint of the pushing forward at A, the adjoint of of the output senstivity.
the fact that that is also equal to (f_*)_{A'}(ΔY) is pretty magic.

Magical expodential symmetry? (I feel like i made the same suprised sounds for the same reason on your last PR)

src/rulesets/LinearAlgebra/matfun.jl

test/rulesets/LinearAlgebra/matfun.jl

Co-authored-by: Lyndon White <[email protected]>

sethaxen · 2021-01-20T05:57:23Z

@oxinabox I added a _matfun_frechet_adjoint, changed the signature to put the differential first, and substantially modified the docstrings and comments. Would you mind re-reviewing the comments and docstrings at the top of matfun.jl before I merge?

oxinabox

LGTM

888: Remove rules for matrix exponential r=DhairyaLGandhi a=sethaxen JuliaDiff/ChainRules.jl#351 added rules for the dense matrix exponential to ChainRules. This PR removes the corresponding adjoint from Zygote. Co-authored-by: Seth Axen <[email protected]>

sethaxen added 12 commits January 18, 2021 02:44

Add matfun.jl file

6e8358c

Add matfun docstrings

f987f35

Add exp matrix function

e6b92c9

At least store one intermediate

7624ee5

Test exp!

a7792a5

Make pullback type-inferrable

6d6b4cb

Add clearer test label

3645d75

Create as hermitian

937f2ac

Test rrule

b48204c

Add comment about relationship between pushforward and pullback

2c19bba

Add header

58f6005

Add reference to Frechet deriv paper

6ee1759

sethaxen commented Jan 18, 2021

View reviewed changes

src/rulesets/LinearAlgebra/matfun.jl Outdated Show resolved Hide resolved

oxinabox reviewed Jan 18, 2021

View reviewed changes

sethaxen added 3 commits January 18, 2021 11:31

Run JuliaFormatter

b1a2980

Reduce comment spacing from code

e860b3e

Update src/rulesets/LinearAlgebra/matfun.jl

8f665ac

sethaxen commented Jan 18, 2021

View reviewed changes

src/rulesets/LinearAlgebra/matfun.jl Outdated Show resolved Hide resolved

sethaxen added 4 commits January 18, 2021 13:04

Correctly handle balancing

9e565ae

Test imbalanced matrix A

71134fd

Increment version number

bd48565

Merge branch 'exp2' of https://github.com/sethaxen/ChainRules.jl into…

062b11d

… exp2

oxinabox approved these changes Jan 19, 2021

View reviewed changes

sethaxen and others added 4 commits January 19, 2021 11:49

Apply suggestions from code review

dc1b1ab

Co-authored-by: Lyndon White <[email protected]>

Change signature of _matfun_frechet

57aea17

Give math for Frechet derivative

e2e6605

Change Frechet notation

976af09

sethaxen added 11 commits January 19, 2021 17:17

Add _matfun_frechet_adjoint

d7d20ba

Simplify hermitian code

b0ae61c

Correct comment

62b963b

Remove comments

87e4c53

Use abbreviated SHA

9bd06b1

Link

2ed06e6

Update comment

156e6f5

Move comment up

9a63d13

Move comment further up

5ba193d

Update docstrings

49df929

Push header to same level as rules

8c27276

oxinabox approved these changes Jan 20, 2021

View reviewed changes

sethaxen merged commit b92da50 into JuliaDiff:master Jan 20, 2021

sethaxen deleted the exp2 branch January 20, 2021 18:34

sethaxen mentioned this pull request Jan 20, 2021

Remove rules for matrix exponential FluxML/Zygote.jl#888

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rules for dense matrix exponential #351

Add rules for dense matrix exponential #351

sethaxen commented Jan 18, 2021

sethaxen Jan 18, 2021

oxinabox Jan 19, 2021

oxinabox Jan 19, 2021

sethaxen Jan 19, 2021

oxinabox left a comment

sethaxen commented Jan 18, 2021 •

edited

Loading

codecov-io commented Jan 18, 2021 •

edited

Loading

oxinabox Jan 19, 2021

oxinabox Jan 19, 2021

sethaxen commented Jan 20, 2021

oxinabox left a comment

	# (f^)_Y(ΔY) = (f_)_{A'}(ΔY) = ((f_*)_A(ΔY'))'
	# (f^)_A(ΔY) = (f_)_{A'}(ΔY) = ((f_*)_A(ΔY'))'

Add rules for dense matrix exponential #351

Add rules for dense matrix exponential #351

Conversation

sethaxen commented Jan 18, 2021

sethaxen Jan 18, 2021

Choose a reason for hiding this comment

oxinabox Jan 19, 2021

Choose a reason for hiding this comment

oxinabox Jan 19, 2021

Choose a reason for hiding this comment

sethaxen Jan 19, 2021

Choose a reason for hiding this comment

oxinabox left a comment

Choose a reason for hiding this comment

sethaxen commented Jan 18, 2021 • edited Loading

codecov-io commented Jan 18, 2021 • edited Loading

Codecov Report

oxinabox Jan 19, 2021

Choose a reason for hiding this comment

oxinabox Jan 19, 2021

Choose a reason for hiding this comment

sethaxen commented Jan 20, 2021

oxinabox left a comment

Choose a reason for hiding this comment

sethaxen commented Jan 18, 2021 •

edited

Loading

codecov-io commented Jan 18, 2021 •

edited

Loading