-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accumulate NamedTuple + Tangent #88
Conversation
fc55c22
to
d50f6e8
Compare
Codecov Report
@@ Coverage Diff @@
## main #88 +/- ##
==========================================
- Coverage 53.89% 51.69% -2.20%
==========================================
Files 21 21
Lines 2171 2124 -47
==========================================
- Hits 1170 1098 -72
- Misses 1001 1026 +25
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@@ -5,11 +5,25 @@ struct DiffractorRuleConfig <: RuleConfig{Union{HasReverseMode,HasForwardsMode}} | |||
@Base.constprop :aggressive accum(a::Tuple, b::Tuple) = map(accum, a, b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get some errors from this adding tuples of mis-matched lengths, which I haven't tracked down. Not touched in this PR.
What I don't know here is whether accum
is only ever called on a single tangent, or also on a tuple of a function's arguments. IIRC there is no distinction in Zygote, but in CR the former can be Array or Tangent or NoTangent, but the latter can only ever be a Tuple (of the right length).
Base.real(z::ZeroTangent) = z # TODO should be in CRC | ||
Base.real(z::NoTangent) = z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is JuliaDiff/ChainRulesCore.jl#581 now
I don't think this is the right long term solution, but it seems right for now. |
Yes for sure it's a hack. But it doesn't seem like it will hide serious bugs, and having tests that work for now is good. Last commit disables tests on 1.8, since judging by "┌ Warning: ir verification broken. Either use 1.9 or 1.7" there isn't an intention to make everything work there. (The actual failures are inference tests.) Edit: and now nightly has some unexpected passes, inference tests... Maybe we need to split CI into two files, to allow failures on 1.8 + nightly while returning a useful overall pass/fail? |
On the topic of longer-term solutions, why is |
I don't know. Where should we put things like this, where a failing test has uncovered bad behaviour, but possibly no longer bad enough to make the test fail? (Fun Zygote example where re-using a variable name caused a 5x slowdown for ages, later became a bug: Z#1290) |
This is a hack to fix problems like this:
Diffractor sometimes produces ChainRulesCore's Tangent types, and sometimes produces plain NamedTuples instead. If these happen for structural gradients of the same object, then you will get an error.
Ideally it should probably make up its mind. Maybe
Tangent
s are a pain to deal with at higher order?