-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "Remove number / vector (#44358)" #49915
Conversation
This reverts commit 24d5268.
I don't get what particular method of the chain rule for / and \ was broken by the PR? Is it just vector \ number or something more? |
I could have it wrong but, The linked PR has tests which you can try to see tgd primal (forward) code that it broke. Here is an example of something that broke. Julia Master (and 1.9)julia> y, pb = rrule(/, [1.0, 2.0], [1.0, 2.0]);
julia> y
2×2 adjoint(::Matrix{Float64}) with eltype Float64:
0.2 0.4
0.4 0.8
julia> unthunk.(pb(y))
ERROR: MethodError: no method matching /(::Float64, ::Vector{Float64})
Julia 1.6 (and 1.8)julia> using ChainRules, ChainRulesCore
julia> y, pb = rrule(/, [1.0, 2.0], [1.0, 2.0]);
julia> y
2×2 adjoint(::Matrix{Float64}) with eltype Float64:
0.2 0.4
0.4 0.8
julia> unthunk.(pb(y))
(NoTangent(), [0.20000000000000004, 0.4000000000000001], [-0.20000000000000007, -0.40000000000000013]) |
OK, so indeed the PR broke the chain rule for vector / vector and vector \ vector, which are both legitimate 1.9 operations (though I'd love to get rid of them in the 2.0 timeframe). Note that both these operations are highly unusual, so it's relatively unlikely to have actually broken people's code. I've only seen vector \ vector used once (in the pkgeval run for the PR, in a stats package that used it for linear regression without offset, which is very cute), and I'm pretty sure nobody ever used vector / vector, except as a typo for vector ./ vector. This is because the rule is
When A and B are vectors, Y is a scalar, so the second line hits vector' \ number, which hits number / vector. This is indeed because vector'vector is a scalar. An easy fix is to change the chainrule to cast Y to a vector if it is a scalar (or to compute the second line with an explicit pseudoinverse in the vector case, which it's going to do anyway, but not in the matrix case). It's a bit concerning that pkgeval didn't catch this, maybe it got into chainrules after the pkgeval run? I agree that removing the method was breaking, and I was a bit on the fence about merging it in a pre-2.0 release. Fixing the massive footgun outweighted the esoteric breakage, though. Is it acceptable to just do the quick fix above to chainrules? |
Its been in ChainRules for over 3 years. |
The pkgeval run did catch this, but then returned a timeout and so it was not reported: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/62300a8_vs_a5dfbb4/ChainRules.primary.log. I'll open a separate issue for maybe making pkgeval report failures even if there's a subsequent timeout |
adding to triage because this will be controversial, but I do think we do actually have to merge this (even though I hate it). |
I won't be on triage, but my opinion is that this is a relatively minor breakage: it only affects the chain rule for vector / vector and vector \ vector. Basically nobody uses these methods, so I doubt the chainrule issue is actually bothering people (apart from fixing the CR CI, of course). An easy fix is Reverting it now is a bit weird (it'd make for weird 1.10 release notes). We knew it was breaking when merging it, so this issue doesn't change the calculus that was made at the time. But I'm not super clear on the policy wrt breaking stuff, so... |
There is still no evidence that anyone actually uses this so I agree with @antoine-levitt that nothing has changed since the last discussion was had on this. |
The deleted code is literally used in the ChainRules.jl code. That is a package that exists. Cos I am pretty sure they do, since it is the rrule for It's one thing to say "Do minor breaking thing that noone uses" based on PkgEval The calculus has changed from when this was decided due to the fact that this break was missed when running PkgEval due to it timing out and not reporting the error. We do vaguely like to say that we take SemVer seriously.
This is a perfectly valid thing to do by semver. |
Also vector \ vector probably. But yes, that's the point: I don't expect anybody ever hit that code path (but I don't know)
I'd imagine this is a case of overzealous testing and there's no actual use case for this, but again I don't know. I think the point is that as far as breakage go, this one is pretty minor. It's breaking a massive footgun (number / vector) that is, I'm pretty sure, never used except in error. In doing so, it unintentionally also broke the chain rule for another feature (vector \ vector) which is less of a footgun (because nobody reaches for \ unless they mean linalg) and does have its (rare) uses. It broke the chain rule in a way that is easy to fix, with a one-line workaround that will go away in 2.0. I'm not defending the breakage itself: if breakage is forbidden under any circumstance, then this PR should never have been merged and let's revert. But if there's some leeway, then the calculus hasn't fundamentally changed since the PR was merged, and so it shouldn't be reverted. |
If a package does: for method in get_all_julia_methods()
wrapper_expr = provide_wrapper_for_method(method)
@eval wrapper_expr
end I wouldn't say that this is a "usage" of every julia method. ChainRules tries to provide a chain rule for every Julia function but that doesn't on its own constitute a usage (even though its tests will break if any of those methods are removed). What is interesting is the actual code that ends up calling this chain rule. |
but it isn't the method for |
I support reverting this (any) feature if/since used in a package.
Can't we then deprecate it? Even if it has a few/one users.
I don't know too much about this/vector math, but there is some vector math alternative code, that should be used? Is using deprecated bad for speed?
I.e. this was thought "technically breaking", and all such gets documented in NEWS, so I support at least the documenting. This was basically API, not syntax, but in either case, we likely need to be strict to not ever break. Except in 2.0, and I see this as just one more argument we should get to that sooner rather than later. |
Yes, deprecating it would be fine.
It was in the NEWS.md, I just didn't spot it. |
There seems to be no use of this in the ecosystem (defining use as a call to this method that is not just testing the method itself). And the method that is getting added back is the extremely confusing
So why add it back again? No one has missed this method in over a year! |
This was only removed in 1.9 (according to NEWS), not all live on master, so only been out of Julia a few days officially? Maybe we can get away not having it in, see if we get (more) feedback before 1.9.1? |
Triage agrees that this was breaking. However this method is somewhat sketchy and easy to misuse (as determined by the original pkgeval). As such, we believe that this functionality should be re-added for 1.9.1, but it should be deprecated (in 1.9.1) and slated for re-removal in 2.0. |
The breakage of this was never in question, it was deemed seldomly enough used and Co fusing enough that the breakage was worth it. And if the method gets deprecated ChainRules will have to update to not use it anyway but the confusing behavior is still present. Worst of both worlds? I mean the only reason this is discussed is because @oxinabox does not like the workaround for ChainRules due to the removal of this but a deprecation causes the exact same workload. So seems pointless to do reintroduce the confusing behavior (which again has not been mentioned by anyone for a year). |
I think the key point is that
only happened do to a pkgeval bug. had we seen that chainrules had an explicit test that this worked, we wouldn't have merged the pr. it is a good target for removal since it goes against the spirit of #4774, and likely to be a typo, but removal is actually breaking a real package that was using an exported function correctly . deprecating will mean that chainrules keeps working, and sets the stage for us to properly remove this in the future. |
Can we schedule this in to be talked about at Triage on the 22rd? |
I think triage already agrees that this was breaking and should be fixed |
Does this mean we should merge and potentially backport to 1.9.2? |
@ViralBShah yes. |
Can we add a Forget-me-not label? |
Triage thought we should do this, in part because the original PR's pkgeval looks pretty bad (IIRC triage looked at a sample of the reported breakage and found several to be real). We can always re-triage and re-land the original PR. |
Really? I did look at those pretty closely, in both that pkgeval and that of #40758, and I didn't see any that were caused by this PR, just segfaults. The only "real" breakage I've seen is ChainRules, discussed extensively here. I can either try to reland this directly, or make a PR to deprecate it, just tell me which. |
I just now manually reviewed 9/174 of the packages that failed tests only after #44358. Of these, I found one real breakage
This usage is via ChainRules.jl, and its existence serves as a counterexample to the (false) prevailing notion in this thread that no non-ChainRules.jl package depends on ChainRules.jl's usage of this method. Triage recommended deprecation, so a PR to deprecate is likely to succeed, but @oxinabox might object? |
Nice! I wonder why my screening did not flag it, I should have been more thorough.
This is likely fixed by JuliaDiff/ChainRules.jl#718. My argument was not that nobody uses vector / vector and vector \ vector (which was indeed broken in chainrules as an unintended consequence, and is now fixed by JuliaDiff/ChainRules.jl#718), but that nobody uses (correctly) real / vector and vector \ real. If deprecation is the way to go, can we go all the way and deprecate vector / vector and vector \ vector while we're at it? |
Backported PRs: - [x] #47782 <!-- Generalize Bool parse method to AbstractString --> - [x] #48634 <!-- Remove unused "deps" mechanism in internal sorting keywords [NFC] --> - [x] #49931 <!-- Lock finalizers' lists at exit --> - [x] #50064 <!-- Fix numbered prompt with input only with comment --> - [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized --> - [x] #50516 <!-- Fix visibility of assert on GCC12/13 --> - [x] #50635 <!-- `versioninfo()`: include build info and unofficial warning --> - [x] #49915 <!-- Revert "Remove number / vector (#44358)" --> - [x] #50781 <!-- fix `bit_map!` with aliasing --> - [x] #50845 <!-- fix #50438, use default pool for at-threads --> - [x] #49031 <!-- Update inference.md --> - [x] #50289 <!-- Initialize prev_nold and nold in gc_reset_page --> - [x] #50559 <!-- Expand kwcall lowering positional default check to vararg --> - [x] #49582 <!-- Update HISTORY.md for `DelimitedFiles` --> - [x] #50341 <!-- invokelatest docs should say not exported before 1.9 --> - [x] #50525 <!-- only check that values are finite in `generic_lufact` when `check=true` --> - [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some cases --> - [x] #50523 <!-- Avoid generic call in most cases for getproperty --> - [x] #50860 <!-- Add `Base.get_extension` to docs/API --> - [x] #50164 <!-- codegen: handle dead code with unsafe_store of FCA pointers --> - [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` --> - [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception handling. --> Need manual backport: - [ ] #48542 <!-- Add docs on task-specific buffering using multithreading --> - [ ] #50591 <!-- build: fix various makefile bugs --> Non-merged PRs with backport label: - [ ] #50842 <!-- Avoid race conditions with recursive rm --> - [ ] #50823 <!-- Make ranges more robust with unsigned indexes. --> - [ ] #50663 <!-- Fix Expr(:loopinfo) codegen --> - [ ] #49716 <!-- Update varinfo() docstring signature --> - [ ] #49713 <!-- prevent REPL from erroring in numbered mode in some situations --> - [ ] #49573 <!-- Implement jl_cpu_pause on PPC64 --> - [ ] #48726 <!-- fix macro expansion of property destructuring --> - [ ] #48642 <!-- Use gc alloc instead of alloc typed in lowering --> - [ ] #48183 <!-- Don't use pkgimage for package if any includes fall in tracked path for coverage or alloc tracking --> - [ ] #48050 <!-- improve `--heap-size-hint` arg handling --> - [ ] #47615 <!-- Allow threadsafe access to buffer of type inference profiling trees -->
This reverts commit 24d5268.
undoing #44358
I propose backporting this PR to bring the operation back in 1.9.1, and holding off #44358 til julia 2.0 where matching changes can be made as a whole set. cc: @antoine-levitt
#44358 was originally made with the assumption that noone was using this feature except in error.
Which was checked using PkgEval.
This was incorrect, and the check for some reason or another missed or skipped over ChainRules.jl.
This broke ChainRules.jl for anyone using reverse mode AD to go through
/
or\
.The use in ChainRules.jl is intentional and is required to achieve generality between different types.
Quoting @sethaxen on slack
The trivial fix to make ChainRules work again in 1.9.0 (JuliaDiff/ChainRules.jl#718) is uglier, less performant and less numerically good.
I think we do not need to discuss if the feature in #44358 is desirable or not in this PR.
Merely if it was too breaking or not.