-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Refactor LinearAlgebra to use isa
instead of method dispatch to avoid ambiguity hell
#33072
base: master
Are you sure you want to change the base?
Conversation
Agreed. |
This special-cases Base types at the expense of extendability in packages. The way @mbauman and I handled this for |
Could you elaborate when it would be a problem with module LinearAlgebra
function * end
# definitions of "local" * here
Base.:*(A::AbstractMatrix, B::AbstractMatrix) = A * B
end In current (say Julia 1.2) LinearAlgebra you can only define method that are more specific than the ones defined in LinearAlgebra and Base (without type piracy). So, above extreme version of LinearAlgebra seems to have the same property. Or am I missing something? |
I'm not sure the work on I think this approach of privileging transposes to auto-unwrap in the abstract/generic implementation makes a lot of sense. Sure, it's not fully extensible (in that others cannot insert similarly privileged types into the abstract implementation), but it addresses one of the biggest pain points here. One of the best ways to handle this sort of symmetric dispatch problem in general is through the sort of promotion/conversion numeric arithmetic uses. The challenge with arrays is twofold:
If we're not careful we might pay the cost for the conversions and still end up with a generic slow implementation. We don't know if the promoted method exists at all, and even if it does, we don't know if it's advantageous to do the conversions. In other words, the ambiguities are real, but we may want to paper over them. |
What kind of conversion/promotion you are thinking? Can they be done lazily so that it doesn't have to be expensive at run-time? It's not based on promotion but I can imagine doing something based on cost estimates (but yet possible to be compiled away): """
MulImpl{Cost, OtherType}(impl)
`Cost` is a pair `(p, c)` s.t. `impl(A, B)` does about `c * n^p` multiplications
when all the (non-unit) sizes of `A` and `B` are `n`.
"""
struct MulImpl{Cost, OtherType, F}
impl::F
end
MulImpl{Cost, OtherType}(impl::F) where F = MulImpl{Cost, OtherType, F}(impl)
cost(::MulImpl{Cost}) where Cost = Cost
othertype(::MulImpl{<:Any, OtherType}) where OtherType = OtherType
"""
lmulimpls(A) :: Tuple{Vararg{MulImpl}}
rmulimpls(B) :: Tuple{Vararg{MulImpl}}
"""
lmulimpls(A) = ()
rmulimpls(B) = ()
function (*)(A::AbstractMatrix, B::AbstractMatrix)
candidates = (
filter(impl -> B isa othertype(impl), lmulimpls(A))...,
filter(impl -> A isa othertype(impl), rmulimpls(B))...,
MulImpl{(Inf,), Any}(_mulfallback),
)
impl = first(sort(candidates, by=cost))
return impl(A, B)
end
lmulimpls(::UpperTriangular) = (
MulImpl{(2, 0.5), Number}(...), # UpperTriangular * Number needs about 0.5 * n^2 multiplications
# and so on...
) |
Here's an example: The other alternative, of course, is to request a coarser "memory layout" that specifies a more abstract API and then we dispatch on that — and while it'd be very powerful I think it just effectively pushes the ambiguity problem down one layer. Ref. #25558. |
Thanks, that's a very clear case.
I see. I guess I misread your last comment because I thought conversion idea could be used for complicated nested type like
I was vaguely aware of #25558 but I thought it was mostly about in-place implementations. If so, while it is very relevant, it does not directly solve the problem with binary operations like I wonder if it makes sense to implement the layered dispatch idea only for "algebraic" wrappers like |
Agreed that here there isn't a privileged item, and I don't have time now to think this through carefully, but wouldn't *(A, B) = _mul(MulTrait(A), MulTrait(B), A, B)
_mul(trait1, trait2, A, B) = _mul(MulTrait(trait1, trait2), A, B)
_mul(::HandleByReordering, A, B) = _permutedmul(A, B)
MulTrait(::Adjoint) = HandleByReordering()
...
MulTrait(::T, ::T) where T = T()
MulTrait(::HandleByReordering, ::Any) = HandleByReordering()
MulTrait(::Any, ::HandleByReordering) = HandleByReordering() go a long ways towards "splitting off" these special cases? |
The redesign of broadcasting (#23939) would have probably been a better analogy. |
But, because of this unpredictability, I still think cost-based dispatch #33072 (comment) is easier to handle than |
I think there are too many method definitions in LinearAlgebra just to resolve ambiguities. One simple solution might be to use "type-level" if branches using
isa
. That is to say, remove definitions like thisand inline them into more generic definition:
An obvious benefit is that this lets us remove many (7 by above change; c7fadec) method definitions that existed just for ambiguity resolutions. Also, I think it would make it easier for us to understand when/how certain combination of matrices is computed in a certain way.
A disadvantage may be that it would make
@edit A * B
useless when the given combination is handled inside theif
block. But I think this effect is small now that we have debuggers.Admittedly, this is not a super elegant solution. If anyone has a more elegant plan to resolve it, I'll be happy to wait for it. But I'm also guessing that
if
branches might not be so bad. I think it'll likely be better than the very fragile state of the current method definitions. What do you think?