Taking matrix transposes seriously #408

StefanKarpinski · 2017-03-10T20:54:40Z

Currently, transpose is recursive. This is pretty unintuitive and leads to this unfortunateness:

julia> A = [randstring(3) for i=1:3, j=1:4]
3×4 Array{String,2}:
 "J00"  "oaT"  "JGS"  "Gjs"
 "Ad9"  "vkM"  "QAF"  "UBF"
 "RSa"  "znD"  "WxF"  "0kV"

julia> A.'
ERROR: MethodError: no method matching transpose(::String)
Closest candidates are:
  transpose(::BitArray{2}) at linalg/bitarray.jl:265
  transpose(::Number) at number.jl:100
  transpose(::RowVector{T,CV} where CV<:(ConjArray{T,1,V} where V<:(AbstractArray{T,1} where T) where T) where T) at linalg/rowvector.jl:80
  ...
Stacktrace:
 [1] transpose_f!(::Base.#transpose, ::Array{String,2}, ::Array{String,2}) at ./linalg/transpose.jl:54
 [2] transpose(::Array{String,2}) at ./linalg/transpose.jl:121

For some time now, we've been telling people to do permutedims(A, (2,1)) instead. But I think we all know, deep down, that's terrible. How did we get here? Well, it's pretty well-understood that one wants the ctranspose or "adjoint" of a matrix of matrices to be recursive. A motivating example is that you can represent complex numbers using 2x2 matrices, in which case the "conjugate" of each "element" (actually a matrix), is its adjoint as a matrix – in other words, if ctranspose is recursive, then everything works. This is just an example but it generalizes.

The reasoning seems to have been the following:

ctranspose should be recursive
ctranspose == conj ∘ transpose == conj ∘ transpose
transpose therefore should also be recursive

I think there are a few problems here:

There's no reason that ctranspose == conj ∘ transpose == conj ∘ transpose has to hold, although the name makes this seem almost unavoidable.
The behavior of conj operating elementwise on arrays is kind of an unfortunate holdover from Matlab and isn't really a mathematically justifiable operations, much as exp operating elementwise isn't really mathematically sound and what expm does would be a better definition.
It is actually taking the conjugate (i.e. adjoint) of each element that implies that ctranspose should be recursive; in the absence of conjugation, there's no good reason for transpose to be recursive.

Accordingly, I would propose the following changes to remedy the situation:

Rename ctranspose (aka ') to adjoint – that is really what this operation does, and it frees us from the implication that it must be equivalent to conj ∘ transpose.
Deprecate vectorized conj(A) on arrays in favor of conj.(A).
Add a recur::Bool=true keyword argument to adjoint (née ctranspose) indicating whether it should call itself recursively. By default, it does.
Add a recur::Bool=false keyword argument to transpose indicating whether it should call itself recursively. By default, it does not.

At the very minimum, this would let us write the following:

julia> A.'
4×3 Array{String,2}:
 "J00"  "Ad9"  "RSa"
 "oaT"  "vkM"  "znD"
 "JGS"  "QAF"  "WxF"
 "Gjs"  "UBF"  "0kV"

Whether or not we could shorten that further to A' depends on what we want to do with conj and adjoint of non-numbers (or more specifically, non-real, non-complex values).

[This issue is the second of an ω₁-part series.]

The text was updated successfully, but these errors were encountered:

jiahao · 2017-03-10T20:57:16Z

The logical successor to a previous issue... 👍

stevengj · 2017-03-10T21:09:48Z

The behavior of conj operating elementwise on arrays is kind of an unfortunate holdover from Matlab and isn't really a mathematically justifiable operations

This is not true at all, and not at all analogous to exp. Complex vector spaces, and conjugations thereof, are a perfectly well-established mathematical concept. See also JuliaLang/julia#19996 (comment)

StefanKarpinski · 2017-03-10T21:13:38Z

Complex vector spaces, and conjugation thereof, are a perfectly well-established mathematical concept.

Unless I'm mistaken, the correct mathematical conjugation operation in that context is ctranspose rather than conj (which was precisely my point):

julia> v = rand(3) + rand(3)*im
3-element Array{Complex{Float64},1}:
 0.0647959+0.289528im
  0.420534+0.338313im
  0.690841+0.150667im

julia> v'v
0.879291582684847 + 0.0im

julia> conj(v)*v
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
 [1] *(::Array{Complex{Float64},1}, ::Array{Complex{Float64},1}) at ./linalg/rowvector.jl:180

stevengj · 2017-03-10T21:13:50Z

The problem with using a keyword for recur is that, last I checked, there was a big performance penalty for using keywords, which is a problem if you end up recursively calling these functions with a base case that ends up on scalars.

StefanKarpinski · 2017-03-10T21:14:19Z

We need to fix the keyword performance problem in 1.0 anyway.

stevengj · 2017-03-10T21:15:40Z

@StefanKarpinski, you're mistaken. You can have complex conjugation in a vector space without having adjoints — adjoints are a concept that requires a Hilbert space etc., not just a complexified vector space.

Furthermore, even when you do have a Hilbert space, the complex conjugate is distinct from the adjoint. e.g. the conjugate of a complex column vector in ℂⁿ is another complex vector, but the adjoint is a linear operator (a "row vector").

(Complex conjugation does not imply that you can multiply conj(v)*v!)

StefanKarpinski · 2017-03-10T21:18:19Z

Deprecating vectorized conj is independent of the rest of the proposal. Can you provide some references for the definition of complex conjugation in a vector space?

stevengj · 2017-03-10T21:18:57Z

https://en.wikipedia.org/wiki/Complexification#Complex_conjugation

(This treatment is rather formal; but if you google "complex conjugate matrix" or "complex conjugate vector" you will find zillions of usages.)

StefanKarpinski · 2017-03-10T21:33:28Z

If mapping a complex vector to the conjugate vector (what conj does now) is an important operation distinct from mapping it to is conjugate covector (what ' does), then we can certainly keep conj to express that operation on vectors (and matrices, and higher arrays, I guess). That provides a distinction between conj and adjoint since they would agree on scalars but behave differently on arrays.

In that case, should conj(A) call conj on each element or call adjoint? The representation of complex numbers as 2x2 matrices example would suggest that conj(A) should actually call adjoint on each element, rather than calling conj. That would make adjoint, conj and conj. all different operations:

adjoint: swap i and j indices and recursively map adjoint over elements.
conj: map adjoint over elements.
conj.: map conj over elements.

stevengj · 2017-03-10T21:39:28Z

conj(A) should call conj on each element. If you are representing complex numbers by 2x2 matrices, you have a different complexified vector space.

For example, a commonplace usage of conjugation of vectors is in the analysis of eigenvalues of real matrices: the eigenvalues and eigenvectors come in complex-conjugate pairs. Now, suppose you have a block matrix represented by a 2d array A of real 2x2 matrices, acting on blocked vectors v represented by 1d arrays of 2-component vectors. For any eigenvalue λ of A with eigenvector v, we expect a second eigenvalue conj(λ) with eigenvector conj(v). This won't work if conj calls adjoint recursively.

stevengj · 2017-03-10T21:44:41Z

(Note that the adjoint, even for a matrix, might be different from a conjugate-transpose, because the adjoint of a linear operator defined in the most general way depends also on the choice of inner product. There are lots of real applications where some kind of weighted inner product is appropriate, in which case the appropriate way to take the adjoint of a matrix changes! But I agree that, given a Matrix, we should take the adjoint corresponding to the standard inner product given by dot(::Vector,::Vector). However, it's entirely possible that some AbstractMatrix (or other linear-operator) types would want to override adjoint to do something different.)

stevengj · 2017-03-10T21:53:20Z

Correspondingly, there is also some difficulty in defining an algebraically reasonable adjoint(A) for e.g. 3d arrays, because we haven't defined 3d arrays as linear operators (e.g. there is no built-in array3d * array2d operation). Maybe adjoint should only be defined by default for scalars, 1d, and 2d arrays?

Update: Oh, good: we don't define ctranspose now for 3d arrays either. Carry on.

timholy · 2017-03-10T22:05:29Z

(OT: I'm already looking forward to "Taking 7-tensors seriously," the next installment in the highly-successful 6-part miniseries...)

StefanKarpinski · 2017-03-10T22:23:44Z

If you are representing complex numbers by 2x2 matrices, you have a different complexified vector space.

I'm not following this – the 2x2 matrix representation of complex numbers should behave exactly like having complex scalars as elements. I would think, e.g. that if we define

m(z::Complex) = [z.re -z.im; z.im z.re]

and we have an arbitrary complex vector v then we'd want this identity to hold:

conj(m.(v)) == m.(conj(v))

I would spell the example out more and make a comparison with ' which is supposed to already commute with m but I can't because of #409, which accidentally breaks that commutation. Once that bug is fixed then m.(v)' == m.(v') will hold, but if I'm understanding you correctly, @stevengj, then conj(m.(v)) == m.(conj(v)) should not?

stevengj · 2017-03-11T03:17:26Z

conj(m(z)) should == m(z).

Your m(z) is an isomorphism to complex numbers under addition and multiplication, but it is not the same object in other ways for linear algebra, and we shouldn't pretend it is for conj. For example, if z is a complex scalar, then eigvals(z) == [z], but eigvals(m(z)) == [z, conj(z)]. When the m(z) trick is extended to complex matrices, this doubling of the spectrum has tricky consequences for e.g. iterative methods (see e.g. this paper).

Another way of putting it is that z is already a complex vector space (a vector space over the complex numbers), but the 2x2 real matrix m(z) is a vector space over the real numbers (you can multiply m(z) by a real number) … if you "complexify" m(z) by multiplying it by a complex number e.g. m(z) * (2+3im) (not m(z) * m(2+3im)) then you get a different complex vector space not isomorphic anymore to the original complex vector space z.

felixrehren · 2017-03-11T07:08:35Z

This proposal looks like it will produce a real improvement 👍 I'm thinking about the mathematical side:

as I understand, conjugation is an action on the ring (read: numbers) which provide the coefficients for our vector space/vectors. This action induces another action on the vector space (and on its mappings, read:matrices), also called conjugation, by the linearity of the construction of the vector space over the ring. Hence conjugation necessarily works by elementwise application to the coefficients. For Julia, that means that at heart for a vector v, conj(v) = conj.(v), and it's a design choice whether conj has methods also for arrays or only for scalars, both of which seem ok? (Steven's point about the complex-numbers-as-2x2-matrices example is that this is a vector space whose coefficients are formally/actually real, so conjugation has no effect here.)
adjoint has an algebraic meaning that's fundamental in lots of important domains intimately involving linear algebra (see 1 and 2 and 3, all with big applications in physics too). If adjoint is added, it should allow keyword arguments for the action under consideration -- in a vector spaces/functional analysis, this action is typically induced by the inner product, hence the keyword argument can be the form instead. Whatever is designed for Julia's transposes should avoid conflicting with these applications, so I think adjoint is a bit highly-charged?

stevengj · 2017-03-11T12:54:43Z

@felixrehren, I don't think you would use a keyword argument to specify the inner product that induces the adjoint. I think you would just use a different type instead, the same as if you wanted to change the meaning of dot for a vector.

stevengj · 2017-03-11T13:00:31Z

My preference would be a bit simpler:

Keep conj as-is (elementwise).
Make a' correspond to adjoint(a), and make it recursive always (and hence fail for arrays of strings etc where adjoint is not defined for the elements).
Make a.' correspond to transpose(a), and make it never recursive (just a permutedims), and hence ignore the type of the elements.

I really don't see a use case for a non-recursive adjoint. As a stretch, I can sort of imagine use-cases for a recursive transpose, but in any application where you have to change a.' to transpose(a, recur=true) it seems just as easy to call another function transposerecursive(a). We could define transposerecursive in Base, but I think the need for this will be so rare that we should wait to see if it actually comes up.

andyferris · 2017-03-12T12:51:46Z

I personally feel that keeping this simple will be the most straightforward for users (and for implementation), and still be quite defensible on the linear algebra front.

So far (most) of our linear algebra routines are pretty strongly rooted to standard array structures and the standard inner product. In each slot of your matrix or vector, you put an element of your "field". I will argue that for elements of fields, we care about +, *, etc, and conj, but not transpose. If your field was a special complex representation which looks a bit like a 2x2 real matrix but changes under conj, then that's OK - it's a property of conj. If it's just a 2x2 real AbstractMatrix that doesn't change under conj, then arguably such elements of a matrix shouldn't change under the adjoint (@stevengj said it better than I can describe - there might be an isomorphism to complex numbers but that doesn't mean it behaves the same in all ways).

Anyway, the 2x2 complex example feels a little bit of a red herring to me. The real cause of the recursive matrix behavior was as a shortcut to doing linear algebra on block matrices. Why don't we treat this special case with proper care, and simplify the underlying system?

So my "simplified" suggestion would be:

Keep conj as-is (or make it a view for AbstractArrays)
Make a' a non-recursive view such that (a')[i,j] == conj(a[j,i])
Make a.' a non-recursive view such that (a.')[i,j] == a[j,i]
Introduce a BlockArray type for handling block-matrices and so-on (in Base, or possibly in a well-supported package). Arguably, this would be much more powerful and flexible than a Matrix{Matrix} for this purpose, but equally efficient.

I think these rules would be simple enough for users to embrace, and build upon.

PS - @StefanKarpinski For practical reasons, a boolean keyword argument for recursion won't work with view's for transposition. The type of the view may depend on the boolean value.

andyferris · 2017-03-12T13:01:33Z

Also, I mentioned elsewhere, but I'll add it here for completeness: recursive transpose views have the annoying property that the element type might change compared to the array it is wrapping. E.g. transpose(Vector{Vector}) -> RowVector{RowVector}. I haven't thought of a way of getting this element type for RowVector without a run-time penalty or by invoking inference to compute the output type. I'm guessing the current behavior (invoking inference) is undesirable from a language standpoint.

NB: there's also nothing stopping users from defining conj to return a different type - so the ConjArray view also suffers from this problem, whether transposition is recursive or not.

StefanKarpinski · 2017-03-12T20:23:27Z

@stevengj – you point about "complex" 2x2 matrices being a formally real vector space rather than a complex vector space makes sense to me, but then that point calls into question for me the original motivation for recursive adjoint, which leads me to wonder if @andyferris's proposal wouldn't be better (non-recursive transpose and adjoint). I guess the fact that both the complex 2x2 example and the block matrix representation "want" adjoint to be recursive is suggestive, but given your comments about that first example, I have to wonder if there aren't other cases where non-recursive adjoint is more correct/convenient.

stevengj · 2017-03-12T22:42:48Z

If the adjoint isn't recursive, it isn't an adjoint. It's just wrong.

StefanKarpinski · 2017-03-12T22:43:41Z

Can you give a little bit more of a justification for that when you've got a moment?

stevengj · 2017-03-12T23:13:28Z

The adjoint of a vector must be a linear operator mapping it to a scalar. That is, a'*a must be a scalar if a is a vector. And this induces a corresponding adjoint on matrices, since the defining property is a'*A*a == (A'*a)'*a.

If a is a vector of vectors, this implies that a' == adjoint(a) must be recursive.

andyferris · 2017-03-13T00:32:41Z

OK, I think I follow this.

We also have recursive inner products:

julia> norm([[3,4]])
5.0

julia> dot([[3,4]], [[3,4]])
25

Clearly, the "adjoint" or "dual" or whatever should be similarly recursive.

I think the central question then is, do we require that a' * b == dot(a,b) for all vectors a, b?

The alternative is to say that ' doesn't necessarily return the adjoint - it is just an array operation that transposes the elements and passes them through conj. This just happens to be the adjoint for the real or complex elements.

stevengj · 2017-03-13T00:52:22Z

There's one and only one reason that we have a special name and special symbols for adjoints in linear algebra, and that is the relationship to inner products. That's the reason why "conjugate transpose" is a important operation and "conjugate rotation of matrices by 90 degrees" is not. There's no point in having something that is "just an array operation that swaps rows and columns and conjugates" if it is not connected to dot products.

stevengj · 2017-03-13T01:00:21Z

One can define a similar relationship between transpose and an unconjugated "dot product", which would argue for a recursive transpose. However, unconjugated "dot products" for complex data, which aren't really inner products at all, don't show up nearly as often as true inner products — they arise from bi-orthogonality relations when when an operator is written in complex-symmetric (not Hermitian) form — and don't even have a built-in Julia function or a common terminology in linear algebra, whereas wanting to swap rows and columns of arbitrary non-numeric arrays seems far more common, especially with broadcasting. This is why I can support making transpose non-recursive while leaving adjoint recursive.

JeffBezanson · 2017-11-30T22:07:15Z

Triage thinks @Sacha0 should move ahead with proposal 2 so we can try it out.

andyferris · 2017-11-30T23:17:48Z

I strongly agree with @ttparker that recursive adjoint is a choice, and not the only mathematically consistent option. For example, we could simply state:

1 - to LinAlg, an AbstractVector v is a length(v)-dimensional vector with (scalar) basis weights v[1], v[2], ..., v[length(v)].

(and similarly for AbstractMatrix).

This would probably be the assumption many people would make coming from other libraries, and having such simple definition of basis vectors, rank, etc, helps keep implementation simple. (Many would probably say that numerical linear algebra is feasible to implement on a computer precisely because we have a nice basis set to work in.)

Our current approach is more like:

2 - to LinAlg, an AbstractVector v is a direct sum of length(v) abstract vectors. We also include enough definitions on scalar types like Number so that to LinAlg they are valid one-dimensional linear operators / vectors.

and similarly for (block) matrices. This is wildly more generalized than the linear algebra implementations in MATLAB, numpy, eigen, etc, and it's a reflection of Julia's powerful type/dispatch system that this is even feasible.

The overarching reason I see option 2 as being desirable is again that Julia's type/dispatch system allows us to have a much broader goal, which vaguely goes like:

3 - In LinAlg, we are attempting to create a generic linear algebra interface that works for objects which satisfy linearity (etc) under +, *, conj (etc), treating such objects as linear operators / members of a Hilbert space / whatever as appropriate.

Which is a really cool goal (surely much beyond any other programming language/library that I'm aware of), completely motivates recursive adjoint and 2 (because +, * and conj are themselves recursive) and is why @Sacha0's proposal 2 and the triage decision is a good choice :)

Sacha0 · 2017-12-01T01:43:26Z

I'd really be happy to discuss further about this, but I guess we better do this elsewhere. But really, contact me as I am actually very interested to learn more about this.

Cheers, let's do! I look forward to chatting further offline :). Best!

Sacha0 · 2017-12-01T01:44:11Z

Nice summary Andy! :)

Jutho · 2017-12-01T10:51:13Z

Fully agreed Andy, at least for adjoint (which was the topic of your comment).

However, one final plea for a non-recursive transpose, before I forever hold my peace (hopefully).
I see a non-recursive transpose having the following advantages:

It can be used by all people working with matrices, even containing non-numerical data. That's also how they will likely know this operation and look for it, from other languages and from basic math extrapolated to their non-numerical use case
No need to write extra code to have the lazy flip type or PermutedDimsArray interact with LinAlg. What if I have a numerical matrix which I flipped instead of transposed; will I still be able to multiply it with other matrices (preferably using BLAS)?
With a non-recursive transpose and a recursive adjoint, we can easily have a non-recursive adjoint as conj(transpose(a)) and a recursive transpose conj(adjoint(a)). And still everything will interact nicely with LinAlg.

So what are the disadvantages. I see none. I still stand by my point that really nobody is using transpose in its mathematical sense. But instead of trying to argue any further, can anybody give me an concrete example where a recursive transpose is needed or useful, and where really it is a mathematical tanspose? This excludes any example where you actually intended to use adjoint but are just having real numbers. So an application where there is a mathematical reason to transpose a vector or matrix filled with more matrices which are itself complex valued.

pablosanjose · 2017-12-01T11:02:35Z

I can say that at least Mathematica (which one would expect to have devoted considerable thought to this) does not do recursive transpose:

A = Array[1 &, {2, 3, 4, 5}];
Dimensions[A]  # returns {2, 3, 4, 5}
Dimensions[Transpose[A]] # returns {3, 2, 4, 5}

EDIT: Ooops, this was also commented above, sorry

ttparker · 2017-12-01T15:59:09Z

So I'm confused. There seemed to be pretty solid consensus that transpose should be made non-recursive - e.g. #408, #408, #408, #408, and JuliaLang/julia#23424. Then @Sacha0 gave two proposals, one of which would leave transpose recursive but introduce a non-recursive flip function, which got strong support despite (as far as I can tell) not really having been raised as a possibility before. Then @JeffBezanson suggested that we don't need flip after all if we give permutedims a default second argument, which also got strong support.

So now the consensus seems to be that the only real changes to transpose should be "behind the scenes": regarding special lowering and lazy vs. eager evaluation, which the typical end user probably won't know or care about. The only really visible changes are essentially just spelling changes (depreciating .' and giving permutedims a default second argument).

So the community consensus seems to have changed almost 180 degrees in a very short time (around the time of @Sacha0's post #408). Was Sacha's post so eloquent that it just changed everyone's mind? (That's fine if so, I just want to understand why we seem to moving forward on a path that just a few days ago we'd all seemed to agree was the wrong one.)

ttparker · 2017-12-01T16:28:04Z

I forget if anyone's suggested this, but could we just make transpose(::AbstractMatrix{AbstractMatrix}) (and possibly transpose(::AbstractMatrix{AbstractVector}) as well) recursive and transpose non-recursive otherwise? That seems like it would cover all the bases and I can't think of any other use case where you'd want tranpose to be recursive.

Sacha0 · 2017-12-01T17:10:09Z

So the community consensus seems to have changed almost 180 degrees in a very short time (around the time of @Sacha0's post #408 (comment)). Was Sacha's post so eloquent that it just changed everyone's mind? (That's fine if so, I just want to understand why we seem to moving forward on a path that just a few days ago we'd all seemed to agree was the wrong one.)

If only I were so eloquent 😄. What you are seeing is that consensus had not actually formed. Rather, (1) participants that favor the status quo, but had withdrawn from discussion due to attrition, returned to express an opinion; and (2) other parties that had not considered what moving away from the status quo would entail in practice (and how that might play with release considerations) formed a stronger opinion in favor of the status quo and expressed that opinion.

Please consider that this discussion has been ongoing in one form or another on github since 2014, and likely earlier offline. For long-term participants, such discussions become exhausting and cyclic. There being meaningful work to do other than engage in this discussion --- like writing code, which is more enjoyable --- the result is attrition among those long-term participants. Consequently, the conversation appears lopsided during one period or another. Personally, I am about at that attrition threshold, so I am going to focus on writing code now rather than continue engaging in this discussion. Thanks all and best! :)

braydenware · 2017-12-01T18:36:25Z

I'll cast a small vote in favor of non-recursive transpose and ctranspose for AbstractArrays, with both being recursive on AbstractArray{T} where T<:AbstractArray.

I agree that recursive behavior is 'correct' in some cases, and I see the question as how do we achieve the correct behavior with the least amount of surprise for those using and developing packages.
In this proprosal, recursive transpose behavior for custom types is opt-in: you opt-in by making your type an AbstractArray or by defining the appropriate method
Base.transpose(AbstractArray{MyType}) or Base.transpose(AbstractArray{T}) where T<: MyAbstractType.
I think the duck typing strategy for recursive transposes (just recursing without asking) produces a few surprises as documented above. If you introduce distinct ctranspose and adjoint, or more complicated proposals like conjadjoint and flip, users will encounter these and try to use them, and package maintainers will try to support them all.

As an example of something that would be hard to support under the new proposals: normal, transpose, ctranspose, and conj arrays all should be able to have views (or lazy evaluation) which interop with ReshapedArray and SubArray views. (I'm agnostic about whether these produce views by default or only when using @view. ) This connects to the work on the A*_mul_B* lowering and on lower level BLAS calls with flags 'N', 'T', and 'C' for dense arrays, as has been noted elsewhere. These would be easier to reason about if they treated normal, transpose, ctranspose, and conj
on equal footing. Note that BLAS itself only supports 'N' for normal, 'T' for transpose, and 'C' for ctranspose and has no flag for conj, which I think is a mistake.

Finally, for consistency with higher dimensional Arrays and reshapes, I believe the appropriate generalization of transpose and ctranspose is to reverse all of the dimensions, i.e.
transpose(A::Array{T, 3}) = permutedims(A, (3, 2, 1)).

Cheers!

Jutho · 2017-12-01T21:44:16Z

I very much appreciate the people actually doing the work. What has been discussed at way too much length is vector adjoints/transpose (but never the recursive aspect of it), until @andyferris stepped up and implemented this, and it works wonderfully well. Similarly, I also greatly appreciate the ongoing redesign of array constructors. Thumbs up for all of that.

That being said, matrix transpose and adjoint/ctranspose never got much discussion, especially not the recursive aspect of it, which was almost silently introduced in JuliaLang/julia#7244 with as single motivation block matrices. Various reasons and motivations for recursive adjoint have been given (after the facts), and most people can agree on that being a good (but not the only) choice. Transpose however is lacking even a single motivation or actual use case.

andyferris · 2017-12-01T22:26:58Z

There's a few separate things going on in these discussions, and it happens right now that we need a plan that can be implemented quickly.

We've discussed whether supporting block matrices (and more exotic structures) is worthwhile in LinAlg. Implementation choices are: no recursive stuff at all (except +, * and conj because that's the nature of generic functions in Julia), recursive everything (the status quo), or trying for some kind of type-check or trait for whether an element should do recursive linear algebra or be treated as scalar.
We want a nice way for users to permute the dimensions of a 2D array of data. We've got non-recursive transpose, flip, shortened permutedims syntax (that PR was submitted first purely because it is the fewest characters to implement, and probably makes sense even if we do something else as well), some kind of type-check or trait for whether an element should do recursive transpose (perhaps even reintroducing transpose(x::Any) = x...).
The Julia parser has strange behavior like x' * y -> Ac_mul_B(x, y) which is a bit of a wart, which ideally won't exist in v1.0. This hasn't been seen as feasible until we can support fast BLAS (no extra copies) without it, thus lazy matrix transpose & adjoint.
The code in LinAlg is quite large and built up over a number of years. Many things like matrix multiplication could be refactored to be more trait-friendly, perhaps with a dispatch system more like the new broadcast. I think this is where we can make it easier to send the right arrays (I'm thinking PermuteDimsArray of conjugated reshaped ajdointed views of strided matrices) to BLAS. However, this won't make v1.0 and we are also trying to avoid performance regressions without making the code much worse. As Sacha pointed out (and I'm discovering) having transpose views with a wide range of behaviors on the elements (recursive adjoint, recursive transpose, conjugation, nothing) makes for additional complexity and a bunch of new methods to keep things working as they are.

If we think about v1.0 as somewhat stabilizing the language, then in some senses the biggest priority to make a change in behavior is the third one. I'd say: the language (including parser) should be most stable, followed by Base, followed by the stdlib (which may or may not include LinAlg, but I think almost definitely will include BLAS, Sparse, etc one day). It's a change that doesn't really affect users (mostly library developers), so I wouldn't be surprised if people's opinions differ here.

Sacha0 · 2017-12-01T23:21:09Z

Spot on Andy! :)

JeffBezanson · 2017-12-31T01:09:10Z

I think the only thing left to do here is make adjoint and transpose lazy by default?

JeffBezanson · 2018-01-10T23:20:17Z

Can this be closed now?

StefanKarpinski · 2018-01-15T17:19:13Z

Next up: "Taking scalar transposes seriously"

ChrisRackauckas · 2018-01-15T17:22:43Z

But seriously though, can we have a good interface for specifying the different 3D transposes and tensor multiplications which are used in PDE solvers? Kind of serious, but I'm not sure if I could handle being the OP to the next iteration of this madness.

KristofferC · 2018-01-15T19:22:08Z

no

:)

StefanKarpinski · 2018-01-15T23:56:44Z

can we have a good interface for specifying the different 3D transposes and tensor multiplications which are used in PDE solvers

Definitely seems like a good subject for a package.

andyferris · 2018-01-16T00:11:31Z

a good interface for specifying the different 3D transposes and tensor multiplications

Does TensorOperations.jl not do what you need here? (Note that at this level "a good interface" means something like a tensor network diagram, which is slightly challenging to write in code any more succinctly than the syntax of TensorOperations).

ChrisRackauckas · 2018-01-16T01:23:56Z

Yeah, TensorOperations.jl looks good. I was slightly kidding, but I got what I needed out of it 👍 .

jiahao added the linear algebra label Mar 10, 2017

StefanKarpinski mentioned this issue Mar 10, 2017

bug involving broadcast and vector ctranspose #409

Closed

JeffBezanson added the breaking This change will break code label Mar 12, 2017

JeffBezanson assigned Sacha0 Nov 30, 2017

andyferris mentioned this issue Jan 6, 2018

recursive vecdot and vecnorm JuliaLang/julia#25093

Closed

JeffBezanson closed this as completed Jan 15, 2018

andyferris mentioned this issue Nov 9, 2018

Add +/- for Number and AbstractMatrix to LinearAlgebra JuliaLang/julia#29951

Closed

arturgower mentioned this issue Apr 28, 2019

Change the type of FrequencySimulationResult.ω from RowVector in to just Vector? JuliaWaveScattering/MultipleScattering.jl#24

Closed

jebej mentioned this issue Sep 26, 2020

Small feature request: hvec function JuliaLang/julia#37768

Open

nsajko mentioned this issue Oct 22, 2023

transpose should not be recursive #1034

Closed

stevengj mentioned this issue Mar 16, 2017

new syntax for transpose #410

Closed

KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024

Taking matrix transposes seriously #408

Taking matrix transposes seriously #408

Comments

StefanKarpinski commented Mar 10, 2017 • edited Loading

jiahao commented Mar 10, 2017

stevengj commented Mar 10, 2017 • edited Loading

StefanKarpinski commented Mar 10, 2017 • edited Loading

stevengj commented Mar 10, 2017

StefanKarpinski commented Mar 10, 2017

stevengj commented Mar 10, 2017 • edited Loading

StefanKarpinski commented Mar 10, 2017

stevengj commented Mar 10, 2017 • edited Loading

StefanKarpinski commented Mar 10, 2017

stevengj commented Mar 10, 2017 • edited Loading

stevengj commented Mar 10, 2017 • edited Loading

stevengj commented Mar 10, 2017 • edited Loading

timholy commented Mar 10, 2017

StefanKarpinski commented Mar 10, 2017 • edited Loading

stevengj commented Mar 11, 2017 • edited Loading

felixrehren commented Mar 11, 2017 • edited Loading

stevengj commented Mar 11, 2017

stevengj commented Mar 11, 2017 • edited Loading

andyferris commented Mar 12, 2017 • edited Loading

andyferris commented Mar 12, 2017 • edited Loading

StefanKarpinski commented Mar 12, 2017

stevengj commented Mar 12, 2017 • edited Loading

StefanKarpinski commented Mar 12, 2017

stevengj commented Mar 12, 2017

andyferris commented Mar 13, 2017

stevengj commented Mar 13, 2017

stevengj commented Mar 13, 2017 • edited Loading

JeffBezanson commented Nov 30, 2017

andyferris commented Nov 30, 2017 • edited Loading

Sacha0 commented Dec 1, 2017

Sacha0 commented Dec 1, 2017

Jutho commented Dec 1, 2017

pablosanjose commented Dec 1, 2017 • edited Loading

ttparker commented Dec 1, 2017 • edited Loading

ttparker commented Dec 1, 2017

Sacha0 commented Dec 1, 2017 • edited Loading

braydenware commented Dec 1, 2017 • edited Loading

Jutho commented Dec 1, 2017

andyferris commented Dec 1, 2017

Sacha0 commented Dec 1, 2017

JeffBezanson commented Dec 31, 2017

JeffBezanson commented Jan 10, 2018

StefanKarpinski commented Jan 15, 2018

ChrisRackauckas commented Jan 15, 2018

KristofferC commented Jan 15, 2018 • edited Loading

StefanKarpinski commented Jan 15, 2018

andyferris commented Jan 16, 2018 • edited Loading

ChrisRackauckas commented Jan 16, 2018

StefanKarpinski commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 10, 2017 •

edited

Loading

StefanKarpinski commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 10, 2017 •

edited

Loading

StefanKarpinski commented Mar 10, 2017 •

edited

Loading

stevengj commented Mar 11, 2017 •

edited

Loading

felixrehren commented Mar 11, 2017 •

edited

Loading

stevengj commented Mar 11, 2017 •

edited

Loading

andyferris commented Mar 12, 2017 •

edited

Loading

andyferris commented Mar 12, 2017 •

edited

Loading

stevengj commented Mar 12, 2017 •

edited

Loading

stevengj commented Mar 13, 2017 •

edited

Loading

andyferris commented Nov 30, 2017 •

edited

Loading

pablosanjose commented Dec 1, 2017 •

edited

Loading

ttparker commented Dec 1, 2017 •

edited

Loading

Sacha0 commented Dec 1, 2017 •

edited

Loading

braydenware commented Dec 1, 2017 •

edited

Loading

KristofferC commented Jan 15, 2018 •

edited

Loading

andyferris commented Jan 16, 2018 •

edited

Loading