-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.6: vec*mat throws "Cannot left-multiply a matrix by a vector" even when mat is 1 x n #400
Comments
julia> ones(5) * ones(5)'
5×5 Array{Float64,2}:
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 |
I know, but it doesn't change the fact that vector*ones(1,5) is well-defined mathematically
…Sent from my iPhone
On 2 Feb 2017, at 22:23, Fredrik Ekre ***@***.***> wrote:
julia> ones(5) * ones(5)'
5×5 Array{Float64,2}:
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
It looks like it is a bug introduce in JuliaLang/julia#19670; cc @andyferris |
@stevengj this was intentional on my part, rather than a bug. Perhaps a misfeature - but I'm pretty confident that it isn't. EDIT: Damn, I meant to stick this in the NEWS.md, but it seems I forgot. |
I'll copy my comments from Discourse: To me, the question is in reverse - why did this method exist in v0.5? The (only) reason is it was a way of obtaining the outer product of two vectors. You can now do Otherwise Believe me, I pained over this one, but really its not difficult to write |
If we view a vector as a 1-column matrix, then vector * (1-row matrix) makes perfect sense in standard linear algebra.. (I agree that there is some tension between this viewpoint and viewing vectors as living in some kind of abstract finite-dimensional Hilbert space with matrices as linear operators on them. But this seems like a case where we might as well continue to support both viewpoints, since most users will expect it.) |
(This question seems in much the same vein as whether a |
This strikes me as a slightly fishy thing to do in the first place, but I guess I can't really see the harm in allowing it. |
I worry that "slightly fishy" just leads to worse/less clear code, and prefer to have crystal clear semantics about things. I personally can't see that a user really wants this thing, where they are in this situation where "I have this matrix, but I know it's a single row but not a We don't treat
I was aiming for a degree of symmetry here, where we don't treat |
It's not treating a 1 x N array like a row vector, it's treating a column vector like an N x 1 matrix. |
OK, but what I didn't get is why do this for
We could go and implement this, but where would it end? Following that path would be a development nightmare. I'm not sure I care enough about this to hold out :) I don't even disagree that we can find a semantic of |
Because |
@dlfivefifty if mathematicians would really find the missing method quite suspicious, that is a pretty powerful argument.
This is also a good argument. We want an easy-to-use language. (EDIT: a well written message would help with such an explanation... you see @dlfivefifty it's all part of my plan to secretly train the next generation of mathematicians in Dirac notation... :) ). |
(I used to like Dirac notation until I started teaching with it and found out how badly most students mis-use it. They mostly seem to think it is just a funny kind of parentheses, and happily write things like "Â|ψ⟩=|Âψ⟩" to the point where it seemed to cause more harm than good.) |
(@stevengj That does not seem a good argument to ditch Dirac notation. Maybe the students should start to use it properly. Once getting used, it certainly does help to keep track what a bra and what a ket are. |
(I find that the benefits I get from it, in a non-quantum context, are outweighed by the the time it takes to teach people to use it properly, particularly when I'm dealing with physics students who should, in principle, have already learned it.) |
Sorry for the off-topic digression. In principle, a deprecation message could try to explain this. But my feeling is that trying to explain abstract vector spaces and dual spaces, and how vectors differ from the vector space of 1-column matrices that they are isomorphic to, is too much to deal with in a deprecation message. It will be much less confusing to just allow the operation. |
Haha - that is quite a math lesson to learn from a short string. :) I'm still a bit disquiet about this. The point was never Dirac notation (which is definitely a confusing set of brackets for newcomers), but the appreciation of vectors, their duals and linear operators. MATLAB always just called all of these arrays of numbers; I would like to take a more (relatively elementary) abstract linear algebra approach. To disentangle the argument, rather than consider Dirac notation, the corresponding mathematical notation I was aiming for would be this: "Consider vectors v, w, etc and their duals vᵀ, wᵀ, etc, as well as linear operators A, B, etc. The inner product induced by the duals (returning a scalar) is written vᵀ.w and the linear mapping as A.v = w; the dual of this implies vᵀ.Aᵀ = wᵀ. We can compose linear operators as A.B, and we can show the dual map is (A.B)ᵀ = Bᵀ.Aᵀ. Composing scalar-vector multiplication with the inner product means we can define a linear map that takes v and emits u.(wᵀ.v) and we choose to write this linear map as A = u.wᵀ." In the universities I've been associated with, a good student at the end of first-year math should be comfortable with this (plus eigenvalues, inverses, etc). To me, it seems exceeding inelegant to add to the end of this mathematical notation description "Oh, and by the way, in the special case that an operator A is a mapping from a 1-dimensional vector space to a n-dimensional vector space, we also allow you to write v.A." I feel this is associate one-dimensional vector spaces with scalars (in Julia, that is
If we replace wᵀ with a linear operator from a N-dimensional vector space to a 1-dimensional vector space (remember Julia says a 1D vector space is distinct from a scalar) then that implies u is an operator from 1-dimensional vector space to an M-dimensional vector space. And since Julia treats vectors and matrices as distinct, that implies that u must be a matrix. That is, we can multiply Mx1 matrices by 1xN matrices - which is fine and already exists, but it doesn't justify the method requested in the OP. IMO, all this follows directly from pre-exisiting Julia choices To give a bit of context of where I'm coming from, over time, my views have been quite hardline about this kind of thing. For instance, IMO I'm very sorry for my extremely long rant, especially the last paragraph - please don't take it the wrong way as I'm trying to admit here that my views are very strict and that some softening in an implementation which is to be used be very large number of users is quite acceptable in my book. :) But when I read e.g. this comment from Stefan I thought "well, it's only because it makes sense". I agree that this change is painful and it should have been a deprecation not a breaking change (sorry, my bad) but IMO long-term clarity is also important. I will try to bow out and let you guys figure out what to do now. @alanedelman and @mbauman have also discussed dual vectors, etc amongst themselves and with me quite a bit, so I feel it would be valuable to have their opinions before moving on. |
I agree that any math major could comprehend this. But lots of users of linear algebra are not math majors, even at the first year level. As long as we don't stray into ambiguity or incorrectness, a certain flexibility doesn't seem to hurt. I'm also a bit concerned that it will be difficult to write a clear deprecation message here. What you want to replace it with really seems to depend on where your 1-column matrix came from. |
Using |
Your viewpoint is also not canonical for all mathematicians. To numerical analysts a column vector is an n x 1 matrix. So we would view Abstract algebra is a generalisation of linear algebra. It's concepts do not necessary supersede or invalidate a simpler viewpoint. So I don't think your strict view is any more correct mathematically (just a different set of definitions), and would confuse large set of users. |
This example is consistent with my point: julia> ones(5,1)+ones(5)
5×1 Array{Float64,2}:
2.0
2.0
2.0
2.0
2.0 |
Yes, it's true there are multiple self-consistent interpretations here. The goal of #42 was to decide what kind of universe that we want our multidimensional arrays to live in. Across many issues, we seem to be moving away from Matlab's trailing singleton dimensions and moving towards a universe where dimensionality is intrinsically very important. This intimitely connects with JuliaLang/julia#14770. Personally, I like this restriction; to me it is both pedagogically and practically useful. I've had bugs that this would have caught. And I say that with full knowledge that it's in direct opposition with my viewpoint on JuliaLang/julia#14770. |
I completely agree @dlfivefifty that there are multiple entirely consistent conventions for linear algebra, and my one is not the one true canonical viewpoint. I am wondering which are consistent with pre-existing facts and conventions in Julia like |
Ha! I would have made a PR to deprecate that if I had of seen that already. :) (There is no harm in using |
In any case, they should be consistent. so either also deprecate One way I would be happy with the current setting is if julia> v*ones()
ERROR: MethodError: no method matching *(::Array{Int64,1}, ::Array{Float64,0})
Closest candidates are:
*(::Any, ::Any, ::Any, ::Any...) at operators.jl:281
*{T}(::Number, ::AbstractArray{T,N} where N) at arraymath.jl:31
*{T}(::AbstractArray{T,N} where N, ::Number) at arraymath.jl:34
... |
At some point in the future, I feel we might start to seriously consider the multi-dimensional (arbitrary tensor contraction) case with 0-dimensional being an obvious case. Among other things, we've been thinking about ways of representing or thinking about "upness" and "downness" of each dimension (vector spaces and their duals) and the method in which
Right, yes in all of the arguments above I am probably asserting a stronger difference between vectors and matrices than in v0.5 and before, and more like where we are heading towards/aspiring for in the future. We might even see deprecating (edit: curiously, this leads to an argument that, from the linear algebra point of view as well as many interfaces like linear indexing, |
You can can have a complex vector space without a multiplication algebra, so I don't think that defining It's not clear to me what we gain by deprecating |
I interpretted the vectorization roadmap (knowing full well that this is your roadmap) as suggesting we don't automatically vectorize functions over "unknowns on a grid". In particular, that the user should indicate that specifically with a dot-call. (edited) (by the way, I'm a huge fan of dot-call, both for the fusing and the semantic clarity it brings). |
@andyferris, "unknowns on a grid" can still form a vector space. e.g. the solutions to a discretized PDE are normally thought of as being in a Hilbert space. Just because a vector space has |
Sorry, I definitely misused the word "tensor" here - 100% agreed. It's good that all arrays are treated like a vector space for |
Just letting everyone know that there are now PRs for both options (restoring this functionality at JuliaLang/julia#20423, or else turning the breaking change into a nice deprecation warning message at JuliaLang/julia#20429). |
I'm a bit torn here but I have to say on the whole I'm with the scruffies (i.e. allow it):
There are certainly correctness counterarguments to be made here, I think that if we're going to make that change it should be made far more comprehensively than disallowing this one operation. Taking one step in that direction from where we are now – which is generally considering lower dimensional arrays to be identified behaviorally with higher dimensional arrays with trailing singleton dimensions – is a step into inconsistency. An argument can be made for going all the way in the other direction, but I suspect that if we make all those behavioral changes, not only will we break a lot of code that works just fine right now, but we may risk making the language pretty unpleasant for linear algebra and related tasks. |
While If the vector |
The code you need to execute depends rather heavily on the value of |
I'm not honestly suggesting removing the type parameter, merely pointing out that it's kind of part of the inconsistency between a liberal/lazy approach and a more strict approach: is a Vector in R^{n} or in R^{n x 1 x 1 ....}. |
That's why I described the identification as conceptual: we can't actually identify vectors with column matrices, but we can make them behave as similarly, which is effectively behavioral identification – that's as good as we can do in computers. Your argument seems to be that because they are not actually the same, they shouldn't behave the same, which doesn't make sense to me. |
If we want to chalk this up as the same as trailing singleton dimensions, then that seems somewhat logical. Although I think code would be more readable if people didn't use this method, and it could potentially hide bugs, Stefan makes good points that in this case hiding bugs isn't actually that likely in this case, and it is somewhat consistent with some other parts of Julia. (OTOH I have seen code in production with bugs that occur only when some dimension is To be clear, I'd rather use the type parameter While one may make an argument saying that removing trailing singleton indices from Finally, there is a new semantic thing happening if we support |
One way to think of it is that |
The following throws an error, even though a column vector times a
1 x n
matrix have consistent shapes.The text was updated successfully, but these errors were encountered: