Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reduce(cat(dims=4), A), with efficient method for simple cases #37196

Closed
wants to merge 10 commits into from

Conversation

mcabbott
Copy link
Contributor

@mcabbott mcabbott commented Aug 25, 2020

Updated:

This adds an efficient method to do things like reduce((x,y) -> cat(x,y,dims=3), A) for a vector of matrices. This can be called as reduce(cat(dims=3), A), where keyword-only cat now returns a Fix1 function. reduce(cat(dims=1), A) will also dispatch to reduce(vcat, A), etc.

The new internal _typed_cat function handles stacking an M-array of uniform-size N-arrays such that B[I,J] == A[J][I], and can be called by B = reduce(cat, A). The explicit reduce(cat(dims=N+1), A) dispatches to this when M=1. As does reduce(cat(dims=N+2), A) by inserting a trivial dimension, i.e. B[I,1,j] == A[j][I]. For example:

julia> reduce(cat(dims=4), [ones(2,3) for _ in 1:5]) |> size      # M=1 outer, N=2 inner, insert 1 trivial
(2, 3, 1, 5)
  
julia> reduce(cat, [rand(3,5) for μ in 1:7, ν in 0:10]) |> size   # M=N=2, no one fixed dims
(3, 5, 7, 11)

Other concatenations such as reduce(cat(dims=(1,2)), A) will simply proceed pairwise. It would be possible to add a more efficient method, later, if there is a need.

Earlier versions of this PR also added a method for cat(A,B,C) to infer dimensions, but (1) perhaps that's too confusing, and (2) it doesn't fit well with the above reduce, as after the first reduction, ndims has changed, and (3) anyway it doesn't allow anything new to be done easily. So this has been removed. Which means that reduce(cat, A) on anything that isn't a uniform M-array of N-arrays will still be an error.

In the beginning, this said:

This lets you omit the dims keyword in cat, when acting on arrays of consistent ndims. One reason to do so is that reduce((x,y) -> cat(x,y,dims=3), A) can then be written reduce(cat, A), which can have an efficient method like existing reduce(hcat, A). (They agree for a vector of vectors.)

Needs tests, and a less crude implementation than using reduce(hcat, vec.(A)) when ndims(first(A)) > 2. But first, is this a good idea?

@mcabbott
Copy link
Contributor Author

mcabbott commented Sep 4, 2020

The slightly weird feature of reduce(cat, [rand(1,2,3) for _ in 1:4]) is that, without a special method, the first cat(A1, A2) produces a 4-array by inferring dims, but the second application cat(prev, A3) would fail. The existing reduce(hcat, As) doesn't have this issue.

@BioTurboNick
Copy link
Contributor

One question would be whether the natural default should be cat along the highest input dimension, or to add one new dimension?

I can see the argument for the latter, and was initially going to support it. But your second point makes me think the former would be a more consistent API. If someone desired concatenation along the next-highest dimension then, they could reshape the result or reshape at least one of the inputs to have a 1-length extra dimension.

@dkarrasch dkarrasch added the arrays [a, r, r, a, y, s] label Oct 6, 2020
@mcabbott
Copy link
Contributor Author

For a vector of arrays, cat along their ndims not ndims+1 would indeed avoid this weirdness. But would also remove the most obvious use case, of concatenating a lot of matrices into a 3-array, since it would then be identical to reduce(hcat, ...). And the resulting reshaping is a bit of a pain.

With a version of Base.Fix2 holding keywords, something like reduce(cat(dims=3), arrays) could dispatch to an efficient implementation. (xref #36181 maybe.)

How strange would reduce(cat, vec_of_mats; dims=3) be? Compared to reduce(+, matrix; dims), I suppose you should think of dims as being the directions of the output array coming from the operation.

@timholy
Copy link
Member

timholy commented Oct 11, 2020

I worry that some people might also think that this has a different "obvious" behavior:

julia> cat([1,2,3,4], [1,2,3])
ERROR: UndefKeywordError: keyword argument dims not assigned
Stacktrace:
 [1] cat(::Vector{Int64}, ::Vararg{Vector{Int64}, N} where N)
   @ Base ./abstractarray.jl:1725
 [2] top-level scope
   @ REPL[4]:1

Because the two arrays have different lengths, "clearly" I want to concatenate along dimension 1. Yet with this PR I will get the strange error message

ERROR: DimensionMismatch("mismatch in dimension 1 (expected 4 got 3)")

That error message should probably be clarified anyway, but it has to be thrown after the default for dims is assigned and at that point the nature of the problem is no longer obvious unless we pass down lots of state indicate what was and wasn't deliberately set by the user. In contrast, the current error message "keyword argument dims not assigned" is quite clear about the nature of the problem.

So on balance I am against this change.

@mcabbott
Copy link
Contributor Author

Good point, to think about someone who doesn't have vcat / hcat internalised. I guess the method could exclude ndims=1 (since you can use hcat for what this does) which would leave "keyword argument dims not assigned" as the error, but that's one more corner case to know.

Maybe it should not provide a default for cat, only for reduce(cat, ...). Saving one keyword isn't much, but avoiding things like reduce((x,y)->cat(x,y;dims=4), arrays) is what would be more useful.

@timholy
Copy link
Member

timholy commented Oct 11, 2020

I guess the method could exclude ndims=1 (since you can use hcat for what this does) which would leave "keyword argument dims not assigned" as the error, but that's one more corner case to know.

I could just generate a 2-dimensional example of the same thing.

My underlying point is that successful mind-reading usually depends on which mind, in which state, you're trying to read, and the diversity of minds is impressively large.

@mcabbott mcabbott changed the title Default dims=ndims+1 for cat Add efficient reduce(cat, A) which infers dims=ndims+1 Oct 24, 2020
@mcabbott
Copy link
Contributor Author

I agree that some mind-reading is necessary to guess the ways in which people may think something ought to work, despite documented behaviour. (Otherwise it would be enough to give every function a random string of letters as a name.) I am persuaded that letting cat infer dims is probably too confusing, for too little benefit. So I have removed that.

What would still be useful, IMO, is an efficient method like reduce(hcat, A) for higher-dimensional arrays. It seems fairly natural to call this reduce(cat, A) since that doesn't at present do anything, yet does fit the pattern for anyone who knows what reduce(hcat, A) does. But it could be called something else.

If A contains arrays of differing size, the error message is pretty explicit. If it contains arrays of differing ndims, it could be better... perhaps it should explicitly catch that case?

julia> reduce(cat, [ones(2), rand(3)])
ERROR: ArgumentError: expected arrays of consistent size, got (Base.OneTo(3),) for argument 2, compared to (Base.OneTo(2),) for the first

julia> reduce(cat, [ones(2), rand(2,2)])
ERROR: MethodError: no method matching _typed_cat(::Type{Float64}, ::Vector{Array{Float64, N} where N})

@timholy
Copy link
Member

timholy commented Nov 1, 2020

I'd be all in favor of reduce(cat(dims=3), A), where cat(dims=3) would create a Base.Fix2.

@mcabbott
Copy link
Contributor Author

mcabbott commented Nov 5, 2020

Well the crude solution is

@eval Base begin 
    cat(; dims) = Fix1(_cat, dims)
    (f::Fix1)(ys...) = f.f(f.x, ys...)
end

reduce(cat(dims=4), [fill(i, 4,3,2) for i in 1:5])   # this PR's case, done inefficiently

reduce(cat(dims=(2,3,4)), [fill(i, 4,3,2) for i in 1:5])

Presumably an extension of Fix1 should be done with a bit more thought, ref #36181 again.

While the first of these could be done by the efficient _typed_cat of this PR, the ability to accept dims sort-of implies that things like the second example should work. Which means that a more complicated function would be needed. And also that it can't be type-stable.

Adding such things later would not conflict with the simple reduce(cat, A) here.

@timholy
Copy link
Member

timholy commented Nov 5, 2020

Perhaps not in a technical sense, but if we're going to add a mechanism to explicitly specify the dimension, why add a somewhat-arbitrary default choice at all? Currently dims is a required argument for cat (no default is provided), why should it be different for reduce(cat, ...)?

@mcabbott
Copy link
Contributor Author

mcabbott commented Nov 5, 2020

I guess we differ on how arbitrary this is. In my mind:

  • There is exactly one dense packing which follows from column-major storage, i.e. a preferred vec(out) order, with no extra zeros introduced.
  • Reshaping this result to an Array{T, N+M} which keeps all the the dimensions of the contents + container is also the canonical choice, for arrays of uniform size. The result has combined[I,J] == separate[J][I] with I,J appropriate CartesianIndices for inner / outer arrays. If any axes have offsets, then this can simply keep all of them, without making choices about how to combine any.
  • Wanting to do exactly this is a pretty common request. In languages which don't distinguish between an N-dim array and an array-of-arrays, these are the two things being identified. (Or would be if they were column-major.)

Concatenating arrays of varying size has to be done along the dimension which varies, and that can't be seen in the type, and often won't result in this simple storage packing. This has to be specified. But wanting to concatenate many 3-arrays along (say) a varying-sized 2nd dimension is a pretty rare request.

Maybe concatenating many diagonal blocks into a matrix dims=(1,2) is more common, I don't know; it's inherently pretty wasteful to make all those zeros, compared to keeping track of the block structure. There's no really easy way to use cat to convert [fill(i+j, 2,2) for i in 1:2, j in 1:3] from a block matrix into a Matrix, whatever does that would presumably also be the natural way to collect Diagonal([trues(2,2) for _ in 1:3]) similarly.

@BioTurboNick
Copy link
Contributor

BioTurboNick commented Nov 5, 2020

It seems like the fundamental issue, @mcabbott, is that to achieve what you want, you really need a preliminary step to figure out the dimensions, which reduce isn't meant to do. Do you agree with that?

I'm not sure you can get away without a preparatory step that either determines the dimensions to concatenate along, and so can be explicitly partially applied to the cat, or by reshaping the inputs so that they have the dimensionality you intend the output to have, where cat can infer the highest dimension uniformly in each call (if it was given that default).

@mcabbott
Copy link
Contributor Author

mcabbott commented Nov 5, 2020

I'm not sure I follow about there being multiple steps, sorry.

The operation performed by _typed_cat here seems useful and is, in the sense I tried to describe, the canonical map from a uniform array of arrays, to one column-major array. It keeps all the dimensions with their existing lengths. As mentioned above, it doesn't have to be spelled any variant of reduce(..., and other suggestions would be welcome. For a vector of vectors, it agrees with reduce(hcat, A), which inspired this suggestion. 


There is a zoo of other possible ways to map many arrays into one. Some of them are covered by the existing cat function, and would also be covered by my hacky reduce(cat(dims=(4,5)), A) above. A more efficient implementation of that could no doubt be written.

In numpy, the function stack is (by default) the row-major version of this canonical map:


>>> arrays = [np.random.randn(3, 5, 7) for _ in range(11)]
>>> np.stack(arrays).shape  # default is axis=0
(11, 3, 5, 7)
>>> np.stack(arrays, axis=1).shape
(3, 11, 5, 7)
>>> np.stack(arrays, axis=3).shape
(3, 5, 7, 11)

Its behaviour with non-default keywords is permutedims of the standard one — all dimensions are still preserved, but they appear in a different order. This cat doesn’t do.

@mcabbott
Copy link
Contributor Author

mcabbott commented Nov 7, 2020

I begin to think maybe the Fix1 thing should be made official. I've added a slightly less hacky version (which still needs tests), which dispatches to one of the efficient methods when possible. And a docstring for cat(; dims) which perhaps does a better job of explaining.

Edit -- this causes @test cat(; dims=1) == Any[] to fail. Not sure how seriously that ought to be taken, if it can't be changed, then reduce(cat(dims=3), A) isn't possible.

@mcabbott mcabbott closed this Nov 7, 2020
@mcabbott mcabbott reopened this Nov 7, 2020
@timholy
Copy link
Member

timholy commented Nov 7, 2020

This mostly looks good. I'd be willing to see the result of @test cat(; dims=1) == Any[] change.

It still contains the mind-reading, though. Since you really seem to want that, I think this needs to be discussed at triage.

@timholy timholy added the triage This should be discussed on a triage call label Nov 7, 2020
@mcabbott mcabbott changed the title Add efficient reduce(cat, A) which infers dims=ndims+1 Allow reduce(cat(dims=4), A), with efficient method for simple cases Nov 9, 2020
@mcabbott
Copy link
Contributor Author

mcabbott commented Nov 9, 2020

OK, sounds good. I've updated the first post (and title) to hopefully make things clearer to anyone tuning in now.

@mcabbott
Copy link
Contributor Author

To bump this, here's an example from a few days ago (on slack). The outer container is 2-dimensional, and reduce(cat, a) is exactly the desired behaviour, preserving all 3 dimensions, while cat(a..., dims=3) flattens the container.

How a matrix of vectors can be converted to a 3D array?
In Mathematica, I'm used to have Lists of Lists (of Lists...), which can be treated just as multidimensional arrays. In Julia, I can use cat to convert a Vector of arrays into an array of +1 dimensions. But is there a simple (or correct) solution to convert a multidimensional array of arrays in a higher-dimensional array? The solution I use now seems to me quite awkward (shown here on example of a matrix of vectors):

julia> b = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> a = [[b] [2b]; [3b] [4b]]
2×2 Array{Array{Int64,1},2}:
 [1, 2, 3]  [2, 4, 6]
 [3, 6, 9]  [4, 8, 12]

julia> с = reshape(vcat(a[:]...), (size(b)..., size(a)...))
3×2×2 Array{Int64,3}:
[:, :, 1] =
 1  3
 2  6
 3  9

[:, :, 2] =
 2   4
 4   8
 6  12

I would expect something like cat(a..., dims=3) or cat(a, dims=3) to work, but of course, the former looses the 2D structure of a , and the latter gives me a 3D array of vectors.

Just checked in Python, with numpy it's as simple as wrapping a in array

In [38]: b = np.array([1, 2,3])

In [39]: a=[[b, 2*b], [3*b, 4*b]]

In [40]: c=np.array(a)

In [41]: c
Out[41]:
array([[[ 1,  2,  3],
        [ 2,  4,  6]],
       [[ 3,  6,  9],
        [ 4,  8, 12]]])

Python's a is a vector of vectors of vectors, but it blurs this distinction a bit, c[0] == c[0] is a matrix slice.

@timholy
Copy link
Member

timholy commented Dec 11, 2020

Did this ever get discussed at triage?

Keep in mind that my only objection is the mind-reading---I actively dislike it, but I wouldn't stand in the way if it's popular with a wider audience. But with just the two of us discussing this, it's hard to know how others feel.

If you split the mind-reading into a separate PR, I think we can merge the rest without needing triage. Then that part can get discussion on its own schedule.

@simeonschaub
Copy link
Member

I am a bit worried about changing the behavior of 0-argument cat, since it seems to be not uncommon to splat into cat. (https://juliahub.com/ui/RepoSearch?q=%5B%5Cs%5C%28%5Dcat%5C%28%5Cw%2B%5C.%5C.%5C.&r=true) Perhaps returning Any[] is pretty useless in a lot of these cases already, but I could certainly imagine code relying on this. See also the discussion in #35293

@BioTurboNick
Copy link
Contributor

BioTurboNick commented Dec 11, 2020

How a matrix of vectors can be converted to a 3D array?
In Mathematica, I'm used to have Lists of Lists (of Lists...), which can be treated just as multidimensional arrays. In Julia, I can use cat to convert a Vector of arrays into an array of +1 dimensions. But is there a simple (or correct) solution to convert a multidimensional array of arrays in a higher-dimensional array? The solution I use now seems to me quite awkward (shown here on example of a matrix of vectors):

julia> b = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> a = [[b] [2b]; [3b] [4b]]
2×2 Array{Array{Int64,1},2}:
 [1, 2, 3]  [2, 4, 6]
 [3, 6, 9]  [4, 8, 12]

julia> с = reshape(vcat(a[:]...), (size(b)..., size(a)...))
3×2×2 Array{Int64,3}:
[:, :, 1] =
 1  3
 2  6
 3  9

[:, :, 2] =
 2   4
 4   8
 6  12

Worth mentioning that this operation is slightly less awkward with my PR #33697

hvncat((1, size(a)...), a...)

That's because it infers how big the final array should be from the array elements.

Is there a technical term for "promoting" an array of arrays to a single multdimensional array by adding a dimension? A function with an appropriate name could map to the above.

EDIT: For lack of something better, upcat(a::AbstractArray{T}) where T <: AbstractArray = hvncat((1, size(a)...), a...)

@simeonschaub
Copy link
Member

We didn't really have time to talk about this very thoroughly last triage, but one thing that was proposed was maybe introducing a functor Cat(dims::Int) that could be used here instead.

@mcabbott
Copy link
Contributor Author

OK, that would avoid messing with zero-arg cat(; dims). So it would look something like this:

reduce(Cat(4), [ones(2,3) for _ in 1:5])           # size (2, 3, 1, 5)   # M=1 outer, N=2 inner
reduce(Cat(), [rand(3,5) for μ in 1:7, ν in 0:10]) # size (3, 5, 7, 11)  # M = N = 2

Maybe another possibility is to consider cat to be, like max, the function which acts on elements. And to give some other name to the function which, like maximum, acts on collections, without explicitly mentioning reduce.

concatenate([ones(2,3) for _ in 1:5]; dims=4)

Then it would make sense for this also to be used instead of reduce(hcat, A). Seems like a bigger change somehow.

A third idea might be to pass dims to reduce instead. It sometimes accepts this keyword already, e.g. reduce(+, rand(2,3); dims=1) is sum. Is this weird or conflicting?

reduce(cat, [ones(2,3) for _ in 1:5]; dims=4)           # size (2, 3, 1, 5), perhaps
reduce(+, [rand(3,5) for μ in 1:7, ν in 0:10]; dims=1)  # size (7, 1), containing arrays (3, 5)

@JeffBezanson
Copy link
Member

JeffBezanson commented Jan 7, 2021

I think such methods of reduce should only be optimizations, and not change the result. I also somewhat dislike the "mind-reading" default.

@mbauman
Copy link
Member

mbauman commented Jan 7, 2021

I was initially in favor of a default "higher dimension if not specified" rule (and didn't really think of it as mind reading), but I'd disagree with the implementation here. That is, I'd expect the following to be of size (3, 5, 77).

reduce(Cat(), [rand(3,5) for μ in 1:7, ν in 0:10]) # size (3, 5, 7, 11)  # M = N = 2

On triage, others thought that it'd automatically append on one of the existing dimensions... so perhaps this is indeed mind reading.

@mbauman mbauman removed the triage This should be discussed on a triage call label Jan 7, 2021
@mcabbott
Copy link
Contributor Author

mcabbott commented Apr 15, 2021

Thanks all for looking. I guess this is evidence that tying this to the name cat is too confusing.

But the functionality issue remains: combining a vector of N-dimensional arrays into one N+1 array (or worse, and M-dimensional container into an M+N-array) is at present quite awkward. Except for N==M==1, when reduce(hcat, ...) works.

For inspiration, #32310 will allow eachslice to produce precisely such arbitrary-dimensional arrays of arrays, e.g. A = eachslice(rand(3,4,5,6), dims=(3,4)) is N==M==2. The operation proposed here is the inverse of that. Maybe it should be something like allslices(As) with dims=(3,4) as the default.

Or collect(allslices(As)) if, like JuliennedArrays & #32310, this allslices were to be lazy. That would also move it further from stepping on cat's toes -- reduce(hcat, [rand(3) for _ in 1:4]) would be the eager cousin of allslices([rand(3) for _ in 1:4]).

Edit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants