Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Containers] support ArrayInterface.jl traits #3214

Closed
ParadaCarleton opened this issue Feb 10, 2023 · 57 comments · Fixed by #3413
Closed

[Containers] support ArrayInterface.jl traits #3214

ParadaCarleton opened this issue Feb 10, 2023 · 57 comments · Fixed by #3413
Labels
Category: Containers Related to the Containers submodule

Comments

@ParadaCarleton
Copy link

As far as I'm aware, JuMP's containers/labeled arrays aren't used outside of JuMP itself. This can cause a lot of problems and incompatibilities when trying to work with data from packages other than JuMP.

DimensionalData.jl is more widely-used, e.g. throughout the Julia Geostats ecosystem and in ArviZ.jl. Would it make sense to switch JuMP's containers out with DimensionalData.jl DimArrays for compatibility?

@jd-foster
Copy link
Collaborator

I guess the short answer is that the principle in JuMP / MathOptInterface is to try to keep external dependencies to a minimum. In this regard, there seems to be small gains from "compatibility" in the abstract considering that the required functionality is fully implemented within the package with an AbstractArray interface.

@blegat
Copy link
Member

blegat commented Feb 10, 2023

What we need is that the array directly generalizes Array so that the user does not have to know about whether the container Array or DenseAxisArray or SparseAxisArray.
This is what prevents us to use a Dict for SpaceAxisArray, because a Dict does not just behave like an Array with custom indices with missing entries. What we want there is really like a SparseArray with custom indexing.
The issue with AxisArrays is that if you pass integer indices, it assumes that you are directly indexing the underlying array:

julia> using AxisArrays

julia> A = AxisArray(1:2; i = 2:3)
1-dimensional AxisArray{Int64,1,...} with axes:
    :i, 2:3
And data, a 2-element UnitRange{Int64}:
 1
 2

julia> A[2]
2

We would need A[2] to be 1 here.
There is the same issue with DimensionalData:

julia> A = ones(X(2:3), Y(3:4))
2×2 DimArray{Float64,2} with dimensions: 
  X Sampled{Int64} 2:3 ForwardOrdered Regular Points,
  Y Sampled{Int64} 3:4 ForwardOrdered Regular Points
    3    4
 2  1.0  1.0
 3  1.0  1.0

julia> A[2, 3]
ERROR: BoundsError: attempt to access 2×2 Matrix{Float64} at index [2, 3]
Stacktrace:
 [1] getindex
   @ ./array.jl:925 [inlined]
 [2] getindex(::DimArray{Float64, 2, Tuple{X{DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Y{DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, Tuple{}, Matrix{Float64}, DimensionalData.NoName, DimensionalData.Dimensions.LookupArrays.NoMetadata}, ::Int64, ::Int64)
   @ DimensionalData ~/.julia/dev/DimensionalData/src/array/indexing.jl:6
 [3] top-level scope
   @ REPL[10]:1

julia> A[At(2), At(3)]
1.0

Here, it seems you need to use At to get the behavior we would want by default.

If there was a package that would define arrays that we can use without having to change the syntax to our users, I would be in favor of using it but there is none at the moment as far as I know which is why we have to define our own containers.

@odow odow added Category: Containers Related to the Containers submodule breaking labels Feb 10, 2023
@odow
Copy link
Member

odow commented Feb 10, 2023

The largest reason against doing this (for the default behavior) is that it would be breaking.

However, adding support for @variable(model, x[2:3, 3:4], container = DimArray) is do-able in an extension package in the upcoming Julia v1.9.

@mlubin
Copy link
Member

mlubin commented Feb 11, 2023

Just for some historical context, we did evaluate AxisArrays as a replacement for JuMP's custom containers. The indexing issue that @blegat described was a blocker: JuliaArrays/AxisArrays.jl#117.

@metab0t
Copy link
Contributor

metab0t commented Feb 11, 2023

The extension method is an attractive approach to support custom container type.
@odow What are the functions we should specialize in extension to support multiple container types?

@odow
Copy link
Member

odow commented Feb 11, 2023

@ParadaCarleton
Copy link
Author

Just for some historical context, we did evaluate AxisArrays as a replacement for JuMP's custom containers. The indexing issue that @blegat described was a blocker: JuliaArrays/AxisArrays.jl#117.

I think this is a reasonable concern. To deal with this, could we just wrap a DimArray around an OffsetArray?

@odow
Copy link
Member

odow commented Feb 12, 2023

To deal with this, could we just wrap a DimArray around an OffsetArray?

I'm not sure I understand how this would work. JuMP's DenseAxisArray can use arbitrary indices of any Julia type. For example:

julia> model = Model();

julia> a, b = [1, 2], [3, 4]
([1, 2], [3, 4])

julia> @variable(model, x[1:2, [:a, :b], ["c", "d"], [a, b]])
4-dimensional DenseAxisArray{VariableRef,4,...} with index sets:
    Dimension 1, Base.OneTo(2)
    Dimension 2, [:a, :b]
    Dimension 3, ["c", "d"]
    Dimension 4, [[1, 2], [3, 4]]
And data, a 2×2×2×2 Array{VariableRef, 4}:
[:, :, "c", [1, 2]] =
 x[1,a,c,[1, 2]]  x[1,b,c,[1, 2]]
 x[2,a,c,[1, 2]]  x[2,b,c,[1, 2]]

[:, :, "d", [1, 2]] =
 x[1,a,d,[1, 2]]  x[1,b,d,[1, 2]]
 x[2,a,d,[1, 2]]  x[2,b,d,[1, 2]]

[:, :, "c", [3, 4]] =
 x[1,a,c,[3, 4]]  x[1,b,c,[3, 4]]
 x[2,a,c,[3, 4]]  x[2,b,c,[3, 4]]

[:, :, "d", [3, 4]] =
 x[1,a,d,[3, 4]]  x[1,b,d,[3, 4]]
 x[2,a,d,[3, 4]]  x[2,b,d,[3, 4]]

julia> x[2, :a, "d", b]
x[2,a,d,[3, 4]]

Since there's a public interface to defining new container types (#3214 (comment)), I think the first step is to demonstrate what it would look like in a separate package. I haven't looked at the details, but I assume the necessary code is something like:

function Containers.container(f::Function, indices, ::Type{DimensionalData.DimArray}, names)
    data = [f(i...) for i in indices]
    return DimensionalData.DimArray(data, names)
end

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Feb 12, 2023

I'm not sure I understand how this would work. JuMP's DenseAxisArray can use arbitrary indices of any Julia type. For example:

Right, so what I mean is that the container macro would create a DimArray wrapped around an OffsetArray. If the user creates an index with unusual integer indices, e.g. -12:100 or 0:2:100, the indices of the underlying array will be offset to match the requested labels. Then, indexing with x[-12] will work as usual, since the integer indices still refer to the right thing.

@odow
Copy link
Member

odow commented Feb 12, 2023

The axis could also be 3:-1:1, or [1, 5, 7, 2], so I don't think this works in general?

@ParadaCarleton
Copy link
Author

The axis could also be 3:-1:1, or [1, 5, 7, 2], so I don't think this works in general?

Hmm, I know OffsetArrays handles the first case, but I don't know if it handles the second. If not, it's possible we could modify DimensionalData.jl to fix this.

@odow
Copy link
Member

odow commented Feb 12, 2023

If not, it's possible we could modify DimensionalData.jl to fix this.

Potentially. But it's another argument to leave JuMP's defaults alone and work on the JuMP<->DimensionalData interface in a separate opt-in package.

@rafaqz
Copy link

rafaqz commented Feb 13, 2023

1.9 should make this fairly easy to do.

But I wasn't aware you had an in-house axis array implementation here. Its a bit tragic how much of this work we have duplicated in the last few years, although I understand the desire to keep things in-house for JuMP. FWIW AbstractDimArray (in DimensionalData.jl) is extensible and could have custom JuMP specific logic added without having to implement everything here.

There are also some efforts at ArrayInterface.jl to add a compatability layer for these array labelling packages. It may be that we can add a weak dependency for that instead and make things generic so e.g. AxisKeys.jl will work as well.

@odow DimensionalData.jl lookups can contain any julia objects, 3:-1:1, or [1, 5, 7, 2] are no problem at all. The vector will be classified as Irregular Unordered and will use findfirst rather than searchsortedfirst when indexing.

@odow
Copy link
Member

odow commented Feb 13, 2023

Its a bit tragic how much of this work we have duplicated in the last few years, although I understand the desire to keep things in-house for JuMP

If you go spelunking in the source code, you'll find that JuMP has supported something similar for 10(!!) years:
79570c4

We developed in-house because at the start there were no other options, and then we never switched to alternatives because we had something that met our needs.

There are also some efforts at ArrayInterface.jl to add a compatability layer for these array labelling packages. It may be that we can add a weak dependency for that instead and make things generic so e.g. AxisKeys.jl will work as well.

Yes. I think the right approach is for people interested in this to develop a proof-of-concept package, and then once 1.9 is released, we could consider adding it as an extension package.

@rafaqz
Copy link

rafaqz commented Feb 13, 2023

Oh wow ten years! you may not have duplicated orhers efforts then, but others have certainly duplicated yours.

Some kind of traits interface may be best, as you won't want to fix something that's not broken.

The problem is it's hard and boring work no one has time for - it's not something I will do myself.

Probably someone will need to get paid to do it, maybe a numfocus grant "Compatability for named dimension objects in Julia".

@odow
Copy link
Member

odow commented Feb 13, 2023

Aside from it being nice to have JuMP and other packages share the underlying functionality, are there any current issues that prevent JuMP from being used with one of the other array packages? I've never seen an issue on GitHub or a post on the forum with someone complaining.

@ParadaCarleton
Copy link
Author

I've never seen an issue on GitHub or a post on the forum with someone complaining.

You're in one right now 😄 This post was inspired by my annoyance when DimArrays didn't "Just work" the way I expect for most Julia stuff, and also being forced to learn a completely different interface from the one I'm used to for array labeling.

@odow
Copy link
Member

odow commented Feb 14, 2023

Define not work though. What did you try? What did you want to happen?

@odow
Copy link
Member

odow commented Feb 14, 2023

This seems to be working as expected:

julia> using JuMP, DimensionalData

julia> model = Model();

julia> @variable(model, x1[1:3,1:2])
3×2 Matrix{VariableRef}:
 x1[1,1]  x1[1,2]
 x1[2,1]  x1[2,2]
 x1[3,1]  x1[3,2]

julia> y1 = DimArray(x1, (:a, :b))
3×2 DimArray{VariableRef,2} with dimensions: Dim{:a}, Dim{:b}
 x1[1,1]  x1[1,2]
 x1[2,1]  x1[2,2]
 x1[3,1]  x1[3,2]

julia> y1[a=1, b=2]
x1[1,2]

julia> @variable(model, x2[2:3,[:b, :c]])
2-dimensional DenseAxisArray{VariableRef,2,...} with index sets:
    Dimension 1, 2:3
    Dimension 2, [:b, :c]
And data, a 2×2 Matrix{VariableRef}:
 x2[2,b]  x2[2,c]
 x2[3,b]  x2[3,c]

julia> y2 = DimArray(Array(x2), (i = 2:3, j = [:b, :c]))
2×2 DimArray{VariableRef,2} with dimensions: 
  Dim{:i} Sampled{Int64} 2:3 ForwardOrdered Regular Points,
  Dim{:j} Categorical{Symbol} Symbol[b, c] ForwardOrdered
    :b       :c
 2  x2[2,b]  x2[2,c]
 3  x2[3,b]  x2[3,c]

julia> y2[i=2, j=:b]
ERROR: ArgumentError: Invalid index `:b`. Did you mean `At(:b)`? Use stardard indices, `Selector`s, or `Val` for compile-time `At`.
Stacktrace:
 [1] selectindices
   @ ~/.julia/packages/DimensionalData/imYoU/src/LookupArrays/selector.jl:902 [inlined]
 [2] _dims2indices
   @ ~/.julia/packages/DimensionalData/imYoU/src/Dimensions/indexing.jl:114 [inlined]
 [3] macro expansion
   @ ~/.julia/packages/DimensionalData/imYoU/src/Dimensions/indexing.jl:56 [inlined]
 [4] _dims2indices(lookups::Tuple{DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}, DimensionalData.Dimensions.LookupArrays.Categorical{Symbol, Vector{Symbol}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, dims::Tuple{Dim{:i, DimensionalData.Dimensions.LookupArrays.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Int64}, DimensionalData.Dimensions.LookupArrays.Points, DimensionalData.Dimensions.LookupArrays.NoMetadata}}, Dim{:j, DimensionalData.Dimensions.LookupArrays.Categorical{Symbol, Vector{Symbol}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, I::Tuple{Dim{:i, Int64}, Dim{:j, Symbol}})
   @ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/imYoU/src/Dimensions/indexing.jl:56
 [5] dims2indices
   @ ~/.julia/packages/DimensionalData/imYoU/src/Dimensions/indexing.jl:51 [inlined]
 [6] dims2indices
   @ ~/.julia/packages/DimensionalData/imYoU/src/Dimensions/indexing.jl:34 [inlined]
 [7] #getindex#39
   @ ~/.julia/packages/DimensionalData/imYoU/src/array/indexing.jl:49 [inlined]
 [8] top-level scope
   @ REPL[57]:1

julia> y2[i=2, j=At(:b)]
x2[3,b]

julia> y2[i=At(2), j=At(:b)]
x2[2,b]

The mis-matched indexing if an axis is an AbstractVector{Int} is a foot-gun though.

@rafaqz
Copy link

rafaqz commented Feb 14, 2023

That's the reason for At being a selector like Near... so that regular indexing and lookups are always distinct operations. Do your arrays just not have regular Int indexing? I would suggest any indexing with Int not being regular julia indexing is a foot-gun ;)

But it would be nice if the constructors of DimArray and DenseAxisArray and e.g. KeyedArray from AxisKeys.jl could all get the lookups from the other array type, there is no reason to pass in the lookup to DimArray if the parent array already has them.

Thats probably what we can do with ArrayInterface.jl.

@ParadaCarleton
Copy link
Author

ParadaCarleton commented Feb 14, 2023

Define not work though. What did you try? What did you want to happen?

In order:

  1. Tried to use a DimArray as the constraint matrix for a linear optimization problem, while also trying to make the vector a solution by using container=DimArray. Got obscure errors about how the shapes of the array and vectors didn't match, because according to the error messages, a matrix with a size of n * n and a vector of size n have incompatible sizes (clearly wrong).
  2. When I managed to figure out it wasn't working because JuMP.jl doesn't let me use DimArrays as containers for variables/solutions, I tried using in combination with Containers.
  3. Tried using containers with a regular matrix (failed, gave up on figuring out if they worked).
  4. Tried using just containers (failed, had trouble figuring out how they worked, decided it wasn't worth the trouble).
  5. Gave up on labeling entirely and just checked everything was right and lined up by looking at my code really hard.

I want to note that none of this is because JuMP containers are unusually complicated, poorly-designed, or badly-documented. It's because they're not what I'm used to, and aren't standardized throughout the rest of the ecosystem. No matter how well-written a piece of documentation is, it still takes time to read it, and spending more time trying to figure this out just wasn't worth it. This is why we standardize things. For something as basic as array-labeling, we could probably learn from Python's "there should be one--and only one--way to do it."

@odow
Copy link
Member

odow commented Feb 14, 2023

Tried to use a DimArray as the constraint matrix for a linear optimization problem
trying to make the vector a solution by using container=DimArray.

Do you have code samples? I don't really understand what you're doing.

I can see that this would be a problem:

julia> using JuMP, DimensionalData

julia> model = Model();

julia> @variable(model, x[1:3,1:2])
3×2 Matrix{VariableRef}:
 x[1,1]  x[1,2]
 x[2,1]  x[2,2]
 x[3,1]  x[3,2]

julia> y = DimArray(x, (:a, :b))
3×2 DimArray{VariableRef,2} with dimensions: Dim{:a}, Dim{:b}
 x[1,1]  x[1,2]
 x[2,1]  x[2,2]
 x[3,1]  x[3,2]

julia> @constraint(model, y[a=At(1), b=At(2)] <= 1)
ERROR: LoadError: Unexpected assignment in expression `sum(y[a = (:), b = At(2)])`.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] _rewrite(vectorized::Bool, minus::Bool, inner_factor::Expr, current_sum::Nothing, left_factors::Vector{Any}, right_factors::Vector{Any}, new_var::Symbol)
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:706
  [3] _rewrite(vectorized::Bool, minus::Bool, inner_factor::Expr, current_sum::Nothing, left_factors::Vector{Any}, right_factors::Vector{Any})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:462
  [4] _rewrite(vectorized::Bool, minus::Bool, inner_factor::Expr, current_sum::Nothing, left_factors::Vector{Any}, right_factors::Vector{Any}, new_var::Symbol)
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:488
  [5] _rewrite
    @ ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:462 [inlined]
  [6] rewrite_and_return(expr::Expr; move_factors_into_sums::Bool)
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:315
  [7] rewrite_and_return
    @ ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:314 [inlined]
  [8] rewrite(x::Expr; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ MutableArithmetics ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:301
  [9] rewrite
    @ ~/.julia/packages/MutableArithmetics/Hf30J/src/rewrite.jl:300 [inlined]
 [10] parse_constraint_call(_error::Function, vectorized::Bool, operator::Val{:<=}, lhs::Expr, rhs::Int64)
    @ JuMP ~/.julia/packages/JuMP/yYfHy/src/macros.jl:510
 [11] parse_constraint_head(::Function, ::Val{:call}, ::Symbol, ::Expr, ::Int64)
    @ JuMP ~/.julia/packages/JuMP/yYfHy/src/macros.jl:362
 [12] parse_constraint
    @ ~/.julia/packages/JuMP/yYfHy/src/macros.jl:303 [inlined]

We're trying to rewrite statements inside a Ref.

@ParadaCarleton
Copy link
Author

Do you have code samples? I don't really understand what you're doing.

I don't; as mentioned, I rewrote everything to just use numeric indexing. I didn't understand what I was doing either, thus the thread 😅

@rafaqz
Copy link

rafaqz commented Feb 14, 2023

@ParadaCarleton but it would be nice to write it again so we have an MWE to discuss.

We need to understand exactly what is hard or confusing or this issue will likely not progress.

@odow
Copy link
Member

odow commented Feb 14, 2023

We'll fix the macros so that the syntax x[i=At(2)] works: jump-dev/MutableArithmetics.jl#197.

Other than that, I don't know what actionable items there are from here. The syntax to convert a JuMP container to a DimArray works well-enough.

@odow
Copy link
Member

odow commented Feb 15, 2023

With jump-dev/MutableArithmetics.jl#197, this now works:

julia> using JuMP, DimensionalData

julia> model = Model();

julia> @variable(model, x[1:3,1:2])
3×2 Matrix{VariableRef}:
 x[1,1]  x[1,2]
 x[2,1]  x[2,2]
 x[3,1]  x[3,2]

julia> y = DimArray(x, (:a, :b))
3×2 DimArray{VariableRef,2} with dimensions: Dim{:a}, Dim{:b}
 x[1,1]  x[1,2]
 x[2,1]  x[2,2]
 x[3,1]  x[3,2]

julia> @constraint(model, sum(y[a=At(i), b=At(2)] for i in 1:3) <= 1)
x[1,2] + x[2,2] + x[3,2]  1.0

and also something like

julia> using JuMP, DimensionalData

julia> model = Model();

julia> @variable(model, x[2:4,["a", "b"]])
2-dimensional DenseAxisArray{VariableRef,2,...} with index sets:
    Dimension 1, 2:4
    Dimension 2, ["a", "b"]
And data, a 3×2 Matrix{VariableRef}:
 x[2,a]  x[2,b]
 x[3,a]  x[3,b]
 x[4,a]  x[4,b]

julia> y = DimArray(Array(x), NamedTuple(i => j for (i, j) in zip((:a, :b), axes(x))))
3×2 DimArray{VariableRef,2} with dimensions: 
  Dim{:a} Sampled{Int64} 2:4 ForwardOrdered Regular Points,
  Dim{:b} Categorical{String} String[a, b] ForwardOrdered
    "a"     "b"
 2  x[2,a]  x[2,b]
 3  x[3,a]  x[3,b]
 4  x[4,a]  x[4,b]

julia> @constraint(model, sum(y[a=At(i), b=At("a")] for i in 2:4) <= 1)
x[2,a] + x[3,a] + x[4,a]  1.0

and integrating the container is a two line function:

julia> using JuMP, DimensionalData

julia> function JuMP.Containers.container(
           f::Function, 
           indices::JuMP.Containers.VectorizedProductIterator,
           c::Type{DimensionalData.DimArray},
           names,
       )
           dims = NamedTuple(i => j for (i, j) in zip(names, indices.prod.iterators))
           return DimensionalData.DimArray(map(i -> f(i...), indices), dims)
       end

julia> model = Model();

julia> @variable(model, x[i=2:4, j=["a", "b"]], container = DimArray)
3×2 DimArray{VariableRef,2} with dimensions: 
  Dim{:i} Sampled{Int64} 2:4 ForwardOrdered Regular Points,
  Dim{:j} Categorical{String} String[a, b] ForwardOrdered
    "a"     "b"
 2  x[2,a]  x[2,b]
 3  x[3,a]  x[3,b]
 4  x[4,a]  x[4,b]

julia> @variable(model, y[i=1:3, j=1:2], container = DimArray)
3×2 DimArray{VariableRef,2} with dimensions: 
  Dim{:i} Sampled{Int64} Base.OneTo(3) ForwardOrdered Regular Points,
  Dim{:j} Sampled{Int64} Base.OneTo(2) ForwardOrdered Regular Points
    1        2
 1   y[1,1]   y[1,2]
 2   y[2,1]   y[2,2]
 3   y[3,1]   y[3,2]

julia> @variable(model, z[i=1:3, j=i:2], container = DimArray)
ERROR: Unable to build a container with the provided type DimArray. Implement `Containers.container(::Function, indices, ::Type{DimArray})`.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] container(#unused#::Function, #unused#::JuMP.Containers.NestedIterator{Tuple{var"#10#13", var"#11#14"}, JuMP.Containers.var"#36#38"}, D::Type)
   @ JuMP.Containers ~/.julia/packages/JuMP/H4UGR/src/Containers/container.jl:185
 [3] container(f::Function, indices::JuMP.Containers.NestedIterator{Tuple{var"#10#13", var"#11#14"}, JuMP.Containers.var"#36#38"}, D::Type, names::Vector{Any})
   @ JuMP.Containers ~/.julia/packages/JuMP/H4UGR/src/Containers/container.jl:71
 [4] macro expansion
   @ ~/.julia/packages/JuMP/H4UGR/src/macros.jl:136 [inlined]
 [5] top-level scope
   @ REPL[6]:1

julia> @variable(model, w[1:3, j=1:2], container = DimArray)
3×2 DimArray{VariableRef,2} with dimensions: 
  Dim{:##285} Sampled{Int64} Base.OneTo(3) ForwardOrdered Regular Points,
  Dim{:j} Sampled{Int64} Base.OneTo(2) ForwardOrdered Regular Points
    1        2
 1   w[1,1]   w[1,2]
 2   w[2,1]   w[2,2]
 3   w[3,1]   w[3,2]

Although in practice you might want to provide a more informative error, and catch any containers with missing names (like w).

@odow
Copy link
Member

odow commented Feb 24, 2023

There are a few more of these packages

Updated.

Just make it easy for people to use whatever they want I guess.

👍 If anyone can identify concrete examples where JuMP fails to work (like the indexing issue with DimensionalData) then we'll fix that.

cc @oxinabox since they have been on a multi-year crusade to fix this wrong.

@oxinabox
Copy link

oxinabox commented Mar 1, 2023

cc @BSnelling who had JuMP working on AxisKeys at one point

@odow
Copy link
Member

odow commented Mar 13, 2023

It looks like @BSnelling's solution was to convert everything to a DenseAxisArray when working with JuMP:

https://github.com/invenia/FullNetworkModels.jl/blob/b5b2cfbf6cb05f5ea58b74efcc7142bdb88af534/src/utils/internal.jl#L97-L99

https://github.com/invenia/FullNetworkModels.jl/blob/b5b2cfbf6cb05f5ea58b74efcc7142bdb88af534/src/model/general/constraints.jl#L50-L57

I'll wait another few days and then close, unless someone has more comments.

@BSnelling
Copy link

It looks like @BSnelling's solution was to convert everything to a DenseAxisArray when working with JuMP:

Yes, that's what we landed on in the end. I don't have all the details to hand but I experimented with using AxisKeys as @odow described in this post: https://discourse.julialang.org/t/using-alternative-array-container-types-in-jump/77707/3 (both by overloading the Containers.container function and as a wrapper around JuMP containers). But in the end we decided this didn't have much benefit over converting everything to DenseAxisArray.

@odow
Copy link
Member

odow commented Mar 13, 2023

But in the end we decided this didn't have much benefit over converting everything to DenseAxisArray.

Yeah. I think this is often the right solution. (Or alternatively, constructing a matrix in JuMP and then converting it to some other type.) We don't need explicit support in JuMP for these transformations.

@rafaqz
Copy link

rafaqz commented Mar 14, 2023

Honestly everyone working out their own conversion solution sounds painfull to me. We need to solve this with traits so it just works.

@odow
Copy link
Member

odow commented Mar 14, 2023

Honestly everyone working out their own conversion solution sounds painfull to me

The methods are pretty simple. I haven't tested, but they're something like:

KeyedArray(x::DenseAxisArray) = KeyedArray(Array(x), axes(x))
DenseAxisArray(arr::KeyedArray) = DenseAxisArray(arr, axiskeys(arr)...)

The issue is, where should these methods live? Neither JuMP nor AxisKeys want a dependency on each other. So someone needs to make an extension package.

The trait issue could be made to work for DenseAxisArray. But we essentially already have that. Convert to a regular backing array with Array and get the axes with axes(x).

But this wouldn't work for Containers.SparseAxisArray (the other widely used named axis array type in JuMP) since it models a sparse array of eltype T for which zero(T) isa T == false.

@blegat
Copy link
Member

blegat commented Mar 14, 2023

Honestly everyone working out their own conversion solution sounds painfull to me. We need to solve this with traits so it just works.

I agree. In JuMP, we have a pretty inflexible condition for an array: the user should notice the difference between the different arrays (Array, DenseAxisArray, SparseAxisArray). That is, since each array is a generalization of the previous one, you should be able to create a DenseAxisArray (resp. SparseAxisArray) that mimic an Array (resp. DenseAxisArray) and everything you do with one should do the same with the other one.
There is currently no package that does that (to the best of my knowledge) so we have to define our own. So there is two options here:

  1. There is a new array package that gets created and satisfy these requirements, in which case, we could consider switching to it (although it would be breaking so not before JuMP v2.0)
  2. If there is enough interest, I wouldn't be against moving JuMP.Containers into its own separate package. Then people could use these arrays outside of JuMP as well hence no need for conversion if you use these arrays are what you use for everything ^^ I personally use DenseAxisArray whenever I need an array with custom indices even when I'm not solving any optimization problem.

@rafaqz
Copy link

rafaqz commented Mar 14, 2023

There is currently no package that does that (to the best of my knowledge) so we have to define our own. So there is two options here:

@blegat I don't understand why you have dense and sparse separate and what you need that DimensionalData.jl or AxisKeys.jl don't do? But it would be good to be aware of what the limitations are. Is it for forwading or sparse method specialisations?

Would something like this help?
JuliaLang/julia#48861

@blegat
Copy link
Member

blegat commented Mar 14, 2023

It's much simpler than that. Simply looking at getindex singles out DenseAxisArray as the only possible option at the moment as I detailed in #3214 (comment)

@rafaqz
Copy link

rafaqz commented Mar 14, 2023

Oh right, you will have the same problem with AxisKeys.jl. We all intentionally avoid breaking the base array interface like that. I guess it's actually a different kind of thing you are doing here then, more like OffsetArrays.jl. Thanks

@blegat
Copy link
Member

blegat commented Mar 14, 2023

Yes, we know that breaking the Base interface is the hard route but we don't have the choice. It's quite ambitious which is why the option 2) would make sense

@aplavin
Copy link

aplavin commented Mar 14, 2023

Following @rafaqz comment, seems like what you need is directly achieved by offset arrays.
Of course, they can be combined with named-dim arrays to get both named dimensions and offset indices:

julia> using NamedDims, OffsetArrays

julia> A = NamedDimsArray{(:i,)}(OffsetArray(1:2, 1))
2-element NamedDimsArray(OffsetArray(::UnitRange{Int64}, 2:3), :i) with indices 2:3:
 i  1
     2

julia> A[3]
2

julia> A[i=3]
2

@odow
Copy link
Member

odow commented Mar 14, 2023

seems like what you need is directly achieved by offset arrays.

Not quite. Here's an example that demonstrates some of the features of DenseAxisArray:

julia> using JuMP.Containers

julia> Containers.@container(x[r = ["A", 1, :C], c = 1:4], (r, c))
2-dimensional DenseAxisArray{Tuple{Any, Int64},2,...} with index sets:
    Dimension 1, Any["A", 1, :C]
    Dimension 2, Base.OneTo(4)
And data, a 3×4 Matrix{Tuple{Any, Int64}}:
 ("A", 1)  ("A", 2)  ("A", 3)  ("A", 4)
 (1, 1)    (1, 2)    (1, 3)    (1, 4)
 (:C, 1)   (:C, 2)   (:C, 3)   (:C, 4)

julia> x["A", 1]
("A", 1)

julia> x[:C, 2:3]
1-dimensional DenseAxisArray{Tuple{Any, Int64},1,...} with index sets:
    Dimension 1, [2, 3]
And data, a 2-element Vector{Tuple{Any, Int64}}:
 (:C, 2)
 (:C, 3)

julia> x[["A", :C], 2:3]
2-dimensional DenseAxisArray{Tuple{Any, Int64},2,...} with index sets:
    Dimension 1, Any["A", :C]
    Dimension 2, [2, 3]
And data, a 2×2 Matrix{Tuple{Any, Int64}}:
 ("A", 2)  ("A", 3)
 (:C, 2)   (:C, 3)

julia> y = x[1, :]
1-dimensional DenseAxisArray{Tuple{Any, Int64},1,...} with index sets:
    Dimension 1, Base.OneTo(4)
And data, a 4-element Vector{Tuple{Any, Int64}}:
 (1, 1)
 (1, 2)
 (1, 3)
 (1, 4)

julia> y[2]
(1, 2)

julia> y = x[:, 3]
1-dimensional DenseAxisArray{Tuple{Any, Int64},1,...} with index sets:
    Dimension 1, Any["A", 1, :C]
And data, a 3-element Vector{Tuple{Any, Int64}}:
 ("A", 3)
 (1, 3)
 (:C, 3)

julia> y[1]
(1, 3)

julia> Array(x)
3×4 Matrix{Tuple{Any, Int64}}:
 ("A", 1)  ("A", 2)  ("A", 3)  ("A", 4)
 (1, 1)    (1, 2)    (1, 3)    (1, 4)
 (:C, 1)   (:C, 2)   (:C, 3)   (:C, 4)

julia> axes(x)
(Any["A", 1, :C], Base.OneTo(4))

julia> x[1, 1]
(1, 1)

julia> x[CartesianIndex(1, 1)]
("A", 1)

julia> x[CartesianIndex(1, 2)]
("A", 2)

julia> x[CartesianIndex(3, 2)]
(:C, 2)

To summarize:

  • Axes may be any AbstractVector type. They need not have the same eltype, or be integers, or be contiguous, etc.
  • getindex indexes in axis-space. To index the underlying array with integers, use CartesianIndex.
  • Slicing etc occurs in axis-space.
  • To access the underlying array, use Array(x).
  • To access the axes, use axes(x).

The main difference to other packages is that DenseAxisArray uses axis-space indexing by default, and makes you fight to use regular indexing on the underlying array. It seems that most other packages let regular indexing work by default, and you opt-in to accessing via the axes, via something like x[At(:C), At(2)].

@aplavin
Copy link

aplavin commented Mar 14, 2023

getindex indexes in axis-space. To index the underlying array with integers, use CartesianIndex.

Hm, I think this cannot be a proper AbstractArray in Julia, severely limiting interoperability with array libraries. Many packages assume sequential integer indices, following the AbstractArray interface.

Why is it important to use getindex specifically to access the array by "axis keys"? In AxisKeys.jl, they do x("A", 1) for that (round brackets), leaving getindex (square brackets) alone to fulfil the basic interface.

@blegat
Copy link
Member

blegat commented Mar 14, 2023

This is because of what I detailed in #3214 (comment)

@odow
Copy link
Member

odow commented Mar 15, 2023

I think this cannot be a proper AbstractArray in Julia, severely limiting interoperability with array libraries. Many packages assume sequential integer indices, following the AbstractArray interface.

Yeah I think this is the underlying tension in the ecosystem.

A DenseAxisArray is it's own thing, and it sits on a different side of the interoperability/usability trade-off than the other packages.

I also think that it's wrong to treat axis-type arrays as an Array with a bit of metadata. Here's an example:

julia> Containers.@container(x[i = [1, 2]], i)
1-dimensional DenseAxisArray{Int64,1,...} with index sets:
    Dimension 1, [1, 2]
And data, a 2-element Vector{Int64}:
 1
 2

julia> Containers.@container(y[i = [2, 1]], i)
1-dimensional DenseAxisArray{Int64,1,...} with index sets:
    Dimension 1, [2, 1]
And data, a 2-element Vector{Int64}:
 2
 1

julia> x .+ y
ERROR: Unable to broadcast over DenseAxisArrays with different axes.

Compare

julia> using DimensionalData

julia> x = DimArray([1, 2], X([1, 2]))
2-element DimArray{Int64,1} with dimensions: 
  X Sampled{Int64} Int64[1, 2] ForwardOrdered Irregular Points
 1  1
 2  2

julia> y = DimArray([1, 2], X([2, 1]))
2-element DimArray{Int64,1} with dimensions: 
  X Sampled{Int64} Int64[2, 1] ReverseOrdered Irregular Points
 2  1
 1  2

julia> x .+ y
2-element DimArray{Int64,1} with dimensions: 
  X Sampled{Int64} Int64[1, 2] ForwardOrdered Irregular Points
 1  2
 2  4

Which you can debate about whether is the right or wrong answer.

But I don't think this detracts from the gist of this issue. The reason that the ecosystem hasn't standardized on a single implementation is that there are some orthogonal design choices that different packages make and interoperabiblity is hard 😢

@aplavin
Copy link

aplavin commented Mar 15, 2023

I also think that it's wrong to treat axis-type arrays as an Array with a bit of metadata. Here's an example: <...>

How does the example relate to whether the "axisarrays" are arrays with a bit of metadata? What's the alternative exactly?

AxisKeys also throw an error here, so not sure what this argument shows.

julia> using AxisKeys

julia> x = KeyedArray(1:2, i=[1,2])
1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
   i  2-element Vector{Int64}
And data, 2-element UnitRange{Int64}:
 (1)  1
 (2)  2

julia> y = KeyedArray(1:2, i=[2, 1])
1-dimensional KeyedArray(NamedDimsArray(...)) with keys:
   i  2-element Vector{Int64}
And data, 2-element UnitRange{Int64}:
 (2)  1
 (1)  2

julia> x .+ y
ERROR: ArgumentError: key vectors must agree; got [1, 2] != [2, 1]

@odow
Copy link
Member

odow commented Mar 15, 2023

How does the example relate to whether the "axisarrays" are arrays with a bit of metadata?

Oops. Obviously poorly worded.

I guess I meant that DimArray is mostly an AbstractArray that composes well with other array libraries, but it has special indexing for the axes with x[At(:A), At(:B)]. It seems like you could use a DimArray without ever knowing about the extra axes that it is carrying around.

In contrast, the axes in DenseAxisArray are a fundamental part of the type. So you can only combine DenseAxisArray if they have the same axes, etc, and you have to fight with Array(x) or CartesianIndex to access the underlying array. So if you want to use a DenseAxisArray you pretty much have to be aware of the axes.

AxisKeys also throw an error here, so not sure what this argument shows.

That different packages have adopted different conventions. Some want x .+ y to work, and others want an error. That's a pretty irreconcilable difference.

@rafaqz
Copy link

rafaqz commented Mar 15, 2023

DimArray does have some of these checks in places and that broadcast could and should probably fail in your example.

That kind of security is just not something people have asked for much.

The broadcast_dims method actually goes much further and automatically checks and permutes the dimensions for you.

@odow
Copy link
Member

odow commented Mar 21, 2023

So coming back to this. I don't know if there is anything actionable from JuMP's point of view.

Containers.DenseAxisArray (ab)uses the AbstractArray interface by not providing regular indexing, but we won't be changing it, and I don't think the other array types meet our needs, even if it meant that interoperability would improve between JuMP and other packages.

So if anyone has concrete suggestions for improvements to JuMP's containers, please comment, otherwise I really will close the issue this time 😆

@rafaqz
Copy link

rafaqz commented Mar 21, 2023

Maybe it got lost in the thread, but my concrete suggestion is that we all implement the ArrrayInterface.jl labeled array traits so interop can just work.

@odow
Copy link
Member

odow commented Mar 21, 2023

but my concrete suggestion is that we all implement the ArrrayInterface.jl labeled array traits so interop can just work.

Open a PR? I don't know what this entails. Some concrete examples of how this would help interop would also be useful.

@odow odow changed the title Suggestion: Switch from Containers to DimensionalData.jl [Containers] support ArrayInterface.jl traits Mar 21, 2023
@rafaqz
Copy link

rafaqz commented Mar 21, 2023

I dont use jump, nor have the time to even add this PR to my own package. Its just a concrete siggestion you asked for

@odow
Copy link
Member

odow commented May 24, 2023

An alternative is to support other array types as an extension: #3383

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Containers Related to the Containers submodule
Development

Successfully merging a pull request may close this issue.

10 participants