Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ImmutableArray #41777

Closed
wants to merge 1 commit into from
Closed

Implement ImmutableArray #41777

wants to merge 1 commit into from

Conversation

Keno
Copy link
Member

@Keno Keno commented Aug 3, 2021

This rebases #31630 with several fixed and modifications.
After #31630, we had originally decided to hold off on said
PR in favor of implementing either more efficient layouts for
tuples or some sort of variable-sized struct type. However, in
the two years since, neither of those have happened (I had a go
at improving tuples and made some progress, but there is much
still to be done there). In the meantime, all across the package
ecosystem, we've seen an increasing creep of pre-allocation and
mutating operations, primarily caused by our lack of sufficiently
powerful immutable array abstractions and array optimizations.

This works fine for the individual packages in question, but it
causes a fair bit of trouble when trying to compose these packages
with transformation passes such as AD or domain specific optimizations,
since many of those passes do not play well with mutation. More
generally, we would like to avoid people needing to pierce
abstractions for performance reasons.

Given these developments, I think it's getting quite important
that we start to seriously look at arrays and try to provide
performant and well-optimized arrays in the language. More
importantly, I think this is somewhat independent from the
actual implementation details. To be sure, it would be nice
to move more of the array implementation into Julia by making
use of one of the abovementioned langugage features, but that
is a bit of an orthogonal concern and not absolutely required.

This PR provides an ImmutableArray type that is identical
in functionality and implementation to Array, except that
it is immutable. Two new intrinsics Core.arrayfreeze and
Core.arraythaw are provided which are semantically copies
and turn a mutable array into an immutable array and vice
versa.

In the original PR, I additionally provided generic functions
freeze and thaw that would simply forward to these
intrinsics. However, said generic functions have been omitted
from this PR in favor of simply using constructors to go
between mutable and immutable arrays at the high level.
Generic freeze/thaw functions can always be added later,
once we have a more complete picture of how these functions
would work on non-Array datatypes.

Some basic compiler support is provided to elide these copies
when the compiler can prove that the original object is
dead after the copy. For instance, in the following example:

function simple()
    a = Vector{Float64}(undef, 5)
    for i = 1:5
        a[i] = i
    end
    ImmutableArray(a)
end

the compiler will recognize that the array a is dead after
its use in ImmutableArray and the optimized implementation
will simply rewrite the type tag in the originally allocated
array to now mark it as immutable. It should be pointed out
however, that semantically there is still no mutation of the
original array, this is simply an optimization.

At the moment this compiler transform is rather limited, since
the analysis requires escape information in order to compute
whether or not the copy may be elided. However, more complete
escape analysis is being worked on at the moment, so hopefully
this analysis should become more powerful in the very near future.

I would like to get this cleaned up and merged resonably quickly,
and then crowdsource some improvements to the Array APIs more
generally. There are still a number of APIs that are quite bound
to the notion of mutable Arrays. StaticArrays and other packages
have been inventing conventions for how to generalize those, but
we should form a view in Base what those APIs should look like and
harmonize them. Having the ImmutableArray in Base should help
with that.

This rebases #31630 with several fixed and modifications.
After #31630, we had originally decided to hold off on said
PR in favor of implementing either more efficient layouts for
tuples or some sort of variable-sized struct type. However, in
the two years since, neither of those have happened (I had a go
at improving tuples and made some progress, but there is much
still to be done there). In the meantime, all across the package
ecosystem, we've seen an increasing creep of pre-allocation and
mutating operations, primarily caused by our lack of sufficiently
powerful immutable array abstractions and array optimizations.

This works fine for the individual packages in question, but it
causes a fair bit of trouble when trying to compose these packages
with transformation passes such as AD or domain specific optimizations,
since many of those passes do not play well with mutation. More
generally, we would like to avoid people needing to pierce
abstractions for performance reasons.

Given these developments, I think it's getting quite important
that we start to seriously look at arrays and try to provide
performant and well-optimized arrays in the language. More
importantly, I think this is somewhat independent from the
actual implementation details. To be sure, it would be nice
to move more of the array implementation into Julia by making
use of one of the abovementioned langugage features, but that
is a bit of an orthogonal concern and not absolutely required.

This PR provides an `ImmutableArray` type that is identical
in functionality and implementation to `Array`, except that
it is immutable. Two new intrinsics `Core.arrayfreeze` and
`Core.arraythaw` are provided which are semantically copies
and turn a mutable array into an immutable array and vice
versa.

In the original PR, I additionally provided generic functions
`freeze` and `thaw` that would simply forward to these
intrinsics. However, said generic functions have been omitted
from this PR in favor of simply using constructors to go
between mutable and immutable arrays at the high level.
Generic `freeze`/`thaw` functions can always be added later,
once we have a more complete picture of how these functions
would work on non-Array datatypes.

Some basic compiler support is provided to elide these copies
when the compiler can prove that the original object is
dead after the copy. For instance, in the following example:
```
function simple()
    a = Vector{Float64}(undef, 5)
    for i = 1:5
        a[i] = i
    end
    ImmutableArray(a)
end
```

the compiler will recognize that the array `a` is dead after
its use in `ImmutableArray` and the optimized implementation
will simply rewrite the type tag in the originally allocated
array to now mark it as immutable. It should be pointed out
however, that *semantically* there is still no mutation of the
original array, this is simply an optimization.

At the moment this compiler transform is rather limited, since
the analysis requires escape information in order to compute
whether or not the copy may be elided. However, more complete
escape analysis is being worked on at the moment, so hopefully
this analysis should become more powerful in the very near future.

I would like to get this cleaned up and merged resonably quickly,
and then crowdsource some improvements to the Array APIs more
generally. There are still a number of APIs that are quite bound
to the notion of mutable `Array`s. StaticArrays and other packages
have been inventing conventions for how to generalize those, but
we should form a view in Base what those APIs should look like and
harmonize them. Having the `ImmutableArray` in Base should help
with that.
@Tokazama
Copy link
Contributor

Tokazama commented Aug 7, 2021

Is the goal here to be able to replace SArray with something like this...

struct SArray{S,T,N,L}
    data::ImmutableArray{T,N}
end

...or is it a unique thing entirely.

@timholy
Copy link
Member

timholy commented Aug 7, 2021

These are heap-allocated (when they allocate...) and so it's a little different, but the basic idea is the same. For example the construct

X .+= 1

might be the best way to implement an elementwise-increment if X is mutable, but it's an error if X is an immutable. As you know well from your work on ArrayInterface, that makes it harder to write generic code. If we have immutable arrays, then it should be easier to write this as

X += 1

(without the dot) and have the compiler determine that there are circumstances where the operation could be performed in place. In other words, it should allow code to become more generic without loss of performance.

However, the escape-analysis that @Keno refers to would be important for making this a reality.

@AriMKatz
Copy link

AriMKatz commented Aug 8, 2021

Related: @tkf 's https://github.com/tkf/Mutabilities.jl

For reference, I believe this is the escape analysis work? https://github.com/aviatesk/EscapeAnalysis.jl

@tkf
Copy link
Member

tkf commented Aug 8, 2021

(I think Keno's strategy mentioned in the OP was to keep it minimal and postpone harder design decisions. So, I posted my comment in the original PR which already contained various discussions #31630 (comment))

@chriselrod
Copy link
Contributor

Any plans to communicate immutability via alias info or invariance to LLVM?

@fingolfin
Copy link
Contributor

What is the rationale for providing a "thaw" function, though? Doesn't the mere possibility that one can "thaw" an "immutable" array render some optimizations impossible?

@andyferris
Copy link
Member

Doesn't the mere possibility that one can "thaw" an "immutable" array render some optimizations impossible?

My understand is no - thaw would make a mutable copy in the default case, unless the compiler can prove you are thawing the only live reference to the array in which case it is safe to simply make it mutable without copying.

@tecosaur
Copy link
Contributor

tecosaur commented Sep 5, 2021

I'm curious, will this allow for a similar performance to that currently seen with StaticArrays for small matrix operations? If so, that would be brilliant to have in Base.

Multiplying a 2x2 matrix ~1350x faster with StaticArrays and Julia 1.6

@timholy
Copy link
Member

timholy commented Sep 5, 2021

@tecosaur, times that are much less than the duration of a CPU clock tick indicate that the compiler is just eliminating the entire workload. Here's the right way to do it:

julia> @btime A*A setup=(A=rand(SMatrix{2,2}));
  1.485 ns (0 allocations: 0 bytes)

julia> @btime A*A setup=(A=rand(2,2));
  36.058 ns (1 allocation: 112 bytes)

And no, this proposal won't narrow the entire gap. The sizes aren't static, the values are. But if the compiler doesn't actually have to instantiate the array, then most of that time may disappear.

@tecosaur
Copy link
Contributor

tecosaur commented Sep 5, 2021

Ah, thanks for explaining that @timholy 👍. Given the still huge performance difference that your benchmark shows I think it would be nice if I didn't need a package for high-performance small matrix operations, but it's nice to hear that this proposal may narrow the gap.

@chriselrod
Copy link
Contributor

chriselrod commented Sep 5, 2021

It's the stack allocation and static sizing that are good for performance.
Immutability is largely orthogonal to these (but can potentially enable some optimizations, like removing alias checks when used with mutated arrays).

@jpsamaroo
Copy link
Member

Immutability can also be good for performance. If I understand this PR correctly, when we freeze an array, we've frozen its size, shape, and values, and thus multiple loads of the same attributes or values may potentially be coalesced by the optimizer.

@chriselrod
Copy link
Contributor

Immutability can also be good for performance. If I understand this PR correctly, when we freeze an array, we've frozen its size, shape, and values, and thus multiple loads of the same attributes or values may potentially be coalesced by the optimizer.

This should be happening in many cases anyway, but TBAA information often fails to propagate.

I also don't know if it's just that the associated LLVM pass isn't turned on (like why @llvm.expect doesn't work), but invariant information doesn't really seem to work / enable the optimizations I would expect it to in Julia.

@tkf
Copy link
Member

tkf commented Sep 5, 2021

when we freeze an array, we've frozen its size, shape, and values

In principle, we can freeze values and "shape" separately (as a possible concrete API, see freezevalue and freezeindex in Mutabilities.jl). Value-frozen and shape-not-frozen vector can act as an append-only data structure where the compiler can assume loaded value won't change even after new values are appended. Not sure if how much is doable at LLVM level since it'd look like the buffer is swapped to a larger one and then memcpy'ed. Maybe it could be useful if you incrementally build an array while passing intermediate result to some other outlined functions.

This should be happening in many cases anyway, but TBAA information often fails to propagate.

I think adding invariance at Julia's type-level helps things LLVM (alone) cannot reason about. For example, in

xs::ImmutableArray
x = xs[1]
dynamic_dispatch(xs)
x = xs[1]

the second load can be elided by the Julia compiler (or Julia helping LLVM?).

@vtjnash
Copy link
Member

vtjnash commented Feb 1, 2022

Moved to #42465

@vtjnash vtjnash closed this Feb 1, 2022
@vtjnash vtjnash deleted the kf/immutablearray branch February 1, 2022 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.