-
Notifications
You must be signed in to change notification settings - Fork 72
Primitive array types in ArrayOps / ArrayBuilder / ImmutableArray #278
Conversation
0378cb4
to
8a711b7
Compare
The test class doesn't compile on Dotty anymore with these changes. The error message is not very helpful:
|
// contract all empty ones must be equal, so discriminating based on the reference | ||
// equality of an empty array should not come up) but we may as well be | ||
// conservative since wrapRefArray contributes most of the unnecessary allocations. | ||
def make[T](x: AnyRef): ArrayView[T] = (x match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the type of x
AnyRef
rather than Array[_]
(for instance)? Do we want this method to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's currently done with AnyRef
but it should work with Array[_]
(which also erases to Object
). Since this is the only way of creating an ArrayView
that takes primitive arrays into account, it needs to be public.
def elemTag = ClassTag.Unit | ||
def length: Int = array.length | ||
def apply(index: Int): Unit = array(index) | ||
override def hashCode = arrayViewHash(array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to override ArrayView
’s hashcode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make use of specialization
case x: Array[Unit] => new ofUnit(x) | ||
}).asInstanceOf[ImmutableArray[T]] | ||
|
||
final class ofRef[T <: AnyRef](val array: Array[T]) extends ImmutableArray[T] with Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should follow the well established naming convention where class names begin with an upper case letter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're public and copied directly from the current API. I wouldn't want to do any unnecessary breaking changes here.
while (newsize < size) newsize *= 2 | ||
resize(newsize) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an idea of the loss of performance we would get if we defined these methods in ArrayBuilder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't benchmarked it but I wouldn't expect any performance difference. ensureSize
and sizeHint
don't touch the actual array. They should be OK to pull up.
* | ||
* @tparam T type of elements for the array builder, subtype of `AnyRef` with a `ClassTag` context bound. | ||
*/ | ||
final class ofRef[T <: AnyRef : ClassTag] extends ArrayBuilder[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make those private
? Scala.js does not like them. We have an entirely different implementation of ArrayBuilder
in Scala.js. For example see https://github.com/scala-js/scala-js/blob/7ea476aa62349f681bb2dbba16b7ffa6470bd786/scalalib/overrides-2.13/scala/collection/mutable/ArrayBuilder.scala#L32-L73
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These types are exposed as part of the API so that you can use them directly with unboxed primitive element types in all cases where you don't need polymorphism over primitive and reference types. Having to code against a non-specialized superclass means you need to box and unbox every single element you append to the Builder.
8a711b7
to
ff24f7d
Compare
Updated:
|
ff24f7d
to
6ba4846
Compare
|
||
def toIterable = ArrayView(xs) | ||
def toIterable: ImmutableArray[A] = ImmutableArray.wrapArray(xs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think it is not safe to expose it as an immutable
one because the underlying array xs
is still available and can be mutated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a general risk with ImmutableArray
but it is unavoidable if we want to make the default Seq
a collection.immutable.Seq
and support varargs as Seq
.
Or should we limit the implicit conversion from Array
to produce a generic Seq
? Varargs could still expose it as an immutable one. It may still be somewhat confusing once we make the immutable Seq
the default one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would expose collection.IndexedSeq
as the public result type of toIterable
. We can still safely use ImmutableArray
to support varargs since the reference to the underlying mutable Array
will not leak (or will it?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will but I'd classify that as an abuse of varargs. If you mutate an array after passing it to a varags method all bets are off.
@@ -104,7 +104,7 @@ class LowPriority { | |||
import strawman.collection._ | |||
|
|||
/** Convert array to iterable via view. Lower priority than ArrayOps */ | |||
implicit def arrayToView[T](xs: Array[T]): ArrayView[T] = ArrayView[T](xs) | |||
implicit def arrayToImmutableArray[T](xs: Array[T]): immutable.ImmutableArray[T] = immutable.ImmutableArray.wrapArray[T](xs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too, we should use collection.IndexedSeq[T]
as a return type instead of immutable.ImmutableArray[T]
|
||
final class ofRef[T <: AnyRef](val array: Array[T]) extends ImmutableArray[T] with Serializable { | ||
protected def elements: Array[T] = array | ||
def length: Int = array.length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we pull up length
’s implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that would require a polymorphic array
field (which erases to Object
) in ImmutableArray
. All methods which operate on the actual array
field need to be in the subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comments I left: we shouldn’t publish the Array
as an immutable
collection as long as the reference to the underlying mutable Array
has also publicly leaked.
6ba4846
to
987cf68
Compare
Updated to return a generic |
|
||
protected def fromTaggedIterable[B: ClassTag](coll: Iterable[B]): Array[B] = coll.toArray[B] | ||
|
||
// Note: Declaring the return type as ArrayOps.WithFilter[A] breaks Dotty 0.4.0-RC1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean that we won’t get back an Array
but an IndexedSeq
if we call map
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you file an issue in the dotty repository (lampepfl/dotty)? These issues have high priority and might be fixed in a few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strangely enough I was unable to reproduce the error again. Let's see how CI fares.
ImmutableArray.wrapArray(elements) | ||
} | ||
|
||
def wrapArray[T](x: AnyRef): ImmutableArray[T] = (x match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method have to be public? We should at least document what it does and also warn about the fact that it doesn’t create a copy of the given Array
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be public in the bytecode so that the codegen can generate code that calls it, in cases that are known to be safe (i.e., varargs generation). However it could be private-ish at the language level.
One way is to make it protected[immutable]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will buy us anything to hide the method. The specialized subclasses expose the constructors and the wrapped arrays, and both are necessary to write high-performance code that uses ImmutableArray. Also exposing wrapArray on top of that does not make it any less safe. The current collections do the same and I have not seen any problems in the wild that were caused by this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having it non-public would just prevent confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current collections do the same and I have not seen any problems in the wild that were caused by this.
Hum ... the current collections don't have ImmutableArray
s to begin with, so I'm not sure what you're talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's called WrappedArray
in the current collections. (WrappedArray
is a mutable collection though, so maybe it would be more problematic for ImmutableArray
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WrappedArray is a mutable collection though
Ah, well, yes. That changes everything!
987cf68
to
2b4a83c
Compare
Opened scala/scala3#3467 against Dotty |
So what should we do about the
|
In such a case I would make it as clear as possible in the method name. E.g. |
I definitely think we need The current |
@sjrd What about the following implementation? package object collection {
type WrappedArray[A] <: Seq[A]
def wrap[A](array: Array[A]): WrappedArray[A] = ImmutableArray.fromArrayNoCopy(array)
} My point is that we can always statically expose the |
In addition to the run-time type check issue, that implementation is not helpful when one wants to wrap an |
@sjrd - |
I still think we need non-private wrapping and unwrapping even in the presence of |
Looking at |
The difference is that |
|
Right @sjrd I missed that one and I agree it’s important. |
We'll need to make some bigger changes for that. Currently every |
aa72779
to
4be1730
Compare
Everything should be ready here, including |
I’m a bit lost with the builders. Why do we need |
def newBuilder[A](): Builder[A, WrappedArray[A]] = | ||
ArrayBuffer.newBuilder[A]().mapResult(fromArrayBuffer) | ||
|
||
def newBuilder[A](implicit t: ClassTag[A]): Builder[A, WrappedArray[A]] = new WrappedArrayBuilder[A](t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need WrappedArrayBuilder
? Can’t we use ArrayBuilder.mapResult(fromArray)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any reason why we couldn't.
} | ||
|
||
def fromArrayBuffer[A](arr: ArrayBuffer[A]): WrappedArray[A] = | ||
make[A](arr.asInstanceOf[ArrayBuffer[Any]].toArray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why fromArrayBuffer
rather than fromArray
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have that in the form of make
. fromArrayBuffer
is only used in two places. We could inline it or make the method private.
At least in the existing collections, |
4be1730
to
b0cbdf9
Compare
Updated: Removed |
class ArrayOps[A](val xs: Array[A]) extends AnyVal | ||
with IterableOnce[A] | ||
with IndexedSeqOps[A, immutable.IndexedSeq, Array[A]] | ||
with StrictOptimizedIterableOps[A, Seq, Array[A]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StrictOptimizedSeqOps
is even better!
@throws[ArrayIndexOutOfBoundsException] | ||
def apply(n: Int) = xs(n) | ||
override def className = "ArrayView" | ||
def append[B >: A : ClassTag](x: B): Array[B] = fromTaggedIterable(View.Append(toIterable, x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods have been renamed appended
and prepended
.
We should also add appendedAll
, prependedAll
, and concat
, BTW.
extends IndexedSeq[A] | ||
with IndexedSeqOps[A, ImmutableArray, ImmutableArray[A]] | ||
with StrictOptimizedSeqOps[A, ImmutableArray, ImmutableArray[A]] { | ||
|
||
def iterableFactory: SeqFactory[ImmutableArray] = ImmutableArray | ||
|
||
/* The wrapped mutable `Array` */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should clearly remind that mutating the Array
will break the immutability contract.
/* The wrapped mutable `Array` */ | ||
def unsafeArray: Array[A @uncheckedVariance] | ||
// uncheckedVariance should be safe: Array[A] for reference types A is covariant at the JVM level. Array[A] for | ||
// primitive types A can only be widened to Array[Any] which erases to Object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it is not safe:
val rectangles: ImmutableArray[Rectangle] = ImmutableArray(rect1, rect2)
val shapes: ImmutableArray[Shape] = rectangles // assuming Rectangle is a Shape
val shapesArray: Array[Shape] = shapes.unsafeArray
shapesArray(0) = someCircle // assuming Circle is a Shape which is not a Rectangle
rectangles.foreach(rect => println(rect.height)) // ClassCastException! Circle can not be cast to Rectangle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this? I would expect it to fail with an ArrayStoreException
in shapesArray(0) = someCircle
. But both this error and the one you demonstrate are due to the inherent unsoundness of covariant arrays.
This could be mitigated in ImmutableArray
by making it invariant, but for an immutable array this should not be necessary. The JVM array actually is covariant and. If you use it in a mutable way all bets are off. That's why it is called unsafe in the first place. Any immutable use should be sound.
} | ||
|
||
/** Wrap an existing `Array` into an `ImmutableArray` of the proper primitive specialization type without copying. */ | ||
def unsafeWrapArray[T](x: AnyRef): ImmutableArray[T] = (x match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious here: x: Array[_]
doesn’t work?
case x: Array[Short] => new ofShort(x) | ||
case x: Array[Boolean] => new ofBoolean(x) | ||
case x: Array[Unit] => new ofUnit(x) | ||
}).asInstanceOf[ImmutableArray[T]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end we never statically know that we have an underlying ofInt
instance. We always manipulate an ImmutableArray[X]
. How does the JVM know that it doesn’t have to box/unbox elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't. Any access through ImmutableArray
is boxed. If you want unboxed access you need to cast down or instantiate a primitive subclass directly. It's the same for WrappedArray
. In theory a @specialized
annotation should be beneficial for both cases but since WrappedArray
isn't currently specialized I didn't want to do any speculative changes without more complete benchmarks.
if(s > 0) b.sizeHint(s) | ||
b ++= coll | ||
WrappedArray.make(b.result()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like iterableFactory.from(coll)
would be a better implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the IterableFactory
cannot make use of the existing elemTag
. It would always have to build a boxed array.
b ++= coll | ||
WrappedArray.make(b.result()) | ||
} | ||
protected[this] def newSpecificBuilder(): Builder[T, WrappedArray[T]] = ArrayBuilder.make()(elemTag).mapResult(WrappedArray.make) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have different implementations here and in WrappedArray.newBuilder
?
Did you measure the performance difference? |
This is similar to the old implementation. Primitive array types are not polymorphic on the JVM so we need to special-case them in order to be able to efficiently wrap existing arrays as collections. We also remove ArrayView in favor of WrappedArray and ImmutableArray. With the new equality semantics (where Views are never equal unless they are the same reference) using a View as a default wrapper type for Array is problematic. ImmutableArray is basically the same as ArrayView, except it is an immutable.Seq and WrappedArray is the same for mutable.Seq, so we get the expected equality semantics and strict operations for both of them. Implicit conversions from Array to Seq always produce a WrappedArray. This is the safe choice because there is no guarantee that the array is effectively immutable. ImmutableArray is intended for use in compiler-generated varargs bridges and manually by the user via an explicit conversion call. Unlike the 2.12 collections the strawman currently requires a mutable Seq to be Growable and Shrinkable. This needs to be changed in order to implement WrappedArray. The new definition of mutable.Seq allows for fixed-size collections like WrappedArray. A mutable.Seq which is growable and shrinkable is a mutable.Buffer. We also introduce mutable.IndexedSeq as a new type. It does not define any new operations of its own but merely combines mutable.Seq with collection.IndexedSeq for orthogonality with immutable.IndexedSeq.
b0cbdf9
to
848f5d5
Compare
Addressed review comments. I didn't do any performance tests since this only brings the implementation up to the previous standard in 2.12. |
This is similar to the old implementation. Primitive array types are not
polymorphic on the JVM so we need to special-case them in order to be
able to efficiently wrap existing arrays as collections.