Don’t force views elements prematurely. #343

julienrf · 2018-01-16T10:46:31Z

IndexedView[A] was previously trying to return an IndexedView[A] on
transformation operations such as filter or take. Doing so sometimes
required to evaluate the views elements to build an indexed sequence
from which to get an IndexedView (see the last case):

 override protected[this] def fromSpecificIterable(it: Iterable[A]): IndexedView[A] =
   it match {
     case v: IndexedView[A] => v
     case i: IndexedSeq[A] => i.view
     case _ => it.to(IndexedSeq).view
   }

Now IndexedView[A] returns only a View[A] on such operations.

Another solution would have been to have IndexedView based versions of all transformation operations that are View based (i.e. IndexedView.Filter, IndexedView.Map). The drawback of that solution would be that we would duplicate all transformation operations (we would have View.Filter and IndexedView.Filter, etc.), but the advantage would be that the fact that a collection is indexed would be preserved across transformations applied to its view.

I discovered this issue while benchmarking the following code:

      xs.view
        .filter(x => x % 2L == 0L)
        .map(x => x * x)
        .sum

When xs was an IndexedSeq, then it’s filter operation returned fromSpecificIterable(View.Filter(…)), which was converting the View.Filter(…) into a strict IndexedSeq to take its view.

As a consequence, that code was (up to 2×) slower than the same code using strict transformation operations only (in red on the following chart):

With this PR, the version that uses views rather than strict transformation is slightly faster but still a bit slower than the strict version:

I’ve run a profiler and noticed that most of the time is spent on the hasNext operation. It is not clear to me why the view based version is still slower because in the strict version we also iterate on the same transformations and call the same hasNext operations…

IndexedView[A] was previously trying to return an IndexedView[A] on transformation operations such as filter or take. Doing so sometimes required to evaluate a views elements to build an indexed sequence from which to get an IndexedView. Now IndexedView[A] returns only a View[A] on such operations.

Ichoran · 2018-01-16T15:02:02Z

Doesn't this change clobber the performance of xs.map(f).drop(1000) because it can't jump to the 1000th element any more?

julienrf · 2018-01-16T15:02:46Z

Yes.

Ichoran · 2018-01-16T15:04:45Z

Well, maybe it doesn't because map is overridden, but we should be careful to think about which operations are going to be mysteriously slow because views are used. Often we don't think very much about chained operations, but in the case of views there isn't much reason to use them explicitly unless they're going to be chained. I don't have time now to think through them all, though.

julienrf · 2018-01-16T15:08:53Z

Well to be more precise: xs.view.map(f).drop(1000) (given xs: IndexedSeq[?]) will not take advantage of xs being an IndexedSeq to perform the drop part. I agree that this is a severe limitation. I can try to fix it by following the strategy mentioned in the PR description (override all view operations inherited from View such that they all return an IndexedView now).

Ichoran · 2018-01-16T16:09:43Z

I think there's a big difference between non-indexable and indexable operations. filter inherently can't be indexed lazily because you need to build up an intermediate data structure that is almost as expensive as just eagerly doing the filter. Same with flatMap and collect. But span and drop and map and splitAt and so on have fast ways to compute the index, so are view-friendly. Even if there isn't anything in the inheritance hierarchy that distinguishes these two types of operations, it would be good for there to be an override for all the relevant ones in IndexedView. If it's overridden, it doesn't really matter whether the type drops back to View or not; you'll still get the better performance from dynamic dispatch.

julienrf · 2018-01-17T12:50:39Z

oh sorry we were talking about map but I was thinking of filter. So, of course xs.view.map(f).drop(1000) will jump to the 1000th element.

szeiger

Looks correct. I think this was the intention all the time. We already have the overrides of IndexedView methods to return an IndexedView where appropriate.

szeiger · 2018-01-19T10:45:20Z

collections/src/main/scala/strawman/collection/View.scala

@@ -294,19 +294,7 @@ trait ArrayLike[+A] extends Any {
 }

 /** View defined in terms of indexing a range */
-trait IndexedView[+A] extends View[A] with ArrayLike[A] with SeqOps[A, View, IndexedView[A]] { self =>
-
-  final override def toSeq: immutable.Seq[A] = to(immutable.IndexedSeq)


We should consider making this the default for Iterable.toSeq and build something like an ArrayBuffer. If you care about linear characteristics for consing / unconsing you have to request an appropriate type, otherwise you get the most efficient representation.

most efficient representation for what?

Building a List is faster than building a Vector, IIRC.

julienrf requested review from szeiger, Ichoran and lrytz January 16, 2018 10:46

szeiger approved these changes Jan 19, 2018

View reviewed changes

julienrf merged commit 5f39e81 into scala:master Jan 22, 2018

julienrf deleted the indexed-view branch January 22, 2018 08:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don’t force views elements prematurely. #343

Don’t force views elements prematurely. #343

julienrf commented Jan 16, 2018 •

edited

Loading

Ichoran commented Jan 16, 2018

julienrf commented Jan 16, 2018

Ichoran commented Jan 16, 2018

julienrf commented Jan 16, 2018

Ichoran commented Jan 16, 2018

julienrf commented Jan 17, 2018

szeiger left a comment

szeiger Jan 19, 2018

julienrf Jan 19, 2018

Don’t force views elements prematurely. #343

Don’t force views elements prematurely. #343

Conversation

julienrf commented Jan 16, 2018 • edited Loading

Ichoran commented Jan 16, 2018

julienrf commented Jan 16, 2018

Ichoran commented Jan 16, 2018

julienrf commented Jan 16, 2018

Ichoran commented Jan 16, 2018

julienrf commented Jan 17, 2018

szeiger left a comment

Choose a reason for hiding this comment

szeiger Jan 19, 2018

Choose a reason for hiding this comment

julienrf Jan 19, 2018

Choose a reason for hiding this comment

julienrf commented Jan 16, 2018 •

edited

Loading