Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

optimise arrayOps in strawman collection #317

Closed
wants to merge 1 commit into from

Conversation

ackratos
Copy link
Contributor

@ackratos ackratos commented Dec 10, 2017

Benchmarked array slice performance, @mkeskells optimization have performance gain to scala.ArrayOps implementation and a big win of strawman collection. I am trying to optimize strawman implementation (but not sure whether its a good idea considered current implementation has a good, unified abstraction on View)

ArrayBaselineBenchmark
[info] Benchmark                               (size)  (vLoSize)  Mode  Cnt         Score         Error  Units
[info] ArrayBaselineBenchmark.sliceLaterHalf       39         39  avgt   30        66.163 ?      14.964  ns/op
[info] ArrayBaselineBenchmark.sliceLaterHalf      282         39  avgt   30       701.774 ?     320.821  ns/op
[info] ArrayBaselineBenchmark.sliceLaterHalf    73121         39  avgt   30    115300.047 ?   40758.378  ns/op
[info] ArrayBaselineBenchmark.sliceLaterHalf  7312102         39  avgt   30  15430154.435 ? 3136356.950  ns/op

ArrayView implementation
[info] Benchmark                           (size)  (vLoSize)  Mode  Cnt         Score         Error  Units
[info] ArrayViewBenchmark.sliceLaterHalf       39         39  avgt   30       767.224 ?     257.456  ns/op
[info] ArrayViewBenchmark.sliceLaterHalf      282         39  avgt   30      5511.769 ?    2177.406  ns/op
[info] ArrayViewBenchmark.sliceLaterHalf    73121         39  avgt   30   1000808.203 ?   12135.514  ns/op
[info] ArrayViewBenchmark.sliceLaterHalf  7312102         39  avgt   30  86467759.270 ? 1719425.377  ns/op

Mike https://github.com/scala/scala/pull/5652
[info] Benchmark                               (size)  (vLoSize)  Mode  Cnt         Score         Error  Units
[info] ArrayBaselineBenchmark.sliceLaterHalf       39         39  avgt   30        33.081 ?       0.700  ns/op
[info] ArrayBaselineBenchmark.sliceLaterHalf      282         39  avgt   30       221.226 ?       4.331  ns/op
[info] ArrayBaselineBenchmark.sliceLaterHalf    73121         39  avgt   30     67573.426 ?    9092.555  ns/op
[info] ArrayBaselineBenchmark.sliceLaterHalf  7312102         39  avgt   30  17210746.261 ? 4301349.666  ns/op

@ackratos ackratos changed the title optimise arrayOps in strawman optimise arrayOps in strawman collection Dec 10, 2017
@julienrf
Copy link
Contributor

Hey, thanks for taking this initiative but I’m afraid this is going to overlap with #278.

Copy link
Contributor

@szeiger szeiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general but should be rebased on top of #278 once that has been merged. This will remove ArrayView and leave us with ArrayOps, WrappedArray and ImmutableArray as possible candidates for this optimization.

@@ -12,7 +12,20 @@ package object collection extends LowPriority {
implicit def stringToStringOps(s: String): immutable.StringOps = new immutable.StringOps(s)

/** Decorator to add collection operations to arrays. */
implicit def arrayToArrayOps[A](as: Array[A]): ArrayOps[A] = new ArrayOps[A](as)
object ArrayOpsDecorators {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a fallback for Array[Any] that dispatches dynamically

@ackratos
Copy link
Contributor Author

@julienrf thanks for reminding, thankfully I have a chance to learn from #278

@ackratos
Copy link
Contributor Author

@szeiger yes, I will regard #278 as my baseline. Thanks for the changes

@julienrf
Copy link
Contributor

Hey @ackratos, now that #278 has been merged you can continue this work. I think this optimization is useful. I’m wondering what’s the cost of its implementation in terms of bytecode size.
Also, I noticed that slice could be optimized in general for indexed sequence (simply by implementing it in terms of view.take(…).drop(…) instead of View.Drop(View.Take(…), …)). Would you be interested in finishing this PR and also adding the optimization I just mentioned and then benchmarking everything?

@ackratos
Copy link
Contributor Author

ackratos commented Jan 15, 2018

collection-strawman's Array slice still has space to enhance, in following result, above is collection-strawman's result, below is Mike's optimization:
scala/scala#5652
https://github.com/scala/scala/compare/2.12.x...rorygraves:cong/2.12.x_ArrayTest?expand=1

[info] Result "access_slice":
[info]   2935893.274 ?(99.9%) 874438.239 ns/op [Average]
[info]   (min, avg, max) = (2003209.475, 2935893.274, 5066064.713), stdev = 858815.273
[info]   CI (99.9%): [2061455.035, 3810331.513] (assumes normal distribution)
[info] 
[info] 
[info] # Run complete. Total time: 00:09:07
[info] 
[info] Benchmark                              (size)  Mode  Cnt        Score        Error  Units
[info] ImmutableArrayBenchmark.access_slice        0  avgt   16       41.207 ?      2.480  ns/op
[info] ImmutableArrayBenchmark.access_slice        1  avgt   16       41.881 ?      3.234  ns/op
[info] ImmutableArrayBenchmark.access_slice        2  avgt   16       39.314 ?      0.550  ns/op
[info] ImmutableArrayBenchmark.access_slice        3  avgt   16       40.444 ?      0.839  ns/op
[info] ImmutableArrayBenchmark.access_slice        4  avgt   16       40.220 ?      1.409  ns/op
[info] ImmutableArrayBenchmark.access_slice        7  avgt   16       40.070 ?      0.402  ns/op
[info] ImmutableArrayBenchmark.access_slice        8  avgt   16       41.119 ?      1.410  ns/op
[info] ImmutableArrayBenchmark.access_slice       15  avgt   16       40.376 ?      1.254  ns/op
[info] ImmutableArrayBenchmark.access_slice       16  avgt   16       40.768 ?      0.497  ns/op
[info] ImmutableArrayBenchmark.access_slice       17  avgt   16       40.328 ?      0.406  ns/op
[info] ImmutableArrayBenchmark.access_slice       39  avgt   16       42.270 ?      0.349  ns/op
[info] ImmutableArrayBenchmark.access_slice      282  avgt   16       83.971 ?      4.216  ns/op
[info] ImmutableArrayBenchmark.access_slice     4096  avgt   16     1492.959 ?    476.279  ns/op
[info] ImmutableArrayBenchmark.access_slice   131070  avgt   16    34739.869 ?   4089.021  ns/op
[info] ImmutableArrayBenchmark.access_slice  7312102  avgt   16  2935893.274 ? 874438.239  ns/op
[info] 
[info] Benchmark result is saved to /Users/zhaocong/Developer/collection-strawman/benchmarks/time/target/scala-2.13.0-M2/jmh-result.json
[success] Total time: 559 s, completed Jan 16, 2018 12:06:16 AM


[info] Result "scala.collection.mutable.ArrayBaselineBenchmark.access_slice":
[info]   1307237.310 ?(99.9%) 478467.896 ns/op [Average]
[info]   (min, avg, max) = (1020158.565, 1307237.310, 1817316.572), stdev = 250247.961
[info]   CI (99.9%): [828769.414, 1785705.207] (assumes normal distribution)
[info] 
[info] 
[info] # Run complete. Total time: 00:04:44
[info] 
[info] Benchmark                             (size)  Mode  Cnt        Score        Error  Units
[info] ArrayBaselineBenchmark.access_slice        0  avgt    8       14.009 ?      7.245  ns/op
[info] ArrayBaselineBenchmark.access_slice        1  avgt    8       18.760 ?     10.962  ns/op
[info] ArrayBaselineBenchmark.access_slice        2  avgt    8       14.603 ?      4.859  ns/op
[info] ArrayBaselineBenchmark.access_slice        3  avgt    8       12.027 ?      5.900  ns/op
[info] ArrayBaselineBenchmark.access_slice        4  avgt    8       18.682 ?      6.070  ns/op
[info] ArrayBaselineBenchmark.access_slice        7  avgt    8       26.929 ?     26.067  ns/op
[info] ArrayBaselineBenchmark.access_slice        8  avgt    8       30.137 ?     26.866  ns/op
[info] ArrayBaselineBenchmark.access_slice       15  avgt    8       40.062 ?     36.232  ns/op
[info] ArrayBaselineBenchmark.access_slice       16  avgt    8       42.642 ?     34.040  ns/op
[info] ArrayBaselineBenchmark.access_slice       17  avgt    8       19.694 ?      1.995  ns/op
[info] ArrayBaselineBenchmark.access_slice       39  avgt    8       21.482 ?      5.269  ns/op
[info] ArrayBaselineBenchmark.access_slice      282  avgt    8       93.949 ?    120.360  ns/op
[info] ArrayBaselineBenchmark.access_slice     4096  avgt    8      468.349 ?    197.093  ns/op
[info] ArrayBaselineBenchmark.access_slice   131070  avgt    8    11484.873 ?   3698.272  ns/op
[info] ArrayBaselineBenchmark.access_slice  7312102  avgt    8  1307237.310 ? 478467.896  ns/op
[success] Total time: 305 s, completed Jan 15, 2018 11:24:13 PM

@ackratos
Copy link
Contributor Author

Close this as I have a new one contains less changes and easier to review: #354

@ackratos ackratos closed this Jan 20, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants