Update benchmarks to use Scala 2.12 or 2.13 #242

lbulej · 2021-04-20T09:18:36Z

While this branch started with a simple idea of updating Scala in various benchmarks to recent versions, it turned out to be a much more involved endeavor, resulting in a lot of fallout that needed handling. To make benchmarks compatible with newer Scala, their dependencies needed to be updated, sometimes substantially (e.g., Spark, Neo4J, Dotty). However, when the dust settles, the suite should be more compatible with modern JVMs. There are a few things that need still need to be done before release, but these should come through much smaller PRs.

Key changes

Suite uses Scala versions 2.12.13 and 2.13.5 only, Scala 2.11 is no longer used
Spark benchmarks now use Spark 3.0.1, Spark 2.0.0 is no longer used
Benchmarks can be in more than one benchmark group
Benchmarks can specify requirements on JVM versions
Harness avoids executing benchmarks on incompatible JVM
Harness provides better control over scratch directories
Build system uses SBT 1.4.9 (and some updated plugins)

Follow-up changes

Replace the contents of sources.zip in dotty benchmark with scalap sources
Clean-up scratch file and directory handling in all benchmarks
Capture requirements on JVM versions for the rest of the benchmarks
Update README.md generator to output benchmark JVM version requirements
Handle Spark parameters consistently in Spark benchmarks

Used Scala versions

The following benchmarks now use Scala 2.12:

reactors
apache-spark group (als, chi-square, dec-tree, gauss-mix, log-regression, movie-lens, naive-bayes, page-rank)
neo4j-analytics
scala-stm group (philosophers, scala-stm)
twitter-finagle group (finagle-chirper, finagle-http)

The following benchmarks use Scala 2.13:

akka-uct
dotty
scala-doku
scala-kmeans

The following are Java benchmarks with wrappers using Scala 2.13:

db-shootout
jdk-concurrent group (fj-kmeans, future-genetic)
jdk-streams group (mnemonics, par-mnemonics, scrabble)
rx-scrabble

Benchmark changes

Benchmark can be in more than one benchmark group. This can be achieved by providing the benchmark class with multiple @Group annotations.
Benchmarks can specify requirements on JVM versions. There are two new annotations for this purpose: the @JvmRequired annotation, which allows setting the minimum required JVM version a benchmark needs to run (inclusive, defaults to "1.8" if unset), and the @JvmSupported annotation, which allows setting the maximum supported JVM version a benchmark runs on (also inclusive, unset by default).

The following provides a more detailed summary of changes in individual benchmarks.

actors/akka-uct (Scala 2.13)

Updated Scala version (2.11.8 -> 2.13.5)
Split off into separate subproject to make it independent of reactors
Updated com.typesafe.actor.akka-actor dependency (2.3.11 -> 2.6.12)

actors/reactors (Scala 2.12)

Updated Scala version (2.11.8 -> 2.12.13)
Split off into separate subproject to make it independent of akka-uct
Updated io.reactors.reactors-core dependency (0.7 -> 0.9-renaissance-83d194)
- The dependency was ported/cleaned from Scala 2.11 to Scala 2.12
- https://github.com/renaissance-benchmarks/dependency-reactors/tree/renaissance/export

apache-spark/* (Scala 2.12)

Updated Scala version (2.11.8 -> 2.12.13)
- Spark 3.0 does not support Scala 2.13
Updated org.apache.spark.spark-core dependency (2.0.0 -> 3.0.1)
Updated org.apache.spark.spark-sql dependency (2.0.0 -> 3.0.1)
Updated org.apache.spark.spark-mllib dependency (2.0.0 -> 3.0.1)

database/db-shootout (Scala 2.13)

Scala wrapper over Java benchmark
Updated Scala version (2.11.8 -> 2.13.5)

dummy/* (Java only)

Internal benchmarks for harness testing
Made into pure Java subproject

jdk-concurrent/* (Scala 2.13)

Scala wrapper over Java benchmarks
Updated Scala version (2.12.8 -> 2.13.5)

jdk-streams/* (Scala 2.13)

Scala wrapper over Java benchmarks
Updated Scala version (2.12.8 -> 2.13.5)

neo4j/neo4j-analytics (Scala 2.12)

Updated Scala version (2.12.8 -> 2.12.13)
Benchmark ported to Neo4J 4.x API
Cleaned up scratch file handling
Updated net.liftweb.lift-json dependency (3.2.0 -> 3.4.3)
Updated org.neo4j.neo4j dependency (3.5.12 -> 4.2.4)
- Neo4J 4.2 does not support Scala 2.13
Removed direct commons-io.commons-io dependency

rx/rx-scrabble (Scala 2.13)

Scala wrapper over Java benchmark
Updated Scala version (2.11.8 -> 2.13.5)

scala-dotty/dotty (Scala 2.13)

Updated Scala version (2.12.8 -> 2.13.5)
Updated contents of sources.zip to make it compatible with updated Dotty
- The plan is to replace the contents with scalap sources
Added org.scala-lang.modules.scala-collection-compat dependency (2.4.2)
Added org.scala-lang.scala3-compiler_3.0.0-RC1 dependency (3.0.0-RC1)
Removed direct ch.epfl.lamp.dotty-compiler_0.12 dependency
Removed direct commons-io.commons-io dependency

scala-sat/scala-doku (Scala 2.13)

Updated Scala version (2.11.7 -> 2.13.5)
Updated bundled dependency on CafeSat (0.01 -> 0.01-28-gd0edeaa)
- The dependency was ported/cleaned from Scala 2.11 to Scala 2.13
- https://github.com/renaissance-benchmarks/dependency-cafesat/tree/renaissance/export
Updated bundled dependency on com.regblanc.scala-smtlib (0.1 -> 0.2.1-52-ga71d6b0)
- The dependency was ported/cleaned from Scala 2.11 to Scala 2.13
- https://github.com/renaissance-benchmarks/dependency-scala-smtlib/tree/renaissance/export

scala-stdlib/scala-kmeans (Scala 2.13)

Updated Scala version (2.12.8 -> 2.13.5)

scala-stm/* (Scala 2.12)

Updated Scala version (2.12.3 -> 2.12.13)
The bundled org.scala-stm.scala-stm-library dependency will need updating for Scala 2.13

twitter-finagle/* (Scala 2.12)

Updated Scala version (2.11.8 -> 2.12.13)
Removed direct com.twitter.twitter-server dependency (19.4.0)
Removed direct com.twitter.common.metrics dependency (0.0.39)
Removed direct com.twitter.common.io dependency (0.0.69)
Removed direct com.twitter.util-events dependency (7.0.0)
Added dependency com.google.guava.guava dependency (19.0)
Added dependency commons-io.commons-io dependency (2.4)

Externally visible core/harness changes

The BenchmarkContext has been changed to reduce clutter due to different methods for accessing benchmark parameters. The interface now provides only a single parameter() method returning an instance of BenchmarkParameter, which then provides the convenience methods parameter value conversion, allowing the two interfaces to evolve independently.
The handling of scratch directories has been cleaned up, triggering a number of changes:
- The BenchmarkContext interface provides the scratchDirectory() method, which provides a benchmark-specific scratch directory, which is removed when JVM exits. The other methods related to creating temporary directories will be removed when all benchmarks are updated to use the new interface.
- The DirUtils class provides methods to recursively clean or delete directories so that benchmarks don't need to depend on the Apache commons-io library just for that.
- The suite accepts two new command-line options --scratch-base and --keep-scratch. The first allows setting a base directory for scratch directories created by the suite, and the second prevents removal of those directories on JVM exit (for testing/debugging purposes).
- The JMH wrapper (JmhRenaissanceBenchmark) provides similar functionality through system settings org.renaissance.jmh.scratchBase (path) and org.renaissance.jmh.keepScratch (boolean).
The suite now checks the JVM version requirements selected for execution and excludes benchmarks that are incompatible with the current JVM from execution.
- Consequently, the --raw-list option, which is intended for use in shell scripts, will only list benchmarks compatible with the JVM.
- In both cases, the filtering can be disabled by using the --no-jvm-check option.
The JMH wrapper can substitute (and execute) the dummy-empty benchmark in place of an incompatible benchmark (after issuing a textual warning). This is only useful for CI automation, because JMH does not provide a way to report/skip an incompatible benchmark without failure, and (unlike the normal Renaissance bundle) the JMH-enabled bundle cannot provide a list of compatible benchmarks. This behavior must be explicitly enabled by setting the org.renaissance.jmh.fakeIncompatible system property to true.

Internal core/harness changes

The Launcher, RenaissanceSuite, and the JmhRenaissanceBenchmark classes now create independent scratch directories to reflect the different contexts in which they execute.
The ModuleLoader class is no longer a class-based (static) singleton, but is supposed to be instantiated and needs to be provided with a scratch directory into which it can write module libraries.
The BenchmarkInfo class now provides a method to load a particular benchmark using a given ModuleLoader instance. The method has been moved from the BenchmarkRegistry class.

Use default value for scalafmtConfig, the new version does not like Some() as a value.

Also add local .scalafmt.conf to including configuration from the root project to make the project work locally.

Includes minor changes to avoid using deprecated features.

Keeps the dependencies untouched (yet), but adds comments to indicate potential upgrade paths. This may actually require splitting the akka-uct and reactors benchmarks into two sub-projects. AkkaUct depends on akka-actor (which is still being developed). Versions 2.5.x and 2.6.x of akka-actor support Scala up to 2.13, version 2.3.x supports Scala only up to 2.11. Reactors depends on reactors-core, which (even in version 0.8) only supports Scala up to 2.11. Version 0.9 is in development, but it is not clear which version of Scala it supports.

Using Scala 2.13 will require using at least version 0.18.1 of the dotty-compiler library, which will in turn require updating the input sources, because since version 0.13.0, dotty starts to dislike them.

This reverts commit 5d9820c.

This time without breaking the loop by completely removing the list of benchmarks.

Neo4j apparently needs a lot more work, because version 3.5.x do not work with Scala 2.12.12 or 2.13.x. When compiled with Scala 2.12.12, the benchmark crashes during initialization, failing to find class definition for Scala.Product$class despite everything being on the class path.

Finally a bit more civilized version to break down arguments

Avoids ugly dependency on Launcher scratch root directory field, and allows fixing JMH wrappers.

This allows initializing the ModuleLoader using scratch directory managed by the JMH wrapper, not by the Launcher class.

This is done by loading and executing the `dummy-empty` benchmark instead of the incompatible benchmark. This must be enabled by setting the `org.renaissance.jmh.fakeIncompatible` property to `true`. This is now used in the Travis configuration.

farquet

This is an incredible work on both the quality and quantity sides!

I have a few minor comments based on the code review. I will now test the branch (and its new options) locally before giving the approval.

benchmarks/apache-spark/src/main/scala/org/renaissance/apache/spark/SparkUtil.scala

benchmarks/scala-dotty/build.sbt

benchmarks/scala-dotty/src/main/scala/org/renaissance/scala/dotty/Dotty.scala

benchmarks/scala-sat/cafesat/build.sbt

benchmarks/scala-sat/src/main/scala/org.renaissance.scala.sat/ScalaDoku.scala

This was triggered by the need to provide access to a boolean parameter value. Instead of adding another method to BenchmarkContext, I have modified BenchmarkContext to provide only a single method to access benchmark parameters, which returns BenchmarkParameter instance. The BenchmarkParameter class then provides access to the parameter value (as string) along with a number of convenience methods that convert the string to a typed value. This will prevent changes to the BenchmarkContext interface whenever we want to add a common parameter access method. In addition, we can remove some of the (duplicated) code implementing BenchmarkContext in the ExecutionDriver and JmhRenaissanceBenchmark. Benchmark updates to the interface follow.

In most cases, we also use a toPositiveInteger() method to indicate that the parameter value has to be positive (greater than 0), and in some cases, we use a common toList() method to get a list of typed values from a comma-separated list of elements.

benchmarks/scala-dotty/src/main/scala/org/renaissance/scala/dotty/Dotty.scala

One less option to worry about, based on the assumption that the --raw-list option is meant for machines (scripts), which should get a list of compatible benchmarks by default. Using the --no-jvm-option together with --raw-list will produce a list of all benchmarks, regardless of compatibility. The output of the --list option, which is meant for humans, was modified to indicate whether a benchmark is compatible with the current JVM.

This is the last of 1.4.x series, 1.5.0 forces the use of the slash notation when referring to settings (which we don't use everywhere yet).

farquet

After careful local experimenting with all features from the branch, everything works like a charm as expected.

Feel free to merge it such that we can iterate on smaller PRs for upcoming changes :)

lbulej · 2021-04-27T16:33:10Z

Thanks for taking the time to review and experiment with this pile of changes! I'll try to go back to much smaller PRs :-)

lbulej added 30 commits September 28, 2020 12:14

Update sbt to version 1.3.13

c7e591b

Update scalafmt in root project to version 2.4.2

d93e675

Use default value for scalafmtConfig, the new version does not like Some() as a value.

Remove unused renaissanceScalaVersion setting from the root project

28670cc

Update filtering hint in root build.sbt

5d9820c

Update scalafmt in renaissance-core to version 2.4.2

0967934

Also add local .scalafmt.conf to including configuration from the root project to make the project work locally.

Update scalafmt in renaissance-harness to version 2.4.2

8b6b5ea

Update scala in renaissance-harness to version 2.13.3

b6f4cfe

Includes minor changes to avoid using deprecated features.

Update scalafmt in benchmarks/dummy to version 2.4.2

1b71f2b

Update scalafmt in benchmarks/jdk-concurrent to version 2.4.2

6bc8128

Update scala in benchmarks/jdk-concurrent to 2.13.3

183bea0

Update scalafmt in benchmarks/jdk-streams to version 2.4.2

bb9db36

Update scala in benchmarks/jdk-concurrent to version 2.13.3

6aaf72d

Update scalafmt in benchmarks/scala-stdlib to version 2.4.2

b99b67e

Update scala in benchmarks/scala-stdlib to version 2.13.3

966a47e

Clean up code style in ScalaKmeans

ffe7f18

Update scalafmt in benchmarks/actors to version 2.4.2

d7ded63

Clean up minor Scala coding style issues in benchmarks/actors

00ce5dd

Update scalafmt in benchmarks/scala-dotty to version 2.4.2

6a2d40a

Clean up minor Scala coding style issues in benchmarks/scala-dotty

b369a65

Update scala in benchmarks/scala-dotty to version 2.12.12

82a56e1

Using Scala 2.13 will require using at least version 0.18.1 of the dotty-compiler library, which will in turn require updating the input sources, because since version 0.13.0, dotty starts to dislike them.

Clean up minor Scala coding style issues in benchmarks/scala-dotty

aa5f08b

Update scalafmt in benchmarks/rx to version 2.4.2

c54dd9b

Update scala in benchmarks/rx to version 2.13.3

5d16e2a

Clean up minor Scala coding style issues in benchmarks/rx

757db94

Update scalafmt in benchmarks/neo4j to version 2.4.2

dfcebb4

Revert "Update filtering hint in root build.sbt"

a309eba

This reverts commit 5d9820c.

Update filtering hint in build.sbt

ce563ee

This time without breaking the loop by completely removing the list of benchmarks.

Enable config-style argument formatting in Scala

b90cf5c

Finally a bit more civilized version to break down arguments

lbulej added 4 commits April 20, 2021 00:04

Use separate ModuleLoader instances instead of static methods

ecda8a4

Avoids ugly dependency on Launcher scratch root directory field, and allows fixing JMH wrappers.

Update JMH wrapper to use ModuleLoader instance

90f3bdb

This allows initializing the ModuleLoader using scratch directory managed by the JMH wrapper, not by the Launcher class.

Update README.md

ad46d31

farquet reviewed Apr 21, 2021

View reviewed changes

lbulej added 12 commits April 21, 2021 20:49

Fix ScalaDoku package specifier

0e62160

Rephrase a few comments in ScalaDoku

4f66c8d

Add comment regarding scala3 compiler to scala-dotty build file

6cd8805

Use idiomatic initializers in DbShootout

1d6b445

Unify exception messages in BenchmarkInfo

d0eb88f

Remove unsupported method from BenchmarkContext

0da282e

Make resource path explicit at declaration in Dotty

40a8b7f

Update classpath-related comment in Dotty

a999e6b

Update README.md generator to list JVM version requirements

8030d73

Pick up a fix from scala-smtlib repo

235b425

axel22 approved these changes Apr 21, 2021

View reviewed changes

benchmarks/scala-dotty/src/main/scala/org/renaissance/scala/dotty/Dotty.scala Show resolved Hide resolved

lbulej added 5 commits April 22, 2021 12:57

Update Travis configuration to match the change in CLI options

a3128a3

Use version relative to upstream master in cafesat

3d17414

Update SBT to version 1.4.9

d23a38c

This is the last of 1.4.x series, 1.5.0 forces the use of the slash notation when referring to settings (which we don't use everywhere yet).

Update README.md

ddf9fcd

farquet approved these changes Apr 27, 2021

View reviewed changes

farquet mentioned this pull request Apr 27, 2021

Compatibility with Java 16 #241

Closed

lbulej merged commit 8e3f720 into master Apr 27, 2021

lbulej deleted the topic/update-scala-sbt branch April 28, 2021 11:44

lbulej mentioned this pull request Apr 30, 2021

Upgrade Spark version #152

Closed

farquet mentioned this pull request May 3, 2021

Dotty benchmark is failing on master #251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update benchmarks to use Scala 2.12 or 2.13 #242

Update benchmarks to use Scala 2.12 or 2.13 #242

lbulej commented Apr 20, 2021 •

edited

Loading

farquet left a comment

farquet left a comment

lbulej commented Apr 27, 2021

Update benchmarks to use Scala 2.12 or 2.13 #242

Update benchmarks to use Scala 2.12 or 2.13 #242

Conversation

lbulej commented Apr 20, 2021 • edited Loading

Key changes

Follow-up changes

Used Scala versions

Benchmark changes

actors/akka-uct (Scala 2.13)

actors/reactors (Scala 2.12)

apache-spark/* (Scala 2.12)

database/db-shootout (Scala 2.13)

dummy/* (Java only)

jdk-concurrent/* (Scala 2.13)

jdk-streams/* (Scala 2.13)

neo4j/neo4j-analytics (Scala 2.12)

rx/rx-scrabble (Scala 2.13)

scala-dotty/dotty (Scala 2.13)

scala-sat/scala-doku (Scala 2.13)

scala-stdlib/scala-kmeans (Scala 2.13)

scala-stm/* (Scala 2.12)

twitter-finagle/* (Scala 2.12)

Externally visible core/harness changes

Internal core/harness changes

farquet left a comment

Choose a reason for hiding this comment

farquet left a comment

Choose a reason for hiding this comment

lbulej commented Apr 27, 2021

lbulej commented Apr 20, 2021 •

edited

Loading