-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Zinc + Persistent Bazel Worker Processes #12
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed, please reply here (e.g.
|
progress_message="scala ijar %s" % ctx.label,) | ||
|
||
|
||
def _compile(ctx, jars, buildijar, usezinc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored this a bit so that it can be reused for both scalac and zinc invocations
@@ -254,7 +348,8 @@ _implicit_deps = { | |||
"_ijar": attr.label(executable=True, default=Label("//tools/defaults:ijar"), single_file=True, allow_files=True), | |||
"_scalac": attr.label(executable=True, default=Label("@scala//:bin/scalac"), single_file=True, allow_files=True), | |||
"_scalalib": attr.label(default=Label("@scala//:lib/scala-library.jar"), single_file=True, allow_files=True), | |||
"_scalaxml": attr.label(default=Label("@scala//:lib/scala-xml_2.11-1.0.4.jar"), single_file=True, allow_files=True), | |||
# "_scalaxml": attr.label(default=Label("@scala//:lib/scala-xml_2.11-1.0.4.jar"), single_file=True, allow_files=True), | |||
"_scalaxml": attr.label(default=Label("@scala//:lib/scala-library.jar"), single_file=True, allow_files=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this change about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, was this a hack to work around 2.11 being hardwired in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I need to find a better way to handle this.
Note that the author of this PR hasn't signed the CLA. I will meet with our open-source office on Friday and ask them if we need Google CLA for this repo. Until then we cannot accept this PR unless the author sign the CLA. |
"main_class": attr.string(), | ||
"exports": attr.label_list(allow_files=False), | ||
# Worker Args | ||
"worker": attr.label( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a private label? Should users ever change this? If private, can we use _worker
?
related to #8 |
Thanks so much for working on this! Very exciting! Open issues:
|
Btw, /cc @philwo FYI, a PR to use worker for Scala/Zinc :) |
|
||
for (input <- request.getInputsList().asScala) { | ||
inputs.put(input.getPath(), input.getDigest().toStringUtf8()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: idiomatic scala would be:
request.getInputsList.asScala.foreach { input =>
inputs.put(input.getPath, input.getDigest.toStringUtf8)
}
related to #14 |
@ahirreddy great! I'd be interested to see if we can just jump in here: and make an invocation. For instance: @annotation.tailrec
def loop(): Unit = {
val args = readArgsFromProto
val comp = new MainClass
// see: https://github.com/scala/scala/blob/2.11.x/src/compiler/scala/tools/nsc/Driver.scala#L37
comp.process(args)
if (comp.reporter.hasErrors) reportErrors(comp.reporter) else reportSuccess()
loop()
} might do it, and as long as there are no static vars, that should be exactly like running scalac except we get the jit to warm up. What do you think of that? |
@ahirreddy Thanks I have meet with our Open-source office and they confirmed that we need CLA for everything that is contributed to a bazelbuild repository |
@johnynek Currently, Bazel starts up to 4 workers per "key", where a key is a unique combination of command-line to start the worker, environment variables, working directory and Spawn mnemonic. (In other words, if you use both the Javac persistent worker and the Scala persistent workers, there will be up to 4 workers spawned for each.) You can tune that via the --worker_max_instances flag: https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/worker/WorkerOptions.java#L48 |
@ahirreddy I'd love to see this in Bazel - is there anything we could do to help you get it ready for merging? |
Hey @philwo I'll try to get to it this weekend! I've been somewhat backed up with other tasks, but I really want to get this merged :) |
Any updates on this? |
we are using this internally right now. I work with ahir. There is currently an issue with the caching as well because the zinc worker emits a temporary directory instead of a jar. This breaks caching because it can't generate a digest for a directory as an input. I'm working on this issue right now. |
I'm pessimistic about the incremental compiler aspect of That said, I think doing a worker process just using the normal compiler should be pretty easy and we can reinstantiate a new compiler on each call, and I think the JIT will give us a pretty good win. We are using bazel in the mode where CI does not to a full rebuild, so we care a lot about the correctness part and would not want to trade that for a slightly faster compile. |
+1 on correctness.
|
+1 on favoring correctness over speed and leaving out the incremental compilation. JIT and no more JVM start-up overhead should already give a pretty nice performance boost in itself. |
@xynny @ahirreddy |
This PR doesn't actually use the incremental compiler component of Zinc, so we have no issues around correctness or reproducibility. Zinc just happens to provide a nice persistent build server wrapper around the Scala compiler. Non reproducible builds on our end have usually been caused by timestamp issues in jars. I've actually worked on this alot internally, adding code coverage support and support for Scala 2.10 and 2.11 (another Zinc benefit is that it can host Scala 2.10 and 2.11 compilers simultaneously). I'll try to push these upstream once I have some time. |
Can one of the admins verify this patch? |
Hi @ahirreddy, |
I'd also be interested in how this project is going :) It would be an awesome and very welcome contribution to Bazel. |
closed by #91 |
This is a WIP PR that integrates Zinc with Bazel's support for persistent worker processes. The main advantage of running with persistent Zinc workers is taking advantage of warm Scala compilers across builds (this includes reusing warm compiler instances across different build targets). I've found significant compile time improvements on large builds (a 3x speed up for our internal build), although I've done no concrete bench marking.
This PR does not enable the incremental compiler. It is unclear if it makes sense to enable incremental re-compilation within Bazel, as I do not believe the results of incremental compilation are guaranteed to be reproducible.
To use the persistent worker use the
scala_worker
rule and run:bazel build --strategy=Scala=worker --worker_max_instances=<n> <target>
.You can also add these default parameters to your
.bazelrc
(which can be global or per repo) so that you don't have to constantly type them:Notes:
java_binary
rule instead of thescala_binary
because the scala rule changes the present working directory before invoking the JVM. The issue here is that Bazel provides input paths and arguments to the worker process as relative paths that only resolve in the initial working directory.