Multiplex persistent worker protocol #2832

jart · 2017-04-16T19:04:02Z

Background

Bazel spawns 4 persistent workers processes and sends them requests in serial via stdin/stdout.

Requirements

Option to spawn 1 process instead of 4 in ctx.action execution_requirements.
Ability to handle multiple requests simultaneously

Justification

JVM has high memory overhead.
CacheBuilder and SoftReference turn Java GC into super fast LRU cache for ASTs.
Why have 4 caches?

Design No. 1: Multiplex

Continue using stdin/stdout
Add request_id field to worker protocol request and response message protos
Add is_trying boolean field to response proto, sort of like 100 Trying in SIP
Bazel can send another request in parallel if it gets an is_trying response

Design No. 2: TCP

Add upgrade field to response proto that redirects Bazel to a HostAndPort.
All future requests get sent there via TCP
Don't use gRPC just send the raw protos
Maybe allow multiple requests per socket

CC: @lberki, @meisterT with whom I socialized idea offline IIRC

The text was updated successfully, but these errors were encountered:

buchgr · 2017-04-18T06:08:36Z

Don't use gRPC just send the raw protos
Maybe allow multiple requests per socket

Those two seem contradicting. That is, if you want multiple requests per socket you again need to implement something like gRPC.

lberki · 2017-04-18T07:32:28Z

Do you hava data on the costs of not implementing this? Sure, it can be done, but it's yet another feature we have to support so I'd rather it pays for its rent.

abergmeier-dsfishlabs · 2017-04-18T08:07:41Z

Why not have a frontend-process, which delegates to one background-process? Why does it have to be implemented on the Bazel level?

philwo · 2017-04-18T08:42:24Z

Fun fact, a multiplexing TCP-based version of this was implemented in the very first version of persistent workers and worked perfectly fine with a multi-threaded version of the JavaBuilder worker - but was then deemed unnecessary complexity by me and teammates and the code was deleted (AFAIR without even submitting the CL, so we can't restore it from history, ouch) and replaced with the simpler, serialized, multi-process stdin/stdout mechanism. :| Maybe we should have gone with the more complex version in the first place. Hindsight is best sight.

I'll have a look at this! Thanks for writing this proposal down so cleanly.

jart · 2017-04-18T13:30:46Z

@philwo My pleasure. Did you consider using the multiplexing technique described in Design No. 1? That would avoid the socket complexity and should hopefully be pretty straightforward. The user could continue doing things the simple way if he wants

@lberki It's far too easy to accidentally link the wrong thing in our internal repo and end up with so many jars that the JVM takes up gigs of memory. The JVM is amazing at threads and garbage collection so it makes sense to me to utilize those strengths, just like Bazel does.

@abergmeier-dsfishlabs The requests would still get sent to the frontend in parallel. Maybe the four frontends could scheme together to launch a single backend, possibly by locking a single input file, but I'm not sure if Bazel would consider that hermetic.

pauldraper · 2018-09-04T03:35:04Z

Why not have a frontend-process, which delegates to one background-process? Why does it have to be implemented on the Bazel level?

Because that would be really easy to mess up. Who would start it? Who would stop it? Who would check to see if files have changed?

Do you hava data on the costs of not implementing this? Sure, it can be done, but it's yet another feature we have to support so I'd rather it pays for its rent.

Lucid Software has seen a massive memory regression in transitioning from sbt (Scala) to Bazel. sbt used a single JVM process, and so it could JIT the Scala compiler once and have reasonable memory overhead. Then Bazel says, "Hey, if you want the same performance you had before, take your machine apart cram a bunch more RAM into it, and start 8x the number of processes each doing the exact same JIT.

As @jart said, it's insane to have multiple local caches. And the only reason to use workers is to cache things, no? (Typically caching JIT, sometimes just caching loading the executable, and I suppose you can get fancier.) Is there any situation that wouldn't used significantly less memory with this proposal?

buchgr · 2018-09-04T08:58:17Z

Is there any situation that wouldn't used significantly less memory with this proposal

For compilers that don't supported multi-threaded compilation it should neither be a win nor a miss. I remember talking to @philwo offline and I believe we agreed that it's a good idea to support your use case. Would you be interested in working on this @pauldraper ?

pauldraper · 2018-09-04T14:06:58Z

For compilers that don't supported multi-threaded compilation

That's true. Say, Node.js-based compilers.

Would you be interested in working on this @pauldraper ?

I'm no longer working with this, but @jjudd may be interested.

pauldraper · 2018-09-04T15:20:55Z

My 2c though: I suppose TCP would help the #4897 issues. But I like the simplicity of stdin/stdout (even when multiplexed). And not fiddling with Nagle, etc.

FWIW, the apt transport protocol is multiplexed stdin/stdout with an executable.

jjudd · 2018-09-04T16:48:25Z

We are definitely interested in this. Launching lots of JVMs consumes lots of resources.

I'm not sure when we will have time to work on it, but it is something we are interested in.

@buchgr do you have a ballpark estimate of how large of an effort you think developing this would be? I lack enough context on the Bazel codebase to effectively tell if it is a 2 day, 2 week, or 2 month effort.

buchgr · 2018-09-04T17:40:57Z

@buchgr do you have a ballpark estimate of how large of an effort you think developing this would be? I lack enough context on the Bazel codebase to effectively tell if it is a 2 day, 2 week, or 2 month effort.

As the original author of the workers feature @philwo should be able to answer this question best and provide guidance.

jin · 2018-09-04T18:03:56Z

/sub

jjudd · 2018-09-17T23:31:31Z

@philwo friendly ping. In your opinion how large of a task is this? Hours, days, weeks, months?

philwo · 2018-09-27T14:56:43Z

I think it shouldn't take long - days for a first prototype, weeks for a first complete version maybe? This would only concern the Bazel side though, I can't speak about updating existing workers to take advantage of the new protocol.

All the worker related code in Bazel is concentrated here: https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/worker/ - so you don't need much context about how Bazel works.

There's an integration test, too: https://source.bazel.build/bazel/+/master:src/test/shell/integration/bazel_worker_test.sh

Regarding the protocol, I'm open to whatever you'd come up with that works well and is easily integrated into various languages out there. I think I've seen persistent workers written in Java, JavaScript, TypeScript, Dart so far.

@cushon (Java), @mprobst (TypeScript) and @davidmorgan (Dart) might want to comment on this with their ideas / wishes. :)

davidmorgan · 2018-09-27T15:18:03Z

From the Dart side: parallel requests in one worker isn't super exciting since Dart is single threaded. We're planning on experimenting with build performance in Q4, we might have some suggestions for worker protocol changes of our own. (Not super high probability, though; 20% maybe).

mprobst · 2018-09-27T16:14:26Z

Same on the TypeScript side, our workers are necessarily single threaded, so this wouldn't help us.

cushon · 2018-09-27T16:21:26Z

@kevin1e100 might be interested in this for kotlin.

Javac is single-threaded, but I think this would allow us to run multiple instances of it in one worker and share a cache and memory footprint, instead of having multiple workers which starts to use a lot of memory and means any caching takes a long time to work up. How do Dart and TypeScript avoid those issues with the current approach?

kevin1e100 · 2018-09-27T18:35:37Z

Right even with single-threaded underlying tools, if they can safely be run in parallel, that can still be a win I would think. But from my point of view this really shines when the worker wants to do some kind of caching (example below) or incremental scheme (e.g., Java compilation is typically incremental in the Eclipse IDE IIUC). Bazel's DexBuilder worker for Android builds for instance uses caching but as @cushon mentioned all worker instances have their own cache, which can be unfortunate.

jjudd · 2018-09-27T22:50:53Z

Thanks for the estimate @philwo. We are starting work on this. @borkaehw is leading the implementation our end. We'll keep people updated as we make progress, propose designs, etc.

buchgr · 2018-09-28T11:10:37Z

@jjudd it would be great if you could share a design document with bazel-discuss / bazel-dev before doing the implementation. We are happy to review it and give pointers! Thanks so much!

davidmorgan · 2018-09-28T12:56:51Z

@cushon users of bazel+workers+Dart are google internal--we just use a lot of RAM.

ittaiz · 2018-09-28T15:20:19Z

Cc @johnynek since we (rules_scala) also use a persistent worker and I think we’d love for this feature as well

…

On Fri, 28 Sep 2018 at 15:57 David Morgan ***@***.***> wrote: @cushon <https://github.com/cushon> users of bazel+workers+Dart are google internal--we just use a lot of RAM. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2832 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABUIF7DO1GwfJ7umqYyY3tmZVMlkkvR5ks5ufhydgaJpZM4M-s5j> .

cushon · 2018-09-28T16:15:47Z

@davidmorgan are you using the workers for caching/incrementality, or mostly to keep a VM warm? If you're using caching, have seen issues with the hit rate from having a separate cache for each worker instance?

davidmorgan · 2018-09-29T09:43:20Z

@cushon Right now mostly to keep a VM warm. We hope to gain more from caching/incrementality in future.

For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832

This is the attempt to solve issue #2832. The design doc has been approved [Multiplex persistent worker](https://docs.google.com/document/d/1OC0cVj1Y1QYo6n-wlK6mIwwG7xS2BJX7fuvf1r5LtnU/edit?usp=sharing). Two minor design changes from design doc - Number of WorkerProxy is still limited by `--worker_max_instances`. - We merge worker multiplexer sender and receiver to one WorkerMultiplxer, WorkerProxy sends request to worker process directly. Closes #6857. PiperOrigin-RevId: 274560006

jart added the type: feature request label Apr 16, 2017

lberki assigned philwo Apr 18, 2017

philwo added the P2 We'll consider working on this in future. (Assignee optional) label Apr 18, 2017

dslomov added the category: sandboxing label May 16, 2017

cushon mentioned this issue Mar 21, 2018

Failed worker protocol error doesn't explain that stdout is reserved for worker protocol #4897

Closed

borkaehw mentioned this issue Oct 2, 2018

Output Runner and Multiple Runs bazelbuild/bazel-watcher#133

Closed

cushon mentioned this issue Apr 11, 2019

Allow persistent workers to communicate with Bazel using JSON instead of protos #7998

Closed

borkaehw mentioned this issue Oct 25, 2019

Multiplex Workers documentation #10108

Closed

bazel-io closed this as completed in ac04cd9 Nov 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiplex persistent worker protocol #2832

Multiplex persistent worker protocol #2832

jart commented Apr 16, 2017

buchgr commented Apr 18, 2017 •

edited

Loading

lberki commented Apr 18, 2017

abergmeier-dsfishlabs commented Apr 18, 2017

philwo commented Apr 18, 2017 •

edited

Loading

jart commented Apr 18, 2017

pauldraper commented Sep 4, 2018 •

edited

Loading

buchgr commented Sep 4, 2018

pauldraper commented Sep 4, 2018

pauldraper commented Sep 4, 2018 •

edited

Loading

jjudd commented Sep 4, 2018

buchgr commented Sep 4, 2018 •

edited

Loading

jin commented Sep 4, 2018

jjudd commented Sep 17, 2018

philwo commented Sep 27, 2018

davidmorgan commented Sep 27, 2018

mprobst commented Sep 27, 2018 via email

cushon commented Sep 27, 2018

kevin1e100 commented Sep 27, 2018

jjudd commented Sep 27, 2018

buchgr commented Sep 28, 2018

davidmorgan commented Sep 28, 2018

ittaiz commented Sep 28, 2018 via email

cushon commented Sep 28, 2018

davidmorgan commented Sep 29, 2018

Multiplex persistent worker protocol #2832

Multiplex persistent worker protocol #2832

Comments

jart commented Apr 16, 2017

Background

Requirements

Justification

Design No. 1: Multiplex

Design No. 2: TCP

buchgr commented Apr 18, 2017 • edited Loading

lberki commented Apr 18, 2017

abergmeier-dsfishlabs commented Apr 18, 2017

philwo commented Apr 18, 2017 • edited Loading

jart commented Apr 18, 2017

pauldraper commented Sep 4, 2018 • edited Loading

buchgr commented Sep 4, 2018

pauldraper commented Sep 4, 2018

pauldraper commented Sep 4, 2018 • edited Loading

jjudd commented Sep 4, 2018

buchgr commented Sep 4, 2018 • edited Loading

jin commented Sep 4, 2018

jjudd commented Sep 17, 2018

philwo commented Sep 27, 2018

davidmorgan commented Sep 27, 2018

mprobst commented Sep 27, 2018 via email

cushon commented Sep 27, 2018

kevin1e100 commented Sep 27, 2018

jjudd commented Sep 27, 2018

buchgr commented Sep 28, 2018

davidmorgan commented Sep 28, 2018

ittaiz commented Sep 28, 2018 via email

cushon commented Sep 28, 2018

davidmorgan commented Sep 29, 2018

buchgr commented Apr 18, 2017 •

edited

Loading

philwo commented Apr 18, 2017 •

edited

Loading

pauldraper commented Sep 4, 2018 •

edited

Loading

pauldraper commented Sep 4, 2018 •

edited

Loading

buchgr commented Sep 4, 2018 •

edited

Loading