-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiplex persistent worker protocol #2832
Comments
Those two seem contradicting. That is, if you want multiple requests per socket you again need to implement something like gRPC. |
Do you hava data on the costs of not implementing this? Sure, it can be done, but it's yet another feature we have to support so I'd rather it pays for its rent. |
Why not have a frontend-process, which delegates to one background-process? Why does it have to be implemented on the Bazel level? |
Fun fact, a multiplexing TCP-based version of this was implemented in the very first version of persistent workers and worked perfectly fine with a multi-threaded version of the JavaBuilder worker - but was then deemed unnecessary complexity by me and teammates and the code was deleted (AFAIR without even submitting the CL, so we can't restore it from history, ouch) and replaced with the simpler, serialized, multi-process stdin/stdout mechanism. :| Maybe we should have gone with the more complex version in the first place. Hindsight is best sight. I'll have a look at this! Thanks for writing this proposal down so cleanly. |
@philwo My pleasure. Did you consider using the multiplexing technique described in Design No. 1? That would avoid the socket complexity and should hopefully be pretty straightforward. The user could continue doing things the simple way if he wants @lberki It's far too easy to accidentally link the wrong thing in our internal repo and end up with so many jars that the JVM takes up gigs of memory. The JVM is amazing at threads and garbage collection so it makes sense to me to utilize those strengths, just like Bazel does. @abergmeier-dsfishlabs The requests would still get sent to the frontend in parallel. Maybe the four frontends could scheme together to launch a single backend, possibly by locking a single input file, but I'm not sure if Bazel would consider that hermetic. |
Because that would be really easy to mess up. Who would start it? Who would stop it? Who would check to see if files have changed?
Lucid Software has seen a massive memory regression in transitioning from sbt (Scala) to Bazel. sbt used a single JVM process, and so it could JIT the Scala compiler once and have reasonable memory overhead. Then Bazel says, "Hey, if you want the same performance you had before, take your machine apart cram a bunch more RAM into it, and start 8x the number of processes each doing the exact same JIT. As @jart said, it's insane to have multiple local caches. And the only reason to use workers is to cache things, no? (Typically caching JIT, sometimes just caching loading the executable, and I suppose you can get fancier.) Is there any situation that wouldn't used significantly less memory with this proposal? |
For compilers that don't supported multi-threaded compilation it should neither be a win nor a miss. I remember talking to @philwo offline and I believe we agreed that it's a good idea to support your use case. Would you be interested in working on this @pauldraper ? |
That's true. Say, Node.js-based compilers.
I'm no longer working with this, but @jjudd may be interested. |
My 2c though: I suppose TCP would help the #4897 issues. But I like the simplicity of stdin/stdout (even when multiplexed). And not fiddling with Nagle, etc. FWIW, the apt transport protocol is multiplexed stdin/stdout with an executable. |
We are definitely interested in this. Launching lots of JVMs consumes lots of resources. I'm not sure when we will have time to work on it, but it is something we are interested in. @buchgr do you have a ballpark estimate of how large of an effort you think developing this would be? I lack enough context on the Bazel codebase to effectively tell if it is a 2 day, 2 week, or 2 month effort. |
As the original author of the workers feature @philwo should be able to answer this question best and provide guidance. |
/sub |
@philwo friendly ping. In your opinion how large of a task is this? Hours, days, weeks, months? |
I think it shouldn't take long - days for a first prototype, weeks for a first complete version maybe? This would only concern the Bazel side though, I can't speak about updating existing workers to take advantage of the new protocol. All the worker related code in Bazel is concentrated here: https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/worker/ - so you don't need much context about how Bazel works. There's an integration test, too: https://source.bazel.build/bazel/+/master:src/test/shell/integration/bazel_worker_test.sh Regarding the protocol, I'm open to whatever you'd come up with that works well and is easily integrated into various languages out there. I think I've seen persistent workers written in Java, JavaScript, TypeScript, Dart so far. @cushon (Java), @mprobst (TypeScript) and @davidmorgan (Dart) might want to comment on this with their ideas / wishes. :) |
From the Dart side: parallel requests in one worker isn't super exciting since Dart is single threaded. We're planning on experimenting with build performance in Q4, we might have some suggestions for worker protocol changes of our own. (Not super high probability, though; 20% maybe). |
Same on the TypeScript side, our workers are necessarily single threaded,
so this wouldn't help us.
|
@kevin1e100 might be interested in this for kotlin. Javac is single-threaded, but I think this would allow us to run multiple instances of it in one worker and share a cache and memory footprint, instead of having multiple workers which starts to use a lot of memory and means any caching takes a long time to work up. How do Dart and TypeScript avoid those issues with the current approach? |
Right even with single-threaded underlying tools, if they can safely be run in parallel, that can still be a win I would think. But from my point of view this really shines when the worker wants to do some kind of caching (example below) or incremental scheme (e.g., Java compilation is typically incremental in the Eclipse IDE IIUC). Bazel's DexBuilder worker for Android builds for instance uses caching but as @cushon mentioned all worker instances have their own cache, which can be unfortunate. |
@jjudd it would be great if you could share a design document with bazel-discuss / bazel-dev before doing the implementation. We are happy to review it and give pointers! Thanks so much! |
@cushon users of bazel+workers+Dart are google internal--we just use a lot of RAM. |
Cc @johnynek since we (rules_scala) also use a persistent worker and I
think we’d love for this feature as well
…On Fri, 28 Sep 2018 at 15:57 David Morgan ***@***.***> wrote:
@cushon <https://github.com/cushon> users of bazel+workers+Dart are
google internal--we just use a lot of RAM.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2832 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF7DO1GwfJ7umqYyY3tmZVMlkkvR5ks5ufhydgaJpZM4M-s5j>
.
|
@davidmorgan are you using the workers for caching/incrementality, or mostly to keep a VM warm? If you're using caching, have seen issues with the hit rate from having a separate cache for each worker instance? |
@cushon Right now mostly to keep a VM warm. We hope to gain more from caching/incrementality in future. |
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
For each unique WorkerKey, Bazel can launch a multiplexer to talk to one multi-threaded worker process optionally. We use less JVM processes but maintain the approximately same performance, hence, save more memory. The worker process should be able to handle multiple requests to fully utilize this feature. Fix: bazelbuild#2832
This is the attempt to solve issue #2832. The design doc has been approved [Multiplex persistent worker](https://docs.google.com/document/d/1OC0cVj1Y1QYo6n-wlK6mIwwG7xS2BJX7fuvf1r5LtnU/edit?usp=sharing). Two minor design changes from design doc - Number of WorkerProxy is still limited by `--worker_max_instances`. - We merge worker multiplexer sender and receiver to one WorkerMultiplxer, WorkerProxy sends request to worker process directly. Closes #6857. PiperOrigin-RevId: 274560006
Background
Bazel spawns 4 persistent workers processes and sends them requests in serial via stdin/stdout.
Requirements
Justification
Design No. 1: Multiplex
request_id
field to worker protocol request and response message protosis_trying
boolean field to response proto, sort of like 100 Trying in SIPis_trying
responseDesign No. 2: TCP
upgrade
field to response proto that redirects Bazel to aHostAndPort
.CC: @lberki, @meisterT with whom I socialized idea offline IIRC
The text was updated successfully, but these errors were encountered: