-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Actions fetched from remote cache / executed remotely should not count towards parallelism limit #6394
Comments
Thanks @ob! A colleague of mine has a prototype of this, however I still expect this to be at least 3-6 months away to land in a released Bazel version! |
FWIW, we "fix" this in Google by actually running with --jobs=200 (or higher). Local actions are still limited by local resources, so this is fine in general. |
Means you run with |
@ulfjack what's the status of your work in this area? is there a tracking bug on github? |
Parts of the prototype have been submitted, but I haven't even sent out some critical parts. Making it work also requires rewriting the RemoteSpawnRunner to be async and use ListenableFuture, for which I do not currently have plans. I am not aware of a tracking bug on GitHub apart from this one. |
That I'd be happy to take over :-) |
Happy for you to start working on the RSR in parallel to me landing the Skyframe changes that are also required. |
Broke this out into #7182. |
Commit 9beabe0 is related. |
45a9bc2 was a change to the resource requirements of runfiles trees, which allowed more parallelism. Probably this should be discussed in another issue, though. |
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team ( |
We are working on this again, in a different way. Next step is to upgrade to a modem JDK which will allow us to use loom/virtual threads. |
Reading through this thread. If I'm using local execution with a remote cache and almost all tests are cached. with --jobs set to 4, I see. (currently on bazel 6.0.0)
With jobs set to 8, I see a max of 16 actions. This would mean jobs fetching from remote cache is not counting towards parallelism. Is there a way we can configure it to do so? |
The expensive local actions will be separately gated to not overwhelm the machine, and currently highly asynchronous actions are a dominant part of our builds due to downloading cached artifacts. Without a high concurrency, these are downloaded roughly 2-at-a-time currently. I've verified that on a small machine without a good cache this doesn't seem to generate huge amounts of work and local build and test actions are successfully gated on the local flags. There is already a Bazel issue tracking this limitation: bazelbuild/bazel#6394
The expensive local actions will be separately gated to not overwhelm the machine, and currently highly asynchronous actions are a dominant part of our builds due to downloading cached artifacts. Without a high concurrency, these are downloaded roughly 2-at-a-time currently. I've verified that on a small machine without a good cache this doesn't seem to generate huge amounts of work and local build and test actions are successfully gated on the local flags. There is already a Bazel issue tracking this limitation: bazelbuild/bazel#6394
The expensive local actions will be separately gated to not overwhelm the machine, and currently highly asynchronous actions are a dominant part of our builds due to downloading cached artifacts. Without a high concurrency, these are downloaded roughly 2-at-a-time currently. I've verified that on a small machine without a good cache this doesn't seem to generate huge amounts of work and local build and test actions are successfully gated on the local flags. There is already a Bazel issue tracking this limitation: bazelbuild/bazel#6394
The expensive local actions will be separately gated to not overwhelm the machine, and currently highly asynchronous actions are a dominant part of our builds due to downloading cached artifacts. Without a high concurrency, these are downloaded roughly 2-at-a-time currently. I've verified that on a small machine without a good cache this doesn't seem to generate huge amounts of work and local build and test actions are successfully gated on the local flags. There is already a Bazel issue tracking this limitation: bazelbuild/bazel#6394
The expensive local actions will be separately gated to not overwhelm the machine, and currently highly asynchronous actions are a dominant part of our builds due to downloading cached artifacts. Without a high concurrency, these are downloaded roughly 2-at-a-time currently. I've verified that on a small machine without a good cache this doesn't seem to generate huge amounts of work and local build and test actions are successfully gated on the local flags. There is already a Bazel issue tracking this limitation: bazelbuild/bazel#6394
bazelbuild/bazel#22785 fixes a bug remote caching bug that prevented us from fetching only top-level targets from cache. Enabling that option should speed up CI runs. Also increase the number of concurrent Bazel jobs, see bazelbuild/bazel#6394
bazelbuild/bazel#22785 fixes a bug remote caching bug that prevented us from fetching only top-level targets from cache. Enabling that option should speed up CI runs. Also increase the number of concurrent Bazel jobs, see bazelbuild/bazel#6394
bazelbuild/bazel#22785 fixes a bug remote caching bug that prevented us from fetching only top-level targets from cache. Enabling that option should speed up CI runs. Also increase the number of concurrent Bazel jobs, see bazelbuild/bazel#6394
bazelbuild/bazel#22785 fixes a bug remote caching bug that prevented us from fetching only top-level targets from cache. Enabling that option should speed up CI runs. Also increase the number of concurrent Bazel jobs, see bazelbuild/bazel#6394
When a fairly large application is built using the remote cache, and all the actions are fully cached. Bazel still keeps the parallelism set to the number of cores in the machine. However, most actions are just waiting for network I/O.
With the default settings on an 8-core machine, I get:
But if I bump up the number of jobs to a crazy number:
I think the
--jobs
option should only apply tolocal
actions.The text was updated successfully, but these errors were encountered: