-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharing Java outputs across platforms (Linux/Mac) #4714
Comments
Greg, |
Interesting. Did you figure out a way to ask bazel for the list of all files that participate in finger print calculation? I'd love to get a sense how difficult that nut is to crack. The docker idea is interesting. Are you thinking of running bazel within docker? How would that work? Would you somehow pass the commands over docker barrier? Would IntelliJ work in that setup? |
Re first question- no.
Re docker-
Yes, run bazel in docker. The rough idea is to have a script named bazel
which will call “docker run Bazel” with the arguments.
There are of course rough edges of mounts and such but they’re probably
solvable.
This should also please IntelliJ since it will just call Bazel as is.
…On Tue, 27 Feb 2018 at 19:03 Grzegorz Kossakowski ***@***.***> wrote:
Interesting. Did you figure out a way to ask bazel for the list of all
files that participate in finger print calculation? I'd love to get a sense
how difficult that nut is to crack.
The docker idea is interesting. Are you thinking of running bazel within
docker? How would that work? Would you somehow pass the commands over
docker barrier? Would IntelliJ work in that setup?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4714 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIFx4ldFKrx2hrHpRmUctgEJue-PhRks5tZDUzgaJpZM4SUKiz>
.
|
Re: the first question, you mean the fingerprint calculation for Bazel's action cache? What specific kind of cross-platform caching are you interested in? Are you only interested in cross-platform hosts (building a Java library on a Mac or Linux machine, but in both cases targeting the same output architecture, i.e. I've been prototyping platform-independent Java compilation by making input and output paths consistent. In other words, when
This is unconditionally true: for C++ actions (where this makes sense), Java compilation (where it doesn't), and everything else. Since command input and output paths are part of Bazel's cache, this alone breaks meaningful cross-platform caching. I've discovered some interesting results in my prototype, but the main gist is fixing this isn't trivial. One particular problem is generated code, e.g. if some Java source comes from a proto then host architecture info could make its way into that source. Another complication is But you imagine many classes of Java compilation don't use protos or |
Yes, that’s the one.
We’re interested in java/Scala but also proto generation and hopefully
we’ll get to js/node/go/general docker containers in the quarters.
That’s why the docker solution sounds interesting since you can base the
container on the same image as CI agents and if we can get a smooth and
performant protocol it should be good
…On Tue, 27 Feb 2018 at 20:20 Greg ***@***.***> wrote:
Re: the first question, you mean the fingerprint calculation for Bazel's
action cache?
What specific kind of cross-platform caching are you interested in? Are
you only interested in cross-platform hosts (building a Java library on a
Mac or Linux machine, but in both cases targeting the same output
architecture, i.e. --host_cpu
<https://docs.bazel.build/versions/master/user-manual.html#flag--host_cpu>
is different but --cpu
<https://docs.bazel.build/versions/master/user-manual.html#flag--cpu> is
the same)? Or are you also interested in cases where --cpu differs?
I've been prototyping platform-independent Java compilation platform by
making input and output paths consistent. In other words, when --cpu=k8,
all outputs today look like:
bazel-out/k8-fastbuild/my/project/some.output
This is unconditionally true: for C++ actions (where this makes sense),
Java compilation (where it doesn't), and everything else. Since command
input and output paths are part of Bazel's cache, this alone breaks
meaningful cross-platform caching.
I've discovered some interesting results in my prototype, but the main
gist is fixing this isn't trivial. One particular problem is generated
code, e.g. if some Java source comes from a proto then host architecture
info could make its way into that source. Another complication is select
<https://docs.bazel.build/versions/master/be/functions.html#select>,
which could swap out one of the source files in an arbitrarily
platform-dependent way. Annotation processors
<https://docs.bazel.build/versions/master/be/java.html#java_plugin> could
also be a problem.
But you imagine many classes of Java compilation don't use protos or
select, etc. So at least in those cases these problems should be
avoidable. The broader challenge is how to accurately identify these cases.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4714 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF5DQF8WzSkzzFBaf2ICUix_t4SyPks5tZEdYgaJpZM4SUKiz>
.
|
My question about action digest/key calculation is about understanding the following scenario. I'm trying to build a tiny example java target on two different machines that are supposed to be configured exactly the same. On one machine:
and then on another:
You can see that digest for The action key for |
To see which files these ultimately resolve to, you can run:
I believe all these files are staged for all compilation actions, i.e. here, which Bazel collects from One way to see which system files these resolve to:
|
I followed your instructions to go to execroot and then:
And then I got:
I was sure I did similar steps yesterday and didn't see any difference. Thanks for the instructions, this is handy. A suggestion: could we have a slow mode in bazel that automates what I just did? Aside for that, I don't understand why these files are different on the two machines. I suspect that we've built jdk out of sources and somehow that build is not reproducible. These files are large binaries so it's hard to tell what's the problem. Is there an escape hatch that let's me say "oh, i'm sorry that these files are different but can you just carry on pretending they're not"? |
Both |
Re: running bazel inside a docker container, you may be interested in https://github.com/nadirizr/dazel . |
I tried the idea of using bazel build just from the docker container to enforce reproducibility but that doesn't work due to localjdk including non-reproducible bits: #4769 |
I filed a separate issue for bazel's own build not being reproducible on Mac: #4770 |
I wouldn't be surprised if @local_jdk//:jdk-default is overly broad, especially if this is inherited from the Google implementation where platforms are more consistent and extra files aren't as big a deal. We've encountered this kind of issue in the past for other toolchains and have often successfully mitigated by simply declaring these rules more finely or breaking them down into sub-rules which are more selectively opted into actions that need them. This can be a win/win all around. As an experiment, would you like to try excluding these files from the globs in your workspace and see if you get the caching you want? If that works for you we can absolutely explore a more proper change. |
thanks greg! is there a way to exclude these files declaratively from my workspace without forking bazel and modifying it there? The relevant filegroup that needs modification is this: https://github.com/bazelbuild/bazel/blob/master/src/main/tools/jdk.BUILD#L104 |
Good point. Can you clone the file before making your changes, then redirect Blaze to that JDK with |
will setting |
I just tried setting the |
We can examine on a per-action basis, but I'd still like to see if we can work through the query approach, since there are less targets than actions. I forgot above to include About setting these options with query, if you're using the latest Bazel you can use the new So here's what I see:
|
First of all, neat! Do I understand correctly that by specifying I reproduced your configuration in https://github.com/gkossakowski/bazel-scala-example and tried it on two linux machines configured exactly the same. I continue to see a disagreement of digestKeys:
vs
|
I recalled from my notes that I did debug the problem with |
I emptied out What command did you run exactly? Do you see any diffs in I'd expect As for |
Sorry got sidetracked by other work. I just tried I'll check the |
The contents of |
gkossakowski and I have been looking a bit at the minimal hello world, using javabase flags. When I run These jars look the same when extracted, but I notice that Also, I accidentally built the test program with Bazel 0.6.1 and the cache keys matched on both systems. So I strongly suspect this non-determinism was introduced between 0.6.1 and 0.11.1. |
When compiling |
edit: never mind, this mismatch was due to a stray newline in the source file on one of the machines. |
Re: Bazel in Docker |
Be aware that building large projects in Docker on OSX resulted in a ~3x slowdown the last time I looked into it. IMO, it is a completely inadequate solution to the problem. |
@rdnetto4 disk I/O is premium for any virtualization (and Docker on Mac is in VM). Did you happen to |
It was about a year ago, so I can't remember the exact details. I have a
vague recollection I didn't spend much time fine tuning it, as for our
application even a 20% slowdown would have been too much.
…On Mon, Sep 10, 2018 at 11:11 AM Paul Draper ***@***.***> wrote:
@rdnetto4 <https://github.com/rdnetto4> disk I/O is premium for any
virtualization (and Docker on Mac is in VM).
Did you happen to
1. Use the delegated
<https://docs.docker.com/docker-for-mac/osxfs-caching/> mode for
mounting the workspace.
2. Ensure Bazel output directory is *not* a shared mount with host.
3. Put the Bazel ouput directory on a separate Docker volume (which I
believe will us a more efficient type than Docker's layered container file
system).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4714 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AikcoRZbs9yKasi0ypJBLhO6Fdx5Nb7Yks5uZbw9gaJpZM4SUKiz>
.
|
The various consistency modes for bind mounts arrived in Docker last April,
so you may have been using that feature.
It's pretty much a requirement for any Docker dev on MacOS. Sped up example
builds 2-3x
https://blog.docker.com/2017/05/user-guided-caching-in-docker-for-mac/
…On Sun, Sep 9, 2018 at 8:57 PM rdnetto4 ***@***.***> wrote:
It was about a year ago, so I can't remember the exact details. I have a
vague recollection I didn't spend much time fine tuning it, as for our
application even a 20% slowdown would have been too much.
On Mon, Sep 10, 2018 at 11:11 AM Paul Draper ***@***.***>
wrote:
> @rdnetto4 <https://github.com/rdnetto4> disk I/O is premium for any
> virtualization (and Docker on Mac is in VM).
>
> Did you happen to
>
> 1. Use the delegated
> <https://docs.docker.com/docker-for-mac/osxfs-caching/> mode for
> mounting the workspace.
> 2. Ensure Bazel output directory is *not* a shared mount with host.
> 3. Put the Bazel ouput directory on a separate Docker volume (which I
> believe will us a more efficient type than Docker's layered container
file
> system).
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#4714 (comment)
>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/AikcoRZbs9yKasi0ypJBLhO6Fdx5Nb7Yks5uZbw9gaJpZM4SUKiz
>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4714 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABVph1KO76jdsbmN1krsbbalf_ZdHcyvks5uZdUVgaJpZM4SUKiz>
.
|
@pauldraper and @rdnetto4 - Note that the consistency mode We have been experimenting for some time now with developing with bazel inside docker, and we use a local nfs mount instead of a docker volume/mount. This has been working quite well for as so far. If I understand what you meant by (3) correctly, a "separate Docker volume, which bypasses Docker's layered container filesystem" - this will indeed allow bazel to happily run and output everything with native speed, however you will still have to wait (a lot) in order to actually see and use these outputs from your IDE which is running locally on your mac (similar to what happens when using docker-sync project as a workaround for this exact bottleneck) |
Hm, is that often an issue? Do IDEs need any of the outputs? (Assuming you are building, testing, running, etc. in the container.)
Interesting. And then do you do file watch (ibazel) on the host? |
IDEs need the outputs - the intellij-info.txt files which are how re file watch - no need to, the nfs mount is from the mac's host native filesystem, and is available to the docker container via a type:nfs docker volume. |
I have also had some issues with high cache miss rates when the build clients are not "identical". So far I've had the most luck from running in docker to "control" the environment. Configured my containers to build a few large example apps using BuildFarm for remote execution + caching. Still looking into how I can relax the identical computer requirement. Here is my docker setup in case anyone is interested/has suggestions for how to improve it: https://github.com/thelgevold/remote-builder |
Been doing some experiments in this space recently: see #6431 (comment). I'm personally not sure how Docker affects the story and this thread has gotten pretty big since I last visited it. What's the most actionable request we're looking for on this issue now? Is this all contingent on docker now? |
@gregestren I don't think it should be about Docker. The requirement for us is:
Basically, we want to populate a remote cache from a set of blessed/trusted environments for all developers. These trusted builds run on Linux. Some developers use Macs. It's mostly Java output I'm concerned about (class files and/or jars). I would like to add:
|
Sure, that's a good goal.
…On Wed, Mar 6, 2019 at 11:56 AM Gunnar Wagenknecht ***@***.***> wrote:
@gregestren <https://github.com/gregestren> I don't think it should be
about Docker.
The requirement for us is:
- sharing Java outputs in Bazel remote cache across platforms
(Linux/Mac)
I would like to add:
- don't require building inside Docker
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4714 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABVph4S4t0AYDD7zqeEIlkzJkvGteRiUks5vUA9VgaJpZM4SUKiz>
.
|
Hmm, the comments in #4714 (comment) and below look promising. So it seems like some people are getting caching and others aren't? For those who are, are you targeting the same CPU on both machines? i.e. @guw - are you finding this problem on any java builds (even trivial example ones)? Can you share more details of what you're seeing? |
Seconding @guw's comment above. Populating a build cache with platform-agnostic artifacts (e.g. java/kotlin/android artifacts) from blessed machines (which are nearly certainly Linux machines) and consuming cache content from developer machines (which are nearly certainly Mac machines). |
FYI Experimental Content-Based Output Paths is a proposal to address the output path side of the problem. I'd like to implement it as an |
I'm going to mark this a dupe of #8339. Technically I believe the issue is broader. But that one should better track our focused efforts, and naturally leads here anyway. |
@gregestren you said it was a dupe of itself. Wild concept. If there is another issue it is a dupe of can you edit your comment? |
Apologies - updated the link. |
Following up on #4558 (comment)
I'm starting this thread to discuss options for sharing JDK-derived (java rules, scala rules, etc.) outputs across platforms. The JDK itself is designed to be cross-platform so bytecode can be shared. However, the current design of bazel makes it difficult to share outputs across platform. See the linked ticket for details.
The linked ticket suggests a hacky workaround. All binaries that are not registered with bazel or WORKSPACE are not participating in the fingerprint calculation. The idea: set up bash scripts in a workspace that dynamically route to either linux or macos java binaries (they are not hermetic). Register these scripts and override
local_sdk
to use these scripts instead of what ships with bazel.Questions:
Finally, how far are we from having a native support in bazel for sharing jdk outputs across platforms?
The text was updated successfully, but these errors were encountered: