Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep repackaged 'allmodules' and 'minimized' JDKs up to date #14785

Closed
cushon opened this issue Feb 10, 2022 · 18 comments
Closed

Keep repackaged 'allmodules' and 'minimized' JDKs up to date #14785

cushon opened this issue Feb 10, 2022 · 18 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug

Comments

@cushon
Copy link
Contributor

cushon commented Feb 10, 2022

[This is a continuation of discussion in internal review cl/427236699]

095f1e2 consolidated the handling the versions of remote JDKs referenced in Bazel in distdir_deps.bzl, as a follow-up to b741116.

The only exception are the 'minimized' and 'allmodules' JDKs that are used for Bazel's embedded JDK:

  • bazel/WORKSPACE

    Lines 135 to 141 in 06a3d82

    # OpenJDK distributions used to create a version of Bazel bundled with the OpenJDK.
    http_file(
    name = "openjdk_linux",
    downloaded_file_path = "zulu-linux.tar.gz",
    sha256 = "65bfe4e0ffa74a680ee4410db46b17e30cd9397b664a92a886599fe1f3530969",
    urls = ["https://mirror.bazel.build/openjdk/azul-zulu11.37.17-ca-jdk11.0.6/zulu11.37.17-ca-jdk11.0.6-linux_x64-linux_x64-allmodules-b23d4e05466f2aa1fdcd72d3d3a8e962206b64bf-1581689070.tar.gz"],
    )
  • bazel/WORKSPACE

    Lines 148 to 153 in 06a3d82

    http_file(
    name = "openjdk_linux_minimal",
    downloaded_file_path = "zulu-linux-minimal.tar.gz",
    sha256 = "91f7d52f695c681d4e21499b4319d548aadef249a6b3053e306308992e1e29ae",
    urls = ["https://mirror.bazel.build/openjdk/azul-zulu11.37.17-ca-jdk11.0.6/zulu11.37.17-ca-jdk11.0.6-linux_x64-minimal-b23d4e05466f2aa1fdcd72d3d3a8e962206b64bf-1581689068.tar.gz"],
    )

Those JDKs are currently out of sync with the other versions.

They should be updated, and if possible the process for creating those derived JDKs should be improve to keep them in sync with the other versions instead of requiring manual updates. It would be nice if it was possible to create them at build-time instead.

The current process uses:

@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug labels Feb 11, 2022
@meteorcloudy
Copy link
Member

@meisterT Why did we use prebuilt 'allmodules' and 'minimized' JDKs? Was it for shorten the total build time of Bazel?

@cushon
Copy link
Contributor Author

cushon commented Feb 11, 2022

I think one of the reasons was that the repackaged JDK is significantly smaller, which makes the bazel installer smaller:

191M zulu11.37.17-ca-jdk11.0.6-linux_x64.tar.gz
 19M zulu11.37.17-ca-jdk11.0.6-linux_x64-minimal-b23d4e05466f2aa1fdcd72d3d3a8e962206b64bf-1581689068.tar.gz
 51M zulu11.37.17-ca-jdk11.0.6-linux_x64-linux_x64-allmodules-b23d4e05466f2aa1fdcd72d3d3a8e962206b64bf-1581689070.tar.gz

@meisterT
Copy link
Member

There are two main reasons: it really takes quite a while (multiple minutes) so you don't want to do this on every build. Secondly, it is unfortunately non-deterministic. So it throw all caching out of the window.

@meteorcloudy
Copy link
Member

Oh, I understand why we are embedding the repacked JDK, but I'm asking the same question as yours: why not "create them at build-time instead."

@meteorcloudy
Copy link
Member

@meisterT I see..

@meteorcloudy
Copy link
Member

Secondly, it is unfortunately non-deterministic. So it throw all caching out of the window.

Do we know why it is non-deterministic?

@cushon
Copy link
Contributor Author

cushon commented Feb 11, 2022

How are the allmodules JDKs used at this point? I remember some history from when the embedded JDK could be used by other tools that ran during the build, which might need modules Bazel doesn't. Is that still supported, or is everything else using local or remote JDKs instead of the embedded one?

@meisterT
Copy link
Member

re non-determinism: when I last looked into it, the jlink tool internally used data structures to collect information and later process that information in a non-deterministic order (think HashMap vs LinkedHashMap).

re allmodules JDK: that's a good question, and your recollection of history matches mine. I hope @comius knows the answer whether / how it is used today.

@comius
Copy link
Contributor

comius commented Feb 11, 2022

re allmodules JDK: I hope @comius knows the answer whether / how it is used today.

I don't. I would need to investigate.

@cushon
Copy link
Contributor Author

cushon commented Feb 12, 2022

I found some internal discussion about allmodules from 2019 that included

with suffix allmodules: this is a full blown JDK with all the modules but doesn't include examples/documentation and so on --> this should be fine to use as --host_javabase

That matches my recollection that --host_javabase was a motivating use-case.

I'm not sure there's a reason to keep the allmodules JDK around at this point. The vanilla remote JDKs seem to be fine for the JDKs that are used in toolchains. I think it's worth considering dropping it, unless we discover it's still being used for something important.

re non-determinism: when I last looked into it, the jlink tool internally used data structures to collect information and later process that information in a non-deterministic order (think HashMap vs LinkedHashMap).

If the non-determinism is the main blocker and you're interested in generating these during the build, I can try to help chase that down. It sounds like something we ought to be able to upstream a fix for.

@meisterT
Copy link
Member

I did look into the time it takes to minimize the JDK and it it is less than 15s on my laptop so I guess if we can fix the non-determinism we can actually get rid of the cached version.

@meteorcloudy
Copy link
Member

Fixing the non-determinism sounds great!

@meteorcloudy
Copy link
Member

meteorcloudy commented Feb 14, 2022

BTW, I'm wondering how does Bazel know the action is non-deterministic and then decide to throw away the cache?

Update: Oh I guess the output of jink is the input of the next action?

@meisterT
Copy link
Member

Right, the output of jlink is input to the next action, in this case the bazel binary itself, potentially making all integration tests uncached.

@cushon
Copy link
Contributor Author

cushon commented Feb 14, 2022

Did you verify you're still seeing non-determinism from jlink? Any chance you'd be willing to put together a minimal repro for that?

@meisterT
Copy link
Member

Yep, it's still there:

$ wget https://mirror.bazel.build/zulu/bin/zulu11.52.13-ca-jdk11.0.13-linux_x64.tar.gz
$ tar xf zulu11.52.13-ca-jdk11.0.13-linux_x64.tar.gz
$ cd zulu11.52.13-ca-jdk11.0.13-linux_x64
$ for i in 1 2 3 4 5; do
   ./bin/jlink --module-path ./jmods/ --add-modules "java.base,java.compiler,java.instrument,java.logging,java.management,java.naming,java.sql,java.xml,jdk.management,jdk.sctp,jdk.unsupported,jdk.crypto.ec" --vm=server --strip-debug --no-man-pages --output attempt${i}
   md5sum attempt${i}/lib/modules
done
4aaa27887a7b0a76e77995a60283307e  attempt1/lib/modules
164fb4bbccafa9cc4738c98e7373bac2  attempt2/lib/modules
9813c9c12cf47c4c462415ac37a087a1  attempt3/lib/modules
4aaa27887a7b0a76e77995a60283307e  attempt4/lib/modules
9813c9c12cf47c4c462415ac37a087a1  attempt5/lib/modules

@cushon
Copy link
Contributor Author

cushon commented Feb 15, 2022

The jlink reproducibility issues have been fixed in JDK 16:

for i in $(seq 0 5); do
   ${JAVA_HOME}/bin/jlink --module-path ./jmods/ --add-modules "java.base,java.compiler,java.instrument,java.logging,java.management,java.naming,java.sql,java.xml,jdk.management,jdk.sctp,jdk.unsupported,jdk.crypto.ec" --vm=server --strip-debug --no-man-pages --output attempt${i}
   md5sum attempt${i}/lib/modules
done
b1d59124b35b455b24513615cbd6dd67  attempt0/lib/modules
b1d59124b35b455b24513615cbd6dd67  attempt1/lib/modules
b1d59124b35b455b24513615cbd6dd67  attempt2/lib/modules
b1d59124b35b455b24513615cbd6dd67  attempt3/lib/modules
b1d59124b35b455b24513615cbd6dd67  attempt4/lib/modules
b1d59124b35b455b24513615cbd6dd67  attempt5/lib/modules

@meisterT what do you think about trying to use JDK 17 as the embedded JDK? Since the JDK is packaged in Bazel, and only used by Bazel, I don't think that would necessarily be an invasive or user-visible change.

@meisterT
Copy link
Member

Great! I think it's reasonable to update to JDK17.

benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 24, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 25, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 26, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 26, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 26, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 26, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Oct 27, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
benjaminp added a commit to benjaminp/bazel that referenced this issue Nov 18, 2022
This allows removing the "cached" allmodules and minimized jdk archives, since `jlink` is deterministic in newer JDK versions.

Fixes bazelbuild#14785.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants