Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--lockfile_mode doesn't actually work in Bazel 6.2 #18455

Closed
aaronmondal opened this issue May 19, 2023 · 15 comments
Closed

--lockfile_mode doesn't actually work in Bazel 6.2 #18455

aaronmondal opened this issue May 19, 2023 · 15 comments
Assignees
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests awaiting-user-response Awaiting a response from the author P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@aaronmondal
Copy link

aaronmondal commented May 19, 2023

Description of the bug:

My guess is that there is some bugfix missing. Invoking a build with --lockfile_mode=update creates a MODULE.bazel.lock file with the content {} on the initial invocation. All subsequent invocations then crash with the following stacktrace:

FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileValue$$Lambda$215/0x0000000800297040@4eb05709' 
(requested by nodes 'com.google.devtools.build.lib.bazel.bzlmod.BazelDepGraphValue$$Lambda$341/0x00000008004bf840@328bbd2c')
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:642)
        at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: Failed to invoke public com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileValue() with no args
        at com.google.gson.internal.ConstructorConstructor$3.construct(ConstructorConstructor.java:113)
        at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:212)
        at com.google.gson.Gson.fromJson(Gson.java:932)
        at com.google.gson.Gson.fromJson(Gson.java:897)
        at com.google.gson.Gson.fromJson(Gson.java:846)
        at com.google.gson.Gson.fromJson(Gson.java:817)
        at com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileFunction.compute(BazelLockFileFunction.java:81)
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:571)
        ... 4 more
Caused by: java.lang.InstantiationException
        at java.base/jdk.internal.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
        at com.google.gson.internal.ConstructorConstructor$3.construct(ConstructorConstructor.java:110)
        ... 11 more
aaron@ii ~/aaronmondal

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

  • Use Bazel 6.2.
  • Add build --lockfile_mode=update (and --enable_bzlmod) to .bazelrc
  • Run a command.
  • Attempt to run another command.

Which operating system are you running Bazel on?

Gentoo x86_64

What is the output of bazel info release?

release 6.2.0- (@non-git)

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

Release from nixpkgs.

Have you found anything relevant by searching the web?

This might be relevant: 23518b8

@Wyverald Wyverald added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. area-Bzlmod Bzlmod-specific PRs, issues, and feature requests labels May 19, 2023
@Wyverald Wyverald assigned SalmaSamy and unassigned sgowroji and Pavank1992 May 19, 2023
@SalmaSamy
Copy link
Contributor

SalmaSamy commented May 19, 2023

Hi @aaronmondal,
I tried the simple scenario mentioned and was not able to reproduce the error. Can you please provide a minimal reproducible case?

@meteorcloudy meteorcloudy added P1 I'll work on this now. (Assignee required) and removed untriaged labels May 23, 2023
@meteorcloudy
Copy link
Member

I believe in Bazel 6.2, the lock file flag should be --lockfile_mode instead of --enable_lockfile. Are you sure this bug is reproducible with 6.2.0?

@meteorcloudy
Copy link
Member

meteorcloudy commented May 26, 2023

Invoking a build with --lockfile_mode=update creates a MODULE.bazel.lock file with the content {} on the initial invocation.

Sorry, I missed this. Now I can reproduce similar issue with an invalid MODULE.bazel.lock file.

@SalmaSamy
Copy link
Contributor

SalmaSamy commented May 26, 2023

@meteorcloudy True, we can handle the crashing part.
But getting an empty lockfile generated in the initial run shouldn't be a valid case. So we need to understand how did that happen?

@meteorcloudy
Copy link
Member

So we need to understand how did that happen?

Indeed, @aaronmondal did you intentionally created a MODULE.bazel.lock file with {}?

@aaronmondal
Copy link
Author

Ok I just retested this again (and sorry for the late reply).

I believe in Bazel 6.2, the lock file flag should be --lockfile_mode instead of --enable_lockfile. Are you sure this bug is reproducible with 6.2.0?

Sorry that was a typo. It's indeed just --lockfile_mode=update. Edited the OP.

Indeed, @aaronmondal did you intentionally created a MODULE.bazel.lock file with {}?

No, that lockfile is created automatically by Bazel. I'm getting this by adding --lockfile_mode=update to the .bazelrc in the examples directory in rules_ll. To fully reproduce this (not minimal, but has pretty much the entire world pinned):

git clone [email protected]:eomii/rules_ll
cd rules_ll/examples
nix develop
# And then add `--lockfile_mode=update` to `examples/.bazelrc` and run e.g. `bazel test cpp`.

rules_ll is somewhat customized and uses RBE toolchains during local development. The .bazelrc itself might be relevant:

# Don't inherit PATH and LD_LIBRARY_PATH.
build --incompatible_strict_action_env

# Forbid network access unless explicitly enabled.
build --sandbox_default_allow_network=false

# Use correct runfile locations.
build --nolegacy_external_runfiles

# Enable sandboxing for exclusive tests like GPU performance tests.
test --incompatible_exclusive_test_sandboxed

# Make sure rules_cc uses the correct transition mechanism.
build --incompatible_enable_cc_toolchain_resolution

# Propagate tags such as no-remote for precompilations to downstream actions.
common --incompatible_allow_tags_propagation

# Bzlmod configuration.
common --enable_bzlmod
common --lockfile_mode=update
common --registry=https://raw.githubusercontent.com/bazelbuild/bazel-central-registry/main/
common --registry=https://raw.githubusercontent.com/eomii/bazel-eomii-registry/main/

# Make sure to use the correct java runtime.
build --java_runtime_version=rbe_jdk
build --tool_java_runtime_version=rbe_jdk

# Always act as if using remote execution.
build --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1
build --define=EXECUTOR=remote

# Remote optimizations.
build --experimental_remote_cache_compression
build --experimental_remote_build_event_upload=minimal
build --remote_download_minimal
build --nolegacy_important_outputs

# Smaller profiling. Careful. Disabling this might explode remote cache usage.
build --slim_profile
build --experimental_profile_include_target_label
build --noexperimental_profile_include_primary_output

# Allow user-side customization.
try-import %workspace%/.bazelrc.user

@aaronmondal
Copy link
Author

Hmm I thought that maybe the fact that we're using a local_path_override in the examples might matter, but that doesn't seem to be the case. Removing local_path_override still leads to the same issue.

@SalmaSamy
Copy link
Contributor

@aaronmondal Yes, that shouldn't matter. I will take a close look and get back to you.

@SalmaSamy
Copy link
Contributor

@aaronmondal Hi, I tried the mentioned scenario (though I am getting an error running that target) but I got a correct lockfile generated:
image
Also, rerunning is giving me the same error, but nothing related to the lockfile.

So, can you try to provide a minimal repro? preferably without installing nix (I tried but had some errors: "error: failed to configure synthetic.conf")

@aaronmondal
Copy link
Author

@SalmaSamy Thanks a lot for looking into this!

The rules_ll toolchains don't work without the nix environment. We wrap the entire Bazel executable twice to ensure full reproducibility beyond labels. The upside of this is that it lets you use remote execution toolchains locally (e.g. to share the same remote cache between locally running builds of different trusted users), but it won't work at all without the nix environment which contains all the pinned tools. We also only support linux environments, so these toolchains most likely won't work even if nix was set up.

That shouldn't be related to this issue though. If the lockfile is generated correctly with your local toolchains it's probably related to some Java or glibc incompatibility. Since the lockfile is working even with the .bazelrc from rules_ll/examples, I think we can cross out bazel flag incompatibilities.

FYI the Java version that is used at the moment is 17.0.6 or more specifically openjdk-headless-17.0.6+10: https://github.com/eomii/rules_ll/blob/d5b919596e4d5896a7996ea4b7f481d4439fd4e0/rbe/default/java/BUILD#L29. JAVA_HOME is using that exact same version as well.

I'm suspecting that this is related to Java version mismatch. Which java version is running on your system? I can try cycling through various Java versions to see whether the issue persists. I might be able to create a non-nix reproducer by using --java_runtime/--tool_java_runtime flags to reference a local JDK.

@meteorcloudy
Copy link
Member

@aaronmondal Thanks for the context! But I'm not sure it's related the java version, since the lock file support in 6.2 only takes Bazel modules into account, but the JDK repos are not Bazel modules.

@SalmaSamy
Copy link
Contributor

@aaronmondal I see, I was running openjdk 11.0.19, but I also downloaded openjdk 17.0.7 and got the same results.

copybara-service bot pushed a commit that referenced this issue Jun 6, 2023
Add new exception for this function to handle any syntax errors or missing data within the lockfile

Related: #18455
PiperOrigin-RevId: 538144339
Change-Id: I82160f3bff6598c26b3f99c824fe85ea86086c1f
@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) awaiting-user-response Awaiting a response from the author and removed P1 I'll work on this now. (Assignee required) labels Jun 6, 2023
SalmaSamy added a commit that referenced this issue Jul 10, 2023
Add new exception for this function to handle any syntax errors or missing data within the lockfile

Related: #18455
PiperOrigin-RevId: 538144339
Change-Id: I82160f3bff6598c26b3f99c824fe85ea86086c1f
SalmaSamy added a commit that referenced this issue Jul 11, 2023
Add new exception for this function to handle any syntax errors or missing data within the lockfile

Related: #18455
PiperOrigin-RevId: 538144339
Change-Id: I82160f3bff6598c26b3f99c824fe85ea86086c1f
keertk pushed a commit that referenced this issue Jul 11, 2023
* Update lockfile function exception

Add new exception for this function to handle any syntax errors or missing data within the lockfile

Related: #18455
PiperOrigin-RevId: 538144339
Change-Id: I82160f3bff6598c26b3f99c824fe85ea86086c1f

* Update lockfile writing logic to be event triggered.

The writing will only happen at the end of resolution in the after command function (at this point, the module and all needed module extensions are resolved). Which solves the problem of reading and writing into the lockfile multiple times in one invocation.

PiperOrigin-RevId: 540552139
Change-Id: I4a78412a388bde2ff7949d119831318c40d49047

# Conflicts:
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelDepGraphFunction.java
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelLockFileFunction.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/BazelLockFileFunctionTest.java

* Add module extension to lockfile

PiperOrigin-RevId: 543445157
Change-Id: Ib32a2c9fdc22c5b228d78141fc9648b3ef5edf7d

# Conflicts:
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/GsonTypeAdapterUtil.java
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/SingleExtensionEvalFunction.java

* fixes

* Fix tests

* Enable lockfile by default and fix warning

PiperOrigin-RevId: 545927412
Change-Id: I34e8531b8e396ccdfe0eecbee22cb68f76f969fc

# Conflicts:
#	src/test/java/com/google/devtools/build/lib/analysis/RunfilesRepoMappingManifestTest.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/BazelDepGraphFunctionTest.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunctionTest.java
#	src/test/java/com/google/devtools/build/lib/query2/testutil/SkyframeQueryHelper.java
#	src/test/java/com/google/devtools/build/lib/rules/starlarkdocextract/StarlarkDocExtractTest.java

* Add lockfile documentation

PiperOrigin-RevId: 546085523
Change-Id: If287e6e143f8858d185f83a1630275520db65e44

# Conflicts:
#	site/en/_book.yaml

* Fix null event issue

If a change "only" occurred in a module extension, then the value of ModuleResolutionEvent is null -the skyvalue is cached and the event was never sent- then "combineModuleExtensions" function would crash with null pointer exception while trying to get the old usages.

In this case we can just re-add the old module extensions from the lockfile because if the module resolution is the same, this means the usage didn't change.

PiperOrigin-RevId: 546821911
Change-Id: Ie685cbab654d1c41403aebd31ddad91033be1d56

* Fix tests

* Update lockfile function exception

Add new exception for this function to handle any syntax errors or missing data within the lockfile

Related: #18455
PiperOrigin-RevId: 538144339
Change-Id: I82160f3bff6598c26b3f99c824fe85ea86086c1f

* Update lockfile writing logic to be event triggered.

The writing will only happen at the end of resolution in the after command function (at this point, the module and all needed module extensions are resolved). Which solves the problem of reading and writing into the lockfile multiple times in one invocation.

PiperOrigin-RevId: 540552139
Change-Id: I4a78412a388bde2ff7949d119831318c40d49047

# Conflicts:
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelDepGraphFunction.java
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelLockFileFunction.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/BazelLockFileFunctionTest.java

* Add module extension to lockfile

PiperOrigin-RevId: 543445157
Change-Id: Ib32a2c9fdc22c5b228d78141fc9648b3ef5edf7d

# Conflicts:
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/GsonTypeAdapterUtil.java
#	src/main/java/com/google/devtools/build/lib/bazel/bzlmod/SingleExtensionEvalFunction.java

* fixes

* Fix tests

* Enable lockfile by default and fix warning

PiperOrigin-RevId: 545927412
Change-Id: I34e8531b8e396ccdfe0eecbee22cb68f76f969fc

# Conflicts:
#	src/test/java/com/google/devtools/build/lib/analysis/RunfilesRepoMappingManifestTest.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/BazelDepGraphFunctionTest.java
#	src/test/java/com/google/devtools/build/lib/bazel/bzlmod/BazelModuleResolutionFunctionTest.java
#	src/test/java/com/google/devtools/build/lib/query2/testutil/SkyframeQueryHelper.java
#	src/test/java/com/google/devtools/build/lib/rules/starlarkdocextract/StarlarkDocExtractTest.java

* Add lockfile documentation

PiperOrigin-RevId: 546085523
Change-Id: If287e6e143f8858d185f83a1630275520db65e44

# Conflicts:
#	site/en/_book.yaml

* Fix null event issue

If a change "only" occurred in a module extension, then the value of ModuleResolutionEvent is null -the skyvalue is cached and the event was never sent- then "combineModuleExtensions" function would crash with null pointer exception while trying to get the old usages.

In this case we can just re-add the old module extensions from the lockfile because if the module resolution is the same, this means the usage didn't change.

PiperOrigin-RevId: 546821911
Change-Id: Ie685cbab654d1c41403aebd31ddad91033be1d56

* Fix tests

* Remove unrelated updates

* reverse unrelated changes
@Wyverald
Copy link
Member

Closing this as inactionable.

@Wyverald Wyverald closed this as not planned Won't fix, can't repro, duplicate, stale Jul 12, 2023
@csmulhern
Copy link
Contributor

I'm having the same exact issue with a very minimal example (repro.zip) that sets up a simple python binary. No custom .bazelrc at all.

Running:

bazel run --enable_bzlmod --lockfile_mode=update //:main

Results in the creation of a MODULE.bazel.lock file that contains an empty JSON object:

{}

The next attempt to run the bazel command above results in the following crash:

FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileValue$$Lambda$219/0x00000008002a1440@4b69424b' (requested by nodes 'com.google.devtools.build.lib.bazel.bzlmod.BazelDepGraphValue$$Lambda$361/0x00000008004c0840@2403fbb8')
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:633)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:365)
	at java.base/java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1407)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.lang.RuntimeException: Failed to invoke public com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileValue() with no args
	at com.google.gson.internal.ConstructorConstructor$3.construct(ConstructorConstructor.java:113)
	at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:212)
	at com.google.gson.Gson.fromJson(Gson.java:932)
	at com.google.gson.Gson.fromJson(Gson.java:897)
	at com.google.gson.Gson.fromJson(Gson.java:846)
	at com.google.gson.Gson.fromJson(Gson.java:817)
	at com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileFunction.getLockfileValue(BazelLockFileFunction.java:101)
	at com.google.devtools.build.lib.bazel.bzlmod.BazelLockFileFunction.compute(BazelLockFileFunction.java:79)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:562)
	... 7 more
Caused by: java.lang.InstantiationException
	at java.base/jdk.internal.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
	at com.google.gson.internal.ConstructorConstructor$3.construct(ConstructorConstructor.java:110)
	... 15 more

Bazel version:

> bazel --version
bazel 6.3.2-homebrew

macOS version:

> sw_vers
ProductName:		macOS
ProductVersion:		14.0
BuildVersion:		23A344

@csmulhern
Copy link
Contributor

@Wyverald can we reopen?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests awaiting-user-response Awaiting a response from the author P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

No branches or pull requests

7 participants