Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

action cache inputs with old ~ external module naming poison fresh builds #23180

Closed
benjaminp opened this issue Aug 1, 2024 · 6 comments
Closed
Assignees
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@benjaminp
Copy link
Collaborator

Using a7264d9, I saw this server crash:

Caused by: java.util.concurrent.CompletionException: com.google.devtools.build.lib.cmdline.LabelSyntaxException: invalid repository name 'abseil-cpp~': repo names may contain only A-Z, a-z, 0-9, '-', '_', '.' and '+'
	at com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$3(LocalLoadingCache.java:204)
	at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$13(BoundedLocalCache.java:2451)
	at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1955)
	at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2449)
	at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2432)
	at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:107)
	at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:57)
	at com.google.devtools.build.lib.cmdline.RepositoryName.createUnvalidated(RepositoryName.java:97)
	at com.google.devtools.build.lib.cmdline.PackageIdentifier.discoverFromExecPath(PackageIdentifier.java:107)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction$PackageRootResolverWithEnvironment.findPackageRootsForFiles(ActionExecutionFunction.java:733)
	at com.google.devtools.build.lib.actions.ArtifactFactory.resolveSourceArtifacts(ArtifactFactory.java:489)
	at com.google.devtools.build.lib.actions.ActionCacheChecker.getCachedInputs(ActionCacheChecker.java:828)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.getActionCachedInputs(SkyframeActionExecutor.java:805)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.collectInputs(ActionExecutionFunction.java:633)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:264)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:189)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:467)
	... 7 more
Caused by: com.google.devtools.build.lib.cmdline.LabelSyntaxException: invalid repository name 'abseil-cpp~': repo names may contain only A-Z, a-z, 0-9, '-', '_', '.' and '+'
	at com.google.devtools.build.lib.cmdline.LabelParser.syntaxErrorf(LabelParser.java:208)
	at com.google.devtools.build.lib.cmdline.RepositoryName.validate(RepositoryName.java:167)
	at com.google.devtools.build.lib.cmdline.RepositoryName.lambda$static$0(RepositoryName.java:67)
	at com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$3(LocalLoadingCache.java:197)
	... 23 more

Everything worked after a clean --expunge. I didn't try to build a repro, but I assume this happens when a C++ action from an external repo is rerun after the ~ to + naming conversion.

@fmeum
Copy link
Collaborator

fmeum commented Aug 1, 2024

@Wyverald This would probably also affect anybody who flips the flag manually. Should we fix this in an rc2 if so?

@Wyverald Wyverald added P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. area-Bzlmod Bzlmod-specific PRs, issues, and feature requests type: bug labels Aug 5, 2024
@Wyverald Wyverald self-assigned this Aug 5, 2024
@tpudlik
Copy link
Contributor

tpudlik commented Aug 6, 2024

I used to think this issue was not a huge deal, just bazelisk clean --expunge once locally and be done. But unfortunately, we've observed it in our CI builds (which use both remote and local disk cache; I'm not sure which one is poisoned). So I'm very reluctant to roll past 60924fd, which flipped the flag by default.

@fmeum
Copy link
Collaborator

fmeum commented Aug 6, 2024

PackageIdentifier.discoverFromExecPath could probably just throw a checked exception that is handled in ActionCacheChecker.getCachedInputs to consider the entry corrupted.

@Wyverald
Copy link
Member

Wyverald commented Aug 6, 2024

This does seem bad enough to warrant a fix in 7.3.0. I'm working on it now -- sorry for not realizing how bad this was before.

The fix is simple enough (as Fabian noted above), but I'm having trouble finding a place for unit tests. Coverage in this area seems pretty bad.

Wyverald added a commit that referenced this issue Aug 6, 2024
60924fd changed the canonical repo name separator from `~` to `+`. Older repo names containing `~` now trigger a syntax error. If an action cache entry refers to an exec path from a previous version of Bazel that used `~`, we need to be resilient and treat the cache entry as corrupted, rather than just crash.

Fixes #23180.
@Wyverald
Copy link
Member

Wyverald commented Aug 6, 2024

@bazel-io fork 7.3.0

Wyverald added a commit that referenced this issue Aug 6, 2024
60924fd changed the canonical repo name separator from `~` to `+`. Older repo names containing `~` now trigger a syntax error. If an action cache entry refers to an exec path from a previous version of Bazel that used `~`, we need to be resilient and treat the cache entry as corrupted, rather than just crash.

Fixes #23180.

Closes #23227.

PiperOrigin-RevId: 660105601
Change-Id: Iea5d86c635056d12ba20598383da463bdde03ab0
github-merge-queue bot pushed a commit that referenced this issue Aug 6, 2024
…23230)

60924fd
changed the canonical repo name separator from `~` to `+`. Older repo
names containing `~` now trigger a syntax error. If an action cache
entry refers to an exec path from a previous version of Bazel that used
`~`, we need to be resilient and treat the cache entry as corrupted,
rather than just crash.

Fixes #23180.

Closes #23227.

PiperOrigin-RevId: 660105601
Change-Id: Iea5d86c635056d12ba20598383da463bdde03ab0
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.3.0 RC2. Please test out the release candidate and report any issues as soon as possible.
If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.3.0rc2. Thanks!

keertk added a commit that referenced this issue Aug 12, 2024
60924fd
changed the canonical repo name separator from `~` to `+`. Older repo
names containing `~` now trigger a syntax error. If an action cache
entry refers to an exec path from a previous version of Bazel that used
`~`, we need to be resilient and treat the cache entry as corrupted,
rather than just crash.

Fixes #23180.

Closes #23227.

PiperOrigin-RevId: 660105601
Change-Id: Iea5d86c635056d12ba20598383da463bdde03ab0

Co-authored-by: Xdng Yng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Bzlmod Bzlmod-specific PRs, issues, and feature requests P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants