Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

downloader config from --experimental_downloader_config is cached in the server and requires shutdown to change #24166

Closed
novas0x2a opened this issue Oct 31, 2024 · 15 comments
Labels
P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@novas0x2a
Copy link

novas0x2a commented Oct 31, 2024

Description of the bug:

The value of --experimental_downloader_config seems to be cached in the server on startup and requires a shutdown in order to change either the value of the flag or the contents of the file that the flag points to.

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

$ bazel --ignore_all_rc_files info release
release 8.0.0rc2

$ bazel --ignore_all_rc_files clean --expunge
INFO: Starting clean (this may take a while). Use --async if the clean takes more than several minutes.

$ rm -f MODULE.bazel.lock

$ cat allow-bcr
allow *

$ cat deny-bcr
block bcr.bazel.build
allow *

$ bazel --ignore_all_rc_files mod deps --lockfile_mode=update --repository_cache= --experimental_downloader_config=deny-bcr
Starting local Bazel server and connecting to it...
ERROR: Error accessing registry https://bcr.bazel.build/: Failed to fetch registry file https://bcr.bazel.build/modules/platforms/0.0.10/MODULE.bazel: Configured URL rewriter blocked all URLs: [https://bcr.bazel.build/modules/platforms/0.0.10/MODULE.bazel]. Type 'bazel help mod' for syntax and help.

$ bazel --ignore_all_rc_files mod deps --lockfile_mode=update --repository_cache= --experimental_downloader_config=allow-bcr
ERROR: Error accessing registry https://bcr.bazel.build/: Failed to fetch registry file https://bcr.bazel.build/modules/platforms/0.0.10/MODULE.bazel: Configured URL rewriter blocked all URLs: [https://bcr.bazel.build/modules/platforms/0.0.10/MODULE.bazel]. Type 'bazel help mod' for syntax and help.

$ bazel --ignore_all_rc_files shutdown

$ bazel --ignore_all_rc_files mod deps --lockfile_mode=update --repository_cache= --experimental_downloader_config=allow-bcr
# download proceeds normally

Which operating system are you running Bazel on?

linux

What is the output of bazel info release?

release 8.0.0rc2 and release 7.4.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

bazelisk doesn't support this use case, but I manually bisected release versions, and this appears to be a regression. The bug is present in:

  • 7.4.0rc1
  • 7.4.0
  • 8.0.0rc2

The bug is not present in:

  • 7.3.2

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@novas0x2a
Copy link
Author

novas0x2a commented Oct 31, 2024

I did a deeper manual bisect and it appears to have been introduced in cab54be (relative to a git bisect between 7.3.2 and 7.4.0)

@fmeum
Copy link
Collaborator

fmeum commented Oct 31, 2024

@criemen

@fmeum
Copy link
Collaborator

fmeum commented Oct 31, 2024

@bazel-io fork 7.4.1

@fmeum
Copy link
Collaborator

fmeum commented Oct 31, 2024

@bazel-io fork 8.0.0

@criemen
Copy link
Contributor

criemen commented Nov 1, 2024

Thanks for the bisect and ping, I'll see if I can whip up an easy patch for this.

@criemen
Copy link
Contributor

criemen commented Nov 1, 2024

Okay I need some help here, as I've never worked with the SkyFunction concept.

This bug is as far as I can see only applicable to bzlmod registry access, and not to general URLs.
What's happening is the following:

  • we instantiate a new download manager (with a new URL rewriter, as well as all the other config) in beforeCommand in BazelRepositoryModule. This is correct, and doesn't cache across command invocations, as it shouldn't.
  • We instantiate a global RegistryFactoryImpl in the constructor of BazelRepositoryModule
  • That registry factory gets a download manager set into itself in beforeCommand, and it gets called through the SkyFunction RegistryFunction.
  • On the second command, the registry factory implementation dutifully gets the new DownloadManager instance (as it should)
  • However, the SkyFunction mechanism caches the previous registry instantiation, so that never gets the new download manager with the updated config.

Does this analysis help someone from the bazel core team to come up with a patch for this issue?

@iancha1992 iancha1992 added the team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. label Nov 1, 2024
@fmeum
Copy link
Collaborator

fmeum commented Nov 1, 2024

I think that this root cause analysis is spot-on, thanks!

How we fix this depends on the semantics we want to achieve:

  • If we want the downloader config to only invalidate unsuccessful downloads, we should make sure that it doesn't end up in SkyValues. For example, we could inject it into the individual SkyFunctions that use Registry instances and pass DownloadManager into Registry functions as an argument.
  • If we want the downloader config to invalidate all downloads (e.g. so that changing it to something broken after a successful build actually breaks the build), we could add a dependency in the contents of the file (FileValue) to the RegistryFunction.

@fmeum
Copy link
Collaborator

fmeum commented Nov 1, 2024

@Wyverald

@criemen
Copy link
Contributor

criemen commented Nov 1, 2024

How we fix this depends on the semantics we want to achieve:

I think the problem goes further - the other options from RepoOptions that affect the download manager are also cached for bzlmod downloads - updates to the max parallel downloads option are as far as I can see by a quick look also ignored.

If there's a chance that the registry is used in parallel with other stuff, then we have two download managers active at the same time that don't share the same underlying semaphore in the httpdownloader, which in theory is also a correctness problem.

So injecting DownloadManagers into registry instances seems reasonable to me, as we otherwise break the assumption that the DownloadManager is command-scoped.

@dws
Copy link
Contributor

dws commented Nov 2, 2024

I think that this root cause analysis is spot-on, thanks!

How we fix this depends on the semantics we want to achieve:

  • If we want the downloader config to only invalidate unsuccessful downloads, we should make sure that it doesn't end up in SkyValues. For example, we could inject it into the individual SkyFunctions that use Registry instances and pass DownloadManager into Registry functions as an argument.
  • If we want the downloader config to invalidate all downloads (e.g. so that changing it to something broken after a successful build actually breaks the build), we could add a dependency in the contents of the file (FileValue) to the RegistryFunction.

Speaking as a Bazel user here, I do not want an entry in the cache to mask an error in a modified downloader config. So I'd want a change in the downloader config to invalidate all downloads.

@fmeum
Copy link
Collaborator

fmeum commented Nov 4, 2024

Speaking as a Bazel user here, I do not want an entry in the cache to mask an error in a modified downloader config. So I'd want a change in the downloader config to invalidate all downloads.

This is pretty tricky as the repo cache isn't aware of downloader config rewrites. So even if we make the Bzlmod SkyFunctions dependent on them, you could still get cache hits if your config is invalid. We could add new logic to rewrite URL-like canonical IDs using the rewriter, but that's more of a FR than a bug fix.

@fmeum
Copy link
Collaborator

fmeum commented Nov 4, 2024

@criemen Would you be willing to send a PR to inject DownloadManager into the individual SkyFunctions? That should fix the regression, everything else could be a follow-up.

@criemen
Copy link
Contributor

criemen commented Nov 5, 2024

@fmeum I'll give it a timeboxed try this afternoon.

@meteorcloudy meteorcloudy added P1 I'll work on this now. (Assignee required) and removed untriaged labels Nov 5, 2024
@criemen
Copy link
Contributor

criemen commented Nov 5, 2024

PR here: #24212

bazel-io pushed a commit to bazel-io/bazel that referenced this issue Nov 6, 2024
This PR removes the DownloadManager from the registry factory implementation. As the registry created by the factory is cached by the SkyFunction mechanism, the DownloadManager instance was living too long - it is supposed to be re-created for every command instantiation to respect changes in command line options, but for the registry, it ignored those changes.

Instead, the DownloadManager is set directly into the affected SkyFunctions that require access to it. This way, the per-command DownloadManager instance is correctly used.

This fixes bazelbuild#24166.

Note for reviewers: This is my first time touching code with SkyFunctions, so I don't really know what I'm doing.

Closes bazelbuild#24212.

PiperOrigin-RevId: 693644409
Change-Id: I7b16684e52673043615290d114f078ab7ab99fcf
meteorcloudy pushed a commit to meteorcloudy/bazel that referenced this issue Nov 6, 2024
This PR removes the DownloadManager from the registry factory implementation. As the registry created by the factory is cached by the SkyFunction mechanism, the DownloadManager instance was living too long - it is supposed to be re-created for every command instantiation to respect changes in command line options, but for the registry, it ignored those changes.

Instead, the DownloadManager is set directly into the affected SkyFunctions that require access to it. This way, the per-command DownloadManager instance is correctly used.

This fixes bazelbuild#24166.

Note for reviewers: This is my first time touching code with SkyFunctions, so I don't really know what I'm doing.

Closes bazelbuild#24212.

PiperOrigin-RevId: 693644409
Change-Id: I7b16684e52673043615290d114f078ab7ab99fcf
meteorcloudy added a commit that referenced this issue Nov 6, 2024
…4228)

This PR removes the DownloadManager from the registry factory
implementation. As the registry created by the factory is cached by the
SkyFunction mechanism, the DownloadManager instance was living too long
- it is supposed to be re-created for every command instantiation to
respect changes in command line options, but for the registry, it
ignored those changes.

Instead, the DownloadManager is set directly into the affected
SkyFunctions that require access to it. This way, the per-command
DownloadManager instance is correctly used.

This fixes #24166.

Note for reviewers: This is my first time touching code with
SkyFunctions, so I don't really know what I'm doing.

Closes #24212.

PiperOrigin-RevId: 693644409
Change-Id: I7b16684e52673043615290d114f078ab7ab99fcf

Co-authored-by: Cornelius Riemenschneider <[email protected]>
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.4.1 RC2. Please test out the release candidate and report any issues as soon as possible.
If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.4.1rc2. Thanks!

github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2024
…4220)

This PR removes the DownloadManager from the registry factory
implementation. As the registry created by the factory is cached by the
SkyFunction mechanism, the DownloadManager instance was living too long
- it is supposed to be re-created for every command instantiation to
respect changes in command line options, but for the registry, it
ignored those changes.

Instead, the DownloadManager is set directly into the affected
SkyFunctions that require access to it. This way, the per-command
DownloadManager instance is correctly used.

This fixes #24166.

Note for reviewers: This is my first time touching code with
SkyFunctions, so I don't really know what I'm doing.

Closes #24212.

PiperOrigin-RevId: 693644409
Change-Id: I7b16684e52673043615290d114f078ab7ab99fcf

Commit
f57f672

Co-authored-by: Cornelius Riemenschneider <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants