Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repository_ctx.download* does not use URL as default canonical ID #22652

Closed
Silic0nS0ldier opened this issue Jun 6, 2024 · 5 comments
Closed
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@Silic0nS0ldier
Copy link
Contributor

Description of the bug:

Prior to Bazel v7, .download and .download_and_extract were beholden to the flag --repository_cache_urls_as_default_canonical_id which in Bazel v6 (and possibly earlier) was enabled by default.

As of #20047 this logic was lifted out of src/main/java/com/google/devtools/build/lib/bazel/repository/downloader/DownloadManager.java, making it specific to the following repository rules/macros (via tools/build_defs/repo/cache.bzl).

  • http_archive
  • http_file
  • http_jar
  • jvm_maven_import_external
  • jvm_import_external

Bazel v7 release notes do not mention that repository_ctx.download and repository_ctx.download_and_extract no longer use URLs as the default canonical ID.

Which category does this issue belong to?

Configurability, External Dependency, Rules API

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

  1. Create a simple repository_rule which downloads a file, passing in a value for integrity.
  2. Run it.
  3. Modify the URL to anything else (file does not need to exist).
  4. Run it again, observe no failure (pulled from cache using integrity).

Which operating system are you running Bazel on?

macOS, Ubuntu (Linux)

What is the output of bazel info release?

release 7.1.2

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

The release notes do not explicitly mention this behaviour change, so I believe this is an unintended regression.

Commit: a6f8923

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@github-actions github-actions bot added team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts labels Jun 6, 2024
@fmeum
Copy link
Collaborator

fmeum commented Jun 6, 2024

Prior to Bazel v7, .download and .download_and_extract were beholden to the flag --experimental_repository_cache_urls_as_default_canonical_id...

This is correct. As of Bazel 7, the flag is now a no-op and the repo rules you listed have been updated to use this logic by default.

... which in Bazel v6 (and possibly earlier) was enabled by default.

This is not correct, the flag wasn't enabled by default in previous versions of Bazel:
https://cs.opensource.google/bazel/bazel/+/release-6.5.0:src/main/java/com/google/devtools/build/lib/bazel/repository/RepositoryOptions.java;l=243

If you are using rctx.download or rctx.download_and_extract in a custom repo rule and were previously using the --experimental_repository_cache_urls_as_default_canonical_id flag to set a default canonical ID, you would have to update the logic to manually pass URLs into the canonical ID field. You can use the new public helper function get_default_canonical_id in @bazel_tools//tools/build_defs/repo:cache.bzl if you want to match the behavior of the special-cased rules.

@comius comius removed the team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts label Jun 7, 2024
@Silic0nS0ldier
Copy link
Contributor Author

Silic0nS0ldier commented Jun 11, 2024

Ah, my mistake. Perhaps the flag had been in use in our repo, or perhaps I had simply made a bad assumption.

In this case I have some suggestions;

  • rctx.download* documentation should be updated to include what the default canonical_id is.
  • rctx.download* documentation should include examples for common patterns (such as using get_default_canonical_id).
  • Co-ordinate with ruleset authors so API usage uses an appropriate canonical_id (not having it set is probably a good signal that the behaviour hasn't been considered).

Some extra context for my request which I neglected to include in the OP. Due to the default behaviour, a TypeScript version upgrade was merged in a broken state due to an overlooked integrity hash.

load("@aspect_rules_ts//ts:repositories.bzl", "rules_ts_dependencies")
rules_ts_dependencies(
    ts_version_from = "//:package.json", # 5.4.2 -> 5.4.5
-   ts_integrity = "sha512-+2/g0Fds1ERlP6JsakQQDXjZdZMM+rqpamFZJEKh4kwTIn3iDkgKtby0CeNd5ATNZ4Ry1ax15TMx0W2V+miizQ==",
+   ts_integrity = "sha512-vcI4UpRgg81oIRUFwR0WSIHKt11nJ7SAVlYNIu+QpqeyXP+gpQJy/Z4+F0aGxSE4MqwjyXvW/TzgkLAx2AGHwQ==",
)

This specific instance of the issue should not reoccur as we've updated usage to automatically pick up the hash.

rules_ts_dependencies(
    ts_version_from = "@pnpm//:typescript/resolved.json",
)

@Wyverald Wyverald added team-Documentation Documentation improvements that cannot be directly linked to other team labels P2 We'll consider working on this in future. (Assignee optional) and removed team-Configurability platforms, toolchains, cquery, select(), config transitions untriaged labels Jun 11, 2024
@Silic0nS0ldier
Copy link
Contributor Author

I looked into using @bazel_tools//tools/build_defs/repo:cache.bzl in an internal repo rule, without success. The helper has visibility("private") preventing usage, it is not currently part of the public API.

@fmeum
Copy link
Collaborator

fmeum commented Jun 13, 2024

Feel free to send a PR to remove that restriction!

copybara-service bot pushed a commit that referenced this issue Jul 19, 2024
Documenting the default caching behaviour for `rctx.download` and `rctx.download_and_extract`.
 
Related to #22652

Closes #22716.

PiperOrigin-RevId: 653878029
Change-Id: I129f9eede2b841483a98d5335651c1d1198aaf76
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jul 19, 2024
Documenting the default caching behaviour for `rctx.download` and `rctx.download_and_extract`.
 
Related to bazelbuild#22652

Closes bazelbuild#22716.

PiperOrigin-RevId: 653878029
Change-Id: I129f9eede2b841483a98d5335651c1d1198aaf76
github-merge-queue bot pushed a commit that referenced this issue Jul 22, 2024
Documenting the default caching behaviour for `rctx.download` and
`rctx.download_and_extract`.
 
Related to #22652

Closes #22716.

PiperOrigin-RevId: 653878029
Change-Id: I129f9eede2b841483a98d5335651c1d1198aaf76

Commit
d4b416f

Co-authored-by: Jordan Mele <[email protected]>
copybara-service bot pushed a commit that referenced this issue Aug 14, 2024
When `repository_ctx.download` or `repository_ctx.download_and_extract` are not given an explicit `canonical_id` the default behaviour can lead to some counterintuitive results (e.g. URL changed but old asset restored from cache due to unchanged checksum).

This PR seeks to bring greater attention to `canonical_id` in these low level API (relative to `http_archive` which uses `get_default_canonical_id` by default).

URLs are usually the most appropriate `canonical_id` choice, so `get_default_canonical_id` has been added to the public API and sample usage added to documentation.

Related to #22652

## TODO

I need some pointers to make these happen.

- [x] Add `get_default_canonical_id` to the Bazel docs (I believe there are other `load`ed API that is missing from docs in one way or another).
- [x] Confirm examples render correctly in docs.

Closes #22742.

PiperOrigin-RevId: 662822225
Change-Id: I1617aa16a62da2d8dff2034fef8ca19aecd33d58
fmeum pushed a commit to fmeum/bazel that referenced this issue Aug 22, 2024
Documenting the default caching behaviour for `rctx.download` and `rctx.download_and_extract`.
 
Related to bazelbuild#22652

Closes bazelbuild#22716.

PiperOrigin-RevId: 653878029
Change-Id: I129f9eede2b841483a98d5335651c1d1198aaf76
fmeum pushed a commit to fmeum/bazel that referenced this issue Aug 22, 2024
When `repository_ctx.download` or `repository_ctx.download_and_extract` are not given an explicit `canonical_id` the default behaviour can lead to some counterintuitive results (e.g. URL changed but old asset restored from cache due to unchanged checksum).

This PR seeks to bring greater attention to `canonical_id` in these low level API (relative to `http_archive` which uses `get_default_canonical_id` by default).

URLs are usually the most appropriate `canonical_id` choice, so `get_default_canonical_id` has been added to the public API and sample usage added to documentation.

Related to bazelbuild#22652

I need some pointers to make these happen.

- [x] Add `get_default_canonical_id` to the Bazel docs (I believe there are other `load`ed API that is missing from docs in one way or another).
- [x] Confirm examples render correctly in docs.

Closes bazelbuild#22742.

PiperOrigin-RevId: 662822225
Change-Id: I1617aa16a62da2d8dff2034fef8ca19aecd33d58
@Silic0nS0ldier
Copy link
Contributor Author

Doc change and helper are merged. All that remains is making sure rulesets match user expectations. Not enough reason to keep this open IMO, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

No branches or pull requests

7 participants