-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendor mode: move the external repo instead of copying #22668
Conversation
FileSystemUtils.moveFile(markerUnderExternal, tMarker); | ||
// 3. Move the external repo to vendor dir. It's fine if this step fails or is interrupted, because the marker | ||
// file under external is gone anyway. | ||
FileSystemUtils.moveTreesBelow(repoUnderExternal, repoUnderVendor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this behave if a repo symlinks files from another repo and one is vendored while the other is not? It looks like it may be necessary to follow relative symlinks but not absolute symlinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The moveTreesBelow
doesn't follow any symlinks. Judging from the code here, it's actually impossible to create relative symlink with the ctx.symlink
API.
I tested with
ctx.symlink("/tmp/foo", "path_abs")
ctx.symlink("data", "path_rel")
ctx.symlink(ctx.path(Label("@bar//:data")), "path_bar")
ctx.symlink("../_main~ext~bar~/data", "path_bar_2")
and it resulted
path_abs@ -> /tmp/foo
path_bar@ -> /private/var/tmp/_bazel_pcloudy/d278f827a729facdbfb1ff0fc0002042/external/_main~ext~bar/data
path_bar_2@ -> /private/var/tmp/_bazel_pcloudy/d278f827a729facdbfb1ff0fc0002042/external/_main~ext~bar~/data
path_rel@ -> /private/var/tmp/_bazel_pcloudy/d278f827a729facdbfb1ff0fc0002042/external/_main~ext~foo/data
in both external and vendor dir.
This is fine if only foo is vendored, since eventually <output_base>/external/_main~ext~bar
would exist and point to the right location. However, I noticed there is problem if output base is changed after vendoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is the current behavior, wouldn't we have to change it so that symlinks in vendored repos do not contain absolute paths? I think there was another issue about this filed recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To deal with potential output base, maybe we could
- create a symlink pointing to the external repo root under the vendor dir
- Rewrite all symlinks pointing some path under external repo root to a relative path to the symlink created in 1.
I have an experimental implementation in meteorcloudy@bf0ec69, which results
$ ll vendor_src/_bazel-external
lrwxr-xr-x 1 pcloudy primarygroup 73 Jun 10 15:17 vendor_src/_bazel-external@ -> /private/var/tmp/_bazel_pcloudy/d278f827a729facdbfb1ff0fc0002042/external
pcloudy@pcloudy-macbookpro2:~/workspace/my_tests/simple_cpp_test (master)
$ ll vendor_src/_main~ext~foo/
total 8
drwxr-xr-x 9 pcloudy primarygroup 288 Jun 10 15:17 ./
drwxr-xr-x 7 pcloudy primarygroup 224 Jun 10 15:17 ../
-rwxr-xr-x 1 pcloudy wheel 0 Jun 10 15:17 BUILD*
-rwxr-xr-x 1 pcloudy wheel 0 Jun 10 15:17 REPO.bazel*
-rwxr-xr-x 1 pcloudy wheel 15 Jun 10 15:17 data*
lrwxr-xr-x 1 pcloudy wheel 8 Jun 10 15:17 path_abs@ -> /tmp/foo
lrwxr-xr-x 1 pcloudy primarygroup 37 Jun 10 15:17 path_bar@ -> ../_bazel-external/_main~ext~bar/data
lrwxr-xr-x 1 pcloudy primarygroup 38 Jun 10 15:17 path_bar_2@ -> ../_bazel-external/_main~ext~bar/data2
lrwxr-xr-x 1 pcloudy primarygroup 37 Jun 10 15:17 path_rel@ -> ../_bazel-external/_main~ext~foo/data
Please let me know what you think, and preferably I'll do it in another PR.
/cc @Wyverald @fmeum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is the current behavior, wouldn't we have to change it so that symlinks in vendored repos do not contain absolute paths? I think there was another issue about this filed recently.
#22303, probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To deal with potential output base, maybe we could
- create a symlink pointing to the external repo root under the vendor dir
- Rewrite all symlinks pointing some path under external repo root to a relative path to the symlink created in 1.
this is quite clever! but what will version-control systems do with this special symlink? Usually people put bazel-*
symlinks in the workspace root in .gitignore, so presumably this new special symlink will also need to be ignored? And the symlink is generated on demand if it's not there, etc.? (I agree that this should be done in a separate PR)
Either way, some sort of symlink rewriting will need to happen, and we'll probably need to do something similar for the true repo cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so presumably this new special symlink will also need to be ignored? And the symlink is generated on demand if it's not there, etc.? (I agree that this should be done in a separate PR)
Yes, I also think it should be gitignored since it's machine specific. And we can just always re-create the symlink since it's quite cheap to keep the code simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly LGTM, just nits!
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/VendorManager.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/VendorManager.java
Outdated
Show resolved
Hide resolved
FileSystemUtils.moveFile(markerUnderExternal, tMarker); | ||
// 3. Move the external repo to vendor dir. It's fine if this step fails or is interrupted, because the marker | ||
// file under external is gone anyway. | ||
FileSystemUtils.moveTreesBelow(repoUnderExternal, repoUnderVendor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To deal with potential output base, maybe we could
- create a symlink pointing to the external repo root under the vendor dir
- Rewrite all symlinks pointing some path under external repo root to a relative path to the symlink created in 1.
this is quite clever! but what will version-control systems do with this special symlink? Usually people put bazel-*
symlinks in the workspace root in .gitignore, so presumably this new special symlink will also need to be ignored? And the symlink is generated on demand if it's not there, etc.? (I agree that this should be done in a separate PR)
Either way, some sort of symlink rewriting will need to happen, and we'll probably need to do something similar for the true repo cache.
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/VendorManager.java
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/VendorManager.java
Show resolved
Hide resolved
This drastically improves the speed of vendoring external repositories. Related: bazelbuild#19563 Closes bazelbuild#22668. PiperOrigin-RevId: 642338030 Change-Id: Idcba16c491711cf8fa6637d1e9c42cfc65e87599
This drastically improves the speed of vendoring external repositories.
Related: #19563