Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot successfully extract go_sdk because of unicode filename #2771

Closed
swsnider opened this issue Dec 23, 2020 · 15 comments · Fixed by #2836
Closed

Cannot successfully extract go_sdk because of unicode filename #2771

swsnider opened this issue Dec 23, 2020 · 15 comments · Fixed by #2836
Labels
Milestone

Comments

@swsnider
Copy link
Contributor

What version of rules_go are you using?

0.25.0

What version of gazelle are you using?

0.22.2

What version of Bazel are you using?

3.7.1

Does this issue reproduce with the latest releases of all the above?

Yes

What operating system and processor architecture are you using?

docker container (via dazel) on macOS Big Sur (uname -a == Linux 45ddc4b6c7ee 4.19.121-linuxkit #1 SMP Tue Dec 1 17:50:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux)

Any other potentially useful information about your toolchain?

We use a custom c++ toolchain to build against musl, but that doesn't seem relevant to this issue. We're using go 1.15.6, using this go_download_sdk invocation:

go_download_sdk(
    name = "go_sdk",
    #    version = "1.15.6",
    sdks = {
        "darwin_amd64": ("go_sdk-darwin.tar.gz", "940a73b45993a3bae5792cf324140dded34af97c548af4864d22fd6d49f3bd9f"),
        "linux_amd64": ("go_sdk-linux.tar.gz", "3918e6cc85e7eaaa6f859f1bdbaac772e7a825b0eb423c63d3ae68b21f84b844"),
    },
    urls = ["https://doesnotexistipromise.local/{}"],
)

What did you do?

Ran dazel test on a go target in the repo

What did you expect to see?

Some test output

What did you see instead?

Extracting Bazel installation...
Starting local Bazel server and connecting to it...
INFO: Repository go_sdk instantiated at:
  /Volumes/Projects/repo/WORKSPACE:62:16: in <toplevel>
  /Users/swsnider/.cache/bazel/_bazel_swsnider/external/io_bazel_rules_go/go/private/sdk.bzl:129:21: in go_download_sdk
Repository rule _go_download_sdk defined at:
  /Users/swsnider/.cache/bazel/_bazel_swsnider/external/io_bazel_rules_go/go/private/sdk.bzl:116:35: in <toplevel>
ERROR: An error occurred during the fetch of repository 'go_sdk':
   Traceback (most recent call last):
	File "/Users/swsnider/.cache/bazel/_bazel_swsnider/external/io_bazel_rules_go/go/private/sdk.bzl", line 100, column 16, in _go_download_sdk_impl
		_remote_sdk(ctx, [url.format(filename) for url in ctx.attr.urls], ctx.attr.strip_prefix, sha256)
	File "/Users/swsnider/.cache/bazel/_bazel_swsnider/external/io_bazel_rules_go/go/private/sdk.bzl", line 180, column 29, in _remote_sdk
		ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error extracting /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/temp18079659004892177486/go_sdk-linux.tar.gz to /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/temp18079659004892177486: /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/test/fixedbugs/issue27836.dir/?foo.go (Input/output error)
ERROR: Analysis of target '[REDACTED]' failed; build aborted: java.io.IOException: Error extracting /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/temp18079659004892177486/go_sdk-linux.tar.gz to /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/temp18079659004892177486: /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/test/fixedbugs/issue27836.dir/?foo.go (Input/output error)
INFO: Elapsed time: 267.761s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (33 packages loaded, 19 targets configured)
FAILED: Build did NOT complete successfully (33 packages loaded, 19 targets configured)
    Fetching /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk; Extracting /Users/swsnider/.cache/bazel/_bazel_swsnider/external/go_sdk/temp18079659004892177486/go_sdk-linux.tar.gz 223s
@swsnider
Copy link
Contributor Author

Appears due to #2729 -- trying a custom patch to see if that fixes it.

@swsnider
Copy link
Contributor Author

Applying a partial revert to 0.25.0 using the following patch gets me past this error (I skipped adding the 'if tar.gz' stuff because I only ever use tar.gz).

diff --git a/go/private/sdk.bzl b/go/private/sdk.bzl
index c148c9e2..482bebbd 100644
--- a/go/private/sdk.bzl
+++ b/go/private/sdk.bzl
@@ -177,11 +177,15 @@ def _remote_sdk(ctx, urls, strip_prefix, sha256):
     if len(urls) == 0:
         fail("no urls specified")
     ctx.report_progress("Downloading and extracting Go toolchain")
-    ctx.download_and_extract(
+    ctx.download(
         url = urls,
-        stripPrefix = strip_prefix,
         sha256 = sha256,
+        output = "go_sdk.tar.gz",
     )
+    res = ctx.execute(["tar", "-xf", "go_sdk.tar.gz", "--strip-components=1"])
+    if res.return_code:
+        fail("error extracting Go SDK:\n" + res.stdout + res.stderr)
+    ctx.execute(["rm", "go_sdk.tar.gz"])
 
 def _local_sdk(ctx, path):
     for entry in ["src", "pkg", "bin"]:

@swsnider
Copy link
Contributor Author

Interestingly, this doesn't replicate on macos for me, somehow.

@jayconrod
Copy link
Contributor

Thanks for reporting. That code was meant to work around bazelbuild/bazel#7055, which has been fixed for a long time (in all supported versions of Bazel), so I didn't think it was necessary anymore. Perhaps it's a new regression in Bazel or something new in the Go archive.

I'm not at all familiar with dazel. Can you reproduce this with Bazel on its own, either on macOS or within a Docker container?

@swsnider
Copy link
Contributor Author

I cannot on macOS, retrying on my own in docker (I'll also try on a real machine).

@swsnider
Copy link
Contributor Author

This replicates using the following Dockerfile in dazel only, not manually:

FROM centos:7

RUN yum install -y \
    gcc  \
    gcc-c++  \
    make  \
    openldap-devel \
    openssl  \
    openssl-devel \
    libevent-devel \
    yum-utils \
    rpm-build \
    expect \
    tar \
    curl \
    rpm-sign \
    curl-devel \
    expat-devel \
    gettext-devel \
    zlib-devel \
    perl-ExtUtils-MakeMaker \
    which \
    python3 \
    util-linux \
    binutils \
    fakeroot \
    dnf \
    dnf-plugins-core \
    python2-dnf-plugin-versionlock \
    strace \
    autoreconf \
    automake \
    autoconf \
    libtool \
    gperf \
    device-mapper-devel \
    jq \
    git

@swsnider
Copy link
Contributor Author

Any hints on stuff to look at to see what might be causing this in dazel? I assume maybe a locale flag or something?

@jayconrod
Copy link
Contributor

Sorry for slow response, still catching up after the holidays.

So if this doesn't reproduce with standalone Bazel or Bazel in Docker, it sounds like it's either a bug in Dazel or perhaps the bug in Bazel was not completely fixed. Maybe report the issue upstream in one of those projects? I'm not at all familiar with Dazel, but if I had to guess, I'd look for something related to proxying the macOS file system, which is case-insensitive and does unicode normalization.

@mrene
Copy link

mrene commented Jan 6, 2021

I can replicate without docker (ubuntu20.04 with zfs) with bazel 3.7.2. The filesystem has the utf8only option to on (default value).

Upgrading to 0.25.1 yields:

Error in download_and_extract: java.io.IOException: 
Error extracting /<snip>/external/go_sdk/temp12515746824519403713/go1.14.11.linux-amd64.tar.gz to /<snip>/external/go_sdk/temp12515746824519403713: 
/<snip>/external/go_sdk/test/fixedbugs/issue27836.dir/foo.go (Invalid or incomplete multibyte or wide character)

tar xvf has no problem extracting the file though

@titanous
Copy link

titanous commented Feb 1, 2021

I'm also experiencing this issue with Bazel 4.0.0 and Docker using a zfs bind mount with utf8only on.

@jvolkman
Copy link

jvolkman commented Feb 3, 2021

Yep, I also have this issue with Bazel 4.0.0 and zfs with utf8only enabled, which is the default (apparently) when choosing the zfs option during Ubuntu installation.

@jvolkman
Copy link

I created bazelbuild/bazel#12986 explaining the root cause. This may break rules_go on macOS at some point as well if it's not fixed upstream.

@jvolkman
Copy link

@jayconrod would you consider restoring the previous workaround until bazelbuild/bazel#12986 is resolved? Looks like that's slated for maybe Q2 of this year but until then I'm not aware of a workaround on affected machines without changes to rules_go.

@luna-duclos
Copy link
Contributor

Mirroring above request, we're having some engineers hit this issue as well

@jayconrod jayconrod added this to the v0.26 milestone Feb 17, 2021
@jayconrod
Copy link
Contributor

Agreed it makes sense to restore this workaround. This should be backported to release-0.24 and release-0.25.

jayconrod pushed a commit to jayconrod/rules_go that referenced this issue Mar 5, 2021
Use 'tar' installed on the system to extract .tar.gz archives instead
of Bazel's download_and_extract. Go has at least one test with an
invalid unicode file name, and on some configurations (macOS + Docker
+ some particular file system binding?), this causes an error.

Fixes bazel-contrib#2771
jayconrod pushed a commit to jayconrod/rules_go that referenced this issue Mar 5, 2021
Use 'tar' installed on the system to extract .tar.gz archives instead
of Bazel's download_and_extract. Go has at least one test with an
invalid unicode file name, and on some configurations (macOS + Docker
+ some particular file system binding?), this causes an error.

Fixes bazel-contrib#2771
jayconrod pushed a commit that referenced this issue Mar 5, 2021
Use 'tar' installed on the system to extract .tar.gz archives instead
of Bazel's download_and_extract. Go has at least one test with an
invalid unicode file name, and on some configurations (macOS + Docker
+ some particular file system binding?), this causes an error.

Fixes #2771
QIvan added a commit to QIvan/envoy that referenced this issue Apr 20, 2021
in the resent version of rules_go, the issue bazel-contrib/rules_go#2771 was fixed.
It should address the bazel build issue on some Linux or MacOS (bazelbuild/bazel#12986)

Signed-off-by: Ivan Zemlyanskiy <[email protected]>
lizan pushed a commit to envoyproxy/envoy that referenced this issue Apr 21, 2021
in the resent version of rules_go, the issue bazel-contrib/rules_go#2771 was fixed. 
It should address the bazel build issue on some Linux or MacOS (bazelbuild/bazel#12986)

Signed-off-by: izemlyanskiy <[email protected]>
gokulnair pushed a commit to gokulnair/envoy that referenced this issue May 6, 2021
in the resent version of rules_go, the issue bazel-contrib/rules_go#2771 was fixed.
It should address the bazel build issue on some Linux or MacOS (bazelbuild/bazel#12986)

Signed-off-by: izemlyanskiy <[email protected]>
Signed-off-by: Gokul Nair <[email protected]>
timothytrippel added a commit to lowRISC/rules_go that referenced this issue Sep 26, 2022
As we learned when attempting to use `rules_rust` in an airgapped
environment (see
lowRISC/rules_rust@3ea1eda),
the `ctx.download` action does not seem to cache downloads in the
repository cache. As such, files will not be available on airgapped
system simply by prefetching/populating the repository cache. This
patches `rules_go` to only download go toolchains with the
`ctx.download_and_extract` action until Issue bazel-contrib#2771
(bazel-contrib#2771) is properly
resolved.

Signed-off-by: Timothy Trippel <[email protected]>
jayconrod added a commit to jayconrod/rules_go that referenced this issue May 17, 2023
The Go distribution contains at least one test file with an invalid
unicode name. Bazel cannot extract the distribution archive on some
operating systems and file systems; Darwin with AFS at least is affected.

For .tar.gz files, we workaround the failure in ctx.download_and_extract
by using the native system tar.

This PR applies a similar workaround for .zip files on non-Windows OSs.
Windows itself is not affected (ctx.download_and_extract works),
so the workaround is not applied there. This is only really needed
when you have a Darwin host and a Windows executor (don't ask).

For bazel-contrib#2771
jayconrod added a commit to jayconrod/rules_go that referenced this issue May 17, 2023
The Go distribution contains at least one test file with an invalid
unicode name. Bazel cannot extract the distribution archive on some
operating systems and file systems; Darwin with AFS at least is affected.

For .tar.gz files, we workaround the failure in ctx.download_and_extract
by using the native system tar.

This PR applies a similar workaround for .zip files on non-Windows OSs.
Windows itself is not affected (ctx.download_and_extract works),
so the workaround is not applied there. This is only really needed
when you have a Darwin host and a Windows executor (don't ask).

For bazel-contrib#2771
fmeum pushed a commit that referenced this issue Jun 2, 2023
…Ss (#3563)

* go_download_sdk: apply extraction workaround to zips on non-windows OSs

The Go distribution contains at least one test file with an invalid
unicode name. Bazel cannot extract the distribution archive on some
operating systems and file systems; Darwin with AFS at least is affected.

For .tar.gz files, we workaround the failure in ctx.download_and_extract
by using the native system tar.

This PR applies a similar workaround for .zip files on non-Windows OSs.
Windows itself is not affected (ctx.download_and_extract works),
so the workaround is not applied there. This is only really needed
when you have a Darwin host and a Windows executor (don't ask).

For #2771

* use rename_files; rewrite comment

* version check
tingilee pushed a commit to tingilee/rules_go that referenced this issue Jul 19, 2023
…Ss (bazel-contrib#3563)

* go_download_sdk: apply extraction workaround to zips on non-windows OSs

The Go distribution contains at least one test file with an invalid
unicode name. Bazel cannot extract the distribution archive on some
operating systems and file systems; Darwin with AFS at least is affected.

For .tar.gz files, we workaround the failure in ctx.download_and_extract
by using the native system tar.

This PR applies a similar workaround for .zip files on non-Windows OSs.
Windows itself is not affected (ctx.download_and_extract works),
so the workaround is not applied there. This is only really needed
when you have a Darwin host and a Windows executor (don't ask).

For bazel-contrib#2771

* use rename_files; rewrite comment

* version check
copybara-service bot pushed a commit to bazelbuild/bazel that referenced this issue Oct 9, 2023
When creating a `PathFragment` from a ZIP or TAR entry file name, the raw bytes of the name are now wrapped into a Latin-1 encoded String, which is how Bazel internally represents file paths.

Previously, ZIP entries as well as TAR entries with PAX headers would result in ordinary decoded Java strings, resulting in corrupted file names when passed to Bazel's file system operations.

Fixes #12986

Fixes bazel-contrib/rules_go#2771

Closes #18448.

PiperOrigin-RevId: 571857847
Change-Id: Ie578724e75ddbefbe05255601b0afab706835f89
fmeum added a commit to fmeum/bazel that referenced this issue Oct 9, 2023
When creating a `PathFragment` from a ZIP or TAR entry file name, the raw bytes of the name are now wrapped into a Latin-1 encoded String, which is how Bazel internally represents file paths.

Previously, ZIP entries as well as TAR entries with PAX headers would result in ordinary decoded Java strings, resulting in corrupted file names when passed to Bazel's file system operations.

Fixes bazelbuild#12986

Fixes bazel-contrib/rules_go#2771

Closes bazelbuild#18448.

PiperOrigin-RevId: 571857847
Change-Id: Ie578724e75ddbefbe05255601b0afab706835f89
meteorcloudy pushed a commit to bazelbuild/bazel that referenced this issue Oct 9, 2023
…mes (#19765)

When creating a `PathFragment` from a ZIP or TAR entry file name, the
raw bytes of the name are now wrapped into a Latin-1 encoded String,
which is how Bazel internally represents file paths.

Previously, ZIP entries as well as TAR entries with PAX headers would
result in ordinary decoded Java strings, resulting in corrupted file
names when passed to Bazel's file system operations.

Fixes #12986

Fixes bazel-contrib/rules_go#2771

Closes #18448.

PiperOrigin-RevId: 571857847
Change-Id: Ie578724e75ddbefbe05255601b0afab706835f89

Fixes #19671
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants