Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel fails to unzip archives containing files with non-latin characters in their name #11670

Closed
rickwebiii opened this issue Jun 29, 2020 · 4 comments
Assignees
Labels
more data needed P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: bug
Milestone

Comments

@rickwebiii
Copy link

Hello,
I'm porting our build over to Bazel and am running into a problem that only reproduces on mac. Namely, I created an http_archive repo rule points to a zip file that contains an é in a filename. This results in the following error:

/Users/***/maestro/platform-monolith/BUILD.bazel:3:10: //platform-monolith:artifacts depends on @monolith_osx//:artifacts in repository @monolith_osx which failed to fetch. no such package '@monolith_osx//': java.io.IOException: Error extracting /private/var/tmp/_bazel_rweber/47c212f6310157f06bb1214862485fca/external/monolith_osx/desktop_linux.zip to /private/var/tmp/_bazel_rweber/47c212f6310157f06bb1214862485fca/external/monolith_osx: /private/var/tmp/_bazel_rweber/47c212f6310157f06bb1214862485fca/external/monolith_osx/tableau-1.3/install/defaults/Datasources/fr_FR-EU/Exemple - Hypermarch?.tds (Illegal byte sequence)

The offending filename contains an e with an accent-aigu.

I did a little more investigation and found I can repro this problem by simply zipping an archive containing the one offending file and writing a repository_rule that simple tries to call extract on that file:

rules.bzl:

def _http_archive_workaround(ctx):
  ctx.extract(
    archive = ctx.attr.file,
    output = "./",
  )
 

http_archive_workaround = repository_rule(
  implementation = _http_archive_workaround,
  attrs = {
    "file": attr.string(),
  }
)

WORKSPACE:

load("//:bazel/rules/mac/rules.bzl", "http_archive_workaround")

http_archive_workaround(
  name = "monolith_osx",
  file = "/Users/***/maestro/repro.zip",
)

I'm using bazel 3.3.0 and this doesn't repro on Windows or Linux, only mac.

As a workaround, I created a repoitory rule that downloads the zip file and calls the build-in unzip program to extract, then only specify filegroups that don't contain the offending files (I don't actually need them, but they're stopping the archive extraction).

@rickwebiii rickwebiii changed the title Bazel fails to unzip archives containing non-latin characters Bazel fails to unzip archives containing files with non-latin characters in their name Jun 29, 2020
@aiuto
Copy link
Contributor

aiuto commented Jun 30, 2020

Do you have a small github repo which contains a self contained repro?

There could be any of several things going on. The mac difference might have to do with Macos's strange choice of how they encode file names. (Nice summary here: https://superuser.com/questions/999232/unicode-filenames-in-windows-vs-mac-os-x).

@gregestren gregestren added z-team-Apple Deprecated. Send to rules_apple, or label team-Rules-CPP + platform:apple untriaged type: bug labels Jul 1, 2020
@jmmv jmmv added P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues and removed z-team-Apple Deprecated. Send to rules_apple, or label team-Rules-CPP + platform:apple untriaged labels Jul 9, 2020
@aiuto aiuto removed their assignment Oct 28, 2020
@aiuto aiuto added this to the unicode milestone Apr 20, 2021
@y3llowcake
Copy link

Not a particularly small repro, but we just ran into this issue extracting a recent golang release:

If the definition of 'golang_sdk_osx-x86_64' was updated, verify that the hashes were also updated.
ERROR: An error occurred during the fetch of repository 'golang_sdk_osx-x86_64':
  Traceback (most recent call last):
	File "/private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/bazel_tools/tools/build_defs/repo/http.bzl", line 111, column 45, in _http_archive_impl
		download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error extracting /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/temp17079543164555012796/go1.16.12.darwin-amd64.tar.gz to /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/temp17079543164555012796: /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/test/fixedbugs/issue27836.dir/?foo.go (Illegal byte sequence)
ERROR: Error fetching repository: Traceback (most recent call last):
	File "/private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/bazel_tools/tools/build_defs/repo/http.bzl", line 111, column 45, in _http_archive_impl
		download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error extracting /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/temp17079543164555012796/go1.16.12.darwin-amd64.tar.gz to /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/temp17079543164555012796: /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/test/fixedbugs/issue27836.dir/?foo.go (Illegal byte sequence)
ERROR: /Users/snagaraj/dev/goslackgo/BUILD.bazel:8:7: //:go depends on @golang_sdk_osx-x86_64//:toolchain_info in repository @golang_sdk_osx-x86_64 which failed to fetch. no such package '@golang_sdk_osx-x86_64//': java.io.IOException: Error extracting /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/temp17079543164555012796/go1.16.12.darwin-amd64.tar.gz to /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/temp17079543164555012796: /private/var/tmp/_bazel_snagaraj/bf4a8035aa09cf910ceaead73516862a/external/golang_sdk_osx-x86_64/test/fixedbugs/issue27836.dir/?foo.go (Illegal byte sequence)```

@y3llowcake
Copy link

Seems like this may be dupe of #12986 ?

@aiuto
Copy link
Contributor

aiuto commented Dec 15, 2021

Indeed it is a dup. Closing

@aiuto aiuto closed this as completed Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more data needed P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues type: bug
Projects
None yet
Development

No branches or pull requests

5 participants