Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yarn_install munges binary data #680

Closed
skinner opened this issue Apr 9, 2019 · 3 comments
Closed

yarn_install munges binary data #680

skinner opened this issue Apr 9, 2019 · 3 comments
Assignees
Labels

Comments

@skinner
Copy link
Contributor

skinner commented Apr 9, 2019

🐞 bug report

Affected Rule

yarn_install

Is this a regression?

Yes, it worked in 0.16.2

Description

We've been using .tar.gz files for some dependencies and putting them in yarn_install's data. To copy those dependencies, yarn_install runs those inputs through repository_ctx.template, with no substitutions:
https://github.com/bazelbuild/rules_nodejs/blob/843d8e48f763d4a9efce49d260c1e65529985586/internal/npm_install/npm_install.bzl#L127

But that ends up munging those inputs. For example, here's a snippet from the .tar.gz in this reproduction:

$ od -t xC minimal-package-1.0.0.tgz | head -2
0000000    1f  8b  08  00  00  00  00  00  00  13  ed  58  6d  6f  9b  48
0000020    10  ee  67  7e  c5  c8  fd  10  5b  22  f8  2d  71  a4  56  3d

which gets munged to

$ od -t xC /private/var/tmp/_bazel_mskinner/53ab97bee3c2fb0fa8236c27a5268864/external/binary_tgz_js_deps/minimal-package-1.0.0.tgz | head -2
0000000    1f  ef  bf  bd  08  00  00  00  00  00  00  13  ef  bf  bd  58
0000020    6d  6f  ef  bf  bd  48  10  ef  bf  bd  67  7e  ef  bf  bd  ef

where e.g. the second byte 8b in the input becomes ef bf bd (the utf-8 encoding of the unicode replacement character) in the output.

🔬 Minimal Reproduction

bazel run :example in https://github.com/skinner/template_test_repo

🔥 Exception or Error

error Extracting tar content of "/private/var/tmp/_bazel_mskinner/53ab97bee3c2fb0fa8236c27a5268864/external/binary_tgz_js_deps/minimal-package-1.0.0.tgz" failed, the file appears to be corrupt: "Invalid tar header. Maybe the tar is corrupted or it needs to be gunzipped?"

🌍 Your Environment

Operating System:

macOS Mojave and Ubuntu Xenial

Output of bazel version:

Build label: 0.23.2
Build target: bazel-out/darwin-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Mon Mar 11 16:50:03 2019 (1552323003)
Build timestamp: 1552323003
Build timestamp as int: 1552323003

Rules version (SHA):

0.27.8 release

Workaround

We've worked around it locally by copying with mkdir and cp:
https://github.com/skinner/rules_nodejs/tree/npm-install-copy-data-deps-0.27.8

that works for us (no windows users) but I don't know if y'all would accept that as a pull request.

cc @siddharthab

@alexeagle
Copy link
Collaborator

awesome report, thanks for tracking it down. we had similar bugs where we copied files in nodejs using fs.readFileSync/writeFileSync. seems hard to consistently have test cases with binary files :(

@gregmagolan
Copy link
Collaborator

Thanks for the report.

#704 will resolve this as the data attribute will no longer move any files around. Its just waiting on a feature to land in Bazel. Hopefully in the next Bazel release. If that is delayed then we should find a way to fix this in the meantime.

@gregmagolan
Copy link
Collaborator

#704 landed so data is no longer required unless you set symlink_node_modules to False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants