Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(builtin): split node_loader into node_patcher #1003

Closed
wants to merge 3 commits into from

Conversation

Toxicable
Copy link

@Toxicable Toxicable commented Aug 13, 2019

PR Checklist

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

PR Type

  • Feature (please, look at the "Scope of the project" section in the README.md file)

What is the current behavior?

Unable to use bazels module resolution under a child_process or worker_threads

Issue Number: #312 #227

What is the new behavior?

This change splits up node_launcher into node_loader.js + node_patcher.js
node_loader.js - job is to run any bootstrap modules, setup source map support and finally call the users main entry point.
node_patcher - job is to only patch the module resolution logic of require so it can resolve modules under bazels runfile layout.
Part of this change also adds logic so that any worker (child process or thread) also has their require patched correctly so that they can require modules just the same as in the main thread.
Meaning we'll modify node_launcher.sh to call node with --require=${target}node_patcher.js.
In my testing, this change alone will enable works for child_process on nodejs 10.x and greater, however, we will need nodejs 12.x to enable this fix for worker_threads.
With this in mind, we'll make a test for either case but allow the worker_threads one to fail untill 12.x is LTS and we can enable it as the default.

Does this PR introduce a breaking change?

  • Yes
  • No
    Even those i'd consider it an "internal" api, if anyone is relying on the current behaviour of node_loader then this change will break it.

Other information

Copy link
Collaborator

@soldair soldair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still getting up to speed here but these are my notes on this change. Please note that some of these comments are more about the project and maybe out of scope for this pr.

the upgrade to node 12 LGTM except we need to pick up 12.8.1 which has some critical security fixes. If there are any existing docs on how we've decided to manage the nodejs version for users that would be useful. If there is a default we should discuss the merits of pinning it to a semver range so users by default get patch level fixes.

I'm not sure why we refactored out patcher from loader aside from the good reason that i could be and it's cleaner.

Some background in the issue description or link to a related issue would be really useful.

The effect is that the entry point gets a lot more boostrap modules loaded than child processes or worker threads. For things like source-map-support this means inconsistent debugging experience for those that compiled from typescript etc but maybe we shouldn't be helping them with that in the first place.

internal/node/node_patcher.js Outdated Show resolved Hide resolved
internal/node/test/BUILD.bazel Outdated Show resolved Hide resolved
internal/node/node_loader.js Outdated Show resolved Hide resolved
// we patch require here
module.constructor._load(patchModulePath)
// then we also patch process.execArgs sothat any child process also does this patch
process.execArgv = process.execArgv.concat([`--require=${patchModulePath}`]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will nodejs will already have a --require=...loader.js arg in execArgv?
So when a child process or worker executes it again i would expect this logic to run node with node --require=loader.js --require=patcher.js and add another --require=patcher.js
the extra --require=...patcher.js are a no op because loader requires it but the failure mode here is using up the processes argv space unexpectedly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this isn't right at all, I was just testing what would happen if you modify the execArgv from inside the program - turns out it's ignored by node in the child process (maybe it should be marked as readonly then?)
Either way, we'll move this into the node_launcher.sh script

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

migrated to the --require from launcher implementation, so we don't need this anymore

@Toxicable
Copy link
Author

@soldair Thanks for the review, i've updated the description with a better explaination of the goals here. Which should answer some of the questions you had around here, will try get some time tonight to make some changes so it reflects the description :P

@Toxicable Toxicable marked this pull request as ready for review August 23, 2019 23:03
@Toxicable
Copy link
Author

The only failure now is because of this logic

if [[ $patcher = /* ]]
then
  NODE_OPTIONS+=( "--require=$patcher" )
else
  NODE_OPTIONS+=( "--require=./$patcher" )
fi

Is there a proper way to check for absolute/relative or to normalize this path in node_laucher?

@soldair soldair self-assigned this Aug 27, 2019
@soldair
Copy link
Collaborator

soldair commented Aug 27, 2019

support for spawning node.

This is a really long answer to the question about the patcher file path handling.

I Think we need to add support for a few other ways nodejs can be spawned.
for example.

As a process looked up in the PATH

child_process.spawnSync('node',['-e','require("lib")'])

As a process invoked indirectly from the shell (very common with npm scripts)

child_process.execSync('node -e \'require("lib")\'')

The thing I think is missing to make these examples work is node needs to be resolved to a script from the PATH. We add directory containing an executable script called exactly node and (node.bat for windows) that has the absolute path to the nodejs binary and the file we intend to --require as our loader.

I think that it should be pretty much exactly this.

#!/bin/sh
export RULES_NODE_PROXY_SCRIPT=$0
exec $(templated_absolute_path_to_real_node)/bin/node -r $(templated absolute path to loader) "$@"

exec to drop the sh process, the absolute path to nodejs, the absolute path to the loader, and forward all other arguments.

Inside of the loader we set

process.execPath = path.resolve(process.env.RULES_NODE_PROXY_SCRIPT)

this supports programs that attempt to preserve execPath themselves (like npm), child_process.fork, and worker_threads i guess

Then we make sure the directory that contains our patched bin is in the path.

let binDir = path.dirname(process.env.RULES_NODE_PROXY_SCRIPT);
if(process.env.PATH.split(path.delimeter).indexOf(binDir) === -1){
  process.env.PATH = binDir+path.delimiter+process.env.PATH;
}

@Toxicable
Copy link
Author

@soldair Those look like some pretty easy changes to make. Although I don't think ill get any time on this week till the weekend to make those changes. If you want to pick it up however im always on slack.

@Toxicable Toxicable force-pushed the workers branch 3 times, most recently from ff10a5a to 006325d Compare August 28, 2019 08:19
@Toxicable
Copy link
Author

Toxicable commented Aug 28, 2019

I've addressed those points and looks like it's working.

It'll still fail some tests since I don't quite know how to resolve the issue I mentioned above about the patcher path.

It's also failing in a few places with /usr/bin/env: 'bash': No such file or directory

Not sure what hashbang I should be using for the node_proxy script.

@Toxicable Toxicable force-pushed the workers branch 4 times, most recently from b087991 to 64772d8 Compare August 28, 2019 08:44
@alexeagle
Copy link
Collaborator

Note, with the upcoming linker approach, the node_modules directory on disk when the action starts should already be in the state all node programs expect, so we may not need this.

(remains to be seen whether all features of node_loader.js can be replaced by the linker, eg. runfiles, but I think it's eminently possible)

@Toxicable
Copy link
Author

Sounds good to me. If we're able to remove th need for node_patcher.js then im happy

@Toxicable Toxicable closed this Sep 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants