Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/node/node_loader.js does not support NODE_PATH #304

Closed
lexor90 opened this issue Sep 2, 2018 · 5 comments
Closed

internal/node/node_loader.js does not support NODE_PATH #304

lexor90 opened this issue Sep 2, 2018 · 5 comments

Comments

@lexor90
Copy link

lexor90 commented Sep 2, 2018

Hello there,

I'm facing issues loading modules from a node_modules directory which does not sit on the project root. I've done a bit of inspection of the new node_loader introduced by #253 and it seems it sets the NODE_MODULES_ROOT always as workspace_name/node_modules, ignoring any NODE_PATH.

In particular, with older releases (0.10.1 seems to be the latest working - after that #253 was merged) I was able to keep node_modules in //third_party/nodejs/yarnpkg.org/node_modules/ (checked in the repo).

I notice 2 issues:

  • If the filegroup(name="node_modules",...)'s "srcs" is not a label it doesn't work (nothing goes into runtimes using, say, a "glob", while referencing it through a sub-//third_party BUILD file - which includes proper license requirements - works again. I guess this might be forced by bazel itself, as third_party is special. [N.B. this was also happening with previous releases of bazel and rules_nodejs]. Not a drama, by the way.
  • Being not able to set a NODE_PATH I can't just "require" files and get the node resolver to work on it, I need to declare them using the workspace-relative path (which is not so stupid, btw). BTW even when I'm able to "satisfy" node_loader requirements, I get runtime issues about "module not found" for the same modules I was forced to require in a workspace-relative fashion.

I'm a bit confused about the intended bahaviour, what if my rule is this:


nodejs_binary(
  name = "frontend",
  data = [
    "@//lib/framework",
    "@//lib/auth-sdk",
    ":sources",
    "//product/tools/keys"
  ],
  node_modules = "@//:node_modules",
  entry_point = "./product/nodejs/frontend/index.js", # note: using "product/" or "depot/product" - depot being the workspace name - it does not work
)

Where node_modules is

filegroup(
    name = "node_modules",
    srcs = ["@//third_party/nodejs/yarnpkg.org/node_modules:node_modules"],
    visibility = ["//visibility:public"],
)

And third_party/nodejs/yarnpkg.org/node_modules:node_modules is

licenses(["notice", "restricted", "reciprocal"]) # check each package licenses

filegroup(
  name = "node_modules",
  srcs = glob(
    ["**/*"],
    exclude = ["@types/**/*"]
  ),
  visibility =  ["//visibility:public"],
)

(this is why having fine-grained or automated tools like gazelle to do this on each single packages would be awesome).

By using NODE_PATH=/<absolute-path>/lib:/<absolute-path>/third_party/nodejs/yarnpkg.org/node_modules/ bazel run //product/nodejs/frontend:frontend I was fine. Now I cannot satisfy both requires (I cannot anymore "require("lodash"); require("framework"); // /lib/framework and I cannot even make it work by using require("lib/framework").), I tried patching the resolver, supporting NODE_PATH and trying to resolveRunfiles() on every single path, but I could not come up with something working: when the application spawns childrens the require start failing again, the initial checks executed by "node_loader.js" are fine, then fails with a node runtime error trying to import "framework" which initializes the application. Maybe we could pass somehow the NODE_PATH to the actual binary execution (I think somewhere in the bash script)? Or what should be changed in the current setup to work with the new loader?

Tests executed using bazel 0.16.0 and 0.16.1 and rules_node, from 0.12.2 down to 0.10.1 on a MacOS 10.14-beta and Ubuntu LTS.

P.S. I'd also love to know what you guys do to keep trace of lib requirements using the current "global node_modules", to avoid keeping unreferenced or outdated dependencies, and how do you manage vendoring in a proper way, given issues that arise with "node-gyp" for native modules, which only compiles on the current architecture. We currently only execute "npm install" on a target machine having the same OS/Setup like production environments and then check-in only these outputs. Of course locally it's a mess if you try to require native modules, which have been built on a remote architecture, and it would be nice to being able to compile native modules with bazel itself from sources. Maybe RBE will be helpful here to always run tests on the same architecture as the checked-in modules regardless of the dev machine.

@lexor90
Copy link
Author

lexor90 commented Sep 2, 2018

I've gone a bit deeper and discovered that by using js_library we can exploit the "module mappings" (thanks to a comment by @alexeagle here), which would allow me to both fix the fine-grained dependency tracking (using bazel query language and specifying all of the "deps" without wildcards allows us to make queries like "yarn why" but without having the need to keep a package.json per project) and the difference between "require('module');" and it's effective location (given I'm using a non standard <workspace>/node_modules/ path).

I've scripted a quick npm postinstall.js script to generate a global "//third_party/nodejs/yarnpkg.org/node_modules/BUILD" file containing all of the js_library it finds in that directory (it uses the the node_modules/ sub-folder name to fill the name and module_name attributes of every js_library).

This way whenever the package manager is run to alter the "node_modules" directory it would also keep the BUILD file updated and you can access direct dependencies using your nodejs_binary's deps = ["//third_party/nodejs/yarnpkg.org/node_modules:lodash", ...]. This would also allow me to use the bazel query language to build dep graphs, I think. It would be awesome to have js_library to support deps, but if I'm correctly understanding the node_loader, it should be able to resolve transitive dependencies anyway if I link them to the "nodejs_binary" rule. Another nice addiction would be to have licenses supported on js_library rather than having to set a global "licenses([])" in my third_party BUILD file. Isn't it this a property that should be accepted globally?

I am actually trying down this path, it could even make the node_binary's attribute "node_modules" unnecessary in my case.

I'll keep you updated as I discover more.

@lexor90
Copy link
Author

lexor90 commented Sep 3, 2018

I can confirm that by automatically listing all node_modules as js_library I can properly map files from a non-standard <workspace>/node_modules. The original issue should be resolved (would still know more about others' ideas to proper vendoring of nodejs packages), maybe marked as "lack of documentation", but I remember about js_library not being publicly available.

I had faced two issues during this process (let me know if you prefer to discuss them in a new issue):

  1. using cluster.fork() breaks the resolution, as the child process seems to have the original node resolver. Do you have any elegant idea to patch this situation without having to do tricks in app code?
  2. I had to replace the following regex, otherwise the resolver would mess things up if you require both bunyan and bunyan-sentry as an example, (the latter would end up as /module/root/-sentry) as the loop runs for every module but does not check for an exact match (why not?)
diff -r rules_nodejs/internal/node/node.bzl rules_nodejs_2/internal/node/node.bzl
35c35
<         mapping = r"{module_name: /^%s\b$/, module_root: '%s'}" % (escaped, mr)
---
>         mapping = r"{module_name: /^%s$/, module_root: '%s'}" % (escaped, mr)

@gregmagolan
Copy link
Collaborator

gregmagolan commented Sep 7, 2018

@lexor90 Thanks for the detailed write up on your issue and work-around. We're also working on fine grained npm deps with #302 although it wouldn't apply to your case with checked-in node_modules. #302 creates fine grained npm deps when using yarn_install and npm_install. Your setup brings up an interesting case where it would be useful to generate a BUILD file similar to the one generated in #302 but for a checked-in node_modules folder. @alexeagle thoughts on this?

The two issues you found look valid. I haven't run into any cases where the patched node resolver is needed on a spawned node process but your observation makes sense. And the second issue looks like a problem with resolveToModuleRoot() in node_loader.js. Can you please open separate issues for them?

@lexor90
Copy link
Author

lexor90 commented Sep 7, 2018

Hey there, thanks for the interest.

Regarding #302 I'm following the discussion and the design process, I agree it's definitely useful to track down single dependencies (we made some custom tooling to track them, but it looks weird as it works with genrules).

I made some tests on our internal artifacts, produced in conjunction with rules_docker: the final image size dropped consistently without having the whole node_modules/ folder in it (just 'cause the current rules implementation takes them in).

Also, it makes sense to me a vision where the BUILD file replaces the package.json for every single project, this way one can rely on a customized setup to get the dependencies (for example, one package.json per project, or one for the whole repository and place dependencies where they want to -- it doesn't make assumptions). And then use the BUILD file to gather only the needed ones on the final artifact. Of course, if you use a single package.json per project it could make sense to just reference the package.json and specify dependencies in a single file.

Let me know if I can help with it.

I'm going to open two tickets to track down the issues above and provide a use case for the 1st issue, closing this.

@lexor90 lexor90 closed this as completed Sep 7, 2018
@lexor90
Copy link
Author

lexor90 commented Sep 7, 2018

@gregmagolan maybe you want to keep this open to track down the NODE_MODULES env support?
Is it on the roadmap or it's an intended behavior to ignore it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants