Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: improve module loader readability #16536

Closed
wants to merge 1 commit into from

Conversation

addaleax
Copy link
Member

@addaleax addaleax commented Oct 26, 2017

Various improvements on readability, performance and
conformity to the Node core coding style in the ESM loader C++ code:

  • isolate for the Isolate*, context for the Local<Context>
  • Less reliance on auto where it’s unnecessary/increases cognitive
    overhead
  • Shorter paths to failure via .ToLocal() & co
  • Do not keep pending exceptions around e.g. for failed JSON parsing
  • Use Maybe types to get explicit error status forwarding
  • Remove an unnecessary handle scope
  • Add nullptr checks for unwrapped host objects
  • Remove unused code
  • Use CamelCase for function names
  • Use const Foo& instead of copying values whenever possible
  • Pass along the Environment* explicitly
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

src

@addaleax addaleax added the esm Issues and PRs related to the ECMAScript Modules implementation. label Oct 26, 2017
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Oct 26, 2017
@addaleax
Copy link
Member Author

btw, re:

Use Maybe types to get explicit error status forwarding

… I realize this is still something a lot of people aren’t familiar with, but we’re stuck with that anyway because of V8. So see this as “you only have to understand one error forwarding system instead of two”, not as “this solves all problems & everybody gets a unicorn”. :)

Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. It would be interesting as well to start getting some performance benchmarks going on this code too in due course.

static URL INITIAL_CWD(__init_cwd());
inline bool is_relative_or_absolute_path(std::string specifier) {
// Tests whether a path starts with /, ./ or ../
inline bool PathIsBasedOffCurrentDirectory(const std::string& specifier) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a URL-related function, designed to mimic the resolution steps under https://html.spec.whatwg.org/#integration-with-the-javascript-module-system.

I think isRelativeOrAbsolute remains a good name here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it seems like you're flipping the polarity of this function? If that is the case perhaps a better name might be isBareSpecifier to match the WhatWG terminology.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although since this function relies on parseAsURL already being checked, isNotRelativeOrAbsolute may be the most descriptive!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guybedford I can’t find what exactly you’re referring to in that spec (also, linking to that page is a great way to DoS somebody’s browser)

At least outside of web specs, relative paths are the paths that are not absolute, so IsRelativeOrAbsolute really is not a helpful name?

Maybe IsBareSpecifier works, but even that isn’t really saying anything about what’s meant by it … I can add a comment indicating what this matches in WhatWG terminology, but the part of the goal of this change is to make it more accessible to people who don’t know all the spec phrases.

Copy link
Contributor

@guybedford guybedford Oct 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addaleax here's the appropriate spec text, which is what is exactly what the function here is distinguishing:

To resolve a module specifier given a script script and a JavaScript string specifier, perform the following steps. It will return either a URL record or failure.

  1. Apply the URL parser to specifier. If the result is not failure, return the result.

  2. If specifier does not start with the character U+002F SOLIDUS (/), the two-character sequence U+002E FULL STOP, U+002F SOLIDUS (./), or the three-character sequence U+002E FULL STOP, U+002E FULL STOP, U+002F SOLIDUS (../), return failure and abort these steps.

  3. This restriction is in place so that in the future we can allow custom module loaders to give special meaning to "bare" import specifiers, like import "jquery" or import "web/crypto". For now any such imports will fail, instead of being treated as relative URLs.

  4. Return the result of applying the URL parser to specifier with script's base URL as the base URL.

IsNotRelativeOrAbsolute is trying to capture the concept of what is remaining being a plain or bare name (eg lodash).

Note that for example, we do not check .\ or handle internal .. segments, we are exactly just checking the above features. That is "bare names" are detected identically to the WhatWG URL spec, which then get resolved as packages.

It's nits though certainly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but this bikeshedding might be worth it. :)

So how does something like ShouldBeTreatedAsRelativeOrAbsolutePath sound to you? Does that capture the essence of the semantics well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds about right, yes. Although with the function polarity change, I think it would make it something like ShouldNotBeTreatedAsRelativeOrAbsolutePath? My comment below then may make more sense as well in terms of it being that confusing concept to humans of the double negative :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we can just change polarity back … ;)

}
if (is_relative_or_absolute_path(specifier)) {
if (!PathIsBasedOffCurrentDirectory(specifier)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be slightly easier to read by reversing the branch order instead of using negation:

if (IsBareSpecifier(specifier)) {
  return ResolveModule(env, specifier, base);
}
else {
  // ...
}

URL(URL&&) = default;
URL& operator=(URL&&) = default;

URL() : URL("") {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a blank line before private members (no idea whether we lint this?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the linter complained, I just overlooked it. (I run into this particular one about 20 times a day … I’ll never get used to it 😄)

args.GetReturnValue().Set(that);
}

void ModuleWrap::Link(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
Isolate* iso = args.GetIsolate();
EscapableHandleScope handle_scope(iso);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes using v8::EscapableHandleScope obsolete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. I thought the linter would catch that.

ModuleWrap* obj = Unwrap<ModuleWrap>(that);
CHECK_NE(obj, nullptr);
auto mod_context = that->CreationContext();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have an appropriate Local abstraction for this auto?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, thanks for pointing it out :)

Various improvements on readability, performance and
conformity to the Node core coding style in the ESM loader C++ code:

- `isolate` for the `Isolate*`, `context` for the `Local<Context>`
- Less reliance on `auto` where it’s unnecessary/increases cognitive
  overhead
- Shorter paths to failure via `.ToLocal()` & co
- Do not keep pending exceptions around e.g. for failed `JSON` parsing
- Use `Maybe` types to get explicit error status forwarding
- Remove an unnecessary handle scope
- Add `nullptr` checks for unwrapped host objects
- Remove unused code
- Use `CamelCase` for function names
- Use `const Foo&` instead of copying values whenever possible
- Pass along the `Environment*` explicitly
@addaleax
Copy link
Member Author

@addaleax
Copy link
Member Author

addaleax commented Nov 3, 2017

Landed in e22a8f1

@addaleax addaleax closed this Nov 3, 2017
addaleax added a commit that referenced this pull request Nov 3, 2017
Various improvements on readability, performance and
conformity to the Node core coding style in the ESM loader C++ code:

- `isolate` for the `Isolate*`, `context` for the `Local<Context>`
- Less reliance on `auto` where it’s unnecessary/increases cognitive
  overhead
- Shorter paths to failure via `.ToLocal()` & co
- Do not keep pending exceptions around e.g. for failed `JSON` parsing
- Use `Maybe` types to get explicit error status forwarding
- Remove an unnecessary handle scope
- Add `nullptr` checks for unwrapped host objects
- Remove unused code
- Use `CamelCase` for function names
- Use `const Foo&` instead of copying values whenever possible
- Pass along the `Environment*` explicitly

PR-URL: #16536
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
@addaleax addaleax deleted the esm-readability-c++ branch November 3, 2017 00:18
cjihrig pushed a commit to cjihrig/node that referenced this pull request Nov 6, 2017
Various improvements on readability, performance and
conformity to the Node core coding style in the ESM loader C++ code:

- `isolate` for the `Isolate*`, `context` for the `Local<Context>`
- Less reliance on `auto` where it’s unnecessary/increases cognitive
  overhead
- Shorter paths to failure via `.ToLocal()` & co
- Do not keep pending exceptions around e.g. for failed `JSON` parsing
- Use `Maybe` types to get explicit error status forwarding
- Remove an unnecessary handle scope
- Add `nullptr` checks for unwrapped host objects
- Remove unused code
- Use `CamelCase` for function names
- Use `const Foo&` instead of copying values whenever possible
- Pass along the `Environment*` explicitly

PR-URL: nodejs#16536
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
@cjihrig cjihrig mentioned this pull request Nov 6, 2017
@MylesBorins MylesBorins added baking-for-lts PRs that need to wait before landing in a LTS release. lts-watch-v6.x and removed baking-for-lts PRs that need to wait before landing in a LTS release. lts-watch-v6.x labels Nov 14, 2017
gibfahn pushed a commit that referenced this pull request Nov 14, 2017
Various improvements on readability, performance and
conformity to the Node core coding style in the ESM loader C++ code:

- `isolate` for the `Isolate*`, `context` for the `Local<Context>`
- Less reliance on `auto` where it’s unnecessary/increases cognitive
  overhead
- Shorter paths to failure via `.ToLocal()` & co
- Do not keep pending exceptions around e.g. for failed `JSON` parsing
- Use `Maybe` types to get explicit error status forwarding
- Remove an unnecessary handle scope
- Add `nullptr` checks for unwrapped host objects
- Remove unused code
- Use `CamelCase` for function names
- Use `const Foo&` instead of copying values whenever possible
- Pass along the `Environment*` explicitly

PR-URL: #16536
Reviewed-By: Stephen Belanger <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Tobias Nießen <[email protected]>
Reviewed-By: Gireesh Punathil <[email protected]>
@gibfahn gibfahn mentioned this pull request Nov 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. esm Issues and PRs related to the ECMAScript Modules implementation. lib / src Issues and PRs related to general changes in the lib or src directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants