Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some proposals for changes before this module starts getting used everywhere #3

Closed
gotwarlost opened this issue Dec 6, 2015 · 20 comments

Comments

@gotwarlost
Copy link

  • Could we simplify the method signature of the hook function and make it extensible for future changes instead of doing the dance in the parse args method? My proposal:
function addHook(hook, opts) {
}

where hook is the hook to be added, the only required argument (that hooks .js files by default) and
opts is an optional object with extensions and matcher properties supported on it.

  • Please do not bake the node_modules logic in your code. istanbul, for instance, allows you to disregard its default excludes and hook all files under node_modules as well. You could support a hookNodeModules property that is false by default but can be turned on by the caller.

cc @bcoe @jamestalmage @novemberborn for inputs

@gotwarlost
Copy link
Author

Also while we are at it, why do you need the thisDir thing in the code? Can't you just add a comment to your file /* @pirates: ignore */ to ensure that you don't self-hook?

@jamestalmage
Copy link

I agree with basically everything here.

@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

@gotwarlost: I agree too, This looks great. I'll do impliment this first thing tomorrow morning.

@gotwarlost
Copy link
Author

Thinking about this some more, it seems to me that we don't need a separate hookNodeModules property. What you could do instead is have a default matcher that ignores node_modules and if someone passes in a matcher, then it is their responsibility to correctly ignore or process node_modules. Seems to me that this strategy will provide reasonable defaults and allow overrides when needed.

@novemberborn
Copy link

@ariporad nice! Great to see a common solution rather than each project reinventing the wheel.

Why does pirates need to avoid self-hooking? Normally the module is already in the require cache so it won't be reloaded. And even if it is, why would that be bad? The @pirates: ignore pragma causing hooks to be ignored seems iffy. It's not that unique a string. Why would a file opt out of the hooks? If this is needed just for the tests then maybe they can be done differently.

Agreed with @gotwarlost to implement exclusions using custom matchers. As Babel, nyc etc have their own config, only they know how to do correctly exclude modules. But if each library using pirates ends up customizing the exclusion matcher then I'm not sure excluding even node_modules by default is necessary.

If the default exclusions are kept then it should match on the path separators, e.g. my_fancy_node_modules shouldn't be excluded.

The code calls console.error() when a hook fails to return code. This console output is likely to get lost amongst other logs. I'd rather see my program crash. Incidentally it logs 'hook' (a string). Should that be hook (the function)?


From #1:

After looking into having getters and setters for dealing with naughty require hooks, I ended up dropping that idea (for now), because basically any hook that misbehaves is using Module.prototype._compile instead of module._compile, which I haven't been able to figure out how to fix.

Could you elaborate on this? nyc uses the setter to detect the custom require hooks. If it switches to pirates` it won't be compatible with code that does not (yet) use pirates. Why makes that code "naughty"?

@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

@novemberborn:

The tests involve some stuff which should be hooked, and some which shouldn't. I agree the regex is iffy, (that's why it's undocumented). I'd like to remove it, but I'm not sure how.

Noted, will do.

Yes, the log is a problem, I'll fix it. No, I don't think it should crash, because as someone who has had to deal with misbehaving require hooks before, I'd rather it scream but still work. Maybe there can be a strict option.

Actually, NYC is one of the bad ones. Here's the way to properly implement a require hook (it's what pirates does):

var oldLoader = Module.extensions[ext] || Module.extensions['.js'];
Module.extensions[ext] = function (mod, filename) {
  var _compile = mod._compile;
  mod._compile = function (code, filename) {
    code = this.instrument(code);
    _compile(code, filename);
  }
  oldLoader(mod, filename);
}

This will work for any number of hooks, because it's just injecting a step in between the loader and _compile. NYC does something similar, but it does it in a way so that only two hooks (nyc and something else) can be active at any time. It does this by basically intercepting new loaders, then wrapping that new loader as above. The part that they do wrong is they call Module.prototype._compile, instead of module._compile.

Babel does even worse. Babel never calls old loader (which means it never even goes into the node.js code), it just calls fs.readFileSync, then transpiles the result, then Module.prototype._compile. (I'll write a wiki page on this.)

So by switching to pirates, nyc would actually gain compatibility, not loose it. And over the course of doing this, I've noticed that the problem is not that require hooks don't work together, it's that most are implemented wrong (Shout out to istanbul, who did it 110% perfectly).

@gotwarlost
Copy link
Author

To the point made by @novemberborn - I think we should leave out the node_modules logic out of pirates. Also the regexp can have unintended consequences - for example if the hook is applied after bundling a bunch of files through browserify for example , the entire bundle will be ignored because one file has the Pirates comment in it. Not to mention that running a regexp over a massive bundle is likely to be horribly expensive. I don't think we should support the Pirates comment for these reasons

@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

@gotwarlost: As I said, I agree. I think what I'll do for V2 is get rid of the regex. The reason it was there is that I needed a way to ignore the test compilers. I wanted to put them in test/fixtures/node_modules, but they got gitignored, so this was a lazy solution. I think I'll just figure out a way to not git ignore them, and remove the comment.

@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

Ok, @gotwarlost, @jamestalmage @novemberborn: I think I've properly implemented all your suggestions, take a look at the feedback-round-1 branch, and let me know. By default, it still ignores node_modules, but it is configurable.

@ariporad ariporad added this to the 2.0.0 milestone Dec 6, 2015
@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

Also, here's a todo list, for the milestone:

  • Switch to options object
  • Make ignoring node_modules configurable
  • Ignore node_modules with the path separator
  • Remove the ignore comment

@bcoe
Copy link

bcoe commented Dec 6, 2015

Could we simplify the method signature of the hook function and make it extensible for future changes instead of doing the dance in the parse args method?

I like the idea of moving the matcher into a configuration setting, and default to perhaps *.


@ariporad great work so far, I think it's great to be pulling this logic into a common module., a couple initial comments:

  • as I understand it right now, pirates relies on adoption to work properly; to support the babel + istanbul use-case, as an example, both libraries would need to be using pirates. A thought I have, is that it would be nice to start with the __defineGetter__, __defineSetter__ logic that nyc uses with this feature being configurable -- this way folks could immediately start moving towards pirates, and then gradually turn that configuration option off as it gains traction across a few of the modules that hook require.
  • the logic for managing a stack of require hooks is sticky, and there's a lot of room for weird interactions -- what would make me most confident to make a move to this library would be a very thorough test suite, this is where I would put a lot of upfront work 👍

This is looking slick!

@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

@bcoe: agree on the tests, but your bit about adoption is incorrect. If everyone had a properly written require hook, like istanbul, there would be no need for this, but everyone doesn't. Both babel and nyc have require hooks that are implemented improperly. (See above for a more detailed explanation). So NYC's getters and setters are actually hurting compatibility, not helping it.

@bcoe
Copy link

bcoe commented Dec 6, 2015

@ariporad I now fully grok what you're getting at. As soon as we get babel using a more compliant require hook I'd happily simply nyc's logic -- the two main use-cases that come up for hooks are code-coverage, and ES2015 transpilation so I'll have to block on babel for the time being.

@ariporad
Copy link
Collaborator

ariporad commented Dec 6, 2015

EDIT: I am wrong, never mind.

@ariporad ariporad mentioned this issue Dec 7, 2015
Merged
1 task
@jamestalmage
Copy link

Babel's problem is that it completely disregards both module._compile and the old loader

Wrong on the first part (it uses module._compile correctly), correct on the second (it manually loads from the disk).

@ariporad
Copy link
Collaborator

ariporad commented Dec 7, 2015

@jamestalmage: Oops! You're right! My bad!

@jamestalmage
Copy link

@ariporad fake-module-system will help you write much simpler tests, that don't actually need to hit the file system (though I would probably recommend you actually include a separate test suite that just tests one or two files).

@novemberborn
Copy link

@ariporad:

Yes, the log is a problem, I'll fix it. No, I don't think it should crash, because as someone who has had to deal with misbehaving require hooks before, I'd rather it scream but still work. Maybe there can be a strict option.

If I happen to catch the log message, even though my program appears to work correctly, I'm going to have to investigate to make sure. If I don't catch the message I won't investigate and my program may not work correctly. Thus I'd rather it crashes. Fixing it should be easy enough (remember: I should be investigating the problem anyway) and then I can have more confidence in my program.

A strict option doesn't really help here, now I need to know that some dependency used pirates and I need to somehow configure it to be strict if it doesn't do so already.

Both babel and nyc have require hooks that are implemented improperly. (See above for a more detailed explanation). So NYC's getters and setters are actually hurting compatibility, not helping it.

nyc hacks child process spawning to inject itself at the very top of the new process. Consequently it's require override is meant to be the first, no earlier hooks can exist. I appreciate it's not a perfect hook implementation but it's sufficient for its purposes.

Without observing getters and setters for require.extensions I'm not even sure how two pirates instances can cooperate. This would occur if my program has two dependencies which depend on incompatible pirates versions. npm won't dedupe them.

@ariporad
Copy link
Collaborator

ariporad commented Dec 7, 2015

@novemberborn: Ok, I see what you mean on the crashing. I'll make it throw.

As for the getters/setters for NYC:

  1. Oh, that makes more sense, but it still is broken.
  2. Two instances of pirates can run together just like two good require hooks can work together. Since when you override the loader, you store a reference to the old one, and later call it, it forms a chain. NYC's setup breaks the chain, which causes a problem.

@novemberborn
Copy link

@ariporad

Ok, I see what you mean on the crashing. I'll make it throw.

Cool!

Two instances of pirates can run together just like two good require hooks can work together. Since when you override the loader, you store a reference to the old one, and later call it, it forms a chain.

Yea, @jamestalmage's write-up made that a lot clearer for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants