Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.

Add build option for Node CLI #397

Merged
merged 5 commits into from
May 18, 2017

Conversation

sebmarkbage
Copy link
Contributor

This adds a compatibility option for node-cli. The goal of this is to Prepack a whole Node.js CLI environment including Node's module system. This is useful for an end application but not so useful for a small library yet.

This takes a bit different approach than compiling a single bundle. Instead of running the target bundle in the Prepack realm, it actually runs Node.js's JavaScript code inside the Prepack realm. This code is embedded within the runtime and we can get to it through process.binding('natives') when we run Prepack.

The script file will be passed as a process.argv.

Node's CLI code will then pick that up and start loading modules from the file system. The idea is that reads from fs while Prepacking will be considered constant. That way all the relevant modules will be automatically picked up and executed in the Prepack realm.

Finally, we can snap shot the entire Node system.

This PR takes the approach of building intrinsics only for the native code in Node.js (which are all on the process object). An alternative solution would be to build intrinsics from the Node.js public APIs. The tradeoff is that the public API is much bigger surface area but moves slower, where as the native APIs are considered more private and therefore moves faster but have smaller surface area.

My guess is that it will be pretty easy to keep up with the native APIs but not the JS APIs so I picked that solution. It guarantees that the JS parts are 100% compatible with the Node version you're running.

This is still a WIP. I have more intrinsics to implement before this works.

);
}

// Next call the bootstrap function with the process intrinsic.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NTillmann This is the Call I was referring to. I want to do the equivalent of realm.$GlobalEnv.execute but call a function value that was the completion value of the previous execution. This is a very common way to initialize internals in JS engines/browsers in a way that doesn't leak those internals onto the global object (unlike React Native).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. I think you should break it out into a separate function and co-located it with execute. Likewise, you may want to break out lines 176-197 and co-locate it with execute in environment.js.

case 'function':
throw new Error('Functions could be supported but are not yet.');
case 'object': {
if (value === 'null') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this be? Do you mean value === null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. yea.

[realm.intrinsics.process]
);
} catch (err) {
if (err instanceof JoinedAbruptCompletions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception catching for ...Completions seems dubious, but I can see how to copy&pasted it from evaluateCompletion. Let me check with Herman what should happen here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, here, you can collapse all of those ...Completions cases into a single "if (err instanceof Completion) res = err; else ...". This will log the issue in a reasonable way, and there is no way to recover at this point anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like call of this belongs in a helper somewhere but I'm not sure where.

import { ThrowCompletion } from "../../completions.js";

function throwError(realm, name) {
throw new ThrowCompletion(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have realm.createErrorThrowCompletion to make this easier.

}

// node
if (realm.compatibility === 'node' ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You filed in assume to make the "jsc" compatibility more precise by adding a version identifier. #395 Maybe. But then you probably also want to make "node"/"node-cli" versioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite understand what "node" is. Looks like it is currently just for testing source maps. It probably should have some general "CommonJS" mode where you can't assume a particular version and many of the variables are abstract.

"node-cli" is interesting because I think we can make the versioning very automatic based on what node version you're running but then the environment is locked to that version of node. So it doesn't quite make sense to specify it.

src/global.js Outdated
@@ -415,5 +414,18 @@ export default function (realm: Realm): ObjectValue {
});
}

if (realm.compatibility === "browser") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to hide this for the React Native builds or does it need it for some feature detection thing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, let's just keep it under this compatibility flag.

Copy link
Contributor

@hermanventer hermanventer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I understand what you are trying to do here. What exactly does the bootstrap function do? Where does it get is input from? How do you designate input as unknown?

);
}

// Takes a value from the host realm and clones it into a Prepack Realm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm none too sure about this comment. It seems to me that you are taking JS value and making a corresponding Prepack value.

);
}

// Next call the bootstrap function with the process intrinsic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me. I think you should break it out into a separate function and co-located it with execute. Likewise, you may want to break out lines 176-197 and co-locate it with execute in environment.js.

@sebmarkbage
Copy link
Contributor Author

@hermanventer The bootstrap function will get its input from process.argv. Not yet implemented here. That's an array of the arguments passed on the command-line to the node executable.

The first argument is normally the JavaScript module that is going to be executed.

I was thinking that will be set up as a constant string value equal to the file about to be Prepacked. This will cause the Node environment to load that module file into memory via a call to the fs binding. That becomes the input source code.

The abstract input is the rest of the argv array, which lets the user pass arguments to their Prepacked node.js CLI.

Since the Node.js uses dynamic module source code loading as the program executes, it needs this different mechanism to be compatible with the ecosystem. Unlike browser/react native where bundles are created by statically analyzing dependency graphs.

@sebmarkbage
Copy link
Contributor Author

Seeing what bootstrap_node does might be helpful: https://github.com/nodejs/node/blob/master/lib/internal/bootstrap_node.js#L146

@hermanventer
Copy link
Contributor

The most I can figure out quickly from all this is that you might be able to prepack the bootstrap function. I'm wondering just how much value this would unlock. What am I missing?

@sebmarkbage
Copy link
Contributor Author

@hermanventer That is correct. I still need to implement contextify and fs module. The next follow ups on the contextify stuff will demonstrate that. The contextify built-in will execute further scripts in the Prepack Realm when the bootstrap function calls it.

'transform-es2015-parameters',
],
retainLines: true,
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this depends on a "devDependency" atm as a hack. I didn't want to add it as a hard dependency since this is still experimental and we'll just implement these directly eventually.

sebmarkbage added a commit to sebmarkbage/prepack that referenced this pull request May 5, 2017
The purpose of this is to be able to land facebookarchive#397 with some test coverage.
In its current form it is too specific in its implementation and only
works on a specific version of Node. In the future we'll want to make that
more abstract and general so that it can work on ranges and other major
versions.
sebmarkbage added a commit to sebmarkbage/prepack that referenced this pull request May 5, 2017
The purpose of this is to be able to land facebookarchive#397 with some test coverage.
In its current form it is too specific in its implementation and only
works on a specific version of Node. In the future we'll want to make that
more abstract and general so that it can work on ranges and other major
versions.
sebmarkbage added a commit that referenced this pull request May 6, 2017
The purpose of this is to be able to land #397 with some test coverage.
In its current form it is too specific in its implementation and only
works on a specific version of Node. In the future we'll want to make that
more abstract and general so that it can work on ranges and other major
versions.
@sebmarkbage sebmarkbage force-pushed the node-cli branch 2 times, most recently from 9634a7e to ef7448f Compare May 10, 2017 03:11
@sebmarkbage
Copy link
Contributor Author

sebmarkbage commented May 10, 2017

I added a unit test that is only enabled in CI for now (not yarn test) since it's locked to particular node version. This is now only blocked on #598 to be in a landable state (although far from complete).

@sebmarkbage
Copy link
Contributor Author

I added an additional test case that tests the path where ArrayBuffer #598 is needed. CI should now fail until that's fixed.

@sebmarkbage sebmarkbage force-pushed the node-cli branch 2 times, most recently from f0dcbe6 to c3c3261 Compare May 13, 2017 02:20
@sebmarkbage sebmarkbage changed the title [WIP] Add build option for Node CLI Add build option for Node CLI May 13, 2017
@sebmarkbage
Copy link
Contributor Author

After rebasing on top of the ArrayBuffer fix I think that this should now be in a landable state. After land, I'll follow up with Issues for a few ideas for future improvements.

@@ -0,0 +1,17 @@
# Prepack test input
node ./bin/prepack.js ./test/node-cli/Simple.js --out Simple-test.js --compatibility node-cli
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does prepack.js get into ./bin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's checked into the repo already. https://github.com/facebook/prepack/tree/master/bin

It's an alias for lib/prepack-cli.js which gets there from the build.


declare var process: any;

function getBufferFromArg(realm: Realm, value: Value): Uint8Array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just type value as an ObjectValue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And add the invariant at each callsite instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is actually preferable because it makes it easier for the reader to verify that the invariant is true.

declare var process: any;

function getBufferFromArg(realm: Realm, value: Value): Uint8Array {
invariant(value instanceof ObjectValue && value.$ViewedArrayBuffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This invariant is redundant.


declare var process: any;

function getBufferFromArg(realm: Realm, value: Value): Uint8Array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably use DataBlock instead of Uint8Array.

Copy link
Contributor Author

@sebmarkbage sebmarkbage May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to return a native Uint8Array in the Node environment. Not the Prepack value. It's what I pass to nativeFS.read etc. If the internal representation of DataBlock changes, I want that to be an error so that we can do a conversion from that format to Uint8Array here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, but it makes me wonder why we have DataBlock type in the first place.


export default function (realm: Realm): ObjectValue {
let nativeBuffer = process.binding('buffer');
let nativeBufferPrototype = (require('buffer'): any).Buffer.prototype;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. This is defined in native C code. This is related to the setupBufferJS thing below. The runtime we're currently running in has already called setupBufferJS, read from a mutated object and that's how it gets a handle on this object.

https://github.com/nodejs/node/blob/v7.9.0/src/node_buffer.cc#L1197-L1230

We can't do the same thing at this point. Luckily this same object is reachable another way through another module which I found here.

let utf8Slice = new NativeFunctionValue(realm, "Buffer.prototype.utf8Slice", "utf8Slice", 0, (context, args) => {
let self = getBufferFromArg(realm, context);
let decodedArgs = args.map((arg, i) => {
return ToInteger(realm, arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good place to follow the arrow with an expression rather than a block with a return.

let decodedArgs = args.map((arg, i) => {
return ToInteger(realm, arg);
});
let utf8String = nativeBufferPrototype.utf8Slice.apply(self, decodedArgs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised by this. The utf8Slice I can find does not expect a this parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let self = getBufferFromArg(realm, context);
let decodedArgs = args.map((arg, i) => {
if (i === 0) {
return getBufferFromArg(realm, arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems inconsistent with utf8Slice and I can't think of a good reason why it should be.

Copy link
Contributor Author

@sebmarkbage sebmarkbage May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utf8Slice creates a string from a Buffer. No side-effects. The rest of the arguments are all integers if they're defined.

copy copies bytes from this Buffer by mutating a target Buffer. The rest of the arguments are integers if defined. The return value is an integer.

https://github.com/nodejs/node/blob/v7.9.0/src/node_buffer.cc#L524-L557

return ToInteger(realm, arg);
}
});
let bytesCopied = nativeBufferPrototype.copy.apply(self, decodedArgs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you unpacked args with a pattern, you might get away with calling copy directly.

throw new Error("TODO");
});

Set(realm, obj, "setupBufferJS", setupBufferJS, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more readable if moved to just after the definition of setupBufferJS.

Copy link
Contributor

@hermanventer hermanventer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only really skimmed over this since it is quite large and somewhat strange to me. It seems self contained enough that I'm not overly concerned that it will break things, so I'm going to let you go ahead with this.


declare var process: any;

function getBufferFromArg(realm: Realm, value: Value): Uint8Array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a duplicate of the one in buffer.js.

return realm.intrinsics.undefined;
});
obj.defineNativeMethod("internalModuleStat", 0, (context, args) => {
const fileName = ToString(realm, args[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather use a pattern with more descriptive names.

enumerable: true,
});

// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be a bit more descriptive. When your future self comes back it he might appreciate a hint as to what your present self had in mind.

declare var process: any;

// Takes a value from the host realm and create it into a Prepack Realm.
// This seems like it could be a useful primitive elsewhere.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps move this to Realm.js.

enumerable: true,
configurable: true
});
// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what?

return realm.intrinsics.undefined;
});
TTYPrototype.defineNativeMethod("writeUtf8String", 0, (context, args) => {
// let req = args[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

let fd = ToInteger(realm, args[0]);
return new StringValue(realm, nativeTTYWrap.guessHandleType(fd));
// TODO: Make this abstract so that changing the pipe at runtime is
// possible. Currently this cause an introspection error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

causes

);
return realm.createAbstract(types, values, [], buildNode, undefined, `(process.binding('tty_wrap').isTTY(${fd}))`);
});
// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

}

export function prepackNodeCLISync(filename: string, options: Options = defaultOptions) {
if (process.version !== "v7.9.0") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems rather specific. Wouldn't later versions work as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment. The strategy is very fragile because I only model the minimal needed. Which is why this whole thing is so hacky. The ideal would be to have more of an automatic pass-through or making more things abstract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm modeling private APIs here. They do change between minor versions. The rationale for picking this strategy over modeling public APIs is outlined in the PR comment.

Expose `process` on intrinsics. We won't expose it on the global through the normal route. Instead, we'll let Node's on script set it up for us.
This lets us run from within a module scope for convenience.
This is just a sufficient amount of hacks to get the base line running. Needs more implementations and more of this needs to be automated.
I enable this mode only on CI for now. Not to `yarn test`. Since it is
locked to a particular version of Node so we don't want to force people
to run it locally.
The sync API doesn't accept a callback.
@sebmarkbage sebmarkbage merged commit 1402638 into facebookarchive:master May 18, 2017
@jdalton
Copy link

jdalton commented Jul 29, 2017

I've noticed this may still throw errors when attempting to load built-in modules like "zlib".

@sebmarkbage
Copy link
Contributor Author

@jdalton I think it'll be expected that many things won't work out of the box until the ecosystem shifts towards supporting "heap snapshot". E.g. Any hanging open socket at the end won't work 100%. We'll have to keep chasing the minimal set needed. Ideally Prepack can help minimize the things that can and can't be.

I'll take a look at why zlib doesn't work. Keep em coming. :)

@jdalton
Copy link

jdalton commented Jul 29, 2017

@sebmarkbage Cool, thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants