Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: support loading data: URLs #28614

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion doc/api/esm.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,13 +312,38 @@ There are four types of specifiers:
Bare specifiers, and the bare specifier portion of deep import specifiers, are
strings; but everything else in a specifier is a URL.

Only `file://` URLs are supported. A specifier like
Only `file:` and `data:` URLs are supported. A specifier like
`'https://example.com/app.js'` may be supported by browsers but it is not
supported in Node.js.

Specifiers may not begin with `/` or `//`. These are reserved for potential
future use. The root of the current volume may be referenced via `file:///`.

#### `data:` Imports

<!-- YAML
added: REPLACEME
-->

[`data:` URLs][] are supported for importing with the following MIME types:

* `text/javascript` for ES Modules
* `application/json` for JSON
* `application/wasm` for WASM.

`data:` URLs only resolve [_Bare specifiers_][Terminology] for builtin modules
and [_Absolute specifiers_][Terminology]. Resolving
[_Relative specifiers_][Terminology] will not work because `data:` is not a
[special scheme][]. For example, attempting to load `./foo`
from `data:text/javascript,import "./foo";` will fail to resolve since there
is no concept of relative resolution for `data:` URLs. An example of a `data:`
URLs being used is:

```mjs
import 'data:text/javascript,console.log("hello!");'
import _ from 'data:application/json,"world!"'
```

## import.meta

* {Object}
Expand Down Expand Up @@ -869,6 +894,8 @@ $ node --experimental-modules --es-module-specifier-resolution=node index
success!
```
[Terminology]: #esm_terminology
[`data:` URLs]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URIs
[`export`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/export
[`import`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import
[`import()`]: #esm_import-expressions
Expand All @@ -877,6 +904,7 @@ success!
[CommonJS]: modules.html
[ECMAScript-modules implementation]: https://github.com/nodejs/modules/blob/master/doc/plan-for-new-modules-implementation.md
[Node.js EP for ES Modules]: https://github.com/nodejs/node-eps/blob/master/002-es-modules.md
[special scheme]: https://url.spec.whatwg.org/#special-scheme
[WHATWG JSON modules specification]: https://html.spec.whatwg.org/#creating-a-json-module-script
[ES Module Integration Proposal for Web Assembly]: https://github.com/webassembly/esm-integration
[dynamic instantiate hook]: #esm_dynamic_instantiate_hook
Expand Down
22 changes: 21 additions & 1 deletion lib/internal/modules/esm/default_resolve.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ const typeFlag = getOptionValue('--input-type');
const experimentalWasmModules = getOptionValue('--experimental-wasm-modules');
const { resolve: moduleWrapResolve,
getPackageType } = internalBinding('module_wrap');
const { pathToFileURL, fileURLToPath } = require('internal/url');
const { URL, pathToFileURL, fileURLToPath } = require('internal/url');
const { ERR_INPUT_TYPE_NOT_ALLOWED,
ERR_UNKNOWN_FILE_EXTENSION } = require('internal/errors').codes;

Expand Down Expand Up @@ -45,12 +45,32 @@ if (experimentalWasmModules)
extensionFormatMap['.wasm'] = legacyExtensionFormatMap['.wasm'] = 'wasm';

function resolve(specifier, parentURL) {
try {
const parsed = new URL(specifier);
if (parsed.protocol === 'data:') {
const [ , mime ] = /^([^/]+\/[^;,]+)(;base64)?,/.exec(parsed.pathname) || [ null, null, null ];
const format = ({
'__proto__': null,
'text/javascript': 'module',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

text/javascript doesn't necessarily mean Module, it could also mean Script.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under no JS loading spec does Script get checked against the MIME text/javascript. Script is effectively without a MIME and this table matches web standards.

Copy link
Contributor

@jkrems jkrems Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we support (browser-style) Script anywhere, so it would feel weird to do that here. A CommonJS script would be something like text/vnd.node.js according to nodejs/TSC#371.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then can application/node be added to this object?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression though — /^([^/]+\/[^;,]+)(;base64)?,/ seems to assume the presence of mime type before the [;,] — regardless of if it is mandatory, it might make sense to test it against long and malformed urls to potentially refine it if necessary.

Sorry for not wanting to muddy this with a bad attempt to wing it here, but I will try to locate the ones I worked on a while back for that very same purpose if it helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SMotaal we could, but i can't think of much we could do except limiting the size? Right now this lacks a variety of things, including MIME parameter parsing and the PR for parsing MIMEs is stuck.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is anchored, it is certainly possible to add efficient guards in the current expression. I'd like to take on exploring how we can do that here, which is mainly just to carve a limited allowed chars when delimited per spec (I did this a while back just need to dig).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmeck I looked into the various options for the expression and recommend:

/^(?:((?:text|application)\/(?:[A-Z][-.0-9A-Z]*)?[A-Z]+)((?:;[A-Z][!%'()*\-.0-9A-Z_~]*=[!%'()*\-.0-9A-Z_~]*)*)(;base64)?),/i

This would match any text/ and application/ subtype, along with the attribute-value parameters like charset= (to be parsed separately), and optional base64 (captured separately from previous parameters).

For now, simply being more restrictive of the character ranges for greedy * and + captures is likely all we need to avoid unpredictable performance hazards with very long crafted/malformed strings.

See gist for more details.

Please let me know how to proceed, if this is worth incorporating.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just post a suggestion change and that looks fine

bmeck marked this conversation as resolved.
Show resolved Hide resolved
'application/json': 'json',
'application/wasm': experimentalWasmModules ? 'wasm' : null
})[mime] || null;
return {
url: specifier,
format
};
}
} catch {}
if (NativeModule.canBeRequiredByUsers(specifier)) {
return {
url: specifier,
format: 'builtin'
};
}
if (parentURL && parentURL.startsWith('data:')) {
// This is gonna blow up, we want the error
new URL(specifier, parentURL);
}

const isMain = parentURL === undefined;
if (isMain)
Expand Down
7 changes: 5 additions & 2 deletions lib/internal/modules/esm/loader.js
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,12 @@ class Loader {
}
}

if (format !== 'dynamic' && !url.startsWith('file:'))
if (format !== 'dynamic' &&
!url.startsWith('file:') &&
!url.startsWith('data:')
)
throw new ERR_INVALID_RETURN_PROPERTY(
'file: url', 'loader resolve', 'url', url
'file: or data: url', 'loader resolve', 'url', url
);

return { url, format };
Expand Down
86 changes: 61 additions & 25 deletions lib/internal/modules/esm/translators.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ const {
StringPrototype
} = primordials;

const { Buffer } = require('buffer');

const {
stripBOM,
loadNativeModule
Expand All @@ -23,6 +25,8 @@ const { debuglog } = require('internal/util/debuglog');
const { promisify } = require('internal/util');
const esmLoader = require('internal/process/esm_loader');
const {
ERR_INVALID_URL,
ERR_INVALID_URL_SCHEME,
ERR_UNKNOWN_BUILTIN_MODULE
} = require('internal/errors').codes;
const readFileAsync = promisify(fs.readFile);
Expand All @@ -33,6 +37,31 @@ const debug = debuglog('esm');
const translators = new SafeMap();
exports.translators = translators;

const DATA_URL_PATTERN = /^[^/]+\/[^,;]+(;base64)?,([\s\S]*)$/;
function getSource(url) {
const parsed = new URL(url);
if (parsed.protocol === 'file:') {
return readFileAsync(parsed);
} else if (parsed.protocol === 'data:') {
const match = DATA_URL_PATTERN.exec(parsed.pathname);
if (!match) {
throw new ERR_INVALID_URL(url);
}
const [ , base64, body ] = match;
return Buffer.from(body, base64 ? 'base64' : 'utf8');
} else {
throw new ERR_INVALID_URL_SCHEME(['file', 'data']);
}
}

function errPath(url) {
const parsed = new URL(url);
if (parsed.protocol === 'file:') {
return fileURLToPath(parsed);
}
return url;
}

function initializeImportMeta(meta, { url }) {
meta.url = url;
}
Expand All @@ -44,7 +73,7 @@ async function importModuleDynamically(specifier, { url }) {

// Strategy for loading a standard JavaScript module
translators.set('module', async function moduleStrategy(url) {
const source = `${await readFileAsync(new URL(url))}`;
const source = `${await getSource(url)}`;
debug(`Translating StandardModule ${url}`);
const { ModuleWrap, callbackMap } = internalBinding('module_wrap');
const module = new ModuleWrap(source, url);
Expand Down Expand Up @@ -111,26 +140,32 @@ translators.set('builtin', async function builtinStrategy(url) {
translators.set('json', async function jsonStrategy(url) {
debug(`Translating JSONModule ${url}`);
debug(`Loading JSONModule ${url}`);
const pathname = fileURLToPath(url);
const modulePath = isWindows ?
StringPrototype.replace(pathname, winSepRegEx, '\\') : pathname;
let module = CJSModule._cache[modulePath];
if (module && module.loaded) {
const exports = module.exports;
return createDynamicModule([], ['default'], url, (reflect) => {
reflect.exports.default.set(exports);
});
const pathname = url.startsWith('file:') ? fileURLToPath(url) : null;
let modulePath;
let module;
if (pathname) {
modulePath = isWindows ?
StringPrototype.replace(pathname, winSepRegEx, '\\') : pathname;
module = CJSModule._cache[modulePath];
if (module && module.loaded) {
const exports = module.exports;
return createDynamicModule([], ['default'], url, (reflect) => {
reflect.exports.default.set(exports);
});
}
}
const content = await readFileAsync(pathname, 'utf-8');
// A require call could have been called on the same file during loading and
// that resolves synchronously. To make sure we always return the identical
// export, we have to check again if the module already exists or not.
module = CJSModule._cache[modulePath];
if (module && module.loaded) {
const exports = module.exports;
return createDynamicModule(['default'], url, (reflect) => {
reflect.exports.default.set(exports);
});
const content = `${await getSource(url)}`;
if (pathname) {
// A require call could have been called on the same file during loading and
// that resolves synchronously. To make sure we always return the identical
// export, we have to check again if the module already exists or not.
module = CJSModule._cache[modulePath];
if (module && module.loaded) {
const exports = module.exports;
return createDynamicModule(['default'], url, (reflect) => {
reflect.exports.default.set(exports);
});
}
}
try {
const exports = JsonParse(stripBOM(content));
Expand All @@ -143,10 +178,12 @@ translators.set('json', async function jsonStrategy(url) {
// parse error instead of just manipulating the original error message.
// That would allow to add further properties and maybe additional
// debugging information.
err.message = pathname + ': ' + err.message;
err.message = errPath(url) + ': ' + err.message;
throw err;
}
CJSModule._cache[modulePath] = module;
if (pathname) {
CJSModule._cache[modulePath] = module;
}
return createDynamicModule([], ['default'], url, (reflect) => {
debug(`Parsing JSONModule ${url}`);
reflect.exports.default.set(module.exports);
Expand All @@ -155,14 +192,13 @@ translators.set('json', async function jsonStrategy(url) {

// Strategy for loading a wasm module
translators.set('wasm', async function(url) {
const pathname = fileURLToPath(url);
const buffer = await readFileAsync(pathname);
const buffer = await getSource(url);
debug(`Translating WASMModule ${url}`);
let compiled;
try {
compiled = await WebAssembly.compile(buffer);
} catch (err) {
err.message = pathname + ': ' + err.message;
err.message = errPath(url) + ': ' + err.message;
throw err;
}

Expand Down
63 changes: 63 additions & 0 deletions test/es-module/test-esm-data-urls.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
// Flags: --experimental-modules
'use strict';
const common = require('../common');
const assert = require('assert');
function createURL(mime, body) {
return `data:${mime},${body}`;
}
function createBase64URL(mime, body) {
return `data:${mime};base64,${Buffer.from(body).toString('base64')}`;
}
(async () => {
{
const body = 'export default {a:"aaa"};';
const plainESMURL = createURL('text/javascript', body);
const ns = await import(plainESMURL);
assert.deepStrictEqual(Object.keys(ns), ['default']);
assert.deepStrictEqual(ns.default.a, 'aaa');
const importerOfURL = createURL(
'text/javascript',
`export {default as default} from ${JSON.stringify(plainESMURL)}`
);
assert.strictEqual(
(await import(importerOfURL)).default,
ns.default
);
const base64ESMURL = createBase64URL('text/javascript', body);
assert.notStrictEqual(
await import(base64ESMURL),
ns
);
}
{
const body = 'export default import.meta.url;';
const plainESMURL = createURL('text/javascript', body);
const ns = await import(plainESMURL);
assert.deepStrictEqual(Object.keys(ns), ['default']);
assert.deepStrictEqual(ns.default, plainESMURL);
}
{
const body = '{"x": 1}';
const plainESMURL = createURL('application/json', body);
const ns = await import(plainESMURL);
assert.deepStrictEqual(Object.keys(ns), ['default']);
assert.deepStrictEqual(ns.default.x, 1);
}
{
const body = '{"default": 2}';
const plainESMURL = createURL('application/json', body);
const ns = await import(plainESMURL);
assert.deepStrictEqual(Object.keys(ns), ['default']);
assert.deepStrictEqual(ns.default.default, 2);
}
{
const body = 'null';
const plainESMURL = createURL('invalid', body);
try {
await import(plainESMURL);
common.mustNotCall()();
} catch (e) {
assert.strictEqual(e.code, 'ERR_INVALID_RETURN_PROPERTY_VALUE');
}
}
})().then(common.mustCall());