Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

import assertions RFC #427

Closed
bmeck opened this issue Nov 11, 2019 · 53 comments
Closed

import assertions RFC #427

bmeck opened this issue Nov 11, 2019 · 53 comments

Comments

@bmeck
Copy link
Member

bmeck commented Nov 11, 2019

Cross org linking is broken, see tc39/proposal-import-attributes#25

@bmeck bmeck added the modules-agenda To be discussed in a meeting label Nov 11, 2019
@ljharb
Copy link
Member

ljharb commented Nov 11, 2019

I’m not sure it’s worth meeting time until it’s been presented to tc39.

@jkrems
Copy link
Contributor

jkrems commented Nov 11, 2019

It looks like Daniel is looking for signals from the node side for the December meeting:

The plan is to collect feedback in issues, and present for Stage 1 in the December 2019 TC39 meeting.

So it seems like it should be discussed before the next TC39 meeting?

@ljharb
Copy link
Member

ljharb commented Nov 11, 2019

Our group should provide feedback, surely, but that can be done on GitHub - it doesn’t require consensus, so it doesn’t seem like it should take meeting time.

If folks want to talk about it, ofc we can.

@bmeck
Copy link
Member Author

bmeck commented Nov 11, 2019

@ljharb i'd expect the meeting time currently to only be devoted towards gaining agreement to move discussion to w/e location is apt. If the WG wants to not be involved we can avoid spending time on it as a group, but currently our model is to use meetings as our consensus building point. Without consensus discussion on a per individual level is somewhat awkward when acting on behalf of the WG.

@ljharb
Copy link
Member

ljharb commented Nov 11, 2019

Fair point.

@littledan
Copy link

I'll be happy to get more feedback from you all here. @ljharb has commented on integration with Node, e.g., in this comment. Does this comment reflect the consensus position of the modules team?

So far, the biggest concerns I've heard about this proposal have been from participants in this group. Feedback from some other people I talked with so far (including folks who didn't file issues) was generally positive/working through details, though of course we'll get broader feedback as time goes on.

If you all have time, I'd really appreciate your feedback here. I'm really trying to understand if this is a viable approach from a Node perspective.

@GeoffreyBooth
Copy link
Member

I wouldn’t say that @ljharb’s statements necessarily reflect a consensus of the modules team, no. We haven’t discussed the issue as a group yet.

In particular, the quote that you have in your readme:

"The author is the only true arbiter of the parse goal of content" -- Jordan H.

Is not a universally held opinion. It isn’t clear to me whether you presented this as attributed to Jordan meaning that it’s just his opinion, or rather a pearl of wisdom brought forth from him, so I’m not sure what you mean by including it. But there have been many long debates within this group regarding author-defined versus consumer-defined disambiguation, and there is no consensus on that point.

littledan added a commit to tc39/proposal-import-attributes that referenced this issue Nov 18, 2019
The quote is on a debated point, as explained in nodejs/modules#427 (comment) . We can discuss this sort of thing further in the issues.
littledan added a commit to tc39/proposal-import-attributes that referenced this issue Nov 18, 2019
The quote is on a debated point, as explained in nodejs/modules#427 (comment) . We can discuss this sort of thing further in the issues.
@bmeck
Copy link
Member Author

bmeck commented Nov 18, 2019

@GeoffreyBooth the disambiguation argument is quite different from the quote at hand. I disagree with your statement of us having discussed that specific quote at length regarding it in isolation. However, removing the quote seems fine.

@littledan
Copy link

I've removed the quote; thanks for the context.

@GeoffreyBooth
Copy link
Member

I wasn't saying that we've discussed the quote, just that we've discussed disambiguation; and that discussion hasn't led to any firm conclusion.

@bmeck
Copy link
Member Author

bmeck commented Nov 21, 2019

We discussed this in the last meeting https://github.com/nodejs/modules/blob/master/doc/meetings/2019-11-06.md . It seems there is a variety of opinions for/against; of note, one potential implementation concern that was brought up is how CJS based loading could be a bit problematic.


Looking at our code this affects our current loading of JSON, since JSON Modules are still experimental in Node, how do people @here feel about decoupling the JSON translator from the CJS cache?

I personally feel it is fine to separate them even if it could lead to loading the same location in 2 different ways and thus resulting in multiple load operations/JSON values.

@bmeck
Copy link
Member Author

bmeck commented Nov 22, 2019

I've made a PR to at least kick start discussion of removing the CJS cache synchronization for JSON modules.

@MylesBorins MylesBorins removed the modules-agenda To be discussed in a meeting label Nov 25, 2019
@MylesBorins
Copy link
Contributor

dropping agenda label as we discussed this last week

@bmeck
Copy link
Member Author

bmeck commented Dec 6, 2019

Module Attributes has achieved Stage 1

@MylesBorins
Copy link
Contributor

FYI there is a discussion happening in the openjs standards working group about this proposal. If folks could chime in it would be helpful

openjs-foundation/standards#91

@bmeck bmeck changed the title Module Attributes RFC import assertions RFC Aug 4, 2020
@bmeck
Copy link
Member Author

bmeck commented Aug 4, 2020

in the latest spec for this proposal (now called "import assertions") the cache key computed for a module takes assertions as components, this needs to be integrated somehow with our loader designs.

@jkrems
Copy link
Contributor

jkrems commented Aug 4, 2020

the cache key computed for a module takes assertions as components, this needs to be integrated somehow with our loader designs.

Are we still trying to prevent multiple entries with the same URL but different assertions? Or are we accepting potential duplication based on assertions? If we want to ensure a single entry per URL, I assume that for our implementation using just the URL as the key would be valuable?

@weswigham
Copy link
Contributor

Isn't the only security-important attribute something like {declarative: true} (to indicate that the import should not trigger execution)? Can node just... not support type? Since node doesn't load non-code imports (except json), doing so would allow package authors to still have transparent (ie, non breaking) migrations of implementations?

@ljharb
Copy link
Member

ljharb commented Aug 4, 2020

The intention, i believe, is that everyone is required to support type json, at least (but not required to restrict json imports to those with the type)

@bmeck bmeck added the modules-agenda To be discussed in a meeting label Aug 4, 2020
@bmeck
Copy link
Member Author

bmeck commented Aug 4, 2020

Added to agenda since it could block upstream

@weswigham
Copy link
Contributor

weswigham commented Aug 4, 2020

Errr, but, eg non-script html modules, or xml or css documents may also be declarative. I feel like hitching the security aspect to specific content encodings is maybe not sufficient. Minimally, if the runtime supports, say, yaml modules in addition to json (which is somewhat reasonable in many places), one could reasonably expect a package author to safely migrate from json to yaml or vice-versa, and it'd be really odd, if under the hood import x from "mod/conf" if {type: "json"} failed because it was a loadable yaml document instead, when all you really wanted to imply was import x from "mod/conf" if {declarative: true}. Asserting the invariants you actually want upheld just seems way more straightforward than content type malarkey. Who's to say json8 in some hypothetical future doesn't implement some kind of execution-like behavior (looking at you xml) that should be similarly limited? Minimally, asserting things like the executability of a thing are much easier to derive author intent from.

@ljharb
Copy link
Member

ljharb commented Aug 5, 2020

Hopefully the proposal champions can weigh in on why a “no execute” bit was insufficient.

@bmeck
Copy link
Member Author

bmeck commented Aug 5, 2020

@ljharb no execute is a vastly more complex and ambiguous task than asserting a content type (consider a module that only has function declarations but no executed code for example). You have to take into account nesting sub resources locked into being limited, content type is a single layer deep. I feel strongly that "no execute" is not viable in any short term and requires a massive undertaking and is more than what is needed to unblock the main driving feature of import assertions.

@dandclark
Copy link

I agree with @bmeck's comment regarding noexecute. @rniwa also pointed out here that:

Execution vs. no execution is an important distinction but changing the parser mode based on MIME type isn't great either. We've definitely had security attacks which used an existing content by reinterpreting it in a different text encoding & MIME type.

@dandclark
Copy link

Regarding the other discussion in this thread:

I’m pretty sure the current proposed spec merely allows creating a new module record for a new specifier/assertions combination, and never requires it.

That is correct. The proposal leaves it to hosts to decide whether assertions are part of the module cache key (see https://tc39.es/proposal-import-assertions/#sec-hostresolveimportedmodule), although it recommends that assertions are not part of the key. An earlier version required that they were not part of the cache key, but this restriction was loosened based on feedback from proposal's integration with HTML. HTML is going to consider assertions part of the module cache key.

However, it is still a requirement that assertions do not affect the interpretation of the module (other than potentially causing it to fail to load). So, things like changing the HTTP headers depending on the assertions would not be permitted.

Thus, given these two statements it is up to the host whether or not these result in two module instances (assuming neither set of assertions results in failure to load):

import thing from "./url" assert {foo: "bar"};
import thing2 from "./url" assert {foo2: "bar2"};

However, if the host chooses to treat these as two module instances, then the instances must be interpreted exactly the same. They must have the same HTTP request headers, they must be parsed and executed the same, etc.

@ljharb
Copy link
Member

ljharb commented Aug 5, 2020

In other words, they must be the same conceptual module, although nobody has figured out how to better word that normative requirement in the spec yet.

The easiest way to fulfill both the letter and the intent of the spec is to make import * as one from './url' assert {foo: "bar"}; import * as two from './url' assert {foo2: "bar2"}; one === two result in true. The only environments where this isn't possible are web browsers (and deno) because they fetch content from an untrusted location (the internet), and the web server can't be properly constrained in this way. node does not have this security/usability/certainty hole, and as such, it shouldn't need to consider the assertions as part of the cache key.

@jkrems
Copy link
Contributor

jkrems commented Aug 5, 2020

node does not have this security/usability/certainty hole, and as such, it shouldn't need to consider the assertions as part of the cache key.

But that means that code using assertions wouldn't be portable between browsers and node - assuming unknown assertions are ignored[1]. It would be valid to write otherwise portable code that uses assertion to "bust the cache" in the browser - but that code would fail in node (and vice versa). I would argue strongly for following the execution semantics of the browser in node. And if adding an "nonsense assertion" ({thisWillNeverBeActually: "checked"}) causes an additional evaluation of the imported module in one but not in the other, that would be a change in execution semantics to me.

[1] Are unknown assertions ignored? I assume they must be because of future compat.

@ljharb
Copy link
Member

ljharb commented Aug 5, 2020

@jkrems import assertions are explicitly not designed, intended, or permitted to "cache bust", so any code written with them to do so is already incorrect/broken.

@dandclark
Copy link

[1] Are unknown assertions ignored? I assume they must be because of future compat.

This is also left up to hosts. The current status of the HTML integration is to fail on unrecognized assertions.

This seems important from a security perspective. If I'm asserting some security-related restriction on the thing I'm importing, if the implementation can't guarantee that the assertion is true because the assertion is not yet implemented, then I'd rather it cause a failure than just be ignored.

@weswigham
Copy link
Contributor

no execute is a vastly more complex and ambiguous task than asserting a content type (consider a module that only has function declarations but no executed code for example).

That's a huge broadening of what I'd expected a "declarative" assertion to mean - it shouldn't introsoect specific code file contents; it should just be banning all potentially executable formats; which seems much easier. (And also isn't "deep" in any way)

@ljharb
Copy link
Member

ljharb commented Aug 5, 2020

@weswigham it would have to be deep, a module that lacks permissions to execute shouldn't be able to pull in malicious code that can execute, directly or transitively.

@jkrems
Copy link
Contributor

jkrems commented Aug 5, 2020

@ljharb You can define the requirements also as "only things that are necessarily leaf nodes" (like JSON is) are allowed. The argument here is that you can "easily" define a set of importable files that by their nature don't lead to execution, without looking at the file contents. And any file where you would have to look at the contents, would always fail that assertion. That seems to be what {type: "json"} is meant to express, just in a somewhat roundabout way.

@weswigham
Copy link
Contributor

it would have to be deep, a module that lacks permissions to execute shouldn't be able to pull in malicious code that can execute, directly or transitively.

What nonexecutable formats can pull in executable dependencies? I'd struggle to call the format "nonexecutable" if that were the case.

@ljharb
Copy link
Member

ljharb commented Aug 6, 2020

@weswigham HTML Modules.

@weswigham
Copy link
Contributor

I'd be inclined to say that if they can include executable content (even via import), then they should just be considered executable themselves.

@xtuc
Copy link
Contributor

xtuc commented Aug 8, 2020

About

import 'foo' if { type: 'commonjs' };

I don't think we should encourage hosts to define subsets of JavaScript, even if they are allowed to. That will make the module less portable across environments.
For this specific example, I think this could be a evaluator/transformation attribute. The host could translate the module to the user's preference.


Wasm is another instance of module that a user can mark as non-executable but that could import executable JavaScript on its own.

@bmeck
Copy link
Member Author

bmeck commented Sep 8, 2020

The Import Assertions champions are seeking to ask for Stage 3 this coming TC39 meeting, we should try and iron out if there are any actionable feedback items we wish to ask for. In particular the web browser restriction on cache key seems something I would remain uncomfortable with keeping in the import assertions proposals since node may wish to make use of matching the web.

@MylesBorins MylesBorins removed the modules-agenda To be discussed in a meeting label Sep 23, 2020
@bmeck
Copy link
Member Author

bmeck commented Jan 21, 2021

closing as i believe there is nothing else needed here

@bmeck bmeck closed this as completed Jan 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants