-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: dom.createPort() #679
base: main
Are you sure you want to change the base?
Conversation
This adds a proposal for a new dom.createPort() method, whic will allow communication between different javscript "worlds". It leverages the proposed dom.execute() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const foo = getFoo(); | ||
port.sendMessage({foo}); | ||
} else if (message == 'getBar') { | ||
const bar = getBar(); | ||
port.sendMessage({bar}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aligning with the code below to make the examples more readable / prevent perceived differences where they do not apply.
const foo = getFoo(); | |
port.sendMessage({foo}); | |
} else if (message == 'getBar') { | |
const bar = getBar(); | |
port.sendMessage({bar}); | |
port.sendMessage({foo: getFoo()}); | |
} else if (message == 'getBar') { | |
port.sendMessage({bar: getBar()}); |
} | ||
|
||
// The rest of this code executes in the content script world. | ||
const mainWorldPort = browser.dom.createPort({world: 'MAIN'}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the port needs to be passed to the main world anyway. What is the reason to require explicitly specifying the world?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above from Oliver's comment. This is to make it explicit and obvious to both the browser and the developer that this port is implicitly bound to one world, and disallow any accidental passing to another world or multiple worlds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments but also suggestions to specify underspecified behavior. For ease of editing, you can commit these suggestions with Github's UI (https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/incorporating-feedback-in-your-pull-request#applying-suggested-changes) and then use git pull origin refs/pull/679/head
to pull the new changes locally.
I'll use "Squash & Merge" before committing the PR, so the number of intermediate commits do not matter.
browser.dom.execute( | ||
{ | ||
func: setUpMainWorld, | ||
args: mainWorldPort, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
args: mainWorldPort, | |
args: [mainWorldPort], |
|
||
interface MessagePort { | ||
sendMessage(args?: any): void | ||
onMessage(args?: any): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm that you are intentionally using a new design here:
There is no precedent in extension APIs for using properties to control API behavior - API interaction is always through methods. Some APIs can accept function members on objects passed to functions (e.g. contextMenus.update(id, { onclick })
), but there is no namespace whose behavior changes through properties.
If this new pattern is not intentional, setMessageListener( MessageListenerCallback? );
would fit the usual conventions (I don't mention getMessageListener
because there is presumably no need to change the listener later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant to update this earlier. This should be an event, not a function. In our yet-to-be-finalized new syntax, this would be something like:
callback OnMessageListener = void (any message);
interface OnMessageEvent : ExtensionEvent {
static void addListener(OnMessageListener listener);
static void removeListener(OnMessageListener listener);
static boolean hasListener(OnMessageListener listener);
}
interface MessagePort {
static attribute OnMessageEvent onMessage;
...
}
Not sure if we want to use that here, or a more typescript-centric approach; lemme know if you have a preference
|
||
### Open Web API | ||
|
||
The open web is unaware of the concept of multiple Javascript worlds, so this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the open web does not support the concept of multiple worlds, it does already have a concept of a MessagePort
(https://developer.mozilla.org/en-US/docs/Web/API/MessagePort). It's worth calling this out explicitly, and also mention that we cannot use this because (1) it is async and (2) the use of web prototypes would enable web pages to intercept messages (e.g. through MessagePort.prototype
or Event.prototype
).
Here is an example:
mc = new MessageChannel();
mc.port1.onmessage = e => console.log(e.data);
mc.port2.postMessage(2);
console.log(3);
Result:
3
2
// ( 2 logged after 3 means that the message delivery was async )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename MessagePort
to SyncMessagePort
? That would both address the naming collision called out in the previous comment and describe what is different about this interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we rename MessagePort to SyncMessagePort?
Rather MessagePortSync
because Sync
at the end is the established pattern in the web platform (FileEntrySync) and nodejs (fs.readFileSync), whereas Sync
at the beginning indicates it being the primary subject e.g. SyncEvent is for synchronization events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot use MessagePort because it is async
Indeed, so maybe just make a new simpler API that doesn't use the concept of ports, but just establishes one secure synchronous channel.
|
||
## Implementation Notes | ||
|
||
N/A. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand with notes on the prototype (maybe at the behavior section?):
N/A. | |
### Prototype | |
To avoid interception by web pages, the prototype chain of the `MessagePort` | |
instance should terminate at `null` instead of `Object.prototype`. | |
Concretely, this means that an instance of `MessagePort` could either inherit | |
from a (hidden) MessagePort class \* (which itself has a `null` Prototype), or | |
be a plain object with a null Prototype and all specified members as own | |
property descriptors. | |
\* hidden means: not exposed as a global class. If anyone (e.g. the web page) | |
ever gets access to a `MessagePort` instance, it may see the hidden prototype | |
via `port.__proto__` or `port.constructor.prototype` and through that intercept | |
& interfere with all future port communications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to find an existing web specification that has interfaces that define an interface or object that inherits from null
. The closest thing I was able to find was the create an exports object algorithm from the WebAssembly JavaScript Interface working draft.
} | ||
}; | ||
|
||
// These notify the content script of changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// These notify the content script of changes. | |
// These notify the content script of changes. | |
// NOTE: addEventListener used in this example is not part of the API. |
Adding emphasis on the fact that addEventListener
is NOT part of the API. Because if it were, it would easily be intercepted by the web page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack; will incorporate
### Additional Security Considerations | ||
|
||
N/A. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm suggesting to include a short write-up of the brittle safety offered and not-offered by the API. This clarifies what the API does and does not do to those who are not experts in the subject.
#### Secrecy of data | |
This API is designed to offer a secure communication channel between one world | |
and an untrusted world. It is the responsibility of the extension to maintain | |
further secrecy beyond that. In general, this requires avoiding the use of | |
prototype members. | |
For example, imagine a func that reads `message.foo` in the main world. This | |
is safe if `message` is always an object with the `foo` member. If this is not | |
the case, then it is a language feature of JavaScript to look up the `foo` | |
member in the prototype chain. Most objects ultimately inherit from | |
`Object.prototype`, which can be manipulated by other (untrusted) scripts to | |
intercept the message. | |
Example: | |
```javascript | |
// Attack: | |
Object.defineProperty(Object.prototype, "foo", { | |
get() { console.log("Intercepted via foo:", this); return "intercepted" } | |
}); | |
// Safe use: foo always defined, web page cannot intercept access. | |
message = { secret: 1, foo: true }; | |
console.log("message.foo=", message.foo); | |
// Logs "message.foo= true" | |
// Unsafe use: foo not set; web page can intercept access attempt. | |
message = { secret: 1 }; | |
console.log("message.foo=", message.foo); | |
// Logs "Intercepted via foo: { secret: 1 } | |
// Logs "message.foo= intercepted" | |
``` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, will incorporate
sendMessage(args?: any): void | ||
onMessage(args?: any): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sendMessage(args?: any): void | |
onMessage(args?: any): void | |
sendMessage(message?: any): void | |
onMessage(message?: any): void |
To avoid confusion with args
in dom.execute
, suggest renaming to message
.
A message port can be passed to another world (in order to establish a | ||
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be | ||
expanded to have special serialization logic for the MessagePort type. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a more explicit description of the message dispatch semantics. This defines synchronicity, error handling and non-errors.
### Message dispatch | |
Upon calling `sendMessage`, `onMessage` is synchronously called before | |
`sendMessage` returns. `sendMessage` returns successfully if `onMessage` has | |
been called, even if the function itself threw an error. | |
`sendMessage` may throw if the input message cannot be serialized. | |
`sendMessage` may throw if the return value cannot be serialized. | |
`sendMessage` may throw when the recipient is unavailable. For example, if the | |
recipient did not assign an `onMessage` listener. Or if the port was never | |
delivered to the other world. | |
I am also willing to use the phrase throws
instead of may throw
if you agree with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to touch on the synchronocity aspect a bit more here (and I think return values are a natural byproduct of that). I know we discussed this briefly before, but I'd like some more clarity on what this enables. Given we have a synchronous dom.execute(), what the use cases in which synchronous message passing (as opposed to just FIFO) to a port is critical? Is it for cases where the main world needs synchronous execution with the extension world? If so, why? Or for some other use case?
A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, synchronous message passing enables the already injected main world script to reach back to the isolated world to look up relevant state that the script in the main world needs.
Other extension authors have already commented on some use cases at #284
As an example of a concrete use case from personal experience: I have an extension called Don't Track Me Google, which is just a content script that ensures that becomes
. Among other things, the extension intercepts the setter of
HTMLAnchorElement.prototype.href`, to hook in on calls from the web page. It sanitizes it and calls the original setter.
To minimize interference from the web page, the sanitization logic should run in an isolated world. And because the original caller expects a synchronous API, the messaging API has to be synchronous.
This specific use case also shows why sometimes it is desirable to be able to receive DOM objects, so that the page and caller can pass the DOM elements (that they already share!) without requiring appending to the DOM.
Note: this extension was originally a MV2 extension. To convert to MV3, I had to duplicate the sanitization logic in the main world and use a shared DOM element for communicating specific state (preferences from the options page) from the content script to the main world script: https://github.com/Rob--W/dont-track-me-google/blob/a73ab8b3bfa9794c75cb4e111c6424e31e6c019e/main_world_script.js. The shared DOM element can be detected by the web page. In my case it's a reasonable trade-off, especially because I don't expect Google web properties to intentionally break the extension. This assumption is not necessarily true for extensions that are privacy-focused and designed for the whole web.
A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).
There is already a synchronous communication between worlds - they share the same DOM, and therefore DOM events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given we have a synchronous dom.execute()
Abusing it to pass messages synchronously would be very inefficient as running code requires a lot more internal work on top of passing the arguments. Userscripts may exchange thousands of messages between the worlds.
use cases in which synchronous message passing [...] is critical
It allows implementing simple super-performant API via object getters exposed to the userscript or main worlds that receive the actual info from the secure world synchronously.
Tampermonkey, Violentmonkey, Firemonkey would like to keep supporting the legacy GM_xxx API, which is synchronous. This alone seems critical as there are thousands of userscripts that still work even though they are abandoned and no longer developed.
we could potentially have a single world blocked while another is unblocked
They share the same JS/DOM thread, so what you describe is impossible.
If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).
No, we just abandon the idea of MessagePort altogether and use CustomEvent mechanism internally, which already does it synchronously and which is what userscript extensions have been always using. The side benefit would be the API may be simplified as there's no need for a port.
|
||
A message port can be passed to another world (in order to establish a | ||
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be | ||
expanded to have special serialization logic for the MessagePort type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expanded to have special serialization logic for the MessagePort type. | |
expanded to have special serialization logic for the MessagePort type. | |
Once a message port has been passed to a `dom.execute` call, it cannot be passed | |
again - an error will be thrown because it is not structurally cloneable. | |
I'm explicitly noting that the port cannot be reused. And also emphasizing that the port is not structurally cloneable, because making the type cloneable could be a trivial way to make it fit in the dom.execute
requirement of args
being strucurally cloneable (which would trigger a bunch of new issues, such as the "clone" potentially inverting roles without dom.execute
, e.g. when calling structuredClone(port)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; will incorporate
declare namespace dom { | ||
interface PortProperties { | ||
world: ExecutionWorld; | ||
worldId?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both @carlosjeurissen and @xeenon had mentioned that this might be redundant, and I can't immediately see a strong need for this (it is only useful if an extension sends the port to multiple worlds, which seems unlikely - and we might be able to just make it so the port only works in the first world it is sent to). Any thoughts on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't to support multiple worlds (this wouldn't do that, since this still only supports one), but rather to ensure that:
a) a developer is aware that a port is bound to a single world. This is distinct from the existing uses of ports, which are a many-to-one relationship between receivers and senders.
b) a developer can't accidentally sending a port to an incorrect world.
Technically, this is redundant, since it could be derived from the first world it's sent to (and, since dom.executeScript() is proposed to be synchronous, there in theory aren't any TOCTOU issues here?), but that still seems a bit less obvious to me than binding the port at the start. (And makes browser implementation slightly more complex, though we can always work around that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what you mean by "this might be redundant". Are you saying that worldId
is not necessary, that world
should not be required if worldId
is present, or something else?
The Allow multiple user script worlds proposal introduced the concept of worldId
(along side world
) as a property of RegisteredUserScript
. Right now the presence of both properties on PortProperties
follows that pattern. The main difference between how these properties are used is that this proposal defines world
as a required property while in the other prooposal its optional. That allows developers to omit the world
property when targeting a specific user script world or to omit worldId
when targeting MAIN
or USER_SCRIPTS
worlds.
The only scenario I can think of where requiring world
might be useful is if we anticipate allowing other types of script injection to occur in separate, isolated worlds. Or, at least want to reserve the ability to introduce such a concept in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that worldId is not necessary
This was the intended meaning of my original comment. Thinking about it some more (and as Devlin touched on below) the world
property is also not strictly necessary.
I can see the argument for requiring these, but right now I'm not convinced the benefit is worth it. If a developer tries to send a port to a second world, we can easily throw a "Port cannot be sent to multiple worlds" error.
That seems like enough for developers to discover the right path without making the API more verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, folks! Just responding to open comments to seed some more discussion; haven't made changes to the commit just yet.
declare namespace dom { | ||
interface PortProperties { | ||
world: ExecutionWorld; | ||
worldId?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't to support multiple worlds (this wouldn't do that, since this still only supports one), but rather to ensure that:
a) a developer is aware that a port is bound to a single world. This is distinct from the existing uses of ports, which are a many-to-one relationship between receivers and senders.
b) a developer can't accidentally sending a port to an incorrect world.
Technically, this is redundant, since it could be derived from the first world it's sent to (and, since dom.executeScript() is proposed to be synchronous, there in theory aren't any TOCTOU issues here?), but that still seems a bit less obvious to me than binding the port at the start. (And makes browser implementation slightly more complex, though we can always work around that.)
|
||
interface MessagePort { | ||
sendMessage(args?: any): void | ||
onMessage(args?: any): void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant to update this earlier. This should be an event, not a function. In our yet-to-be-finalized new syntax, this would be something like:
callback OnMessageListener = void (any message);
interface OnMessageEvent : ExtensionEvent {
static void addListener(OnMessageListener listener);
static void removeListener(OnMessageListener listener);
static boolean hasListener(OnMessageListener listener);
}
interface MessagePort {
static attribute OnMessageEvent onMessage;
...
}
Not sure if we want to use that here, or a more typescript-centric approach; lemme know if you have a preference
|
||
A message port can be passed to another world (in order to establish a | ||
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be | ||
expanded to have special serialization logic for the MessagePort type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; will incorporate
A message port can be passed to another world (in order to establish a | ||
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be | ||
expanded to have special serialization logic for the MessagePort type. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to touch on the synchronocity aspect a bit more here (and I think return values are a natural byproduct of that). I know we discussed this briefly before, but I'd like some more clarity on what this enables. Given we have a synchronous dom.execute(), what the use cases in which synchronous message passing (as opposed to just FIFO) to a port is critical? Is it for cases where the main world needs synchronous execution with the extension world? If so, why? Or for some other use case?
A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).
} | ||
}; | ||
|
||
// These notify the content script of changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack; will incorporate
} | ||
|
||
// The rest of this code executes in the content script world. | ||
const mainWorldPort = browser.dom.createPort({world: 'MAIN'}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above from Oliver's comment. This is to make it explicit and obvious to both the browser and the developer that this port is implicitly bound to one world, and disallow any accidental passing to another world or multiple worlds.
### Additional Security Considerations | ||
|
||
N/A. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, will incorporate
But this won't support userscripts registered via userScripts.register() API in the target world (e.g. "main"), so we will have register them in the primary userscript world and then run them via dom.execute(), which is pretty wasteful due to the additional serialization/deserialization and various other JS checks. Possible solutions:
|
It's also imperative to have the ability to pass DOM nodes (live and detached, from either a shadow or the main tree) and a ❌ It means that a MessagePort cannot be used.
|
To clarify, it's not that using There is a way to use CustomEvent mechanism safely for a communication channel - just don't depend on current prototypes, i.e. addEventListener, dispatchEvent, CustomEvent constructor, CustomEvent.prototype.detail getter all should be used from a safe source, not from the current JS environment. Current workaround for extensions is to extract these safe original functions from a temporary iframe that they create in their secure world using a function extractor that runs synchronously in the target world ("main"). Browsers should be able to do it without creating an iframe:
In ether case the object produced by dom.createPort() will use these functions internally. One inconvenience of CustomEvent is the need to have a target such as |
Compiling the reasons why MessagePort cannot be used as the internal mechanism:
|
Note: "MessagePort" in this proposal refers to a new type. It is not the MessagePort from the web platform. I explicitly called out before that this concept already exists and that it cannot be used to serve the use case of extensions, at #679 (comment) Regarding the prototypes: I'd like the proposed API to be safe against prototype tampering. This was called out at #679 (comment) . There is also a special note on (un)safety of received values, at #679 (comment) |
Then it must be renamed, otherwise it'll be the "Local Storage" disaster all over again, when new developers confuse chrome.storage.local with localStorage en masse and mix up their usage patterns, limitations, availability, permissions, and inspectability in devtools. It's unclear to me that it should contain "port" in its name. What is clear to me is that it should be based on the existing robust synchronous mechanisms such as CustomEvent (for objects) and MouseEvent (for DOM nodes), which aren't "fragile or hacky" per se as I explained in my comments above in detail. |
A simpler solution might be to incorporate it all in dom.execute. Example 1,
|
This adds a proposal for a new
dom.createPort()
method, which will allow communication between different JavaScript "worlds". It leverages the proposeddom.execute()
method (#678).