Transferring the ability to use restricted APIs to another window
.
Mustaq Ahmed ([email protected], github.com/mustaqahmed)
- Github repository: WICG/capability-delegation
- Issue tracker: WICG/capability-delegation/issues
"Capability delegation" means allowing a frame to relinquish its ability to call
a restricted API and transfer the ability to another (sub)frame it trusts. The
focus here is a dynamic delegation mechanism which exposes the capability to the
target frame in a time-constrained manner (unlike <iframe allow=...>
attribute
which is not time-constrained).
The API proposed here is based on postMessage()
, where the sender frame uses a
new
PostMessageOptions
member to specify the capability it wants to delegate.
Here are some practical scenarios that are enabled by the Capability Delegation API.
Many merchant websites perform payment processing through a Payment Service
Provider (PSP) site (e.g. Stripe) to comply with security
and regulatory complexities around card payments. When the end-user clicks on
the "Pay" button on the merchant website, the merchant website sends a message
to a cross-origin iframe
from the PSP website to initiate payment processing,
and then the iframe
uses the Payment Request
API to complete the task.
But sites are only allowed to call the Payment Request
API after transient user
activation
(a recent click or other interaction) to prevent malicious attempts like
unattended or repeated payment requests. Since the user probably clicked on the
main site, and not the PSP iframe
, this would prevent the PSP from using the
Payment Request API at all. Browsers today support such payment processing by
ignoring the user activation requirement altogether (see
crbug.com/1114218)!
Capability Delegation API provides a way to support this use-case while letting the browser enforce the user activation requirement, as follows:
// Top-frame (merchant website) code
checkout_button.onclick = () => {
targetWindow.postMessage("process_payment", {targetOrigin: "https://example.com",
delegate: "payment"
});
};
// Sub-frame (PSP website) code
window.onmessage = () => {
const payment_request = new PaymentRequest(...);
const payment_response = await payment_request.show();
...
}
This is a work-in-progress in Chrome.
Consider a presentation/slide website where the main "control panel" window has spawned a few presentation windows, and the user wants to selectively make one presentation window fullscreen by clicking on the appropriate button on the main window (a feature request from a developer). Clicking on the "control panel" button does not make the user activation available to the presentation window, so this does not work today.
The Web does not support this use-case today but Capability Delegation API provides a solution:
// Main window ("control panel") code
let win1 = open("presentation1.html");
let win2 = open("presentation2.html");
button1.onclick = () => win1.postMessage("msg", {targetOrigin: "https://example.com",
delegate: "fullscreen"});
button2.onclick = () => win2.postMessage("msg", {targetOrigin: "https://example.com",
delegate: "fullscreen"});
// Sub-frame ("presentation window") code
window.onmessage = () => document.body.requestFullscreen();
Consider a web app in which you want to add video-conferencing capabilities.
You turn to a third party solution that can be embedded in a cross-origin
iframe
. There's a lot of logic behind the scenes, but UX-wise, maybe you
work out a scheme where it's mostly the video which is user-facing in the
video-conferencing iframe
, and the user-facing controls - mute, leave,
share-screen - are all part of the web app, and receive its specific UX
styling. When those buttons are pressed, some messages are exchanged between
the web app and the embedded video-conferencing solution.
To let the third-party iframe
to prompt the user to share a tab, a window,
or a screen, the top frame would delegate the mediaDevices.getDisplayMedia()
permission to the iframe
as follows:
// In the top frame, user clicks the "Share My Screen" button.
button.onclick = () =>
frames[0].postMessage("msg", { delegate: "display-capture" });
// In the cross-origin video-conferencing iframe, prompt the user
// to share a tab, a window, or a screen.
window.onmessage = () => navigator.mediaDevices.getDisplayMedia();
-
A web service that does not care about user location except for a "branch locator" functionality provided by a third-party map-provider app can delegate its own location access capability to the map
iframe
in a temporary manner right after the "branch locator" button is clicked. -
An authentication provider may wish to show a popup to complete the authentication flow before returning a token to the host site.
-
A website may want a third-party chat app in an
iframe
to be able to vibrate the phone on message receipt, even when the user is not active in theiframe
.
-
This explainer is not about delegation of user activation (i.e., allowing the
iframe
to choose from all of the things the top frame could do after a user click or other interaction). See Considered Alternatives below for more details. -
This explainer does not determine which APIs could possibly support capability delegation. If any API needs the support, the designers of the API would decide details of delegated behavior. The PaymentRequest API case presented here (in collaboration with the owners of that API) serves as a guide for similar changes in other API specifications.
Developers would use Capability Delegation by just initiating the delegation
appropriately, as shown in the example code snippets above. In short, when a
browsing
context
wants to delegate a capability to another browsing context, it sends a
postMessage()
to the second browsing context with an extra
WindowPostMessageOptions
member called delegate
specifying the capability.
After a successful delegation, the "user API" (the restricted API being
delegated) just works when called at the right moment. The general idea is
calling the restricted API in a MessageEvent
handler or soon afterwards. In
the examples above, the restricted APIs are payment_request.show()
,
element.requestFullscreen()
, and mediaDevices.getDisplayMedia()
respectively.
-
Payment Request API: To see how this API works with Payment Request, run Chrome with the command-line flag:
--enable-blink-features=PaymentRequestRequiresUserActivation
, then open this demo. -
Fullscreen API: Work in progress.
-
Screen Capture API: Work in progress.
It may appear that we can delegate user activation to solve the same use-cases
and thus avoid specifying a feature in the postMessage()
call. We attempted
this direction in the past from a few different perspectives, and decided not to
pursue this. In particular, user activation controls many Web APIs, so
delegating user activation for any of the mentioned use-cases is impossible
without causing problems with unrelated APIs. See the TAG
discussion with one past
attempt.
Instead of piggy-backing the delegation request as a PostMessageOptions
entry,
we considered adding a new delegation-specific interface on the Window
object.
While the latter may look cleaner from a developer’s perspective, to support
cross-origin communication this solution would require adding the new method on
the
WindowProxy
wrapper, which HTML's editor strongly
disliked.
We will track the overall status through this Chrome Status entry.
Many thanks for valuable feedback and advice from:
- Anne van Kesteren (github.com/annevk)
- Jeffrey Yasskin (github.com/jyasskin)
- Robert Flack (github.com/flackr)