Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ban data URL worklets #985

Closed
annevk opened this issue Apr 27, 2020 · 15 comments
Closed

Ban data URL worklets #985

annevk opened this issue Apr 27, 2020 · 15 comments

Comments

@annevk
Copy link
Member

annevk commented Apr 27, 2020

Although I thought data URL worklets would be fine, per #506 and implementation reality, static imports are supported, which means worklets have some means of networking. In that case it would be better for data URLs to create their own agent cluster (see also whatwg/html#5476), but that would be kind of pointless, especially for audio worklets. So it makes more sense to network error data URL worklets (to me).

cc @padenot @nhiroki

@padenot
Copy link

padenot commented Apr 27, 2020

@hoch, fyi.

@nhiroki
Copy link
Contributor

nhiroki commented Apr 28, 2020

The main point of this issue is that a data URL worklets have an opaque origin, right?

Top-level worklet script import and static import from there are issued with the outside settings object (i.e., window) as the fetch client, not the inside settings object (i.e., worklet global scope), so I guess the origin of the worklet shouldn't affect them (double-check is needed). Also, data URL worklets wouldn't be able to run static import because the top-level data URL module script doesn't have a valid base URL and cannot resolve a module identifier for static import.

@nhiroki
Copy link
Contributor

nhiroki commented Apr 28, 2020

cc: @hiroshige-g

@nhiroki
Copy link
Contributor

nhiroki commented Apr 28, 2020

If we ban data URL worklets, I'd like to see the current usage before that. I guess audio worklets are mostly used among worklets. @hoch do you have an insight about this?

@annevk
Copy link
Member Author

annevk commented Apr 28, 2020

@nhiroki yeah, if they have an opaque origin they would ideally be distinct agent clusters, which doesn't make a lot of sense for worklets?

I could see giving them the same origin as outside settings, but that would be inconsistent with the decision we made for documents and workers. I think the principle was that if a data URL can result in an execution environment, that needs to have an opaque origin.

@hoch
Copy link

hoch commented Apr 28, 2020

@nhiroki I am not sure how we can collect such data. I know there are a couple of examples out there, but I don't think they are a good representation of the apps in production. Usually those examples generate codes for an AudioWorkletProcessor based on the user action. Blocking data URL will disable this type of applications and developers will have to set up a backend to generate the code on the fly.

I get that the motivation here is to use dynamic import() instead, but it will break some apps for sure.

@padenot
Copy link

padenot commented Apr 28, 2020

Blocking data URL will disable this type of applications and developers will have to set up a backend to generate the code on the fly.

I've seen those apps use the following pattern:

const blob = new Blob([processorSourceCode], {type: "application/javascript"});
audioContext.addModule(URL.createObjectURL(blob))

where processorSourceCode is the source code of the processor. My understanding is that this would continue working.

@hoch
Copy link

hoch commented Apr 28, 2020

Perhaps I am not understanding the issue correctly. Then what would be the breaking change? Can you provide an example?

@annevk
Copy link
Member Author

annevk commented Apr 28, 2020

audioContext.addModule("data:...")

@hoch
Copy link

hoch commented Apr 28, 2020

Thanks! I see. FWIW, I've seen both cases: Blobs and Data URLs. (I did not know it was renamed recently) I do not have a strong opinion on this. @nhiroki WDYT?

@AmeliaBR
Copy link

There is definitely a strong use case for having some way to keep simple module code encapsulated in the same file as the registration script, just to minimize the number of files (and fetches) required for something like a simple CSS Paint worklet.

But if Blob URLs are still OK, that could replace it. The WPT worklet tests all seem to use this pattern. If you disallow data URLs, it might be worth including an example of how to generate a blob URL from a string.

(Note, I don't quite understand how Blobs are different from data URLs when it comes to the security issues, but if it works, it works! Just make sure it's clear in the spec.)

@hiroshige-g
Copy link

data URL worklets wouldn't be able to run static import because the top-level data URL module script doesn't have a valid base URL and cannot resolve a module identifier for static import.

Static imports using absolute URL specifiers like import('https://...') work.

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed ban data: url worklets, and agreed to the following:

  • RESOLVED: ban data url from spanning a worker
The full IRC log of that discussion <TabAtkins> Topic: ban data: url worklets
<TabAtkins> github: https://github.com//issues/985
<fremy> TabAtkins: next topic comes from Ana Tudor
<AmeliaBR> s/Ana Tudor/Anne VK/
<fremy> s/Ana Tudor/AnneVK/
<TabAtkins> vK
<fremy> TabAtkins: you initialize a worklet by passing a url
<fremy> TabAtkins: you can do that using a data url to avoid using the network
<fremy> TabAtkins: annevk says that this can be an issue
<fremy> TabAtkins: for security reasons
<fremy> TabAtkins: blob urls would still work
<fremy> TabAtkins: but data urls apparently have some issue
<fremy> TabAtkins: any comment?
<fremy> AmeliaBR: can we add a note showing how to use a blob url?
<fremy> TabAtkins: why not?
<fremy> TabAtkins: any suggestion?
<fremy> RESOLVED: ban data url from spanning a worker
<fremy> s/spanning/spawning/

@domenic
Copy link
Contributor

domenic commented Oct 16, 2020

I think this issue was opened based on a false premise. The URL passed to addModule() is not related to "the worklet's URL", and does not impact agent cluster selection. The worklet's URL and origin are determined before any addModule() calls.

That said, we could still ban addModule("data:...") if we wanted. Should we?

@annevk
Copy link
Member Author

annevk commented Oct 19, 2020

Yeah, you're right. And they don't create an opaque response either so I think we are good here.

@annevk annevk closed this as completed Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants