Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module workers cannot be cross-origin #3109

Closed
annevk opened this issue Oct 10, 2017 · 34 comments · Fixed by #3656
Closed

Module workers cannot be cross-origin #3109

annevk opened this issue Oct 10, 2017 · 34 comments · Fixed by #3656
Labels
security/privacy There are security or privacy implications topic: script topic: workers

Comments

@annevk
Copy link
Member

annevk commented Oct 10, 2017

A service worker is either selected based on the "client" doing the request, or in case of request for a client, it's selected based on the URL.

Workers are clients. Dedicated workers didn't have to be a client necessarily, but shared workers had to be, because they can span multiple clients that might each have their own service worker and then selecting a service worker for them would not necessarily be as deterministic as can be. Anyway, we made dedicated workers clients too for consistency. This also has some benefits with referrer setup and such.

So since workers are clients, we cannot allow cross-origin URLs for them, as that would also mean we'd end up with a cross-origin service worker, which would violate all kinds of rules.

This was discovered in whatwg/fetch#527.

cc @wanderview @jakearchibald

@jakearchibald
Copy link
Contributor

Trying to figure out the difference between this and importScripts, which can be cross-origin. Is it because the global's client is used for importScripts, whereas with module scripts that doesn't exist yet?

It feels like all module imports should use the same service worker as the root, meaning it would behave like importScripts in terms of service worker usage.

Perhaps the easiest way is to allow the module fetching spec to decide which service worker it wants to use, which would need plumbed through fetch and service worker.

@annevk
Copy link
Member Author

annevk commented Oct 10, 2017

@jakearchibald the issue is about the initial fetch for the worker script, not for any import statements found inside that script.

@jakearchibald
Copy link
Contributor

Then I don't understand why it's bad for workers to load through their own service worker, but ok for iframes.

@annevk
Copy link
Member Author

annevk commented Oct 10, 2017

@jakearchibald workers get the origin of the environment that creates them, iframes do not. Maybe we should support actual cross-origin workers at some point, but these are not it.

@jakearchibald
Copy link
Contributor

Gotcha. Then I agree, if the worker takes on the origin of the host environment, they must be same-origin.

Proper cross-origin workers would be nice though.

@annevk
Copy link
Member Author

annevk commented Oct 10, 2017

Yeah, that could be nice as a feature if there's support. It's unclear if we should require CORS for those though. Correct MIME type seems good, but if they're more like <iframe> I don't necessarily see a reason to require CORS there.

@wanderview
Copy link
Member

Other random things that worried me about this after IRC last night:

  • Would self.location and self.origin point to different origin URLs? That seems surprising an somewhat unprecedented. Do we ever do that for anything besides about:blank today?
  • While about:blank iframes inherit the origin, we also create a completely new Client with a different origin if they are navigated to a cross-origin URL.
  • Initial about:blank iframes are really complicated with bad interop. I'm not excited to try to duplicate that behavior on workers.

@domenic
Copy link
Member

domenic commented Oct 10, 2017

I'm happy to rip this out, especially since module workers haven't seen any implementation work yet.

I appreciate the background about how this is driven by shared workers. I really had anticipated this just being that the source code is fetched cross-origin, then the worker is created the same as a normal same-origin one. But I guess I can see how that leads to nondeterminism for shared workers because there can be multiple service workers per origin. You can imagine fixing that hackily (e.g. saying the "service worker URL" is either the script URL or the page URL, whichever is same-origin). But, let's not do that, I guess.

However I'd like to get clarity on what we're doing for worklets first. Does it make sense for module workers to follow worklets, if they are allowed to be cross-origin? I guess not for shared module workers, is the point? It sounds like there's now some disagreement over whether worklets should work cross-origin anyway, according to whatwg/fetch#527.

It'd also be nice to resolve #3112. If that gets resolved in favor of making dedicated workers subresources, do we allow them to be cross-origin?

@annevk
Copy link
Member Author

annevk commented Oct 11, 2017

Subresources can be cross-origin, but it seems that due to the clients API it would really be better if all workers were clients. I don't think we should block on worklets as they could go either way and it doesn't really affect workers at all.

@domenic
Copy link
Member

domenic commented Oct 11, 2017

I would like worklets and workers to be consistent, perhaps only dedicated workers.

@annevk
Copy link
Member Author

annevk commented Oct 11, 2017

What justifies such a constraint?

@wanderview
Copy link
Member

I would like worklets and workers to be consistent, perhaps only dedicated workers.

Then make worklets same-origin like existing dedicated workers? I don't see why worklets are a reason to change dedicated workers.

@domenic
Copy link
Member

domenic commented Oct 11, 2017

What justifies such a constraint?

Both are ways of loading code in another global/thread using modules. It's good if they both work similarly.

Then make worklets same-origin like existing dedicated workers?

That's one option. But unlike you, I don't believe it's so bad to have subresources, and I do believe allowing cross-origin code-loading is a nice advantage, so there is another option. I understand you don't agree with that weighing of the tradeoff, but I'd like to see how the discussion with worklets goes.

I don't see why worklets are a reason to change dedicated workers.

I think I covered this above in my response to Anne.

@wanderview
Copy link
Member

Well, worklets have an opaque origin according to step 3 here:

https://drafts.css-houdini.org/worklets/#script-settings-for-worklets

Worklets also don't have any way to invoke network/storage currently. (I guess import() will raise some questions, though.)

If anything worklets are like data: URL workers which don't behave like you are asking for things to behave here.

(And BTW, in my testing it seems chrome follows the current spec and does not make these opaque data URL workers controlled if created from a controlled document. We do in firefox, but I'm going to fix that.)

@annevk
Copy link
Member Author

annevk commented Oct 12, 2017

Both are ways of loading code in another global/thread using modules. It's good if they both work similarly.

Well, why would that not generalize to shared workers and service workers? Why is there a stronger tie between worklets and dedicated workers than there is between all worker types? It seems somewhat better to me if all workers at least took the same approach, so you can take the lessons you learn from one worker to another and not be surprised about a wildly different setup.

@annevk
Copy link
Member Author

annevk commented Oct 12, 2017

Also, how do you address the concerns raised around the clients API? If dedicated workers end up not being a client (as you propose), how would you explain them exposing themselves from a JavaScript perspective as a client?

@domenic
Copy link
Member

domenic commented Oct 13, 2017

OK, so I'm glad that we waited for the worklets issue to be resolved, as it made it clear that worklets are going to be too different from workers (due to the opaque origin).

Still:

Well, why would that not generalize to shared workers and service workers? Why is there a stronger tie between worklets and dedicated workers than there is between all worker types? It seems somewhat better to me if all workers at least took the same approach, so you can take the lessons you learn from one worker to another and not be surprised about a wildly different setup.

I think there's a stronger tie between "non-shared" things like worklets and dedicated workers than there is between dedicated workers and shared workers.


Anyway, it sounds like we should remove cross-origin module workers. Let me do that.

@domenic
Copy link
Member

domenic commented Oct 13, 2017

Wait, no, we need to figure out #3112 first, i.e. whether we're going to align the spec with what's implemented (subresources) or with what it currently says (separate client) since that changes the story significantly.

@wanderview
Copy link
Member

To repeat, I'm opposed to changing the spec here and I am in the process of implementing it in Firefox. Chrome also already partially implements it by treating data URL workers as a separate Client without any service worker controlling it, etc.

@domenic
Copy link
Member

domenic commented Oct 13, 2017

I'm sure your testing whether it's web-compatible will be helpful information in letting other engines decide whether they want to align or not. Maybe we can table the discussion until that data is in and we have a consensus position among engines.

@wanderview
Copy link
Member

I'm implementing the agreed upon spec. I think service workers is still new enough and this is enough of a corner case that I'm not doing any interop measurements/analysis.

For the vast majority of sites this will work the same as it does today. The only time it will change is for same-origin worker scripts which are outside of the scope path of the document load. From most of the service worker deploys I have seen I think that is pretty rare at the moment.

@wanderview
Copy link
Member

Similarly, if we discuss this at TPAC and as a group decide to change the spec, I think it could be changed even if I've implemented the current spec. So far, though, I'm not convinced and I haven't heard anyone else asking for this change.

@domenic
Copy link
Member

domenic commented Oct 13, 2017

I mean, we have Chrome asking for the change in #3112.

@wanderview
Copy link
Member

Asking "can we do this since it kind of works this way in our partial implementations", is not the same as "I think we should do this because its better for reason X, Y, and Z."

Also the original post in #3112 raises a valid concern about how to handle dedicated workers nested under service workers. That is something we have long wanted to support and shouldn't block.

Anyway, at this point I'd prefer to discuss any changes to this part of the spec in person at the next SW F2F at TPAC.

@annevk
Copy link
Member Author

annevk commented Oct 16, 2017

The problem is that the status quo is wrong. This becomes especially problematic the moment someone tries to implement module workers. Problematic for shared workers, but also for dedicated workers given Firefox's goal of making them a client as well as per the standard.

So unless we don't see anyone implementing module workers anytime soon we should add a warning of sorts I think.

@annevk
Copy link
Member Author

annevk commented Oct 16, 2017

(I'm not sure why it's the opaque origin point that convinced you by the way. A dedicated/shared worker can have one too if fetched from a data URL or whenever we add sandboxing to workers.)

@surma
Copy link
Contributor

surma commented Feb 20, 2018

I have run into this issue and think it’s really confusing from a developer’s perspective that CORS headers are completely ignored. This makes hosting libraries with workers on CDNs practically impossible. In the near future, I see this becoming a problem for libraries.

As a workaround, I’m currently using something long these lines:

async function myWorker(path) {
  const blob = await fetch(path).then(resp => resp.blob());
  return new Worker(URL.createObjectURL(blob));
}

(An earlier version of this workaround would assemble a data URL that called importScripts(), but that would even circumvent CORS so it felt a bit heavy-handed.)

Considering that this work iff path is same-origin or CORS headers allow, is there any reason why we can’t bake the same (or similar) behavior into the platform?

@jakearchibald
Copy link
Contributor

@surma In your opinion, if foo.example.com were to load a worker script from bar.example.com, and it accessed IDB, which origin would it be reading data from?

@surma
Copy link
Contributor

surma commented Feb 20, 2018

AFAIU, workers are always on the origin of the hosting site, so I think the origin should be foo.example.com (same effect as my workaround via blob URLs).

@annevk
Copy link
Member Author

annevk commented Feb 20, 2018

@surma the reason it doesn't work is that we want to pick the service worker for a worker based on the worker URL (just like we do for documents), and therefore it needs to be same-origin.

@jakearchibald
Copy link
Contributor

We could discuss changing this again. We'd have to do some tests to check if it'd break anything on live.

It would mean workers would drop out of the clients API, meaning a service worker wouldn't be able to message a page's dedicated worker, except via broadcast channels.

@wanderview
Copy link
Member

I'm still not sure I completely understand why this is a critical issue. Is it really that onerous for the site to host a skeleton .js worker script on its origin that then pulls in the CDN content? Or the skeleton script can be a blob URL as it looks you settled on as the work-around.

I feel pretty strongly that things creating a global should get their origin from the thing being loaded. I don't think dedicated-worker globals should be different from iframe/shared-worker/service-worker globals in this regard.

We made an exception for worklets because that case is more like the parent loads the resource once and then spawns multiple globals from that loaded data. Kind of like you can do with blob URL workers now. Its not a 1:1 load-to-global. They're also spec'd to get opaque origins which makes them a bit safer to exempt from our normal origin setting logic.

If we need to add an exception for dedicated workers I think we should have a really strong reason. Making them different from other globals adds to the complexity of the platform and implementation. Getting the origin of globals right is obviously important and avoiding complexity helps there. IMO that means increasing consistency and reducing exceptions.

@guybedford
Copy link
Contributor

Surely we shouldn't be designing for a future where CDN-based worker code must rely on a Blob though? Because it seems like that's what will happen with all libraries published using workers to support CDN usage this way.

annevk added a commit that referenced this issue Apr 30, 2018
* Module scripts are always fetched with credentials mode "same-origin" by default and can no longer
  use "omit". This makes them match other high-level platform features.
* The top-level script for module workers is always fetched with request mode "same-origin" and
  credentials mode "same-origin". Cross-origin workers did not quite work due to service workers.
* Any module worker import scripts are fetched with request credentials mode "same-origin" and this
  cannot be configured for now.

Fixes #2557 and fixes #3109.
annevk added a commit that referenced this issue Apr 30, 2018
* Module scripts are always fetched with credentials mode "same-origin" by default and can no longer
  use "omit". This makes them match other high-level platform features.
* The top-level script for module workers is always fetched with request mode "same-origin" and
  credentials mode "same-origin". Cross-origin workers did not quite work due to service workers.
* Any module worker import scripts are fetched with request credentials mode "same-origin" and this
  cannot be configured for now.

Fixes #2557 and fixes #3109.
annevk added a commit that referenced this issue May 23, 2018
* Module scripts are always fetched with request credentials mode "same-origin" by default. Only
  worker module scripts can still set that to "omit". Non-worker module scripts cannot, keeping
  them aligned with how the crossorigin attribute works for other platform features.
* The top-level script for module workers is always fetched with request mode "same-origin".
  Cross-origin workers did not quite work due to service workers.

Fixes #2557 and fixes #3109.
annevk added a commit that referenced this issue Oct 8, 2018
* Module scripts are always fetched with request credentials mode "same-origin" by default. Only
  worker module scripts can still set that to "omit". Non-worker module scripts cannot, keeping
  them aligned with how the crossorigin attribute works for other platform features.
* The top-level script for module workers is always fetched with request mode "same-origin".
  Cross-origin workers did not quite work due to service workers.

Fixes #2557 and fixes #3109.
domenic pushed a commit that referenced this issue Oct 9, 2018
* Module scripts are always fetched with request credentials mode
  "same-origin" by default, instead of the previous default of "omit".
  Only worker module scripts can still set that to "omit", using the
  credentials option to the Worker constructor. Non-worker module
  scripts, which only have the crossorigin="" attribute available, can
  only toggle between "same-origin" and "include", similar to how
  crossorigin="" works for other platform features.
* Similarly, import() statements inside of classic scripts now use the
  "same-origin" credentials mode, instead of "omit". This affects both
  <script> elements, where the default can be changed using
  crossorigin="", and other contexts like javascript: URLs and classic
  worker scripts, where the default cannot be changed.
* The top-level script for module workers is always fetched with request
  mode "same-origin". Cross-origin workers did not quite work due to
  service workers.

Fixes #2557. Fixes #3109.

Tests:

* web-platform-tests/wpt#11274
* web-platform-tests/wpt#13176
* web-platform-tests/wpt#13426
@annevk
Copy link
Member Author

annevk commented Oct 10, 2018

@guybedford sorry that your question got left unanswered. This design decision was basically made when workers were added and cemented later when service workers treated them similar to documents. I don't see how we could have done this differently for shared workers and making dedicated workers special doesn't seem great either (and implementers objected to that as well).

This might be in part why some CDNs are now basically in charge of hosting your site completely.

mustaqahmed pushed a commit to mustaqahmed/html that referenced this issue Feb 15, 2019
* Module scripts are always fetched with request credentials mode
  "same-origin" by default, instead of the previous default of "omit".
  Only worker module scripts can still set that to "omit", using the
  credentials option to the Worker constructor. Non-worker module
  scripts, which only have the crossorigin="" attribute available, can
  only toggle between "same-origin" and "include", similar to how
  crossorigin="" works for other platform features.
* Similarly, import() statements inside of classic scripts now use the
  "same-origin" credentials mode, instead of "omit". This affects both
  <script> elements, where the default can be changed using
  crossorigin="", and other contexts like javascript: URLs and classic
  worker scripts, where the default cannot be changed.
* The top-level script for module workers is always fetched with request
  mode "same-origin". Cross-origin workers did not quite work due to
  service workers.

Fixes whatwg#2557. Fixes whatwg#3109.

Tests:

* web-platform-tests/wpt#11274
* web-platform-tests/wpt#13176
* web-platform-tests/wpt#13426
mustaqahmed pushed a commit to mustaqahmed/html that referenced this issue Feb 15, 2019
* Module scripts are always fetched with request credentials mode
  "same-origin" by default, instead of the previous default of "omit".
  Only worker module scripts can still set that to "omit", using the
  credentials option to the Worker constructor. Non-worker module
  scripts, which only have the crossorigin="" attribute available, can
  only toggle between "same-origin" and "include", similar to how
  crossorigin="" works for other platform features.
* Similarly, import() statements inside of classic scripts now use the
  "same-origin" credentials mode, instead of "omit". This affects both
  <script> elements, where the default can be changed using
  crossorigin="", and other contexts like javascript: URLs and classic
  worker scripts, where the default cannot be changed.
* The top-level script for module workers is always fetched with request
  mode "same-origin". Cross-origin workers did not quite work due to
  service workers.

Fixes whatwg#2557. Fixes whatwg#3109.

Tests:

* web-platform-tests/wpt#11274
* web-platform-tests/wpt#13176
* web-platform-tests/wpt#13426
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security/privacy There are security or privacy implications topic: script topic: workers
Development

Successfully merging a pull request may close this issue.

6 participants