Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caching of importScripts()-scripts #106

Closed
sicking opened this issue Oct 21, 2013 · 28 comments
Closed

caching of importScripts()-scripts #106

sicking opened this issue Oct 21, 2013 · 28 comments

Comments

@sicking
Copy link

sicking commented Oct 21, 2013

How is caching supposed to work of importScripts-loaded scripts?

Normally any network requests coming from a service worker is supposed to not trigger any fetch event in the worker itself, but rather result in network requests. If the user is offline, that will often result in a failed request.

However if a network requests fails during importScripts(), generally the service worker will not work at all.

I guess the implementation can be extra aggressive about caching resources fetched through importScripts() calls in a service worker.

The obvious answer here is ES6 modules. Even if we just had the static import syntax that might be enough to carry us through until full ES6 support is deployed in browsers.

But is it ok to have a hard dependency on event the ES6 import syntax?

@alecf
Copy link
Contributor

alecf commented Oct 21, 2013

Yes, I think all the service workers scripts have to be aggressively cached - meaning that they are stored alongside the original script. The idea is that during the initial eval, all calls to importScripts() are 'recorded' so that the same resources get "served" every time. If the service worker script itself gets updated, then we can re-fetch importScript'ed scripts.

More detail in the advanced topics section:
https://github.com/slightlyoff/ServiceWorker/blob/master/advanced_topics.md#caching-of-serviceworker-scripts

@sicking
Copy link
Author

sicking commented Oct 21, 2013

Makes sense. Unfortunately we have to rely on the heuristic of "what was loaded last time" as we can't catch stuff like

importScripts("library.js", (new Date()).getDay() == 4) ? "thursday.js" : undefined);

or, arguably somewhat more likely :)

if (...) {
importScripts(...);
}

Would be great to switch to using ES6 modules, and not allow importScripts, but maybe I'm being too optimistic.

@alecf
Copy link
Contributor

alecf commented Oct 21, 2013

Yep, I think that needs to be spelled out explicitly in the main spec - that it's only the resources seen during the first load.

it also doesn't account for craziness like

oninstall = function(e) {
  importScripts(...);

}

which might be tempting even though it's obviously also pretty broken..

Honestly I'd be happy making importScripts() throw an exception in all cases except during initial load/eval, though that might also make it a confusing developer experience.

@jakearchibald
Copy link
Contributor

Calling importScripts in an oninstall handler would be ok, right?

Once the oninstall handler finishes, makes sense for importScripts to throw.

There's still a usecase to bring scripts into the worker dynamically, perhaps the script is large and the dev wants to optimise by only parsing it if & when it's required. This can be done via xhr & eval, but es6 modules will become the "right" way to do that.

@alecf
Copy link
Contributor

alecf commented Oct 28, 2013

The problem with supporting it in oninstall is that oninstall only fires during installation. Which means if you call it during install, the imported scripts would only be available during the install phase. That means if the importScripts() happens during an install, some events are handled, and then service worker is restarted, the imported scripts are no longer available. Is that really desired? I guess I can definitely see how a savvy developer would probably appreciate that behavior.

The other problem is that it's synchronous, which isn't nice during event dispatching.

I guess it's true that the install phase is meant to be run while you're still "online"

Hmmm.

@jakearchibald
Copy link
Contributor

Yeah, I can't think of a usecase for calling importScripts oninstall. Throwing if it isn't called during initial execution sounds like the best idea.

@michael-nordman
Copy link
Collaborator

How is caching supposed to work of importScripts-loaded scripts?

Good question. Let's start with describing the behavior we'd like make possible and then figure out how to reflect that in an API. I see these three possible import cases...

  1. import a script resource across the network "as ususal"
  2. import a script resource that is cached coincident with the service worker script.
  3. import a script resource from a application managed Cache.

Most of the discussion here is focused on #2.

@jakearchibald
Copy link
Contributor

I think 2 is most useful here & would be the reason to use importScripts.

1 & 3 are possible via fetch()/xhr and eval(), but async. The ES Module loader will handle this case better, eventually. Both of these allow you to handle network failures and empty caches.

@sicking
Copy link
Author

sicking commented Feb 6, 2014

My proposal here is that during the initial eval of the ServiceWorker script we record any calls to importScripts. Those scripts gets cached along with the ServiceWorker script. The next time the ServiceWorker is spun up and we do the initial eval, any calls to impotScripts throw if they try to load URLs other than what was originally recorded.

Any calls to importScripts outside of the original eval always throws. I.e. during any events like "fetch" or "install", or during any callbacks from IDB, XHR or Promises any calls to importScripts always fails.

@jyasskin
Copy link
Member

jyasskin commented Feb 7, 2014

I'm slightly worried about load time: if a ServiceWorker does many different things, it may want to optimize the onfetch handling time by avoiding the imports it would need to service some other kinds of requests. I do think it makes sense to wait until we have measurements before changing things in response to this.

Re fetch()+eval(): doing this naively violates CSPs. importScript(AbstractResponse) could be arranged to respect CSPs.

@slightlyoff
Copy link
Contributor

@sicking : your proposal is what I've been assuming we'll do for some time. Subsequent start-ups that add to the cache via importScripts() can succeed and will extend the cache object that (conceptually) holds the SW+deps. That said, failure leads to exceptions which lead to the SW failing for a navigation (and, at a UA-dependent amount, probably being disabled until an update is detected).

@jakearchibald
Copy link
Contributor

@slightlyoff I prefer @sicking's idea. Imagine:

importScripts('hello' + Math.floor(Math.random() * 10) + '.js');

…with your suggestion, this would appear to work fine in a local environment, but in production it could cause huge delays as the worker is spun up for an onfetch, hits a new importScripts url, and has to download that before it can handle the request.

With @sicking's suggestion, it'd fail hard during development (the worker should be killed more frequently with devtools open to highlight issues with retaining state in the worker etc).

@alecf
Copy link
Contributor

alecf commented Feb 7, 2014

I was generally assuming we were going to do @sicking 's suggestion, but I think Jeffrey brings up an interesting point - that you may want to delay loading of some scripts to minimize install time. I'm not sure where to strike that balance - if we allow scripts to be loaded later, I think we should have an alternative asynchronous API and make importScripts() fail after initial eval.

@slightlyoff
Copy link
Contributor

As Alec and I discussed f2f on Friday, my conceptual model for this has been that there's always a "default cache object" that the SW and its deps are put into by the system when they are fetched.

One (I think decent) option is to just expose this cache and make it scriptable like all the others. That way apps can managed their imports manually should they want to but also explain the default behavior of "things get added to the cache when you call importScripts()".

Where should this cache object live? It's distinct from those in caches in that it's going to be bound to the SW and its lifetime (and perhaps not shared?). It'll be more like the "gc'd caches" of yore. How about this.caches.self?

Thoughts?

@jakearchibald
Copy link
Contributor

@alecf

I think we should have an alternative asynchronous API and make importScripts() fail after initial eval.

Will ES6 provide that api with modules? Would uses of that go through onfetch even if they're within the serviceworker?

@slightlyoff

Where should this cache object live? It's distinct from those in caches in that it's going to be bound to the SW and its lifetime (and perhaps not shared?). It'll be more like the "gc'd caches" of yore.

If we're going to have unshared & gc'd caches back I'd rather open them up to the developer more. this.workerCaches vs this.caches or whatever. However, the gc-like behaviour could be explained with an ondeactivate step that does this.caches.delete('__workercacheguid1234') or whatever.

@annevk
Copy link
Member

annevk commented Feb 10, 2014

Fetches from the service worker should not go through it. However, ES6 modules will have their own loader hooks that you could use from within the service worker to use the cache of the service worker.

@alecf
Copy link
Contributor

alecf commented Feb 10, 2014

If we expose this cache, then I don't think it should be gc'ed - I think trying to GC caches is simply adding complication to both implementation and API.

At that point I guess I'd do something like have a flag on caches to indicate if SW fetch() should go through it:

var cache = new Cache();
cache.usedByFetch = true;
this.caches.set('scriptcache', cache);

(strawman name)

@jakearchibald
Copy link
Contributor

Hmm, the cache doesn't control whether onfetch happens for a given url.

Maybe static routes could explain this? A static route could have an option so it would also apply to in-worker requests. You'd still need ondeactivate to explain the GC.

@jakearchibald
Copy link
Contributor

I've used this a few times for cache cleanup:

this.onactivate = function(event) {
  // remove caches that shouldn't be there
  event.waitUntil(
    caches.keys().then(function(cacheName) {
      return Promise.all(
        cacheName.filter(n => expectedCaches.indexOf(n) == -1)
          .map(n => caches.delete(n))
      );
    })
  );
};

This would fail if this "supercache" (that contained the worker and importScripts) were visible in this.caches, because I'd be accidently deleting it.

@slightlyoff
Copy link
Contributor

@alecf I don't understand how we do upgrades in a world where these aren't gc'd without introducing a huge footgun. The problem is how to explain/surface that the original script was downloaded and added when a subsequent version will need to populate a Cache object with potentially exactly the same set of URLs (only with different contents).

I also don't see how this makes the impl any harder. What am I missing?

@michael-nordman
Copy link
Collaborator

Is the conclusion about how importScripts() and caching of those resources is supposed to work written down somewhere?

@jakearchibald
Copy link
Contributor

Unfortunately not, we'll tackle this on Monday/Tuesday if we don't fix it in the meantime.

@KenjiBaheux
Copy link
Collaborator

On the Blink side I believe that crbug.com/364247 covered this.
Does the latest spec accurately captured the decision(s) made? Can we close this issue or is this still TBD?

@jakearchibald
Copy link
Contributor

I don't think we quite have this covered in the spec. Behaviour is as http://dev.w3.org/html5/workers/#dom-workerglobalscope-importscripts, except:

Instead of step 5:

  1. If serviceworker is undergoing [[Update]], then
    1. Attempt to fetch each resource identified by the resulting absolute URLs, from the origin specified by settings object, using the API referrer source specified by settings object, and with the synchronous flag set.
    2. Cache responses against the serviceworker instance, keyed by url.
  2. Else, then
    1. Lookup cached response from serviceworker instance.
    2. If no response is found, then
      1. Treat as NetworkError
  3. For each argument in turn, in the order given (back to step 6 in http://dev.w3.org/html5/workers/#dom-workerglobalscope-importscripts)

Errors thrown by importScripts should become update rejections.

This means importScripts resources are refetched on [[Update]], and can be GC'd when a serviceworker is GC'd.

@slightlyoff
Copy link
Contributor

hey all,

@wanderview and @tabatkins pointed out on IRC that we haven't closed this off and there was some confusion about what behavior Mozilla should implement. After a discussion, it seems we need to get something like @jakearchibald's language in.

@jungkees : do you agree with that?

@jungkees
Copy link
Collaborator

it seems we need to get something like @jakearchibald's language in.

Agreed. Let me put this item in my queue.

Spec'ing this in HTML would be more accurate than monkey patching somehow. @Hixie Is there any way I can make a PR or something for this and some of the stuffs that need hooks between Service Workers spec and HTML?

@annevk
Copy link
Member

annevk commented Feb 27, 2015

@jungkees I recommend filing a bug on HTML and adding a monkey patch in service workers (that is clearly identified as such) to aid implementers. Then once HTML is fixed the monkey patch can be turned into more proper layering.

@jungkees
Copy link
Collaborator

@annevk Okay, I'll do it that way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants