Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to skipWaiting when the last tab refreshes #993

Closed
dfabulich opened this issue Oct 20, 2016 · 10 comments
Closed

Provide a way to skipWaiting when the last tab refreshes #993

dfabulich opened this issue Oct 20, 2016 · 10 comments

Comments

@dfabulich
Copy link

dfabulich commented Oct 20, 2016

When a service worker updates, it doesn't take control of the page right way; it goes into a "waiting" state, waiting to be activated.

Surprisingly, the updated service worker doesn't even take control of the tab after refreshing the page. Google explains:

https://developers.google.com/web/fundamentals/instant-and-offline/service-worker/lifecycle

Even if you only have one tab open to the demo, refreshing the page isn't enough to let the new version take over. This is due to how browser navigations work. When you navigate, the current page doesn't go away until the response headers have been received, and even then the current page may stay if the response has a Content-Disposition header. Because of this overlap, the current service worker is always controlling a client during a refresh.

To get the update, close or navigate away from all tabs using the current service worker. Then, when you navigate to the demo again, you should see the horse [updated content].

This pattern is similar to how Chrome updates. Updates to Chrome download in the background, but don't apply until Chrome restarts. In the mean time, you can continue to use the current version without disruption. However, this is a pain during development, but DevTools has ways to make it easier, which I'll cover later in this article.

This behavior makes sense in the case of multiple tabs. My app is a bundle that needs to be updated atomically; we can't mix and match part of the old bundle and part of the new bundle. (In a native app, atomicity is automatic and guaranteed.)

But in the case of a single tab, which I'll call the "last open tab" of the app, this behavior is not what I want. I want refreshing the last open tab to update my service worker.

(It's hard to imagine that anybody actually wants the old service worker to continue running when the last open tab is refreshed. Google's "navigation overlap" argument sounds to me like a good excuse for an unfortunate bug.)

I'm developing an "app" (a game). My app is normally only used in a single tab. In production, I want my users to be able to use the latest code just by refreshing the page, but that won't work: the old service worker will remain in control across refreshes.

I don't want to have to tell users, "to receive updates, be sure to close or navigate away from my app." I want to tell them, "just refresh."

When I raised this on StackOverflow, @jakearchibald advised me to add an in-app update notification; clicking on it would post a message to the service worker, which would skipWaiting, and then window.location.reload() once the new service worker became active.

This seems like a hack, so we discussed it on Twitter. https://twitter.com/dfabu/status/789242276358651904

Is there any way to skipWaiting on refresh? Writing my own in app refresh button seems like a hack, doesn't it?

no, but I think such a thing would be useful (and not just on refresh, but any navigation)

I guess this would be a "navigate" event in the service worker, which would allow you to halt navigation until skipWaiting.

could you file a ticket on https://github.com/w3c/ServiceWorker?

EDIT: I now think the current behavior is just a bug (with a good excuse! but it's still a bug), so I'm going to clarify steps to reproduce, expected, actual.

Consider this repo. https://github.com/dfabulich/service-worker-refresh-bug

There are three directories in there: appcache, broken-refresh, and nonatomic-skip-waiting. Each directory contains an index.html file containing hard-coded content and a delay-loaded request for a script tag; the delay is to demonstrate the race condition.

The appcache directory is the simplest, and most clearly demonstrates the behavior I expect and desire. (AppCache is actually really nice! Please stop calling AppCache mean names 😉 )

Here's the appcache/index.html file:

<!DOCTYPE html>
<html manifest="manifest.appcache">
<title>offline behavior test</title>
<body>
<p>HTML: <span id="html">1</span></p>
<p>JS: <span id="js">Loading...</span></p>
<script>
// timeout to demonstrate race
setTimeout(function() {
    var script = document.createElement("script");
    script.src = "script.js";
    document.body.appendChild(script);
}, 2000);
</script>
</body>
</html>

And here's the appcache/script.js file. It writes the JS value into the page, and says "MATCH" or "MISMATCH" depending on whether it matches the hardcoded value in the HTML.

var value = 1;

var html = document.getElementById('html');
var js = document.getElementById('js');

if (html.innerHTML == value) {
    js.innerHTML = value + " MATCH";
} else {
    js.innerHTML = value + " MISMATCH";
}

Here's the AppCache manifest.

CACHE MANIFEST
# index.html 1
script.js

To see the expected behavior, open appcache/index.html on a web server. The page will say this, as expected:

HTML: 1
JS: 1 MATCH

Now, without closing the tab, change the number 1 to 2 in index.html, script.js, and manifest.appcache; then refresh the page.

On the initial refresh, the page will still say "1", as expected:

HTML: 1
JS: 1 MATCH

This is expected behavior because we're loading from the cache by default. The console log will explain:

Document was loaded from Application Cache with manifest https://mymachine/appcache/manifest.appcache
Application Cache Checking event
Application Cache Downloading event
Application Cache Progress event (0 of 2) https://mymachine/appcache/script.js
Application Cache Progress event (1 of 2) https://mymachine/appcache/index.html
Application Cache Progress event (2 of 2)
Application Cache UpdateReady event

Now refresh the page, and you'll see the page has updated:

HTML: 2
JS: 2 MATCH

Since I'm developing an "app" (a game), this is the exact behavior I want to emulate in serviceworker. But this is impossible.

First, try broken-refresh/index.html. Instead of declaring an AppCache manifest, it registers a service worker.

<!DOCTYPE html>
<html>
<title>offline behavior test</title>
<body>
<p>HTML: <span id="html">1</span></p>
<p>JS: <span id="js">Loading...</span></p>
<script>
// timeout to demonstrate race
setTimeout(function() {
    var script = document.createElement("script");
    script.src = "script.js";
    document.body.appendChild(script);
}, 2000);

if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('sw.js').then(function(registration) {
    console.log('ServiceWorker registration successful with scope: ', registration.scope);
  }).catch(function(err) {
    console.error('ServiceWorker registration failed: ', err);
  });
}
</script>
</body>
</html>

The script.js file is identical to the appcache variation, so I won't repeat it. The sw.js service worker says:

var CACHE_NAME = "1";

self.addEventListener('install', function(event) {
  event.waitUntil(
    caches.open(CACHE_NAME)
      .then(function(cache) {
        console.log('Opened cache');
        return cache.addAll(['index.html', 'script.js']);
      })
  );
});

self.addEventListener('fetch', function(event) {
  event.respondWith(
    caches.match(event.request)
      .then(function(response) {
        if (response) return response;
        return fetch(event.request);
      }
    )
  );
});

self.addEventListener('activate', function(e) {
  console.log('[ServiceWorker] Activate');
  e.waitUntil(
    caches.keys().then(function(keyList) {
      return Promise.all(keyList.map(function(key) {
        if (key !== CACHE_NAME) {
          console.log('[ServiceWorker] Removing old cache', key);
          return caches.delete(key);
        }
      }));
    })
  );
  return self.clients.claim();
});

An easy peasy boilerplate service worker.

To reproduce

Open broken-refresh/index.html on a web server. The page will show the expected initial results:

HTML: 1
JS: 1 MATCH

Now, without closing the tab, update the HTML, JS, and SW to replace "1" with "2" and refresh.

As we'd expect, the page initially shows "1" because the page is loaded from the cache, and the new service worker is installed and waiting.

HTML: 1
JS: 1 MATCH

Now, refresh the page again.

Expected
Refreshing the (only) tab again, now that the service worker is installed and waiting, should load activate the new service worker before loading the next page, just like it does in AppCache.

Actual
The new navigation starts before the old service worker can be unloaded, so the old values remain, no matter how many times you refresh the page.

HTML: 1
JS: 1 MATCH


The obvious thing to try next is to skipWaiting on install, which we can demonstrate in nonatomic-skip-waiting directory.

The nonatomic-skip-waiting directory is identical to the broken-refresh directory, except that we add self.skipWaiting() to the first line of the install event handler.

If you open nonatomic-skip-waiting/index.html, of course the page displays the right values at first:

HTML: 1
JS: 1 MATCH

But if you update the HTML, JS, and SW to say "2", the page displays a mismatch value.

HTML: 1
JS: 2 MISMATCH

The point of this diatribe is that it's impossible to reproduce the desired AppCache behavior.

@dfabulich
Copy link
Author

The more I think about it, the more I feel that activating the service worker when refreshing the last tab should be the automatic default, delaying the navigation automatically if necessary. If somebody wants the existing "refreshing doesn't update" behavior, they should have to opt into it.

@jakearchibald
Copy link
Contributor

My idea around the navigation event:

self.addEventListener('navigation', event => {
  event.waitUntil(p);
});

…which would pause the navigation (delaying the fetch) until p resolved. This would allow developers to do something like:

self.addEventListener('navigation', event => {
  if (registration.waiting) {
    event.waitUntil(
      clients.matchAll().then(clients => {
        if (clients.length < 2) {
          return registration.waiting.skipWaiting();
        }
      })
    );
  }
});

…the idea that the navigation would be delayed until the waiting worker activates, so the navigation request would go through the new worker.

However, it's not clear to my how this would play with the concurrent request proposal.

@jakearchibald
Copy link
Contributor

The more I think about it, the more I feel that activating the service worker when refreshing the last tab should be the automatic default

Then what happens if the response doesn't replace the current client?

@dfabulich
Copy link
Author

My imagination is poor, but I don't see how this can happen.

The waiting service worker activates by default when you close the last tab, even if you immediately click a bookmark to re-open the site. In my view, that's exactly what refreshing is: it's a navigation away, followed immediately by a navigation back to the current page.

@wanderview
Copy link
Member

The waiting service worker activates by default when you close the last tab, even if you immediately click a bookmark to re-open the site. In my view, that's exactly what refreshing is: it's a navigation away, followed immediately by a navigation back to the current page.

This isn't really accurate. The browser starts loading the next page, but doesn't tear down the current page until that new request completes. So when the navigation FetchEvent fires the page is still being shown to the user.

…the idea that the navigation would be delayed until the waiting worker activates, so the navigation request would go through the new worker.

I don't think you will have a waiting worker at this point. Both chrome and firefox wait to perform the update check until after the navigation load event.

I guess you could manually call update and then wait for the resulting promise to resolve. At that point you should have an installing worker you could wait on.

Given that update() must be called to get the SW before the refresh navigation I think pages should probably just build their own "check for updates" and "update found, click to reload" buttons. I suppose devtools could also have those buttons for the site.

@dfabulich
Copy link
Author

@wanderview I think I haven't been clear enough in my explanation, so I've provided a clearer statement of the bug in the initial description, now with a code sample, steps to reproduce, expected behavior, and actual behavior.

Specifically, the desired behavior is what AppCache does, so I know it must be possible.

In my view, that's exactly what refreshing is: it's a navigation away, followed immediately by a navigation back to the current page.
This isn't really accurate.

I know; I've read all about it. But it works perfectly in AppCache. That's the behavior I want.

Given that update() must be called to get the SW before the refresh navigation I think pages should probably just build their own "check for updates" and "update found, click to reload" buttons. I suppose devtools could also have those buttons for the site.

I think this is a misunderstanding about what I want. As I explained above, in AppCache, when the manifest updates, the first refresh gets stale content, as expected/desired. The AppCache re-installs itself in the background, so the second post-install refresh "just works."

I agree that there's no good way to make the first refresh update the service worker, but subsequent refreshes should just work. The refreshed page's navigation should simply be paused while the initial page unloads and the new service worker activates.

I could imagine a simple API like: self.activateOnLastRefresh(); that you could call on install that would do what I want.

Except, not really, because that should just be the default. If anything, if anybody wants to opt into this buggy behavior, they should be calling an API like, self.yieldToPreviousWorker().

And, to be clear, the proposal to build an in-app update UI is really not a good one. It's actually pretty tricky for a page to know with confidence how many in-app tabs are still open, especially since there's no way to add a lock/semaphore on this.

If that turns out to be the only long-term way to refresh a page atomically, then this is a big embarrassment for service workers.

Back in 2012, @jakearchibald wrote that infamous article, http://alistapart.com/article/application-cache-is-a-douchebag

As a "game" app, I guess I won the AppCache lottery or something, because AppCache was pretty straightforward. I had to set NETWORK: * and I had to add a version comment to the manifest, but other than that, I tell you, it just worked.

This bug is douchier than anything AppCache ever did to me.

@jakearchibald
Copy link
Contributor

@dfabulich

My imagination is poor, but I don't see how this can happen.

Content-Disposition response is one.

Specifically, the desired behavior is what AppCache does

Appcache allows two versions of the app to run concurrently, assigning caches to documents. So one tab can be running off v2 while another is running on v1. Is this the model you want?

The point of this diatribe is that it's impossible to reproduce the desired AppCache behavior.

This is why we introduced the client IDs https://jakearchibald.com/2016/service-worker-meeting-notes/#fetch-event-clients, so you can track clients and associate caches with them (via indexeddb or something).

It's actually pretty tricky for a page to know with confidence how many in-app tabs are still open

The clients API can do this pretty easily.

If that turns out to be the only long-term way to refresh a page atomically, then this is a big embarrassment for service workers.

This thread is getting pretty long, but I ask that you don't throw around words like "embarrassment", especially as you don't seem so sure about how appcache worked.

What's your favoured solution here? Assigning caches to a particular client, or providing a hook into the refresh button so you can call skipWaiting before the fetch is dispatched?

@dfabulich
Copy link
Author

I finally found time to do more research on this. You're quite right that I didn't know that AppCache assigns caches to documents, and so allows each tab to run its own cache.

re: assigning caches to particular clients (as AppCache does), I can see how the current API would allow for that, but, as you may have guessed, I really don't want to do that; I was wrong to think AppCache did it correctly.

I want all tabs to share the same cache, until the last tab is refreshed, at which point I want the new cache to immediately set in before the refresh loads.

Since I posted this issue last year, there's been other discussion around a "navigation" event, to which @wanderview replied:

How is this different than a FetchEvent with a "navigate" RequestMode?

addEventListener('fetch', evt => {
  if (evt.request.mode === 'navigate') {
    // navigation event stuff here
  }
});

I, a fool, didn't yet know that FetchEvent.request.mode existed. But I think that alone may give me the power to do what I want.

The idea is: on fetch in navigate mode, if there's just one client, post a message to the new SW, asking it to skipWaiting. It's too late for the new SW to directly handle the navigation that already started, so in that case, we can just coordinate with the new SW to do what the new SW would have done.

In my case, I know that all my SWs ever do is to check the cache and fallback to the network, so I can just open the newest cache (instead of the oldest, which is the default behavior for caches.match) and return cached results from there, falling back to the network.

I've demonstrated this in https://github.com/dfabulich/service-worker-refresh-bug/tree/master/possible-workaround

Here's the key section:

self.addEventListener('fetch', function(event) {
  console.log("[ServiceWorker] fetch", CACHE_NAME, event.request.url.replace(/^.*\//, ""));
  event.respondWith(clients.matchAll().then(function(clients) {
    if (clients.length < 2 && event.request.mode === "navigate" && registration.waiting) {
      console.log("[ServiceWorker]", CACHE_NAME, 'sending skipWaiting');
      registration.waiting.postMessage("skipWaiting");
      var lastKey;
      return caches.keys().then(function(keyList) {
        lastKey = keyList[keyList.length-1];
        return caches.open(lastKey);
      }).then(function(cache) {
        return cache.match(event.request);
      }).then(function(cached) {
        var response = cached || fetch(event.request);
        console.log("[ServiceWorker] response", CACHE_NAME, "from", lastKey, event.request.url.replace(/^.*\//, ""), response);
        return response;
      })
    } else {
      return caches.match(event.request).then(function(cached) {
        var response = cached || fetch(event.request);
        console.log("[ServiceWorker] response", CACHE_NAME, event.request.url.replace(/^.*\//, ""), response);
        return response;
      });
    }
  }));
});

If there's just one client, the request is a navigation, and there's a SW waiting, tell it to skipWaiting, and return index.html from the newest cache (the last cache key, instead of the first); otherwise, do the normal thing.

I tested this in Chrome 62 and it seemed to do what I want: with just one open tab, refreshing the tab twice (once to pick up the new SW and then again to activate it) updated the page, but with two open tabs, refreshing any number of times continued to use the old cache.

Unfortunately, the code didn't work at all in Firefox, because registration.waiting is always null. (In fact, registration.active is null, too, in the SW in Firefox.)

Can y'all confirm that my code is actually correct? Is there a cleverer or more standard way to handle this case? Do I just need to file a bug on Firefox?

@dfabulich
Copy link
Author

Thrashing around for a way to make this work on Firefox, this trick seems to work, but it seems too clever to be the right thing:

https://github.com/dfabulich/service-worker-refresh-bug/tree/master/fixed-atomic-skip-waiting

  1. subresource URLs have a ?v=1 URL parameter to uniquify them. (In production, they'd use hashes instead of version numbers.)
  2. Instead of deleting all non-current caches during activate, as all SW tutorials do, we .slice(0,-2) the cache key list, deleting all old caches except for the last two. Thus, when the v2 SW skips waiting during install, the existing tab can still access its ?v=1 URLs, but we'll eventually drop the v1 cache when the v3 SW activates.
  3. When I did this naively, I'd assumed that caches.match() would return index.html from the newest cache, but it did the opposite, returning index.html from the oldest cache. So the page effectively refused to update at all, relying on the old cache when a new one was available. So I tweaked the fetchevent handler to first query the current SW's own cache, and then fallback to querying all caches before giving up and hitting the network.
  4. That worked for refreshing one tab, but if I opened two or three tabs and tried to push an update, I'd get some tabs on the old version and some tabs on the new version, which is what I said I didn't want. So, for my last trick, we count uncontrolled clients in the install event handler, and only skipWaiting if there's exactly one client. This is a big hack compromise; refreshing the only tab works fine, but if a new SW installs while there are multiple tabs open, it will refuse activate even if you close all other tabs and refresh the last tab.

@dfabulich
Copy link
Author

This thread is too much full of my own confusions. I'm closing this issue and filing #1238 to replace it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants