-
-
Notifications
You must be signed in to change notification settings - Fork 84
PWA custom metric #198
PWA custom metric #198
Conversation
I've also tested this on vice.com, which does use Workbox, however the custom metric doesn't seem to be able to find any useful info in the SW: https://www.vice.com/service-worker.js const workboxPattern = /workbox\.([a-zA-Z]+\.?[a-zA-Z]*)/g; This doesn't seem to match on any of the lines in the SW. I based it off of this query which @tunetheweb may have written. Let me know if anyone has any suggestions for a better pattern detector. |
Looks to me like only workbox-sw uses the I don't know enough about WorkBox but is I checked last years query and vice.com does appear as a service worker site (and uses a similar sw script as present), but isn't picked up by my query so it, and other sites not using |
@rviscomi, please assign @jeffposnick the review. He's the authoritative source when it comes to all current and historic ways to identify the library. What you have (with the |
Modern Workbox usage can be identified via strings like So if both could be accounted for, that would be great. |
Updated the workbox pattern to Now detecting workbox on Vice:
And still working on publicstorage. |
Full version number ( ...unless there are any negative repercussions to breaking out the matches into individual entries instead of aggregating them more? Does it end up using up a lot more storage or something? In which case, we can option for aggregation—both on the major version number, and for the older-style Workbox usage, on just one-level of detail after the Whatever works best for you all to ensure that this stays scalable. |
Updated to extract the full version number. Storage shouldn't be an issue and any redundancy could be accounted for in the BigQuery analysis. |
Thanks so much, Rick! Now that this is merged, can you provide a bit more context about the "how" and "when" of querying the new data? |
The new custom metrics will be picked up in the June crawl, the results of which should be queryable by the end of June. At that time I could provide a sample query to extract the PWA data. Do you have any specific metrics you want to query? For example, % of sites that use workbox at all, version distribution, package/method popularity, etc. Now is also a good time to double check that the custom metric enables the kinds of use cases you have. |
We'd be interested in questions like number of unique origins that use any version of Workbox, relative Workbox version popularity (Workbox v4 vs. v5 vs. v6), what percentage of sites that register a service worker use Workbox, and maybe a few other questions in that general vein. CC: @tropicadri |
Ok sounds like we should be in good shape. The custom metric supports those use cases. 👍 |
Hi folks, As discussed in #2153, we are looking to expand this script to include (at a minimum) two more fields for the PWA section of Web Almanac: one to detect the usage of service worker events and another one to obtain the different calls to I wanted to ask you if you could share details on how you usually test this code during development, to iterate on it and make sure it works. I initially thought I could just obtain the value of I hope this request is clear. I'm just trying to find a more efficient way that let us iterate more quickly when maintaining these scripts. But maybe the one you're using is just running the test in WPT and see the responses there. Thanks! |
Yup as I understand it we just do some example runs against WPT for testing. Best to test a PWA page and a non-PWA page at least to confirm it both will work when expected and it won't throw an error if it can't find what it's looking for. Plus include links to WPT in any pull request to make it easier to review. |
This is a custom metric that we can use as a workaround for the HTTP Archive response bodies being unavailable. It may also be useful for the PWA chapter of the 2021 Web Almanac.
The process to get this data in and out of BigQuery is:
pages
dataset for the_pwa
field in the HAR payloadKnown issue: a page like https://developers.google.com/web/tools/chrome-user-experience-report/ installs a SW but its JS is so minified that there's no way to reliably parse the registration with a regex. I'd love to hear if there are better ways to detect which JS resource is a SW. (maybe we can borrow a technique from Lighthouse? cc @brendankenny)
Example WPT: https://www.webpagetest.org/result/210511_BiDc1V_ae6dbe0c19cd64c52b477fa46d3ab42a/?f=json
Output (addressable as
$.data.runs[1].firstView.pwa
):