-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebVitals metrics measurements are incorrect for "slow" loading sites #960
Comments
I've been investigating this issue further. My main concern was to clarify if k6 browser implementation was incorrect at some point during the Web Vitals metrics gathering, parsing or reporting; from the tooling script injected in the web site, to being reported to k6 browser through a set up binding and finally pushed to k6 core. For this I defined a set of hypothetical issues that might be producing these metric mismatches and explored them individually. This was done by collecting timestamped events for the moments on which:
A clarification first on
|
grafana.com | k6 Browser | headful | Lighthouse | PageSpeed | WebPageTest |
---|---|---|---|---|---|
TTF | 105.1ms | 100.7ms | - | 500ms | 722ms |
FCP | 317.6ms | 353.9ms | 800ms | 1800ms | 2074ms |
LCP | 521.3ms | 554ms | 1600ms | 2100ms | 3989ms |
CLS | 0.001913 | 0.001959 | 0.003 | 0.02 | 0.029 |
slack.com | k6 Browser | headful | Lighthouse | PageSpeed | WebPageTest |
---|---|---|---|---|---|
TTF | 143.84ms | 248.59ms | _ | 700ms | 899ms |
FCP | 573.6ms | 629.19ms | 900ms | 1500ms | 2821ms |
LCP | 603.1ms | 608.19ms | 1600ms | 2100ms | 2821ms |
CLS | 0.057071 | 0.061046 | 0.056 | 0.15 | 0.236 |
Headless/Headful mode produced huge differences between runs for slack.com
. From 600ms to 5.38s for LCP and up to 2.5s for FCP in a few executions in headful mode.
github.com | k6 Browser | headful | Lighthouse | PageSpeed | PageWebTest |
---|---|---|---|---|---|
TTF | 133.8ms | 142.09ms | - | 1000ms | 845ms |
FCP | 359.6ms | 450.39ms | 700ms | 1900ms | 3216ms |
LCP | 375ms | 406.19ms | 900ms | 2700ms | 3843ms |
CLS | 0.002217 | 0.001416 | 0 | 0.11 | 0 |
news.ycombinator.com | k6 Browser | headful | Lighthouse | PageSpeed | PageWebTest |
---|---|---|---|---|---|
TTF | 515.39ms | 509.7ms | - | 600ms | 895ms |
FCP | 721.59ms | 735.1ms | 600ms | 600ms | 1464ms |
LCP | 721.59ms | 735.1ms | 800ms | 600ms | 1464ms |
CLS | 0 | 0 | 0 | 0 | 0 |
youtube.com | k6 Browser | headful | Lighthouse | PageSpeed | PageWebTest |
---|---|---|---|---|---|
TTF | 250.35ms | 209.89ms | - | 700ms | 2093ms |
FCP | 439.1ms | 412.3ms | 700ms | 2600ms | 4234ms |
LCP | 1650ms | 1640ms | 2600ms | 4000ms | 6187ms |
CLS | 0.000816 | 0.000737 | 0.001 | 0.31 | 0 |
www.zara.com | k6 Browser | headful | Lighthouse | PageSpeed | PageWebTest |
---|---|---|---|---|---|
TTF | 148.1ms | 171.5ms | - | 400ms | 950ms |
FCP | 174.89ms | 431.6ms | 600ms | 1100ms | 3052ms |
LCP | 174.89ms | 464.9ms | 700ms | 3000ms | 9160ms |
CLS | 0 | 0 | 0.002 | 0.14 | 0 |
See also charts.
Next steps
- Revert Report Web Vital metrics on every chance #949 in order to avoid incorrect LCP measurements.
- Apply Web Vitals force page hide #953 in order to improve Web Vitals metrics consistent reporting.
- Investigate impact of new headless mode in Web Vitals metrics (Investigate working with
new
type forheadless
option #948).
Great bit of analysis 👏 🎉 It feels like there's not much we can do at the moment (apart from what is listed in the Next Steps, which I agree with), we are bound by the Web Vital library and the behaviours of the all the different factors (the browser module, website under test, the OS, and the compute resources, the network, etc). I think what will be important is how consistent the measurements are from one test run to another -- a test today against grafana.com should result in a similar test result where the margin of error (or delta) is small assuming the factors remain the same. |
See comparison charts for FCP and LCP metrics for sites under test: grafana.comslack.comgithub.comnews.ycombinator.comyoutube.comzara.comWe can see that the trends between k6 browser and Lighthouse usually match, and the difference between them in absolute values are not so significant compared to differences with other tools. |
An interesting test comparison between k6 Browser and Grafana Faro: This test was done by instrumenting k6-docs site with Grafana Faro, running the site locally and executing k6 browser tests against it. The interesting point about this test is that it allows us to obtain metrics calculated by k6 browser and Grafana Faro in the same "environment conditions", including hardware, network etc, also taking into account that in this test both measurements for k6 browser and Grafana faro for each sample are obtained as a result of the same page request/page load, as the k6 browser script just executes a FCP
LCP
Try it yourselfk6-docs
Instrument web app with Faro:
Apply diffdiff --git a/src/components/pages/doc-welcome/faro/faro.js b/src/components/pages/doc-welcome/faro/faro.js
new file mode 100644
index 00000000..0ee82df8
--- /dev/null
+++ b/src/components/pages/doc-welcome/faro/faro.js
@@ -0,0 +1,16 @@
+import { initializeFaro as coreInit } from '@grafana/faro-web-sdk';
+
+export function initializeFaro() {
+ const faro = coreInit({
+ url: 'http://localhost:8027/collect',
+ apiKey: 'api_key',
+ app: {
+ name: 'k6-docs-test',
+ version: '1.0.0',
+ },
+ });
+
+ faro.api.pushLog(['Faro was initialized']);
+}
diff --git a/src/components/pages/doc-welcome/faro/index.js b/src/components/pages/doc-welcome/faro/index.js
new file mode 100644
index 00000000..00cebe8e
--- /dev/null
+++ b/src/components/pages/doc-welcome/faro/index.js
@@ -0,0 +1 @@
+export * from './faro';
diff --git a/src/components/pages/doc-welcome/index.js b/src/components/pages/doc-welcome/index.js
index 6fbdb036..06203d92 100644
--- a/src/components/pages/doc-welcome/index.js
+++ b/src/components/pages/doc-welcome/index.js
@@ -1,3 +1,5 @@
+import { initializeFaro } from './faro';
+
export * from './cloud';
export * from './features';
export * from './help';
@@ -6,3 +8,6 @@ export * from './showcase';
export * from './what-is';
export * from './whats-new';
export * from './manifesto';
+export * from './faro';
+
+initializeFaro(); Run it:
Faro
Website will be available at http://localhost:8100 and Grafana at http://localhost:300. To see the Web Vitals metrics go to Test
And run the following script: Simple k6 browser test being used.import { browser } from 'k6/x/browser';
export const options = {
scenarios: {
ui: {
executor: 'shared-iterations',
options: {
browser: {
type: 'chromium',
},
},
},
},
};
export default async function () {
try {
await page.goto('http://localhost:8100/')
} finally {
page.close();
}
} ConclusionSo, all in all is good to verify that both k6 browser and Grafana Faro produce almost identical results when executed in the same conditions for a "real world" website (not a simple test site), even if in the end they serve for different purposes. Still, as mentioned in the previous comment, it's important to remark again the impact that the "execution environment" can have in Web Vitals measurements, considering things like hardware and network conditions, but also headless/headful mode, viewport and screen resolution, and also the user interaction with the page. |
Brief summary
After modifications made in #949 and #943, and comparisons done in #950, we concluded that, in some cases, k6 browser reports incorrect Web Vitals metrics measurements. This happens especially for sites that are more dynamic and take longer to load, as in this cases k6 browser seems to report "intermediate" values corresponding to samples taken before the page is fully loaded, event for which k6 browser seems to not wait for long enough. This affects particularly FCP and LCP metric.
xk6-browser version
commit: c959b57
OS
Ubuntu 20.04.5 LTS
Chrome version
113.0.5672.126 (Official Build) (64-bit)
Docker version and image (if applicable)
No response
Steps to reproduce the problem
Run the following test:
Expected behaviour
Web Vitals metrics should be similar to the ones reported by Google Lighthouse:
Actual behaviour
Web Vitals metrics reported by k6 browser:
Notice the differences in FCP and LCP.
The text was updated successfully, but these errors were encountered: