core: use the same scoring curve for desktop in all channels #9911

connorjclark · 2019-11-02T01:18:14Z

The first inclination was to add some overrides in config.js, kinda like this. But having the scoring curve numbers co-located with the audits is just so good, so let's try to keep them together.

The next inclination was to modify the audit context to have hostDevice. However, this was kind of awkward. For example, there's no great place to keep the logic of host agent checking, which is needed in creating the base artifacts and now here.

So instead of avoiding making a new artifact, I suggest we embrace it. This will be useful for #9713 too.

~~Draft to get consensus. I only did one audit.~~

connorjclark · 2019-11-02T01:18:43Z

lighthouse-core/audits/metrics/first-contentful-paint.js

-      scoreMedian: 4000,
-    };
+  static getDefaultScoreOptions(artifacts, context) {
+    if (context.settings.emulatedFormFactor === 'mobile' || artifacts.HostDevice === 'mobile') {


I think this logic is right. @paulirish can you confirm?

I think this logic is right. @paulirish can you confirm?

#9436 is scoring for desktop runs, which could either be emulated desktop or no emulation but actually running on a desktop. So don't we already have the artifact (setting aside the general usefulness of a HostFormFactor) and this line could be if (artifacts.TestedAsMobileDevice) {?

Emulating desktop on a mobile device doesn't make much sense to me. Shouldn't those runs be scored with numbers tuned for a mobile device?

Emulating desktop on a mobile device doesn't make much sense to me. Shouldn't those runs be scored with numbers tuned for a mobile device?

it doesn't make much sense why someone would be doing that to me either :) but if lantern doesn't correctly emulate desktop there it should be fixed at that level, not patched up at the scoring level

Yah I agree that TestedAsMobileDevice gives us the signal we want.
real mobile hardware + desktop emulation is stupid but that's solved with a warning telling the developer they're stupid rather than mis-scoring them.

Thank god we don't have first-class support for tablet with a mouse or whatever. :)

I still like having HostFormFactor in the artifacts tho.. even if we aren't using it here.
I guess that could be a separate PR, but no big deal.

patrickhulce

nice, I like this approach overall

patrickhulce · 2019-11-02T04:44:12Z

types/artifacts.d.ts

@@ -29,6 +29,8 @@ declare global {
      LighthouseRunWarnings: string[];
      /** Whether the page was loaded on either a real or emulated mobile device. */
      TestedAsMobileDevice: boolean;
+      /** Device which Chrome is running on. */
+      HostDevice: 'desktop'|'mobile';


🚲 🏠 HostFormFactor?

+1. What do others think?

+1 to hostformfactor

patrickhulce · 2019-11-02T04:49:29Z

lighthouse-core/audits/metrics/first-contentful-paint.js

@@ -54,12 +64,13 @@ class FirstContentfulPaint extends Audit {
    const devtoolsLog = artifacts.devtoolsLogs[Audit.DEFAULT_PASS];
    const metricComputationData = {trace, devtoolsLog, settings: context.settings};
    const metricResult = await ComputedFcp.request(metricComputationData, context);
+    const scoreOptions = context.options || this.getDefaultScoreOptions(artifacts, context);


can we have this be performed by the runner? i.e. check if default options is a function that accepts arguments and invoke it for the default options and/or permanently convert it to a function instead of getter property?

I like this idea of passing in context to determine default options overall though :)

permanently convert it to a function instead of getter property?

That was another approach I started with. @paulirish and I thought it best to avoid changing the interface for default options, but I didn't consider supporting both a function and the (current) getter property, which I like.

thought it best to avoid changing the interface for default options

If we had a bunch of users of our default options already I'd agree, but we're still in the early stages there so if we'd like to move IMO now would be the time rather than indefinitely support two modes :)

That was another approach I started with. @paulirish and I thought it best to avoid changing the interface for default options, but I didn't consider supporting both a function and the (current) getter property, which I like.

If we had a bunch of users of our default options already I'd agree, but we're still in the early stages there so if we'd like to move IMO now would be the time rather than indefinitely support two modes :)

having a

static getDefaultOptions(context) { if (artifacts.TestedAsMobileDevice) { return { scorePODR: 2000, scoreMedian: 4000, } } else { return { scorePODR: 800, scoreMedian: 1600, } } } } static async audit(artifacts, context) { const scoreOptions = context.options; // ... }

seems ergonomically (and functionally) equivalent to something like

static get defaultOptions() { return { mobile: { scorePODR: 2000, scoreMedian: 4000, }, desktop: { scorePODR: 800, scoreMedian: 1600, }, }; } static async audit(artifacts, context) { const scoreOptions = context.options[TestedAsMobileDevice ? 'mobile' : 'desktop']; // ... }

and avoids churn for anyone using audit options.

but speaking of users of default options...is there any reason to keep score control points in the options? The main push for that was to be able to make a desktop config that defined its own score curves (#4873) to override the default ones, but if desktop scoring is being moved into the audits themselves, any reason to continue defining them circuitously like this instead of as regular properties?

tho I am assuming some user actually uses this

Looks like the only people using it are folks that are reusing the desktop config.

but speaking of users of default options...is there any reason to keep score control points in the options?

I was thinking the same thing.

I'm in favor of moving scoring out out of options. (Roughly reverting #4927) If a user really wants different scoring, they can create a plugin that requires a core metric audit and then recomputes the score their way.

I'm happy to add friction there. I want audit options to be used for all sorts of other things but core perf metric scoring doesn't seem like something that should be part of our extensibility story.

@patrickhulce how sad would you be?

Looks like the only people using it are folks that are reusing the desktop config.

A GitHub search would find any open source tools that use Lighthouse + these config options, but it doesn't tell us anything about how power users might be using configs in their infra to run Lighthouse.

Setting it up with options allows folks to run with advanced throttling on different connection types and run with just a slightly modified config ala lr-desktop-config. To say that removing scoring from options adds friction feels like quite an understatement. If we removed, they would need to completely reimplement every metric audit just to have meaningful scores of any connection type outside our two blessed ones, which sounds like an effective way to completely kill testing of other connection types.

We also originally structured it this way to offer metric values on different connection types. We had dreams once upon a time of offering multiple views within the same report and while no one has really talked about something like that recently, we get that for free when the inputs all come from options and it would be rather difficult to accomplish and/or require hard-coded copies without it. If #8374 had been able to land, I would have shipped my slow-3g performance plugin that does this :)

Nuking scoring in options kills both of these dreams, and I'm not personally convinced that a custom config isn't already enough friction to discourage it. ~~AFAICT from my brief search there are exactly 0 usages of custom scorePODR in open source github where the values aren't just copied from our guidance or lr-desktop-config.~~ (EDIT: Paul already said this, so I think we interpret that finding differently, agree with connor that it's just power users then :) )

In summary, I would be very, very sad to see this die.

okay. :)

we'll keep the scoring parameters in audit options.

I don't think it's a huge burden to make users do class Slow3gInteractive extends Interactive {static getScoreOptions {return myAlteredScoreOptions;}. But we could also be rid of the (undocumented) passing of an audit's default options back into itself with a simple const {options = this.defaultOptions} = context in the audit and still have the override behavior without runner trickiness.

...but I can live with the status quo :)

But I would very much like to see no new options magic for something we all agree no one uses to do something complicated when we can do it in a simple way :)

…e-curves

connorjclark · 2019-11-05T22:52:08Z

lighthouse-core/audits/metrics/first-contentful-paint.js


    return {
      score: Audit.computeLogNormalScore(
        metricResult.timing,
-        context.options.scorePODR,
-        context.options.scoreMedian
+        scoreOptions.scorePODR,


@patrickhulce @paulirish @brendankenny Good?

probably should still note this as a breaking change in release notes, but this WFM 👍

agreed. we should adopt a way to signify breaking changes for release-notes-writing time.

…e-curves

paulirish · 2019-11-09T23:46:53Z

lighthouse-core/test/audits/metrics/first-meaningful-paint-test.js

@@ -22,7 +22,7 @@ describe('Performance: first-meaningful-paint audit', () => {
    const context = {options, settings: {throttlingMethod: 'provided'}, computedCache: new Map()};
    const fmpResult = await FMPAudit.audit(artifacts, context);

-    assert.equal(fmpResult.score, 1);
+    assert.equal(fmpResult.score, 0.96);


kinda weird to see all these scores change in the tests.

im assuming thats because TestedAsMobileDevice is falsy by default in these tests.
that's a little odd as LH is is otherwise mobile by default.

we need something better here. options i can think of:

add {TestedAsMobileDevice: true} into all these tests to make it explicit

change the ternary in the metric audits to check if TestedAsMobileDevice === false ? 'desktop' : 'mobile' because i guess it'd be undefined in all these test scenarios?

I considered both of those.

the first was annoying and a big change to the tests

the second is ehhhh. I would do this if only there was one single place this condition lived. but it's in many places

but

that's a little odd as LH is is otherwise mobile by default.

is so convincing that we should do one of these. I choose the second. is there a good place to move this code to a function?

ehhhhh i just did 2)

wip

952a089

googlebot added the cla: yes label Nov 2, 2019

connorjclark commented Nov 2, 2019

View reviewed changes

remove accidental code

d5cbcbb

vercel bot temporarily deployed to staging November 2, 2019 01:19 Inactive

comment out code to show in pr diff

489c9e6

vercel bot deployed to staging November 2, 2019 01:20 View deployment

patrickhulce reviewed Nov 2, 2019

View reviewed changes

pr

04430de

vercel bot deployed to staging November 5, 2019 22:45 View deployment

connorjclark added 2 commits November 5, 2019 14:46

Merge remote-tracking branch 'origin/master' into use-lr-desktop-scor…

4bbb274

…e-curves

undo host device artifact

c081de7

vercel bot deployed to staging November 5, 2019 22:47 View deployment

connorjclark mentioned this pull request Nov 5, 2019

core: add new base artifact, HostFormFactor #9923

Merged

connorjclark commented Nov 5, 2019

View reviewed changes

TestedAsMobileDevice

b6e557c

vercel bot deployed to staging November 5, 2019 22:53 View deployment

update

de97ad3

vercel bot deployed to staging November 6, 2019 23:13 View deployment

connorjclark marked this pull request as ready for review November 6, 2019 23:13

forgot required artifacts

d76d7a2

vercel bot deployed to staging November 6, 2019 23:14 View deployment

connorjclark added 2 commits November 7, 2019 17:28

Merge remote-tracking branch 'origin/master' into use-lr-desktop-scor…

592bb4e

…e-curves

fix tests

979d119

vercel bot deployed to staging November 8, 2019 02:14 View deployment

more

38fac83

vercel bot deployed to staging November 8, 2019 02:27 View deployment

lint

bcb2689

vercel bot deployed to staging November 8, 2019 18:08 View deployment

connorjclark added the 6.0 label Nov 8, 2019

fix cpu idle

54a4eef

vercel bot deployed to staging November 8, 2019 18:22 View deployment

paulirish reviewed Nov 9, 2019

View reviewed changes

mobile first

8d276ff

vercel bot deployed to staging November 10, 2019 00:08 View deployment

line limit

56d2ae4

vercel bot deployed to staging November 10, 2019 00:11 View deployment

paulirish approved these changes Nov 10, 2019

View reviewed changes

connorjclark merged commit 93560f4 into master Nov 10, 2019

connorjclark deleted the use-lr-desktop-score-curves branch November 10, 2019 00:51

jburger424 mentioned this pull request Nov 26, 2019

Separate mobile & desktop scoring options following LH 6.0 googleads/publisher-ads-lighthouse-plugin#167

Open

patrickhulce mentioned this pull request Dec 12, 2019

Observing consistently lower performance scores when running Lighthouse programmatically than from Chrome DevTools #10062

Closed

brendankenny mentioned this pull request May 6, 2020

core(scoring): redefine log-normal curves with median and p10 points #10715

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: use the same scoring curve for desktop in all channels #9911

core: use the same scoring curve for desktop in all channels #9911

connorjclark commented Nov 2, 2019 •

edited

Loading

connorjclark Nov 2, 2019

brendankenny Nov 4, 2019 •

edited

Loading

connorjclark Nov 4, 2019

brendankenny Nov 4, 2019

paulirish Nov 4, 2019

patrickhulce left a comment

patrickhulce Nov 2, 2019

connorjclark Nov 4, 2019

paulirish Nov 4, 2019

patrickhulce Nov 2, 2019

connorjclark Nov 4, 2019

patrickhulce Nov 4, 2019

brendankenny Nov 4, 2019

brendankenny Nov 4, 2019

paulirish Nov 4, 2019

connorjclark Nov 4, 2019

patrickhulce Nov 4, 2019 •

edited

Loading

paulirish Nov 5, 2019

brendankenny Nov 5, 2019

connorjclark Nov 5, 2019

patrickhulce Nov 5, 2019

connorjclark Nov 9, 2019

paulirish Nov 9, 2019

connorjclark Nov 9, 2019 •

edited

Loading

connorjclark Nov 10, 2019

core: use the same scoring curve for desktop in all channels #9911

core: use the same scoring curve for desktop in all channels #9911

Conversation

connorjclark commented Nov 2, 2019 • edited Loading

Choose a reason for hiding this comment

brendankenny Nov 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce Nov 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connorjclark Nov 9, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connorjclark commented Nov 2, 2019 •

edited

Loading

brendankenny Nov 4, 2019 •

edited

Loading

patrickhulce Nov 4, 2019 •

edited

Loading

connorjclark Nov 9, 2019 •

edited

Loading