Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests(lantern): add lantern regression test scripts #5435

Merged
merged 9 commits into from
Jun 15, 2018

Conversation

patrickhulce
Copy link
Collaborator

Adds lantern regression testing against ~100 URLs to travis (only when lantern files are affected).

A few things this PR highlighted:

  1. Our scripts are getting sophisticated and probably a candidate for some refactoring into libs and such
  2. We duplicate a decent bit of logic in our scripts, perhaps moving more to js and aforementioned refactoring would help.
  3. Names are hard and suggestions welcome. It felt weird to call the lantern expectations file we checkin a "golden" file since it's not really golden, it's where we are right now and WPT results are the golden expectations we're trying to reach. I sorta went with computed vs. expected but these aren't great and not 100% consistent.

@patrickhulce patrickhulce changed the title tests(lantern): add golden lantern test scripts tests(lantern): add lantern regression test scripts Jun 5, 2018
@patrickhulce
Copy link
Collaborator Author

I think this is the last thing standing in my way before #5362 #5482 #5483 etc 😃

@paulirish
Copy link
Member

I sorta went with computed vs. expected but these aren't great and not 100% consistent.

what do each of these groups represent? i'm not sure between WPT data, 'golden' lantern data, and freshly computed lantern data.. what we're dealing with here.

@patrickhulce
Copy link
Collaborator Author

what do each of these groups represent? i'm not sure between WPT data, 'golden' lantern data, and freshly computed lantern data.. what we're dealing with here.

we have a few main things

  1. lantern-data/lantern-expectations.json this contains URLs and the WPT metric values, this is our target against which accuracy is judged
  2. lantern-data/lantern-computed.json this contains the URLs and computed Lantern values for the current build
  3. core/test/fixtures/lantern-expectations.json this contains the URLs and computed Lantern values for master which is diff'd similar to our golden LHR. this is the one that is most in need of a name change, but I'm not sure what to call it :)

@paulirish
Copy link
Member

perfect. this helps enormously. thanks!

some name ideas below:

  1. lantern-data/lantern-expectations.json this contains URLs and the WPT metric values, this is our target against which accuracy is judged

wpt-baseline-data

  1. lantern-data/lantern-computed.json this contains the URLs and computed Lantern values for the current build

lantern-fresh-data

  1. core/test/fixtures/lantern-expectations.json this contains the URLs and computed Lantern values for master which is diff'd similar to our golden LHR. this is the one that is most in need of a name change, but I'm not sure what to call it :)

lantern-master-data


I'm also fine with switching the order of baseline/fresh/master and wpt/lantern.

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Is there going to be a failure condition?

# Testing lantern can be expensive, we'll only run the tests if we touched files that affect the simulations.
CHANGED_FILES=""
if [[ "$CI" ]]; then
CHANGED_FILES=$(git --no-pager diff --name-only $TRAVIS_COMMIT_RANGE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$TRAVIS_COMMIT_RANGE

cooool

@patrickhulce
Copy link
Collaborator Author

Is there going to be a failure condition?

Hm, yeah seems useful to match golden LHR pattern here for now, I'll exit with 1 if there was any difference 👍

@patrickhulce
Copy link
Collaborator Author

@paulirish I oversimplified the lantern-expectations one a bit which I only realized once wpt-baseline-data sounded wrong to me.

It contains the list of all the URLs we should compare, the WPT metric values, and the paths to the devtools log and trace of the unthrottled runs. Just WPT feels a bit weird in that scenario since it's an index of the unthrottled data around it. If I were going verbose I'd probably say something like
lantern-3g-golden-expectations-index.json
lantern-3g-HEAD-computed-values.json
lantern-3g-master-computed-values.json

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry! lots of questions.
we probably have to work through this in voice. :)


  1. evaluate-results.js - this is evaluating the correlation of wpt data with our HEAD lantern data, right? print-correlations.js?
  2. diff-expectations.js - this has nothing to do with wpt data but is logging out any difference between master...HEAD of lantern metrics, right? do we need this script to pass? or is this a sanity check so we are cognizant of any changes we make? assert-master-lantern-values-unchanged.js?

  • lantern-3g-master-computed-values.json

this name lgtm, though i'd be fine with dropping 3g.

  • lantern-3g-golden-expectations-index.json

this one not lgtm.

first, i'm a little wary of using 'expectations' especially since we use it in smokehouse and it feels like the semantics here are different.

i still feel like calling this wpt-baseline is fair. you said:

Just WPT feels a bit weird in that scenario since it's an index of the unthrottled data around it.

but i dont understand. if the data has no throttling applied then why would it be ...-3g-... ? i dont get why it'd be called lantern either.

and perhaps a rather macro question: why are we correlating simulated 3g against unthrottled metrics? i totally was expecting that we're comparing all this to wpt-3g metrics.

site-index-plus-golden-expectations
site-index-plus-golden-expectations-plus-HEAD-computed

exitCode = 1;
}
} finally {
fs.unlinkSync(TMP_COMPUTED);
Copy link
Member

@paulirish paulirish Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do these files compare to lantern-data/lantern-{computed|expected}.json?

can we move -data/lantern-computed to .tmp?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe these should have -for-diff in the filename? (if i'm understanding why they're separate)

tar -xzf lantern-traces.tar.gz
mv lantern-traces-subset lantern-data
mv lantern-traces-subset/ lantern-data/
Copy link
Member

@paulirish paulirish Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these live in something like lantern-data/unthrottledAssets/ ?

so that lantern-data/lantern-expected.json are not siblings to them?

const EXPECTATIONS_PATH = path.resolve(process.cwd(), INPUT_PATH);
const EXPECTATIONS_DIR = path.dirname(EXPECTATIONS_PATH);
const COMPUTED_PATH = path.join(EXPECTATIONS_DIR, 'lantern-computed.json');
const RUN_ALL_SCRIPT_PATH = path.join(__dirname, 'run-all-expectations.js');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this isn't running expectations but "collecting computed values". wdyt?

DIRNAME="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
LH_ROOT_PATH="$DIRNAME/../../.."
cd $LH_ROOT_PATH

if [[ -f lantern-data/lantern-expectations.json ]] && ! [[ "$FORCE" ]]; then
echo "Lantern data already detected, done."
Copy link
Member

@paulirish paulirish Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within the lantern-data/lantern-expectations.json file....

image

can these metrics be within a wpt3g object? and then put the trace/log props inside a unthrottledAssets obj?

const path = require('path');

const GOOD_ABSOLUTE_THRESHOLD = 0.2;
const OK_ABSOLUTE_THRESHOLD = 0.5;

const GOOD_RANK_THRESHOLD = 0.1;

if (!process.argv[2]) throw new Error('Usage $0 <computed summary file>');
const COMPUTATIONS_INPUT_ARG = process.argv[2] || './lantern-data/lantern-computed.json';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of these things doesn't look like the others. COMPUTATIONS?

@patrickhulce
Copy link
Collaborator Author

@paulirish how you feelin' about this now :)

@paulirish
Copy link
Member

merge when you're ready

@patrickhulce
Copy link
Collaborator Author

let's do it! 🎉

@patrickhulce patrickhulce merged commit b263f83 into master Jun 15, 2018
@patrickhulce patrickhulce deleted the lantern_each_commit branch June 15, 2018 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants