Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate ways we can keep track of perfomance metrics in PRs #3051

Closed
5 tasks done
Tracked by #3279
romaricpascal opened this issue Nov 29, 2022 · 6 comments
Closed
5 tasks done
Tracked by #3279

Investigate ways we can keep track of perfomance metrics in PRs #3051

romaricpascal opened this issue Nov 29, 2022 · 6 comments
Assignees

Comments

@romaricpascal
Copy link
Member

romaricpascal commented Nov 29, 2022

What

Look for ways to automatically measure and display performance metrics when submitting PRs.

Why

As we're ahead of a big change in the codebase, it'll be good to identify how this impacts our users. We're hoping modernising our JavaScript and polyfilling strategy will help shave KB of their final bundle, but to be sure we need to measure it.

We're also looking to potentially automate polyfilling and transpiling of the latest ES syntax to match the browsers we support. We'll want to keep tab on these metrics to make sure code being written once we've modernised our approach to JavaScript doesn't inadvertently weigh down our users with a sneaky polyfill or feature that's heavy to transpile.

Assumptions

Not really assumptions, but a handful of links:

Timebox

We should review progress after this period of time has elapsed, even if the spike has not been 'completed'

1 day [to work out which metrics we think we care about]

Who is working on this?

Spike lead: Brett

Spike buddy: Romaric

Questions to answer

  • Which metrics can we measure and how do each help our users and us?
  • Which tools can help us keep tab on these metrics and how?

Done when

You may find it helpful to refer to our expected outcomes of spikes.

  • Questions have been answered or we have a clearer idea of how to get to our goal
  • Findings have been reviewed and agreed with at least one other person
  • Findings have been shared, e.g: via a write-up on the ticket, at a show & tell or team meeting
@domoscargin
Copy link
Contributor

domoscargin commented Nov 29, 2022

@domoscargin
Copy link
Contributor

Danger with some plugins might also be worth checking out:

https://danger.systems/js/

@domoscargin
Copy link
Contributor

As a minimum, we should keep an eye on file/package sizes.

@domoscargin
Copy link
Contributor

domoscargin commented Dec 6, 2022

We've populated a doc with some of the metrics we might want to test.

I've started a spike of getting [size-limit](https://github.com/ai/size-limit) up and running: #3076

An immediate concern is that its associated Github action is not certified, so can't be used within alphagov.

However, just running an npm task does work.

  • The useful-sounding --why argument only works with the webpack plugin.

@domoscargin
Copy link
Contributor

domoscargin commented Jan 12, 2023

Which metrics can we measure and how do each help our users and us?

We've considered 3 levels of analysis:

Analysing files

Things like:

  • The size of our distribution CSS and JS files
  • The size and number of files in our package and dist folders
  • The number of assets, by type
  • The size of our release zip
  • The size of our assets, by type

Analysing the code

Things like:

  • Size of first-party code
  • Details of all dependencies
  • Size of dependencies, polyfills
  • Number of modules
  • Duplicate modules
  • Duplicate code

Analysing web performance

Things like:

  • Time to interactive
  • Largest Contentful Paint
  • Cumulative Layout Shift
  • Lighthouse and other automated web perf scores
  • Number of first and third party requests
  • JS long-running jobs
  • Blocking JS
  • JS Errors

Which tools can help us keep tab on these metrics and how?

For this spike, we focused on analysing files and analysing code. We do have access to Speedcurve for web performance stats, so could possibly look into that later, though it's a bit unclear what we would be measuring - certainly we can measure the Design System website's performance, but that's not really directly related to the upcoming changes to govuk-frontend's JavaScript. Potentially we could do some measurement on the review app to check some basic stuff.

Potential tools

As a general note, many of the tools we found rely on Webpack to put their stats and displays together. While this is certainly a way that some folks will ingest govuk-frontend, it feels better for us to go closer to the metal and try to get something that looks as much like our own compiled code as possible.

Relative CI

Pros: Allows for easy trend tracking, fancy graphs and not much work on our part
Cons: Paid for (there's a free open-source tier which might be viable for us), uses Webpack (this should change in v5, which is planned to have Rollup support)

Size limit

Pros: Small and simple, provides an easy way to keep track of file size (and a ready-made Github action for comments), also allows further analysis with certain plugins.
Cons: Uses webpack, Github Action is unverified, so would need approval

Statoscope

Pros: Probably more customisable than size limit (this is what size limit uses under the hood)
Cons: Would require us to roll our own Webpack bundling to test (as it doesn’t self-build)

Rollup plugin visualiser

Pros: We use Rollup currently, so theoretically this is fairly accurate to what we compile; offers good visual data and opportunity to drill down.
Cons: We’re using an oooold version of Rollup which isn’t compatible, so we’d have to run a standalone version; if we do move to Webpack at any point, we’d have to rejig.

Spikes

size limit

We've looked at adding a basic size limit configuration. Size limit has several options for how it builds and what data it gathers. At it's core, on pull requests it measures size difference between files you specify. Using the webpack plugin means it can also provide more in depth module data using Statoscope, which we could finesse into a useful Github comment.

We considered size limit as a simple stop gap - making sure our package sizes don't climb too massively. But we feel like we'd need to replace it eventually so if we can find another option which can give us an MVP quickly and allow for more detail later, that'd be better.

Rollup-plugin-visualiser

We've also looked at rollup-plugin-visualiser. This feels like a better way to go, since Rollup is what we're using to compile these files. It provides good data which we could finesse into helpful Github comments.

One problem is that it relies on Rollup 3, and we're pinned to 0.59.4 in order to support IE8. We can work around this by running a standalone version of rollup for gathering stats.

What remains to be done

Defining "good"

As a team, we'd need to consider what changes in the metrics are acceptable, and probably a process for dealing with PRs that break performance checks but we still want to merge (for example, a big new component that breaks the file size constraints).

Implementing stats gathering with rollup-plugin-visualiser

The actual work of getting this up and running and spitting out data, and failing builds if they break a certain % relative file size increase.

Github actions

We need a Github action to post comments on PRs. Something like:

  • compute the stats of the PR branch and cache against commit hash
  • try to access stats of the reference branch from the cache
  • if they’re not there, checkout reference branch, compute the stats of the reference branch and cache against commit hash
  • diff the stats to highlight changes in files getting bundled and file sizes
  • comment with the stats of the PR branch and any changes from the diff

Trends

What none of these tools offer out of the box vs something like Relative CI is a way to look at trends over time. We don't think this is a particular issue - we're mostly interested in some kind of graph to make sure that, yes, file size or number of dependencies, etc are tracking downward. We think it'd be relatively simple to generate this data and store it somewhere like Google Sheets for the purposes of lightly monitoring these trends. This is also something that we could iteratively add once we've got an MVP working.

@domoscargin
Copy link
Contributor

Closing now in favour of #3188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants