-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance testing #20802
Improve performance testing #20802
Conversation
Size Change: +139 B (0%) Total Size: 864 kB
ℹ️ View Unchanged
|
Added a commit applying rounding to the performance reporter. |
item.dur && | ||
item.args && | ||
item.args.data && | ||
item.args.data.type === 'keypress'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's enough, I know previously keydown was also a problem.
You can see in the screenshot that there's aa keydown and then a keypress. Both of these are gathered in a "Task". and after that there are "follow up async tasks).
Ideally, if we can meansure the three segments it would be great, I believe the most important thing is the sync behavior (keydown + keypress). A format like that might be best keydown+keypress ( follow-ups: xx )
eg: 25ms ( follow-ups: 40ms )
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will make much difference, since I expect the absolute values not to be as important as their evolution over time, and the keydown
/keyup
events don't seem to have a great deal of variability.
That said, for completeness sake, I added a commit where I calculate the sum of the three event durations for a total value for each virtual key press.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that you'd suggested separating out the values of the different event types in the output. Do you think that's necessary, or is the sum enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sum is enough for the first part (the synchronous work). I don't know though if it's possible somehow to compute the time spent in the asynchronous work. In the trace it's shown as separate from keydown, keyup, keypress event, it's shown as multiple small tasks.
I'd say, getting that value is less important but if we manage to get it, we shouldn't sum it with the rest.
does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that whether or not the async value is important depends on what you're trying to measure. If you're trying to measure input latency as perceived by the user, then the async value isn't important, as that work happens after the screen has updated and probably does not limit interactivity in any way, as the tasks are very short.
In any case, I suspect it would be extremely difficult to get that value, as there wouldn't be a good way of correlating the async work with the key events that triggered it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't run it, but from code, changes looks good to me 👍
Thanks @sgomes This is great. |
Description
Performance testing for Gutenberg was being done using puppeteer, by instrumenting a browser instance, loading a page, and performing some typing. However, these times were being measured across the instrumented browser / host context switch; this meant that the numbers were potentially unreliable, since they were subject to unpredictable factors such as OS scheduling.
This PR performs the measurements within the instrumented browser instance, by relying on browser performance timings for the
load
/DOMContentLoaded
numbers, and by using in-browser tracing for the typing event duration numbers.The results won't be directly comparable to the previous way of testing, but they should be comparable between each other henceforth, assuming equivalent testing conditions (same machine, similar system load, etc.).
How has this been tested?
The intermediate trace files were loaded in Chrome's performance tab, and it was experimentally verified that the numbers in the generated
results.json
matched the duration of the events visible in the DevTools timeline.Types of changes
Test-only bugfixes.