Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(network-recorder): remove quic-request-finished workaround #9744

Merged
merged 2 commits into from
Sep 27, 2019

Conversation

brendankenny
Copy link
Member

#5256 added a workaround for #5254, where many QUIC requests never got a final LoadingFinished or LoadingFailed event, so we were treating them as forever open, we never reached network quiet, and so we always hit the LH load timeout. The workaround was to check if the request was over QUIC and if there was end timing information, calling that "finished".

We broke this workaround I think when we finally dropped our direct dependence on devtools-frontend and wrote our own NetworkRequest.

NetworkRequest._recomputeTimesWithResourceTiming takes in some preliminary timing information that may or may not be updated later, but any QUIC requests always appear finished after this happens, whether or not they end up updated later.

This causes networkidle and network-2-idle to sometimes fire too early, which means we may be cutting off some runs before TTI would actually be possible. I'm not sure how bad the impact is in practice, however, as

  • the requests are often finished soon after anyways
  • the Network protocol events LH receives are often delayed well after they actually occurred
  • it's rare that we're in e.g. network-3-idle where one of the requests is QUIC and appears finished and we're not almost network-2-idle anyways

Originally I was going to try to update the QUIC heuristic, but after doing a bunch of tests, I don't think the original behavior is an issue anymore. All but a tiny handful of QUIC requests I've observed have a Network.loadingFinished event, and every time I've encountered a case of one without a finish event (developers.google.com sometimes hits this), extending networkQuietThresholdMs was enough to make the problem go away, so I think they're just legitimately slow loads.

So this PR just removes the workaround :)

@brendankenny
Copy link
Member Author

perf-wise, since this is in the n^2 path, we get like a 20% improvement in NetworkRecorder.recordsFromLogs execution time, 69ms to 54ms for a big (~800 request) theverge devtoolsLog.

That leaves about 20% of that total runtime spent on idle detection...we could make it O(1), but I'm not sure those last 10ms are worth it for how invasive the extra request counting would be.

@brendankenny
Copy link
Member Author

And here's the worst example I could find (in a relatively short search) of network-idle periods from the network events on master, from this PR, and vs the "true" network idle periods reconstructed after we have the full devtoolsLog and real network timestamps (NetworkRecorder.findNetworkQuietPeriods).

I used a hacked-together trace to visualize, forgive me :)

network idle timeline

This is loading https://google.com/search?q=ghost. In this example, master shows a long quiet period almost at the beginning of the load before eventually mostly aligning with the results from this PR and ground truth.

(raw trace if you want to do some zooming...will have to load manually, timeline-viewer CORS stuff isn't working right now)

@patrickhulce
Copy link
Collaborator

We broke this workaround I think when we finally dropped our direct dependence on devtools-frontend and wrote our own NetworkRequest.
NetworkRequest._recomputeTimesWithResourceTiming takes in some preliminary timing information that may or may not be updated later, but any QUIC requests always appear finished after this happens, whether or not they end up updated later.

I'm not sure I follow how it's more broken now than it ever was :) It seems like it's still working exactly as intended, just using the receipt of any response/headers as the "finished" state for QUIC requests because at the time we had no other event that could possibly represent finished. _recomputeTimesWithResourceTiming in our impl is still only called _onResponse.

Great news that the workaround isn't necessary anymore though and we can move our idle slightly closer reality!

@brendankenny
Copy link
Member Author

I'm not sure I follow how it's more broken now than it ever was :)

ha, well I guess I didn't anticipate that possibility :) Good that we can remove it, then.

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


const periods = NetworkRecorder.findNetworkQuietPeriods(records, 0);
assert.deepStrictEqual(periods, [
{start: 2000, end: Infinity},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just change this to assert that there's no idle period here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just change this to assert that there's no idle period here?

sounds good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants