Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(gather): handle crash if CDP target crashes #11840

Merged
merged 49 commits into from
Apr 17, 2024
Merged

core(gather): handle crash if CDP target crashes #11840

merged 49 commits into from
Apr 17, 2024

Conversation

paulirish
Copy link
Member

@paulirish paulirish commented Dec 16, 2020

fixes #11827

page could crash (what we saw there). we can't recover from that, and we should quit.

the CDP connection could also detach.. this happens if you manually close the tab while lighthouse is gathering. Update: our pptr layer handles this part just fine. we die fatally very quickly when this happens already.

@paulirish paulirish requested a review from a team as a code owner December 16, 2020 21:38
@paulirish paulirish requested review from patrickhulce and removed request for a team December 16, 2020 21:38
@google-cla google-cla bot added the cla: yes label Dec 16, 2020
Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl sounds right to me! but we also discussed the impl a bit haha

if anyone else wants to weigh in on the fact that it's a race at the highest gather level, feel free :)

lighthouse-core/gather/gather-runner.js Outdated Show resolved Hide resolved
*/
static async getGatherTerminatedPromise(driver) {
return new Promise((_, reject) => {
driver.on('Inspector.targetCrashed', _ => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if iframes crash should we still consider it fatal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question.. if an iframe shares a process with its parent.. it takes down both. (naturally)

but here's a iframe page crash when that iframe is isolated (oopif):

image

and..........

if we didn't explicitly attach to that iframe target.. we'd just see a Target.targetCrashed event, but not a Inspector.targetCrashed one.

but we DO autoattach to iframe targets. so we're attached. and thus we'll get the full fledged Inspector.targetCrashed. :)

i don't think there's an easy call here, but ... probably yes.

Copy link
Collaborator

@patrickhulce patrickhulce Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice sleuthing thanks :)

I suppose if we ever have a single gatherer that tries to evaluate something in subtarget contexts we will need to support this (framehole maybe?), so SGTM 👍 (though I would love some broader testing system at a time like this ala chat convo 😄 )

const gatherPromise = GatherRunner.run(runnerOpts.config.passes, gatherOpts);
await Promise.race([
gatherPromise,
GatherRunner.getGatherTerminatedPromise(driver),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations this is one of the only forked areas where we need to worry about duplication for FR 🎉 lucky you! 😃

would you be able to add this to the snapshot runner and use the shared FRTransitionalDriver type instead? :)

living in gather-runner feels a little weird at that point, but I don't have a clean new place or suggestion for it. ideas?

@@ -789,6 +793,25 @@ describe('Runner', () => {
expect(lhr.runtimeError.message).toBeDisplayString(/did not paint any content.*\(NO_FCP\)/);
});


it('includes a crash runtimeError when there\'s a crash during gathering', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to get a smoketest of this, but it's too hard without puppeteer. Maybe with FR support one can be added to the pptr test suite there? :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe doable. looks like it requires a

driver.gotoURL('chrome://crash');

you can't navigate to that URL clientside, so i think it's gotta come via protocol.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea this really could use a smoke test

@@ -115,6 +115,24 @@ class GatherRunner {
return {};
}

/**
* Reject if the gathering terminates due to the CDP target crashing, etc
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-review comment below

Worth noting here for lurkers that this is a good example our long-standing tradition of memory leak listeners that are never removed and promises that never die. We have a ton of these, so shouldn't need to do anything differently here, but a few folks have asked recently about why we recommend child processes and disposable contexts, this is why :)

@connorjclark
Copy link
Collaborator

@paulirish we still want this?

@connorjclark
Copy link
Collaborator

Can you update the branch?

@connorjclark
Copy link
Collaborator

image

@paulirish
Copy link
Member Author

paulirish commented Apr 3, 2024

I spent a while trying to get a test for this but it's wildly hard. This impl needs a non-mock x-runner, driver and session. And... we don't have any tests that are that free of mocks. I managed to almost create one, but it failed.

Auditing chrome://crash was enough to get TARGET_CRASHED instantly, whereas before LH would hang then finally PROTOCOL_TIMEOUT.
A smoke test could be possible, but we need to rework how we handle no output being generated (we currently throw throw new ChildProcessError). Another option is to add a test in cli/index-test.js.

This got me thinking about it a little differently.

I had this rejecting out of LH extremely quickly. No LHR, nothing. But I also now remember taking the stance that we should ALWAYS create an LHR. The fact that we don't is why we (and probably others) have to do stuff like this.

I moved the rejection race elsewhere and now it gets an LHR built for free. (And our smoketest runtimeError assertions are also fine. yay we have a test)

also obv this depends on #15913

penguin733

This comment was marked as off-topic.

core/runner.js Outdated Show resolved Hide resolved
core/test/gather/mock-driver.js Outdated Show resolved Hide resolved
core/test/gather/mock-driver.js Outdated Show resolved Hide resolved
@paulirish
Copy link
Member Author

@connorjclark this is ready for another review. (sorry i forget our workflow conventions :)

Copy link
Collaborator

@connorjclark connorjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle Inspector.targetCrashed event
5 participants