Skip to content

Commit

Permalink
fix: detect chrome browser process and tab crashes to no longer hang …
Browse files Browse the repository at this point in the history
…in CI (#24338)

* Handle chrome tab and browser crashes.

* handle process crashes and add system tests

* handle crashing on windows

* updating comment

* update unit tests

* Update packages/server/lib/browsers/index.ts

Co-authored-by: Emily Rohrbough <[email protected]>

* Apply suggestions from code review

Co-authored-by: Ryan Manuel <[email protected]>

* fix flaky system test probably

Co-authored-by: Emily Rohrbough <[email protected]>
Co-authored-by: Ryan Manuel <[email protected]>
  • Loading branch information
3 people authored Oct 21, 2022
1 parent 208efbf commit 382e0b5
Show file tree
Hide file tree
Showing 18 changed files with 546 additions and 21 deletions.
48 changes: 48 additions & 0 deletions packages/errors/__snapshot-html__/BROWSER_CRASHED.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions packages/errors/__snapshot-html__/RENDERER_CRASHED.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 14 additions & 1 deletion packages/errors/src/errors.ts
Original file line number Diff line number Diff line change
Expand Up @@ -570,7 +570,6 @@ export const AllCypressErrors = {
This can happen for a number of different reasons:
- You wrote an endless loop and you must fix your own code
- There is a memory leak in Cypress (unlikely but possible)
- You are running Docker (there is an easy fix for this: see link below)
- You are running lots of tests on a memory intense application
- You are running in a memory starved VM environment
Expand All @@ -581,6 +580,20 @@ export const AllCypressErrors = {
https://on.cypress.io/renderer-process-crashed`
},
BROWSER_CRASHED: (browser: string, code: string | number, signal: string) => {
return errTemplate`\
We detected that the ${fmt.highlight(browser)} process just crashed with code '${fmt.highlight(code)}' and signal '${fmt.highlight(signal)}'.
We have failed the current test and have relaunched ${fmt.highlight(browser)}.
This can happen for many different reasons:
- You wrote an endless loop and you must fix your own code
- You are running lots of tests on a memory intense application
- You are running in a memory starved VM environment
- There are problems with your GPU / GPU drivers
- There are browser bugs`
},
AUTOMATION_SERVER_DISCONNECTED: () => {
return errTemplate`The automation client disconnected. Cannot continue running tests.`
},
Expand Down
5 changes: 5 additions & 0 deletions packages/errors/test/unit/visualSnapshotErrors_spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,11 @@ describe('visual error templates', () => {
default: [],
}
},
BROWSER_CRASHED: () => {
return {
default: ['Chrome', 'code', 'signal'],
}
},
AUTOMATION_SERVER_DISCONNECTED: () => {
return {
default: [],
Expand Down
30 changes: 25 additions & 5 deletions packages/server/lib/browsers/chrome.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import { fs } from '../util/fs'
import { CdpAutomation, screencastOpts } from './cdp_automation'
import * as protocol from './protocol'
import utils from './utils'
import * as errors from '../errors'
import type { Browser, BrowserInstance } from './types'
import { BrowserCriClient } from './browser-cri-client'
import type { CriClient } from './cri-client'
Expand Down Expand Up @@ -561,6 +562,17 @@ export = {
return args
},

/**
* Clear instance state for the chrome instance, this is normally called in on kill or on exit.
*/
clearInstanceState () {
debug('closing remote interface client')

// Do nothing on failure here since we're shutting down anyway
browserCriClient?.close().catch()
browserCriClient = undefined
},

async connectToNewSpec (browser: Browser, options: BrowserNewTabOpts, automation: Automation) {
debug('connecting to new chrome tab in existing instance with url and debugging port', { url: options.url })

Expand Down Expand Up @@ -595,6 +607,18 @@ export = {
async attachListeners (url: string, pageCriClient: CriClient, automation: Automation, options: BrowserLaunchOpts | BrowserNewTabOpts) {
const browserCriClient = this._getBrowserCriClient()

// Handle chrome tab crashes.
pageCriClient.on('Inspector.targetCrashed', () => {
const err = errors.get('RENDERER_CRASHED')

if (!options.onError) {
errors.log(err)
throw new Error('Missing onError in attachListeners')
}

options.onError(err)
})

if (!browserCriClient) throw new Error('Missing browserCriClient in attachListeners')

debug('attaching listeners to chrome %o', { url, options })
Expand Down Expand Up @@ -700,11 +724,7 @@ export = {
launchedBrowser.browserCriClient = browserCriClient

launchedBrowser.kill = (...args) => {
debug('closing remote interface client')

// Do nothing on failure here since we're shutting down anyway
browserCriClient?.close().catch()
browserCriClient = undefined
this.clearInstanceState()

debug('closing chrome')

Expand Down
12 changes: 9 additions & 3 deletions packages/server/lib/browsers/electron.ts
Original file line number Diff line number Diff line change
Expand Up @@ -146,9 +146,10 @@ export = {
onCrashed () {
const err = errors.get('RENDERER_CRASHED')

errors.log(err)

if (!options.onError) throw new Error('Missing onError in onCrashed')
if (!options.onError) {
errors.log(err)
throw new Error('Missing onError in onCrashed')
}

options.onError(err)
},
Expand Down Expand Up @@ -471,6 +472,11 @@ export = {
})
},

/**
* Clear instance state for the electron instance, this is normally called in on kill or on exit for electron there isn't state to clear.
*/
clearInstanceState () {},

async connectToNewSpec (browser: Browser, options: ElectronOpts, automation: Automation) {
if (!options.url) throw new Error('Missing url in connectToNewSpec')

Expand Down
18 changes: 12 additions & 6 deletions packages/server/lib/browsers/firefox.ts
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,17 @@ export function _createDetachedInstance (browserInstance: BrowserInstance, brows
return detachedInstance
}

/**
* Clear instance state for the chrome instance, this is normally called in on kill or on exit.
*/
export function clearInstanceState () {
debug('closing remote interface client')
if (browserCriClient) {
browserCriClient.close().catch()
browserCriClient = undefined
}
}

export async function connectToNewSpec (browser: Browser, options: BrowserNewTabOpts, automation: Automation) {
await firefoxUtil.connectToNewSpec(options, automation, browserCriClient)
}
Expand Down Expand Up @@ -552,13 +563,8 @@ export async function open (browser: Browser, url: string, options: BrowserLaunc
const originalBrowserKill = browserInstance.kill

browserInstance.kill = (...args) => {
debug('closing remote interface client')

// Do nothing on failure here since we're shutting down anyway
if (browserCriClient) {
browserCriClient.close().catch()
browserCriClient = undefined
}
clearInstanceState()

debug('closing firefox')

Expand Down
26 changes: 25 additions & 1 deletion packages/server/lib/browsers/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import _ from 'lodash'
import Bluebird from 'bluebird'
import Debug from 'debug'
import utils from './utils'
import * as errors from '../errors'
import check from 'check-more-types'
import { exec } from 'child_process'
import util from 'util'
Expand Down Expand Up @@ -155,13 +156,36 @@ export = {
// TODO: normalizing opening and closing / exiting
// so that there is a default for each browser but
// enable the browser to configure the interface
instance.once('exit', () => {
instance.once('exit', async (code, signal) => {
ctx.browser.setBrowserStatus('closed')
// TODO: make this a required property
if (!options.onBrowserClose) throw new Error('onBrowserClose did not exist in interactive mode')

const browserDisplayName = instance?.browser?.displayName || 'unknown'

options.onBrowserClose()
browserLauncher.clearInstanceState()
instance = null

// We are being very narrow on when to restart the browser here. The only case we can reliably test the 'SIGTRAP' signal.
// We want to avoid adding signals in here that may intentionally be caused by a user.
// For example exiting firefox through either force quitting or quitting via cypress will fire a 'SIGTERM' event which
// would result in constantly relaunching the browser when the user actively wants to quit.
// On windows the crash produces 2147483651 as an exit code. We should add to the list of crashes we handle as we see them.
// In the future we may consider delegating to the browsers to determine if an exit is a crash since it might be different
// depending on what browser has crashed.
if (code === null && ['SIGTRAP', 'SIGABRT'].includes(signal) || code === 2147483651 && signal === null) {
const err = errors.get('BROWSER_CRASHED', browserDisplayName, code, signal)

if (!options.onError) {
errors.log(err)
throw new Error('Missing onError in attachListeners')
}

await options.onError(err)

await options.relaunchBrowser!()
}
})

// TODO: instead of waiting an arbitrary
Expand Down
4 changes: 4 additions & 0 deletions packages/server/lib/browsers/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,8 @@ export type BrowserLauncher = {
* Used in Cypress-in-Cypress tests to connect to the existing browser instance.
*/
connectToExisting: (browser: Browser, options: BrowserLaunchOpts, automation: Automation) => void | Promise<void>
/**
* Used to clear instance state after the browser has been exited.
*/
clearInstanceState: () => void
}
9 changes: 8 additions & 1 deletion packages/server/lib/browsers/webkit.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,13 @@ export async function connectToNewSpec (browser: Browser, options: BrowserNewTab
})
}

/**
* Clear instance state for the webkit instance, this is normally called in on kill or on exit.
*/
export function clearInstanceState () {
wkAutomation = undefined
}

export function connectToExisting () {
throw new Error('Cypress-in-Cypress is not supported for WebKit.')
}
Expand Down Expand Up @@ -120,7 +127,7 @@ export async function open (browser: Browser, url: string, options: BrowserLaunc
async kill () {
debug('closing pwBrowser')
await pwBrowser.close()
wkAutomation = undefined
clearInstanceState()
}

/**
Expand Down
2 changes: 2 additions & 0 deletions packages/server/lib/open_project.ts
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,8 @@ export class OpenProject {
return await browsers.connectToNewSpec(browser, { onInitializeNewBrowserTab, ...options }, automation)
}

options.relaunchBrowser = this.relaunchBrowser

return await browsers.open(browser, options, automation, this._ctx)
}

Expand Down
4 changes: 3 additions & 1 deletion packages/server/lib/project-base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,9 @@ export class ProjectBase<TServer extends Server> extends EE {
this.options = {
report: false,
onFocusTests () {},
onError () {},
onError (error) {
errors.log(error)
},
onWarning: this.ctx.onWarning,
...options,
}
Expand Down
1 change: 1 addition & 0 deletions packages/types/src/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ export type BrowserLaunchOpts = {
isTextTerminal: boolean
onBrowserClose?: (...args: unknown[]) => void
onBrowserOpen?: (...args: unknown[]) => void
relaunchBrowser?: () => Promise<any>
} & Partial<OpenProjectLaunchOpts> // TODO: remove the `Partial` here by making it impossible for openProject.launch to be called w/o OpenProjectLaunchOpts
& Pick<ReceivedCypressOptions, 'userAgent' | 'proxyUrl' | 'socketIoRoute' | 'chromeWebSecurity' | 'downloadsFolder' | 'experimentalSessionAndOrigin' | 'experimentalModifyObstructiveThirdPartyCode' | 'experimentalWebKitSupport'>

Expand Down
Loading

5 comments on commit 382e0b5

@cypress-bot
Copy link
Contributor

@cypress-bot cypress-bot bot commented on 382e0b5 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circle has built the linux x64 version of the Test Runner.

Learn more about this pre-release platform-specific build at https://on.cypress.io/installing-cypress#Install-pre-release-version.

Run this command to install the pre-release locally:

npm install https://cdn.cypress.io/beta/npm/10.11.0/linux-x64/develop-382e0b525da203b9ade7e985b03a5ee8800e3565/cypress.tgz

@cypress-bot
Copy link
Contributor

@cypress-bot cypress-bot bot commented on 382e0b5 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circle has built the linux arm64 version of the Test Runner.

Learn more about this pre-release platform-specific build at https://on.cypress.io/installing-cypress#Install-pre-release-version.

Run this command to install the pre-release locally:

npm install https://cdn.cypress.io/beta/npm/10.11.0/linux-arm64/develop-382e0b525da203b9ade7e985b03a5ee8800e3565/cypress.tgz

@cypress-bot
Copy link
Contributor

@cypress-bot cypress-bot bot commented on 382e0b5 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circle has built the darwin arm64 version of the Test Runner.

Learn more about this pre-release platform-specific build at https://on.cypress.io/installing-cypress#Install-pre-release-version.

Run this command to install the pre-release locally:

npm install https://cdn.cypress.io/beta/npm/10.11.0/darwin-arm64/develop-382e0b525da203b9ade7e985b03a5ee8800e3565/cypress.tgz

@cypress-bot
Copy link
Contributor

@cypress-bot cypress-bot bot commented on 382e0b5 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circle has built the darwin x64 version of the Test Runner.

Learn more about this pre-release platform-specific build at https://on.cypress.io/installing-cypress#Install-pre-release-version.

Run this command to install the pre-release locally:

npm install https://cdn.cypress.io/beta/npm/10.11.0/darwin-x64/develop-382e0b525da203b9ade7e985b03a5ee8800e3565/cypress.tgz

@cypress-bot
Copy link
Contributor

@cypress-bot cypress-bot bot commented on 382e0b5 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Circle has built the win32 x64 version of the Test Runner.

Learn more about this pre-release platform-specific build at https://on.cypress.io/installing-cypress#Install-pre-release-version.

Run this command to install the pre-release locally:

npm install https://cdn.cypress.io/beta/npm/10.11.0/win32-x64/develop-382e0b525da203b9ade7e985b03a5ee8800e3565/cypress.tgz

Please sign in to comment.