Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graceful shutdown #60059

Merged
merged 35 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
5b004f2
fix: gracefully shutdown server
redbmk Dec 27, 2023
57f880a
fix: better support for graceful shutdown
redbmk Dec 28, 2023
e3b0427
fix: shutdown next dev immediately
redbmk Dec 28, 2023
7d94719
tests: add another test for graceful shutdown
redbmk Dec 29, 2023
7c8ff12
refactor: use once(pid, "kill") instead of complex waitForCondition
redbmk Dec 30, 2023
c739e11
fix: standalone server graceful-shudown
redbmk Dec 30, 2023
4f74aee
refactor: combine graceful-shutdown tests
redbmk Dec 30, 2023
90af3bb
refactor: move graceful-shutdown test-app to src folder
redbmk Dec 30, 2023
f3c875d
Merge branch 'canary' into graceful-shutdown
redbmk Jan 1, 2024
8de8407
Merge branch 'canary' into graceful-shutdown
redbmk Jan 3, 2024
b173496
fix: remove check for !this.isStopping
redbmk Jan 4, 2024
df3a4e5
fix: allow SIGINT or SIGTERM to kill build
redbmk Jan 4, 2024
c40330a
fix: css-test with proxy not properly exiting
redbmk Jan 4, 2024
778b5dc
Merge branch 'canary' into graceful-shutdown
redbmk Jan 4, 2024
49af857
Merge branch 'canary' into graceful-shutdown
redbmk Jan 5, 2024
2140702
Merge branch 'canary' into graceful-shutdown
redbmk Jan 5, 2024
5622b51
Merge branch 'canary' into graceful-shutdown
redbmk Jan 5, 2024
e2a7700
Merge branch 'canary' into graceful-shutdown
redbmk Jan 5, 2024
44fe46a
Merge branch 'canary' into graceful-shutdown
huozhi Jan 8, 2024
8111659
Merge remote-tracking branch 'upstream/canary' into graceful-shutdown
redbmk Jan 13, 2024
8e273b8
fix: use sigkill in tests by default, simplify types and isAppRunning
redbmk Jan 13, 2024
e340c41
fix: linting
redbmk Jan 13, 2024
3f65ded
fix: also check for signalCode
redbmk Jan 13, 2024
e49fb6d
fix: killApp instance is actually optional
redbmk Jan 13, 2024
c43fc86
Merge remote-tracking branch 'upstream/canary' into graceful-shutdown
redbmk Jan 14, 2024
0c0c73e
Merge branch 'canary' into graceful-shutdown
redbmk Jan 14, 2024
712643c
Merge branch 'canary' into graceful-shutdown
redbmk Jan 14, 2024
77bed79
Merge branch 'canary' into graceful-shutdown
redbmk Jan 14, 2024
aacc91c
Merge branch 'canary' into graceful-shutdown
redbmk Jan 15, 2024
c4af87d
Merge branch 'canary' into graceful-shutdown
redbmk Jan 15, 2024
f7faf58
Merge branch 'canary' into graceful-shutdown
redbmk Jan 15, 2024
e88a95b
Merge branch 'canary' into graceful-shutdown
redbmk Jan 15, 2024
fd84659
Merge branch 'canary' into graceful-shutdown
redbmk Jan 16, 2024
b14eab2
Merge branch 'canary' into graceful-shutdown
redbmk Jan 16, 2024
ec24f76
Merge branch 'canary' into graceful-shutdown
redbmk Jan 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions packages/next/src/bin/next.ts
Original file line number Diff line number Diff line change
Expand Up @@ -108,12 +108,11 @@ if (process.env.NODE_ENV) {
;(process.env as any).NODE_ENV = process.env.NODE_ENV || defaultEnv
;(process.env as any).NEXT_RUNTIME = 'nodejs'

// Make sure commands gracefully respect termination signals (e.g. from Docker)
// Allow the graceful termination to be manually configurable
if (!process.env.NEXT_MANUAL_SIG_HANDLE && command !== 'dev') {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this one is necessary for next dev and next start to gracefully shutdown. The env is still in start-server

if (command === 'build') {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still want to stop the CLI immediately if SIGTERM or SIGINT is used for building. There's also a test for this already that was breaking if we remove these handlers for all commands.

process.on('SIGTERM', () => process.exit(0))
process.on('SIGINT', () => process.exit(0))
}

async function main() {
const currentArgsSpec = commandArgs[command]()
const validatedArgs = getValidatedArgs(currentArgsSpec, forwardedArgs)
Expand Down
7 changes: 0 additions & 7 deletions packages/next/src/build/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2054,13 +2054,6 @@ const dir = path.join(__dirname)
process.env.NODE_ENV = 'production'
process.chdir(__dirname)

// Make sure commands gracefully respect termination signals (e.g. from Docker)
// Allow the graceful termination to be manually configurable
if (!process.env.NEXT_MANUAL_SIG_HANDLE) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing this one is necessary for standalone mode to gracefully shutdown. The env is still in start-server

process.on('SIGTERM', () => process.exit(0))
process.on('SIGINT', () => process.exit(0))
}

const currentPort = parseInt(process.env.PORT, 10) || 3000
const hostname = process.env.HOSTNAME || '0.0.0.0'

Expand Down
38 changes: 20 additions & 18 deletions packages/next/src/cli/next-dev.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,22 +33,33 @@ import {
isPortIsReserved,
} from '../lib/helpers/get-reserved-port'
import os from 'os'
import { once } from 'node:events'

type Child = ReturnType<typeof fork>
type KillSignal = Parameters<Child['kill']>[0]

let dir: string
let child: undefined | ReturnType<typeof fork>
let child: undefined | Child
let config: NextConfigComplete
let isTurboSession = false
let traceUploadUrl: string
let sessionStopHandled = false
let sessionStarted = Date.now()

const handleSessionStop = async (signal: string | null) => {
const handleSessionStop = async (
signal: KillSignal | null,
childExited: boolean = false
) => {
if (child) {
child.kill((signal as any) || 0)
child.kill(signal ?? 0)
}
if (sessionStopHandled) return
sessionStopHandled = true

if (child && !childExited) {
await once(child, 'exit').catch(() => {})
}

try {
const { eventCliSessionStopped } =
require('../telemetry/events/session-stopped') as typeof import('../telemetry/events/session-stopped')
Expand Down Expand Up @@ -108,8 +119,11 @@ const handleSessionStop = async (signal: string | null) => {
process.exit(0)
}

process.on('SIGINT', () => handleSessionStop('SIGINT'))
process.on('SIGTERM', () => handleSessionStop('SIGTERM'))
process.on('SIGINT', () => handleSessionStop('SIGKILL'))
process.on('SIGTERM', () => handleSessionStop('SIGKILL'))
Copy link
Contributor Author

@redbmk redbmk Dec 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little unsure about switching this to SIGKILL because it could potentially mess with someone using a custom signal but it looks like that was never supported in next dev anyway, at least officially:

https://nextjs.org/docs/pages/building-your-application/deploying

Good to know: Manual signal handling is not available in next dev.

Also because of the block in packages/next/src/bin/next.ts that's removed in this MR, it has the extra check command !== 'dev', so behavior should effectively be the same (exit the child immediately when the dev server receives SIGINT or SIGTERM)

So because of all that I don't believe this would be a breaking change, and should still be backwards compatible


// exit event must be synchronous
process.on('exit', () => child?.kill('SIGKILL'))

const nextDev: CliCommand = async (args) => {
if (args['--help']) {
Expand Down Expand Up @@ -287,7 +301,7 @@ const nextDev: CliCommand = async (args) => {
}
return startServer(options)
}
await handleSessionStop(signal)
await handleSessionStop(signal, true)
})
})
}
Expand Down Expand Up @@ -333,16 +347,4 @@ const nextDev: CliCommand = async (args) => {
await runDevServer(false)
}

function cleanup() {
if (!child) {
return
}

child.kill('SIGTERM')
}

process.on('exit', cleanup)
process.on('SIGINT', cleanup)
process.on('SIGTERM', cleanup)

export { nextDev }
13 changes: 6 additions & 7 deletions packages/next/src/server/lib/start-server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -264,10 +264,9 @@ export async function startServer(
})

try {
const cleanup = (code: number | null) => {
const cleanup = () => {
debug('start-server process cleanup')
server.close()
process.exit(code ?? 0)
server.close(() => process.exit(0))
}
const exception = (err: Error) => {
if (isPostpone(err)) {
Expand All @@ -279,11 +278,11 @@ export async function startServer(
// This is the render worker, we keep the process alive
console.error(err)
}
process.on('exit', (code) => cleanup(code))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't need to handle 'exit' anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exit event is sent right before the process exits and it can only execute synchronous code. I suppose we could capture it if we wanted to force the exit code to be 0, but I was thinking letting it send the code it got made sense.
https://nodejs.org/api/process.html#event-exit

There is no way to prevent the exiting of the event loop at this point, and once all 'exit' listeners have finished running the Node.js process will terminate.

Listener functions must only perform synchronous operations. The Node.js process will exit immediately after calling the 'exit' event listeners causing any additional work still queued in the event loop to be abandoned.

Since server.close is asychronous, I don't think calling it on the 'exit' event would do anything useful

https://nodejs.org/api/net.html#serverclosecallback

This function is asynchronous

I could see there maybe being a couple things that server.close does synchronously, but it definitely would exit without calling the callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just realized it was already using the original exit code and just defaulting to 0, so the behavior should still be the same without listening for the event

// Make sure commands gracefully respect termination signals (e.g. from Docker)
// Allow the graceful termination to be manually configurable
if (!process.env.NEXT_MANUAL_SIG_HANDLE) {
// callback value is signal string, exit with 0
process.on('SIGINT', () => cleanup(0))
process.on('SIGTERM', () => cleanup(0))
process.on('SIGINT', cleanup)
process.on('SIGTERM', cleanup)
}
process.on('rejectionHandled', () => {
// It is ok to await a Promise late in Next.js as it allows for better
Expand Down
3 changes: 3 additions & 0 deletions test/integration/css-client-nav/test/index.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import { remove } from 'fs-extra'
import {
findPort,
killApp,
killProcess,
launchApp,
nextBuild,
nextStart,
Expand Down Expand Up @@ -180,6 +181,8 @@ describe('CSS Module client-side navigation', () => {
})
afterAll(async () => {
proxyServer.close()
// something is hanging onto the process, so we need to SIGKILL
await killProcess(app.pid, 'SIGKILL')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to track down why the app wouldn't close. I tried closing proxyServer as well as proxy in different orders, but both of them would hang if I waited for the close callback. I tried killing both of those with killApp which seemed to shut them down, but killApp(app) would still hang.

Using SIGKILL was the only way I found to work. Before this change, it would have been calling process.exit(0) immediately on receiving SIGTERM, so the end result should be the same.

await killApp(app)
})

Expand Down
11 changes: 3 additions & 8 deletions test/lib/next-modes/base.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import { Span } from 'next/src/trace'
import webdriver from '../next-webdriver'
import { renderViaHTTP, fetchViaHTTP } from 'next-test-utils'
import cheerio from 'cheerio'
import { once } from 'events'
import { BrowserInterface } from '../browsers/base'
import escapeStringRegexp from 'escape-string-regexp'

Expand Down Expand Up @@ -59,7 +60,7 @@ export class NextInstance {
public testDir: string
protected isStopping: boolean = false
protected isDestroyed: boolean = false
protected childProcess: ChildProcess
protected childProcess?: ChildProcess
protected _url: string
protected _parsedUrl: URL
protected packageJson?: PackageJson = {}
Expand Down Expand Up @@ -345,13 +346,7 @@ export class NextInstance {
public async stop(): Promise<void> {
this.isStopping = true
if (this.childProcess) {
let exitResolve
const exitPromise = new Promise((resolve) => {
exitResolve = resolve
})
this.childProcess.addListener('exit', () => {
exitResolve()
})
const exitPromise = once(this.childProcess, 'exit')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had started making changes here and turned out they weren't needed, but I thought I'd keep this refactor since once seemed a little cleaner when I learned about it.

await new Promise<void>((resolve) => {
treeKill(this.childProcess.pid, 'SIGKILL', (err) => {
if (err) {
Expand Down
17 changes: 16 additions & 1 deletion test/lib/next-test-utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import { getRandomPort } from 'get-port-please'
import fetch from 'node-fetch'
import qs from 'querystring'
import treeKill from 'tree-kill'
import { once } from 'events'

import server from 'next/dist/server/next'
import _pkg from 'next/package.json'
Expand Down Expand Up @@ -523,10 +524,24 @@ export async function killProcess(
})
}

export function isAppRunning(instance: ChildProcess) {
if (!instance?.pid) return false

try {
// 0 is a special signal that tests the existence of a process
process.kill(instance.pid, 0)
return true
} catch {
return false
}
}

// Kill a launched app
export async function killApp(instance: ChildProcess) {
if (instance && instance.pid) {
if (instance?.pid && isAppRunning(instance)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to kill the app if it's already running.

const exitPromise = once(instance, 'exit')
await killProcess(instance.pid)
await exitPromise
}
}

Expand Down
Loading
Loading