Skip to content

Commit

Permalink
Merge pull request #69 from Shopify/spawn-timeout
Browse files Browse the repository at this point in the history
Implement `spawn_timeout` configuration
  • Loading branch information
casperisfine authored Sep 9, 2023
2 parents 9899ef4 + 18aa140 commit 541418f
Show file tree
Hide file tree
Showing 5 changed files with 50 additions and 1 deletion.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Unreleased

- Implement `spawn_timeout` to protect against bugs causing workers to get stuck before they reach ready state.

# 0.8.0

- Add an `after_monitor_ready` callback, called in the monitor process at end of boot.
Expand Down
13 changes: 13 additions & 0 deletions docs/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,19 @@ exit or be SIGKILL-ed due to timeouts.
See https://nginx.org/en/docs/http/ngx_http_upstream_module.html
for more details on nginx upstream configuration.

### `spawn_timeout`

```ruby
timeout 5
```

Sets the timeout for a newly spawned worker to be ready after being spawned.

This timeout is a safeguard against various low-level fork safety bugs that could cause
a process to dead-lock.

The default of `10` seconds is quite generous and likely doesn't need to be adjusted.

### `logger`

```ruby
Expand Down
5 changes: 5 additions & 0 deletions lib/pitchfork/configurator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ class Configurator
DEFAULTS = {
:soft_timeout => 20,
:cleanup_timeout => 2,
:spawn_timeout => 10,
:timeout => 22,
:logger => default_logger,
:worker_processes => 1,
Expand Down Expand Up @@ -174,6 +175,10 @@ def timeout(seconds, cleanup: 2)
set_int(:timeout, soft_timeout + cleanup_timeout, 5)
end

def spawn_timeout(seconds)
set_int(:spawn_timeout, seconds, 1)
end

def worker_processes(nr)
set_int(:worker_processes, nr, 1)
end
Expand Down
6 changes: 5 additions & 1 deletion lib/pitchfork/http_server.rb
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def extend_deadline(extra_time)
end

# :stopdoc:
attr_accessor :app, :timeout, :soft_timeout, :cleanup_timeout, :worker_processes,
attr_accessor :app, :timeout, :soft_timeout, :cleanup_timeout, :spawn_timeout, :worker_processes,
:after_worker_fork, :after_mold_fork,
:listener_opts, :children,
:orig_app, :config, :ready_pipe,
Expand Down Expand Up @@ -556,6 +556,10 @@ def after_fork_internal
def spawn_worker(worker, detach:)
logger.info("worker=#{worker.nr} gen=#{worker.generation} spawning...")

# We set the deadline before spawning the child so that if for some
# reason it gets stuck before reaching the worker loop,
# the monitor process will kill it.
worker.update_deadline(@spawn_timeout)
Pitchfork.fork_sibling do
worker.pid = Process.pid

Expand Down
25 changes: 25 additions & 0 deletions test/integration/test_boot.rb
Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,29 @@ def test_boot_broken_after_mold_fork

assert_exited(pid, 1)
end

def test_boot_worker_stuck_in_spawn
addr, port = unused_port

pid = spawn_server(app: APP, config: <<~RUBY)
listen "#{addr}:#{port}"
worker_processes 2
refork_after [50, 100, 1000]
spawn_timeout 2
after_worker_fork do |_server, worker|
if worker.nr == 1
sleep 20 # simulate a stuck worker
end
end
RUBY


assert_healthy("http://#{addr}:#{port}")

assert_stderr("worker=0 gen=0 ready")
assert_stderr(/worker=1 pid=\d+ registered/)
assert_stderr(/worker=1 pid=\d+ timed out, killing/, timeout: 4)

assert_clean_shutdown(pid)
end
end

0 comments on commit 541418f

Please sign in to comment.