Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prestop hook #665

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Add prestop hook #665

wants to merge 3 commits into from

Conversation

WalBeh
Copy link
Contributor

@WalBeh WalBeh commented Oct 10, 2024

Summary of changes

For rolling restarts of the crateDB pods, we want to notify cratedb with an alter cluster decommission1 to allow proper handling of running queries and shard handling:

  • the preStop Lifecycle2 executes dc_util, which is provided via https://cdn.crate.io/downloads/dc_util_x86_64
  • dc_util sets the crateDB decommission timeout to 720s and sets decomission force to true to force decommission in cases crateDB would roll-back the decomission.
  • termination_grace_period_seconds on the POD is set to 900s to allow the decommission to finish.

In case the statefulset is scaled to 0 (scale down, or manual triggered by an administrator), dc_util tries to detect that and does NOT trigger an alter cluster decommission and POD Termination is triggered by kubelet sending a SIGTERM.

preStop seems to behave some what fail-save. Eg. if dc_util is not available or cannot be executed, kubelet continues by sending SIGTERM.

Checklist

  • Link to issue this PR refers to: https://github.com/crate/cloud/issues/2127
  • Relevant changes are reflected in CHANGES.rst
  • Added or changed code is covered by tests
  • Documentation has been updated if necessary
  • Changed code does not contain any breaking changes (or this is a major version change)

Footnotes

  1. https://cratedb.com/docs/crate/reference/en/latest/sql/statements/alter-cluster.html#decommission-nodeid-nodename

  2. https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/

@cla-bot cla-bot bot added the cla-signed label Oct 10, 2024
@WalBeh WalBeh force-pushed the bw/add-prestop-for-crate-container branch 2 times, most recently from 98d7a7d to 012fe7c Compare October 14, 2024 07:52
@WalBeh WalBeh marked this pull request as ready for review October 15, 2024 09:16
@WalBeh WalBeh force-pushed the bw/add-prestop-for-crate-container branch from 012fe7c to 06b898b Compare October 15, 2024 09:26
@WalBeh WalBeh force-pushed the bw/add-prestop-for-crate-container branch from 0ed30de to 4e1a214 Compare October 17, 2024 11:46
@hammerhead hammerhead removed their request for review October 25, 2024 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant