Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new ways to reset crash loop tracking metadata #15064

Merged
merged 4 commits into from
Nov 23, 2023

Conversation

bharathv
Copy link
Contributor

Adds a couple more ways to reset crash tracking, main changes in the PR are following.

  • Every time the broker starts in recovery mode, the tracking is reset. Booting in recovery mode shows user intent to fix things, so it doesn't make sense to abort with a crash loop limit check, rather we reset the tracking metadata.
  • Adds a new admin API that can explicitly reset tracking, probably an escape hatch one can call when they are close to hitting the limits.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.2.x
  • v23.1.x
  • v22.3.x

Release Notes

Features

  • Adds an admin API to reset crash tracking. Additionally the tracking metadata is reset every time the broker boots up in recovery mode.

This commit ensures that every time the broker starts in recovery mode,
the crash tracker is reset. We do so becuase booting in recovery mode
shows user intent to fix broken things and resetting crash loop limit
is a good starting point.
@bharathv bharathv requested review from mmaslankaprv and ztlpn and removed request for mmaslankaprv November 21, 2023 12:26
@bharathv bharathv merged commit c4c295b into redpanda-data:dev Nov 23, 2023
22 checks passed
@bharathv bharathv deleted the crash_tracking_reset branch November 23, 2023 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants