Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-restart slurmctld on failure after 1 second. #2838

Conversation

gmarciani
Copy link
Contributor

@gmarciani gmarciani commented Nov 26, 2024

Description of changes

Auto-restart slurmctld on failure after 1 second.

Tests

  • Verified that slurmctl is auto-restarted after a forced crash (kill) and job info are restored
  • Spec test

References

  • Link to impacted open issues.
  • Link to related PRs in other packages (i.e. cookbook, node).
  • Link to documentation useful to understand the changes.

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@gmarciani gmarciani requested review from a team as code owners November 26, 2024 10:57
@gmarciani gmarciani force-pushed the wip/mgiacomo/3120/auto-restart-slurmctld-1126-1 branch from 951ee9e to 1ea47a3 Compare November 26, 2024 11:12
@gmarciani gmarciani force-pushed the wip/mgiacomo/3120/auto-restart-slurmctld-1126-1 branch from 1ea47a3 to d6c8b44 Compare November 28, 2024 12:36
@gmarciani gmarciani enabled auto-merge (rebase) December 2, 2024 15:03
@gmarciani gmarciani merged commit 4e23e3f into aws:develop Dec 2, 2024
29 of 31 checks passed
@gmarciani gmarciani deleted the wip/mgiacomo/3120/auto-restart-slurmctld-1126-1 branch December 2, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants