Skip to content

Commit

Permalink
Merge branch 'develop' into efa
Browse files Browse the repository at this point in the history
  • Loading branch information
hgreebe authored Dec 2, 2024
2 parents f1d02a0 + 4e23e3f commit 02c9683
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ This file is used to list changes made in each version of the AWS ParallelCluste
- Libfabric-aws: `libfabric-aws-1.22.0-1`
- Rdma-core: `rdma-core-54.0-1`
- Open MPI: `openmpi40-aws-4.1.7-1` and `openmpi50-aws-5.0.5`
- Auto-restart slurmctld on failure.

**BUG FIXES**
- Fix an issue in the way we get region when manage volumes so that it can correctly handle local zone.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
it 'creates the service definition for slurmctld with the correct settings' do
is_expected.to render_file('/etc/systemd/system/slurmctld.service')
.with_content("After=network-online.target munge.service remote-fs.target")
.with_content("Restart=on-failure")
.with_content("RestartSec=1s")
end
end
end
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=562930
LimitMEMLOCK=infinity
LimitSTACK=infinity
Restart=on-failure
RestartSec=1s

[Install]
WantedBy=multi-user.target

0 comments on commit 02c9683

Please sign in to comment.