Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(3.3.0-3.5.1) Cluster updates can break Slurm accounting functionality #5151

Closed
enrico-usai opened this issue Apr 4, 2023 · 1 comment
Closed
Labels

Comments

@enrico-usai
Copy link
Contributor

enrico-usai commented Apr 4, 2023

The issue

In ParallelCluster versions 3.3.0 up to 3.5.1, if you enabled the Slurm accounting feature, a cluster-update operation involving any change to settings under the Scheduling section, will break Slurm accounting configuration on the system.

Slurm accounting functionality will continue running unless a restart of the slurmdbd or, more in general, a reboot of the head node is executed.

Mitigation

See https://github.com/aws/aws-parallelcluster/wiki/(3.3.0-3.5.1)-Cluster-updates-can-break-Slurm-accounting-functionality

@enrico-usai enrico-usai added bug update Update related issue 3.x labels Apr 4, 2023
jdeamicis added a commit to jdeamicis/aws-parallelcluster that referenced this issue Apr 14, 2023
Modify integration test test_slurm_accounting to verify that
the Slurm database password is unchanged after a cluster update
that modifies the Slurm queues without modifying the Slurm
accounting configuration.

Signed-off-by: Jacopo De Amicis <[email protected]>
jdeamicis added a commit to jdeamicis/aws-parallelcluster that referenced this issue Apr 17, 2023
Modify integration test test_slurm_accounting to verify that
the Slurm database password is unchanged after a cluster update
that modifies the Slurm queues without modifying the Slurm
accounting configuration.

Signed-off-by: Jacopo De Amicis <[email protected]>
jdeamicis added a commit to jdeamicis/aws-parallelcluster that referenced this issue Apr 17, 2023
Modify integration test test_slurm_accounting to verify that
the Slurm database password is unchanged after a cluster update
that modifies the Slurm queues without modifying the Slurm
accounting configuration.

Signed-off-by: Jacopo De Amicis <[email protected]>
jdeamicis added a commit to jdeamicis/aws-parallelcluster that referenced this issue Apr 17, 2023
Modify integration test test_slurm_accounting to verify that
the Slurm database password is unchanged after a cluster update
that modifies the Slurm queues without modifying the Slurm
accounting configuration.

Signed-off-by: Jacopo De Amicis <[email protected]>
jdeamicis added a commit that referenced this issue Apr 17, 2023
Modify integration test test_slurm_accounting to verify that
the Slurm database password is unchanged after a cluster update
that modifies the Slurm queues without modifying the Slurm
accounting configuration.

Signed-off-by: Jacopo De Amicis <[email protected]>
@enrico-usai
Copy link
Contributor Author

Closing since all version > 3.5.0 contain a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants