Description

This module is a wrapper around the slurm-controller-hybrid module by SchedMD as part of the slurm-gcp github repository. The hybrid module serves to create the configurations needed to extend an on-premise slurm cluster to one with one or more Google Cloud bursting partitions. These partitions will create the requested nodes in a GCP project on-demand and scale after a period of not being used, in the same way as the schedmd-slurm-gcp-v5-controller module auto-scales VMs.

NOTE: This is an experimental module and the functionality and documentation will likely be updated in the near future. This module has only been tested in limited capacity with the HPC Toolkit. On Premise Slurm configurations can vary significantly, this module should be used as a starting point, not a complete solution.

Usage

The slurm-controller-hybrid is intended to be run on the controller of the on premise slurm cluster, meaning executing terraform init/apply against the deployment directory. This allows the module to infer settings such as the slurm user and user ID when setting permissions for the created configurations.

If unable to install terraform and other dependencies on the controller directly, it is possible to deploy the hybrid module in a separate build environment and copy the created configurations to the on premise controller manually. This will require addition configuration and verification of permissions. For more information see the hybrid.md documentation on slurm-gcp.

NOTE: The hybrid module requires the following dependencies to be installed on the system deploying the module:

terraform

addict

httplib2

pyyaml

google-api-python-client

google-cloud-pubsub

A full list of recommended python packages is available in a requirements.txt file in the slurm-gcp repo.

Manual Configuration

This module does not complete the installation of hybrid partitions on your slurm cluster. After deploying, you must follow the steps listed out in the hybrid.md documentation under manual steps.

Example Usage

The hybrid module can be added to a blueprint as follows:

- id: slurm-controller
  source: ./community/modules/scheduler/schedmd-slurm-gcp-v5-hybrid
  use:
  - debug-partition
  - compute-partition
  - pre-existing-storage
  settings:
    output_dir: ./hybrid
    slurm_bin_dir: /usr/local/bin
    slurm_control_host: static-controller

This defines a HPC module that create a hybrid configuration with the following attributes:

2 partitions defined in previous modules with the IDs of debug-partition and compute-partition. These are the same partition modules used by schedmd-slurm-gcp-v5-controller.
Network storage to be mounted on the compute nodes when created, defined in pre-existing-storage.
output_directory set to ./hybrid. This is where the hybrid configurations will be created.
slurm_bin_dir located at /usr/local/bin. Set this to whereever the slurm executables are installed on your system.
slurm_control_host: The name of the on premise host is provided to the module for configuring NFS mounts and communicating with the controller after VM creation.

Assumptions and Limitations

Shared directories from the controller: By default, the following directories are NFS mounted from the on premise controller to the created cloud VMs:

/home
/opt/apps
/etc/munge
/usr/local/slurm/etc

The expectation is that these directories exist on the controller and that all files required by slurmd to be in sync with the controller are in those directories.

If this does not match your slurm cluster, these directories can be overwritten with a custom NFS mount using pre-existing-network-storage or by setting the network_storage variable directly in the hybrid module. Any value in network_storage, added directly or with use, will override the default directories above.

The variable disable_default_mounts will disregard these defaults. Note that at a minimum, the cloud VMs require /etc/munge and /usr/local/slurm/etc to be mounted from the controller. Those will need to be managed manually if the disable_default_mounts variable is set to true.

Power Saving Logic: The cloud partitions will make use of the power saving logic and the suspend and resume programs will be set. If any local partitions also make use of these slurm.conf variables, a conflict will likely occur. There is no support currently for partition level suspend and resume scripts, therefore either the local partition will need to turn this off or the hybrid module will not work.

Slurm versions: The version of slurm on the on premise cluster must match the slurm version on the cloud VMs created by the hybrid partitions. The version on the cloud VMs will be dictated by the version on the disk image that can be set when defining the partitions using schedmd-slurm-gcp-v5-partition.

If the publically available images do not suffice, slurm-gcp provides packer templates for creating custom disk images.

SchedMD only supports the current and last major version of slurm, therefore we strongly advise only using versions 21 or 22 when using this module. Attempting to use this module with any version older than 21 may lead to unexpected results.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Requirements

Name	Version
terraform	>= 0.14.0
null	~> 3.0

Providers

Name	Version
null	~> 3.0

Modules

Name	Source	Version
slurm_controller_instance	github.com/SchedMD/slurm-gcp.git//terraform/slurm_cluster/modules/slurm_controller_hybrid	5.3.0

Resources

Name	Type
null_resource.set_prefix_cloud_conf	resource

Inputs

Name	Description	Type	Default	Required
cloud_parameters	cloud.conf options.	object({ no_comma_params = bool resume_rate = number resume_timeout = number suspend_rate = number suspend_timeout = number })	{ "no_comma_params": false, "resume_rate": 0, "resume_timeout": 300, "suspend_rate": 0, "suspend_timeout": 300 }	no
compute_startup_script	Startup script used by the compute VMs.	`string`	`""`	no
compute_startup_scripts_timeout	The timeout (seconds) applied to the compute_startup_script. If any script exceeds this timeout, then the instance setup process is considered failed and handled accordingly. NOTE: When set to 0, the timeout is considered infinite and thus disabled.	`number`	`300`	no
deployment_name	Name of the deployment.	`string`	n/a	yes
disable_default_mounts	Disable default global network storage from the controller: /usr/local/etc/slurm, /etc/munge, /home, /apps. If these are disabled, the slurm etc and munge dirs must be added manually, or some other mechanism must be used to synchronize the slurm conf files and the munge key across the cluster.	`bool`	`false`	no
enable_bigquery_load	Enables loading of cluster job usage into big query. NOTE: Requires Google Bigquery API.	`bool`	`false`	no
enable_cleanup_compute	Enables automatic cleanup of compute nodes and resource policies (e.g. placement groups) managed by this module, when cluster is destroyed. NOTE: Requires Python and script dependencies. WARNING: Toggling this may impact the running workload. Deployed compute nodes may be destroyed and their jobs will be requeued.	`bool`	`false`	no
enable_cleanup_subscriptions	Enables automatic cleanup of pub/sub subscriptions managed by this module, when cluster is destroyed. NOTE: Requires Python and script dependencies. WARNING: Toggling this may temporarily impact var.enable_reconfigure behavior.	`bool`	`false`	no
enable_devel	Enables development mode. Not for production use.	`bool`	`false`	no
enable_reconfigure	Enables automatic Slurm reconfigure on when Slurm configuration changes (e.g. slurm.conf.tpl, partition details). Compute instances and resource policies (e.g. placement groups) will be destroyed to align with new configuration. NOTE: Requires Python and Google Pub/Sub API. WARNING: Toggling this will impact the running workload. Deployed compute nodes will be destroyed and their jobs will be requeued.	`bool`	`false`	no
epilog_scripts	List of scripts to be used for Epilog. Programs for the slurmd to execute on every node when a user's job completes. See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog.	list(object({ filename = string content = string }))	`[]`	no
google_app_cred_path	Path to Google Applicaiton Credentials.	`string`	`null`	no
install_dir	Directory where the hybrid configuration directory will be installed on the on-premise controller. This updates the prefix path for the resume and suspend scripts in the generated `cloud.conf` file. The value defaults to output_dir if not specified.	`string`	`null`	no
network_storage	Storage to mounted on all instances. - server_ip : Address of the storage server. - remote_mount : The location in the remote instance filesystem to mount from. - local_mount : The location on the instance filesystem to mount to. - fs_type : Filesystem type (e.g. "nfs"). - mount_options : Options to mount with.	list(object({ server_ip = string remote_mount = string local_mount = string fs_type = string mount_options = string }))	`[]`	no
output_dir	Directory where this module will write its files to. These files include: cloud.conf; cloud_gres.conf; config.yaml; resume.py; suspend.py; and util.py. If not specified explicitly, this will also be used as the default value for the `install_dir` variable.	`string`	`null`	no
partition	Cluster partitions as a list.	list(object({ compute_list = list(string) partition = object({ enable_job_exclusive = bool enable_placement_groups = bool network_storage = list(object({ server_ip = string remote_mount = string local_mount = string fs_type = string mount_options = string })) partition_conf = map(string) partition_name = string partition_nodes = map(object({ bandwidth_tier = string node_count_dynamic_max = number node_count_static = number enable_spot_vm = bool group_name = string instance_template = string node_conf = map(string) access_config = list(object({ network_tier = string })) spot_instance_config = object({ termination_action = string }) })) partition_startup_scripts_timeout = number subnetwork = string zone_policy_allow = list(string) zone_policy_deny = list(string) }) }))	`[]`	no
project_id	Project ID to create resources in.	`string`	n/a	yes
prolog_scripts	List of scripts to be used for Prolog. Programs for the slurmd to execute whenever it is asked to run a job step from a new job allocation. See https://slurm.schedmd.com/slurm.conf.html#OPT_Prolog.	list(object({ filename = string content = string }))	`[]`	no
slurm_bin_dir	Path to directroy of Slurm binary commands (e.g. scontrol, sinfo). If 'null', then it will be assumed that binaries are in $PATH.	`string`	`null`	no
slurm_cluster_name	Cluster name, used for resource naming and slurm accounting. If not provided it will default to the first 8 characters of the deployment name (removing any invalid characters).	`string`	`null`	no
slurm_control_addr	The IP address or a name by which the address can be identified. This value is passed to slurm.conf such that: SlurmctldHost={var.slurm_control_host}({var.slurm_control_addr}) See https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost	`string`	`null`	no
slurm_control_host	The short, or long, hostname of the machine where Slurm control daemon is executed (i.e. the name returned by the command "hostname -s"). This value is passed to slurm.conf such that: SlurmctldHost={var.slurm_control_host}({var.slurm_control_addr}) See https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost	`string`	n/a	yes
slurm_log_dir	Directory where Slurm logs to.	`string`	`"/var/log/slurm"`	no

Outputs

No outputs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Description

Usage

Manual Configuration

Example Usage

Assumptions and Limitations

License

Requirements

Providers

Modules

Resources

Inputs

Outputs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Description

Usage

Manual Configuration

Example Usage

Assumptions and Limitations

License

Requirements

Providers

Modules

Resources

Inputs

Outputs