Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate DDN Lustre install script with startup-script #553

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 42 additions & 1 deletion community/modules/file-system/DDN-EXAScaler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,45 @@ More information about the architecture can be found at
[marketplace]: https://console.developers.google.com/marketplace/product/ddnstorage/exascaler-cloud
[architecture]: https://cloud.google.com/architecture/lustre-architecture

## Mounting

To mount the DDN EXAScaler Lustre file system you must first install the DDN
Luster client and then call the proper `mount` command.

When mounting to a Slurm resource both of these steps are automatically handled
with the use of the `use` command. See the
[hpc-cluster-high-io](../../../../examples/hpc-cluster-high-io.yaml) for an
example of using this module with Slurm.

The DDN-EXAScaler module outputs runners that can be used with the
startup-script module to install the client and mount the file system when
mounting to other compute resources such as `vm-instance` or `cloud-batch-job`.
See the following example:

```yaml
- id: lustrefs
source: community/modules/file-system/DDN-EXAScaler
use: [network1]
settings: {local_mount: /scratch}

- id: mount-at-startup
source: modules/scripts/startup-script
settings:
runners:
- $(lustrefs.install_ddn_lustre_client_runner)
- $(lustrefs.mount_runner)

- id: workstation
source: modules/compute/vm-instance
use: [network1, lustrefs, mount-at-startup]
```

See [additional documentation][ddn-install-docs] from DDN EXAScaler.

[ddn-install-docs]: https://github.com/DDNStorage/exascaler-cloud-terraform/tree/master/gcp#install-new-exascaler-cloud-clients

## Support

EXAScaler Cloud includes self-help support with access to publicly available
documents and videos. Premium support includes 24x7x365 access to DDN's experts,
along with support community access, automated notifications of updates and
Expand Down Expand Up @@ -101,8 +139,11 @@ No resources.

| Name | Description |
|------|-------------|
| <a name="output_client_config"></a> [client\_config](#output\_client\_config) | Script that will install DDN EXAScaler lustre client. The machine running this script must be on the same network & subnet as the EXAScaler. |
| <a name="output_http_console"></a> [http\_console](#output\_http\_console) | HTTP address to access the system web console. |
| <a name="output_mount_command"></a> [mount\_command](#output\_mount\_command) | Command to mount the file system. |
| <a name="output_install_ddn_lustre_client_runner"></a> [install\_ddn\_lustre\_client\_runner](#output\_install\_ddn\_lustre\_client\_runner) | Runner that encapsulates the `client_config` output on this module. |
| <a name="output_mount_command"></a> [mount\_command](#output\_mount\_command) | Command to mount the file system. `client_config` script must be run first. |
| <a name="output_mount_runner"></a> [mount\_runner](#output\_mount\_runner) | Runner to mount the DDN EXAScaler Lustre file system |
| <a name="output_network_storage"></a> [network\_storage](#output\_network\_storage) | Describes a EXAScaler system to be mounted by other systems. |
| <a name="output_private_addresses"></a> [private\_addresses](#output\_private\_addresses) | Private IP addresses for all instances. |
| <a name="output_ssh_console"></a> [ssh\_console](#output\_ssh\_console) | Instructions to ssh into the instances. |
Expand Down
35 changes: 33 additions & 2 deletions community/modules/file-system/DDN-EXAScaler/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,47 @@ output "ssh_console" {
value = module.ddn_exascaler.ssh_console
}

output "client_config" {
heyealex marked this conversation as resolved.
Show resolved Hide resolved
description = "Script that will install DDN EXAScaler lustre client. The machine running this script must be on the same network & subnet as the EXAScaler."
heyealex marked this conversation as resolved.
Show resolved Hide resolved
value = module.ddn_exascaler.client_config
}

output "install_ddn_lustre_client_runner" {
description = "Runner that encapsulates the `client_config` output on this module."
value = {
"type" = "shell"
"content" = module.ddn_exascaler.client_config
"destination" = "install_ddn_lustre_client.sh"
}
}

locals {
split_mount_cmd = split(" ", module.ddn_exascaler.mount_command)
split_mount_cmd_wo_mountpoint = slice(local.split_mount_cmd, 0, length(local.split_mount_cmd) - 1)
mount_cmd = "${join(" ", local.split_mount_cmd_wo_mountpoint)} ${var.local_mount}"
mount_cmd_w_mkdir = "mkdir -p ${var.local_mount} && ${local.mount_cmd}"
}

output "mount_command" {
description = "Command to mount the file system."
value = module.ddn_exascaler.mount_command
description = "Command to mount the file system. `client_config` script must be run first."
value = local.mount_cmd_w_mkdir
}

output "mount_runner" {
description = "Runner to mount the DDN EXAScaler Lustre file system"
value = {
"type" = "shell"
"content" = local.mount_cmd_w_mkdir
"destination" = "mount-ddn-lustre.sh"
}
}

output "http_console" {
description = "HTTP address to access the system web console."
value = module.ddn_exascaler.http_console
}


nick-stroud marked this conversation as resolved.
Show resolved Hide resolved
output "network_storage" {
description = "Describes a EXAScaler system to be mounted by other systems."
value = {
Expand Down
20 changes: 14 additions & 6 deletions tools/cloud-build/daily-tests/blueprints/lustre-with-new-vpc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,18 @@ deployment_groups:
settings:
local_mount: /home

# Explicitly picking the local version of the module
- id: scratchfs
source: community/modules/file-system/DDN-EXAScaler
kind: terraform
use: [network1]
settings:
local_mount: /scratch
network_self_link: $(network1.network_self_link)
subnetwork_self_link: $(network1.subnetwork_self_link)
subnetwork_address: $(network1.subnetwork_address)

- id: mount-exascaler
source: modules/scripts/startup-script
settings:
runners:
- $(scratchfs.install_ddn_lustre_client_runner)
- $(scratchfs.mount_runner)

# Create a separate workstation to catch regressions in vm-instance
- id: workstation
Expand All @@ -58,11 +61,16 @@ deployment_groups:
use:
- network1
- homefs
- scratchfs
- mount-exascaler
settings:
name_prefix: test-workstation
machine_type: c2-standard-4

- id: wait0
source: ./community/modules/scripts/wait-for-startup
settings:
instance_name: ((module.workstation.name[0]))

- id: compute_partition
source: ./community/modules/compute/SchedMD-slurm-on-gcp-partition
kind: terraform
Expand Down