Skip to content

Latest commit

 

History

History
140 lines (95 loc) · 3.94 KB

File metadata and controls

140 lines (95 loc) · 3.94 KB

The flux-cluster.yaml blueprint describes a flux-framework cluster where flux is deployed as the native resource manager as described in the Flux Administrator's Guide.

The cluster includes

  • A management node
  • A login node
  • Four compute nodes each of which is an instance of the c2-standard-16 machine type

NOTE: prior to running this HPC Toolkit example the Flux Framework GCP Images must be created in your project.

Intial Setup for flux-framework Cluster

Before provisioning any infrastructure in this project you should follow the Toolkit guidance to enable APIs and establish minimum resource quotas. In particular, the following APIs should be enabled

Deploy the flux-framework Cluster

Use ghcp to provision the blueprint

ghpc create community/examples/flux-framework --vars project_id=<<PROJECT_ID>>

This will create a directory containing Terraform modules.

Follow ghpc instructions to deploy the cluster

terraform -chdir=flux-fw-cluster/primary init
terraform -chdir=flux-fw-cluster/primary validate
terraform -chdir=flux-fw-cluster/primary apply

Connect to the login node

Access the cluster via the login node from the command line.

gcloud compute ssh gfluxfw-login-001

Or via the Google Cloud Console

  1. Open the following URL in a new tab.

    https://console.cloud.google.com/compute

    This will take you to Compute Engine > VM instances in the Google Cloud Console.

    Select the project in which the flux-framework cluster was provisioned.

  2. Click on the SSH button associated with the gfluxfw-login-001 instance to open a window with a terminal into the cluster login node.

Verify the flux-framework Cluster

View the cluster resources

flux resource list

The output will look similar to

     STATE PROPERTIES NNODES   NCORES NODELIST
      free x86-64,e2       1        2 gfluxfw-login-001
      free x86-64,c2       4       32 gfluxfw-compute-[001-004]
 allocated                 0        0 
      down                 0        0 

Run a simple job that executes the hostname command on each of the cluster compute nodes

flux run -N4 --requires=c2 hostname

The output will be something like

gfluxfw-compute-001
gfluxfw-compute-004
gfluxfw-compute-003
gfluxfw-compute-002

Create a two node allocation

flux alloc -N2 --requires=c2

View the resources associated with the allocation

flux resource list

The output will look similar to

     STATE PROPERTIES NNODES   NCORES NODELIST
      free x86-64,c2       2       16 gfluxfw-compute-[003-004]
 allocated                 0        0 
      down                 0        0 

Observe the impact on cluster resources

flux --parent resource list

Yields output like

     STATE PROPERTIES NNODES   NCORES NODELIST
      free x86-64,e2       1        2 gfluxfw-login-001
      free x86-64,c2       2       16 gfluxfw-compute-[001-002]
 allocated x86-64,c2       2       16 gfluxfw-compute-[003-004]
      down                 0        0 

Use ^d to release the resources in the allocation and return to the login node.

Next Steps

To learn how to make the best use of flux follow the Introduction to Flux tutorial.