diff --git a/.codespellrc b/.codespellrc
new file mode 100644
index 0000000000..5aa4b5e75b
--- /dev/null
+++ b/.codespellrc
@@ -0,0 +1,3 @@
+[codespell]
+skip = .git,*.pdf,*.svg
+# ignore-words-list =
diff --git a/.github/workflows/codespell.yml b/.github/workflows/codespell.yml
new file mode 100644
index 0000000000..5768d7c636
--- /dev/null
+++ b/.github/workflows/codespell.yml
@@ -0,0 +1,19 @@
+---
+name: Codespell
+
+on:
+ push:
+ branches: [master]
+ pull_request:
+ branches: [master]
+
+jobs:
+ codespell:
+ name: Check for spelling errors
+ runs-on: ubuntu-latest
+
+ steps:
+ - name: Checkout
+ uses: actions/checkout@v3
+ - name: Codespell
+ uses: codespell-project/actions-codespell@v1
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 1338d9da8e..a7199a2371 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -206,7 +206,7 @@ All notable changes to this project will be documented in this file.
- Fix potential race condition in loading BQ job data.
- Remove deployment manager support.
- Update Nvidia to 470.82.01 and CUDA to 11.4.4
-- Reenable gcsfuse in ansible and workaround the repo gpg check problem
+- Re-enable gcsfuse in ansible and workaround the repo gpg check problem
## \[4.1.5\]
@@ -257,7 +257,7 @@ All notable changes to this project will be documented in this file.
## \[4.0.4\]
-- Configure sockets, cores, threads on compute nodes for better performace with
+- Configure sockets, cores, threads on compute nodes for better performance with
`cons_tres`.
## \[4.0.3\]
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 809a61f773..73b9e94226 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -22,7 +22,7 @@ If you make an automated change (changing a function name, fixing a pervasive
spelling mistake), please send the command/regex used to generate the changes
along with the patch, or note it in the commit message.
-While not required, we encourage use of `git format-patch` to geneate the patch.
+While not required, we encourage use of `git format-patch` to generate the patch.
This ensures the relevant author line and commit message stay attached. Plain
`diff`'d output is also okay. In either case, please attach them to the bug for
us to review. Spelling corrections or documentation improvements can be
diff --git a/README.md b/README.md
index 1f38081a5a..4dfa95a1ca 100644
--- a/README.md
+++ b/README.md
@@ -43,7 +43,7 @@ to help you get up and running and stay running.
Issues and/or enhancement requests can be submitted to
[SchedMD's Bugzilla](https://bugs.schedmd.com).
-Also, join comunity discussions on either the
+Also, join community discussions on either the
[Slurm User mailing list](https://slurm.schedmd.com/mail.html) or the
[Google Cloud & Slurm Community Discussion Group](https://groups.google.com/forum/#!forum/google-cloud-slurm-discuss).
diff --git a/ansible/playbook.yml b/ansible/playbook.yml
index 06126fce7c..5d2737ff5f 100644
--- a/ansible/playbook.yml
+++ b/ansible/playbook.yml
@@ -47,7 +47,7 @@
msg: >
OS ansible_distribution version ansible_distribution_major_version is not
supported.
- Please use a suported OS in list:
+ Please use a supported OS in list:
- RHEL 7,8
- CentOS 7,8
- Debian 10
diff --git a/ansible/roles/lustre/tasks/main.yml b/ansible/roles/lustre/tasks/main.yml
index 28644c526a..d53ef43cb7 100644
--- a/ansible/roles/lustre/tasks/main.yml
+++ b/ansible/roles/lustre/tasks/main.yml
@@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-- name: Install Dependancies
+- name: Install Dependencies
package:
name:
- wget
diff --git a/docs/cloud.md b/docs/cloud.md
index 7b41dba122..fba7a768da 100644
--- a/docs/cloud.md
+++ b/docs/cloud.md
@@ -28,7 +28,7 @@ There are two deployment methods for cloud cluster management:
This deployment method leverages
[GCP Marketplace](./glossary.md#gcp-marketplace) to make setting up clusters a
-breeze without leaving your browser. While this method is simplier and less
+breeze without leaving your browser. While this method is simpler and less
flexible, it is great for exploring what `slurm-gcp` is!
See the [Marketplace Guide](./marketplace.md) for setup instructions and more
diff --git a/docs/faq.md b/docs/faq.md
index 8013362ffe..898b78b627 100644
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -90,7 +90,7 @@ extra_logging_flags:
### How do I move data for a job?
-Data can be migrated to and from external sources using a worflow of dependant
+Data can be migrated to and from external sources using a workflow of dependent
jobs. A [workflow submission script](../jobs/submit_workflow.py.py) and
[helper jobs](../jobs/data_migrate/) are provided. See
[README](../jobs/README.md) for more information.
@@ -205,8 +205,8 @@ it may be allocated jobs again.
### How do I limit user access to only using login nodes?
By default, all instances are configured with
-[OS Login](./glossary.md#os-login). This keeps UID and GID of users consistant
-accross all instances and allows easy user control with
+[OS Login](./glossary.md#os-login). This keeps UID and GID of users consistent
+across all instances and allows easy user control with
[IAM Roles](./glossary.md#iam-roles).
1. Create a group for all users in `admin.google.com`.
@@ -229,7 +229,7 @@ accross all instances and allows easy user control with
1. Select boxes for login nodes
1. Add group as a member with the **IAP-secured Tunnel User** role. Please see
[Enabling IAP for Compute Engine](https://cloud.google.com/iap/docs/enabling-compute-howto)
- for mor information.
+ for more information.
### What Slurm image do I use for production?
diff --git a/docs/federation.md b/docs/federation.md
index bad917a705..2d5431cde9 100644
--- a/docs/federation.md
+++ b/docs/federation.md
@@ -115,7 +115,7 @@ please refer to [multiple-slurmdbd](#multiple-slurmdbd) section.
### Additional Requirements
-- User UID and GID are consistant accross all federated clusters.
+- User UID and GID are consistent across all federated clusters.
## Multiple Slurmdbd
diff --git a/docs/hybrid.md b/docs/hybrid.md
index a06e96defb..a0c8379f0e 100644
--- a/docs/hybrid.md
+++ b/docs/hybrid.md
@@ -26,7 +26,7 @@ This guide focuses on setting up a hybrid [Slurm cluster](./glossary.md#slurm).
With hybrid, there are different challenges and considerations that need to be
taken into account. This guide will cover them and their recommended solutions.
-There is a clear seperation of how on-prem and cloud resources are managed
+There is a clear separation of how on-prem and cloud resources are managed
within your hybrid cluster. This means that you can modify either side of the
hybrid cluster without disrupting the other side! You manage your on-prem and
our [Slurm cluster module](../terraform/slurm_cluster/README.md) will manage the
@@ -71,7 +71,7 @@ and terminating nodes in the cloud:
- Creates compute node resources based upon Slurm job allocation and
configured compute resources.
- `slurmsync.py`
- - Synchronizes the Slurm state and the GCP state, reducing discrepencies from
+ - Synchronizes the Slurm state and the GCP state, reducing discrepancies from
manual admin activity or other edge cases.
- May update Slurm node states, create or destroy GCP compute resources or
other script managed GCP resources.
@@ -260,7 +260,7 @@ controller to be able to burst into the cloud.
### Manage Secrets
-Additionally, [MUNGE](./glossary.md#munge) secrets must be consistant across the
+Additionally, [MUNGE](./glossary.md#munge) secrets must be consistent across the
cluster. There are a few safe ways to deal with munge.key distribution:
- Use NFS to mount `/etc/munge` from the controller (default behavior).
@@ -277,7 +277,7 @@ connections to the munge NFS is critical.
- Isolate the cloud compute nodes of the cluster into their own project, VPC,
and subnetworks. Use project or network peering to enable access to other
- cloud infrastructure in a controlled mannor.
+ cloud infrastructure in a controlled manner.
- Setup firewall rules to control ingress and egress to the controller such that
only trusted machines or networks use its NFS.
- Only allow trusted private address (ranges) for communication to the
diff --git a/jobs/README.md b/jobs/README.md
index d4a676b41a..b5574879a3 100644
--- a/jobs/README.md
+++ b/jobs/README.md
@@ -18,7 +18,7 @@ $ sbatch --export=MIGRATE_INPUT=/tmp/seq.txt,MIGRATE_OUTPUT=/tmp/shuffle.txt \
## submit_workflow.py
This script is a runner that submits a sequence of 3 jobs as defined in the
-input structured yaml file. The three jobs submitted can be refered to as:
+input structured yaml file. The three jobs submitted can be referred to as:
`stage_in`; `main`; and `stage_out`. `stage_in` should move data for `main` to
consume. `main` is the main script that may consume and generate data.
`stage_out` should move data generated from `main` to an external location.
diff --git a/scripts/resume.py b/scripts/resume.py
index 06c0779856..bf64cbb0fd 100755
--- a/scripts/resume.py
+++ b/scripts/resume.py
@@ -173,7 +173,7 @@ def create_instances_request(nodes, placement_group, exclusive_job=None):
body.sourceInstanceTemplate = template
labels = dict(slurm_job_id=exclusive_job) if exclusive_job is not None else None
- # overwrites properties accross all instances
+ # overwrites properties across all instances
body.instanceProperties = instance_properties(
partition, model, placement_group, labels
)
diff --git a/terraform/slurm_cluster/README.md b/terraform/slurm_cluster/README.md
index 289a1860fe..12a8861937 100644
--- a/terraform/slurm_cluster/README.md
+++ b/terraform/slurm_cluster/README.md
@@ -38,7 +38,7 @@ use.
Partitions define what compute resources are available to the controller so it
may allocate jobs. Slurm will resume/create compute instances as needed to run
allocated jobs and will suspend/terminate the instances after they are no longer
-needed (e.g. IDLE for SuspendTimeout duration). Static nodes are persistant;
+needed (e.g. IDLE for SuspendTimeout duration). Static nodes are persistent;
they are exempt from being suspended/terminated under normal conditions. Dynamic
nodes are burstable; they will scale up and down with workload.
diff --git a/terraform/slurm_cluster/examples/slurm_instance_template/blank/README.md b/terraform/slurm_cluster/examples/slurm_instance_template/blank/README.md
index 1f27fc919d..a63428c7d3 100644
--- a/terraform/slurm_cluster/examples/slurm_instance_template/blank/README.md
+++ b/terraform/slurm_cluster/examples/slurm_instance_template/blank/README.md
@@ -14,7 +14,7 @@
## Overview
-This exmaple creates a
+This example creates a
[slurm_instance_template](../../../modules/slurm_instance_template/README.md).
It is compatible with:
diff --git a/terraform/slurm_cluster/examples/slurm_instance_template/compute/README.md b/terraform/slurm_cluster/examples/slurm_instance_template/compute/README.md
index c3a297ac52..c903405118 100644
--- a/terraform/slurm_cluster/examples/slurm_instance_template/compute/README.md
+++ b/terraform/slurm_cluster/examples/slurm_instance_template/compute/README.md
@@ -14,7 +14,7 @@
## Overview
-This exmaple creates a
+This example creates a
[slurm_instance_template](../../../modules/slurm_instance_template/README.md)
intended to be used by the
[slurm_partition](../../../modules/slurm_partition/README.md).
diff --git a/terraform/slurm_cluster/examples/slurm_instance_template/controller/README.md b/terraform/slurm_cluster/examples/slurm_instance_template/controller/README.md
index 6cc1aa6c80..763c370063 100644
--- a/terraform/slurm_cluster/examples/slurm_instance_template/controller/README.md
+++ b/terraform/slurm_cluster/examples/slurm_instance_template/controller/README.md
@@ -14,7 +14,7 @@
## Overview
-This exmaple creates a
+This example creates a
[slurm_instance_template](../../../modules/slurm_instance_template/README.md)
intended to be used by the
[slurm_controller_instance](../../../modules/slurm_controller_instance/README.md).
diff --git a/terraform/slurm_cluster/examples/slurm_instance_template/login/README.md b/terraform/slurm_cluster/examples/slurm_instance_template/login/README.md
index 7c1b062ffc..d294a67cc5 100644
--- a/terraform/slurm_cluster/examples/slurm_instance_template/login/README.md
+++ b/terraform/slurm_cluster/examples/slurm_instance_template/login/README.md
@@ -14,7 +14,7 @@
## Overview
-This exmaple creates a
+This example creates a
[slurm_instance_template](../../../modules/slurm_instance_template/README.md)
intended to be used by the
[slurm_login_instance](../../../modules/slurm_login_instance/README.md).
diff --git a/terraform/slurm_cluster/examples/slurm_partition/simple/README.md b/terraform/slurm_cluster/examples/slurm_partition/simple/README.md
index 02d26c5334..a1fe2c09b0 100644
--- a/terraform/slurm_cluster/examples/slurm_partition/simple/README.md
+++ b/terraform/slurm_cluster/examples/slurm_partition/simple/README.md
@@ -14,7 +14,7 @@
## Overview
-This exmaple creates a
+This example creates a
[Slurm partition](../../../modules/slurm_partition/README.md).
## Usage
diff --git a/terraform/slurm_cluster/modules/slurm_controller_hybrid/README_TF.md b/terraform/slurm_cluster/modules/slurm_controller_hybrid/README_TF.md
index 455df6f931..17b8d03f16 100644
--- a/terraform/slurm_cluster/modules/slurm_controller_hybrid/README_TF.md
+++ b/terraform/slurm_cluster/modules/slurm_controller_hybrid/README_TF.md
@@ -86,7 +86,7 @@ limitations under the License.
| [enable\_devel](#input\_enable\_devel) | Enables development mode. Not for production use. | `bool` | `false` | no |
| [enable\_reconfigure](#input\_enable\_reconfigure) | Enables automatic Slurm reconfigure on when Slurm configuration changes (e.g.
slurm.conf.tpl, partition details). Compute instances and resource policies
(e.g. placement groups) will be destroyed to align with new configuration.
NOTE: Requires Python and Google Pub/Sub API.
*WARNING*: Toggling this will impact the running workload. Deployed compute nodes
will be destroyed and their jobs will be requeued. | `bool` | `false` | no |
| [epilog\_scripts](#input\_epilog\_scripts) | List of scripts to be used for Epilog. Programs for the slurmd to execute
on every node when a user's job completes.
See https://slurm.schedmd.com/slurm.conf.html#OPT_Epilog. |
list(object({| `[]` | no | -| [google\_app\_cred\_path](#input\_google\_app\_cred\_path) | Path to Google Applicaiton Credentials. | `string` | `null` | no | +| [google\_app\_cred\_path](#input\_google\_app\_cred\_path) | Path to Google Application Credentials. | `string` | `null` | no | | [install\_dir](#input\_install\_dir) | Directory where the hybrid configuration directory will be installed on the
filename = string
content = string
}))
list(object({| `[]` | no | | [munge\_mount](#input\_munge\_mount) | Remote munge mount for compute and login nodes to acquire the munge.key.
server_ip = string
remote_mount = string
local_mount = string
fs_type = string
mount_options = string
}))
object({|
server_ip = string
remote_mount = string
fs_type = string
mount_options = string
})
{| no | @@ -95,7 +95,7 @@ limitations under the License. | [partitions](#input\_partitions) | Cluster partitions as a list. |
"fs_type": "nfs",
"mount_options": "",
"remote_mount": "/etc/munge/",
"server_ip": null
}
list(object({| `[]` | no | | [project\_id](#input\_project\_id) | Project ID to create resources in. | `string` | n/a | yes | | [prolog\_scripts](#input\_prolog\_scripts) | List of scripts to be used for Prolog. Programs for the slurmd to execute
compute_list = list(string)
partition = object({
enable_job_exclusive = bool
enable_placement_groups = bool
network_storage = list(object({
server_ip = string
remote_mount = string
local_mount = string
fs_type = string
mount_options = string
}))
partition_conf = map(string)
partition_name = string
partition_nodes = map(object({
node_count_dynamic_max = number
node_count_static = number
access_config = list(object({
network_tier = string
}))
bandwidth_tier = string
enable_spot_vm = bool
group_name = string
instance_template = string
node_conf = map(string)
spot_instance_config = object({
termination_action = string
})
}))
partition_startup_scripts_timeout = number
subnetwork = string
zone_target_shape = string
zone_policy_allow = list(string)
zone_policy_deny = list(string)
})
}))
list(object({| `[]` | no | -| [slurm\_bin\_dir](#input\_slurm\_bin\_dir) | Path to directroy of Slurm binary commands (e.g. scontrol, sinfo). If 'null',
filename = string
content = string
}))
list(object({| `[]` | no | | [partition\_conf](#input\_partition\_conf) | Slurm partition configuration as a map.
server_ip = string
remote_mount = string
local_mount = string
fs_type = string
mount_options = string
}))
list(object({| n/a | yes | +| [partition\_nodes](#input\_partition\_nodes) | Compute nodes contained with this partition.
node_count_static = number
node_count_dynamic_max = number
group_name = string
node_conf = map(string)
additional_disks = list(object({
disk_name = string
device_name = string
disk_size_gb = number
disk_type = string
disk_labels = map(string)
auto_delete = bool
boot = bool
}))
access_config = list(object({
network_tier = string
}))
bandwidth_tier = string
can_ip_forward = bool
disable_smt = bool
disk_auto_delete = bool
disk_labels = map(string)
disk_size_gb = number
disk_type = string
enable_confidential_vm = bool
enable_oslogin = bool
enable_shielded_vm = bool
enable_spot_vm = bool
gpu = object({
count = number
type = string
})
instance_template = string
labels = map(string)
machine_type = string
metadata = map(string)
min_cpu_platform = string
on_host_maintenance = string
preemptible = bool
service_account = object({
email = string
scopes = list(string)
})
shielded_instance_config = object({
enable_integrity_monitoring = bool
enable_secure_boot = bool
enable_vtpm = bool
})
spot_instance_config = object({
termination_action = string
})
source_image_family = string
source_image_project = string
source_image = string
tags = list(string)
}))
list(object({| n/a | yes | | [partition\_startup\_scripts](#input\_partition\_startup\_scripts) | List of scripts to be ran on compute VM startup. |
node_count_static = number
node_count_dynamic_max = number
group_name = string
node_conf = map(string)
additional_disks = list(object({
disk_name = string
device_name = string
disk_size_gb = number
disk_type = string
disk_labels = map(string)
auto_delete = bool
boot = bool
}))
access_config = list(object({
network_tier = string
}))
bandwidth_tier = string
can_ip_forward = bool
disable_smt = bool
disk_auto_delete = bool
disk_labels = map(string)
disk_size_gb = number
disk_type = string
enable_confidential_vm = bool
enable_oslogin = bool
enable_shielded_vm = bool
enable_spot_vm = bool
gpu = object({
count = number
type = string
})
instance_template = string
labels = map(string)
machine_type = string
metadata = map(string)
min_cpu_platform = string
on_host_maintenance = string
preemptible = bool
service_account = object({
email = string
scopes = list(string)
})
shielded_instance_config = object({
enable_integrity_monitoring = bool
enable_secure_boot = bool
enable_vtpm = bool
})
spot_instance_config = object({
termination_action = string
})
source_image_family = string
source_image_project = string
source_image = string
tags = list(string)
}))
list(object({| `[]` | no | | [partition\_startup\_scripts\_timeout](#input\_partition\_startup\_scripts\_timeout) | The timeout (seconds) applied to each script in partition\_startup\_scripts. If
filename = string
content = string
}))