Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor mount/mode setting for local SSD RAID #3214

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 28 additions & 16 deletions modules/scripts/startup-script/files/setup-raid.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,11 @@
[Unit]
After=local-fs.target
Before=slurmd.service
ConditionPathIsMountPoint=!{{ mountpoint }}
ConditionPathExists=!{{ array_dev }}

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/bash -c "/usr/sbin/mdadm --create {{ array_dev }} --name={{ raid_name }} --homehost=any --level=0 --raid-devices={{ local_ssd_devices.files | length }} /dev/disk/by-id/google-local-nvme-ssd-*{{ " --force" if local_ssd_devices.files | length == 1 else "" }}"
ExecStartPost=/usr/sbin/mkfs -t {{ fstype }}{{ " -m 0" if fstype == "ext4" else "" }} {{ array_dev }}

Expand All @@ -70,19 +71,30 @@
enabled: true
daemon_reload: true

- name: Mount RAID array
ansible.posix.mount:
src: "{{ array_dev }}"
path: "{{ mountpoint }}"
fstype: "{{ fstype }}"
# the nofail option is critical as it enables the boot process to complete on machines
# that were powered off and had local SSD contents discarded; without this option
# VMs may fail to join the network
opts: discard,defaults,nofail
state: mounted
- name: Install service to mount local SSD array
ansible.builtin.copy:
dest: /etc/systemd/system/mount-localssd-raid.service
mode: 0644
content: |
[Unit]
After=local-fs.target create-localssd-raid.service
Before=slurmd.service
Wants=create-localssd-raid.service
ConditionPathIsMountPoint=!{{ mountpoint }}

- name: Set mount permissions
ansible.builtin.file:
path: "{{ mountpoint }}"
state: directory
mode: "{{ mode }}"
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/systemd-mount -t {{ fstype }} -o discard,defaults,nofail {{ array_dev }} {{ mountpoint }}
ExecStartPost=/usr/bin/chmod {{ mode }} {{ mountpoint }}
ExecStop=/usr/bin/systemd-umount {{ mountpoint }}

[Install]
WantedBy=slurmd.service

- name: Mount RAID array and set permissions
ansible.builtin.systemd:
name: mount-localssd-raid.service
state: started
enabled: true
daemon_reload: true
Loading