Skip to content

Commit

Permalink
Fix redeploy regressions (#97)
Browse files Browse the repository at this point in the history
* Fix lvm disks; support separate root volume.

* Fix 'Create a volume group' when root volume is defined.

* Fix redeploy regressions

+ Tolerate GCP not renaming disks
  + When redeploying and moving the disks (`_scheme_rmvm_keepdisk_rollback`), we cannot rename the disks in GCP to match the host. Because of this, we must make allowances:
  + For non-lvm, we must not add device_name to the .clusterversetest__ test files for GCP.
  + For lvm, we must not attempt to create the volume groups if the device_name is not found in the blockdevmap return (because the names won't match the hosts).
  + Shorten device names to accommodate GCPs 63 character limit
  + Update blockdevmap.py from upstream to fix missing GCP parameters.

+ Redeploy failing due to undefined lvm variables on previous (redeploying) hosts in `config/tasks/disks_auto_aws_gcp_azure.yml`.  Define these as default if unset and add lvm tests.
  + Importantly though, we shouldn't be running the playbook on the previous hosts.  Consider only adding `cluster_hosts_target` to the inventory, but doing so could break application roles that need the entire inventory when redeploying, (previous hosts will still be part of the cluster for some of the redeploy).  Instead, create a new inventory group `not_target_hosts`, containing hosts that are _not_ part of `cluster_hosts_target`, which can be excluded in the main `cluster.yml` hosts section:
```
- name: clusterverse | Configure the cluster
   hosts: all:!not_target_hosts
```

+ Don't fail jenkinsfile when parameters are missing (i.e. first run of a multibranch); just exit instead (prevents github getting a fail mark against a PR on first run).
  • Loading branch information
dseeley-sky authored Jul 30, 2021
1 parent 7cddc64 commit 391ae1c
Show file tree
Hide file tree
Showing 13 changed files with 142 additions and 115 deletions.
1 change: 1 addition & 0 deletions EXAMPLE/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ ansible-playbook cluster.yml -e buildenv=sandbox -e clusterid=test_gcp_euw1 --va
+ `-e metricbeat_install=false` - Does not install metricbeat
+ `-e wait_for_dns=false` - Does not wait for DNS resolution
+ `-e create_gcp_network=true` - Create GCP network and subnetwork (probably needed if creating from scratch and using public network)
+ `-e delete_gcp_network_on_clean=true` - Delete GCP network and subnetwork when run with `-e clean=_all_`
+ `-e debug_nested_log_output=true` - Show the log output from nested calls to embedded Ansible playbooks (i.e. when redeploying)
+ `-e cluster_vars_override='{"sandbox":{"hosttype_vars":{"sys":{"vms_by_az":{"b":1,"c":1,"d":0}}}}}'` - Ability to override cluster_vars dictionary elements from the command line. NOTE: there must be NO SPACES in this string.

Expand Down
2 changes: 1 addition & 1 deletion EXAMPLE/cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
tasks: [ {wait_for_connection: "", tags: ["always"] } ]

- name: clusterverse | Configure the cluster
hosts: all
hosts: all:!not_target_hosts
tasks: [ { include_role: { name: "clusterverse/config", apply: { tags: ["clusterverse_config"]} }, tags: ["clusterverse_config"] } ]


Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
---

_ubuntu2004image: "ami-03caf24deed650e2c" # eu-west-1 20.04, amd64, hvm-ssd, 20210621. Ubuntu images can be located at https://cloud-images.ubuntu.com/locator/
_centos7image: "ami-01f35c358a86763b2" # eu-west-1 CentOS 7 by Banzai Cloud
_alma8image: "ami-05d7345cebf7a784f" # eu-west-1 Official AlmaLinux 8.x OS image

cluster_vars:
image: "ami-03caf24deed650e2c" # eu-west-1 20.04 amd64 hvm-ssd 20210621. Ubuntu images can be located at https://cloud-images.ubuntu.com/locator/
# image: "ami-0b850cf02cc00fdc8" # eu-west-1, CentOS7
image: "{{_ubuntu2004image}}"
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,12 @@ cluster_vars:
version: "{{sysdisks_version | default('')}}"
vms_by_az: { a: 1, b: 1, c: 0 }

sysdisks3:
sysdiskslvm:
auto_volumes:
- { device_name: "/dev/sda1", mountpoint: "/", fstype: "ext4", volume_type: "gp2", volume_size: 8, encrypted: True, delete_on_termination: true }
- { device_name: "/dev/sdf", mountpoint: "/media/mysvc", fstype: "ext4", volume_type: "gp2", volume_size: 1, encrypted: True, delete_on_termination: true }
- { device_name: "/dev/sdg", mountpoint: "/media/mysvc2", fstype: "ext4", volume_type: "gp2", volume_size: 1, encrypted: True, delete_on_termination: true }
- { device_name: "/dev/sdh", mountpoint: "/media/mysvc3", fstype: "ext4", volume_type: "gp2", volume_size: 1, encrypted: True, delete_on_termination: true }
- { device_name: "/dev/sdg", mountpoint: "/media/mysvc", fstype: "ext4", volume_type: "gp2", volume_size: 1, encrypted: True, delete_on_termination: true }
lvmparams: { vg_name: "vg0", lv_name: "lv0", lv_size: "100%VG" }
flavor: t3a.nano
version: "{{sysdisks_version | default('')}}"
vms_by_az: { a: 1, b: 1, c: 0 }
Expand Down
7 changes: 5 additions & 2 deletions EXAMPLE/cluster_defs/gcp/cluster_vars__cloud.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
---

_ubuntu2004image: "projects/ubuntu-os-cloud/global/images/ubuntu-2004-focal-v20210702"
_centos7image: "projects/centos-cloud/global/images/centos-7-v20210701"
_alma8image: "projects/almalinux-cloud/global/images/almalinux-8-v20210701"

cluster_vars:
image: "projects/ubuntu-os-cloud/global/images/ubuntu-2004-focal-v20210623" # Ubuntu images can be located at https://cloud-images.ubuntu.com/locator/
# image: "projects/ubuntu-os-cloud/global/images/centos-7-v20201216
image: "{{_ubuntu2004image}}"
dns_cloud_internal_domain: "c.{{ (_gcp_service_account_rawtext | string | from_json).project_id }}.internal" # The cloud-internal zone as defined by the cloud provider (e.g. GCP, AWS)
dns_server: "clouddns" # Specify DNS server. nsupdate, route53 or clouddns. If empty string is specified, no DNS will be added.
assign_public_ip: "no"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,13 @@ cluster_vars:
version: "{{sysdisks_version | default('')}}"
vms_by_az: { d: 1, b: 1, c: 0 }

sysdisks3:
sysdiskslvm:
auto_volumes:
- { auto_delete: true, interface: "SCSI", volume_size: 1, mountpoint: "/media/mysvc", fstype: "ext4" }
- { auto_delete: true, interface: "SCSI", volume_size: 3, mountpoint: "/media/mysvc2", fstype: "ext4" }
- { auto_delete: true, interface: "SCSI", volume_size: 1, mountpoint: "/media/mysvc3", fstype: "ext4" }
- { auto_delete: true, interface: "SCSI", volume_size: 1, mountpoint: "/media/mysvc", fstype: "ext4" }
lvmparams: { vg_name: "vg0", lv_name: "lv0", lv_size: "100%VG" }
flavor: "e2-micro"
rootvol_size: "25" # This is optional, and if set, MUST be bigger than the original image size (20GB on GCP)
version: "{{sysdisks_version | default('')}}"
vms_by_az: { d: 1, b: 1, c: 0 }

Expand Down
3 changes: 3 additions & 0 deletions _dependencies/library/blockdevmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,9 @@ class cGCPMapper(cBlockDevMap):
def __init__(self, **kwds):
super(cGCPMapper, self).__init__(**kwds)

for os_device in self.device_map:
os_device.update({"device_name_cloud": os_device['SERIAL']})


class cAwsMapper(cBlockDevMap):
def __init__(self, **kwds):
Expand Down
2 changes: 1 addition & 1 deletion _dependencies/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@
gcp_credentials_file: "gcp__{{ (cluster_vars[buildenv].gcp_service_account_rawtext if cluster_vars[buildenv].gcp_service_account_rawtext|type_debug == 'dict' else cluster_vars[buildenv].gcp_service_account_rawtext | string | from_json).project_id }}.json"
when: gcp_credentials_file is not defined

- name: dynamic_inventory | stat the gcp_credentials_file
- name: stat the gcp_credentials_file
stat: path={{gcp_credentials_file}}
register: r__stat_gcp_credentials_file

Expand Down
4 changes: 2 additions & 2 deletions clean/tasks/gcp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,12 @@
project: "{{cluster_vars[buildenv].vpc_host_project_id}}"
with_items: "{{ cluster_vars.firewall_rules }}"

- name: clean/gcp | Delete the GCP network (if -e create_gcp_network=true)
- name: clean/gcp | Delete the GCP network (if -e delete_gcp_network=true)
gcp_compute_network:
name: "{{cluster_vars[buildenv].vpc_network_name}}"
auth_kind: "serviceaccount"
service_account_file: "{{gcp_credentials_file}}"
project: "{{cluster_vars[buildenv].vpc_host_project_id}}"
state: absent
when: create_gcp_network is defined and create_gcp_network|bool
when: delete_gcp_network is defined and delete_gcp_network|bool
when: clean is defined and clean == '_all_'
7 changes: 5 additions & 2 deletions cluster_hosts/tasks/get_cluster_hosts_target_gcp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,11 @@
{%- for host in cluster_hosts_target -%}
{%- for vol in host.auto_volumes -%}
{%- if 'device_name' not in vol -%}
{%- set _dummy = vol.update({'device_name': host.hostname + '--' + vol.mountpoint | basename }) -%}
{%- set _dummy = vol.update({'initialize_params': {'disk_name': vol.device_name, 'disk_size_gb': vol.volume_size}}) -%}
{%- if 'lvmparams' in cluster_vars[buildenv].hosttype_vars[host.hosttype] -%}
{%- set lvm_device_index = '-d' + loop.index|string -%}
{%- endif -%}
{%- set _dummy = vol.update({'device_name': host.hostname + '--' + vol.mountpoint | basename + lvm_device_index|default('') }) -%}
{%- set _dummy = vol.update({'initialize_params': {'disk_name': vol.device_name, 'disk_size_gb': vol.volume_size }}) -%}
{%- endif -%}
{%- endfor %}
{%- endfor %}
Expand Down
114 changes: 60 additions & 54 deletions config/tasks/disks_auto_aws_gcp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,10 @@

- name: disks_auto_aws_gcp | Check that we haven't mounted disks in the wrong place. Especially useful for redeploys when we're moving disks.
block:
- name: "disks_auto_aws_gcp | Touch a file with the mountpoint and device name for testing that disk attachment is correct. Note: Use a unique filename here instead of writing to a file, so that more than one file per device is an error."
- name: "disks_auto_aws_gcp | Touch a file with the mountpoint and device name for testing that disk attachment is correct. Note: Use a unique filename here instead of writing to a file, so that more than one file per device is an error. Note: don't add device_name for GCP, because we can't rename the disks when redeploying and keeping disks (_scheme_rmvm_keepdisk_rollback)"
become: yes
file:
path: "{{item.mountpoint}}/.clusterversetest__{{inventory_hostname | regex_replace('-(?!.*-).*')}}__{{ item.mountpoint | regex_replace('\\/', '_') }}__{{ item.device_name | regex_replace('\/', '_') }}"
path: "{{item.mountpoint}}/.clusterversetest__{{inventory_hostname | regex_replace('-(?!.*-).*')}}__{{ item.mountpoint | regex_replace('\\/', '_') }}{%- if cluster_vars.type != 'gcp'-%}__{{ item.device_name | regex_replace('\/', '_') }}{%- endif -%}"
state: touch
loop: "{{auto_vols}}"

Expand Down Expand Up @@ -110,62 +110,68 @@
become: yes
register: r__blockdevmap

- name: disks_auto_aws_gcp/lvm | r__blockdevmap (pre-filesystem create)
- name: disks_auto_aws_gcp/lvm | r__blockdevmap (pre raid create)
debug: msg={{r__blockdevmap}}

- name: disks_auto_aws_gcp/lvm | Create a volume group from all block devices
become: yes
lvg:
vg: "{{ lvmparams.vg_name }}"
pvs: "{{ r__blockdevmap.device_map | json_query(\"[?device_name_cloud && contains('\" + auto_vol_device_names + \"', device_name_cloud)].device_name_os\") | join(',')}}"
vars:
auto_vol_device_names: "{{raid_vols | map(attribute='device_name') | sort | join(',')}}"

- name: disks_auto_aws_gcp/lvm | Create a logical volume from volume group
become: yes
lvol:
vg: "{{ lvmparams.vg_name }}"
lv: "{{ lvmparams.lv_name }}"
size: "{{ lvmparams.lv_size }}"

- name: disks_auto_aws_gcp/lvm | Create filesystem(s) on attached volume(s)
become: yes
filesystem:
fstype: "{{ raid_vols[0].fstype }}"
dev: "/dev/{{ lvmparams.vg_name }}/{{ lvmparams.lv_name }}"
force: no
- block:
- name: disks_auto_aws_gcp/lvm | raid_vols_devices
debug: msg={{ raid_vols_devices }}

- name: disks_auto_aws_gcp/lvm | Mount created filesytem(s) persistently
become: yes
mount:
path: "{{ raid_vols[0].mountpoint }}"
src: "/dev/{{ lvmparams.vg_name }}/{{ lvmparams.lv_name }}"
fstype: "{{ raid_vols[0].fstype }}"
state: mounted
opts: _netdev

- name: disks_auto_aws_gcp/lvm | Check that we haven't mounted disks in the wrong place. Especially useful for redeploys when we're moving disks.
block:
- name: "disks_auto_aws_gcp/lvm | Touch a file with the mountpoint for testing that disk attachment is correct. Note: Use a unique filename here instead of writing to a file, so that more than one file per device is an error."
- name: disks_auto_aws_gcp/lvm | Create a volume group from all block devices
become: yes
file:
path: "{{ raid_vols[0].mountpoint }}/.clusterversetest__{{inventory_hostname | regex_replace('-(?!.*-).*')}}__{{ raid_vols[0].mountpoint | regex_replace('\\/', '_') }}"
state: touch

- name: disks_auto_aws_gcp/lvm | Find all .clusterversetest__ files in mounted disks
find:
paths: "{{ raid_vols[0].mountpoint }}"
hidden: yes
patterns: ".clusterversetest__*"
register: r__find_test

- debug: msg={{r__find_test}}
lvg:
vg: "{{ lvmparams.vg_name }}"
pvs: "{{ raid_vols_devices | map(attribute='device_name_os') | sort | join(',') }}"

- name: disks_auto_aws_gcp/lvm | Create a logical volume from volume group
become: yes
lvol:
vg: "{{ lvmparams.vg_name }}"
lv: "{{ lvmparams.lv_name }}"
size: "{{ lvmparams.lv_size }}"

- name: disks_auto_aws_gcp/lvm | Create filesystem(s) on attached volume(s)
become: yes
filesystem:
fstype: "{{ raid_vols[0].fstype }}"
dev: "/dev/{{ lvmparams.vg_name }}/{{ lvmparams.lv_name }}"
force: no

- name: disks_auto_aws_gcp/lvm | Mount created filesytem(s) persistently
become: yes
mount:
path: "{{ raid_vols[0].mountpoint }}"
src: "/dev/{{ lvmparams.vg_name }}/{{ lvmparams.lv_name }}"
fstype: "{{ raid_vols[0].fstype }}"
state: mounted
opts: _netdev

- name: disks_auto_aws_gcp/lvm | Check that we haven't mounted disks in the wrong place. Especially useful for redeploys when we're moving disks.
block:
- name: "disks_auto_aws_gcp/lvm | Touch a file with the mountpoint for testing that disk attachment is correct. Note: Use a unique filename here instead of writing to a file, so that more than one file per device is an error."
become: yes
file:
path: "{{ raid_vols[0].mountpoint }}/.clusterversetest__{{inventory_hostname | regex_replace('-(?!.*-).*')}}__{{ raid_vols[0].mountpoint | regex_replace('\\/', '_') }}"
state: touch

- name: disks_auto_aws_gcp/lvm | Find all .clusterversetest__ files in mounted disks
find:
paths: "{{ raid_vols[0].mountpoint }}"
hidden: yes
patterns: ".clusterversetest__*"
register: r__find_test

- debug: msg={{r__find_test}}

- name: disks_auto_aws_gcp/lvm | assert that only one device descriptor file exists per disk (otherwise, indicates that this run has mapped either more than one device per mount, or a different one to previous)
assert: { that: "'files' in r__find_test != '' and r__find_test.files | length == 1", fail_msg: "ERROR - Exactly one file should exist per LVM." }
when: test_touch_disks is defined and test_touch_disks|bool
vars:
raid_vols_devices: "{{ r__blockdevmap.device_map | json_query(\"[?device_name_cloud && contains('\" + (raid_vols | map(attribute='device_name') | sort | join(',')) + \"', device_name_cloud)]\") }}"
when: raid_vols_devices | length

- name: disks_auto_aws_gcp/lvm | assert that only one device descriptor file exists per disk (otherwise, indicates that this run has mapped either more than one device per mount, or a different one to previous)
assert: { that: "'files' in r__find_test != '' and r__find_test.files | length == 1", fail_msg: "ERROR - Exactly one file should exist per LVM." }
when: test_touch_disks is defined and test_touch_disks|bool
when: (lvmparams is defined and lvmparams != '') and (raid_vols | map(attribute='mountpoint') | list | unique | count == 1) and (raid_vols | map(attribute='mountpoint') | list | count >= 2) and (raid_vols | map(attribute='fstype') | list | unique | count == 1)
when: (lvmparams is defined and lvmparams != {}) and (raid_vols | map(attribute='mountpoint') | list | unique | count == 1) and (raid_vols | map(attribute='mountpoint') | list | count >= 2) and (raid_vols | map(attribute='fstype') | list | unique | count == 1)
vars:
_hosttype_vars: "{{ cluster_hosts_target | json_query(\"[?hostname == '\" + inventory_hostname + \"'] | [0]\") }}"
raid_vols: "{{ _hosttype_vars.auto_volumes | selectattr('mountpoint', '!=', '/')}}"
lvmparams: "{{ cluster_vars[buildenv].hosttype_vars[_hosttype_vars.hosttype].lvmparams | default('') }}"
raid_vols: "{{ (_hosttype_vars.auto_volumes | selectattr('mountpoint', '!=', '/') | default([])) if _hosttype_vars.auto_volumes is defined else [] }}"
lvmparams: "{{ (cluster_vars[buildenv].hosttype_vars[_hosttype_vars.hosttype].lvmparams | default({})) if _hosttype_vars.hosttype is defined else {} }}"
Loading

0 comments on commit 391ae1c

Please sign in to comment.