Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preservation of OS images when re-runnning 'epicli apply' #2329

Merged
merged 23 commits into from
May 28, 2021
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
a0f4415
First commit.
seriva May 18, 2021
4dc5b82
Minor updates for AWS.
seriva May 18, 2021
361eae1
Adding addition logic for preserving images.
seriva May 19, 2021
c5cfcb9
Fix minor typo.
seriva May 19, 2021
273d91a
Minor fix for AWS
seriva May 19, 2021
d52a988
Fixing tests.
seriva May 20, 2021
a762eb4
- Implemented a mechanism to use the current OS images for every OS a…
seriva May 21, 2021
f04b479
Fixed Infrastructure/cloud-os-image-defaults to infrastructure/cloud-…
seriva May 21, 2021
855b255
Update docs/home/howto/CLUSTER.md
seriva May 21, 2021
d8a08e1
Update docs/home/howto/CLUSTER.md
seriva May 21, 2021
601ad80
Update docs/home/howto/CLUSTER.md
seriva May 21, 2021
fd6aeb2
Update docs/home/howto/CLUSTER.md
seriva May 21, 2021
978e201
Update core/src/epicli/data/azure/defaults/infrastructure/cloud-os-im…
seriva May 21, 2021
d5faa24
Update core/src/epicli/data/aws/defaults/infrastructure/cloud-os-imag…
seriva May 21, 2021
5103067
- Fixed unit tests again
seriva May 21, 2021
7a19eee
- Removed questions for Azure.
seriva May 27, 2021
e98c60e
- Added documentation.
seriva May 27, 2021
db8e4ce
Update docs/home/howto/UPGRADE.md
seriva May 28, 2021
1074dd8
Update core/src/epicli/cli/engine/providers/aws/InfrastructureBuilder.py
seriva May 28, 2021
d310ec2
Update core/src/epicli/cli/engine/providers/azure/InfrastructureBuild…
seriva May 28, 2021
e698e31
Update docs/home/howto/CLUSTER.md
seriva May 28, 2021
9e3e318
Renamed os_image to default_os_image
seriva May 28, 2021
7e0c865
Fixed documentation.
seriva May 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions core/src/epicli/cli/engine/ApplyEngine.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@

from cli.helpers.Step import Step
from cli.helpers.doc_list_helpers import select_single, select_all
from cli.helpers.build_saver import save_manifest, get_inventory_path
from cli.helpers.build_saver import save_manifest, get_inventory_path, get_manifest_path, get_build_path
from cli.helpers.data_loader import load_manifest_docs
from cli.helpers.yaml_helpers import safe_load_all
from cli.helpers.Log import Log
from cli.helpers.os_images import get_os_distro_normalized
Expand All @@ -17,6 +18,7 @@
from cli.engine.terraform.TerraformFileCopier import TerraformFileCopier
from cli.engine.terraform.TerraformRunner import TerraformRunner
from cli.engine.ansible.AnsibleRunner import AnsibleRunner
from cli.version import VERSION


class ApplyEngine(Step):
Expand All @@ -31,6 +33,7 @@ def __init__(self, input_data):
self.input_docs = []
self.configuration_docs = []
self.infrastructure_docs = []
self.manifest_docs = []

def __enter__(self):
return self
Expand Down Expand Up @@ -61,9 +64,12 @@ def process_input_docs(self):
schema_validator.run()

def process_infrastructure_docs(self):
# Load any posible existing manifest docs
self.load_manifest_docs()

# Build the infrastructure docs
with provider_class_loader(self.cluster_model.provider, 'InfrastructureBuilder')(
self.input_docs) as infrastructure_builder:
self.input_docs, self.manifest_docs) as infrastructure_builder:
self.infrastructure_docs = infrastructure_builder.run()

# Validate infrastructure documents
Expand All @@ -84,16 +90,10 @@ def collect_infrastructure_config(self):
[*self.configuration_docs, *self.infrastructure_docs]) as config_collector:
config_collector.run()

def validate(self):
self.process_input_docs()

self.process_configuration_docs()

self.process_infrastructure_docs()

save_manifest([*self.input_docs, *self.configuration_docs, *self.infrastructure_docs], self.cluster_model.specification.name)

return 0
def load_manifest_docs(self):
path_to_manifest = get_manifest_path(self.cluster_model.specification.name)
if os.path.isfile(path_to_manifest):
self.manifest_docs = load_manifest_docs(get_build_path(self.cluster_model.specification.name))

def assert_no_master_downscale(self):
components = self.cluster_model.specification.components
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@
from cli.version import VERSION

class InfrastructureBuilder(Step):
def __init__(self, docs):
def __init__(self, docs, manifest_docs=[]):
super().__init__(__name__)
self.cluster_model = select_single(docs, lambda x: x.kind == 'epiphany-cluster')
self.docs = docs
self.manifest_docs = manifest_docs

def run(self):
infrastructure_docs = select_all(self.docs, lambda x: x.kind.startswith('infrastructure/'))
Expand Down
56 changes: 43 additions & 13 deletions core/src/epicli/cli/engine/providers/aws/InfrastructureBuilder.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,18 @@
from cli.helpers.naming_helpers import resource_name
from cli.helpers.objdict_helpers import objdict_to_dict, dict_to_objdict
from cli.version import VERSION
from cli.helpers.query_yes_no import query_yes_no


class InfrastructureBuilder(Step):
def __init__(self, docs):
def __init__(self, docs, manifest_docs=[]):
super().__init__(__name__)
self.cluster_model = select_single(docs, lambda x: x.kind == 'epiphany-cluster')
self.cluster_name = self.cluster_model.specification.name.lower()
self.cluster_prefix = self.cluster_model.specification.prefix.lower()
self.use_network_security_groups = self.cluster_model.specification.cloud.network.use_network_security_groups
self.docs = docs
self.manifest_docs = manifest_docs

def run(self):
infrastructure = []
Expand Down Expand Up @@ -136,7 +138,7 @@ def get_efs_config(self):
return efs_config

def get_autoscaling_group(self, component_key, component_value, subnets_to_create, index):
autoscaling_group = dict_to_objdict(deepcopy(self.get_virtual_machine(component_value, self.cluster_model, self.docs)))
autoscaling_group = dict_to_objdict(deepcopy(self.get_virtual_machine(component_value)))
autoscaling_group.specification.cluster_name = self.cluster_name
autoscaling_group.specification.name = resource_name(self.cluster_prefix, self.cluster_name, 'asg' + '-' + str(index), component_key)
autoscaling_group.specification.count = component_value.count
Expand Down Expand Up @@ -245,6 +247,45 @@ def add_security_rules_inbound_efs(self, infrastructure, security_group):
rules.append(objdict_to_dict(rule))
security_group.specification.rules = rules

def get_virtual_machine(self, component_value):
machine_selector = component_value.machine
model_with_defaults = select_first(self.docs, lambda x: x.kind == 'infrastructure/virtual-machine' and
x.name == machine_selector)

# Merge with defaults
if model_with_defaults is None:
model_with_defaults = merge_with_defaults(self.cluster_model.provider, 'infrastructure/virtual-machine',
machine_selector, self.docs)

# Check if we have a cluster-config OS image defined that we want to apply cluster wide.
cloud_os_image_defaults = self.get_config_or_default(self.docs, 'infrastructure/cloud-os-image-defaults')
cloud_image = self.cluster_model.specification.cloud.default_os_image
if cloud_image != 'default':
if not hasattr(cloud_os_image_defaults.specification, cloud_image):
raise NotImplementedError(f'default_os_image "{cloud_image}" is unsupported for "{self.cluster_model.provider}" provider.')
model_with_defaults.specification.os_full_name = cloud_os_image_defaults.specification[cloud_image]

# finally check if we are trying to re-apply a configuration.
if self.manifest_docs:
manifest_vm_config = select_first(self.manifest_docs, lambda x: x.name == machine_selector and x.kind == 'infrastructure/virtual-machine')
manifest_firstvm_config = select_first(self.manifest_docs, lambda x: x.kind == 'infrastructure/virtual-machine')

if manifest_vm_config is not None and model_with_defaults.specification.os_full_name == manifest_vm_config.specification.os_full_name:
return model_with_defaults

if model_with_defaults.specification.os_full_name == manifest_firstvm_config.specification.os_full_name:
return model_with_defaults

self.logger.warning(f"Re-applying a different OS image might lead to data loss and/or other issues. Preserving the existing OS image used for VM definition '{machine_selector}'.")

if manifest_vm_config is not None:
model_with_defaults.specification.os_full_name = manifest_vm_config.specification.os_full_name
else:
model_with_defaults.specification.os_full_name = manifest_firstvm_config.specification.os_full_name

return model_with_defaults


@staticmethod
def efs_add_mount_target_config(efs_config, subnet):
target = select_first(efs_config.specification.mount_targets,
Expand Down Expand Up @@ -275,17 +316,6 @@ def get_config_or_default(docs, kind):
config['version'] = VERSION
return config

@staticmethod
def get_virtual_machine(component_value, cluster_model, docs):
machine_selector = component_value.machine
model_with_defaults = select_first(docs, lambda x: x.kind == 'infrastructure/virtual-machine' and
x.name == machine_selector)
if model_with_defaults is None:
model_with_defaults = merge_with_defaults(cluster_model.provider, 'infrastructure/virtual-machine',
machine_selector, docs)

return model_with_defaults

@staticmethod
def rule_exists_in_list(rule_list, rule_to_check):
for rule in rule_list:
Expand Down
55 changes: 42 additions & 13 deletions core/src/epicli/cli/engine/providers/azure/InfrastructureBuilder.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,10 @@
from cli.helpers.objdict_helpers import objdict_to_dict, dict_to_objdict
from cli.helpers.os_images import get_os_distro_normalized
from cli.version import VERSION
from cli.helpers.query_yes_no import query_yes_no

class InfrastructureBuilder(Step):
def __init__(self, docs):
def __init__(self, docs, manifest_docs=[]):
super().__init__(__name__)
self.cluster_model = select_single(docs, lambda x: x.kind == 'epiphany-cluster')
self.cluster_name = self.cluster_model.specification.name.lower()
Expand All @@ -22,6 +23,7 @@ def __init__(self, docs):
self.use_network_security_groups = self.cluster_model.specification.cloud.network.use_network_security_groups
self.use_public_ips = self.cluster_model.specification.cloud.use_public_ips
self.docs = docs
self.manifest_docs = manifest_docs

def run(self):
infrastructure = []
Expand All @@ -44,7 +46,7 @@ def run(self):

# The vm config also contains some other stuff we use for network and security config.
# So get it here and pass it allong.
vm_config = self.get_virtual_machine(component_value, self.cluster_model, self.docs)
vm_config = self.get_virtual_machine(component_value)
# Set property that controls cloud-init.
vm_config.specification['use_cloud_init_custom_data'] = cloud_init_custom_data.specification.enabled

Expand Down Expand Up @@ -221,21 +223,48 @@ def get_cloud_init_custom_data(self):
cloud_init_custom_data.specification.file_name = 'cloud-config.yml'
return cloud_init_custom_data

def get_virtual_machine(self, component_value):
machine_selector = component_value.machine
model_with_defaults = select_first(self.docs, lambda x: x.kind == 'infrastructure/virtual-machine' and
x.name == machine_selector)

# Merge with defaults
if model_with_defaults is None:
model_with_defaults = merge_with_defaults(self.cluster_model.provider, 'infrastructure/virtual-machine',
machine_selector, self.docs)

# Check if we have a cluster-config OS image defined that we want to apply cluster wide.
cloud_os_image_defaults = self.get_config_or_default(self.docs, 'infrastructure/cloud-os-image-defaults')
cloud_image = self.cluster_model.specification.cloud.default_os_image
if cloud_image != 'default':
if not hasattr(cloud_os_image_defaults.specification, cloud_image):
raise NotImplementedError(f'default_os_image "{cloud_image}" is unsupported for "{self.cluster_model.provider}" provider.')
model_with_defaults.specification.storage_image_reference = dict_to_objdict(deepcopy(cloud_os_image_defaults.specification[cloud_image]))

# finally check if we are trying to re-apply a configuration.
if self.manifest_docs:
manifest_vm_config = select_first(self.manifest_docs, lambda x: x.name == machine_selector and x.kind == 'infrastructure/virtual-machine')
manifest_firstvm_config = select_first(self.manifest_docs, lambda x: x.kind == 'infrastructure/virtual-machine')

if manifest_vm_config is not None and model_with_defaults.specification.storage_image_reference == manifest_vm_config.specification.storage_image_reference:
return model_with_defaults

if model_with_defaults.specification.storage_image_reference == manifest_firstvm_config.specification.storage_image_reference:
return model_with_defaults

self.logger.warning(f"Re-applying a different OS image might lead to data loss and/or other issues. Preserving the existing OS image used for VM definition '{machine_selector}'.")

if manifest_vm_config is not None:
model_with_defaults.specification.storage_image_reference = dict_to_objdict(deepcopy(manifest_vm_config.specification.storage_image_reference))
else:
model_with_defaults.specification.storage_image_reference = dict_to_objdict(deepcopy(manifest_firstvm_config.specification.storage_image_reference))

return model_with_defaults

@staticmethod
def get_config_or_default(docs, kind):
config = select_first(docs, lambda x: x.kind == kind)
if config is None:
config = load_yaml_obj(types.DEFAULT, 'azure', kind)
config['version'] = VERSION
return config

@staticmethod
def get_virtual_machine(component_value, cluster_model, docs):
machine_selector = component_value.machine
model_with_defaults = select_first(docs, lambda x: x.kind == 'infrastructure/virtual-machine' and
x.name == machine_selector)
if model_with_defaults is None:
model_with_defaults = merge_with_defaults(cluster_model.provider, 'infrastructure/virtual-machine',
machine_selector, docs)

return model_with_defaults
20 changes: 0 additions & 20 deletions core/src/epicli/cli/epicli.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,6 @@ def debug_level(x):
upgrade_parser(subparsers)
delete_parser(subparsers)
test_parser(subparsers)
'''
validate_parser(subparsers)
'''
backup_parser(subparsers)
recovery_parser(subparsers)

Expand Down Expand Up @@ -318,23 +315,6 @@ def run_test(args):
sub_parser.set_defaults(func=run_test)


'''
def validate_parser(subparsers):
sub_parser = subparsers.add_parser('verify', description='Validates the configuration from file by executing a dry '
'run without changing the physical '
'infrastructure/configuration')
sub_parser.add_argument('-f', '--file', dest='file', type=str,
help='File with infrastructure/configuration definitions to use.')

def run_validate(args):
adjust_paths_from_file(args)
with ApplyEngine(args) as engine:
return engine.validate()

sub_parser.set_defaults(func=run_validate)
'''


def backup_parser(subparsers):
"""Configure and execute backup of cluster components."""

Expand Down
4 changes: 4 additions & 0 deletions core/src/epicli/cli/helpers/build_saver.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,10 @@ def get_inventory_path(cluster_name):
return os.path.join(get_build_path(cluster_name), INVENTORY_FILE_NAME)


def get_manifest_path(cluster_name):
return os.path.join(get_build_path(cluster_name), MANIFEST_FILE_NAME)


def get_inventory_path_for_build(build_directory):
build_version = check_build_output_version(build_directory)
inventory = os.path.join(build_directory, INVENTORY_FILE_NAME)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ specification:
credentials:
key: XXXX-XXXX-XXXX
secret: XXXXXXXXXXXXXXXX
default_os_image: default
components:
repository:
count: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
kind: infrastructure/cloud-os-image-defaults
title: "Cloud OS Image Defaults"
name: default
specification:
ubuntu-18.04-x86_64: ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20210323
redhat-7-x86_64: RHEL-7.9_HVM-20210208-x86_64-0-Hourly2-GP2
centos-7-x86_64: CentOS 7.9.2009 x86_64
centos-7-arm64: CentOS 7.9.2009 aarch64

Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ specification:
cloud:
k8s_as_cloud_service: False
use_public_ips: False # When not using public IPs you have to provide connectivity via private IPs (VPN)
default_os_image: default
components:
repository:
count: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
kind: infrastructure/cloud-os-image-defaults
title: "Cloud OS Image Defaults"
name: default
specification:
ubuntu-18.04-x86_64:
publisher: Canonical
offer: UbuntuServer
sku: 18.04-LTS
version: "18.04.202103151"
redhat-7-x86_64:
publisher: RedHat
offer: RHEL
sku: 7-LVM
version: "7.9.2020111202"
centos-7-x86_64:
publisher: OpenLogic
offer: CentOS
sku: "7_9"
version: "7.9.2021020400"
1 change: 1 addition & 0 deletions core/src/epicli/data/common/defaults/epiphany-cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ specification:
secret: DADFAFHCJHCAUYEAk
network:
use_network_security_groups: True
default_os_image: default
components:
kubernetes_master:
count: 1
Expand Down
11 changes: 11 additions & 0 deletions core/src/epicli/data/common/validation/epiphany-cluster.yml
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,17 @@ properties:
default: false
examples:
- true
default_os_image:
type: string
title: Set the latest cloud OS image verified for use by the Epiphany team for this Epiphany version.
default: 'default'
examples:
- default
- ubuntu-18.04-x86_64
- redhat-7-x86_64
- centos-7-x86_64
- centos-7-arm64
pattern: ^(default|ubuntu-18.04-x86_64|redhat-7-x86_64|centos-7-x86_64|centos-7-arm64)$
components:
"$id": "#/properties/components"
type: object
Expand Down
Loading