Skip to content

Commit

Permalink
Remove disabled and not loaded services before calling reset-failed a…
Browse files Browse the repository at this point in the history
…nd restart services (#2266)

What I did

Added logic to remove disabled and not loaded services before calling reset-failed/restart services. Certain services like telemetry can go down and become disabled, which would cause load_minigraph to fail when resetting failed services. Services that are not loaded or disabled should not impact reset or start of other services.

How I did it

Added logic to remove services that are disabled or not loaded from the group of listed services for that specific operation. such as resetting failed or restart.

How to verify it

Manual testing. Bring down a service such as telemetry via mask or config feature state telemetry disabled, and it should not impact load_minigraph

Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)
  • Loading branch information
zbud-msft authored Jul 13, 2022
1 parent 09b4678 commit 62b7b56
Showing 1 changed file with 18 additions and 6 deletions.
24 changes: 18 additions & 6 deletions config/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,15 @@ def _get_disabled_services_list():

return disabled_services_list

def _get_not_loaded_services_list(services_list):
not_loaded_services_list = []
for service in services_list:
command = "sudo systemctl show -p LoadState --value %s" % service
status = subprocess.check_output(command, shell=True).decode()
if status != "loaded\n":
not_loaded_services_list.append(service)

return not_loaded_services_list

def _stop_services():
# This list is order-dependent. Please add services in the order they should be stopped
Expand All @@ -577,6 +586,13 @@ def _stop_services():

execute_systemctl(services_to_stop, SYSTEMCTL_ACTION_STOP)

def _remove_invalid_services(services_list):
invalid_services_list = _get_disabled_services_list()
invalid_services_list += _get_not_loaded_services_list(services_list)
for invalid_service in invalid_services_list:
if invalid_service in services_list:
services_list.remove(invalid_service)


def _reset_failed_services():
# This list is order-independent. Please keep list in alphabetical order
Expand All @@ -601,6 +617,7 @@ def _reset_failed_services():
'telemetry'
]

_remove_invalid_services(services_to_reset)
execute_systemctl(services_to_reset, SYSTEMCTL_ACTION_RESET_FAILED)


Expand All @@ -623,12 +640,7 @@ def _restart_services():
'telemetry'
]

disable_services = _get_disabled_services_list()

for service in disable_services:
if service in services_to_restart:
services_to_restart.remove(service)

_remove_invalid_services(services_to_restart)
if asic_type == 'mellanox' and 'pmon' in services_to_restart:
services_to_restart.remove('pmon')

Expand Down

0 comments on commit 62b7b56

Please sign in to comment.