Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[202012] warmboot failed as services performed cold restart #11416

Open
vaibhavhd opened this issue Jul 11, 2022 · 1 comment
Open

[202012] warmboot failed as services performed cold restart #11416

vaibhavhd opened this issue Jul 11, 2022 · 1 comment
Assignees
Labels

Comments

@vaibhavhd
Copy link
Contributor

Description

Warmboot failure - dataplane disruption seen as services performed cold restart instead of warm.

Steps to reproduce the issue:

  1. Continuous warmboot test on 202012 image.
  2. In one of the iterations, dataplane disruption will be hit.

Describe the results you received:

Although warmboot was done, dataplane downtime was observed.

From the logs it appears that services end up doing cold restart. Hence dataplane was disrupted.

At this point I think services did cold restart as db_migrator failed with below errors and WARM_RESTART_ENABLE_TABLE was not detected from db.

database service had started but container hadn't initialized completely. This could be due to this race condition.

Jul  9 19:13:09 sonic database.sh[936]: True
Jul  9 19:13:09 sonic database.sh[936]: True
Jul  9 19:13:09 sonic db_migrator: :- operator(): DB '{APPL_DB}' is empty with pattern 'INTF_TABLE:*'!
Jul  9 19:13:09 sonic db_migrator: :- operator(): DB '{APPL_DB}' is empty with pattern 'INTF_TABLE:*'!
Jul  9 19:13:09 sonic db_migrator: :- operator(): DB '{APPL_DB}' is empty with pattern 'INTF_TABLE:*'!
Jul  9 19:13:09 sonic db_migrator: :- operator(): DB '{APPL_DB}' is empty with pattern 'INTF_TABLE:*'!
Jul  9 19:13:09 sonic db_migrator: :- operator(): Key 'WARM_RESTART_ENABLE_TABLE|system' field 'enable' unavailable in database 'STATE_DB'
Jul  9 19:13:09 sonic db_migrator: :- operator(): Key 'WARM_RESTART_ENABLE_TABLE|system' field 'enable' unavailable in database 'STATE_DB'
Jul  9 19:13:09 sonic db_migrator: :- operator(): Key 'BUFFER_MAX_PARAM_TABLE|global' field 'mmu_size' unavailable in database 'STATE_DB'
Jul  9 19:13:09 sonic db_migrator: :- operator(): Key 'BUFFER_MAX_PARAM_TABLE|global' field 'mmu_size' unavailable in database 'STATE_DB'
Jul  9 19:13:09 sonic db_migrator: Setting buffer_model to traditional
Jul  9 19:13:09 sonic db_migrator: Setting buffer_model to traditional
Jul  9 19:13:09 sonic db_migrator: :- operator(): DB '{APPL_DB}' is empty with pattern 'COPP_TABLE:*'!
Jul  9 19:13:09 sonic db_migrator: :- operator(): DB '{APPL_DB}' is empty with pattern 'COPP_TABLE:*'!
Jul  9 19:13:09 sonic db_migrator: Caught exception: argument of type 'NoneType' is not iterable
Jul  9 19:13:09 sonic db_migrator: Caught exception: argument of type 'NoneType' is not iterable
Jul  9 19:13:09 sonic database.sh[938]: Traceback (most recent call last):
Jul  9 19:13:09 sonic database.sh[938]: Traceback (most recent call last):
Jul  9 19:13:09 sonic database.sh[938]:   File "/usr/local/bin/db_migrator.py", line 668, in main
Jul  9 19:13:09 sonic database.sh[938]:   File "/usr/local/bin/db_migrator.py", line 668, in main
Jul  9 19:13:09 sonic database.sh[938]:     result = getattr(dbmgtr, operation)()
Jul  9 19:13:09 sonic database.sh[938]:     result = getattr(dbmgtr, operation)()
Jul  9 19:13:09 sonic database.sh[938]:   File "/usr/local/bin/db_migrator.py", line 627, in migrate
Jul  9 19:13:09 sonic database.sh[938]:   File "/usr/local/bin/db_migrator.py", line 627, in migrate
Jul  9 19:13:09 sonic database.sh[938]:     self.common_migration_ops()
Jul  9 19:13:09 sonic database.sh[938]:     self.common_migration_ops()
Jul  9 19:13:09 sonic database.sh[938]:   File "/usr/local/bin/db_migrator.py", line 600, in common_migration_ops
Jul  9 19:13:09 sonic database.sh[938]:   File "/usr/local/bin/db_migrator.py", line 600, in common_migration_ops
Jul  9 19:13:09 sonic database.sh[938]:     if self.asic_type == "broadcom" and 'Force10-S6100' in self.hwsku:
Jul  9 19:13:09 sonic database.sh[938]:     if self.asic_type == "broadcom" and 'Force10-S6100' in self.hwsku:
Jul  9 19:13:09 sonic database.sh[938]: TypeError: argument of type 'NoneType' is not iterable
Jul  9 19:13:09 sonic database.sh[938]: TypeError: argument of type 'NoneType' is not iterable
Jul  9 19:13:09 sonic database.sh[938]: argument of type 'NoneType' is not iterable
Jul  9 19:13:09 sonic database.sh[938]: argument of type 'NoneType' is not iterable
Jul  9 19:13:09 sonic database.sh[938]: usage: db_migrator.py [-h] [-o operation migrate, set_version, get_version]
Jul  9 19:13:09 sonic database.sh[938]: usage: db_migrator.py [-h] [-o operation migrate, set_version, get_version]
Jul  9 19:13:09 sonic database.sh[938]:                       [-s unix socket] [-n asic namespace]
Jul  9 19:13:09 sonic database.sh[938]:                       [-s unix socket] [-n asic namespace]
Jul  9 19:13:09 sonic database.sh[938]: optional arguments:
Jul  9 19:13:09 sonic database.sh[938]: optional arguments:
Jul  9 19:13:09 sonic database.sh[938]:   -h, --help            show this help message and exit
Jul  9 19:13:09 sonic database.sh[938]:   -h, --help            show this help message and exit
Jul  9 19:13:09 sonic database.sh[938]:   -o operation (migrate, set_version, get_version)
Jul  9 19:13:09 sonic database.sh[938]:   -o operation (migrate, set_version, get_version)
Jul  9 19:13:09 sonic database.sh[938]:                         operation to perform [default: get_version]
Jul  9 19:13:09 sonic database.sh[938]:                         operation to perform [default: get_version]
Jul  9 19:13:09 sonic database.sh[938]:   -s unix socket        the unix socket that the desired database listens on
Jul  9 19:13:09 sonic database.sh[938]:   -s unix socket        the unix socket that the desired database listens on
Jul  9 19:13:09 sonic database.sh[938]:   -n asic namespace     The asic namespace whose DB instance we need to
Jul  9 19:13:09 sonic database.sh[938]:   -n asic namespace     The asic namespace whose DB instance we need to
Jul  9 19:13:09 sonic database.sh[938]:                         connect
Jul  9 19:13:09 sonic database.sh[938]:                         connect

Describe the results you expected:

Warmboot should pass without dataplane impact.

Output of show version:

SONiC-OS-20201231.71

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

syslog (62).log

@vaibhavhd
Copy link
Contributor Author

Issue is seen again. Last time this was suspected to be due to manual intervention during the test that caused cold reboot instead of warm.

Second occurrence makes this more worthy of investigation.

@yxieca yxieca added the Triaged this issue has been triaged label Aug 17, 2022
rjthomson added a commit to rjthomson/sonic-utilities-db-migration that referenced this issue Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants