Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FW tools logging system is broken #5684

Closed
nazariig opened this issue Oct 21, 2020 · 7 comments · Fixed by #5707
Closed

FW tools logging system is broken #5684

nazariig opened this issue Oct 21, 2020 · 7 comments · Fixed by #5707

Comments

@nazariig
Copy link
Collaborator

Description
FW tools logging system has been broken by recent changes, when sonic-py-common package has been introduced.
The issue is caused by logger implementation which has several weak places:

  1. It implicitly suppress logging messages up to NOTICE level: logger.py
    The side effect is seen due to Mellanox Platform API implementation: chassis.py
  2. It is not flexible enough: there is no way to configure logoption

FW tools update for Azure/sonic-utilities master/201911 is not consistent: basically logger modifications is missing for 201911

Steps to reproduce the issue:

  1. Initiate firmware update
root@sonic:/home/admin# fwutil install chassis component CPLD1 fw /tmp/cpld_burn.vme -y
Warning: Power cycle (with 30 sec delay) or refresh image is required to complete CPLD1 firmware update.
Installing firmware:
    /tmp/cpld_burn.vme

Describe the results you received:

root@sonic:/home/admin# tail -F /var/log/syslog | grep fwutil

Describe the results you expected:

root@sonic:/home/admin# tail -F /var/log/syslog | grep fwutil
Oct 21 11:06:37.107183 sonic INFO fwutil: Firmware install started: component=MSN3700C/CPLD1, firmware=/tmp/cpld_burn.vme
Oct 21 11:06:37.107527 sonic INFO fwutil: Firmware install ended: component=MSN3700C/CPLD1, firmware=/tmp/cpld_burn.vme, status=success

Additional information you deem important (e.g. issue happens only occasionally):
Logging is suppressed:

Oct 21 11:10:17.030912 sonic ERR fwutil: => error
Oct 21 11:10:17.031530 sonic WARNING fwutil: => warning
Oct 21 11:10:17.031952 sonic NOTICE fwutil: => notice

Logging is restored:

Oct 21 11:06:37.095809 sonic INFO fwutil: Chassis loaded successfully
Oct 21 11:06:37.105335 sonic ERR fwutil: => error
Oct 21 11:06:37.105777 sonic WARNING fwutil: => warning
Oct 21 11:06:37.106136 sonic INFO fwutil: => info
Oct 21 11:06:37.106487 sonic NOTICE fwutil: => notice
Oct 21 11:06:37.106832 sonic DEBUG fwutil: => debug
Oct 21 11:06:37.107183 sonic INFO fwutil: Firmware install started: component=MSN3700C/CPLD1, firmware=/tmp/cpld_burn.vme
Oct 21 11:06:37.107527 sonic INFO fwutil: Firmware install ended: component=MSN3700C/CPLD1, firmware=/tmp/cpld_burn.vme, status=success

Output of show version:

SONiC Software Version: SONiC.201911.219-ce724860
Distribution: Debian 9.13
Kernel: 4.9.0-11-2-amd64
Build commit: ce724860
Build date: Mon Oct 19 03:24:23 UTC 2020
Built by: johnar@jenkins-worker-8

Platform: x86_64-mlnx_msn3700c-r0
HwSKU: ACS-MSN3700C
ASIC: mellanox
Serial Number: MT1852X03894
Uptime: 10:53:59 up  5:21,  1 user,  load average: 0.32, 0.64, 0.85

Docker images:
REPOSITORY                    TAG                   IMAGE ID            SIZE
docker-syncd-mlnx             201911.219-ce724860   3b09825682f0        397MB
docker-syncd-mlnx             latest                3b09825682f0        397MB
docker-router-advertiser      201911.219-ce724860   dee22e74a858        289MB
docker-router-advertiser      latest                dee22e74a858        289MB
docker-sonic-mgmt-framework   201911.219-ce724860   94b6d0d9a68b        431MB
docker-sonic-mgmt-framework   latest                94b6d0d9a68b        431MB
docker-platform-monitor       201911.219-ce724860   019938ad5c3f        664MB
docker-platform-monitor       latest                019938ad5c3f        664MB
docker-fpm-frr                201911.219-ce724860   81be3e71f1c6        335MB
docker-fpm-frr                latest                81be3e71f1c6        335MB
docker-sflow                  201911.219-ce724860   ac2ee59579ab        315MB
docker-sflow                  latest                ac2ee59579ab        315MB
docker-lldp-sv2               201911.219-ce724860   8b50265734de        312MB
docker-lldp-sv2               latest                8b50265734de        312MB
docker-dhcp-relay             201911.219-ce724860   5bccf1c9268f        299MB
docker-dhcp-relay             latest                5bccf1c9268f        299MB
docker-database               201911.219-ce724860   d3c1ffc95b5b        289MB
docker-database               latest                d3c1ffc95b5b        289MB
docker-teamd                  201911.219-ce724860   0eb11ad76ccc        314MB
docker-teamd                  latest                0eb11ad76ccc        314MB
docker-snmp-sv2               201911.219-ce724860   9571503904b0        348MB
docker-snmp-sv2               latest                9571503904b0        348MB
docker-orchagent              201911.219-ce724860   c878017dd768        333MB
docker-orchagent              latest                c878017dd768        333MB
docker-nat                    201911.219-ce724860   fc77ec2066db        316MB
docker-nat                    latest                fc77ec2066db        316MB
docker-sonic-telemetry        201911.219-ce724860   0c7ac5552100        353MB
docker-sonic-telemetry        latest                0c7ac5552100        353MB

Attach debug file sudo generate_dump:

(paste your output here)
@nazariig
Copy link
Collaborator Author

@jleveque please have a look

@jleveque
Copy link
Contributor

@nazariig: the Logger class in sonic_py_common defaults the minimum log level to NOTICE, however it offers a set_min_log_priority() method for changing this level. This is available in both master and 201911 branches.

Could you explain what is inconsistent here? Have changes been made to sonic_py_common in the master branch that have not been backported to 201911?

@nazariig
Copy link
Collaborator Author

@jleveque let me elaborate a little bit of that:
I think there are some issues with this set_min_log_priority() since it applies not to a logger instance but towards entire process.
Thus, it makes logging facility working not as expected, at least on Mellanox platforms. The reason for that is because we use logger twice:

  1. In Chassis API implementation: chassis.py
  2. In FW tools CLI: log.py

And this adds an implicit constraint to use set_min_log_priority() also in fwutil which is not obvious and breaks the existing functionality. In this case official syslog documentation makes a point:

 syslog.setlogmask(maskpri)
    Set the priority mask to maskpri and return the previous mask value.  
    Calls to syslog() with a priority level not set in maskpri are ignored.  
    The default is to log all priorities. The function LOG_MASK(pri) calculates the mask for the individual priority pri.  
    The function LOG_UPTO(pri) calculates the mask for all priorities up to and including pri.

Note: The default is to log all priorities.

Also i have noticed that original logger configuration of fwutil is not aligned with a new logger implementation: LOG_NDELAY option is missing.

Now regarding the backporting of sonic_py_common to 201911:
I have noticed that for master logging facility was replaced with sonic_py_common logger,
but for 201911 we still using Mellanox implementation. That makes a difference.

@jleveque
Copy link
Contributor

I don't understand this:

I think there are some issues with this set_min_log_priority() since it applies not to a logger instance but towards entire process.

set_min_log_priority() applies to a Logger instance, as it is a class method. Each instance can have its own minimum priority.

I understand that the syslog default behavior is to log all priorities, but for the SONiC logger we chose to implement a default minimum log level of NOTICE. We can eliminate this default, if you prefer, and the developer can explicitly call set_min_log_priority() to set it for their individual instance if desired.

Regarding LOG_NDELAY, we could set this flag by default. I don't think it would be a problem, unless you believe it should be confirgurable.

@nazariig
Copy link
Collaborator Author

nazariig commented Oct 21, 2020

@jleveque please have a look at this example:

root@sonic:/home/admin# python
Python 2.7.13 (default, Aug 22 2020, 10:03:02)
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from sonic_py_common import logger
>>>
>>> log1 = logger.Logger(log_identifier='app1')
>>> log1.log_notice("=> notice")
>>> log1.log_info("=> info")
>>>
>>> log2 = logger.Logger(log_identifier='app2')
>>> log2.set_min_log_priority_error()
>>> log2.log_error("=> error")
>>> log1.log_notice("=> notice")
>>> log1.log_info("=> info")
>>> log1.log_error("=> error")

root@sonic:/home/admin# tail -F /var/log/syslog | grep app
Oct 21 19:10:21.916525 sonic NOTICE app1[18999]: => notice

Oct 21 19:10:55.418038 sonic ERR app2[18999]: => error
Oct 21 19:13:25.831293 sonic ERR app2[18999]: => error

It clearly shows us what would happen if syslog API will be used from two different logger instances.

Regarding the LOG_NDELAY flag: i feel it should be used by default.
IMHO, the logoption should be configurable, since for example we don't want to have a PID while logging from CLI - it is useless.

@jleveque
Copy link
Contributor

Interesting. This was not the behavior I expected. I was not aware that syslog sets the mask on a per-process level.

Would you mind drafting a PR to make the proposed changes to sonic-py-common/logger.py?

@nazariig
Copy link
Collaborator Author

@jleveque sure. I will take care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants