Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[monit]: Fix process checker #5480

Merged
merged 1 commit into from
Sep 30, 2020
Merged

Conversation

nazariig
Copy link
Collaborator

@nazariig nazariig commented Sep 28, 2020

Signed-off-by: Nazarii Hnydyn [email protected]

This PR provides a fix for system health monitoring:

root@sonic:/home/admin# monit summary -B
Monit 5.20.0 uptime: 30m
 Service Name                     Status                      Type
 sonic-switch                     Running                     System
 rsyslog                          Running                     Process
 root-overlay                     Accessible                  Filesystem
 var-log                          Accessible                  Filesystem
 routeCheck                       Status ok                   Program
 telemetry|telemetry              Status ok                   Program
 telemetry|dialout_client         Status ok                   Program
 teamd|teamsyncd                  Status ok                   Program
 teamd|teammgrd                   Status ok                   Program
 swss|orchagent                   Status ok                   Program
 swss|portsyncd                   Status ok                   Program
 swss|neighsyncd                  Status ok                   Program
 swss|vrfmgrd                     Status ok                   Program
 swss|vlanmgrd                    Status ok                   Program
 swss|intfmgrd                    Status ok                   Program
 swss|portmgrd                    Status ok                   Program
 swss|buffermgrd                  Status ok                   Program
 swss|nbrmgrd                     Status ok                   Program
 swss|vxlanmgrd                   Status ok                   Program
 snmp|snmpd                       Status ok                   Program
 snmp|snmp_subagent               Status failed               Program
 sflow|sflowmgrd                  Status ok                   Program
 lldp|lldpd_monitor               Status ok                   Program
 lldp|lldp_syncd                  Status ok                   Program
 lldp|lldpmgrd                    Status ok                   Program
 database|redis_server            Status ok                   Program
 bgp|zebra                        Status ok                   Program
 bgp|fpmsyncd                     Status ok                   Program
 bgp|bgpd                         Status ok                   Program
 bgp|staticd                      Status ok                   Program
 bgp|bgpcfgd                      Status ok                   Program
 bgp|bgpmon                       Status failed               Program

Shell output:

root@sonic:/# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2  28324 22776 pts/0    Ss+  17:59   0:01 /usr/bin/python2 /usr/bin/supervisord
root        22  0.0  0.2  22668 16720 pts/0    S    18:00   0:00 python /usr/bin/supervisor-proc-exit-listener --container-name bgp
root        26  0.0  0.0 225856  3468 pts/0    Sl   18:00   0:00 /usr/sbin/rsyslogd -n -iNONE
frr         30  0.0  0.2 503604 16500 pts/0    Sl   18:00   0:00 /usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M fpm -M snmp
frr         31  0.0  0.0  43308  6024 pts/0    S    18:00   0:00 /usr/lib/frr/staticd -A 127.0.0.1
frr         32  0.0  0.2 298924 23636 pts/0    Sl   18:00   0:01 /usr/lib/frr/bgpd -A 127.0.0.1 -M snmp
root        36  0.0  0.6  69224 56008 pts/0    S    18:00   0:01 /usr/bin/python /usr/local/bin/bgpcfgd
root        37  0.0  0.1  21028 15532 pts/0    S    18:00   0:00 /usr/bin/python /usr/local/bin/bgpmon
root        38  0.0  0.0  82116  4616 pts/0    Sl   18:00   0:00 fpmsyncd
root       254  0.0  0.0   3868  3168 pts/1    Ss+  18:34   0:00 bash
root       261  0.0  0.0   3868  3120 pts/2    Ss   18:41   0:00 bash
root       266  0.0  0.0   7640  2608 pts/2    R+   18:41   0:00 ps -aux

root@sonic:/# ps -aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.2  31756 22856 pts/0    Ss+  18:00   0:01 /usr/bin/python2 /usr/bin/supervisord
root         9  0.0  0.2  26244 17300 pts/0    S    18:00   0:00 python /usr/bin/supervisor-proc-exit-listener --container-name snmp
root        19  0.0  0.0 225856  3592 pts/0    Sl   18:00   0:00 /usr/sbin/rsyslogd -n -iNONE
Debian-+    23  0.0  0.1  32924 12660 pts/0    S    18:00   0:02 /usr/sbin/snmpd -f -LS4d -u Debian-snmp -g Debian-snmp -I -smux mteTrigger mteTriggerConf ifTable ifXT
root        24  2.7  0.4 120188 34772 pts/0    Sl   18:00   1:08 python3 -m sonic_ax_impl
root        26  0.0  0.0   7476  3604 pts/1    Ss   18:42   0:00 bash
root        31  0.0  0.0  11248  3044 pts/1    R+   18:42   0:00 ps -aux

- Why I did it

  • To fix system health monitoring

- How I did it

  • Fixed monit config files

- How to verify it

  1. monit summary -B

- Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006

- Description for the changelog

  • N/A

- A picture of a cute animal (not mandatory but encouraged)

      .---.        .-----------
     /     \  __  /    ------
    / /     \(  )/    -----
   //////   ' \/ `   ---
  //// / // :    : ---
 // /   /  /`    '--
//          //..\\
       ====UU====UU====
           '//||\\`
             ''``

Signed-off-by: Nazarii Hnydyn <[email protected]>
@abdosi abdosi requested a review from yozhao101 September 29, 2020 00:02
@abdosi
Copy link
Contributor

abdosi commented Sep 29, 2020

retest vsimage please

@abdosi
Copy link
Contributor

abdosi commented Sep 29, 2020

retest mellanox please

@abdosi
Copy link
Contributor

abdosi commented Sep 29, 2020

retest broadcom please

@liat-grozovik
Copy link
Collaborator

retest this please

@abdosi
Copy link
Contributor

abdosi commented Sep 29, 2020

retest vsimage please

@abdosi abdosi merged commit 79bda7d into sonic-net:master Sep 30, 2020
abdosi pushed a commit that referenced this pull request Sep 30, 2020
@qiluo-msft
Copy link
Collaborator

LGTM

santhosh-kt pushed a commit to santhosh-kt/sonic-buildimage that referenced this pull request Feb 25, 2021
@nazariig nazariig deleted the master-monit-fix branch May 9, 2022 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants