-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[monit] Adding patch to enhance syslog error message generation for monit alert action when status is failed. #5720
Conversation
is handle for clause like this: if status != 0 for x cycle then alert repeat every y cycle. With above clause error syslog will be generated after x cycle and for every yth cycle if error is persistent Signed-off-by: Abhishek Dosi <[email protected]>
Signed-off-by: Abhishek Dosi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abdosi: When I glanced over the Monit source last week, I envisioned adding the alert logging to the handle_alert()
function in alert.c. However, I hadn't thought about how to get the message to repeat. Just curious why you chose the _handleEvent()
function.
@jleveque Change was based on this commit done for Exec Action. Also i thought it will be less changes to patch. |
Signed-off-by: Abhishek Dosi <[email protected]>
Signed-off-by: Abhishek Dosi <[email protected]>
@jleveque and @yozhao101 I have updated monit files also in same PR. Please review. |
Signed-off-by: Abhishek Dosi <[email protected]>
Retest this please |
retest baseimage please |
retest vsimage please |
retest buildimage please |
retest vsimage please |
retest baseimage please |
…onit alert action when status is failed. (#5720) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action
…onit alert action when status is failed. (#5720) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action Signed-off-by: Guohan Lu <[email protected]>
…onit alert action when status is failed. (sonic-net#5720) Why/How I did: Make sure first error syslog is triggered based on FAULT TOLERANCE condition. Added support of repeat clause with alert action. This is used as trigger for generation of periodic syslog error messages if error is persistent Updated the monit conf files with repeat every x cycles for the alert action
Why/How I did:
Make sure first error syslog is triggered based on FAULT TOLERANCE condition.
Added support of repeat clause with alert action. This is used as trigger
for generation of periodic syslog error messages if error is persistent
Updated the monit conf files with repeat every x cycle for the alert action
For example:
Make sure monit is honoring below clause in generating error syslog.
if status != 0 for x cycle then alert repeat every y cycle.
With above clause error syslog will be generated after x cycle and for
every yth cycle if error is persistent
How I verify:
a) check program routeCheck with path "/usr/bin/route_check.py"
every 1 cycles
if status != 0 for 3 cycle then alert repeat every 3 cycles
Oct 26 18:12:54.773398 ERR monit[480]: 'routeCheck' status failed (255) -- no output
Oct 26 18:15:54.940595 ERR monit[480]: 'routeCheck' status failed (255) -- no output
Oct 26 18:18:55.091670 ERR monit[480]: 'routeCheck' status failed (255) -- no output
b) Verify monit status is fine.
c) Verify for process that are not running we are getting ERR message periodically after first failure.