-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change 'rabbitmqctl status' to a wget | grep to save CPU #5009
Change 'rabbitmqctl status' to a wget | grep to save CPU #5009
Conversation
- This reduces CPU usage from 250 millis on idle to 25 millis on idle - Default rabbitmq user needs administrator privileges
Build succeeded.
|
What ensures wget and ash are in the base container image? |
Build succeeded.
|
wget is provided by busybox, as is ash. They are both present in the official awx-rabbit image, which pulls from rabbitmq:${RABBITMQ_VERSION}-management-alpine, which pulls from rabbitmq:${RABBITMQ_VERSION}-alpine, which pulls from alpine:3.10, which has both installed by default. When you put in your own {{ kubernetes_rabbitmq_image }} it would depend. I can switch the shell to /bin/sh or /bin/bash, since they are present as well in the same image and probably present in more base images than ash. If you want to get rid of wget it will have to be replaced by curl, or a custom python script that gets added to the image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matburt you have any thoughts on this?
This is pretty compelling! I'd be willing to merge this if you can show you verification methodology, how you sampled usage, and provide me a way to verify it. |
Without the change, with
With the change to http based healthchecks, grepping for
In prometheus: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all for this change but we can't use /bin/ash
because our downstream images are RHEL-based. Please switch to /bin/sh
or /bin/bash
then I'll +1.
Build succeeded.
|
@shanecmd. I've changed Looking at the downstream image they do not seem to have wget installed, but they do have curl. That doesn't help in trying to unify the two. I've added another change that moves the healthcheck into a small inline python program. Going forward that healthcheck could be moved into a separate script file, but it would have to be added to both the awx and the downstream images. |
Build succeeded.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really impressive stuff. Thank you for the solid work and quick turnaround time.
I just noticed that we're templating out the username / password in the deployment object itself. Can we move this to read from an env var? |
…les for healthcheck
Yes we can :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the noisy feedback, juggling a lot atm and just returned from vacation today. Another thought: we could do something like this to pull this into a ConfigMap / mounted file which would be cleaner than inlining the whole script. |
Build succeeded.
|
@shanemcd That would definitely be nicer on the yaml for the healthchecks, but it might make it harder for people to debug. If wanted I can make those changes tomorrow. |
Build succeeded.
|
Build succeeded.
|
Thanks for this! |
Build succeeded (gate pipeline).
|
SUMMARY
In the default configuration for the awx-rabbit container, the healthchekcs (livenessProbe and readinessProbe) use the comand 'rabbitmqctl status'. This causes high CPU usage, even on idle (250 millis). By changing the healthchecks to an authenticated request to http://localhost:15672/api/healthchecks/node, and grepping for the desired result in the output, this can be reduced to less than 25 millis, a factor ten saving.
ISSUE TYPE
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION
There are other projects suffering the same from the CPU hungry behavior of
rabbitmqctl status
, see: