Change 'rabbitmqctl status' to a wget | grep to save CPU #5009

wanderboessenkool · 2019-10-14T13:03:55Z

This reduces CPU usage from 250 millis on idle to 25 millis on idle
Default rabbitmq user needs administrator privileges

SUMMARY

In the default configuration for the awx-rabbit container, the healthchekcs (livenessProbe and readinessProbe) use the comand 'rabbitmqctl status'. This causes high CPU usage, even on idle (250 millis). By changing the healthchecks to an authenticated request to http://localhost:15672/api/healthchecks/node, and grepping for the desired result in the output, this can be reduced to less than 25 millis, a factor ten saving.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME

Installer

AWX VERSION

awx: 7.0.0

ADDITIONAL INFORMATION

There are other projects suffering the same from the CPU hungry behavior of rabbitmqctl status, see:

- This reduces CPU usage from 250 millis on idle to 25 millis on idle - Default rabbitmq user needs administrator privileges

softwarefactory-project-zuul · 2019-10-14T13:16:51Z

Build succeeded.

awx-api-lint : SUCCESS in 5m 00s
awx-api : SUCCESS in 12m 02s
awx-ui : SUCCESS in 7m 14s
awx-ui-next : SUCCESS in 7m 59s
awx-swagger : SUCCESS in 12m 15s
awx-detect-schema-change : FAILURE in 12m 02s (non-voting)
awx-ansible-modules : SUCCESS in 5m 14s

wenottingham · 2019-10-14T15:31:03Z

What ensures wget and ash are in the base container image?

softwarefactory-project-zuul · 2019-10-14T16:03:55Z

Build succeeded.

awx-api-lint : SUCCESS in 2m 53s
awx-api : SUCCESS in 9m 55s
awx-ui : SUCCESS in 4m 48s
awx-ui-next : SUCCESS in 6m 11s
awx-swagger : SUCCESS in 10m 00s
awx-detect-schema-change : SUCCESS in 9m 47s (non-voting)
awx-ansible-modules : SUCCESS in 3m 16s

wanderboessenkool · 2019-10-14T16:05:24Z

What ensures wget and ash are in the base container image?

wget is provided by busybox, as is ash. They are both present in the official awx-rabbit image, which pulls from rabbitmq:${RABBITMQ_VERSION}-management-alpine, which pulls from rabbitmq:${RABBITMQ_VERSION}-alpine, which pulls from alpine:3.10, which has both installed by default.

When you put in your own {{ kubernetes_rabbitmq_image }} it would depend. I can switch the shell to /bin/sh or /bin/bash, since they are present as well in the same image and probably present in more base images than ash.

If you want to get rid of wget it will have to be replaced by curl, or a custom python script that gets added to the image.

ryanpetrello

@matburt you have any thoughts on this?

matburt · 2019-10-17T15:01:22Z

This is pretty compelling! I'd be willing to merge this if you can show you verification methodology, how you sampled usage, and provide me a way to verify it.

wanderboessenkool · 2019-10-17T18:16:25Z

This is pretty compelling! I'd be willing to merge this if you can show you verification methodology, how you sampled usage, and provide me a way to verify it.

Without the change, with rabbitmqctl status:

oc adm top pod --containers

POD                  NAME            CPU(cores)   MEMORY(bytes)                      
awx-0                awx-memcached   0m           7Mi                        
awx-0                awx-web         0m           552Mi              
awx-0                awx-celery      5m           458Mi                    
awx-0                awx-rabbit      206m         102M
postgresql-2-8w8qd   postgresql      4m           87Mi

With the change to http based healthchecks, grepping for {"status":"ok"}:
oc adm top pod --containers

POD                  NAME            CPU(cores)   MEMORY(bytes)
awx-0                awx-memcached   0m           7Mi                        
awx-0                awx-web         0m           552Mi                                   
awx-0                awx-celery      5m           702Mi                                   
awx-0                awx-rabbit      16m          118Mi
postgresql-2-8w8qd   postgresql      5m           91Mi

In prometheus: rate(container_cpu_usage_seconds_total{namespace="awx", container_name="awx-rabbit"}[1m])
(substituting namespace="awx" for whichever namespace awx is running in.)

shanemcd

I am all for this change but we can't use /bin/ash because our downstream images are RHEL-based. Please switch to /bin/sh or /bin/bash then I'll +1.

softwarefactory-project-zuul · 2019-10-17T19:58:31Z

Build succeeded.

awx-api-lint : SUCCESS in 12m 01s
awx-api : SUCCESS in 18m 33s
awx-ui : SUCCESS in 14m 57s
awx-ui-next : SUCCESS in 16m 35s
awx-swagger : SUCCESS in 20m 06s
awx-detect-schema-change : SUCCESS in 19m 51s (non-voting)
awx-ansible-modules : SUCCESS in 14m 26s

wanderboessenkool · 2019-10-17T20:27:10Z

@shanecmd. I've changed /bin/ash to /bin/sh.

Looking at the downstream image they do not seem to have wget installed, but they do have curl. That doesn't help in trying to unify the two. I've added another change that moves the healthcheck into a small inline python program.

Going forward that healthcheck could be moved into a separate script file, but it would have to be added to both the awx and the downstream images.

softwarefactory-project-zuul · 2019-10-17T20:39:01Z

Build succeeded.

awx-api-lint : SUCCESS in 5m 04s
awx-api : SUCCESS in 11m 22s
awx-ui : SUCCESS in 6m 28s
awx-ui-next : SUCCESS in 7m 01s
awx-swagger : SUCCESS in 11m 40s
awx-detect-schema-change : SUCCESS in 12m 16s (non-voting)
awx-ansible-modules : SUCCESS in 5m 53s

shanemcd

This is really impressive stuff. Thank you for the solid work and quick turnaround time.

shanemcd · 2019-10-17T20:55:44Z

I just noticed that we're templating out the username / password in the deployment object itself. Can we move this to read from an env var?

installer/roles/kubernetes/templates/deployment.yml.j2

…les for healthcheck

wanderboessenkool · 2019-10-17T21:24:02Z

I just noticed that we're templating out the username / password in the deployment object itself. Can we move this to read from an env var?

Yes we can :-)

ryanpetrello

see: https://github.com/ansible/awx/pull/5009/files#r336229304

shanemcd · 2019-10-17T21:35:10Z

Apologies for the noisy feedback, juggling a lot atm and just returned from vacation today. Another thought: we could do something like this to pull this into a ConfigMap / mounted file which would be cleaner than inlining the whole script.

softwarefactory-project-zuul · 2019-10-17T21:35:21Z

Build succeeded.

awx-api-lint : SUCCESS in 2m 42s
awx-api : SUCCESS in 9m 14s
awx-ui : SUCCESS in 4m 45s
awx-ui-next : SUCCESS in 6m 10s
awx-swagger : SUCCESS in 11m 03s
awx-detect-schema-change : SUCCESS in 6m 54s (non-voting)
awx-ansible-modules : SUCCESS in 3m 48s

installer/roles/kubernetes/templates/deployment.yml.j2

wanderboessenkool · 2019-10-17T21:42:07Z

Apologies for the noisy feedback, juggling a lot atm and just returned from vacation today. Another thought: we could do something like this to pull this into a ConfigMap / mounted file which would be cleaner than inlining the whole script.

@shanemcd That would definitely be nicer on the yaml for the healthchecks, but it might make it harder for people to debug. If wanted I can make those changes tomorrow.

softwarefactory-project-zuul · 2019-10-17T21:48:07Z

Build succeeded.

awx-api-lint : SUCCESS in 3m 13s
awx-api : SUCCESS in 9m 58s
awx-ui : SUCCESS in 5m 03s
awx-ui-next : SUCCESS in 6m 15s
awx-swagger : SUCCESS in 11m 08s
awx-detect-schema-change : SUCCESS in 10m 10s (non-voting)
awx-ansible-modules : SUCCESS in 3m 25s

softwarefactory-project-zuul · 2019-10-18T08:27:45Z

Build succeeded.

awx-api-lint : SUCCESS in 3m 58s
awx-api : SUCCESS in 8m 43s
awx-ui : SUCCESS in 5m 00s
awx-ui-next : SUCCESS in 6m 07s
awx-swagger : SUCCESS in 10m 21s
awx-detect-schema-change : SUCCESS in 10m 59s (non-voting)
awx-ansible-modules : SUCCESS in 3m 33s

shanemcd · 2019-10-18T16:26:05Z

Thanks for this!

softwarefactory-project-zuul · 2019-10-18T16:48:38Z

Build succeeded (gate pipeline).

awx-api-lint : SUCCESS in 3m 15s
awx-api : SUCCESS in 10m 48s
awx-ui : SUCCESS in 5m 07s
awx-ui-next : SUCCESS in 6m 18s
awx-swagger : SUCCESS in 10m 10s
awx-detect-schema-change : SUCCESS in 11m 18s (non-voting)
awx-ansible-modules : SUCCESS in 3m 29s
awx-push-new-schema : SUCCESS in 10m 19s (non-voting)

Change 'rabbitmqctl status' to a wget | grep

e870550

- This reduces CPU usage from 250 millis on idle to 25 millis on idle - Default rabbitmq user needs administrator privileges

awxbot added type:bug component:installer labels Oct 14, 2019

Properly escape quotes

038fd92

ryanpetrello reviewed Oct 16, 2019

View reviewed changes

shanemcd requested changes Oct 17, 2019

View reviewed changes

Change /bin/ash to /bin/sh as requested by @shanecmd

d6134fb

matburt approved these changes Oct 17, 2019

View reviewed changes

Change healthcheck from wget and grep to python with httplib

9ab58e9

shanemcd approved these changes Oct 17, 2019

View reviewed changes

ryanpetrello reviewed Oct 17, 2019

View reviewed changes

installer/roles/kubernetes/templates/deployment.yml.j2 Outdated Show resolved Hide resolved

Move installtime hardcoded rabbitmq credentials to environment variab…

00c9d75

…les for healthcheck

ryanpetrello suggested changes Oct 17, 2019

View reviewed changes

Make HTTPConnection import python 2,3 agnostic

c49e64e

ryanpetrello reviewed Oct 17, 2019

View reviewed changes

installer/roles/kubernetes/templates/deployment.yml.j2 Outdated Show resolved Hide resolved

ryanpetrello approved these changes Oct 17, 2019

View reviewed changes

Move python healthcheck script from probes to configMap

8ecc1f3

shanemcd approved these changes Oct 18, 2019

View reviewed changes

shanemcd added the mergeit label Oct 18, 2019

softwarefactory-project-zuul bot merged commit c262df0 into ansible:devel Oct 18, 2019

ryanpetrello mentioned this pull request Oct 22, 2019

AWX 8.0.0 Kubernetes deployment error #5061

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change 'rabbitmqctl status' to a wget | grep to save CPU #5009

Change 'rabbitmqctl status' to a wget | grep to save CPU #5009

wanderboessenkool commented Oct 14, 2019

softwarefactory-project-zuul bot commented Oct 14, 2019

wenottingham commented Oct 14, 2019

softwarefactory-project-zuul bot commented Oct 14, 2019

wanderboessenkool commented Oct 14, 2019

ryanpetrello left a comment

matburt commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

shanemcd left a comment

softwarefactory-project-zuul bot commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 17, 2019

shanemcd left a comment

shanemcd commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

ryanpetrello left a comment

shanemcd commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 18, 2019

shanemcd commented Oct 18, 2019

softwarefactory-project-zuul bot commented Oct 18, 2019

Change 'rabbitmqctl status' to a wget | grep to save CPU #5009

Change 'rabbitmqctl status' to a wget | grep to save CPU #5009

Conversation

wanderboessenkool commented Oct 14, 2019

SUMMARY

ISSUE TYPE

COMPONENT NAME

AWX VERSION

ADDITIONAL INFORMATION

softwarefactory-project-zuul bot commented Oct 14, 2019

wenottingham commented Oct 14, 2019

softwarefactory-project-zuul bot commented Oct 14, 2019

wanderboessenkool commented Oct 14, 2019

ryanpetrello left a comment

Choose a reason for hiding this comment

matburt commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

shanemcd left a comment

Choose a reason for hiding this comment

softwarefactory-project-zuul bot commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 17, 2019

shanemcd left a comment

Choose a reason for hiding this comment

shanemcd commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

ryanpetrello left a comment

Choose a reason for hiding this comment

shanemcd commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 17, 2019

wanderboessenkool commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 17, 2019

softwarefactory-project-zuul bot commented Oct 18, 2019

shanemcd commented Oct 18, 2019

softwarefactory-project-zuul bot commented Oct 18, 2019