GitHub repo: https://github.com/myllynen/aap-troubleshooting-guide
Themed page: https://myllynen.github.io/aap-troubleshooting-guide
This page provides basic Ansible Automation Platform (AAP) troubleshooting tips and tricks.
For Red Hat Troubleshooting Ansible Automation Platform guide, see Troubleshooting Ansible Automation Platform.
For Red Hat Ansible automation controller troubleshooting docs, see the AAP Troubleshooting page.
For Red Hat Ansible automation controller generic user guide, see the AAP User Guide.
For understanding Ansible Automation Platform setup check its topology view, instances, and settings Web UI pages. The location of these page would be something like:
- https://aap.example.com/#/topology_view
- https://aap.example.com/#/instances
- https://aap.example.com/#/settings
In case working on the command line use the following command to display AAP instances and topology:
awx-manage list_instances
To see currently running jobs visit the jobs page on the Web UI:
The jobs type of Workflow Job are parent jobs that are running the
configured number of actual jobs in parallel, see Job Slicing in the
corresponding Template → Details page. Jobs are also subject of
AAP-wide default configurations, see Settings → Job settings. Pay
special attention to the Job Type parameter in the Template →
Details page and always use Check
or Run
as appropriate.
To find jobs related to a certain host or group on the Web UI the filtering capabilities of the automation controller can be used. For example, to find all the jobs where the limit included server23 use the following URL:
https://aap.example.com/#/jobs?job.job__limit__contains=server23
For details of a running or a past job navigate to Jobs → Number/Name → Details. Here the following information is found:
- Project
- Inventory
- Job status
- Verbosity
- Name of the Execution Node
- Revision (can be used to check the contents in a git repo)
For the job output check the Output tab for the job. Hint: using Ctrl+- in your browser should zoom out the page which might make the output slightly easier to read. Another option is to download the output and open it in a text editor (other than Notepad which doesn't handle Unix text files properly). Third, and sometimes the best, option is to open the API page which shows the output in full page mode, for instance (obviously replace the job number as appropriate):
https://aap.example.com/api/v2/jobs/1234/stdout/
The output can be downloaded in plain text format by adding
?format=txt
to end of the URL.
In case a job is not completing as expected it may be a good idea to
increase verbosity and disable job slicing to allow better see what is
going on. Visit the job template page, set Job Slicing to 1
and
apply suitable value for Verbosity. Note that high verbosity values
(3
and above) may in some cases cause secrets to be logged. According
to Ansible developers this is intended. A reasonable starting verbosity
level is often 2
.
If there are one or few hosts in the current inventory known to be
problematic those could be (temporarily) excluded by editing the
template and adding the offending hosts in the Limit string prefixed
by an exclamation mark (!
).
In some cases it might be required to check AAP and system log on AAP controller and/or execution node(s). First, check the AAP topology and possible job details to identify relevant nodes.
On the command line awx-manage list_instances
can be used to display
AAP instances and topology. For details about automation mesh use the
receptorctl
command. For instance, to display basics about the
current node and automation mesh setup and status use:
receptorctl --socket /run/awx-receptor/receptor.sock status
To ping a node as part of the automation mesh use:
receptorctl --socket /run/awx-receptor/receptor.sock ping en-1.example.com
On the controller nodes the most relevant logs are typically in /var/log/tower.
On the execution nodes the most relevant logs are probably /var/log/messages and in /var/log/receptor.
See also https://github.com/myllynen/rhel-troubleshooting-guide.
See also https://github.com/myllynen/aap-automation.