High Postgresql CPU usage #8978
Replies: 5 comments 6 replies
-
After some more investigation I found that the Postgresql processes with high CPU usage have pids that correspond to postgresql pg_stat_activity entries running I've tried to interactively run select * from rhn_server.update_needed_cache(1) as result with various values (0, 1, 20) and they all return fast and with no results. Thus presumably the originating process is trying that query over and over again. I've tried killing the pg worker with |
Beta Was this translation helpful? Give feedback.
-
I had the idea of looking at the Taskomatic log while running pg_cancel_backend on the 100% CPU PG worker thread, and I saw these entries. The canceling statement exception matches the pg_cancel_backend of course. But the question is why is the com.redhat.rhn.manager.errata.cache.UpdateErrataCacheCommand repeating that statement, and how can I fix it? Restarting Taskomatic doesn't stop it, it just picks up again after the restart. I think I remember a sync log error in one of the Redhat channels that an Errata field was too long, so currently that's my best bet as to the source of the problem. However, how would I go about clearing out whatever pending Errata update record is causing this error? 2024-08-08 04:11:26,388 [Thread-66] ERROR com.redhat.rhn.manager.errata.cache.UpdateErrataCacheCommand - Problem updating cache for server |
Beta Was this translation helpful? Give feedback.
-
Definitely a problem with the errata cache task. The history for Bunch errata-cache-bunch shows most tasks getting skipped, the status is INTERRUPTED for the ones that were running when the server needed to be rebooted for updates. The currently running errata-cache task has been running for 41992 seconds. I suppose I can disable the schedule for that task but then we won't get errata/patch information. |
Beta Was this translation helpful? Give feedback.
-
I tried switching to debug logging for the errata tasks by adding to /usr/share/rhn/classes/log4j2.xml
by adjusting the obsolete troubleshooting instructions in the Taskomatic Wiki page So I switched to doing the whole task module, and I get the start of many workers
a long list of Putting worker / Put worker lines
|
Beta Was this translation helpful? Give feedback.
-
Thanks for this report and continued investigation. Indeed the For the time being, can you try if calling |
Beta Was this translation helpful? Give feedback.
-
Hi,
Didn't notice this at first after upgrading to 2024.05, but we now seem to have very high CPU usage on the postgresql service. 2 of the CPUs are pinned. This was initially much higher. Someone else was running an LCM project promotion that was taking over a day. I restarted Taskomatic and the CPU usage maxed out as (I'm assuming) transactions were being rolled back. However after some hours the PG CPU usage dropped again but was still bottoming out at 3 CPUs max. After a full reboot, PG is still maxing out 2 cores, but there appeared to be no query activity on it.
SELECT * FROM pg_stat_activity;
was just returning a large number of worker processes with idle status and one of two queries.
16384 | uyuni | 25316 | | 16385 | uyuni_db | PostgreSQL JDBC Driver | 127.0.0.1 | | 37134 | 2024-06-26 12:58:20.914405-04 | | 2024-06-26 12:59:35.914384-04 | 2024-06-26 12:59:35.914485-04 | Client | ClientRead | idle | | | | select 'c3p0 ping' from dual | client backend
16384 | uyuni | 25875 | | 16385 | uyuni_db | PostgreSQL JDBC Driver | 127.0.0.1 | | 40924 | 2024-06-26 12:59:35.915078-04 | | 2024-06-26 13:00:49.200844-04 | 2024-06-26 13:00:49.20085-04 | Client | ClientRead | idle | | | | COMMIT | client backend
Now there are more processes running queries, but the PG usage is still steady at about 2 cores steady usage.
Even after all the above and the reboot, the GUI is still showing the LCM Project status as in the process of cloning channels
Any idea what's going on or suggestions for further investigation?
Beta Was this translation helpful? Give feedback.
All reactions