-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add functionality for an automatic thread dump upon thread pool exhaustion #137
base: 2.4
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -97,6 +97,25 @@ interface Messages extends BasicLogger { | |
@LogMessage(level = Logger.Level.ERROR) | ||
void taskSubmitFailed(@Cause RejectedExecutionException e, Runnable task); | ||
|
||
@Message(id = 15, value = "Thread pool has reached %d max threads in use. Performance may be impacted. Thread dump:\n"+ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These messages probably shouldn't have IDs like this; a thread dump should probably just go directly to the output, or be formatted otherwise like a normal JDK thread dump. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For reference, IDs have previously been assigned to auto generated thread dump to logging functions like in eap/wildfly-core itself: Or in activemq-artemis: Message ids can be quite helpful to support and I and others use them in simple tooling (https://github.com/aogburn/yala) to quickly search and handle critical known message ids or sum up key information in logging. So for a usability perspective, I'd request that it have an ID. I think to be truly formatted exactly like a normal JDK thread dump it would probably require being generated through normal JDK utilities, like forking an external process that's calling the JDK jstack command if available (then a dump can be missed if someone has not installed the jdk-devel package or is running a JRE). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, then let's make it one big message with one ID instead of multiple. Loggers can accept a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed this to just one message id for the thread dump messages. I did leave one additional id to handle any exception from a dump attempt. Is that suitable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dmlloyd Just wanted to check if you'd have any further feedback here? Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've been ruminating on this. For too long I guess. :) I have a couple more comments. |
||
"*******************************************************************************\n" + | ||
"%s\n===============================================================================\n" + | ||
"End Thread dump\n*******************************************************************************\n") | ||
@LogMessage(level = Logger.Level.WARN) | ||
void exhaustedPoolDump(int maxPoolSize, String dump); | ||
|
||
@LogMessage(level = Logger.Level.WARN) | ||
@Message(id = 16, value = "Deadlock detected!\n"+ | ||
"*******************************************************************************\n" + | ||
"{0}\n===============================================================================\n" + | ||
"End Deadlock\n*******************************************************************************\n", | ||
format = Message.Format.MESSAGE_FORMAT) | ||
void deadLock(String dump); | ||
|
||
@Message(id = 17, value = "Exception thrown during generation of thread dump") | ||
@LogMessage(level = Logger.Level.WARN) | ||
void threadDumpException(@Cause Exception cause); | ||
|
||
// validation | ||
|
||
@Message(id = 100, value = "Keep-alive may only be set to 0 for this executor type") | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
/* | ||
* JBoss, Home of Professional Open Source. | ||
* | ||
* Copyright 2022 Red Hat, Inc., and individual contributors | ||
* as indicated by the @author tags. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.jboss.threads; | ||
|
||
import java.io.PrintWriter; | ||
import java.io.StringWriter; | ||
import java.lang.management.LockInfo; | ||
import java.lang.management.ManagementFactory; | ||
import java.lang.management.MonitorInfo; | ||
import java.lang.management.ThreadInfo; | ||
import java.lang.management.ThreadMXBean; | ||
|
||
final class ThreadDumpUtil { | ||
|
||
public static void threadDump(int maxPoolSize) { | ||
|
||
try ( | ||
StringWriter dumpstr = new StringWriter(); | ||
PrintWriter dumpout = new PrintWriter(dumpstr); | ||
StringWriter deadlockstr = new StringWriter(); | ||
PrintWriter deadlockout = new PrintWriter(deadlockstr); | ||
) { | ||
|
||
ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean(); | ||
|
||
for (ThreadInfo threadInfo : threadMXBean.dumpAllThreads(true, true)) { | ||
dumpout.println(threadInfoToString(threadInfo)); | ||
} | ||
|
||
Messages.msg.exhaustedPoolDump(maxPoolSize, dumpstr.toString()); | ||
|
||
long[] deadlockedThreads = threadMXBean.findDeadlockedThreads(); | ||
|
||
if (deadlockedThreads != null && deadlockedThreads.length > 0) { | ||
for (ThreadInfo threadInfo : threadMXBean.getThreadInfo(deadlockedThreads, true, true)) { | ||
deadlockout.println(threadInfoToString(threadInfo)); | ||
} | ||
|
||
Messages.msg.deadLock(deadlockstr.toString()); | ||
} | ||
|
||
} catch (Exception e) { | ||
Messages.msg.threadDumpException(e); | ||
} | ||
} | ||
|
||
private static String threadInfoToString(ThreadInfo threadInfo) { | ||
StringBuilder sb = new StringBuilder("\"" + threadInfo.getThreadName() + "\"" + | ||
" Id=" + threadInfo.getThreadId() + " " + | ||
threadInfo.getThreadState()); | ||
if (threadInfo.getLockName() != null) { | ||
sb.append(" on " + threadInfo.getLockName()); | ||
} | ||
if (threadInfo.getLockOwnerName() != null) { | ||
sb.append(" owned by \"" + threadInfo.getLockOwnerName() + | ||
"\" Id=" + threadInfo.getLockOwnerId()); | ||
} | ||
if (threadInfo.isSuspended()) { | ||
sb.append(" (suspended)"); | ||
} | ||
if (threadInfo.isInNative()) { | ||
sb.append(" (in native)"); | ||
} | ||
sb.append('\n'); | ||
int i = 0; | ||
for (; i < threadInfo.getStackTrace().length; i++) { | ||
StackTraceElement ste = threadInfo.getStackTrace()[i]; | ||
sb.append("\tat " + ste.toString()); | ||
sb.append('\n'); | ||
if (i == 0 && threadInfo.getLockInfo() != null) { | ||
Thread.State ts = threadInfo.getThreadState(); | ||
switch (ts) { | ||
case BLOCKED: | ||
sb.append("\t- blocked on " + threadInfo.getLockInfo()); | ||
sb.append('\n'); | ||
break; | ||
case WAITING: | ||
sb.append("\t- waiting on " + threadInfo.getLockInfo()); | ||
sb.append('\n'); | ||
break; | ||
case TIMED_WAITING: | ||
sb.append("\t- waiting on " + threadInfo.getLockInfo()); | ||
sb.append('\n'); | ||
break; | ||
default: | ||
} | ||
} | ||
|
||
for (MonitorInfo mi : threadInfo.getLockedMonitors()) { | ||
if (mi.getLockedStackDepth() == i) { | ||
sb.append("\t- locked " + mi); | ||
sb.append('\n'); | ||
} | ||
} | ||
} | ||
|
||
LockInfo[] locks = threadInfo.getLockedSynchronizers(); | ||
if (locks.length > 0) { | ||
sb.append("\n\tNumber of locked synchronizers = " + locks.length); | ||
sb.append('\n'); | ||
for (LockInfo li : locks) { | ||
sb.append("\t- " + li); | ||
sb.append('\n'); | ||
} | ||
} | ||
sb.append('\n'); | ||
return sb.toString(); | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be unconditional. Thread pools reaching capacity is not an error condition, it just means that the behavior changes from submitting tasks to new or waiting threads, to instead adding the task to a queue that the worker threads read from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should be the condition? This has a boolean to only log on the first occurrence of exhaustion. But if we'd accept such a thread dump utility, I think there should be some further configuration required to enable it for the pool if desired. So this would then prompt another change primarily in XNIO to enable this thread dumping for its task pool where such dumping would be desired. Such a dump would likely not be desired from the MSC thread pool upon its exhaustion for instance.
While the behavior of queuing upon is exhaustion is the expected behavior, the reality for customers using JBoss and support engineers assisting them is that this is universally a critical issue state for the XNIO task thread pool used by undertow as that begins to immediately degrade response times if not lead to full request unresponsiveness as well if already processing requests are not quickly completing. We often see multiple cases a day requiring diagnosis of such thread exhaustion induced by many varied issues and most often not from a conventional deadlock. Further details about this is recorded at https://issues.redhat.com/browse/EAPSUP-1325.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition as you suggest should be a configuration switch, and I'd also propose that there should be either two switches or a single three-state switch to separately configure whether a warning is logged when the thread pool queues, and separately whether the warning includes the thread dump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved this to make it configurable. Let me know any suggestions.