-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add other time accounting in HotThreads #79392
Conversation
Currently, the HotThreads CPU report contains only the CPU cycles that a thread executes, excluding time spent in I/O or the JVM runtime (e.g. GC). This PR introduces the 'other' time accounting to augment the CPU time for better understanding of user visible time impact.
@@ -158,6 +159,20 @@ static void initializeProbes() { | |||
JvmInfo.jvmInfo(); | |||
} | |||
|
|||
static void initializeRuntimeMonitoring() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enabling thread contention monitoring
on-demand, as we used to do it in HotThreads, is not precise enough to properly calculate the wait/blocked time. If we enable the support on demand, we need to introduce an artificial pause before we attempt to capture timings, which would make the API unnecessarily slow.
I ran various micro benchmarks and proved that enabling lock contention monitoring by default, doesn't impact the performance of either uncontended or heavily contended use case.
@@ -57,7 +57,7 @@ | |||
|
|||
// NOTE: these are JVM dependent and JVM version dependent | |||
private static final List<String> knownJDKInternalThreads = Arrays.asList( | |||
"Signal Dispatcher", "Finalizer", "Reference Handler", "Notification Thread", "Common-Cleaner", "process reaper" | |||
"Signal Dispatcher", "Finalizer", "Reference Handler", "Notification Thread", "Common-Cleaner", "process reaper", "DestroyJavaVM" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DestroyJavaVM is one more of the JVM threads that don't do anything in particular.
ManagementFactory.getThreadMXBean(), | ||
SunThreadInfo.INSTANCE, | ||
Thread.currentThread().getId(), | ||
(interval) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extra functional parameter was added to remove all timing delays when running the HotThreads unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you!
@@ -161,7 +198,7 @@ static boolean isIdleThread(ThreadInfo threadInfo) { | |||
continue; | |||
} | |||
long allocatedBytes = type == ReportType.MEM ? sunThreadInfo.getThreadAllocatedBytes(threadIds[i]) : 0; | |||
result.put(threadIds[i], new ThreadTimeAccumulator(threadInfos[i], cpuTime, allocatedBytes)); | |||
result.put(threadIds[i], new ThreadTimeAccumulator(threadInfos[i], interval, cpuTime, allocatedBytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We pass in the supplied testing interval to HotThreads to be able to calculate the 'other' time, by subtracting the CPU, wait and blocked timings.
@@ -180,16 +217,22 @@ static boolean isIdleThread(ThreadInfo threadInfo) { | |||
return result; | |||
} | |||
|
|||
private void setThreadWaitBlockTimeMonitoringEnabled(ThreadMXBean threadBean, boolean enabled) { | |||
private boolean isThreadWaitBlockTimeMonitoringEnabled(ThreadMXBean threadBean) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We remove the privileged call to enable contention monitoring and replace it by simply checking if it's enabled. Bootstrap sets it by default ON now.
// Enabling thread contention monitoring is required for capturing JVM thread wait/blocked times | ||
setThreadWaitBlockTimeMonitoringEnabled(threadBean, true); | ||
// Capture before and after thread state with timings | ||
Map<Long, ThreadTimeAccumulator> previousThreadInfos = getAllValidThreadInfos(threadBean, sunThreadInfo, currentThreadId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look bigger than they are, I simply removed the try/finally blocked since we don't need to disable thread contention monitoring at the end of the HotThreads innerDetect call anymore.
// Sort by delta CPU time on thread. | ||
List<ThreadTimeAccumulator> topThreads = new ArrayList<>(latestThreadInfos.values()); | ||
final ToLongFunction<ThreadTimeAccumulator> getter = ThreadTimeAccumulator.valueGetterForReportType(type); | ||
if (type == ReportType.CPU && sortOder == SortOrder.TOTAL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we sort by the 'total' runnable time, we first sort by the total time and then internally sort those at 100% by CPU.
|
||
// for each snapshot (2nd array index) find later snapshot for same thread with max number of | ||
// identical StackTraceElements (starting from end of each) | ||
boolean[] done = new boolean[threadElementsSnapshotCount]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code below is unchanged.
this.blockedTime = info.getBlockedTime(); | ||
this.waitedTime = info.getWaitedTime(); | ||
ThreadTimeAccumulator(ThreadInfo info, TimeValue interval, long cpuTime, long allocatedBytes) { | ||
this.blockedTime = millisecondsToNanos(info.getBlockedTime()); // Convert to nanos to standardize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conversion here fixes an existing bug that I didn't notice before. Essentially, the ThreadInfo getWaitTime and getBlockedTime return the result in milliseconds, while CPU time is returned as nanoseconds. We used to produce incorrect wait times because we'd divide by interval.nanos().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I wonder if this should be documented somewhere? At least in the support database as it may come up, and the fact this was fixed here could easily be lost since the PR is not directly about this bug (it could also be fixed in a separate PR, but I understand this PR may have tests which rely on it being fixed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about that, we should probably document that wait and blocked are fixed now. The millis to nanos is only an issue in 7.16 since I fixed the wait/blocked accounting there, previously we'd just report 0 always, so it doesn't matter what we divided with.
@@ -32,6 +33,10 @@ | |||
|
|||
public class HotThreadsTests extends ESTestCase { | |||
|
|||
private long nanosHelper(long millis) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper to ensure we correctly match results of wait/blocked timings given we internally convert the input mills to nanoseconds so we can accurately calculate the timings and percentages.
assertThat(innerResult, containsString("30.0% (3nanos out of 10nanos) cpu usage by thread 'Thread 3'")); | ||
assertThat(innerResult, containsString("20.0% (2nanos out of 10nanos) cpu usage by thread 'Thread 2'")); | ||
assertThat(innerResult, containsString("10.0% (1nanos out of 10nanos) cpu usage by thread 'Thread 1'")); | ||
assertThat(innerResult, stringContainsInOrder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this method mainly focus on the updated output showing the separate 'cpu' and 'other' timings, as well as ensuring the orderings of output, since we have two sort modes now for CPU.
assertThat(exception.getMessage(), equalTo("thread allocated memory is not supported on this JDK")); | ||
|
||
// making sure CPU type was not affected when isThreadAllocatedMemorySupported() == false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do this extra test here, we already have tests that ensure CPU timings reporting works.
@@ -22,8 +22,6 @@ grant codeBase "${codebase.elasticsearch-secure-sm}" { | |||
grant codeBase "${codebase.elasticsearch}" { | |||
// needed for loading plugins which may expect the context class loader to be set | |||
permission java.lang.RuntimePermission "setContextClassLoader"; | |||
// needed by HotThreads to enable wait/block time accounting on demand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer needed since we now enable the monitoring before we install the security manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see permissions removed. :)
Pinging @elastic/es-core-infra (Team:Core/Infra) |
@@ -20,8 +21,11 @@ | |||
|
|||
public class NodesHotThreadsRequest extends BaseNodesRequest<NodesHotThreadsRequest> { | |||
|
|||
private static final Version SORT_SUPPORTED_VERSION = Version.V_8_0_0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we land this in 7_16_0, I'll change 8.0.0 to reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to see these versions directly in the streaminput ctor and writeTo methods. As a constant, it looks as if the value could just be changed, but it can't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have a person preference for setting the version to the intended backport version from the start, but what you described is acceptable. Thanks for noting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Changed.
@elasticmachine update branch |
@@ -55,7 +55,7 @@ public void testBWCSerialization() throws IOException { | |||
out.setVersion(previous); | |||
request.writeTo(out); | |||
try (StreamInput in = out.bytes().streamInput()) { | |||
in.setVersion(latest); | |||
in.setVersion(previous); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this wrong before, out.bytes().streamInput()
resets the original version that the out had before, so I had to forcibly set it to the old version to simulate 7.x sending over the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have several comments, but they are mostly minor/ideas for followup.
@@ -20,8 +21,11 @@ | |||
|
|||
public class NodesHotThreadsRequest extends BaseNodesRequest<NodesHotThreadsRequest> { | |||
|
|||
private static final Version SORT_SUPPORTED_VERSION = Version.V_8_0_0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have a person preference for setting the version to the intended backport version from the start, but what you described is acceptable. Thanks for noting it.
@@ -20,8 +21,11 @@ | |||
|
|||
public class NodesHotThreadsRequest extends BaseNodesRequest<NodesHotThreadsRequest> { | |||
|
|||
private static final Version SORT_SUPPORTED_VERSION = Version.V_8_0_0; | |||
|
|||
int threads = 3; | |||
HotThreads.ReportType type = HotThreads.ReportType.CPU; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should all be final, I'm surprised they are not. We should have a ctor to take all the members. That can be a followup, though (doesn't have to make 7.16 since this is already the state of the class).
@@ -34,6 +38,9 @@ public NodesHotThreadsRequest(StreamInput in) throws IOException { | |||
type = HotThreads.ReportType.of(in.readString()); | |||
interval = in.readTimeValue(); | |||
snapshots = in.readInt(); | |||
if (in.getVersion().onOrAfter(SORT_SUPPORTED_VERSION)) { | |||
sortOrder = HotThreads.SortOrder.of(in.readString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice (as a followup) to look at changing read/write enum to use the value, instead of ordinal. As it is, the enum methods for stream input/output are brittle since they rely on the enum definition not changing.
static void initializeRuntimeMonitoring() { | ||
// We'll let the JVM boot without this functionality, however certain APIs like HotThreads will not | ||
// function and report an error. | ||
if (ManagementFactory.getThreadMXBean().isThreadContentionMonitoringSupported() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this in a static for hot threads still? Perhaps a static method could be called in initializeProbes? I think we should strive to keep Bootstrap as small as possible, without feature specific logic like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll change this.
return order; | ||
} | ||
|
||
public static SortOrder of(String order) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me why valueOf
can't work here, and leave a comment to that effect explaining why this method is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regular valueOf is a static method, we can't override it, and gives an ugly error message for the end user to see, something like 'field' is not member of full class name. But now that I think about it, I can possibly avoid the string serialization for this enum, I can just do writeEnum and readEnum, since I version the sort anyway. Let me look into that and this method can simply go away.
ManagementFactory.getThreadMXBean(), | ||
SunThreadInfo.INSTANCE, | ||
Thread.currentThread().getId(), | ||
(interval) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you!
// Sort by delta CPU time on thread. | ||
List<ThreadTimeAccumulator> topThreads = new ArrayList<>(latestThreadInfos.values()); | ||
final ToLongFunction<ThreadTimeAccumulator> getter = ThreadTimeAccumulator.valueGetterForReportType(type); | ||
if (type == ReportType.CPU && sortOder == SortOrder.TOTAL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we validate earlier that sortOrder is only present for type cpu, then we don't need to the type check on cpu?
this.blockedTime = info.getBlockedTime(); | ||
this.waitedTime = info.getWaitedTime(); | ||
ThreadTimeAccumulator(ThreadInfo info, TimeValue interval, long cpuTime, long allocatedBytes) { | ||
this.blockedTime = millisecondsToNanos(info.getBlockedTime()); // Convert to nanos to standardize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I wonder if this should be documented somewhere? At least in the support database as it may come up, and the fact this was fixed here could easily be lost since the PR is not directly about this bug (it could also be fixed in a separate PR, but I understand this PR may have tests which rely on it being fixed).
@@ -22,8 +22,6 @@ grant codeBase "${codebase.elasticsearch-secure-sm}" { | |||
grant codeBase "${codebase.elasticsearch}" { | |||
// needed for loading plugins which may expect the context class loader to be set | |||
permission java.lang.RuntimePermission "setContextClassLoader"; | |||
// needed by HotThreads to enable wait/block time accounting on demand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see permissions removed. :)
assertThat(innerResult, containsString("0.0% [cpu=0.0%, other=0.0%] (0s out of 10ms) cpu usage by thread 'Thread 2'")); | ||
assertThat(innerResult, containsString("0.0% [cpu=0.0%, other=0.0%] (0s out of 10ms) cpu usage by thread 'Thread 1'")); | ||
|
||
// Test with the legacy sort order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot going on in this single test method. Can some of these cases be split into separate test methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good I'll get that done.
💔 Backport failed
You can use sqren/backport to manually backport by running |
This PR introduces the concept of 'other' time in the HotThreads report. Backport of #79392
* upstream/master: (24 commits) Implement framework for migrating system indices (elastic#78951) Improve transient settings deprecation message (elastic#79504) Remove getValue and getValues from Field (elastic#79516) Store Template's mappings as bytes for disk serialization (elastic#78746) [ML] Add queue_capacity setting to start deployment API (elastic#79433) [ML] muting rest compat test issue elastic#79518 (elastic#79519) Avoid redundant available indices check (elastic#76540) Re-enable BWC tests TEST Ensure password 14 chars length on Kerberos FIPS tests (elastic#79496) [DOCS] Temporarily remove APM links (elastic#79411) Fix CCSDuelIT for skipped shards (elastic#79490) Add other time accounting in HotThreads (elastic#79392) Add deprecation info API entries for deprecated monitoring settings (elastic#78799) Add note in breaking changes for nameid_format (elastic#77785) Use 'migration' instead of 'upgrade' in GET system feature migration status responses (elastic#79302) Upgrade lucene version 8b68bf60c98 (elastic#79461) Use Strings#EMPTY_ARRAY (elastic#79452) Quicker shared cache file preallocation (elastic#79447) [ML] Removing some code that's obsolete for 8.0 (elastic#79444) Ensure indexing_data CCR requests are compressed (elastic#79413) ...
The sort by change has been merged in 7.x now and therefore we should make 8.0 use the same BWC version. Relates to elastic#79392 and elastic#79507
The sort by change has been merged in 7.x now and therefore we should make 8.0 use the same BWC version. Relates to #79392 and #79507 Co-authored-by: Nhat Nguyen <[email protected]>
The sort by change has been merged in 7.x now and therefore we should make 8.0 use the same BWC version. Relates to elastic#79392 and elastic#79507 Co-authored-by: Nhat Nguyen <[email protected]>
The sort by change has been merged in 7.x now and therefore we should make 8.0 use the same BWC version. Relates to elastic#79392 and elastic#79507 Co-authored-by: Nhat Nguyen <[email protected]>
This PR introduces the concept of 'other' time in the HotThreads report. This 'other' time is calculated as the 'unknown' time the Java threads spend running, while they are not in a WAIT/BLOCKED state and are not using the CPU directly. Typically this 'other' time ends up being consumed by I/O or by the JVM itself, e.g. GC.
We are making the HotThreads CPU report sort by default by the 'Total' runnable time, which equals the CPU time + the 'Other' time. The original sort CPU report sort order can still be invoked through the REST API, by specifying
&sort=cpu
in the_nodes/hot_threads
API url parameters. For example:_nodes/hot_threads?type=cpu&threads=1000000&sort=cpu
.