Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add other time accounting in HotThreads #79392

Merged
merged 17 commits into from
Oct 19, 2021

Conversation

grcevski
Copy link
Contributor

This PR introduces the concept of 'other' time in the HotThreads report. This 'other' time is calculated as the 'unknown' time the Java threads spend running, while they are not in a WAIT/BLOCKED state and are not using the CPU directly. Typically this 'other' time ends up being consumed by I/O or by the JVM itself, e.g. GC.

We are making the HotThreads CPU report sort by default by the 'Total' runnable time, which equals the CPU time + the 'Other' time. The original sort CPU report sort order can still be invoked through the REST API, by specifying &sort=cpu in the _nodes/hot_threads API url parameters. For example: _nodes/hot_threads?type=cpu&threads=1000000&sort=cpu.

Nikola Grcevski added 10 commits October 14, 2021 15:53
Currently, the HotThreads CPU report contains only
the CPU cycles that a thread executes, excluding time
spent in I/O or the JVM runtime (e.g. GC). This PR
introduces the 'other' time accounting to augment the
CPU time for better understanding of user visible time
impact.
@@ -158,6 +159,20 @@ static void initializeProbes() {
JvmInfo.jvmInfo();
}

static void initializeRuntimeMonitoring() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling thread contention monitoring on-demand, as we used to do it in HotThreads, is not precise enough to properly calculate the wait/blocked time. If we enable the support on demand, we need to introduce an artificial pause before we attempt to capture timings, which would make the API unnecessarily slow.

I ran various micro benchmarks and proved that enabling lock contention monitoring by default, doesn't impact the performance of either uncontended or heavily contended use case.

@@ -57,7 +57,7 @@

// NOTE: these are JVM dependent and JVM version dependent
private static final List<String> knownJDKInternalThreads = Arrays.asList(
"Signal Dispatcher", "Finalizer", "Reference Handler", "Notification Thread", "Common-Cleaner", "process reaper"
"Signal Dispatcher", "Finalizer", "Reference Handler", "Notification Thread", "Common-Cleaner", "process reaper", "DestroyJavaVM"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DestroyJavaVM is one more of the JVM threads that don't do anything in particular.

ManagementFactory.getThreadMXBean(),
SunThreadInfo.INSTANCE,
Thread.currentThread().getId(),
(interval) -> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra functional parameter was added to remove all timing delays when running the HotThreads unit tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

@@ -161,7 +198,7 @@ static boolean isIdleThread(ThreadInfo threadInfo) {
continue;
}
long allocatedBytes = type == ReportType.MEM ? sunThreadInfo.getThreadAllocatedBytes(threadIds[i]) : 0;
result.put(threadIds[i], new ThreadTimeAccumulator(threadInfos[i], cpuTime, allocatedBytes));
result.put(threadIds[i], new ThreadTimeAccumulator(threadInfos[i], interval, cpuTime, allocatedBytes));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We pass in the supplied testing interval to HotThreads to be able to calculate the 'other' time, by subtracting the CPU, wait and blocked timings.

@@ -180,16 +217,22 @@ static boolean isIdleThread(ThreadInfo threadInfo) {
return result;
}

private void setThreadWaitBlockTimeMonitoringEnabled(ThreadMXBean threadBean, boolean enabled) {
private boolean isThreadWaitBlockTimeMonitoringEnabled(ThreadMXBean threadBean) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We remove the privileged call to enable contention monitoring and replace it by simply checking if it's enabled. Bootstrap sets it by default ON now.

// Enabling thread contention monitoring is required for capturing JVM thread wait/blocked times
setThreadWaitBlockTimeMonitoringEnabled(threadBean, true);
// Capture before and after thread state with timings
Map<Long, ThreadTimeAccumulator> previousThreadInfos = getAllValidThreadInfos(threadBean, sunThreadInfo, currentThreadId);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look bigger than they are, I simply removed the try/finally blocked since we don't need to disable thread contention monitoring at the end of the HotThreads innerDetect call anymore.

// Sort by delta CPU time on thread.
List<ThreadTimeAccumulator> topThreads = new ArrayList<>(latestThreadInfos.values());
final ToLongFunction<ThreadTimeAccumulator> getter = ThreadTimeAccumulator.valueGetterForReportType(type);
if (type == ReportType.CPU && sortOder == SortOrder.TOTAL) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we sort by the 'total' runnable time, we first sort by the total time and then internally sort those at 100% by CPU.


// for each snapshot (2nd array index) find later snapshot for same thread with max number of
// identical StackTraceElements (starting from end of each)
boolean[] done = new boolean[threadElementsSnapshotCount];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code below is unchanged.

this.blockedTime = info.getBlockedTime();
this.waitedTime = info.getWaitedTime();
ThreadTimeAccumulator(ThreadInfo info, TimeValue interval, long cpuTime, long allocatedBytes) {
this.blockedTime = millisecondsToNanos(info.getBlockedTime()); // Convert to nanos to standardize
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conversion here fixes an existing bug that I didn't notice before. Essentially, the ThreadInfo getWaitTime and getBlockedTime return the result in milliseconds, while CPU time is returned as nanoseconds. We used to produce incorrect wait times because we'd divide by interval.nanos().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I wonder if this should be documented somewhere? At least in the support database as it may come up, and the fact this was fixed here could easily be lost since the PR is not directly about this bug (it could also be fixed in a separate PR, but I understand this PR may have tests which rely on it being fixed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that, we should probably document that wait and blocked are fixed now. The millis to nanos is only an issue in 7.16 since I fixed the wait/blocked accounting there, previously we'd just report 0 always, so it doesn't matter what we divided with.

@@ -32,6 +33,10 @@

public class HotThreadsTests extends ESTestCase {

private long nanosHelper(long millis) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helper to ensure we correctly match results of wait/blocked timings given we internally convert the input mills to nanoseconds so we can accurately calculate the timings and percentages.

assertThat(innerResult, containsString("30.0% (3nanos out of 10nanos) cpu usage by thread 'Thread 3'"));
assertThat(innerResult, containsString("20.0% (2nanos out of 10nanos) cpu usage by thread 'Thread 2'"));
assertThat(innerResult, containsString("10.0% (1nanos out of 10nanos) cpu usage by thread 'Thread 1'"));
assertThat(innerResult, stringContainsInOrder(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this method mainly focus on the updated output showing the separate 'cpu' and 'other' timings, as well as ensuring the orderings of output, since we have two sort modes now for CPU.

assertThat(exception.getMessage(), equalTo("thread allocated memory is not supported on this JDK"));

// making sure CPU type was not affected when isThreadAllocatedMemorySupported() == false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do this extra test here, we already have tests that ensure CPU timings reporting works.

@@ -22,8 +22,6 @@ grant codeBase "${codebase.elasticsearch-secure-sm}" {
grant codeBase "${codebase.elasticsearch}" {
// needed for loading plugins which may expect the context class loader to be set
permission java.lang.RuntimePermission "setContextClassLoader";
// needed by HotThreads to enable wait/block time accounting on demand
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed since we now enable the monitoring before we install the security manager.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see permissions removed. :)

@grcevski grcevski added :Core/Infra/Core Core issues without another label >enhancement auto-backport Automatically create backport pull requests when merged Team:Core/Infra Meta label for core/infra team labels Oct 18, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@@ -20,8 +21,11 @@

public class NodesHotThreadsRequest extends BaseNodesRequest<NodesHotThreadsRequest> {

private static final Version SORT_SUPPORTED_VERSION = Version.V_8_0_0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we land this in 7_16_0, I'll change 8.0.0 to reflect that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to see these versions directly in the streaminput ctor and writeTo methods. As a constant, it looks as if the value could just be changed, but it can't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have a person preference for setting the version to the intended backport version from the start, but what you described is acceptable. Thanks for noting it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Changed.

@grcevski grcevski requested a review from rjernst October 18, 2021 20:04
@grcevski
Copy link
Contributor Author

@elasticmachine update branch

@@ -55,7 +55,7 @@ public void testBWCSerialization() throws IOException {
out.setVersion(previous);
request.writeTo(out);
try (StreamInput in = out.bytes().streamInput()) {
in.setVersion(latest);
in.setVersion(previous);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this wrong before, out.bytes().streamInput() resets the original version that the out had before, so I had to forcibly set it to the old version to simulate 7.x sending over the request.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I have several comments, but they are mostly minor/ideas for followup.

@@ -20,8 +21,11 @@

public class NodesHotThreadsRequest extends BaseNodesRequest<NodesHotThreadsRequest> {

private static final Version SORT_SUPPORTED_VERSION = Version.V_8_0_0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have a person preference for setting the version to the intended backport version from the start, but what you described is acceptable. Thanks for noting it.

@@ -20,8 +21,11 @@

public class NodesHotThreadsRequest extends BaseNodesRequest<NodesHotThreadsRequest> {

private static final Version SORT_SUPPORTED_VERSION = Version.V_8_0_0;

int threads = 3;
HotThreads.ReportType type = HotThreads.ReportType.CPU;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should all be final, I'm surprised they are not. We should have a ctor to take all the members. That can be a followup, though (doesn't have to make 7.16 since this is already the state of the class).

@@ -34,6 +38,9 @@ public NodesHotThreadsRequest(StreamInput in) throws IOException {
type = HotThreads.ReportType.of(in.readString());
interval = in.readTimeValue();
snapshots = in.readInt();
if (in.getVersion().onOrAfter(SORT_SUPPORTED_VERSION)) {
sortOrder = HotThreads.SortOrder.of(in.readString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice (as a followup) to look at changing read/write enum to use the value, instead of ordinal. As it is, the enum methods for stream input/output are brittle since they rely on the enum definition not changing.

static void initializeRuntimeMonitoring() {
// We'll let the JVM boot without this functionality, however certain APIs like HotThreads will not
// function and report an error.
if (ManagementFactory.getThreadMXBean().isThreadContentionMonitoringSupported() == false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this in a static for hot threads still? Perhaps a static method could be called in initializeProbes? I think we should strive to keep Bootstrap as small as possible, without feature specific logic like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll change this.

return order;
}

public static SortOrder of(String order) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why valueOf can't work here, and leave a comment to that effect explaining why this method is needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regular valueOf is a static method, we can't override it, and gives an ugly error message for the end user to see, something like 'field' is not member of full class name. But now that I think about it, I can possibly avoid the string serialization for this enum, I can just do writeEnum and readEnum, since I version the sort anyway. Let me look into that and this method can simply go away.

ManagementFactory.getThreadMXBean(),
SunThreadInfo.INSTANCE,
Thread.currentThread().getId(),
(interval) -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

// Sort by delta CPU time on thread.
List<ThreadTimeAccumulator> topThreads = new ArrayList<>(latestThreadInfos.values());
final ToLongFunction<ThreadTimeAccumulator> getter = ThreadTimeAccumulator.valueGetterForReportType(type);
if (type == ReportType.CPU && sortOder == SortOrder.TOTAL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we validate earlier that sortOrder is only present for type cpu, then we don't need to the type check on cpu?

this.blockedTime = info.getBlockedTime();
this.waitedTime = info.getWaitedTime();
ThreadTimeAccumulator(ThreadInfo info, TimeValue interval, long cpuTime, long allocatedBytes) {
this.blockedTime = millisecondsToNanos(info.getBlockedTime()); // Convert to nanos to standardize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I wonder if this should be documented somewhere? At least in the support database as it may come up, and the fact this was fixed here could easily be lost since the PR is not directly about this bug (it could also be fixed in a separate PR, but I understand this PR may have tests which rely on it being fixed).

@@ -22,8 +22,6 @@ grant codeBase "${codebase.elasticsearch-secure-sm}" {
grant codeBase "${codebase.elasticsearch}" {
// needed for loading plugins which may expect the context class loader to be set
permission java.lang.RuntimePermission "setContextClassLoader";
// needed by HotThreads to enable wait/block time accounting on demand
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see permissions removed. :)

assertThat(innerResult, containsString("0.0% [cpu=0.0%, other=0.0%] (0s out of 10ms) cpu usage by thread 'Thread 2'"));
assertThat(innerResult, containsString("0.0% [cpu=0.0%, other=0.0%] (0s out of 10ms) cpu usage by thread 'Thread 1'"));

// Test with the legacy sort order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot going on in this single test method. Can some of these cases be split into separate test methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good I'll get that done.

@grcevski grcevski changed the title Feature/hot threads other Add other time accounting in HotThreads Oct 19, 2021
@grcevski grcevski merged commit b6439c5 into elastic:master Oct 19, 2021
@grcevski grcevski deleted the feature/hot_threads_other branch October 19, 2021 17:47
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 79392

grcevski added a commit to grcevski/elasticsearch that referenced this pull request Oct 19, 2021
grcevski added a commit that referenced this pull request Oct 19, 2021
This PR introduces the concept of 'other' time in the HotThreads report.

Backport of #79392
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Oct 20, 2021
* upstream/master: (24 commits)
  Implement framework for migrating system indices (elastic#78951)
  Improve transient settings deprecation message (elastic#79504)
  Remove getValue and getValues from Field (elastic#79516)
  Store Template's mappings as bytes for disk serialization (elastic#78746)
  [ML] Add queue_capacity setting to start deployment API (elastic#79433)
  [ML] muting rest compat test issue elastic#79518 (elastic#79519)
  Avoid redundant available indices check (elastic#76540)
  Re-enable BWC tests
  TEST Ensure password 14 chars length on Kerberos FIPS tests (elastic#79496)
  [DOCS] Temporarily remove APM links (elastic#79411)
  Fix CCSDuelIT for skipped shards (elastic#79490)
  Add other time accounting in HotThreads (elastic#79392)
  Add deprecation info API entries for deprecated monitoring settings (elastic#78799)
  Add note in breaking changes for nameid_format (elastic#77785)
  Use 'migration' instead of 'upgrade' in GET system feature migration status responses (elastic#79302)
  Upgrade lucene version 8b68bf60c98 (elastic#79461)
  Use Strings#EMPTY_ARRAY (elastic#79452)
  Quicker shared cache file preallocation (elastic#79447)
  [ML] Removing some code that's obsolete for 8.0 (elastic#79444)
  Ensure indexing_data CCR requests are compressed (elastic#79413)
  ...
grcevski pushed a commit to grcevski/elasticsearch that referenced this pull request Oct 20, 2021
The sort by change has been merged in 7.x now and therefore
we should make 8.0 use the same BWC version.

Relates to elastic#79392 and elastic#79507
grcevski added a commit that referenced this pull request Oct 25, 2021
The sort by change has been merged in 7.x now and therefore
we should make 8.0 use the same BWC version.

Relates to #79392 and #79507

Co-authored-by: Nhat Nguyen <[email protected]>
grcevski added a commit to grcevski/elasticsearch that referenced this pull request Oct 26, 2021
The sort by change has been merged in 7.x now and therefore
we should make 8.0 use the same BWC version.

Relates to elastic#79392 and elastic#79507

Co-authored-by: Nhat Nguyen <[email protected]>
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this pull request Oct 28, 2021
The sort by change has been merged in 7.x now and therefore
we should make 8.0 use the same BWC version.

Relates to elastic#79392 and elastic#79507

Co-authored-by: Nhat Nguyen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants