-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APM] Java agent GC metrics visualization #36320
Comments
Pinging @elastic/apm-ui |
following up on the question about gc pools - #34708 (comment) says to use so new sample data including those labels:
The query(ies) will need to take into account this additional level of aggregation. |
Some input on this: GC names: Normally, there are two garbage collectors in HotSpot and similar JVMs- one that does minor collections and one that does major collections. Minor collections collect young objects (in this case - In addition, something I didn't see here is an issue with
Which to choose- depends on how easy each option is to implement and how it behaves. In order to test with real data, just use the agent on Java 8. Then, to get valid value for this metric, stop the JVM and restart it with the |
@roncohen I'm trying to figure out whether this issue is blocked by work needed in the agents. |
this is blocked on work by design working to come up with next steps AFAIK cc @katrin-freihofner @graphaelli @nehaduggal |
yes, thanks for the link. |
* [APM] Garbage collection metrics charts Closes #36320. * Review feedback * Display average of delta in gc chart
* [APM] Garbage collection metrics charts Closes elastic#36320. * Review feedback * Display average of delta in gc chart
#34708 implemented a metrics endpoint including 3 of the 5 metrics intended for the Java agent metrics UI. This issue is for tracking the other 2 metrics: GC rate and GC time.
GC rate is the number of garbage collection runs per pool
GC time is the amount of time spent in garbage collection per pool
Both of these are monotonically increasing counters. Therefore, both of these metrics require calculations per agent instance first, followed by some rollup to communicate values across all instances. To support that type of aggregation,
agent.ephemeral_id
will be stored with metrics per elastic/apm-server#2148.Considering just GC count, given these 3 samples across 2 instances:
agent
abc
has 1,1,8 GCs,def
has 1,0,5 - the overall service graph would show 2,1,13.One way to query per-instance values, including accounting for counter resets:
This will only consider the top X agents due to the terms aggregation. Also, I was unable to come up with a query to calculate the numbers to be graphed in a single query. One option is to calculate the sums per date histogram bucket post-query, similar to how TSVB series aggregation does it.
To eliminate the terms query limitation, a composite aggregation could be utilized. Another option is to use the metric explorar as a backend for these calculations.
@eyalkoren Can you clarify what the pool means / which field that is in the elasticsearch document?
@sqren all yours, I hope this helps.
The text was updated successfully, but these errors were encountered: