Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to align and cleanup some jmx metrics #11621

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions instrumentation/jmx-metrics/javaagent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,20 @@ No targets are enabled by default. The supported target environments are listed
- [wildfly](wildfly.md)
- [hadoop](hadoop.md)

### Predefined metrics mapping

The pre-defined metrics do not provide an exhaustive mapping of every available JMX attribute as doing
so would be verbose, tedious to maintain and brittle as it relies on implementation details of each
of the targets supported. The goal here is to provide a monitoring of the essential metrics, advanced
use-cases will require dedicated configuration.

The following guidelines are recommended when modifying/extending pre-defined metrics:
- stay consistent with [semconv general guidelines](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/metrics.md#general-guidelines) for metrics.
- start with `{target}.` prefix, where `{target}` is the target system, for example `tomcat` or `hadoop`.
- align with semconv when semantics are identical, for example:
- `wildfly.network.io` is consistent with [`system.network.io`](https://opentelemetry.io/docs/specs/semconv/system/system-metrics/#metric-systemnetworkio)
- when not fitting semconv, reuse the existing mbean attribute names as metric suffix to preserve semantics of the exposed MBeans, for example `tomcat.request.errorCount` where tomcat reports as error any status >=400 and another app server might do it differently.

## Configuration Files

To provide your own metric definitions, create one or more YAML configuration files, and specify their location using the `otel.jmx.config` property. Absolute or relative pathnames can be specified. For example
Expand Down
10 changes: 5 additions & 5 deletions instrumentation/jmx-metrics/javaagent/jetty.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
Here is the list of metrics based on MBeans exposed by Jetty.

| Metric Name | Type | Attributes | Description |
| ------------------------------ | ------------- | ------------ | ---------------------------------------------------- |
|--------------------------------|---------------|--------------|------------------------------------------------------|
| jetty.session.sessionsCreated | Counter | resource | The number of sessions established in total |
| jetty.session.sessionTimeTotal | Counter | resource | The total time sessions have been active |
| jetty.session.sessionTimeMax | Gauge | resource | The maximum amount of time a session has been active |
| jetty.session.sessionTimeMean | Gauge | resource | The mean time sessions remain active |
| jetty.threads.busyThreads | UpDownCounter | | The current number of busy threads |
| jetty.threads.idleThreads | UpDownCounter | | The current number of idle threads |
| jetty.threads.maxThreads | UpDownCounter | | The maximum number of threads in the pool |
| jetty.threads.queueSize | UpDownCounter | | The current number of threads in the queue |
| jetty.thread.busyThreads | UpDownCounter | | The current number of busy threads |
| jetty.thread.idleThreads | UpDownCounter | | The current number of idle threads |
| jetty.thread.maxThreads | UpDownCounter | | The maximum number of threads in the pool |
| jetty.thread.queueSize | UpDownCounter | | The current number of threads in the queue |
| jetty.io.selectCount | Counter | resource, id | The number of select calls |
| jetty.logging.LoggerCount | UpDownCounter | | The number of registered loggers by name |
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ rules:
resource: param(context)
mapping:
sessionsCreated:
unit: "{sessions}"
unit: "{session}"
type: counter
desc: The number of sessions established in total
sessionTimeTotal:
Expand All @@ -22,8 +22,8 @@ rules:
desc: The mean time sessions remain active

- bean: org.eclipse.jetty.util.thread:type=queuedthreadpool,id=*
prefix: jetty.threads.
unit: "{threads}"
prefix: jetty.thread.
unit: "{thread}"
type: updowncounter
mapping:
busyThreads:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,133 +6,129 @@
rules:
- bean: Catalina:type=GlobalRequestProcessor,name=*
unit: "1"
prefix: http.server.tomcat.
prefix: tomcat.
metricAttribute:
name: param(name)
mapping:
errorCount:
metric: errorCount
metric: request.errorCount
type: gauge
unit: "{request}"
desc: The number of errors per second on all request processors
requestCount:
metric: requestCount
metric: request.requestCount
type: gauge
unit: "{request}"
desc: The number of requests per second across all request processors
maxTime:
metric: maxTime
metric: request.maxTime
type: gauge
unit: ms
desc: The longest request processing time
processingTime:
metric: processingTime
metric: request.processingTime
type: counter
unit: ms
desc: Total time for processing all requests
bytesReceived:
metric: traffic
metric: network.io
type: counter
unit: By
desc: The number of bytes transmitted
metricAttribute:
direction: const(received)
network.io.direction: const(receive)
bytesSent:
metric: traffic
metric: network.io
type: counter
unit: By
desc: The number of bytes transmitted
metricAttribute:
direction: const(sent)
network.io.direction: const(transmit)
- bean: Tomcat:type=GlobalRequestProcessor,name=*
unit: "1"
prefix: http.server.tomcat.
prefix: tomcat.
metricAttribute:
name: param(name)
mapping:
errorCount:
metric: errorCount
metric: request.errorCount
type: gauge
unit: "{request}"
desc: The number of errors per second on all request processors
requestCount:
metric: requestCount
metric: request.requestCount
type: gauge
unit: "{request}"
desc: The number of requests per second across all request processors
maxTime:
metric: maxTime
metric: request.maxTime
type: gauge
unit: ms
desc: The longest request processing time
processingTime:
metric: processingTime
metric: request.processingTime
type: counter
unit: ms
desc: Total time for processing all requests
bytesReceived:
metric: traffic
metric: network.io
type: counter
unit: By
desc: The number of bytes transmitted
metricAttribute:
direction: const(received)
network.io.direction: const(receive)
bytesSent:
metric: traffic
metric: network.io
type: counter
unit: By
desc: The number of bytes transmitted
metricAttribute:
direction: const(sent)
network.io.direction: const(transmit)

- bean: Catalina:type=Manager,host=localhost,context=*
unit: "1"
prefix: http.server.tomcat.
unit: "{session}"
prefix: tomcat.session.
type: updowncounter
metricAttribute:
context: param(context)
mapping:
activeSessions:
metric: sessions.activeSessions
metric: activeSessions
desc: The number of active sessions
- bean: Tomcat:type=Manager,host=localhost,context=*
unit: "1"
prefix: http.server.tomcat.
unit: "{session}"
prefix: tomcat.session.
type: updowncounter
metricAttribute:
context: param(context)
mapping:
activeSessions:
metric: sessions.activeSessions
metric: activeSessions
desc: The number of active sessions

- bean: Catalina:type=ThreadPool,name=*
unit: "{threads}"
prefix: http.server.tomcat.
unit: "{thread}"
prefix: tomcat.thread.
type: updowncounter
metricAttribute:
name: param(name)
mapping:
currentThreadCount:
metric: threads
desc: Thread Count of the Thread Pool
metricAttribute:
state: const(idle)
metric: currentThreadCount
desc: Total thread count of the thread pool
currentThreadsBusy:
metric: threads
desc: Thread Count of the Thread Pool
metricAttribute:
state: const(busy)
metric: currentThreadsBusy
desc: Busy thread count of the thread pool
- bean: Tomcat:type=ThreadPool,name=*
unit: "{threads}"
prefix: http.server.tomcat.
unit: "{thread}"
prefix: tomcat.thread.
type: updowncounter
metricAttribute:
name: param(name)
mapping:
currentThreadCount:
metric: threads
desc: Thread Count of the Thread Pool
metricAttribute:
state: const(idle)
metric: currentThreadCount
desc: Total thread count of the thread pool
currentThreadsBusy:
metric: threads
desc: Thread Count of the Thread Pool
metricAttribute:
state: const(busy)
metric: currentThreadsBusy
desc: Busy thread count of the thread pool
Original file line number Diff line number Diff line change
Expand Up @@ -25,47 +25,49 @@ rules:
unit: ns
errorCount:
- bean: jboss.as:subsystem=undertow,server=*,http-listener=*
metricPrefix: wildfly.
metricAttribute:
server: param(server)
listener: param(http-listener)
type: counter
unit: By
mapping:
bytesSent:
metric: wildfly.network.io
metric: network.io
desc: Total number of bytes transferred
metricAttribute:
direction: const(out)
network.io.direction: const(transmit)
bytesReceived:
metric: wildfly.network.io
metric: network.io
desc: Total number of bytes transferred
metricAttribute:
direction: const(in)
network.io.direction: const(receive)
- bean: jboss.as:subsystem=datasources,data-source=*,statistics=pool
unit: "1"
unit: "{connection}"
metricAttribute:
data_source: param(data-source)
mapping:
ActiveCount:
metric: wildfly.db.client.connections.usage
metric: wildfly.db.client.connection.count
metricAttribute:
state: const(used)
db.client.connection.state: const(used)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, the metric will have a metric attribute which will always have the same constant value. What is really the point of having such an attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct here, having a constant metric attribute only makes sense to provide breakdown of the same metric, occurences of wildfly.db.client.connection.usage should be renamed to wildfly.db.client.connection.count. I am currently trying to validate with the implementation if there is an overlap between the connection states or if it's an effective partition (in which case we can provide a breakdown).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking at the implementation and the wildfly test cases it seems that we have:

  • "available" seems close to the definition of "idle"
  • "in use" seems close to the definition of "busy"
  • "active" seems to mostly be the sum of "available" + "in use"
  • for "wait count", I haven't found any proper definition besides the docs

https://github.com/wildfly/wildfly/blob/841fea771567a71d06490a0d7e9a398dc6fdf5c0/testsuite/integration/basic/src/test/java/org/jboss/as/test/integration/jca/statistics/DataSourcePoolClearStatisticsTestCase.java#L69

https://github.com/wildfly/wildfly/blob/841fea771567a71d06490a0d7e9a398dc6fdf5c0/testsuite/integration/basic/src/test/java/org/jboss/as/test/integration/jca/capacitypolicies/ResourceAdapterCapacityPoliciesTestCase.java#L143

Given that both InUseCount and IdleCount refer to physical connection states in documentation, they effectively form a partition and then using a common metric with a constant attribute for idle | used makes sense.

For WaitCount this is about the number of logical connections that are waiting for a physical connection, so it would also make sense to use the same metric with a custom wait constant attribute.

When aggregating and removing attributes this would return the total number of logical connections to the database pool, and only a subset with either idle or used attribute value would be the physical connections.

I have updated this PR to match this in fb71fa1

desc: The number of open jdbc connections
IdleCount:
metric: wildfly.db.client.connections.usage
metricAttribute:
state: const(idle)
desc: The number of open jdbc connections
db.client.connection.state: const(idle)
desc: The number of idle jdbc connections
WaitCount:
metric: wildfly.db.client.connections.WaitCount
type: counter
desc: The number of waiting jdbc connections
- bean: jboss.as:subsystem=transactions
type: counter
prefix: wildfly.db.client.
unit: "{transactions}"
unit: "{transaction}"
mapping:
numberOfTransactions:
metric: transaction.NumberOfTransactions
metric: transaction.numberOfTransactions
numberOfApplicationRollbacks:
metric: rollback.count
metricAttribute:
Expand Down
19 changes: 10 additions & 9 deletions instrumentation/jmx-metrics/javaagent/tomcat.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@

Here is the list of metrics based on MBeans exposed by Tomcat.

| Metric Name | Type | Attributes | Description |
| ------------------------------------------ | ------------- | --------------- | --------------------------------------------------------------- |
| http.server.tomcat.sessions.activeSessions | UpDownCounter | context | The number of active sessions |
| http.server.tomcat.errorCount | Gauge | name | The number of errors per second on all request processors |
| http.server.tomcat.requestCount | Gauge | name | The number of requests per second across all request processors |
| http.server.tomcat.maxTime | Gauge | name | The longest request processing time |
| http.server.tomcat.processingTime | Counter | name | Represents the total time for processing all requests |
| http.server.tomcat.traffic | Counter | name, direction | The number of bytes transmitted |
| http.server.tomcat.threads | UpDownCounter | name, state | Thread Count of the Thread Pool |
| Metric Name | Type | Attributes | Description |
|----------------------------------|---------------|----------------------------|-----------------------------------------------------------------|
| tomcat.session.activeSessions | UpDownCounter | context | The number of active sessions |
| tomcat.request.errorCount | Gauge | name | The number of errors per second on all request processors |
| tomcat.request.requestCount | Gauge | name | The number of requests per second across all request processors |
| tomcat.request.maxTime | Gauge | name | The longest request processing time |
| tomcat.request.processingTime | Counter | name | Represents the total time for processing all requests |
| tomcat.network.io | Counter | name, network.io.direction | The number of bytes transmitted |
| tomcat.thread.currentThreadCount | UpDownCounter | name | Total thread count of the thread pool |
| tomcat.thread.currentThreadsBusy | UpDownCounter | name | Busy thread count of the thread pool |
28 changes: 14 additions & 14 deletions instrumentation/jmx-metrics/javaagent/wildfly.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@

Here is the list of metrics based on MBeans exposed by Wildfly.

| Metric Name | Type | Attributes | Description |
| -------------------------------------------------- | ------------- | ------------------ | ----------------------------------------------------------------------- |
| wildfly.network.io | Counter | direction, server | Total number of bytes transferred |
| wildfly.request.errorCount | Counter | server, listener | The number of 500 responses that have been sent by this listener |
| wildfly.request.requestCount | Counter | server, listener | The number of requests this listener has served |
| wildfly.request.processingTime | Counter | server, listener | The total processing time of all requests handed by this listener |
| wildfly.session.expiredSession | Counter | deployment | Number of sessions that have expired |
| wildfly.session.rejectedSessions | Counter | deployment | Number of rejected sessions |
| wildfly.session.sessionsCreated | Counter | deployment | Total sessions created |
| wildfly.session.activeSessions | UpDownCounter | deployment | Number of active sessions |
| wildfly.db.client.connections.usage | Gauge | data_source, state | The number of open jdbc connections |
| wildfly.db.client.connections.WaitCount | Counter | data_source | The number of requests that had to wait to obtain a physical connection |
| wildfly.db.client.rollback.count | Counter | cause | The total number of transactions rolled back |
| wildfly.db.client.transaction.NumberOfTransactions | Counter | | The total number of transactions (top-level and nested) created |
| Metric Name | Type | Attributes | Description |
|----------------------------------------------------|---------------|------------------------------------------|-------------------------------------------------------------------------|
| wildfly.network.io | Counter | server, network.io.direction | Total number of bytes transferred |
| wildfly.request.errorCount | Counter | server, listener | The number of 500 responses that have been sent by this listener |
| wildfly.request.requestCount | Counter | server, listener | The number of requests this listener has served |
| wildfly.request.processingTime | Counter | server, listener | The total processing time of all requests handed by this listener |
| wildfly.session.expiredSession | Counter | deployment | Number of sessions that have expired |
| wildfly.session.rejectedSessions | Counter | deployment | Number of rejected sessions |
| wildfly.session.sessionsCreated | Counter | deployment | Total sessions created |
| wildfly.session.activeSessions | UpDownCounter | deployment | Number of active sessions |
| wildfly.db.client.connection.usage | Gauge | data_source, db.client.connections.state | The number of open jdbc connections |
| wildfly.db.client.connection.WaitCount | Counter | data_source | The number of requests that had to wait to obtain a physical connection |
| wildfly.db.client.rollback.count | Counter | cause | The total number of transactions rolled back |
| wildfly.db.client.transaction.NumberOfTransactions | Counter | | The total number of transactions (top-level and nested) created |
Loading