Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some metrics that are already covered in upstream #1005

Merged
merged 3 commits into from
Dec 13, 2022

Conversation

mateuszrzeszutek
Copy link
Contributor

Removed some of the duplicated metrics (OTel-based) that are already emitted in the upstream javaagent (mostly GC related)

@mateuszrzeszutek mateuszrzeszutek requested review from a team as code owners November 17, 2022 10:37
// runtime.jvm.threads.peak was removed; easy to replace with
// max(process.runtime.jvm.threads.count)
// runtime.jvm.threads.daemon is replaced by OTel process.runtime.jvm.threads.count{daemon=true}
// runtime.jvm.threads.live is replaced by OTel process.runtime.jvm.threads.count{daemon=false}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one more meter runtime.jvm.threads.states in this class: there is currently no equivalent in OTel metrics, but there's an issue requesting this: open-telemetry/opentelemetry-java-instrumentation#7006
I think we can leave it as it is for now, hopefully at some point it'll get implemented in the upstream

.setDescription("Size of long-lived heap memory pool after reclamation.")
.buildWithCallback(measurement -> measurement.record(liveDataSize.get()));
// runtime.jvm.gc.live.data.size is replaced by OTel
// process.runtime.jvm.memory.usage_after_last_gc{pool=<long lived pools>}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two other metrics left in this class: runtime.jvm.gc.memory.allocated and runtime.jvm.gc.memory.promoted. Both of these emit data that needs to be calculated on the instrumentation side, I think it's probably not possible to reproduce these by querying the backend (dashboards).

Still, I think we should remove them: both of them depend on hard-coded GC & memory pool names, long-term maintaining them and ensuring they're correct for all GC possible implementations is going to be painful.
Also I don't know how useful these really are, maybe the "standard" memory & GC metrics are enough anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wonder if the process.runtime.jvm.memory.allocated could perhaps replace runtime.jvm.gc.memory.allocated?

Copy link
Contributor

@breedx-splk breedx-splk Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'm good with that. It does seem weird that we have memory.allocated inside a runtime.jvm.gc namespace anyway.

I'm not sure if you saw the JFR based approach in contrib...but it's also nontrivial and bakes in some expectations about names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you saw the JFR based approach in contrib...but it's also nontrivial and bakes in some expectations about names.

Yeeaaaah... from what I read, it seems to be similar (I can't really tell if it's 100% equivalent) to what we're doing in the profiling metric.

Copy link
Contributor

@breedx-splk breedx-splk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Onward!

.setDescription("Size of long-lived heap memory pool after reclamation.")
.buildWithCallback(measurement -> measurement.record(liveDataSize.get()));
// runtime.jvm.gc.live.data.size is replaced by OTel
// process.runtime.jvm.memory.usage_after_last_gc{pool=<long lived pools>}

Copy link
Contributor

@breedx-splk breedx-splk Nov 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'm good with that. It does seem weird that we have memory.allocated inside a runtime.jvm.gc namespace anyway.

I'm not sure if you saw the JFR based approach in contrib...but it's also nontrivial and bakes in some expectations about names.

// process.runtime.jvm.gc.duration{gc!=<concurrent gcs>}
// runtime.jvm.gc.max.data.size is replaced by OTel
// process.runtime.jvm.memory.limit{pool=<long lived pools>}
// runtime.jvm.gc.live.data.size is replaced by OTel

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime.jvm.gc.pause and runtime.jvm.gc.live.data.size metrics are used by Splunk Memory Profiling, we should not remove them from the agent until there is full server-side support for the new metrics.

@mateuszrzeszutek mateuszrzeszutek merged commit dc3199a into main Dec 13, 2022
@delete-merged-branch delete-merged-branch bot deleted the metrics-cleanup branch December 13, 2022 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants