-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse AggregatorHandle with cumulative temporality #5142
Conversation
} | ||
T accumulation = entry.getValue().accumulateThenReset(entry.getKey()); | ||
T accumulation = entry.getValue().accumulateThenReset(entry.getKey(), reset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is pretty hard to understand and only exists to serve the Bound{Instrument}
concept which has been shelved. If we don't see a realistic chance of the being incorporated into the spec, I suggest we rip it out and reap the benefits of reduced complexity.
Codecov ReportBase: 91.11% // Head: 90.96% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #5142 +/- ##
============================================
- Coverage 91.11% 90.96% -0.15%
+ Complexity 4903 4871 -32
============================================
Files 549 545 -4
Lines 14501 14465 -36
Branches 1392 1390 -2
============================================
- Hits 13212 13158 -54
- Misses 890 906 +16
- Partials 399 401 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@@ -118,7 +118,7 @@ void staleMetricsDropped_synchronousInstrument() { | |||
(Consumer<SumData<LongPointData>>) | |||
sumPointData -> | |||
assertThat(sumPointData.getPoints().size()) | |||
.isEqualTo(10)))); | |||
.isEqualTo(2000)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not clear to me...is this a significant behavioral change due to this refactoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean by that is... is this change in behavior going to be a breaking change for anyone? Seems like suddenly getting 20x the number of points could be significant to some backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior has changed, but the current behavior is odd and the new behavior is slightly less odd so I think its defensible.
Currently, if the temporality is cumulative, we:
- Record data in delta temporality, limiting the storage to 2000 distinct attributes. Any attributes in excess of 2000 are dropped.
- Merge the deltas into the cumulatives
- Limit the merged cumulative to 2000 distinct attributes. If after merging the delta into the cumulatives, we exceed the 2000 limit, trim the cumulatives down to only those seen since the last collection.
So there's two rounds of enforcement of the cardinality limit and the behavior isn't very intuitive / expected.
The new behavior is much simpler: limit to 2000 attributes in the storage. Drop all measurements which exceed that. This makes the behavior of cardinality enforcement the same with cumulative and delta. I don't think this will be surprising / breaking for any users, since the change is limited to reducing cardinality after the limit has already been hit. I.e. you'll see a potential reduction in cardinality only after you've hit cardinality limits.
There's some work active in the spec which would change this behavior further. The most likely outcome is If a configurable cardinality limit is exceeded, log a warning and emit a single point with a static attribute key / value which aggregates measurements across all dimensions.
The metric sdk's storage situation has an unnecessary layer of abstraction called
TemporalMetricStorage
.AsynchronousMetricStorage and DefaultSynchronousMetricStorageare responsible for storing metric state for asynchronous and synchronous instruments, respectively. If you look at their code, you'll notice that each has a reference to
TemporalMetricStorage
. This class takes in a map of accumulations, and spits out aMetricData
with points that are the appropriate temporality for the reader.How it produces points with the appropriate temporality is different for async vs sync instruments:
When you think about it, its weird that temporal metric storage has to merge deltas into cumulative. Why does
DefaultSynchronousMetricStorage
provideTemporalMetricStorage
with deltas? Well because it implements MetricStorage#collectAndReset, and needs to reset.But actually it doesn't necessarily need to reset. Instead of always resetting,
DefaultSynchronousMetricStorage
could only reset on collections if the temporality is delta. If cumulative, just don't reset and let the AggregatorHandle continue to accumulate measurements.Switching to this approach has some pretty significant benefits:
AggregatorHandle
s don't need to be reallocated after collectionsCheck out the before and after of HistogramCollectBenchmark, which is currently the best JMH test for measurement memory allocation of collection.
Before:
After:
Note I've narrowed in on the cumulative results because there's no material change for delta (as expected).
This is one of those changes that manages to reduce complexity and increases performance.