You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #1856, the behavior of StepTimer was changed to publish max only for the last publishing interval. For the purpose of availability monitoring, it is better to allow maximum to decay over a longer period.
Micrometer decays the maximum rather than aligning it to the publishing interval like it does for sum and count. If we perfectly aligned the view of maximum time to the push-interval then a dropped metrics payload means we potentially miss out on seeing a particularly high maximum value (because in the next interval we'd only consider samples that occurred in that interval).
Practically, there are many reasons why a high maximum latency and a dropped metrics payload would be correlated. For example, if the application is under heavy resource pressure (like a saturated network interface), a response to the user for an API endpoint that is being timed (and for which a maximum value is being tracked) may be exceedingly high at the same time that a metrics post request to the monitoring system fails with a read timeout. But such conditions can be (and many times are) temporary.
Perhaps you have a client-side load balancing strategy that recognizes that (from the client's perspective) API latency has gone up sharply for this instance that is under resource pressure, and begins preferring other instances. By relieving pressure on this instance it recovers.
In some subsequent interval, after the instance has recovered, it's nice to be able to push a maximum latency seen during this time of trouble that would otherwise have been skipped. In fact, it's precisely these times of duress that we care about the most, not the maximum latency under fair-weather conditions!
The text was updated successfully, but these errors were encountered:
Thanks for the heads up. I can understand how the importance of max values makes it behavior different from other 'step values'. It appears you left the refactoring around isolating the rolling logic, which will allow us to continue align max with the last step (and live with the ramifications of missing values) without copying too much of your code. If this comes up in the future, you might consider providing 2 versions of StepTimer.
Please consider making some additions to the javadoc/hosted doc/etc that makes it clear that max deviates from other 'step values' and why that is done.
Lastly, thanks for the project. It's working great for us so far!
In #1856, the behavior of
StepTimer
was changed to publish max only for the last publishing interval. For the purpose of availability monitoring, it is better to allow maximum to decay over a longer period.cc / @crankydillo
Further explanation as to why:
The text was updated successfully, but these errors were encountered: