-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add S3 latency metrics #397
Conversation
if (sensor != null) { | ||
sensor.record(); | ||
// metrics are reported per request, so 1 value can be assumed. | ||
if (metricValues.size() == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API is not clear about it, but docs mention the following: https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/metrics-list.html
Metrics collected with each request
So, my interpretation (and while testing this) is that each instance of metric collection corresponds to a single API call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I looked into the SDK code and seems you are right 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But OTOH what harm would it do it it remains a loop? It'd be more future-proof and we wouldn't expect this internal behavior to change in a casual dependency upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that we will not be able to understand to what request type the latency metric belongs if there will be multiple values AFAIU. Otherwise I would agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. That's very unfortunate. Well, let's rely on our tests as much as we can then
storage/s3/src/main/java/io/aiven/kafka/tieredstorage/storage/s3/MetricCollector.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if (sensor != null) { | ||
sensor.record(); | ||
// metrics are reported per request, so 1 value can be assumed. | ||
if (metricValues.size() == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm thinking that maybe it's worth logging a warning if the size is different from 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same actually, so yeah, lets have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Adding it now.
86ec15f
to
dd2c69f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more comment, but otherwise LGTM
} | ||
|
||
final var durations = metricCollection.metricValues(CoreMetric.API_CALL_DURATION); | ||
if (durations.size() == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the same warning in the else
here
dd2c69f
to
fcf1817
Compare
For better monitoring, latency metrics will help to understand how long are we waiting on the plugin side for a response from S3.
See commits for more details, a small refactor on the testing is included separately.