-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add latency metrics and measurement #4536
Conversation
ba265f9
to
07e1f6a
Compare
@@ -420,6 +458,13 @@ impl AuthorityState { | |||
) -> Result<TransactionInfoResponse, SuiError> { | |||
let transaction_digest = *transaction.digest(); | |||
debug!(tx_digest=?transaction_digest, "handle_transaction. Tx data: {:?}", transaction.signed_data.data); | |||
let start_ts = Instant::now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinion: I think it would be useful to create some helper so this could be done in one line
E.g. have something like
let _latency_guard = measure_latency(&self.metrics.handle_transaction_latency)
We can also avoid cloning metrics every time in this guard
Something like this probably:
struct TimerGuard<'a> {
histogram: &'a Histogram
start: Instant,
}
impl Drop for TimerGuard {
// Record latency
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a bit awkward to make that a function due to the trait object thing so I made a macro for that.
I didn't change the clone()
as it clones a Arc
pointer (very cheap)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't mean to nit-pick, but
(a) cloning Arc
is not super expensive, but it is not 'very cheap' in comparison to other operations. It is more expensive than for example counting metric itself(which is relaxed atomic inc, where cloning Arc is a string atomic inc). Most importantly there is no reason we need to clone it, so why do it?
(b) What do you mean by 'awkward to make that a function due to the trait object thing'?
(c) I see that in macro we still define start_time in the caller and then pass to macro. Why? Could we just move it to macro itself? Also, I am really not sure why we choose to use macro here - I think rule of thumb is it can be function, then it should be function, not a macro.
Overall I think this works but it is a bit awkward solution, are there reasons we could not do it the more reasonable way with function returning guard object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I didn't see the comment when I approve. @longbowlu please follow up with the feedback here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
07e1f6a
to
16e769c
Compare
@@ -519,11 +571,20 @@ impl AuthorityState { | |||
// to do this, since the false contention can be made arbitrarily low (no cost for 1.0 - | |||
// epsilon of txes) while solutions without false contention have slightly higher cost | |||
// for every tx. | |||
let tx_guard = self.database.acquire_tx_guard(&certificate).await?; | |||
let span = tracing::debug_span!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just use the #[instrument()]
macro on the acquire_tx_guard function itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@velvia what would be the name of the prometheus histogram if we used #[instrument()]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracing_span_latencies
... see
https://github.com/MystenLabs/mysten-infra/blob/main/crates/telemetry-subscribers/src/span_latency_prom.rs
No description provided.