Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add latency metrics and measurement #4536

Merged
merged 2 commits into from
Sep 9, 2022
Merged

add latency metrics and measurement #4536

merged 2 commits into from
Sep 9, 2022

Conversation

longbowlu
Copy link
Contributor

No description provided.

@@ -420,6 +458,13 @@ impl AuthorityState {
) -> Result<TransactionInfoResponse, SuiError> {
let transaction_digest = *transaction.digest();
debug!(tx_digest=?transaction_digest, "handle_transaction. Tx data: {:?}", transaction.signed_data.data);
let start_ts = Instant::now();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion: I think it would be useful to create some helper so this could be done in one line

E.g. have something like

let _latency_guard = measure_latency(&self.metrics.handle_transaction_latency)

We can also avoid cloning metrics every time in this guard
Something like this probably:

struct TimerGuard<'a> {
histogram: &'a Histogram
start: Instant,
}

impl Drop for TimerGuard {
// Record latency
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit awkward to make that a function due to the trait object thing so I made a macro for that.

I didn't change the clone() as it clones a Arc pointer (very cheap)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't mean to nit-pick, but
(a) cloning Arc is not super expensive, but it is not 'very cheap' in comparison to other operations. It is more expensive than for example counting metric itself(which is relaxed atomic inc, where cloning Arc is a string atomic inc). Most importantly there is no reason we need to clone it, so why do it?
(b) What do you mean by 'awkward to make that a function due to the trait object thing'?
(c) I see that in macro we still define start_time in the caller and then pass to macro. Why? Could we just move it to macro itself? Also, I am really not sure why we choose to use macro here - I think rule of thumb is it can be function, then it should be function, not a macro.

Overall I think this works but it is a bit awkward solution, are there reasons we could not do it the more reasonable way with function returning guard object?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I didn't see the comment when I approve. @longbowlu please follow up with the feedback here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chatted wdith @andll off line. #4547 is one of the followups

@longbowlu longbowlu merged commit 19ba9e0 into main Sep 9, 2022
@longbowlu longbowlu deleted the latency-breakdown branch September 9, 2022 15:57
@@ -519,11 +571,20 @@ impl AuthorityState {
// to do this, since the false contention can be made arbitrarily low (no cost for 1.0 -
// epsilon of txes) while solutions without false contention have slightly higher cost
// for every tx.
let tx_guard = self.database.acquire_tx_guard(&certificate).await?;
let span = tracing::debug_span!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just use the #[instrument()] macro on the acquire_tx_guard function itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@velvia what would be the name of the prometheus histogram if we used #[instrument()]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

longbowlu added a commit that referenced this pull request Sep 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants