Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transaction: Collect stats about dropped spans #1132

Merged
merged 8 commits into from
Oct 7, 2021
Merged

transaction: Collect stats about dropped spans #1132

merged 8 commits into from
Oct 7, 2021

Conversation

marclop
Copy link
Contributor

@marclop marclop commented Sep 28, 2021

Description

Modifies the span timing reporting function to collect statistics about
dropped spans when they exceed the transaction_max_spans. Each dropped
span is now aggregated under the transaction.dropped_spans_stats array
with a maximum count of items of 128.

Each dropped span is aggregated on: destination_service_resource and
outcome reporting 2 metrics, their duration.sum.us and the count
for each of the aggregated dropped spans.

TODO

  • Changelog entry

Related issues

Closes #1113

Modifies the span timing reporting function to collect statistics about
dropped spans when they exceed the `transaction_max_spans`. Each dropped
span is now aggregated under the `transaction.dropped_spans_stats` array
with a maximum count of items of `128`.

Each dropped span is aggregated on 4 dimensions: `outcome`. `subtype`,
`type` and `destination_service_resource` and report 2 metrics, their
`duration.sum.us` and the `count` for each of the aggregated dropped
spans.

Signed-off-by: Marc Lopez Rubio <[email protected]>
@marclop marclop added enhancement New feature or request v7.16.0 labels Sep 28, 2021
@marclop marclop self-assigned this Sep 28, 2021
@apmmachine
Copy link
Contributor

apmmachine commented Sep 28, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-10-07T04:07:19.056+0000

  • Duration: 36 min 34 sec

  • Commit: d34abc6

Test stats 🧪

Test Results
Failed 0
Passed 11136
Skipped 268
Total 11404

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run benchmark tests : Run the benchmark test.

@marclop
Copy link
Contributor Author

marclop commented Sep 30, 2021

We may remove the "type" and "subtype" dimensions since they're not sure on the apm-server to produce the metrics. The PR that introduces the change is elastic/apm#515.

https://github.com/elastic/apm-server/blob/9e0b8e32b63877765ab5996d926341fac2d32a94/x-pack/apm-server/aggregation/spanmetrics/aggregator.go#L255-L265

@marclop marclop marked this pull request as ready for review October 5, 2021 08:22
span.go Outdated Show resolved Hide resolved
span.go Outdated Show resolved Hide resolved
breakdown.go Outdated Show resolved Hide resolved
modelwriter.go Outdated Show resolved Hide resolved
transaction_test.go Outdated Show resolved Hide resolved
@@ -510,7 +510,7 @@ func spanSelfTimeMetrics(txName, txType, spanType, spanSubtype string, count int
},
Samples: map[string]model.Metric{
"span.self_time.count": {Value: float64(count)},
"span.self_time.sum.us": {Value: sum.Seconds() * 1000000},
"span.self_time.sum.us": {Value: float64(sum) / 1e3},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change this since there was a case where 510000 turned into 510.00000000000006 due to the time.Duration.Seconds() logic. In any case, this should be the same as the other operation, perhaps a little more efficient, since we're only doing 1 division.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, it's just test code anyway

Signed-off-by: Marc Lopez Rubio <[email protected]>
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just noticed one major issue (I think) related to where dropped span metrics are accumulated. I'd like it if we can simplify the tests a bit but otherwise LGTM.

span.go Outdated Show resolved Hide resolved
transaction_test.go Outdated Show resolved Hide resolved
transaction_test.go Outdated Show resolved Hide resolved
@@ -510,7 +510,7 @@ func spanSelfTimeMetrics(txName, txType, spanType, spanSubtype string, count int
},
Samples: map[string]model.Metric{
"span.self_time.count": {Value: float64(count)},
"span.self_time.sum.us": {Value: sum.Seconds() * 1000000},
"span.self_time.sum.us": {Value: float64(sum) / 1e3},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine, it's just test code anyway

Signed-off-by: Marc Lopez Rubio <[email protected]>
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks :)

@marclop marclop merged commit 36bdfef into elastic:master Oct 7, 2021
@marclop marclop deleted the f/collect-statistics-about-dropped-spans branch October 7, 2021 06:26
@zube zube bot removed the [zube]: Done label Jan 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[META 497] Collecting statistics about dropped spans
3 participants