-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding is_trace_root tag for APM Stats #23302
Conversation
Bloop Bleep... Dogbot HereRegression Detector ResultsRun ID: 864eeb51-4b29-42f8-ba92-310cbdd2d283 Performance changes are noted in the perf column of each table:
Experiments with missing or malformed data
Usually, this warning means that there is no usable optimization goal data for that experiment, which could be a result of misconfiguration. No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | file_to_blackhole | % cpu utilization | +3.20 | [-3.41, +9.80] |
Fine details of change detection per experiment
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | file_to_blackhole | % cpu utilization | +3.20 | [-3.41, +9.80] |
➖ | uds_dogstatsd_to_api_cpu | % cpu utilization | +0.53 | [-0.88, +1.94] |
➖ | otel_to_otel_logs | ingress throughput | +0.52 | [-0.12, +1.15] |
➖ | trace_agent_json | ingress throughput | +0.00 | [-0.03, +0.04] |
➖ | tcp_dd_logs_filter_exclude | ingress throughput | -0.00 | [-0.00, +0.00] |
➖ | uds_dogstatsd_to_api | ingress throughput | -0.00 | [-0.00, +0.00] |
➖ | trace_agent_msgpack | ingress throughput | -0.02 | [-0.03, -0.01] |
➖ | process_agent_standard_check_with_stats | memory utilization | -0.12 | [-0.16, -0.07] |
➖ | tcp_syslog_to_blackhole | ingress throughput | -0.12 | [-0.18, -0.06] |
➖ | file_tree | memory utilization | -0.15 | [-0.25, -0.05] |
➖ | process_agent_standard_check | memory utilization | -0.23 | [-0.28, -0.18] |
➖ | process_agent_real_time_mode | memory utilization | -0.63 | [-0.67, -0.58] |
➖ | idle | memory utilization | -0.88 | [-0.93, -0.84] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
pkg/trace/stats/aggregation_test.go
Outdated
@@ -85,7 +85,7 @@ func TestNewAggregation(t *testing.T) { | |||
Meta: map[string]string{"span.kind": "client", "peer.service": "remote-service"}, | |||
}, | |||
false, | |||
Aggregation{BucketsAggregationKey: BucketsAggregationKey{Service: "a", SpanKind: "client"}}, | |||
Aggregation{BucketsAggregationKey: BucketsAggregationKey{Service: "a", SpanKind: "client", IsParentRoot: true}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I wouldn't expect a client span to be a parent root for the trace. Could we use a more realistic example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need to update all of these other test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look good to me - definitely make sure you have a good testing plan here including an end to end test as part of QA process
pkg/trace/stats/concentrator_test.go
Outdated
testSpan(now, 3, 0, 40, 2, "A2", "resource2", 2, nil), | ||
testSpan(now, 4, 0, 300000000000, 2, "A2", "resource2", 2, nil), // 5 minutes trace | ||
testSpan(now, 3, 1, 40, 2, "A2", "resource2", 2, nil), | ||
testSpan(now, 4, 1, 300000000000, 2, "A2", "resource2", 2, nil), // 5 minutes trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid making changes to the existing cases? The choice of parent ID here was likely intentional.
If there's something about this test that we want to evaluate with the new flag, let's make a separate test case to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand having to update the other tests where parent_id
is 0 so we now expect the flag to be set to true.
Again though, let's see if we can avoid making changes to the parent IDs for the input test data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Moved root tag testing to a new test function
releasenotes/notes/apm-adding-is_trace_root-tag-for-APM-Stats-f3f4384105897d11.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion, but approving.
…f3f4384105897d11.yaml Co-authored-by: Rosa Trieu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're just about there! I have some suggestions to incorporate and then I will ✅
pkg/trace/stats/concentrator_test.go
Outdated
@@ -418,7 +421,7 @@ func TestConcentratorStatsCounts(t *testing.T) { | |||
testSpan(now, 9, 0, 30, 1, "A2", "resource2", 2, nil), | |||
testSpan(now, 10, 0, 3600000000000, 1, "A2", "resourcefoo", 0, nil), // 1 hour trace | |||
// present data, part of the third flush | |||
testSpan(now, 6, 0, 24, 0, "A1", "resource2", 0, nil), | |||
testSpan(now, 6, 100, 24, 0, "A1", "resource2", 0, nil), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the parentID changed here?
pkg/trace/stats/concentrator_test.go
Outdated
spans := []*pb.Span{ | ||
testSpan(now, 1, 0, 40, 10, "A1", "resource1", 0, nil), | ||
testSpan(now, 1, 0, 40, 10, "A1", "resource1", 0, nil), | ||
testSpan(now, 1, 100, 30, 10, "A1", "resource1", 0, nil), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a trace where we have 3 identical looking spans, two of which are root spans and one span is orphaned (its parent is 100
but there's no span here with that ID). It might be odd to have two root spans in a trace.
Let's set something up that is a little more of what we'd expect:
- First span as is
- Second span has span ID of 2 and parent ID of 1 (so it's a child of the first span)
- Third span is a child of the second span and has a span ID of 3; let's also pass a
map[string]string
as below in order to mark it as a client span - Fourth span can be an orphaned case like the third span today
Here's the map to use for the third span:
map[string]string{"span.kind": "client"}
I would expect stats for only these spans:
- The first span because it's a root, top-level span
- The third span because it's a client span
- The fourth span because it gets marked as a top-level (part of the logic for orphaned spans, see here:
func ComputeTopLevel(trace pb.Trace) {
Only the stats for the first and fourth spans will have IsTraceRoot: true
and the stats for the third span will have IsTraceRoot: false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may find it preferable to make your own spans instead of relying on testSpan
. They should be simple enough structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworked test case
--- | ||
features: | ||
- | | ||
APM Stats now includes an is_trace_root field to indicate if the stats are from the root span of a trace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
APM Stats now includes an is_trace_root field to indicate if the stats are from the root span of a trace. | |
APM stats now include an is_trace_root field to indicate if the stats are from the root span of a trace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed!
Test changes on VMUse this command from test-infra-definitions to manually test this PR changes on a VM: inv create-vm --pipeline-id=30592004 --os-family=ubuntu |
Regression DetectorRegression Detector ResultsRun ID: d8f06fb4-2637-485f-b38e-3d7e120fd49b Performance changes are noted in the perf column of each table:
No significant changes in experiment optimization goalsConfidence level: 90.00% There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.
|
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | file_to_blackhole | % cpu utilization | -1.33 | [-7.67, +5.01] |
Fine details of change detection per experiment
perf | experiment | goal | Δ mean % | Δ mean % CI |
---|---|---|---|---|
➖ | pycheck_1000_100byte_tags | % cpu utilization | +1.27 | [-3.67, +6.22] |
➖ | idle | memory utilization | +0.72 | [+0.66, +0.77] |
➖ | tcp_syslog_to_blackhole | ingress throughput | +0.35 | [+0.26, +0.45] |
➖ | uds_dogstatsd_to_api_cpu | % cpu utilization | +0.28 | [-2.48, +3.03] |
➖ | tcp_dd_logs_filter_exclude | ingress throughput | +0.03 | [-0.01, +0.06] |
➖ | trace_agent_msgpack | ingress throughput | -0.01 | [-0.02, -0.00] |
➖ | uds_dogstatsd_to_api | ingress throughput | -0.02 | [-0.22, +0.19] |
➖ | trace_agent_json | ingress throughput | -0.03 | [-0.06, +0.00] |
➖ | otel_to_otel_logs | ingress throughput | -0.05 | [-0.46, +0.37] |
➖ | process_agent_real_time_mode | memory utilization | -0.07 | [-0.11, -0.03] |
➖ | process_agent_standard_check_with_stats | memory utilization | -0.35 | [-0.39, -0.31] |
➖ | process_agent_standard_check | memory utilization | -0.41 | [-0.45, -0.36] |
➖ | file_tree | memory utilization | -0.49 | [-0.59, -0.39] |
➖ | basic_py_check | % cpu utilization | -0.68 | [-3.23, +1.87] |
➖ | file_to_blackhole | % cpu utilization | -1.33 | [-7.67, +5.01] |
Explanation
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
/merge |
🚂 MergeQueue Pull request added to the queue. This build is going to start soon! (estimated merge in less than 29m) Use |
* proto change to add is_parent_root * move field * proto and isParentRoot usage in the bucketAggKey * test fixes * test fixes * rename to isTraceRoot and refactor tests * unit tests * root_tag unit tests * change isTraceRoot from bool to enum * made root tags a separate test in concentrator_test.go * release notes * whitespace * Update releasenotes/notes/apm-adding-is_trace_root-tag-for-APM-Stats-f3f4384105897d11.yaml Co-authored-by: Rosa Trieu <[email protected]> * changes --------- Co-authored-by: Rosa Trieu <[email protected]>
What does this PR do?
This PR is to allow the apm-agent to send whether or not the stats come from a root trace and add this to APM Stats. Document for context: https://docs.google.com/document/d/1Ha4MxePXqcI2sMCxMfOQTaeuP7_YkkzRpxsve55aUd4/edit?usp=sharing
Motivation
Having the is_trace_root tag can help APM features to distinguish between APM stats that come from root spans vs. non-root spans.
Additional Notes
Possible Drawbacks / Trade-offs
Adding a tag will increase the cardinality of trace metrics
Describe how to test/QA your changes
We'll own the QA for this PR