set trace log level for our flakyest tests #21595

akarpz · 2023-12-15T20:28:58Z

What does this PR do?

sets a few tests to trace level and adds some trace logs for conntrack offsetguessing specifically. this will be pretty noisy/spammy for a bit in CI, but the hope is we should be able to remove this shortly once we are able to debug the failures with additional info.

Motivation

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

Reviewer's Checklist

pr-commenter · 2023-12-15T22:10:02Z

Bloop Bleep... Dogbot Here

Regression Detector Results

Run ID: 4c863cf9-dcf0-4bf1-971a-533ec80d360b
Baseline: 2618fb9
Comparison: 5c7d4f4
Total CPUs: 7

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Declared stable experiments that are now erratic

An experiment is erratic (i.e., not stable) if its coefficient of variation is at least 0.10.

perf	experiment	goal	Δ mean %	Δ mean % CI	confidence
➖	otel_to_otel_logs	ingress throughput	-3.22	[-3.96, -2.48]	100.00%

Declared erratic experiments that are now stable

An experiment is stable (i.e., not erratic) if its coefficient of variation is less than 0.10.

perf	experiment	goal	Δ mean %	Δ mean % CI	confidence
➖	idle	memory utilization	+0.22	[+0.19, +0.25]	100.00%
➖	file_tree	memory utilization	-0.47	[-0.58, -0.37]	100.00%

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	confidence
➖	file_to_blackhole	% cpu utilization	+1.01	[-5.64, +7.66]	19.75%
➖	tcp_syslog_to_blackhole	ingress throughput	+0.70	[+0.64, +0.76]	100.00%
➖	idle	memory utilization	+0.22	[+0.19, +0.25]	100.00%
➖	process_agent_standard_check_with_stats	memory utilization	+0.18	[+0.14, +0.22]	100.00%
➖	process_agent_standard_check	memory utilization	+0.13	[+0.08, +0.18]	100.00%
➖	trace_agent_json	ingress throughput	+0.02	[-0.01, +0.04]	79.06%
➖	dogstatsd_string_interner_8MiB_100	ingress throughput	+0.02	[-0.02, +0.06]	51.90%
➖	dogstatsd_string_interner_64MiB_1k	ingress throughput	+0.00	[-0.03, +0.04]	15.44%
➖	dogstatsd_string_interner_8MiB_50k	ingress throughput	+0.00	[-0.04, +0.04]	0.00%
➖	dogstatsd_string_interner_8MiB_1k	ingress throughput	+0.00	[-0.04, +0.04]	0.00%
➖	dogstatsd_string_interner_8MiB_10k	ingress throughput	+0.00	[-0.04, +0.04]	0.00%
➖	dogstatsd_string_interner_64MiB_100	ingress throughput	+0.00	[-0.04, +0.04]	0.00%
➖	uds_dogstatsd_to_api	ingress throughput	-0.00	[-0.04, +0.04]	0.00%
➖	tcp_dd_logs_filter_exclude	ingress throughput	-0.00	[-0.07, +0.07]	0.00%
➖	dogstatsd_string_interner_128MiB_100	ingress throughput	-0.00	[-0.05, +0.05]	0.00%
➖	dogstatsd_string_interner_128MiB_1k	ingress throughput	-0.00	[-0.06, +0.06]	0.00%
➖	trace_agent_msgpack	ingress throughput	-0.02	[-0.04, -0.01]	98.48%
➖	dogstatsd_string_interner_8MiB_100k	ingress throughput	-0.03	[-0.05, -0.02]	99.97%
➖	process_agent_real_time_mode	memory utilization	-0.34	[-0.37, -0.31]	100.00%
➖	file_tree	memory utilization	-0.47	[-0.58, -0.37]	100.00%
➖	otel_to_otel_logs	ingress throughput	-3.22	[-3.96, -2.48]	100.00%

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pkg/network/tracer/offsetguess_test.go

pkg/network/tracer/offsetguess/conntrack.go

akarpz · 2023-12-19T20:12:31Z

/merge

dd-devflow · 2023-12-19T20:12:36Z

🚂 MergeQueue

Pull request added to the queue.

There are 6 builds ahead! (estimated merge in less than 4h)

you can cancel this operation by commenting your pull request with /merge -c!

set debug log level for our flakyest tests

ddfe608

akarpz requested a review from a team as a code owner December 15, 2023 20:28

github-actions bot added the component/system-probe label Dec 15, 2023

debug logs for offsetguess test

1a722c2

hmahmood reviewed Dec 16, 2023

View reviewed changes

pkg/network/tracer/offsetguess_test.go Outdated Show resolved Hide resolved

improve logging

e4a208b

hmahmood reviewed Dec 18, 2023

View reviewed changes

pkg/network/tracer/offsetguess/conntrack.go Outdated Show resolved Hide resolved

akarpz added 2 commits December 18, 2023 15:36

added logs to conntrack guessing

0310702

log current offset value

50b374e

akarpz added changelog/no-changelog team/networks [deprecated] qa/skip-qa - use other qa/ labels [DEPRECATED] Please use qa/done or qa/no-code-change to skip creating a QA card labels Dec 19, 2023

akarpz added this to the 7.51.0 milestone Dec 19, 2023

set to trace to reduce spammy offsetguess logs

cfdfa8d

akarpz changed the title ~~set debug log level for our flakyest tests~~ set trace log level for our flakyest tests Dec 19, 2023

Merge branch 'main' into akarpowich/debug_logs_flaky_tests

5c7d4f4

leeavital approved these changes Dec 19, 2023

View reviewed changes

akarpz added the qa/done QA done before merge and regressions are covered by tests label Dec 19, 2023

dd-devflow bot added mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: queued labels Dec 19, 2023

dd-mergequeue bot merged commit 890419a into main Dec 20, 2023
215 checks passed

dd-mergequeue bot deleted the akarpowich/debug_logs_flaky_tests branch December 20, 2023 03:51

dd-devflow bot added mergequeue-status: done and removed mergequeue-status: in_progress labels Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set trace log level for our flakyest tests #21595

set trace log level for our flakyest tests #21595

akarpz commented Dec 15, 2023 •

edited

Loading

pr-commenter bot commented Dec 15, 2023 •

edited

Loading

Declared stable experiments that are now erratic

Declared erratic experiments that are now stable

Fine details of change detection per experiment

Explanation

akarpz commented Dec 19, 2023

dd-devflow bot commented Dec 19, 2023

set trace log level for our flakyest tests #21595

set trace log level for our flakyest tests #21595

Conversation

akarpz commented Dec 15, 2023 • edited Loading

What does this PR do?

Motivation

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

Reviewer's Checklist

pr-commenter bot commented Dec 15, 2023 • edited Loading

Bloop Bleep... Dogbot Here

Regression Detector Results

No significant changes in experiment optimization goals

Declared stable experiments that are now erratic

Declared erratic experiments that are now stable

Fine details of change detection per experiment

Explanation

akarpz commented Dec 19, 2023

dd-devflow bot commented Dec 19, 2023

akarpz commented Dec 15, 2023 •

edited

Loading

pr-commenter bot commented Dec 15, 2023 •

edited

Loading