Support service name overrides for DSM #7798

kr-igor · 2024-10-17T20:56:54Z

This PR adds a mechanism which allows overriding service name for DSM checkpoints per thread.
Stats buckets will be split by service name, so per 1 time interval we may have several "named" buckets.
This will be used to correctly capture service (application) names for spark / flink jobs.

pr-commenter · 2024-10-17T21:32:15Z

Kafka / producer-benchmark

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	kr-igor/dsm-service-name-override
git_commit_date	1731694945	1731711042
git_commit_sha	`2f767ab`	`6856dc5`

See matching parameters

	Baseline	Candidate
ci_job_date	1731712256	1731712256
ci_job_id	709543451	709543451
ci_pipeline_id	49193655	49193655
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion	11.0.21	11.0.21
jmhVersion	1.36	1.36
jvm	/usr/lib/jvm/java-11-openjdk-amd64/bin/java	/usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs	-Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant	-Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName	OpenJDK 64-Bit Server VM	OpenJDK 64-Bit Server VM
vmVersion	11.0.21+9-post-Ubuntu-0ubuntu122.04	11.0.21+9-post-Ubuntu-0ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results

scenario	Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce	same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce	same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce	same

pr-commenter · 2024-10-17T21:35:46Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	kr-igor/dsm-service-name-override
git_commit_date	1731694945	1731711042
git_commit_sha	`2f767ab`	`6856dc5`
release_version	1.43.0-SNAPSHOT~2f767ab81d	1.43.0-SNAPSHOT~6856dc5a90

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1731713628	1731713628
ci_job_id	709543447	709543447
ci_pipeline_id	49193655	49193655
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module	Agent	Agent
parent	None	None
variant	iast	iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 53 metrics, 10 unstable metrics.

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.078 s) : 0, 1078306
Total [baseline] (10.424 s) : 0, 10424468
Agent [candidate] (1.081 s) : 0, 1080627
Total [candidate] (10.433 s) : 0, 10432782
section appsec
Agent [baseline] (1.214 s) : 0, 1213661
Total [baseline] (10.66 s) : 0, 10660096
Agent [candidate] (1.222 s) : 0, 1222185
Total [candidate] (10.69 s) : 0, 10690301
section iast
Agent [baseline] (1.205 s) : 0, 1204762
Total [baseline] (10.85 s) : 0, 10849908
Agent [candidate] (1.206 s) : 0, 1205704
Total [candidate] (10.832 s) : 0, 10831551
section profiling
Agent [baseline] (1.274 s) : 0, 1274412
Total [baseline] (10.733 s) : 0, 10732795
Agent [candidate] (1.275 s) : 0, 1275480
Total [candidate] (10.724 s) : 0, 10724239

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.078 s	-
Agent	appsec	1.214 s	135.355 ms (12.6%)
Agent	iast	1.205 s	126.456 ms (11.7%)
Agent	profiling	1.274 s	196.106 ms (18.2%)
Total	tracing	10.424 s	-
Total	appsec	10.66 s	235.629 ms (2.3%)
Total	iast	10.85 s	425.441 ms (4.1%)
Total	profiling	10.733 s	308.327 ms (3.0%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.081 s	-
Agent	appsec	1.222 s	141.557 ms (13.1%)
Agent	iast	1.206 s	125.077 ms (11.6%)
Agent	profiling	1.275 s	194.853 ms (18.0%)
Total	tracing	10.433 s	-
Total	appsec	10.69 s	257.519 ms (2.5%)
Total	iast	10.832 s	398.769 ms (3.8%)
Total	profiling	10.724 s	291.457 ms (2.8%)

gantt
    title petclinic - break down per module: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (685.024 ms) : 0, 685024
BytebuddyAgent [candidate] (686.286 ms) : 0, 686286
GlobalTracer [baseline] (314.364 ms) : 0, 314364
GlobalTracer [candidate] (315.291 ms) : 0, 315291
AppSec [baseline] (53.972 ms) : 0, 53972
AppSec [candidate] (54.121 ms) : 0, 54121
Remote Config [baseline] (678.013 µs) : 0, 678
Remote Config [candidate] (679.547 µs) : 0, 680
Telemetry [baseline] (10.663 ms) : 0, 10663
Telemetry [candidate] (10.595 ms) : 0, 10595
section appsec
BytebuddyAgent [baseline] (703.259 ms) : 0, 703259
BytebuddyAgent [candidate] (708.486 ms) : 0, 708486
GlobalTracer [baseline] (312.009 ms) : 0, 312009
GlobalTracer [candidate] (315.031 ms) : 0, 315031
AppSec [baseline] (166.298 ms) : 0, 166298
AppSec [candidate] (166.657 ms) : 0, 166657
IAST [baseline] (20.216 ms) : 0, 20216
IAST [candidate] (19.564 ms) : 0, 19564
Remote Config [baseline] (625.339 µs) : 0, 625
Remote Config [candidate] (648.237 µs) : 0, 648
Telemetry [baseline] (7.767 ms) : 0, 7767
Telemetry [candidate] (7.894 ms) : 0, 7894
section iast
BytebuddyAgent [baseline] (801.478 ms) : 0, 801478
BytebuddyAgent [candidate] (802.006 ms) : 0, 802006
GlobalTracer [baseline] (304.06 ms) : 0, 304060
GlobalTracer [candidate] (304.26 ms) : 0, 304260
AppSec [baseline] (55.466 ms) : 0, 55466
AppSec [candidate] (56.994 ms) : 0, 56994
IAST [baseline] (22.093 ms) : 0, 22093
IAST [candidate] (20.707 ms) : 0, 20707
Remote Config [baseline] (605.515 µs) : 0, 606
Remote Config [candidate] (612.18 µs) : 0, 612
Telemetry [baseline] (7.432 ms) : 0, 7432
Telemetry [candidate] (7.479 ms) : 0, 7479
section profiling
BytebuddyAgent [baseline] (680.13 ms) : 0, 680130
BytebuddyAgent [candidate] (679.904 ms) : 0, 679904
GlobalTracer [baseline] (397.177 ms) : 0, 397177
GlobalTracer [candidate] (396.676 ms) : 0, 396676
AppSec [baseline] (54.372 ms) : 0, 54372
AppSec [candidate] (54.624 ms) : 0, 54624
Remote Config [baseline] (664.52 µs) : 0, 665
Remote Config [candidate] (661.504 µs) : 0, 662
Telemetry [baseline] (12.656 ms) : 0, 12656
Telemetry [candidate] (11.355 ms) : 0, 11355
ProfilingAgent [baseline] (90.66 ms) : 0, 90660
ProfilingAgent [candidate] (93.528 ms) : 0, 93528
Profiling [baseline] (90.684 ms) : 0, 90684
Profiling [candidate] (93.552 ms) : 0, 93552

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.09 s) : 0, 1089634
Total [baseline] (8.575 s) : 0, 8574684
Agent [candidate] (1.079 s) : 0, 1079497
Total [candidate] (8.541 s) : 0, 8540787
section iast
Agent [baseline] (1.205 s) : 0, 1205495
Total [baseline] (9.107 s) : 0, 9107250
Agent [candidate] (1.205 s) : 0, 1205215
Total [candidate] (9.097 s) : 0, 9096963
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.208 s) : 0, 1207667
Total [baseline] (9.127 s) : 0, 9127055
Agent [candidate] (1.209 s) : 0, 1208986
Total [candidate] (9.122 s) : 0, 9122034
section iast_TELEMETRY_OFF
Agent [baseline] (1.21 s) : 0, 1209816
Total [baseline] (9.095 s) : 0, 9094555
Agent [candidate] (1.202 s) : 0, 1202332
Total [candidate] (9.138 s) : 0, 9137535

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.09 s	-
Agent	iast	1.205 s	115.86 ms (10.6%)
Agent	iast_HARDCODED_SECRET_DISABLED	1.208 s	118.033 ms (10.8%)
Agent	iast_TELEMETRY_OFF	1.21 s	120.182 ms (11.0%)
Total	tracing	8.575 s	-
Total	iast	9.107 s	532.566 ms (6.2%)
Total	iast_HARDCODED_SECRET_DISABLED	9.127 s	552.371 ms (6.4%)
Total	iast_TELEMETRY_OFF	9.095 s	519.871 ms (6.1%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.079 s	-
Agent	iast	1.205 s	125.718 ms (11.6%)
Agent	iast_HARDCODED_SECRET_DISABLED	1.209 s	129.489 ms (12.0%)
Agent	iast_TELEMETRY_OFF	1.202 s	122.835 ms (11.4%)
Total	tracing	8.541 s	-
Total	iast	9.097 s	556.175 ms (6.5%)
Total	iast_HARDCODED_SECRET_DISABLED	9.122 s	581.247 ms (6.8%)
Total	iast_TELEMETRY_OFF	9.138 s	596.748 ms (7.0%)

gantt
    title insecure-bank - break down per module: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (692.591 ms) : 0, 692591
BytebuddyAgent [candidate] (685.267 ms) : 0, 685267
GlobalTracer [baseline] (317.565 ms) : 0, 317565
GlobalTracer [candidate] (314.435 ms) : 0, 314435
AppSec [baseline] (54.462 ms) : 0, 54462
AppSec [candidate] (54.055 ms) : 0, 54055
Remote Config [baseline] (680.376 µs) : 0, 680
Remote Config [candidate] (694.94 µs) : 0, 695
Telemetry [baseline] (10.561 ms) : 0, 10561
Telemetry [candidate] (11.398 ms) : 0, 11398
section iast
BytebuddyAgent [baseline] (801.955 ms) : 0, 801955
BytebuddyAgent [candidate] (801.34 ms) : 0, 801340
GlobalTracer [baseline] (303.942 ms) : 0, 303942
GlobalTracer [candidate] (304.42 ms) : 0, 304420
AppSec [baseline] (57.334 ms) : 0, 57334
AppSec [candidate] (55.937 ms) : 0, 55937
IAST [baseline] (20.545 ms) : 0, 20545
IAST [candidate] (21.927 ms) : 0, 21927
Remote Config [baseline] (618.431 µs) : 0, 618
Remote Config [candidate] (601.837 µs) : 0, 602
Telemetry [baseline] (7.442 ms) : 0, 7442
Telemetry [candidate] (7.36 ms) : 0, 7360
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (802.893 ms) : 0, 802893
BytebuddyAgent [candidate] (804.503 ms) : 0, 804503
GlobalTracer [baseline] (304.745 ms) : 0, 304745
GlobalTracer [candidate] (304.718 ms) : 0, 304718
AppSec [baseline] (57.702 ms) : 0, 57702
AppSec [candidate] (57.356 ms) : 0, 57356
IAST [baseline] (20.586 ms) : 0, 20586
IAST [candidate] (20.498 ms) : 0, 20498
Remote Config [baseline] (604.12 µs) : 0, 604
Remote Config [candidate] (615.416 µs) : 0, 615
Telemetry [baseline] (7.47 ms) : 0, 7470
Telemetry [candidate] (7.52 ms) : 0, 7520
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (804.615 ms) : 0, 804615
BytebuddyAgent [candidate] (799.262 ms) : 0, 799262
GlobalTracer [baseline] (305.686 ms) : 0, 305686
GlobalTracer [candidate] (303.69 ms) : 0, 303690
AppSec [baseline] (57.684 ms) : 0, 57684
AppSec [candidate] (56.859 ms) : 0, 56859
IAST [baseline] (20.133 ms) : 0, 20133
IAST [candidate] (20.924 ms) : 0, 20924
Remote Config [baseline] (605.041 µs) : 0, 605
Remote Config [candidate] (613.766 µs) : 0, 614
Telemetry [baseline] (7.338 ms) : 0, 7338
Telemetry [candidate] (7.318 ms) : 0, 7318

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
end_time	2024-11-15T23:04:09	2024-11-15T23:11:03
git_branch	master	kr-igor/dsm-service-name-override
git_commit_date	1731694945	1731711042
git_commit_sha	`2f767ab`	`6856dc5`
release_version	1.43.0-SNAPSHOT~2f767ab81d	1.43.0-SNAPSHOT~6856dc5a90
start_time	2024-11-15T23:03:55	2024-11-15T23:10:50

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1731712612	1731712612
ci_job_id	709543448	709543448
ci_pipeline_id	49193655	49193655
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant	iast	iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 17 unstable metrics.

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.333 ms) : 1313, 1353
.   : milestone, 1333,
appsec (1.746 ms) : 1722, 1771
.   : milestone, 1746,
appsec_no_iast (1.727 ms) : 1703, 1752
.   : milestone, 1727,
iast (1.466 ms) : 1444, 1489
.   : milestone, 1466,
profiling (1.511 ms) : 1487, 1535
.   : milestone, 1511,
tracing (1.467 ms) : 1442, 1492
.   : milestone, 1467,
section candidate
no_agent (1.331 ms) : 1312, 1350
.   : milestone, 1331,
appsec (1.741 ms) : 1717, 1765
.   : milestone, 1741,
appsec_no_iast (1.715 ms) : 1690, 1739
.   : milestone, 1715,
iast (1.502 ms) : 1480, 1524
.   : milestone, 1502,
profiling (1.478 ms) : 1454, 1501
.   : milestone, 1478,
tracing (1.468 ms) : 1444, 1492
.   : milestone, 1468,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.333 ms [1.313 ms, 1.353 ms]	-
appsec	1.746 ms [1.722 ms, 1.771 ms]	413.22 µs (31.0%)
appsec_no_iast	1.727 ms [1.703 ms, 1.752 ms]	393.941 µs (29.5%)
iast	1.466 ms [1.444 ms, 1.489 ms]	133.129 µs (10.0%)
profiling	1.511 ms [1.487 ms, 1.535 ms]	178.105 µs (13.4%)
tracing	1.467 ms [1.442 ms, 1.492 ms]	133.615 µs (10.0%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.331 ms [1.312 ms, 1.35 ms]	-
appsec	1.741 ms [1.717 ms, 1.765 ms]	410.019 µs (30.8%)
appsec_no_iast	1.715 ms [1.69 ms, 1.739 ms]	383.479 µs (28.8%)
iast	1.502 ms [1.48 ms, 1.524 ms]	170.454 µs (12.8%)
profiling	1.478 ms [1.454 ms, 1.501 ms]	146.429 µs (11.0%)
tracing	1.468 ms [1.444 ms, 1.492 ms]	136.832 µs (10.3%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (370.99 µs) : 350, 392
.   : milestone, 371,
iast (482.034 µs) : 461, 503
.   : milestone, 482,
iast_FULL (638.608 µs) : 617, 660
.   : milestone, 639,
iast_GLOBAL (513.851 µs) : 492, 536
.   : milestone, 514,
iast_HARDCODED_SECRET_DISABLED (486.71 µs) : 465, 508
.   : milestone, 487,
iast_INACTIVE (447.753 µs) : 426, 469
.   : milestone, 448,
iast_TELEMETRY_OFF (478.151 µs) : 456, 500
.   : milestone, 478,
tracing (442.097 µs) : 422, 463
.   : milestone, 442,
section candidate
no_agent (367.339 µs) : 348, 387
.   : milestone, 367,
iast (486.566 µs) : 465, 509
.   : milestone, 487,
iast_FULL (642.278 µs) : 621, 664
.   : milestone, 642,
iast_GLOBAL (514.931 µs) : 493, 537
.   : milestone, 515,
iast_HARDCODED_SECRET_DISABLED (484.927 µs) : 464, 506
.   : milestone, 485,
iast_INACTIVE (446.73 µs) : 426, 468
.   : milestone, 447,
iast_TELEMETRY_OFF (474.606 µs) : 453, 496
.   : milestone, 475,
tracing (445.852 µs) : 425, 467
.   : milestone, 446,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	370.99 µs [349.64 µs, 392.34 µs]	-
iast	482.034 µs [460.812 µs, 503.256 µs]	111.044 µs (29.9%)
iast_FULL	638.608 µs [617.186 µs, 660.03 µs]	267.618 µs (72.1%)
iast_GLOBAL	513.851 µs [492.089 µs, 535.613 µs]	142.861 µs (38.5%)
iast_HARDCODED_SECRET_DISABLED	486.71 µs [464.978 µs, 508.442 µs]	115.72 µs (31.2%)
iast_INACTIVE	447.753 µs [426.496 µs, 469.01 µs]	76.763 µs (20.7%)
iast_TELEMETRY_OFF	478.151 µs [456.091 µs, 500.211 µs]	107.161 µs (28.9%)
tracing	442.097 µs [421.568 µs, 462.625 µs]	71.107 µs (19.2%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	367.339 µs [347.769 µs, 386.908 µs]	-
iast	486.566 µs [464.563 µs, 508.568 µs]	119.227 µs (32.5%)
iast_FULL	642.278 µs [621.017 µs, 663.539 µs]	274.939 µs (74.8%)
iast_GLOBAL	514.931 µs [493.346 µs, 536.516 µs]	147.593 µs (40.2%)
iast_HARDCODED_SECRET_DISABLED	484.927 µs [463.538 µs, 506.316 µs]	117.588 µs (32.0%)
iast_INACTIVE	446.73 µs [425.806 µs, 467.653 µs]	79.391 µs (21.6%)
iast_TELEMETRY_OFF	474.606 µs [452.88 µs, 496.332 µs]	107.267 µs (29.2%)
tracing	445.852 µs [424.993 µs, 466.711 µs]	78.514 µs (21.4%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	kr-igor/dsm-service-name-override
git_commit_date	1731694945	1731711042
git_commit_sha	`2f767ab`	`6856dc5`
release_version	1.43.0-SNAPSHOT~2f767ab81d	1.43.0-SNAPSHOT~6856dc5a90

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1731713153	1731713153
ci_job_id	709543449	709543449
ci_pipeline_id	49193655	49193655
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant	appsec	appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.547 s) : 15547000, 15547000
.   : milestone, 15547000,
appsec (15.232 s) : 15232000, 15232000
.   : milestone, 15232000,
iast (18.435 s) : 18435000, 18435000
.   : milestone, 18435000,
iast_GLOBAL (18.084 s) : 18084000, 18084000
.   : milestone, 18084000,
profiling (15.03 s) : 15030000, 15030000
.   : milestone, 15030000,
tracing (15.077 s) : 15077000, 15077000
.   : milestone, 15077000,
section candidate
no_agent (15.166 s) : 15166000, 15166000
.   : milestone, 15166000,
appsec (15.12 s) : 15120000, 15120000
.   : milestone, 15120000,
iast (18.66 s) : 18660000, 18660000
.   : milestone, 18660000,
iast_GLOBAL (18.309 s) : 18309000, 18309000
.   : milestone, 18309000,
profiling (14.834 s) : 14834000, 14834000
.   : milestone, 14834000,
tracing (15.238 s) : 15238000, 15238000
.   : milestone, 15238000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.547 s [15.547 s, 15.547 s]	-
appsec	15.232 s [15.232 s, 15.232 s]	-315.0 ms (-2.0%)
iast	18.435 s [18.435 s, 18.435 s]	2.888 s (18.6%)
iast_GLOBAL	18.084 s [18.084 s, 18.084 s]	2.537 s (16.3%)
profiling	15.03 s [15.03 s, 15.03 s]	-517.0 ms (-3.3%)
tracing	15.077 s [15.077 s, 15.077 s]	-470.0 ms (-3.0%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.166 s [15.166 s, 15.166 s]	-
appsec	15.12 s [15.12 s, 15.12 s]	-46.0 ms (-0.3%)
iast	18.66 s [18.66 s, 18.66 s]	3.494 s (23.0%)
iast_GLOBAL	18.309 s [18.309 s, 18.309 s]	3.143 s (20.7%)
profiling	14.834 s [14.834 s, 14.834 s]	-332.0 ms (-2.2%)
tracing	15.238 s [15.238 s, 15.238 s]	72.0 ms (0.5%)

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.469 ms) : 1458, 1481
.   : milestone, 1469,
appsec (2.348 ms) : 2307, 2390
.   : milestone, 2348,
iast (2.079 ms) : 2027, 2132
.   : milestone, 2079,
iast_GLOBAL (2.129 ms) : 2076, 2181
.   : milestone, 2129,
profiling (1.937 ms) : 1896, 1978
.   : milestone, 1937,
tracing (1.919 ms) : 1880, 1958
.   : milestone, 1919,
section candidate
no_agent (1.472 ms) : 1461, 1484
.   : milestone, 1472,
appsec (2.34 ms) : 2299, 2381
.   : milestone, 2340,
iast (2.079 ms) : 2027, 2131
.   : milestone, 2079,
iast_GLOBAL (2.121 ms) : 2068, 2174
.   : milestone, 2121,
profiling (1.946 ms) : 1904, 1988
.   : milestone, 1946,
tracing (1.925 ms) : 1885, 1965
.   : milestone, 1925,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.469 ms [1.458 ms, 1.481 ms]	-
appsec	2.348 ms [2.307 ms, 2.39 ms]	878.876 µs (59.8%)
iast	2.079 ms [2.027 ms, 2.132 ms]	609.631 µs (41.5%)
iast_GLOBAL	2.129 ms [2.076 ms, 2.181 ms]	659.151 µs (44.9%)
profiling	1.937 ms [1.896 ms, 1.978 ms]	467.39 µs (31.8%)
tracing	1.919 ms [1.88 ms, 1.958 ms]	449.609 µs (30.6%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.472 ms [1.461 ms, 1.484 ms]	-
appsec	2.34 ms [2.299 ms, 2.381 ms]	867.66 µs (58.9%)
iast	2.079 ms [2.027 ms, 2.131 ms]	606.268 µs (41.2%)
iast_GLOBAL	2.121 ms [2.068 ms, 2.174 ms]	648.743 µs (44.1%)
profiling	1.946 ms [1.904 ms, 1.988 ms]	473.524 µs (32.2%)
tracing	1.925 ms [1.885 ms, 1.965 ms]	452.645 µs (30.7%)

pr-commenter · 2024-10-17T21:44:22Z

Kafka / consumer-benchmark

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	kr-igor/dsm-service-name-override
git_commit_date	1731694945	1731711042
git_commit_sha	`2f767ab`	`6856dc5`

See matching parameters

	Baseline	Candidate
ci_job_date	1731712290	1731712290
ci_job_id	709543452	709543452
ci_pipeline_id	49193655	49193655
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion	11.0.21	11.0.21
jmhVersion	1.36	1.36
jvm	/usr/lib/jvm/java-11-openjdk-amd64/bin/java	/usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs	-Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant	-Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName	OpenJDK 64-Bit Server VM	OpenJDK 64-Bit Server VM
vmVersion	11.0.21+9-post-Ubuntu-0ubuntu122.04	11.0.21+9-post-Ubuntu-0ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results

scenario	Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume	same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume	unsure [-17367.061op/s; -1389.292op/s] or [-5.463%; -0.437%]
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume	same

amarziali · 2024-10-18T11:31:46Z

...spark-executor/src/main/java/datadog/trace/instrumentation/spark/SparkExecutorDecorator.java

+
+    if (taskDescription != null) {
+      try {
+        Field prop = taskDescription.getClass().getDeclaredField("properties");


I think you should optimize this by unreflecting a method handle for this getter. It will save access checking costs

amarziali · 2024-10-18T11:32:49Z

...spark-executor/src/main/java/datadog/trace/instrumentation/spark/SparkExecutorDecorator.java

+        if (appName != null) {
+          AgentTracer.get()
+              .getDataStreamsMonitoring()
+              .setThreadServiceName(taskRunner.getThreadId(), appName);


Is product wise coherent to have a different service name from what represented by DSM and what represented by the tracing?

It makes sense in case with Spark / Flink. It may not be possible to pass the service name from outside in managed environments (for instance spans generated within Databricks for DJM may have service name == "databricks.all-purpose-cluster.dlt-execution-123"). At the same time customers expect to see the Spark Job name, not the cluster node name.

Also there a cases when a single running application (Task Manager in Flink) runs multiple independent stream processing tasks. Each task has it's own "service name", using node name is incorrect.

The better approach would be to dynamically set global service name depending on the context. This seems like a way large change. Happy to discuss this separately.

actually you can do it by calling updatePreferredServiceName (

dd-trace-java/internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/AgentTracer.java

Line 293 in 00856e0

void updatePreferredServiceName(String serviceName);

). Is it what you were looking from?

OK if it depends from the current context then it's not globally applying to the JVM.

piochelepiotr · 2024-10-18T14:25:12Z

dd-trace-core/src/main/java/datadog/trace/core/datastreams/DefaultDataStreamsMonitoring.java

-    List<StatsBucket> includedBuckets = new ArrayList<>();
-    Iterator<Map.Entry<Long, StatsBucket>> mapIterator = timeToBucket.entrySet().iterator();
+    // stats are grouped by time buckets and service names
+    Map<String, List<StatsBucket>> includedBuckets = new HashMap<>();


we could also put the service name in the statsPoint. What do you think? It would avoid doing this grouping logic here.

The service name is reported per bucket. You can put it inside the stats point but then you'll have to modify the writer to correctly split the bucket. Effectively doing similar thing in a different place.

amarziali

I think that addition is a good to have to improve the experience of DSM consumers. I've still some concerns especially around adding an unbounded map to store thread id to name correspondences. It should be done in a threadlocal and actually it might not work if the kafka producer (in this example) is called in a thread different from the spark task one. Also it will be nice to have more e2e testing with spark and kafka

amarziali · 2024-11-08T08:42:42Z

.../src/main/java/datadog/trace/instrumentation/kafka_clients/KafkaProducerInstrumentation.java

@@ -200,7 +200,8 @@ public static void onEnter(@Advice.Argument(value = 0) int estimatedPayloadSize)
                saved.getTimestampNanos(),
                saved.getPathwayLatencyNano(),
                saved.getEdgeLatencyNano(),
-                estimatedPayloadSize);
+                estimatedPayloadSize,
+                saved.getServiceNameOverride());


Are message produced in the same thread where the task is running? Also are we testing it? I did not see spark tests with kafka but perhaps I missed it

amarziali · 2024-11-08T08:46:03Z

dd-trace-core/src/main/java/datadog/trace/core/datastreams/DefaultDataStreamsMonitoring.java

@@ -74,6 +75,8 @@ public class DefaultDataStreamsMonitoring implements DataStreamsMonitoring, Even
  private volatile boolean agentSupportsDataStreams = false;
  private volatile boolean configSupportsDataStreams = false;
  private final ConcurrentHashMap<String, SchemaSampler> schemaSamplers;
+  private static final ConcurrentHashMap<Long, String> threadServiceNames =


I've some concerns adding unbounded maps here. It might be ok for the spark instrumentation but this has been added to a generic API on DataStreamMonitoring. It can be used in the future in a bad way and start leaking. If service names have to be pinned to a thread, possible a threadlocal variable should be used in order to ensure that that value will go away when the thread is collected.

Support service name overrides for DSM

ba30c88

Set thread service name for spark tasks

a0e834e

amarziali reviewed Oct 18, 2024

View reviewed changes

piochelepiotr reviewed Oct 18, 2024

View reviewed changes

kr-igor added 10 commits October 18, 2024 10:29

Merge branch 'master' into kr-igor/dsm-service-name-override

aa7341b

Merge branch 'master' into kr-igor/dsm-service-name-override

bf5af32

Fixed tests

3703c42

Restructured tests

71d5678

Updated failing test

f8c7b94

Fixed tests

dacc353

Merge branch 'master' into kr-igor/dsm-service-name-override

a055486

Reset service name override in test

5c484ed

Test now uses clearThreadServiceName

545c928

Merge branch 'master' into kr-igor/dsm-service-name-override

2df9920

kr-igor marked this pull request as ready for review October 29, 2024 16:12

kr-igor requested review from a team as code owners October 29, 2024 16:12

kr-igor requested a review from PerfectSlayer October 29, 2024 16:12

Merge branch 'master' into kr-igor/dsm-service-name-override

b4b235e

piochelepiotr approved these changes Nov 6, 2024

View reviewed changes

kr-igor and others added 3 commits November 6, 2024 10:28

Log service name

cef6dc7

Use unreflected method to get properties

02ce8d8

Simplify methodhandle initialization

a6d89db

amarziali requested changes Nov 8, 2024

View reviewed changes

wip: Added unit test

c2eb6cf

kr-igor added 3 commits November 15, 2024 12:58

Merge branch 'master' into kr-igor/dsm-service-name-override

1b32ece

Added service name overrides to test writer

0e9f3b6

Reverted instrumentation-specific code; switched to thread local

6856dc5

amarziali approved these changes Nov 18, 2024

View reviewed changes

kr-igor merged commit 6181783 into master Nov 18, 2024
104 checks passed

kr-igor deleted the kr-igor/dsm-service-name-override branch November 18, 2024 15:26

github-actions bot added this to the 1.43.0 milestone Nov 18, 2024

mcculls added type: feature request comp: data streams Data Streams Monitoring labels Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support service name overrides for DSM #7798

Support service name overrides for DSM #7798

kr-igor commented Oct 17, 2024 •

edited

Loading

pr-commenter bot commented Oct 17, 2024 •

edited

Loading

pr-commenter bot commented Oct 17, 2024 •

edited

Loading

pr-commenter bot commented Oct 17, 2024 •

edited

Loading

amarziali Oct 18, 2024

amarziali Oct 18, 2024

kr-igor Oct 18, 2024 •

edited

Loading

amarziali Oct 18, 2024

amarziali Oct 18, 2024

piochelepiotr Oct 18, 2024

kr-igor Oct 18, 2024

amarziali left a comment

amarziali Nov 8, 2024

amarziali Nov 8, 2024

Support service name overrides for DSM #7798

Support service name overrides for DSM #7798

Conversation

kr-igor commented Oct 17, 2024 • edited Loading

pr-commenter bot commented Oct 17, 2024 • edited Loading

Kafka / producer-benchmark

Parameters

Summary

pr-commenter bot commented Oct 17, 2024 • edited Loading

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

pr-commenter bot commented Oct 17, 2024 • edited Loading

Kafka / consumer-benchmark

Parameters

Summary

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kr-igor Oct 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amarziali left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kr-igor commented Oct 17, 2024 •

edited

Loading

pr-commenter bot commented Oct 17, 2024 •

edited

Loading

pr-commenter bot commented Oct 17, 2024 •

edited

Loading

pr-commenter bot commented Oct 17, 2024 •

edited

Loading

kr-igor Oct 18, 2024 •

edited

Loading