Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support service name overrides for DSM #7798

Merged
merged 20 commits into from
Nov 18, 2024

Conversation

kr-igor
Copy link
Contributor

@kr-igor kr-igor commented Oct 17, 2024

This PR adds a mechanism which allows overriding service name for DSM checkpoints per thread.
Stats buckets will be split by service name, so per 1 time interval we may have several "named" buckets.
This will be used to correctly capture service (application) names for spark / flink jobs.

@pr-commenter
Copy link

pr-commenter bot commented Oct 17, 2024

Kafka / producer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master kr-igor/dsm-service-name-override
git_commit_date 1731694945 1731711042
git_commit_sha 2f767ab 6856dc5
See matching parameters
Baseline Candidate
ci_job_date 1731712256 1731712256
ci_job_id 709543451 709543451
ci_pipeline_id 49193655 49193655
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.21 11.0.21
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.21+9-post-Ubuntu-0ubuntu122.04 11.0.21+9-post-Ubuntu-0ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce same

@pr-commenter
Copy link

pr-commenter bot commented Oct 17, 2024

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master kr-igor/dsm-service-name-override
git_commit_date 1731694945 1731711042
git_commit_sha 2f767ab 6856dc5
release_version 1.43.0-SNAPSHOT~2f767ab81d 1.43.0-SNAPSHOT~6856dc5a90
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1731713628 1731713628
ci_job_id 709543447 709543447
ci_pipeline_id 49193655 49193655
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
module Agent Agent
parent None None
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 53 metrics, 10 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.078 s) : 0, 1078306
Total [baseline] (10.424 s) : 0, 10424468
Agent [candidate] (1.081 s) : 0, 1080627
Total [candidate] (10.433 s) : 0, 10432782
section appsec
Agent [baseline] (1.214 s) : 0, 1213661
Total [baseline] (10.66 s) : 0, 10660096
Agent [candidate] (1.222 s) : 0, 1222185
Total [candidate] (10.69 s) : 0, 10690301
section iast
Agent [baseline] (1.205 s) : 0, 1204762
Total [baseline] (10.85 s) : 0, 10849908
Agent [candidate] (1.206 s) : 0, 1205704
Total [candidate] (10.832 s) : 0, 10831551
section profiling
Agent [baseline] (1.274 s) : 0, 1274412
Total [baseline] (10.733 s) : 0, 10732795
Agent [candidate] (1.275 s) : 0, 1275480
Total [candidate] (10.724 s) : 0, 10724239
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.078 s -
Agent appsec 1.214 s 135.355 ms (12.6%)
Agent iast 1.205 s 126.456 ms (11.7%)
Agent profiling 1.274 s 196.106 ms (18.2%)
Total tracing 10.424 s -
Total appsec 10.66 s 235.629 ms (2.3%)
Total iast 10.85 s 425.441 ms (4.1%)
Total profiling 10.733 s 308.327 ms (3.0%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.081 s -
Agent appsec 1.222 s 141.557 ms (13.1%)
Agent iast 1.206 s 125.077 ms (11.6%)
Agent profiling 1.275 s 194.853 ms (18.0%)
Total tracing 10.433 s -
Total appsec 10.69 s 257.519 ms (2.5%)
Total iast 10.832 s 398.769 ms (3.8%)
Total profiling 10.724 s 291.457 ms (2.8%)
gantt
    title petclinic - break down per module: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (685.024 ms) : 0, 685024
BytebuddyAgent [candidate] (686.286 ms) : 0, 686286
GlobalTracer [baseline] (314.364 ms) : 0, 314364
GlobalTracer [candidate] (315.291 ms) : 0, 315291
AppSec [baseline] (53.972 ms) : 0, 53972
AppSec [candidate] (54.121 ms) : 0, 54121
Remote Config [baseline] (678.013 µs) : 0, 678
Remote Config [candidate] (679.547 µs) : 0, 680
Telemetry [baseline] (10.663 ms) : 0, 10663
Telemetry [candidate] (10.595 ms) : 0, 10595
section appsec
BytebuddyAgent [baseline] (703.259 ms) : 0, 703259
BytebuddyAgent [candidate] (708.486 ms) : 0, 708486
GlobalTracer [baseline] (312.009 ms) : 0, 312009
GlobalTracer [candidate] (315.031 ms) : 0, 315031
AppSec [baseline] (166.298 ms) : 0, 166298
AppSec [candidate] (166.657 ms) : 0, 166657
IAST [baseline] (20.216 ms) : 0, 20216
IAST [candidate] (19.564 ms) : 0, 19564
Remote Config [baseline] (625.339 µs) : 0, 625
Remote Config [candidate] (648.237 µs) : 0, 648
Telemetry [baseline] (7.767 ms) : 0, 7767
Telemetry [candidate] (7.894 ms) : 0, 7894
section iast
BytebuddyAgent [baseline] (801.478 ms) : 0, 801478
BytebuddyAgent [candidate] (802.006 ms) : 0, 802006
GlobalTracer [baseline] (304.06 ms) : 0, 304060
GlobalTracer [candidate] (304.26 ms) : 0, 304260
AppSec [baseline] (55.466 ms) : 0, 55466
AppSec [candidate] (56.994 ms) : 0, 56994
IAST [baseline] (22.093 ms) : 0, 22093
IAST [candidate] (20.707 ms) : 0, 20707
Remote Config [baseline] (605.515 µs) : 0, 606
Remote Config [candidate] (612.18 µs) : 0, 612
Telemetry [baseline] (7.432 ms) : 0, 7432
Telemetry [candidate] (7.479 ms) : 0, 7479
section profiling
BytebuddyAgent [baseline] (680.13 ms) : 0, 680130
BytebuddyAgent [candidate] (679.904 ms) : 0, 679904
GlobalTracer [baseline] (397.177 ms) : 0, 397177
GlobalTracer [candidate] (396.676 ms) : 0, 396676
AppSec [baseline] (54.372 ms) : 0, 54372
AppSec [candidate] (54.624 ms) : 0, 54624
Remote Config [baseline] (664.52 µs) : 0, 665
Remote Config [candidate] (661.504 µs) : 0, 662
Telemetry [baseline] (12.656 ms) : 0, 12656
Telemetry [candidate] (11.355 ms) : 0, 11355
ProfilingAgent [baseline] (90.66 ms) : 0, 90660
ProfilingAgent [candidate] (93.528 ms) : 0, 93528
Profiling [baseline] (90.684 ms) : 0, 90684
Profiling [candidate] (93.552 ms) : 0, 93552
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.09 s) : 0, 1089634
Total [baseline] (8.575 s) : 0, 8574684
Agent [candidate] (1.079 s) : 0, 1079497
Total [candidate] (8.541 s) : 0, 8540787
section iast
Agent [baseline] (1.205 s) : 0, 1205495
Total [baseline] (9.107 s) : 0, 9107250
Agent [candidate] (1.205 s) : 0, 1205215
Total [candidate] (9.097 s) : 0, 9096963
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.208 s) : 0, 1207667
Total [baseline] (9.127 s) : 0, 9127055
Agent [candidate] (1.209 s) : 0, 1208986
Total [candidate] (9.122 s) : 0, 9122034
section iast_TELEMETRY_OFF
Agent [baseline] (1.21 s) : 0, 1209816
Total [baseline] (9.095 s) : 0, 9094555
Agent [candidate] (1.202 s) : 0, 1202332
Total [candidate] (9.138 s) : 0, 9137535
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.09 s -
Agent iast 1.205 s 115.86 ms (10.6%)
Agent iast_HARDCODED_SECRET_DISABLED 1.208 s 118.033 ms (10.8%)
Agent iast_TELEMETRY_OFF 1.21 s 120.182 ms (11.0%)
Total tracing 8.575 s -
Total iast 9.107 s 532.566 ms (6.2%)
Total iast_HARDCODED_SECRET_DISABLED 9.127 s 552.371 ms (6.4%)
Total iast_TELEMETRY_OFF 9.095 s 519.871 ms (6.1%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.079 s -
Agent iast 1.205 s 125.718 ms (11.6%)
Agent iast_HARDCODED_SECRET_DISABLED 1.209 s 129.489 ms (12.0%)
Agent iast_TELEMETRY_OFF 1.202 s 122.835 ms (11.4%)
Total tracing 8.541 s -
Total iast 9.097 s 556.175 ms (6.5%)
Total iast_HARDCODED_SECRET_DISABLED 9.122 s 581.247 ms (6.8%)
Total iast_TELEMETRY_OFF 9.138 s 596.748 ms (7.0%)
gantt
    title insecure-bank - break down per module: candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (692.591 ms) : 0, 692591
BytebuddyAgent [candidate] (685.267 ms) : 0, 685267
GlobalTracer [baseline] (317.565 ms) : 0, 317565
GlobalTracer [candidate] (314.435 ms) : 0, 314435
AppSec [baseline] (54.462 ms) : 0, 54462
AppSec [candidate] (54.055 ms) : 0, 54055
Remote Config [baseline] (680.376 µs) : 0, 680
Remote Config [candidate] (694.94 µs) : 0, 695
Telemetry [baseline] (10.561 ms) : 0, 10561
Telemetry [candidate] (11.398 ms) : 0, 11398
section iast
BytebuddyAgent [baseline] (801.955 ms) : 0, 801955
BytebuddyAgent [candidate] (801.34 ms) : 0, 801340
GlobalTracer [baseline] (303.942 ms) : 0, 303942
GlobalTracer [candidate] (304.42 ms) : 0, 304420
AppSec [baseline] (57.334 ms) : 0, 57334
AppSec [candidate] (55.937 ms) : 0, 55937
IAST [baseline] (20.545 ms) : 0, 20545
IAST [candidate] (21.927 ms) : 0, 21927
Remote Config [baseline] (618.431 µs) : 0, 618
Remote Config [candidate] (601.837 µs) : 0, 602
Telemetry [baseline] (7.442 ms) : 0, 7442
Telemetry [candidate] (7.36 ms) : 0, 7360
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (802.893 ms) : 0, 802893
BytebuddyAgent [candidate] (804.503 ms) : 0, 804503
GlobalTracer [baseline] (304.745 ms) : 0, 304745
GlobalTracer [candidate] (304.718 ms) : 0, 304718
AppSec [baseline] (57.702 ms) : 0, 57702
AppSec [candidate] (57.356 ms) : 0, 57356
IAST [baseline] (20.586 ms) : 0, 20586
IAST [candidate] (20.498 ms) : 0, 20498
Remote Config [baseline] (604.12 µs) : 0, 604
Remote Config [candidate] (615.416 µs) : 0, 615
Telemetry [baseline] (7.47 ms) : 0, 7470
Telemetry [candidate] (7.52 ms) : 0, 7520
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (804.615 ms) : 0, 804615
BytebuddyAgent [candidate] (799.262 ms) : 0, 799262
GlobalTracer [baseline] (305.686 ms) : 0, 305686
GlobalTracer [candidate] (303.69 ms) : 0, 303690
AppSec [baseline] (57.684 ms) : 0, 57684
AppSec [candidate] (56.859 ms) : 0, 56859
IAST [baseline] (20.133 ms) : 0, 20133
IAST [candidate] (20.924 ms) : 0, 20924
Remote Config [baseline] (605.041 µs) : 0, 605
Remote Config [candidate] (613.766 µs) : 0, 614
Telemetry [baseline] (7.338 ms) : 0, 7338
Telemetry [candidate] (7.318 ms) : 0, 7318
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
end_time 2024-11-15T23:04:09 2024-11-15T23:11:03
git_branch master kr-igor/dsm-service-name-override
git_commit_date 1731694945 1731711042
git_commit_sha 2f767ab 6856dc5
release_version 1.43.0-SNAPSHOT~2f767ab81d 1.43.0-SNAPSHOT~6856dc5a90
start_time 2024-11-15T23:03:55 2024-11-15T23:10:50
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1731712612 1731712612
ci_job_id 709543448 709543448
ci_pipeline_id 49193655 49193655
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant iast iast

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 17 unstable metrics.

Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.333 ms) : 1313, 1353
.   : milestone, 1333,
appsec (1.746 ms) : 1722, 1771
.   : milestone, 1746,
appsec_no_iast (1.727 ms) : 1703, 1752
.   : milestone, 1727,
iast (1.466 ms) : 1444, 1489
.   : milestone, 1466,
profiling (1.511 ms) : 1487, 1535
.   : milestone, 1511,
tracing (1.467 ms) : 1442, 1492
.   : milestone, 1467,
section candidate
no_agent (1.331 ms) : 1312, 1350
.   : milestone, 1331,
appsec (1.741 ms) : 1717, 1765
.   : milestone, 1741,
appsec_no_iast (1.715 ms) : 1690, 1739
.   : milestone, 1715,
iast (1.502 ms) : 1480, 1524
.   : milestone, 1502,
profiling (1.478 ms) : 1454, 1501
.   : milestone, 1478,
tracing (1.468 ms) : 1444, 1492
.   : milestone, 1468,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.333 ms [1.313 ms, 1.353 ms] -
appsec 1.746 ms [1.722 ms, 1.771 ms] 413.22 µs (31.0%)
appsec_no_iast 1.727 ms [1.703 ms, 1.752 ms] 393.941 µs (29.5%)
iast 1.466 ms [1.444 ms, 1.489 ms] 133.129 µs (10.0%)
profiling 1.511 ms [1.487 ms, 1.535 ms] 178.105 µs (13.4%)
tracing 1.467 ms [1.442 ms, 1.492 ms] 133.615 µs (10.0%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.331 ms [1.312 ms, 1.35 ms] -
appsec 1.741 ms [1.717 ms, 1.765 ms] 410.019 µs (30.8%)
appsec_no_iast 1.715 ms [1.69 ms, 1.739 ms] 383.479 µs (28.8%)
iast 1.502 ms [1.48 ms, 1.524 ms] 170.454 µs (12.8%)
profiling 1.478 ms [1.454 ms, 1.501 ms] 146.429 µs (11.0%)
tracing 1.468 ms [1.444 ms, 1.492 ms] 136.832 µs (10.3%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (370.99 µs) : 350, 392
.   : milestone, 371,
iast (482.034 µs) : 461, 503
.   : milestone, 482,
iast_FULL (638.608 µs) : 617, 660
.   : milestone, 639,
iast_GLOBAL (513.851 µs) : 492, 536
.   : milestone, 514,
iast_HARDCODED_SECRET_DISABLED (486.71 µs) : 465, 508
.   : milestone, 487,
iast_INACTIVE (447.753 µs) : 426, 469
.   : milestone, 448,
iast_TELEMETRY_OFF (478.151 µs) : 456, 500
.   : milestone, 478,
tracing (442.097 µs) : 422, 463
.   : milestone, 442,
section candidate
no_agent (367.339 µs) : 348, 387
.   : milestone, 367,
iast (486.566 µs) : 465, 509
.   : milestone, 487,
iast_FULL (642.278 µs) : 621, 664
.   : milestone, 642,
iast_GLOBAL (514.931 µs) : 493, 537
.   : milestone, 515,
iast_HARDCODED_SECRET_DISABLED (484.927 µs) : 464, 506
.   : milestone, 485,
iast_INACTIVE (446.73 µs) : 426, 468
.   : milestone, 447,
iast_TELEMETRY_OFF (474.606 µs) : 453, 496
.   : milestone, 475,
tracing (445.852 µs) : 425, 467
.   : milestone, 446,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 370.99 µs [349.64 µs, 392.34 µs] -
iast 482.034 µs [460.812 µs, 503.256 µs] 111.044 µs (29.9%)
iast_FULL 638.608 µs [617.186 µs, 660.03 µs] 267.618 µs (72.1%)
iast_GLOBAL 513.851 µs [492.089 µs, 535.613 µs] 142.861 µs (38.5%)
iast_HARDCODED_SECRET_DISABLED 486.71 µs [464.978 µs, 508.442 µs] 115.72 µs (31.2%)
iast_INACTIVE 447.753 µs [426.496 µs, 469.01 µs] 76.763 µs (20.7%)
iast_TELEMETRY_OFF 478.151 µs [456.091 µs, 500.211 µs] 107.161 µs (28.9%)
tracing 442.097 µs [421.568 µs, 462.625 µs] 71.107 µs (19.2%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 367.339 µs [347.769 µs, 386.908 µs] -
iast 486.566 µs [464.563 µs, 508.568 µs] 119.227 µs (32.5%)
iast_FULL 642.278 µs [621.017 µs, 663.539 µs] 274.939 µs (74.8%)
iast_GLOBAL 514.931 µs [493.346 µs, 536.516 µs] 147.593 µs (40.2%)
iast_HARDCODED_SECRET_DISABLED 484.927 µs [463.538 µs, 506.316 µs] 117.588 µs (32.0%)
iast_INACTIVE 446.73 µs [425.806 µs, 467.653 µs] 79.391 µs (21.6%)
iast_TELEMETRY_OFF 474.606 µs [452.88 µs, 496.332 µs] 107.267 µs (29.2%)
tracing 445.852 µs [424.993 µs, 466.711 µs] 78.514 µs (21.4%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master kr-igor/dsm-service-name-override
git_commit_date 1731694945 1731711042
git_commit_sha 2f767ab 6856dc5
release_version 1.43.0-SNAPSHOT~2f767ab81d 1.43.0-SNAPSHOT~6856dc5a90
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1731713153 1731713153
ci_job_id 709543449 709543449
ci_pipeline_id 49193655 49193655
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
variant appsec appsec

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.547 s) : 15547000, 15547000
.   : milestone, 15547000,
appsec (15.232 s) : 15232000, 15232000
.   : milestone, 15232000,
iast (18.435 s) : 18435000, 18435000
.   : milestone, 18435000,
iast_GLOBAL (18.084 s) : 18084000, 18084000
.   : milestone, 18084000,
profiling (15.03 s) : 15030000, 15030000
.   : milestone, 15030000,
tracing (15.077 s) : 15077000, 15077000
.   : milestone, 15077000,
section candidate
no_agent (15.166 s) : 15166000, 15166000
.   : milestone, 15166000,
appsec (15.12 s) : 15120000, 15120000
.   : milestone, 15120000,
iast (18.66 s) : 18660000, 18660000
.   : milestone, 18660000,
iast_GLOBAL (18.309 s) : 18309000, 18309000
.   : milestone, 18309000,
profiling (14.834 s) : 14834000, 14834000
.   : milestone, 14834000,
tracing (15.238 s) : 15238000, 15238000
.   : milestone, 15238000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.547 s [15.547 s, 15.547 s] -
appsec 15.232 s [15.232 s, 15.232 s] -315.0 ms (-2.0%)
iast 18.435 s [18.435 s, 18.435 s] 2.888 s (18.6%)
iast_GLOBAL 18.084 s [18.084 s, 18.084 s] 2.537 s (16.3%)
profiling 15.03 s [15.03 s, 15.03 s] -517.0 ms (-3.3%)
tracing 15.077 s [15.077 s, 15.077 s] -470.0 ms (-3.0%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.166 s [15.166 s, 15.166 s] -
appsec 15.12 s [15.12 s, 15.12 s] -46.0 ms (-0.3%)
iast 18.66 s [18.66 s, 18.66 s] 3.494 s (23.0%)
iast_GLOBAL 18.309 s [18.309 s, 18.309 s] 3.143 s (20.7%)
profiling 14.834 s [14.834 s, 14.834 s] -332.0 ms (-2.2%)
tracing 15.238 s [15.238 s, 15.238 s] 72.0 ms (0.5%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.43.0-SNAPSHOT~6856dc5a90, baseline=1.43.0-SNAPSHOT~2f767ab81d
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.469 ms) : 1458, 1481
.   : milestone, 1469,
appsec (2.348 ms) : 2307, 2390
.   : milestone, 2348,
iast (2.079 ms) : 2027, 2132
.   : milestone, 2079,
iast_GLOBAL (2.129 ms) : 2076, 2181
.   : milestone, 2129,
profiling (1.937 ms) : 1896, 1978
.   : milestone, 1937,
tracing (1.919 ms) : 1880, 1958
.   : milestone, 1919,
section candidate
no_agent (1.472 ms) : 1461, 1484
.   : milestone, 1472,
appsec (2.34 ms) : 2299, 2381
.   : milestone, 2340,
iast (2.079 ms) : 2027, 2131
.   : milestone, 2079,
iast_GLOBAL (2.121 ms) : 2068, 2174
.   : milestone, 2121,
profiling (1.946 ms) : 1904, 1988
.   : milestone, 1946,
tracing (1.925 ms) : 1885, 1965
.   : milestone, 1925,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.469 ms [1.458 ms, 1.481 ms] -
appsec 2.348 ms [2.307 ms, 2.39 ms] 878.876 µs (59.8%)
iast 2.079 ms [2.027 ms, 2.132 ms] 609.631 µs (41.5%)
iast_GLOBAL 2.129 ms [2.076 ms, 2.181 ms] 659.151 µs (44.9%)
profiling 1.937 ms [1.896 ms, 1.978 ms] 467.39 µs (31.8%)
tracing 1.919 ms [1.88 ms, 1.958 ms] 449.609 µs (30.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.472 ms [1.461 ms, 1.484 ms] -
appsec 2.34 ms [2.299 ms, 2.381 ms] 867.66 µs (58.9%)
iast 2.079 ms [2.027 ms, 2.131 ms] 606.268 µs (41.2%)
iast_GLOBAL 2.121 ms [2.068 ms, 2.174 ms] 648.743 µs (44.1%)
profiling 1.946 ms [1.904 ms, 1.988 ms] 473.524 µs (32.2%)
tracing 1.925 ms [1.885 ms, 1.965 ms] 452.645 µs (30.7%)

@pr-commenter
Copy link

pr-commenter bot commented Oct 17, 2024

Kafka / consumer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master kr-igor/dsm-service-name-override
git_commit_date 1731694945 1731711042
git_commit_sha 2f767ab 6856dc5
See matching parameters
Baseline Candidate
ci_job_date 1731712290 1731712290
ci_job_id 709543452 709543452
ci_pipeline_id 49193655 49193655
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.21 11.0.21
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.21+9-post-Ubuntu-0ubuntu122.04 11.0.21+9-post-Ubuntu-0ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume unsure
[-17367.061op/s; -1389.292op/s] or [-5.463%; -0.437%]
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume same


if (taskDescription != null) {
try {
Field prop = taskDescription.getClass().getDeclaredField("properties");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should optimize this by unreflecting a method handle for this getter. It will save access checking costs

if (appName != null) {
AgentTracer.get()
.getDataStreamsMonitoring()
.setThreadServiceName(taskRunner.getThreadId(), appName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is product wise coherent to have a different service name from what represented by DSM and what represented by the tracing?

Copy link
Contributor Author

@kr-igor kr-igor Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense in case with Spark / Flink. It may not be possible to pass the service name from outside in managed environments (for instance spans generated within Databricks for DJM may have service name == "databricks.all-purpose-cluster.dlt-execution-123"). At the same time customers expect to see the Spark Job name, not the cluster node name.

Also there a cases when a single running application (Task Manager in Flink) runs multiple independent stream processing tasks. Each task has it's own "service name", using node name is incorrect.

The better approach would be to dynamically set global service name depending on the context. This seems like a way large change. Happy to discuss this separately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually you can do it by calling updatePreferredServiceName (

). Is it what you were looking from?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK if it depends from the current context then it's not globally applying to the JVM.

List<StatsBucket> includedBuckets = new ArrayList<>();
Iterator<Map.Entry<Long, StatsBucket>> mapIterator = timeToBucket.entrySet().iterator();
// stats are grouped by time buckets and service names
Map<String, List<StatsBucket>> includedBuckets = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also put the service name in the statsPoint. What do you think? It would avoid doing this grouping logic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service name is reported per bucket. You can put it inside the stats point but then you'll have to modify the writer to correctly split the bucket. Effectively doing similar thing in a different place.

@kr-igor kr-igor marked this pull request as ready for review October 29, 2024 16:12
@kr-igor kr-igor requested review from a team as code owners October 29, 2024 16:12
Copy link
Collaborator

@amarziali amarziali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that addition is a good to have to improve the experience of DSM consumers. I've still some concerns especially around adding an unbounded map to store thread id to name correspondences. It should be done in a threadlocal and actually it might not work if the kafka producer (in this example) is called in a thread different from the spark task one. Also it will be nice to have more e2e testing with spark and kafka

@@ -200,7 +200,8 @@ public static void onEnter(@Advice.Argument(value = 0) int estimatedPayloadSize)
saved.getTimestampNanos(),
saved.getPathwayLatencyNano(),
saved.getEdgeLatencyNano(),
estimatedPayloadSize);
estimatedPayloadSize,
saved.getServiceNameOverride());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are message produced in the same thread where the task is running? Also are we testing it? I did not see spark tests with kafka but perhaps I missed it

@@ -74,6 +75,8 @@ public class DefaultDataStreamsMonitoring implements DataStreamsMonitoring, Even
private volatile boolean agentSupportsDataStreams = false;
private volatile boolean configSupportsDataStreams = false;
private final ConcurrentHashMap<String, SchemaSampler> schemaSamplers;
private static final ConcurrentHashMap<Long, String> threadServiceNames =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've some concerns adding unbounded maps here. It might be ok for the spark instrumentation but this has been added to a generic API on DataStreamMonitoring. It can be used in the future in a bad way and start leaking. If service names have to be pinned to a thread, possible a threadlocal variable should be used in order to ensure that that value will go away when the thread is collected.

@kr-igor kr-igor merged commit 6181783 into master Nov 18, 2024
104 checks passed
@kr-igor kr-igor deleted the kr-igor/dsm-service-name-override branch November 18, 2024 15:26
@github-actions github-actions bot added this to the 1.43.0 milestone Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants