-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defer remote components to avoid OkHttp class-loading side-effects #8131
Conversation
…mote components when use of OkHttp should be delayed
or JMX builder was on the system classpath (because the main thread would find it there if OkHttp triggered initialization of JUL.). We now make OkHttp calls from our own background threads, which are isolated from the system classloader, not the main thread - so this exemption no longer makes sense.
…n SharedCommunicationObjects
BenchmarksStartupParameters
See matching parameters
SummaryFound 1 performance improvements and 1 performance regressions! Performance is the same for 54 metrics, 7 unstable metrics.
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.054 s) : 0, 1054030
Total [baseline] (8.616 s) : 0, 8616440
Agent [candidate] (1.051 s) : 0, 1051164
Total [candidate] (8.651 s) : 0, 8651367
section iast
Agent [baseline] (1.179 s) : 0, 1178984
Total [baseline] (9.239 s) : 0, 9239308
Agent [candidate] (1.185 s) : 0, 1185391
Total [candidate] (9.247 s) : 0, 9247323
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.185 s) : 0, 1184923
Total [baseline] (9.251 s) : 0, 9250611
Agent [candidate] (1.176 s) : 0, 1176348
Total [candidate] (9.182 s) : 0, 9182369
section iast_TELEMETRY_OFF
Agent [baseline] (1.171 s) : 0, 1171436
Total [baseline] (9.212 s) : 0, 9211835
Agent [candidate] (1.175 s) : 0, 1174694
Total [candidate] (9.174 s) : 0, 9174424
gantt
title insecure-bank - break down per module: candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (711.252 ms) : 0, 711252
BytebuddyAgent [candidate] (711.825 ms) : 0, 711825
GlobalTracer [baseline] (256.947 ms) : 0, 256947
GlobalTracer [candidate] (256.073 ms) : 0, 256073
AppSec [baseline] (56.793 ms) : 0, 56793
AppSec [candidate] (54.845 ms) : 0, 54845
Remote Config [baseline] (697.454 µs) : 0, 697
Remote Config [candidate] (705.571 µs) : 0, 706
Telemetry [baseline] (13.506 ms) : 0, 13506
Telemetry [candidate] (12.73 ms) : 0, 12730
section iast
BytebuddyAgent [baseline] (828.783 ms) : 0, 828783
BytebuddyAgent [candidate] (833.53 ms) : 0, 833530
GlobalTracer [baseline] (247.197 ms) : 0, 247197
GlobalTracer [candidate] (247.646 ms) : 0, 247646
AppSec [baseline] (57.656 ms) : 0, 57656
AppSec [candidate] (58.125 ms) : 0, 58125
IAST [baseline] (21.316 ms) : 0, 21316
IAST [candidate] (21.614 ms) : 0, 21614
Remote Config [baseline] (644.524 µs) : 0, 645
Remote Config [candidate] (661.98 µs) : 0, 662
Telemetry [baseline] (8.464 ms) : 0, 8464
Telemetry [candidate] (8.699 ms) : 0, 8699
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (833.622 ms) : 0, 833622
BytebuddyAgent [candidate] (826.363 ms) : 0, 826363
GlobalTracer [baseline] (247.596 ms) : 0, 247596
GlobalTracer [candidate] (246.385 ms) : 0, 246385
AppSec [baseline] (58.316 ms) : 0, 58316
AppSec [candidate] (58.249 ms) : 0, 58249
IAST [baseline] (21.256 ms) : 0, 21256
IAST [candidate] (21.158 ms) : 0, 21158
Remote Config [baseline] (655.442 µs) : 0, 655
Remote Config [candidate] (705.179 µs) : 0, 705
Telemetry [baseline] (8.524 ms) : 0, 8524
Telemetry [candidate] (8.522 ms) : 0, 8522
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (823.532 ms) : 0, 823532
BytebuddyAgent [candidate] (825.243 ms) : 0, 825243
GlobalTracer [baseline] (245.852 ms) : 0, 245852
GlobalTracer [candidate] (246.698 ms) : 0, 246698
AppSec [baseline] (57.557 ms) : 0, 57557
AppSec [candidate] (57.863 ms) : 0, 57863
IAST [baseline] (20.694 ms) : 0, 20694
IAST [candidate] (20.886 ms) : 0, 20886
Remote Config [baseline] (651.519 µs) : 0, 652
Remote Config [candidate] (656.967 µs) : 0, 657
Telemetry [baseline] (8.277 ms) : 0, 8277
Telemetry [candidate] (8.379 ms) : 0, 8379
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.054 s) : 0, 1054222
Total [baseline] (10.366 s) : 0, 10365926
Agent [candidate] (1.052 s) : 0, 1051540
Total [candidate] (10.434 s) : 0, 10434404
section appsec
Agent [baseline] (1.185 s) : 0, 1184964
Total [baseline] (10.716 s) : 0, 10716196
Agent [candidate] (1.188 s) : 0, 1188164
Total [candidate] (10.71 s) : 0, 10710184
section iast
Agent [baseline] (1.179 s) : 0, 1179028
Total [baseline] (10.894 s) : 0, 10893991
Agent [candidate] (1.179 s) : 0, 1178699
Total [candidate] (10.924 s) : 0, 10924085
section profiling
Agent [baseline] (1.272 s) : 0, 1271527
Total [baseline] (10.826 s) : 0, 10825952
Agent [candidate] (1.275 s) : 0, 1275136
Total [candidate] (10.795 s) : 0, 10795264
gantt
title petclinic - break down per module: candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (712.306 ms) : 0, 712306
BytebuddyAgent [candidate] (710.866 ms) : 0, 710866
GlobalTracer [baseline] (256.071 ms) : 0, 256071
GlobalTracer [candidate] (256.177 ms) : 0, 256177
AppSec [baseline] (59.182 ms) : 0, 59182
AppSec [candidate] (55.189 ms) : 0, 55189
Remote Config [baseline] (677.416 µs) : 0, 677
Remote Config [candidate] (705.789 µs) : 0, 706
Telemetry [baseline] (11.06 ms) : 0, 11060
Telemetry [candidate] (13.576 ms) : 0, 13576
section appsec
BytebuddyAgent [baseline] (728.429 ms) : 0, 728429
BytebuddyAgent [candidate] (730.547 ms) : 0, 730547
GlobalTracer [baseline] (252.843 ms) : 0, 252843
GlobalTracer [candidate] (253.2 ms) : 0, 253200
AppSec [baseline] (170.204 ms) : 0, 170204
AppSec [candidate] (170.659 ms) : 0, 170659
IAST [baseline] (19.335 ms) : 0, 19335
IAST [candidate] (20.125 ms) : 0, 20125
Remote Config [baseline] (649.577 µs) : 0, 650
Remote Config [candidate] (659.369 µs) : 0, 659
Telemetry [baseline] (8.299 ms) : 0, 8299
Telemetry [candidate] (7.998 ms) : 0, 7998
section iast
BytebuddyAgent [baseline] (829.303 ms) : 0, 829303
BytebuddyAgent [candidate] (828.821 ms) : 0, 828821
GlobalTracer [baseline] (246.734 ms) : 0, 246734
GlobalTracer [candidate] (246.648 ms) : 0, 246648
AppSec [baseline] (57.846 ms) : 0, 57846
AppSec [candidate] (57.862 ms) : 0, 57862
IAST [baseline] (21.092 ms) : 0, 21092
IAST [candidate] (21.155 ms) : 0, 21155
Remote Config [baseline] (644.318 µs) : 0, 644
Remote Config [candidate] (652.136 µs) : 0, 652
Telemetry [baseline] (8.5 ms) : 0, 8500
Telemetry [candidate] (8.515 ms) : 0, 8515
section profiling
BytebuddyAgent [baseline] (701.077 ms) : 0, 701077
BytebuddyAgent [candidate] (702.558 ms) : 0, 702558
GlobalTracer [baseline] (370.553 ms) : 0, 370553
GlobalTracer [candidate] (372.442 ms) : 0, 372442
AppSec [baseline] (54.027 ms) : 0, 54027
AppSec [candidate] (54.243 ms) : 0, 54243
Remote Config [baseline] (655.908 µs) : 0, 656
Remote Config [candidate] (671.064 µs) : 0, 671
Telemetry [baseline] (7.817 ms) : 0, 7817
Telemetry [candidate] (7.832 ms) : 0, 7832
ProfilingAgent [baseline] (95.731 ms) : 0, 95731
ProfilingAgent [candidate] (95.497 ms) : 0, 95497
Profiling [baseline] (95.758 ms) : 0, 95758
Profiling [candidate] (95.526 ms) : 0, 95526
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 16 unstable metrics. Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section baseline
no_agent (1.358 ms) : 1338, 1377
. : milestone, 1358,
appsec (1.765 ms) : 1741, 1789
. : milestone, 1765,
appsec_no_iast (1.752 ms) : 1726, 1778
. : milestone, 1752,
iast (1.505 ms) : 1482, 1528
. : milestone, 1505,
profiling (1.567 ms) : 1543, 1592
. : milestone, 1567,
tracing (1.476 ms) : 1451, 1501
. : milestone, 1476,
section candidate
no_agent (1.372 ms) : 1351, 1392
. : milestone, 1372,
appsec (1.769 ms) : 1746, 1793
. : milestone, 1769,
appsec_no_iast (1.776 ms) : 1752, 1801
. : milestone, 1776,
iast (1.512 ms) : 1489, 1534
. : milestone, 1512,
profiling (1.515 ms) : 1491, 1539
. : milestone, 1515,
tracing (1.499 ms) : 1475, 1523
. : milestone, 1499,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section baseline
no_agent (378.296 µs) : 357, 400
. : milestone, 378,
iast (495.821 µs) : 474, 518
. : milestone, 496,
iast_FULL (657.279 µs) : 636, 679
. : milestone, 657,
iast_GLOBAL (525.683 µs) : 504, 547
. : milestone, 526,
iast_HARDCODED_SECRET_DISABLED (493.715 µs) : 472, 515
. : milestone, 494,
iast_INACTIVE (452.718 µs) : 431, 474
. : milestone, 453,
iast_TELEMETRY_OFF (480.982 µs) : 460, 502
. : milestone, 481,
tracing (460.925 µs) : 440, 482
. : milestone, 461,
section candidate
no_agent (387.95 µs) : 368, 408
. : milestone, 388,
iast (494.661 µs) : 473, 516
. : milestone, 495,
iast_FULL (660.087 µs) : 638, 682
. : milestone, 660,
iast_GLOBAL (524.412 µs) : 502, 547
. : milestone, 524,
iast_HARDCODED_SECRET_DISABLED (496.012 µs) : 474, 518
. : milestone, 496,
iast_INACTIVE (452.058 µs) : 431, 473
. : milestone, 452,
iast_TELEMETRY_OFF (481.591 µs) : 460, 503
. : milestone, 482,
tracing (455.045 µs) : 434, 476
. : milestone, 455,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section baseline
no_agent (1.464 ms) : 1453, 1476
. : milestone, 1464,
appsec (2.344 ms) : 2302, 2386
. : milestone, 2344,
iast (2.084 ms) : 2032, 2137
. : milestone, 2084,
iast_GLOBAL (2.126 ms) : 2073, 2180
. : milestone, 2126,
profiling (1.93 ms) : 1889, 1972
. : milestone, 1930,
tracing (1.926 ms) : 1886, 1967
. : milestone, 1926,
section candidate
no_agent (1.471 ms) : 1460, 1483
. : milestone, 1471,
appsec (2.339 ms) : 2297, 2380
. : milestone, 2339,
iast (2.075 ms) : 2022, 2127
. : milestone, 2075,
iast_GLOBAL (2.135 ms) : 2081, 2188
. : milestone, 2135,
profiling (1.969 ms) : 1926, 2012
. : milestone, 1969,
tracing (1.923 ms) : 1883, 1964
. : milestone, 1923,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.45.0-SNAPSHOT~c9cf18f149, baseline=1.45.0-SNAPSHOT~46b5986f6a
dateFormat X
axisFormat %s
section baseline
no_agent (14.877 s) : 14877000, 14877000
. : milestone, 14877000,
appsec (14.838 s) : 14838000, 14838000
. : milestone, 14838000,
iast (18.884 s) : 18884000, 18884000
. : milestone, 18884000,
iast_GLOBAL (17.833 s) : 17833000, 17833000
. : milestone, 17833000,
profiling (14.808 s) : 14808000, 14808000
. : milestone, 14808000,
tracing (14.889 s) : 14889000, 14889000
. : milestone, 14889000,
section candidate
no_agent (14.694 s) : 14694000, 14694000
. : milestone, 14694000,
appsec (15.145 s) : 15145000, 15145000
. : milestone, 15145000,
iast (18.846 s) : 18846000, 18846000
. : milestone, 18846000,
iast_GLOBAL (17.795 s) : 17795000, 17795000
. : milestone, 17795000,
profiling (14.922 s) : 14922000, 14922000
. : milestone, 14922000,
tracing (14.955 s) : 14955000, 14955000
. : milestone, 14955000,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. A bit sad to decrease the code coverage though 😞
This kind of PR reminds me we need to define general components and events as platform… (similar to the ones that would help with CRaC).
@@ -25,7 +25,7 @@ dependencies { | |||
} | |||
|
|||
ext { | |||
minimumBranchCoverage = 0.6 | |||
minimumBranchCoverage = 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the whole module not testable or it the cost too high?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the double-checked lock that tips things over - attempting to cover all the cases for this piece of code would be very intensive (you'd end up testing the narrow volatile visibility edge.) There are other branches in SharedCommunicationObjects
which are not currently tested which is why we're at this boundary to begin with, but adding tests for those unrelated pieces of code in this PR is IMHO confusing for future reviewers.
I could just exclude SharedCommunicationObjects
from branch coverage completely, but that feels worse - it's already excluded from instrumentation coverage - so reducing the coverage requirement here by a small amount is the least worst option.
The proper solution is to add tests to increase coverage in a separate PR, as a separate work item.
try { | ||
// remote components were paused for custom log-manager/jmx-builder | ||
// add small delay before resuming remote I/O to help stabilization | ||
Thread.sleep(1_000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question: is this an empirical delay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - having no delay or a delay of a few 100ms caused issues because the custom log manager might not have finished its setup by the time the delay expires. The resulting JUL class-loading triggered from this thread via OkHttp could then lead to the logging setup not being complete, which can in turn lead to missing logging.
There's no good class-loader signal to tell us the custom log manager is ready, only when JUL has been touched (and if we try to be too smart then we might miss the signal, which would lead to us never resuming remote I/O)
So this is an empirical practical solution which stabilizes the test locally and on CI. That's also why it's important to install the tracer (and logs-intake collector) early so they can capture data while holding back the remote I/O. That way we avoid triggering JUL issues while not losing trace or log data.
Also note this code path is only used for very specific setups, specifically Java 8 with JFR or IBM JDKs.
What Does This Do
Add ability to pause remote components until SharedCommunicationObjects is ready for remote I/O
Use this to install tracer and logs-intake as soon as necessary, but still defer any remote components until when use of OkHttp is allowed
Remove exemption where we didn't defer if the custom logging manager or JMX builder was on the system classpath (because previously the main thread would find it there if OkHttp triggered initialization of JUL.)
We now make OkHttp calls from our own background threads to avoid blocking the main thread, but our threads are isolated from the system classloader - so this exemption no longer makes sense and should be removed
Motivation
Resolves some spurious failures in the log-injection smoke tests for the IBM Java 8 JDK.
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]