-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore TLS components (SSLContext, TrustManager, KeyManager) if plain HTTP protocol is used for exporting #6329
Ignore TLS components (SSLContext, TrustManager, KeyManager) if plain HTTP protocol is used for exporting #6329
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6329 +/- ##
============================================
+ Coverage 91.12% 91.14% +0.01%
- Complexity 5856 5863 +7
============================================
Files 636 636
Lines 17062 17071 +9
Branches 1733 1740 +7
============================================
+ Hits 15548 15559 +11
+ Misses 1019 1018 -1
+ Partials 495 494 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me. Definitely makes sense to avoid the extra work when not sending via SSL which is likely very common (eg local collector).
As far as I see, |
@@ -207,8 +208,8 @@ public GrpcExporter<T> build() { | |||
grpcChannel, | |||
grpcStubFactory, | |||
retryPolicy, | |||
tlsConfigHelper.getSslContext(), | |||
tlsConfigHelper.getTrustManager()); | |||
isPlainHttp ? null : tlsConfigHelper.getSslContext(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since artifact layout is customized for Spring's custom Launcher classloader as mentioned above, JDK level security components are initialized without custom KeyStoreSpi implementation, because custom KeyStoreSpi implementation is not visible to system classloader, but to Spring's custom Launcher classloader.
It looks like the call to SSLContext.getInstance("TLS")
is what is responsible for initializing this state? Seems brittle to need to rely on the otel java agent and otel java SDK to not call ordinary java network APIs like SSLContext.getInstance("TLS")
.
Why is it not ok to call SSLContext.getInstance("TLS")
when the scheme is http
, but acceptable to call it when scheme is https
? Wouldn't spring users that want to implement KeyStoreSpi
and use https
be subject to this problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jack-berg,
Actually, in my case, extra SSLContext.getInstance("TLS")
call is not a problem. But I have though that there might be similar problem for javax.net.ssl.SSLContextSpi
and java.security.Provider
as they are also loaded by Java ServiceLoader
. So I have prevented extra SSLContext.getInstance("TLS")
calls if they are not needed (scheme is http
).
If you want, I can revert those changes and only keep
builder.connectionSpecs(Collections.singletonList(ConnectionSpec.CLEARTEXT));
parts as those fix the my original issue.
Regarding to your this question
Wouldn't spring users that want to implement KeyStoreSpi and use https be subject to this problem?
Yes, you're right. They are still subject to this problem. In fact, this PR doesn't fix all the possible issues (when OTEL Java agent uses secure connection while exporting to collector), but most of the cases (I believe), because as @tylerbenson mentioned, many are exporting to local collector first without secure connection, and then local collector deals with exporting to remove vendor target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be evaluated as a performance optimization where we try to avoid doing unnecessary work in obvious cases rather than creating a guarantee that it never happens.
While this has a similar appearance as java util logging which we take great pains to deal with in the javaagent, doing so adds a significant amount of complexity in the javaagent.
For logging this complexity is worth it because logging initialization is so hard to avoid, while SSL initialization is only done in very specific locations and not difficult to avoid (with the added benefit of reducing performance overhead and not even requiring an additional setting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The perf argument is quite different than how this is being framed. Some points to consider:
- On my machine,
SSLContext.getInstance("TLS")
takes ~50ms. Anyone know how long the agent typically takes to initialize? - Optimizing initialization for perf reasons seems reasonable, but we shouldn't be under the impression that this is a reliable fix for the reported issue. The same issue is likely to pop up again the future.
- If we want to optimize initialization for perf reasons, can we look at a profile of CPU time for the initialization of the agent / SDK? There's likely other low hanging fruit we can optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, for the performance optimization perspective, getting rid of redundant SSLContext.getInstance("TLS")
is not the only part optimized here. But also getting rid of redundant TrustManager and KeyStore initialization in the OkHttpClient
. In our case, initialization of the TrustManager is the real problem.
Here is the stacktrace to see the flow:
OpenTelemetry Javaagent failed to start
java.security.KeyStoreException: problem accessing trust store
at java.base/sun.security.ssl.TrustManagerFactoryImpl.engineInit(TrustManagerFactoryImpl.java:73)
at java.base/javax.net.ssl.TrustManagerFactory.init(TrustManagerFactory.java:282)
at okhttp3.internal.platform.Platform.platformTrustManager(Platform.kt:80)
at okhttp3.OkHttpClient.<init>(OkHttpClient.kt:237)
at okhttp3.OkHttpClient$Builder.build(OkHttpClient.kt:1069)
at io.opentelemetry.exporter.sender.okhttp.internal.OkHttpGrpcSender.<init>(OkHttpGrpcSender.java:97)
at io.opentelemetry.exporter.sender.okhttp.internal.OkHttpGrpcSenderProvider.createSender(OkHttpGrpcSenderProvider.java:44)
at io.opentelemetry.exporter.internal.grpc.GrpcExporterBuilder.build(GrpcExporterBuilder.java:189)
at io.opentelemetry.exporter.otlp.metrics.OtlpGrpcMetricExporterBuilder.build(OtlpGrpcMetricExporterBuilder.java:243)
at io.opentelemetry.exporter.otlp.internal.OtlpMetricExporterProvider.createExporter(OtlpMetricExporterProvider.java:71)
...
...
Caused by: java.io.IOException: Keystore was tampered with, or password was incorrect
at java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:813)
at java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:242)
at java.base/java.security.KeyStore.load(KeyStore.java:1473)
at java.base/sun.security.ssl.TrustStoreManager$TrustAnchorManager.loadKeyStore(TrustStoreManager.java:390)
at java.base/sun.security.ssl.TrustStoreManager$TrustAnchorManager.getTrustedCerts(TrustStoreManager.java:336)
at java.base/sun.security.ssl.TrustStoreManager.getTrustedCerts(TrustStoreManager.java:57)
at java.base/sun.security.ssl.TrustManagerFactoryImpl.engineInit(TrustManagerFactoryImpl.java:49)
... 43 more
Caused by: java.security.UnrecoverableKeyException: Password verification failed
at java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:811)
... 49 more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerbenson @jack-berg What you are thinking about this PR? Is there anything I can add/fix this PR? Or the Do changes make sense you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving this, but not for the reasons discussed in the PR description. As discussed in this comment chain, this isn't a reliable solution for the problem described. And folks shouldn't depend on or expect the SDK to jump through hoops to add a bandaid without addressing the underlying issue in a general way.
Accepting this because PR description aside, its a innocuous change with a minor improvement in perf for a narrow slice of users.
@serkan-ozal can you update with latest changes from main to fix the build? Thanks. |
… HTTP protocol is used for exporting
957b6d4
to
dacf5b9
Compare
@jack-berg Thanks. Just rebased from |
Even though plain HTTP protocol is used for exporting, still security components (SSLContext, TrustManager, KeyManager) are initialized. Most of the time, this is not a problem other than adding some minor startup overhead.
But there might be some cases that security components should not be initialized initially by OTEL Java agent. Here is a case which effects one of our customers:
KeyStoreSpi
implementation.KeyStoreSpi
implementation, because customKeyStoreSpi
implementation is not visible to system classloader, but to Spring's custom Launcher classloader.Also there is a similarity between the case I have mentioned above and this issue open-telemetry/opentelemetry-java-instrumentation#10921. They are not the same errors, but caused by the initialization of some JDK level SPIs at system classloader level instead of Launcher classloader for Spring applications.