-
Notifications
You must be signed in to change notification settings - Fork 872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate from gRPC to armeria for testing agent in memory exporter #5314
Conversation
...er/src/main/java/io/opentelemetry/javaagent/testing/exporter/AgentTestingLogsCustomizer.java
Show resolved
Hide resolved
Arrays.asList( | ||
AgentTestingTracingCustomizer.spanProcessor.forceFlush(), | ||
AgentTestingLogsCustomizer.logProcessor.forceFlush()); | ||
CompletableResultCode.ofAll(results).join(10, TimeUnit.SECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about using OpenTelemetrySdkAccess
instead? It flushes the meter provider too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops realized that we don't want to flush metrics here anyways (there's no such thing as pending metric exports really, all exports happen at random times and are valid)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grpc without grpc 🤯
e06c774
to
dc4d91a
Compare
7059ef3
to
390b772
Compare
This PR seems to cause a core dump on Java 15 OpenJ9 reliably, anyone have an idea what could cause it? https://github.com/open-telemetry/opentelemetry-java-instrumentation/runs/5104586391?check_suite_focus=true @laurit Maybe you've seen it before? |
ya this is weird. it seems to consistently seg fault on the dubbo and camel tests, and none of the others. and there's no consistency in the "current thread" in the core dumps between runs, which is normally what i'd use to search on to see if it's a known bug and there's a lot of openj9 segfault issues and even if we could run integration tests on Java 17 today, I don't think we could run them on openj9 17 due to #5051 (comment) and https://adoptopenjdk.net/archive.html?variant=openjdk8&jvmVariant=openj9 maybe it's best to go with the i'm also ok with skipping those two tests on openj9 15 if we want to move forward with this PR |
…nstrumentation into testing-agent-armeria
I played around with this a bit. Firstly reporting this to openj9 is probably futile. Afaik they build all their runtimes from the same code base and if the bug was still there then there is a good chance 8 & 11 would also fail similarly. autoConfigurationCustomizer.addMeterProviderCustomizer(
(meterProvider, config) ->
meterProvider.registerMetricReader(
PeriodicMetricReader.builder(AgentTestingExporterFactory.metricExporter)
.setInterval(Duration.ofMillis(300))
.newMetricReaderFactory())); is commented out or when in // If noop OpenTelemetry is enabled, autoConfiguredSdk will be null and AgentListeners are not
// called
AutoConfiguredOpenTelemetrySdk autoConfiguredSdk = null;
if (config.getBoolean(JAVAAGENT_NOOP_CONFIG, false)) {
logger.info("Tracing and metrics are disabled because noop is enabled.");
GlobalOpenTelemetry.set(NoopOpenTelemetry.getInstance());
} else {
autoConfiguredSdk = installOpenTelemetrySdk(config);
}
if (autoConfiguredSdk != null) {
runBeforeAgentListeners(agentListeners, config, autoConfiguredSdk);
} is moved after byte-buddy instrumenter is set up or when in |
I don't think we can do that either - that could cause some instrumentations to initialize with the no-op |
1b1db74
to
1d47a04
Compare
Thanks @laurit for the great digging! Delayed the metric export to after agent initializes |
} | ||
|
||
@SuppressWarnings("ImmutableEnumChecker") | ||
private enum StartableMetricReader implements MetricReaderFactory, MetricReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a brief comment explaining why this is needed (or just pointing to the PR discussion)?
For reference, I was expecting test slowdown with this PR going from in-memory transport to using localhost network traffic. After getting some builds through, dunno if I like it. Instead sent #5332 |
I didn't say it would be easy :) There wouldn't be too many instrumentations that are affected, these could use an extra agent started check and bail out when sdk isn't ready yet. Perhaps this agent started check could even be baked into instrumenter api somehow? For another case where running initial retransformation concurrently with background thread started sdk code produces strangeness see #4697 Basically it boils down to replacing one set of problems with another, hopefully more manageable, set of problems. |
There is currently a big gap between the published agent and our tests because for historical reasons we include gRPC in the testing exporter. This switches it to Armeria to avoid the gRPC dependency, which automatically switches the tests to using the okhttp export codepath.