Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly construct instance of LateBoundBatchSpanProcessor #34545

Merged
merged 3 commits into from
Jul 19, 2023

Conversation

geoand
Copy link
Contributor

@geoand geoand commented Jul 5, 2023

This is now done entirely in a CDI producer method which is then fully initialized when OpenTelemetry autoconfigures itself

Fixes: #29987 (which with newer Quarkus versions is even more problematic)

Detailed explanation:

  • The first commit fixes the case where the DataSource cannot properly be constructed because the OpenTelemetry has not yet been created by CDI. This could happen because there was not declaration of the need for that bean when constructing the DataSource bean, only programmatic lookup.
  • The second commits actually fixes the issue with the missing delegate in a pretty straightforward way: We move the creation of LateBoundBatchSpanProcessor to a synthetic bean - prior to this commit the LateBoundBatchSpanProcessor bean was created from a CDI producer but was then "updated" by a recorder. These two operations are obviously not "atomic", meaning that the OpenTelemetry bean that consumed LateBoundBatchSpanProcessor would see the bean in the intermediate state - where it was created but not updated.
    • This commit also includes the removal of various intermediate build items whose only purpose was to pass data between the various steps. These are no longer necessary, since everything can be accomplished with the combination of CDI and the BeanContainer in recorder methds.
    • Furthermore, this commit makes sure that we no longer use Arc.container()... in recorder methods as these are extremely sensitive to timing issues - by utilizing synthetic beans and the BeanContainer in recorder methods we overcome this problem.
  • The third commit builds on the second one to make the configuration of the SpanProcessors of OpenTelemetry even more reasonable. We essentially only include LateBoundBatchSpanProcessor in the OpenTelemetry bean if a delegate has been set.

Future work: The creation of the OpenTelemetry can be done via a synthetic bean as this allows us complete control over the injection points and does not suffer from any bean creating timing issues (assuming of course the creation is done properly).
This would essentially take what is not done in OpenTelemetryProducer and move it into a recorder method that includes all the creation logic (including the logic that currently resides in the synthetic bean for LateBoundBatchSpanProcessor)

@quarkus-bot
Copy link

quarkus-bot bot commented Jul 5, 2023

/cc @brunobat (opentelemetry), @radcortez (opentelemetry)

@radcortez
Copy link
Member

So what we are doing now is initializing everything at runtime, correct? (instead of the split we had before, where we created a few things at static init).

@geoand
Copy link
Contributor Author

geoand commented Jul 5, 2023

Correct, the bean initialization is not forced at build time anymore

@brunobat
Copy link
Contributor

brunobat commented Jul 5, 2023

This makes everything much simpler. Not sure about the increased startup time... Most likely not noticeable.
I need to check possible implications on the OpenTelemetryProducer, but I won't have time today.

@geoand
Copy link
Contributor Author

geoand commented Jul 5, 2023

No rush!

@quarkus-bot

This comment has been minimized.

@gsmet
Copy link
Member

gsmet commented Jul 5, 2023

If it gets merged, should it be backported?

@geoand
Copy link
Contributor Author

geoand commented Jul 5, 2023

I dunno, I'm on the fence...

@radcortez
Copy link
Member

I think we can hold this until 3.3.

Copy link
Contributor

@brunobat brunobat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to create a new test before approving this PR.

@brunobat
Copy link
Contributor

brunobat commented Jul 6, 2023

@radcortez @gsmet I'd like this back-ported to 3.2.* because these changes affect core parts of the extension and 3.2.* will run for a long time, potentially causing complex merges in the future.

@geoand
Copy link
Contributor Author

geoand commented Jul 10, 2023

If we want this backported, we probably should do it as soon as possible instead of waiting for late 3.2.z releases

Copy link
Contributor

@brunobat brunobat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be pragmatic then.
I'll not be able to do this while at the RivieraDev conf.
Any iD generator issue can be dealt in this new issue: #34668

@geoand
Copy link
Contributor Author

geoand commented Jul 11, 2023

I'm going to have to redo this one unfortunately since there are plenty of conflicts...

Furthermore, this will conflict with #34647

@geoand
Copy link
Contributor Author

geoand commented Jul 11, 2023

Here is my proposal for this one:

I redo it after #34647 is in, in order to avoid additional conflicts.
For 3.2, I open a new PR specifically targeting the 3.2 branch (this exact change should apply cleanly there).

WDYT?

@radcortez
Copy link
Member

Sure.

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

This now turns out to be harder. I'm looking into how it can be done.

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

Should be working now

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

But there is another thing I want to improve if possible

@brunobat brunobat self-requested a review July 18, 2023 10:55
@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

Should be good to go now

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

In future, I would like to move the creation of the OpenTelemetry bean out of OpenTelemetryProducer and into a synthetic bean, but I'll leave that for some other time because it's a lot of mechanical work that is not absolutely necessary at the moment

@quarkus-bot

This comment has been minimized.

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

Oh, I didn't know about those tests...

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

The JDBC telemetry stuff is pretty broken (it only worked by accident before) so I need to see what can be done...

@brunobat
Copy link
Contributor

Those tests are a bit brittle but beware that they end up finding a lot of problems.

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

The tests themselves are great and they indeed show a real problem here.

geoand added 3 commits July 18, 2023 17:27
This is now done entirely in a CDI producer method which is then fully initialized when OpenTelemetry autoconfigures itself

Fixes: quarkusio#29987 (which with newer Quarkus versions is even more problematic)
@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

The JDBC failures are now fixed.

In general, there were (and probably still are) a lot of timing issues that relate to when the various CDI beans are constructed and obtained.
That's why I want to move the OpenTelemetry bean creation to a synthetic bean in the future.

@brunobat
Copy link
Contributor

Ok @geoand. The interdependencies with vert.x in InstrumentationRecorder.createTraces(..)
Will an OTel Producer as a synthetic bean work with dev mode reload?

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

Yes it would.

But it's not something we need to do right now - it can wait for the other stuff I want to do around telemetry

@quarkus-bot
Copy link

quarkus-bot bot commented Jul 18, 2023

✔️ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

@geoand
Copy link
Contributor Author

geoand commented Jul 18, 2023

All tests pass so this is good from my end.

Copy link
Contributor

@brunobat brunobat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @geoand !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OTel LateBoundSampler logs warning if using Liquibase and JDBC instrumentation
4 participants