Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultTracerProvider set permanently in global var, does not look up user-configurable provider #1159

Closed
jrozentur opened this issue Sep 24, 2020 · 19 comments · Fixed by #1726
Assignees
Labels
bug Something isn't working discussion Issue or PR that needs/is extended discussion.

Comments

@jrozentur
Copy link

Describe your environment
Python3.7 sdk 0.12b0
Steps to reproduce
trace.get_tracer(name)
trace.get_tracer_provider().add_span_processor(OITracing._span_processor)
=> E AttributeError: 'DefaultTracerProvider' object has no attribute 'add_span_processor'
ok, then
trace.set_tracer_provider(
TracerProvider(sampler=trace_api.sampling.ALWAYS_ON))
=> WARNING: opentelemetry.trace:Overriding of current TracerProvider is not allowed
This situation is not correctable
What is the expected behavior?

What is the actual behavior?
it calls get_tracer_provider, which SETS global _TRACER_PROVIDER=DefaultTraceProvider, which cannot be overwritten

Additional context
getter should not set global flag, esp when it is not possible to recover from this and provide the right setting. Order of code initialization can be a tricky issue to resolve, e.g. I have an external package call get_tracer() before I have a chance to initialize things properly
I looked at how _load_provider() works. But it is actually used only to load a default_tracer_provider, which it naturally finds in opentelemetry-sdk. Since it takes the first entry (using next()) on the generator, there is no way to override that either, so what is the point of doing this configuration lookup if you know it will be DefaultTracerPrivider? I thought it will look up 'tracer_provider' that can be configured in user project. Naming the method load_provider() would indicate that https://github.com/open-telemetry/opentelemetry-python/blob/master/opentelemetry-api/src/opentelemetry/util/__init__.py#L51

@jrozentur jrozentur added the bug Something isn't working label Sep 24, 2020
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this issue Nov 1, 2020
@aabmass
Copy link
Member

aabmass commented Dec 3, 2020

getter should not set global flag, esp when it is not possible to recover from this and provide the right setting.

Just some background on the reason it works like this. Library authors should instrument their code with the opentelemetry-api package alone, without taking dependency on opentelemetry-sdk. If a user of that library does not set the global tracer provider, then the DefaultTracerProvider will provide no-op spans that don't do anything, since the user isn't collecting telemetry. This is core to the design of OpenTelemetry, not just the python impl.

Order of code initialization can be a tricky issue to resolve, e.g. I have an external package call get_tracer() before I have a chance to initialize things properly

Definitely it can be tricky. That is the intention behind logging a warning (#781), so that it doesn't fail to collect telemetry silently. It should be possible to call set_tracer_provider() before any of your other imports though?

Since it takes the first entry (using next()) on the generator, there is no way to override that either

That is a good point, I'm not too familiar with how this should work. Someone else could speak to this.

Also just wanted to say that you don't have to use the globals if you don't want to. You can just pass around the TracerProvider or have it as a constant somewhere. The downside is that any code instrumented to use the global one won't collect telemetry.

@owais
Copy link
Contributor

owais commented Dec 3, 2020

Dup: #1276

@lzchen
Copy link
Contributor

lzchen commented Dec 3, 2020

@owais
Should we keep this one and delete the older one?

@owais
Copy link
Contributor

owais commented Dec 3, 2020

Sure, just wanted to share it as it showed a real world example of where import order is not necessarily in users' control.

@aabmass
Copy link
Member

aabmass commented Dec 3, 2020

@owais from the other issue, i had a question about the django app:

Any idea why this happens before the setup in wsgi.py? I thought that would be the first thing to run. Would be good to understand the cases where folks are running into this

@aabmass aabmass added the discussion Issue or PR that needs/is extended discussion. label Dec 9, 2020
@aabmass
Copy link
Member

aabmass commented Dec 9, 2020

We discussed this in SIG and I think we should continue discussion before deciding how to move forward. Some options:

  1. Keep the current code, or take it further and have it throw an exception instead of warning
  2. Proxying solution which @anton-ryzhov has prototyped in Allow to override global tracer_provider after tracers creation #1445 😃
    • Spans/measurements before the exporter is added will be dropped
  3. Use an envvar to specify an entrypoint TracerProvider
    • We already have entrypoints to load the provider, but it currently just takes the first one which isn't too useful b/c it always gives the API's default provider:
      Configuration().get(
      provider.upper(), "default_{}".format(provider),
      ),
    • This helps when the user doesn't have control over initialization of their program/imports
    • When the user's code runs, they can add SpanProcessors (or start metric pipelines).
    • Spans/measurements before the exporter is added will be dropped

1. is the only way to prevent spans/metrics from being dropped, because it forces the user to be explicit.

@anton-ryzhov
Copy link
Contributor

  1. We already have entrypoints to load the provider, but it currently just takes the first one which isn't too useful b/c it always gives the API's default provider

All entrypoints [should] have different names, and user may specify desired one before imports.

But it's not possible to decide on components at later steps (e.g. after loading the confutation from external source)

@aabmass
Copy link
Member

aabmass commented Dec 10, 2020

Continuing the discussion with @anton-ryzhov from #1445 here:

Anyway, the rule that some custom code must be executed at the very beginning, even before imports, to allow configuration — looks unexpected, strange and unobvious requirement.

I agree it's not obvious and should be better documented. We want instrumentation to be as easy as possible for users. To me it's not that strange of a requirement for an instrumentation library to do setup at the beginning before other things happen; I would rather folks explicitly set what they want than lose traces/metrics because things ran in an unexpected order or an import took longer than expected. I'm worried the proxy approach might encourage this. Logging is also upcoming in OTel and we don't want to lose startup logs.

@HeinrichHartmann
Copy link

To me it's not that strange of a requirement for an instrumentation library to do setup at the beginning before other things happen; I would rather folks explicitly set what they want than lose traces/metrics because things ran in an unexpected order or an import took longer than expected.

I see a trade off here between:

  1. Ease of use (more magic)
  2. Principle of least surprise (less magic)

People may fall on different ends here, and I see a strong emphasize of (1) with the overall project (which makes sense).

For adoption of OTel at our company I have to balance this with the developer expectations that instrumentation libraries are safe to use and predictable. For example, we have the following policies, that we enforce in our OTel "distribution" we release for internal use:

  • Auto-instrumentation MUST be explicitly applied to a module or an object (app = flask(...); otel.auto_instrument(app))
  • Auto-instrumentation MUST NOT hook into the GC system (saw this somewhere)

From my experience, the developer expectation is that initialization and object creation is started at the main() call and is not a side-effect of the import. I would not expect applications to emit spans before main is called. If spans were to be created before the instrumentation is initialized, I would expect these spans to be dropped.

So my current hunch is that we will need to defer initialization in some ways (e.g. via 2) for our internal use.
Of course, we would appreciate the OTel project adopt a similar solution, and are happy to discuss and contribute.
However, we trust your judgement when it comes to how this project is best serving the whole community and fully understand if you arrive at another conclusion here.

@Oberon00
Copy link
Member

getter should not set global flag, esp when it is not possible to recover from this and provide the right setting. Order of code initialization can be a tricky issue to resolve, e.g. I have an external package call get_tracer() before I have a chance to initialize things properly

Interestingly, while the global tracer handling code was completely rewritten since then, this seems to be the same problem as #45

@aabmass
Copy link
Member

aabmass commented Dec 21, 2020

FYI, opentelemetry-js has a ProxyTracerProvider in their @opentelemetry/api package.

I wonder if both Python and JS go ahead with this approach if it should be in the Tracing API spec (which is frozen)? Last SIG, we discussed opentelemetry.configuration.Configuration should be removed from our API package since it isn't "stable" in regards to the spec. Thoughts?

@srikanthccv
Copy link
Member

Since it takes the first entry (using next()) on the generator, there is no way to override that either, so what is the point of doing this configuration lookup if you know it will be DefaultTracerPrivider? I thought it will look up 'tracer_provider' that can be configured in user project. Naming the method load_provider() would indicate that

This indeed does what you expect it to do. It takes the first entry from the group matching name but the name is configurable in the user project. If the name is not set it defaults to default_tracer_provider. If you have tricky code initialization order you can set env OTEL_PYTHON_TRACER_PROVIDER to sdk_tracer_provider. If you rely on some other implementation of API instead of opentelemetry-sdk, you need to set OTEL_PYTHON_TRACER_PROVIDER to whatever the entry point that package provides.

While this issue is about tracer I think the same applies to meter also. The meter provider could be set in the same way as tracer OTEL_PYTHON_METER_PROVIDER=sdk_meter_provider

cc // @aabmass

@aabmass
Copy link
Member

aabmass commented Jan 11, 2021

@lonewolf3739 is right, I missed these envvars before and it seems like a nice simple solution.

@anton-ryzhov @HeinrichHartmann does setting that environment variable work for your use case? The correct SDK provider will be set and you can use trace.get_tracer_provider().add_span_processor() to add your exporter dynamically after the configuration is loaded.

@anton-ryzhov
Copy link
Contributor

I doubt if setting a environment variable could be considered as easy-to-use and user-friendly (user=developer that imported this library) solution. For prod, in docker for example, it's fine and usual to pass configs as env.

But that's weird requirement for developers env. "If you want to use this lib — add this to your bash_profile and reboot" — is that how it supposed to be?

In my opinion it's still better to hardcode (with an explanation and apologizing comment) the value at first line of top __init__.py — that at least applies same value for all envs.

And is all that only to select one of only implementation? Could maybe SDK be a default setting if it's installed and use Default* if it can't import sdk?

We had a discussion today in other PR about what may surprise users. set that may be called only once and get that internally calls that one-shot set — that what surprised me a lot.

@anton-ryzhov
Copy link
Contributor

Hold on…

I just tried your suggestion. But TracerProvider constructor takes parameters, such as sampler and resource that can't be changed later on, they are immutable and wired to all created Tracers.

I may add add_span_processor later at runtime, but what about these settings?

They only could be set before first import → they can't be dynamically loaded from arbitrary source → they should be hardcoded?

@srikanthccv
Copy link
Member

It is very under-documented, just letting you know that sampler and resource attributes can also be configured with environment variables. There is an open PR for adding support for OTEL_TRACE_SAMPLER and OTEL_TRACE_SAMPLER_ARG. The OTEL_RESOURCE_ATTRIBUTES is already supported and it takes key-value mapping string in the format of key1=value1,key2=value2. Link to all SDK environment variables in the spec.

@anton-ryzhov
Copy link
Contributor

So if I need to get these values from some config storage I need to run "pre-executable" process, fetch all these values, populate env and then exec to "the real" process? Is that a recommended way?

@aabmass
Copy link
Member

aabmass commented Jan 14, 2021

We discussed in the 2021-01-14 SIG meeting and agreed to go with the proxy approach in #1445. It will be a little more complicated for metrics (must have ProxyCounter, ProxyValueObserver, etc..) but should be doable. Thanks for the discussion everyone!

@owais
Copy link
Contributor

owais commented Apr 3, 2021

Any idea why this happens before the setup in wsgi.py? I thought that would be the first thing to run. Would be good to understand the cases where folks are running into this

@aabmass If I remember correctly, this happens because when we run the service with manage.py, django imports the urls, views, models etc to inspect internally before running the extra code we add.

Sorry for being a little late with the reply :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discussion Issue or PR that needs/is extended discussion.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants