DefaultTracerProvider set permanently in global var, does not look up user-configurable provider #1159

jrozentur · 2020-09-24T21:48:47Z

Describe your environment
Python3.7 sdk 0.12b0
Steps to reproduce
trace.get_tracer(name)
trace.get_tracer_provider().add_span_processor(OITracing._span_processor)
=> E AttributeError: 'DefaultTracerProvider' object has no attribute 'add_span_processor'
ok, then
trace.set_tracer_provider(
TracerProvider(sampler=trace_api.sampling.ALWAYS_ON))
=> WARNING: opentelemetry.trace:Overriding of current TracerProvider is not allowed
This situation is not correctable
What is the expected behavior?

What is the actual behavior?
it calls get_tracer_provider, which SETS global _TRACER_PROVIDER=DefaultTraceProvider, which cannot be overwritten

Additional context
getter should not set global flag, esp when it is not possible to recover from this and provide the right setting. Order of code initialization can be a tricky issue to resolve, e.g. I have an external package call get_tracer() before I have a chance to initialize things properly
I looked at how _load_provider() works. But it is actually used only to load a default_tracer_provider, which it naturally finds in opentelemetry-sdk. Since it takes the first entry (using next()) on the generator, there is no way to override that either, so what is the point of doing this configuration lookup if you know it will be DefaultTracerPrivider? I thought it will look up 'tracer_provider' that can be configured in user project. Naming the method load_provider() would indicate that https://github.com/open-telemetry/opentelemetry-python/blob/master/opentelemetry-api/src/opentelemetry/util/__init__.py#L51

) Co-authored-by: Daniel Dyla <[email protected]>

aabmass · 2020-12-03T16:50:19Z

getter should not set global flag, esp when it is not possible to recover from this and provide the right setting.

Just some background on the reason it works like this. Library authors should instrument their code with the opentelemetry-api package alone, without taking dependency on opentelemetry-sdk. If a user of that library does not set the global tracer provider, then the DefaultTracerProvider will provide no-op spans that don't do anything, since the user isn't collecting telemetry. This is core to the design of OpenTelemetry, not just the python impl.

Order of code initialization can be a tricky issue to resolve, e.g. I have an external package call get_tracer() before I have a chance to initialize things properly

Definitely it can be tricky. That is the intention behind logging a warning (#781), so that it doesn't fail to collect telemetry silently. It should be possible to call set_tracer_provider() before any of your other imports though?

Since it takes the first entry (using next()) on the generator, there is no way to override that either

That is a good point, I'm not too familiar with how this should work. Someone else could speak to this.

Also just wanted to say that you don't have to use the globals if you don't want to. You can just pass around the TracerProvider or have it as a constant somewhere. The downside is that any code instrumented to use the global one won't collect telemetry.

owais · 2020-12-03T17:58:01Z

Dup: #1276

lzchen · 2020-12-03T19:33:12Z

@owais
Should we keep this one and delete the older one?

owais · 2020-12-03T20:38:10Z

Sure, just wanted to share it as it showed a real world example of where import order is not necessarily in users' control.

aabmass · 2020-12-03T22:22:47Z

@owais from the other issue, i had a question about the django app:

trace.get_tracer() call ends up getting called before tracing is setup. https://github.com/owais/otel-django-test/blob/main/djtest/views.py#L6

Any idea why this happens before the setup in wsgi.py? I thought that would be the first thing to run. Would be good to understand the cases where folks are running into this

aabmass · 2020-12-09T20:49:07Z

We discussed this in SIG and I think we should continue discussion before deciding how to move forward. Some options:

Keep the current code, or take it further and have it throw an exception instead of warning
Proxying solution which @anton-ryzhov has prototyped in Allow to override global tracer_provider after tracers creation #1445 😃
- Spans/measurements before the exporter is added will be dropped

Use an envvar to specify an entrypoint TracerProvider

We already have entrypoints to load the provider, but it currently just takes the first one which isn't too useful b/c it always gives the API's default provider:

opentelemetry-python/opentelemetry-api/src/opentelemetry/util/__init__.py

Lines 50 to 52 in 5450e6d

    
           Configuration().get( 
        
               provider.upper(), "default_{}".format(provider), 
        
           ),

This helps when the user doesn't have control over initialization of their program/imports
When the user's code runs, they can add SpanProcessors (or start metric pipelines).
Spans/measurements before the exporter is added will be dropped

1. is the only way to prevent spans/metrics from being dropped, because it forces the user to be explicit.

anton-ryzhov · 2020-12-10T14:43:30Z

We already have entrypoints to load the provider, but it currently just takes the first one which isn't too useful b/c it always gives the API's default provider

All entrypoints [should] have different names, and user may specify desired one before imports.

But it's not possible to decide on components at later steps (e.g. after loading the confutation from external source)

aabmass · 2020-12-10T18:54:51Z

Continuing the discussion with @anton-ryzhov from #1445 here:

Anyway, the rule that some custom code must be executed at the very beginning, even before imports, to allow configuration — looks unexpected, strange and unobvious requirement.

I agree it's not obvious and should be better documented. We want instrumentation to be as easy as possible for users. To me it's not that strange of a requirement for an instrumentation library to do setup at the beginning before other things happen; I would rather folks explicitly set what they want than lose traces/metrics because things ran in an unexpected order or an import took longer than expected. I'm worried the proxy approach might encourage this. Logging is also upcoming in OTel and we don't want to lose startup logs.

HeinrichHartmann · 2020-12-11T12:16:09Z

To me it's not that strange of a requirement for an instrumentation library to do setup at the beginning before other things happen; I would rather folks explicitly set what they want than lose traces/metrics because things ran in an unexpected order or an import took longer than expected.

I see a trade off here between:

Ease of use (more magic)
Principle of least surprise (less magic)

People may fall on different ends here, and I see a strong emphasize of (1) with the overall project (which makes sense).

For adoption of OTel at our company I have to balance this with the developer expectations that instrumentation libraries are safe to use and predictable. For example, we have the following policies, that we enforce in our OTel "distribution" we release for internal use:

Auto-instrumentation MUST be explicitly applied to a module or an object (app = flask(...); otel.auto_instrument(app))
Auto-instrumentation MUST NOT hook into the GC system (saw this somewhere)

From my experience, the developer expectation is that initialization and object creation is started at the main() call and is not a side-effect of the import. I would not expect applications to emit spans before main is called. If spans were to be created before the instrumentation is initialized, I would expect these spans to be dropped.

So my current hunch is that we will need to defer initialization in some ways (e.g. via 2) for our internal use.
Of course, we would appreciate the OTel project adopt a similar solution, and are happy to discuss and contribute.
However, we trust your judgement when it comes to how this project is best serving the whole community and fully understand if you arrive at another conclusion here.

Oberon00 · 2020-12-11T15:24:31Z

getter should not set global flag, esp when it is not possible to recover from this and provide the right setting. Order of code initialization can be a tricky issue to resolve, e.g. I have an external package call get_tracer() before I have a chance to initialize things properly

Interestingly, while the global tracer handling code was completely rewritten since then, this seems to be the same problem as #45

aabmass · 2020-12-21T20:48:02Z

FYI, opentelemetry-js has a ProxyTracerProvider in their @opentelemetry/api package.

I wonder if both Python and JS go ahead with this approach if it should be in the Tracing API spec (which is frozen)? Last SIG, we discussed opentelemetry.configuration.Configuration should be removed from our API package since it isn't "stable" in regards to the spec. Thoughts?

srikanthccv · 2021-01-09T11:08:14Z

Since it takes the first entry (using next()) on the generator, there is no way to override that either, so what is the point of doing this configuration lookup if you know it will be DefaultTracerPrivider? I thought it will look up 'tracer_provider' that can be configured in user project. Naming the method load_provider() would indicate that

This indeed does what you expect it to do. It takes the first entry from the group matching name but the name is configurable in the user project. If the name is not set it defaults to default_tracer_provider. If you have tricky code initialization order you can set env OTEL_PYTHON_TRACER_PROVIDER to sdk_tracer_provider. If you rely on some other implementation of API instead of opentelemetry-sdk, you need to set OTEL_PYTHON_TRACER_PROVIDER to whatever the entry point that package provides.

While this issue is about tracer I think the same applies to meter also. The meter provider could be set in the same way as tracer OTEL_PYTHON_METER_PROVIDER=sdk_meter_provider

cc // @aabmass

aabmass · 2021-01-11T21:25:56Z

@lonewolf3739 is right, I missed these envvars before and it seems like a nice simple solution.

@anton-ryzhov @HeinrichHartmann does setting that environment variable work for your use case? The correct SDK provider will be set and you can use trace.get_tracer_provider().add_span_processor() to add your exporter dynamically after the configuration is loaded.

anton-ryzhov · 2021-01-11T22:46:45Z

I doubt if setting a environment variable could be considered as easy-to-use and user-friendly (user=developer that imported this library) solution. For prod, in docker for example, it's fine and usual to pass configs as env.

But that's weird requirement for developers env. "If you want to use this lib — add this to your bash_profile and reboot" — is that how it supposed to be?

In my opinion it's still better to hardcode (with an explanation and apologizing comment) the value at first line of top __init__.py — that at least applies same value for all envs.

And is all that only to select one of only implementation? Could maybe SDK be a default setting if it's installed and use Default* if it can't import sdk?

We had a discussion today in other PR about what may surprise users. set that may be called only once and get that internally calls that one-shot set — that what surprised me a lot.

anton-ryzhov · 2021-01-13T10:10:17Z

Hold on…

I just tried your suggestion. But TracerProvider constructor takes parameters, such as sampler and resource that can't be changed later on, they are immutable and wired to all created Tracers.

I may add add_span_processor later at runtime, but what about these settings?

They only could be set before first import → they can't be dynamically loaded from arbitrary source → they should be hardcoded?

srikanthccv · 2021-01-13T11:08:22Z

It is very under-documented, just letting you know that sampler and resource attributes can also be configured with environment variables. There is an open PR for adding support for OTEL_TRACE_SAMPLER and OTEL_TRACE_SAMPLER_ARG. The OTEL_RESOURCE_ATTRIBUTES is already supported and it takes key-value mapping string in the format of key1=value1,key2=value2. Link to all SDK environment variables in the spec.

anton-ryzhov · 2021-01-13T14:17:51Z

So if I need to get these values from some config storage I need to run "pre-executable" process, fetch all these values, populate env and then exec to "the real" process? Is that a recommended way?

aabmass · 2021-01-14T17:29:06Z

We discussed in the 2021-01-14 SIG meeting and agreed to go with the proxy approach in #1445. It will be a little more complicated for metrics (must have ProxyCounter, ProxyValueObserver, etc..) but should be doable. Thanks for the discussion everyone!

owais · 2021-04-03T23:53:39Z

Any idea why this happens before the setup in wsgi.py? I thought that would be the first thing to run. Would be good to understand the cases where folks are running into this

@aabmass If I remember correctly, this happens because when we run the service with manage.py, django imports the urls, views, models etc to inspect internally before running the extra code we add.

Sorry for being a little late with the reply :D

jrozentur added the bug Something isn't working label Sep 24, 2020

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this issue Nov 1, 2020

chore(deps): update dependency superagent to v5.2.2 (open-telemetry#1159

67f5332

) Co-authored-by: Daniel Dyla <[email protected]>

lzchen mentioned this issue Dec 3, 2020

Calling trace.get_tracer before initializing tracing turns entire tracing pipeline into a noop pipeline. #1276

Closed

anton-ryzhov mentioned this issue Dec 4, 2020

Allow to override global tracer_provider after tracers creation #1445

Closed

10 tasks

aabmass added the discussion Issue or PR that needs/is extended discussion. label Dec 9, 2020

aabmass mentioned this issue Dec 17, 2020

Decide on the versioning for instrumentation packages #1495

Open

lzchen assigned aabmass Jan 19, 2021

owais mentioned this issue Mar 29, 2021

Proxy tracer/provider to enable lazy setup of tracing pipeline #1726

Merged

12 tasks

lzchen closed this as completed in #1726 Apr 7, 2021

rcbjBlueMars mentioned this issue Jun 6, 2021

BatchSpanProcessor is not fork-safe and doesn’t work well with application servers (Gunicorn, uWSGI) hypertrace/pythonagent#217

Closed

aabmass mentioned this issue Aug 27, 2021

Adds metrics API #1887

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DefaultTracerProvider set permanently in global var, does not look up user-configurable provider #1159

DefaultTracerProvider set permanently in global var, does not look up user-configurable provider #1159

jrozentur commented Sep 24, 2020

aabmass commented Dec 3, 2020

owais commented Dec 3, 2020

lzchen commented Dec 3, 2020

owais commented Dec 3, 2020

aabmass commented Dec 3, 2020

aabmass commented Dec 9, 2020

anton-ryzhov commented Dec 10, 2020

aabmass commented Dec 10, 2020

HeinrichHartmann commented Dec 11, 2020

Oberon00 commented Dec 11, 2020

aabmass commented Dec 21, 2020

srikanthccv commented Jan 9, 2021

aabmass commented Jan 11, 2021

anton-ryzhov commented Jan 11, 2021

anton-ryzhov commented Jan 13, 2021

srikanthccv commented Jan 13, 2021

anton-ryzhov commented Jan 13, 2021

aabmass commented Jan 14, 2021

owais commented Apr 3, 2021

DefaultTracerProvider set permanently in global var, does not look up user-configurable provider #1159

DefaultTracerProvider set permanently in global var, does not look up user-configurable provider #1159

Comments

jrozentur commented Sep 24, 2020

aabmass commented Dec 3, 2020

owais commented Dec 3, 2020

lzchen commented Dec 3, 2020

owais commented Dec 3, 2020

aabmass commented Dec 3, 2020

aabmass commented Dec 9, 2020

anton-ryzhov commented Dec 10, 2020

aabmass commented Dec 10, 2020

HeinrichHartmann commented Dec 11, 2020

Oberon00 commented Dec 11, 2020

aabmass commented Dec 21, 2020

srikanthccv commented Jan 9, 2021

aabmass commented Jan 11, 2021

anton-ryzhov commented Jan 11, 2021

anton-ryzhov commented Jan 13, 2021

srikanthccv commented Jan 13, 2021

anton-ryzhov commented Jan 13, 2021

aabmass commented Jan 14, 2021

owais commented Apr 3, 2021