Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making it easier to create custom distros #551

Closed
NathanielRN opened this issue Jun 24, 2021 · 8 comments · Fixed by open-telemetry/opentelemetry-python#1937
Closed

Making it easier to create custom distros #551

NathanielRN opened this issue Jun 24, 2021 · 8 comments · Fixed by open-telemetry/opentelemetry-python#1937

Comments

@NathanielRN
Copy link
Contributor

NathanielRN commented Jun 24, 2021

Description

In the June 24th, 2021 SIG meeting we mentioned that the way Distro and Configurator are set up right now are not friendly for creating new distros because the implementation takes the 1st distro and the 1st configurator it finds installed instead of taking the one the user might want.

See:

logger.warning(
"Configuration of %s not loaded, %s already loaded",
entry_point.name,
configured,

Auto-instrumentation requires a distro to be installed. The opentelemetry-distro package provides a Configurator and an OpenTelemetryDistro class. Auto-instrumentation then hooks into the Configurator AND the Distro to configure instrumentation.

Currently the Configurator initializes components

def _initialize_components():
    exporter_names = _get_exporter_names()
    trace_exporters = _import_exporters(exporter_names)
    id_generator_name = _get_id_generator()
    id_generator = _import_id_generator(id_generator_name)
    _init_tracing(trace_exporters, id_generator)


class Configurator(BaseConfigurator):

    # pylint: disable=no-self-use
    def _configure(self, **kwargs):
        _initialize_components()

And the OpenTelemetryDistro sets environment variables:

class OpenTelemetryDistro(BaseDistro):
    """
    The OpenTelemetry provided Distro configures a default set of
    configuration out of the box.
    """

    # pylint: disable=no-self-use
    def _configure(self, **kwargs):
        os.environ.setdefault(OTEL_TRACES_EXPORTER, "otlp_proto_grpc_span")

Problem

If I want to create a new opentelemetry-distro-aws distro, I really only want to set more environment variables like:

class AwsXrayOpenTelemetryDistro(OpenTelemetryDistro):
    def _configure(self, **kwargs):
        super._configure(kwargs)
        os.environ.setdefault(OTEL_PROPAGATORS, "aws_xray")

but because the current implementation only takes the 1st distro, I have no guarantee my -aws distro will be used for configuration.

Furthermore, if I like the Configurator as is, I have to copy and paste the code in the current setup because again only 1 "distro" can be installed.

Potential Solutions

Some solutions we mentioned in the SIG meeting:

Option 1: Use an environment variable to select the distro you want out of all the ones installed

Use OTEL_PYTHON_SELECTED_DISTRO to select the one you want.

PROs:

  • It is similar to patterns we have elsewhere with environment variables?

CONS:

  • Encourages installing multiple distros which is confusing
  • Would require a user to BOTH pip install opentelemetry-distro-aws and export OTEL_PYTHON_DISTRO=aws which is confusing because a distro install should just work

Option 2: Move the useful "initalization" methods of the Configurator and BaseDistro into the opentelemetry-sdk and use more virtual methods (i.e. with a default implementation) instead

This also means moving BaseDistro out of opentelemetry-instrumentation package and into opentelemetry-sdk.

PROs:

  • The Configurator _initialize_components() code will probably be helpful to many distros so we should pull it out
  • More virtual methods will truly allow distros to only set environment variables and make small adjustments to the useful default implementations
  • You only need to install 1 distro because we will have the right virtual methods to hook into so that downstream distros can use the useful default implementations and only make small changes. i.e.
    • set environment variable
    • customize Instrumentor "configuration options"

CONs:

  • Specification doesn't really have guidance on this? But it's just using environment variables to initialize components as the spec requires so it should be fine.
  • Making _initialize_components() the default means it assumes more of what we want, but it is probably what we want so as long as we expose the right methods to hook into we will still give distros the chance to customize

Summary

I think Option 2 is the clear preference. We just need to move BaseDistro and the _initialize_components() implementation to a common package (opentelemetry-sdk for BaseDistro? opentelemetry-instrumentation for _initialize_components?) and expose the right virtual/abstract methods in the BaseDistro class.

We might even remove Configurator all-together?

@owais
Copy link
Contributor

owais commented Jul 1, 2021

👍 for Option 2.

In addition to the pros/cons listed, Option1 would also require a user to run pip install opentelemetry-distro-aws and then also export OTEL_PYTHON_DISTRO=aws. I think if a user intentionally installs a specific distro, it should ideally work without any further configuration. Generally we should not make configuration more complex for users just to reuse some code internally.

@codeboten
Copy link
Contributor

codeboten commented Jul 1, 2021

+1 option 2
Maybe we can vote with the following options:
Option 1 🚢
Option 2 🚀

@ocelotl
Copy link
Contributor

ocelotl commented Jul 9, 2021

I think I am outvoted here, but I would prefer option 1 😅. A distro class may need to inherit from another distro present in a different distro package, thus making it necessary that more than one distro packages are installed.

@ocelotl
Copy link
Contributor

ocelotl commented Jul 9, 2021

+1 option 2
Maybe we can vote with the following options:
Option 1
Option 2

Hey, there is no 🚢 in the available reactions...

@NathanielRN
Copy link
Contributor Author

@ocelotl I initially thought the same as you, but I think a good alternative solution is found in open-telemetry/opentelemetry-python#1937 .

In this PR we are finding a way to add ALL the useful things all the distros might want to commonly do in a central accessible place. That way distros should really only be setting environment variables or doing very specific logic.

Hopefully this way no distro will want to copy "environment variable" setting and can confident create a new distro with only a few lines of code without needing to inherit from other distros.

@ocelotl
Copy link
Contributor

ocelotl commented Jul 9, 2021

How about not requiring an environment variable being set if there is only one distro installed?

@NathanielRN
Copy link
Contributor Author

@ocelotl

How about not requiring an environment variable being set if there is only one distro installed?

Hm that's interesting. I guess I agree that at the very least it doesn't seem right to error if someone has multiple distros installed, maybe in #571 we can just use a warning and by default use the first one.

I think it's more likely that a user accidentally has 2 distros installed rather than them wanting to use only 1 but install 2. I think an environment variable makes it easier to make that mistake? It sounds like you've found an Option 3 though which is a combination of 1&2 😝

@owais
Copy link
Contributor

owais commented Jul 13, 2021

I guess I agree that at the very least it doesn't seem right to error if someone has multiple distros installed, maybe in #571 we can just use a warning and by default use the first one.

This is very dangerous because the order in which distros are loaded is totally non-deterministic. This will trip up a lot of people. Imagine someone installing a distro that pulls another one behind the scenes. They test locally and it works as expected with a seemingly harmless warning (because everything works in practice) but once they deploy to production, their service exhibits completely different behaviour by using the other distro.

We are mixing up two issues here:

a. Should users be able to install multiple distros at the same time or not? If yes, why? What is the use case for this?
b. Can we somehow make it possible for distro authors to use other distros as libraries in order to re-use code without surprising the user i.e, if a user installs distro X then the instrument command always uses distro X no matter what other distros distro X might pull in as dependencies.

I think a. is not a real problem at least until someone comes up with a very valid use-case for this.

b. is worth solving IMO and that's what we should focus on here. It seems we are trying to solve b. by coming up with a solution for 1. which might not result in the best possible outcome IMO.

I still think the simplest solution is to just take out actually useful code and publish it independently of the default distro somewhere. If we establish this pattern, all distros should only contain vendor-specific code/configuration and it wouldn't be useful to use any distro as a base/library anyway.

However, if we really want all distros to also be Python libraries, may I suggest a fourth option? :)

Priority levels: Each distro would ship with a priority number and the distro with the highest priority would always win. For example, default otel distro would ship with priority level 0 while vendor-specific distro would have priority level 10. Anyone building a distro on top of another one would have to bump the priority level of their distro. So if I wanted to build on top of the AWS distro (with priority 10), I'd add AWS distro as a dependency, import and use it as a library but set my distros priority to 20. This would ensure that the opentelemetry-instrument command would always prefer my distro. This also gives authors the ability to reuse other distros as libraries but at the same time ship a deterministic behaviour without requiring end-users to have any knowledge about distros and make any decisions. This does not allow users to manually install multiple distros but allows distro authors to build on other distros which is what I think @NathanielRN and @ocelotl both want to be able to do.

(quoted from open-telemetry/opentelemetry-python#1937 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants