Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Add remote sampling extension #432

Closed

Conversation

annanay25
Copy link
Contributor

@annanay25 annanay25 commented Nov 23, 2019

Adds an extension to proxy client requests for remote sampling configuration to a configured Jaeger collector.
Refer open-telemetry/opentelemetry-go#327

Motivation

Remote sampling is an important feature that the OpenTelemetry collector does not support but the Jaeger agent does, and to ensure that the OpenTelemetry collector can be used as a drop in replacement for the Jaeger agent (discussed at KubeCon) we add an extension that supports proxying client requests for sampling configuration to a Jaeger collector.

cc @joe-elliott @bogdandrutu

@annanay25 annanay25 changed the title Add remote sampling extension WIP - Add remote sampling extension Nov 23, 2019
@bogdandrutu
Copy link
Member

Consider a name like jaegerremotesampling for the package name?

Copy link
Contributor

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add details about the extension to the docs:
https://github.com/open-telemetry/opentelemetry-collector#extensions
https://github.com/open-telemetry/opentelemetry-collector/blob/master/extension/README.md

Also, I think this will currently conflict with the jaeger receiver as it opens port 5778 when it's starting up the jaeger agent. I am currently looking at #157 and will fix in a PR that adds these configuration options.


// Addr is the upstream Jaeger collector address that can be used to fetch
// sampling configurations. The default value is `:14250`.
Addr string `mapstructure:"addr"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this field be more descriptive to help the operator know what it's for. CollectorAddress?

NameVal: typeStr,
},
Port: 5778,
Addr: "0.0.0.0:14250",
Copy link
Contributor

@joe-elliott joe-elliott Nov 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a meaningful default for this field? Maybe it should just default to empty string?

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking more about this PR, and reading the code I think we can have this as part of the Jaeger receiver:

  • Extend the jaegerreceiver.Config to include a "RemoteSampling(port, address)" extra config.

This way we keep this specific only to the Jaeger receiver, I know that this is not related to the processing pipeline but it is still a property of the Jaeger receiver.

What do you think?

@joe-elliott
Copy link
Contributor

That works for me. It was my initial thought, but there was some strangeness about it which is why we had questions.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a PR description to explain the motivation of this change and why it should be in the core Collector repository. Unless there is a specific justification for it by default all new extensions should be added to "contrib" repo: https://github.com/open-telemetry/opentelemetry-collector-contrib/

@annanay25
Copy link
Contributor Author

@joe-elliott @tigrannajaryan - Thanks for the reviews.

@bogdandrutu - Hmm, I was thinking the same thing while writing the extension, but I thought that remote sampling is not something that should be enabled with the jaeger receiver. Functionally it makes sense to add it as an extension for users to explicitly configure and enable when they really need remote sampling.
I also agree with renaming the package to jaegerremotesampler.

rs.jProxy = jAgent.NewConfigManager(conn)

// Register the http handler at the sampling URI
rs.server.Handler = rs.Handler()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're only handling the remote sampling proxy? and doing so at every endpoint?

I think we should handle queries like the agent does here: https://github.com/jaegertracing/jaeger/blob/master/cmd/agent/app/httpserver/server.go#L39

Copy link
Contributor Author

@annanay25 annanay25 Nov 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a mux handler for the endpoints.

@tigrannajaryan
Copy link
Member

It is still unclear to me why this needs to be in the core. I am not even sure this functionality the way it is implemented belongs to Collector at all. It looks like a standalone HTTP server proxying some information it receives and it does not interact in any way with the rest of the Collector. Why this functionality needs to be in OpenTelemetry Collector? It seems to me it can be a standalone executable.

@annanay25
Copy link
Contributor Author

@tigrannajaryan - The assumption is that eventually, the Otel collector will host sampling configurations that the clients/agents can query remotely.

@joe-elliott
Copy link
Contributor

I can't say I know what the defined goals are of the otel collector, but the sampling rules/baggage proxy is required to fully support the jaeger client. We have need of it in our environment to begin using the otel collector which is why are we are working to contribute it.

I've been through the docs (https://github.com/open-telemetry/opentelemetry-collector/tree/master/docs) and not found any good guiding principles for whether or not this should be included. If there's a good resource for this please let us know.

@tigrannajaryan
Copy link
Member

the sampling rules/baggage proxy is required to fully support the jaeger client.

I would like to see a more detailed description of what is the proposed functionality to understand where is the best place for this. The information that I see in this PR is not enough to make a decision.

I can't say I know what the defined goals are of the otel collector, but the sampling rules/baggage proxy is required to fully support the jaeger client. We have need of it in our environment to begin using the otel collector which is why are we are working to contribute it.

I've been through the docs (https://github.com/open-telemetry/opentelemetry-collector/tree/master/docs) and not found any good guiding principles for whether or not this should be included. If there's a good resource for this please let us know.

There is none currently and thanks for pointing it out, this is something I or one of other maintainers will need to create.

For now here is a few principles that I follow as a maintainer:

  1. If a functionality MUST live in the core then it goes to the core. An example is configuration reading or adding support for components to discover other components and interact with them. Obviously said functionality should benefit the community as a whole otherwise it is unlikely to be accepted.

  2. If a functionality can be implemented as a component (an extension, receiver, processor or exporter) it most likely belongs to contrib repository. When bootstraping this project we made an explicit decision to support a minimal set of open-source protocols in the core and decided that all other vendor specific functionality will be in a separate contrib repository. This is also in-line with vendor-agnostic philosophy of OpenTelemetry. We do not want more vendor-specific code in the core.

  3. Generally the bar for accepting any new functionality in the core is high. The reason is that this increases maintenance burden and ultimately becomes a technical liability for me and other maintainers. I see maintainability (and not just maintenance) as one of primarily responsibilities of maintainers.

  4. The bar for accepting functionality into the "contrib" repo is lower but it still exists. We will be defining minimal quality requirements for the "contrib" soon.

  5. Things that can be independent projects by default should not be in the Collector. Yes, if we see that there is significant community need to have a functionality in the Collector we will consider it, but that is not the default way of doing things, we don't want to increase the size of the Collector just because we can. Again the implementation that I see in this PR seems to be a standalone functionality that does not interact with the rest of the Collector in any way. If that is not the case and my impression is wrong then I need more information and evidence of it belonging to the Collector.

@joe-elliott
Copy link
Contributor

joe-elliott commented Nov 25, 2019

I would like to see a more detailed description of what is the proposed functionality to understand where is the best place for this.

The Jaeger client has the expectation to be able to make an http request for json documents that describe sampling rules and baggage restrictions. Currently, depending on configuration, the agent converts these requests to either tchannel or grpc and forwards them to the collector. Additionally, Jaeger is in the process of adding adaptive sampling rules which will also rely on remote sampling functionality (https://www.jaegertracing.io/docs/1.15/sampling/#adaptive-sampler). The otel collector supports a variety of agent (thrift compact, thrift binary) and collector protocols (tchannel, grpc, http) span ingesting protocols, but does not support this feature.

We are proposing to add the following to the otel-collector:

The additional configuration would be roughly:

  jaegerproxy:
    listenAddr: 0.0.0.0:5778
    collectorType: grpc|tchannel
    collectorAddress: <collector ip>:14250

These additions can be made with existing Jaeger dependencies.

For Jaeger users this should lower the friction of adoption and increase Jaeger client compatibility.
If this is not appropriate for the otel-collector, internally our only option would be to run a sidecar to the otel-collector that did this proxying for us. Additionally, I'm unsure if the sidecar would be sufficient for the future adaptive sampling functionality as it does not yet exist.

Anyone using the existing Jaeger client across many services with the remote sampling feature and wanting to adopt the otel collector as part of their span pipeline would have similar needs.

@tigrannajaryan
Copy link
Member

@joe-elliott thank you for detailed description and I agree it makes sense to support in the Collector since we declared we have built-in Jaeger support.

These additions can be made with existing Jaeger dependencies.

Can this new functionality be part of Jaeger receiver that already exists in this repo? Or this functionality is independently useful even if one is not using Jaeger receiver? Is there a use-case when a Jaeger client needs retrieve the configuration from the Collector but does not send collected trace data to the Collector?

jaegerproxy:
listenAddr: 0.0.0.0:5778
collectorType: grpc|tchannel
collectorAddress: :14250

Does this mean the same Jaeger collectorAddress needs to be specified twice: once here and once in Jaeger exporter config? I assume these 2 endpoints are always the same - there is no use case for sending data to one Jaeger instance but receive config from another. Is this correct?

@annanay25
Copy link
Contributor Author

Can this new functionality be part of Jaeger receiver that already exists in this repo? Or this functionality is independently useful even if one is not using Jaeger receiver? Is there a use-case when a Jaeger client needs retrieve the configuration from the Collector but does not send collected trace data to the Collector?

IIUC, I do not think there is a use-case where one would run a Jaeger collector just to serve sampling strategies but not to consume trace data. The agent is usually configured to receive sampling strategies from the same collector that it forwards spans to.

there is no use case for sending data to one Jaeger instance but receive config from another. Is this correct?

Correct.

IIUC, remote sampling configurations will soon be supported with Otel-collector as well, and then we would have to write a remote sampling extension. So my vote would be to keep this package under extensions and not make it specific to the jaegerreceiver, and possibly refactor it at a later time to include functionality to retrieve sampling configuration from the otel-collector.

Now:

extensions:
   remotesampling:
   remotesampling/1:
     port: 5779
     addr: "jaeger.collector:14251" 

Later:

extensions:
   remotesampling:
   remotesampling/1:
     type: jaeger
     port: 5779
     addr: "jaeger.collector:14251"
   remotesampling/2:
     type: otel
     port: 5678
     addr: "otel.collector:14251"

@tigrannajaryan
Copy link
Member

IIUC, remote sampling configurations will soon be supported with Otel-collector as well,

I am not aware of any initiatives to do that. What do you refer to?

@bogdandrutu
Copy link
Member

@tigrannajaryan there was a discussion to add a remote sampling configuration at the otel protocol during the opentelemetry maintainers track. We did not commit to do it but I think this is probably something that we will need and will end up implementing.

Also I think as I mentioned in one of my previous comment that this functionality better fits in the Jaeger receiver for the moment.

@jmacd
Copy link
Contributor

jmacd commented Nov 26, 2019

@bogdandrutu I think, along those lines, we'll need (in the end) a way to remotely configure metrics, which instruments, which dimensions, which aggregations, and so on. It would be very similar to getting a trace-sampling configuration--I'd expect it to be something included in an OTLP response.

@joe-elliott
Copy link
Contributor

@tigrannajaryan @annanay25

Can this new functionality be part of Jaeger receiver that already exists in this repo?

I agree that you would almost certainly be using a receiver along with the proxy functionality. The agent protocols and the proxy are essentially part of the same api that the jaeger client consumes.

There is no use case for sending data to one Jaeger instance but receive config from another. Is this correct?

I could see a scenario in which you would want to configure the sampling proxy, but not even have a Jaeger exporter in a two tier otel collector setup. One tier would be configured to be close to the process in an "agent" style and a second tier configured for tail based sampling in a "collector" style. Then backend the whole thing with Jaeger. The "agent" tier would not even have a Jaeger exporter but would need to be configured to see the Jaeger backend to provide this functionality.

I am fine with coupling the proxy functionality with the receiver, but think it should be independent of an exporter. I am not opinionated on whether it should be an extension or part of the receiver.

@tigrannajaryan
Copy link
Member

Thanks for the comments. Given the information I see in this thread I want to suggest that we do the following:

  1. Implement Jaeger sampling configuration proxy as part of Jaeger receiver that already exists in this repo in jaegerreceiver package.

  2. The implementation in jaegerreceiver can connect to Jaeger backend to fetch the sampling configuration. The endpoint to listen on and to fetch from will be a config setting of Jaeger receiver, for example:

  jaeger:
    # protocols is how we currently configure Jaeger receiver.
    protocols:
      grpc:
        endpoint: localhost:9876
    # remoteconfig is a new setting.
    remoteconfig:
        # Remote server to connect to and fetch Jaeger config
        fetch_endpoint: "jaeger.collector:14251"
        # Local address to serve config at
        listen_endpoint: "localhost:5779"

The actual settings that go under remoteconfig may need to be different, depending on what exactly needs to be configurable. I am assuming listen_endpoint needs to be separately configurable. If not then it may be defined by the endpoint settings under protocols section.

As for generic remote configuration functionality please show me proposals with design docs attached, happy to review :-)

@joe-elliott
Copy link
Contributor

joe-elliott commented Nov 26, 2019

This would definitely meet our needs. If we could review this PR #434

It lays the groundwork for independently configurable pieces of the Jaeger agent protocols. @annanay25 could then port his work to use this base for the proxy feature additions in the Jaeger receiver.

@annanay25 annanay25 closed this Dec 6, 2019
tigrannajaryan pushed a commit that referenced this pull request Dec 17, 2019
This PR takes over from #432 

- Added SamplingManager to the `jReceiver` struct
- Added new parameters to configure remote sampling endpoint (`remote_sampling:fetch_endpoint`)
- Added tests and updated README
MovieStoreGuy pushed a commit to atlassian-forks/opentelemetry-collector that referenced this pull request Nov 11, 2021
Rename the package from "export" to "metric". Note that all the existing
imports of this package use an explicit name of `export` and, therefore,
no import declaration changes are included.

Rename the `MetricKind` to `Kind` to not stutter in the type usage. Note
this does not include a method name change for the `Descriptor` method
`MetricKind`.
hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this pull request Apr 27, 2023
)

Bumps [go.uber.org/zap](https://github.com/uber-go/zap) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/uber-go/zap/releases)
- [Changelog](https://github.com/uber-go/zap/blob/master/CHANGELOG.md)
- [Commits](uber-go/zap@v1.16.0...v1.17.0)

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this pull request Jul 5, 2024
swiatekm pushed a commit to swiatekm/opentelemetry-collector that referenced this pull request Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants