Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for auto-instrumentation #464

Merged
merged 6 commits into from
Nov 2, 2021

Conversation

pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Oct 20, 2021

Updates #455

This PR adds functionality to inject OpenTelemetry auto-instrumentation (at this point only javaagent) into a pod via admission controller. Note that this is the smallest PR I could submit to support at least something end-to-end. There will be follow up PRs to provide additional features.

Notable changes

  • new CR to define SDK and instrumentation configuration
  • javaagent is injected and configured if java.inst.opentelemetry.io/inject: "true" is specified. The value van be the instrumentation CR name.
  • the javaagent injection uses init container to copy javaagent jar from init container image to a volume that is mounted to the application container
  • the configuration of the instrumentation and SDK is done via environment variables

CR

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryInst
metadata:
  name: java-instrumentation
spec:
  exporter:
    endpoint: http://otel-collector:1
  java:
    image: ghcr.io/pavolloffay/otel-javaagent:1.5.3

Follow up PRs

  • Support better service name
  • Inject k8s resources - namespace, cluster etc.
  • add sampling configuration to CR
  • add support for other languages
  • add status .e.g. show which workloads are instrumented and what is the sampling configuration...
  • add flag to set default javaagent image
  • add e2e test

@pavolloffay pavolloffay requested review from a team and dmitryax October 20, 2021 17:29
@pavolloffay
Copy link
Member Author

@jpkrohling @anuraaga could you please give it a quick look? If it looks fine I will continue and fix tests.

@pavolloffay pavolloffay force-pushed the auto-instrumentation branch 2 times, most recently from 6e512c4 to c4594e4 Compare October 20, 2021 17:54
Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great!

Java JavaSpec `json:"java,omitempty"`
// +optional
// +operator-sdk:gen-csv:customresourcedefinitions.specDescriptors=true
Reporter `json:"reporter,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Reporter configure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a structure. It configures everything related to reporting data. At the moment it is only the endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that it's configuration shared among languages rather than language-specific? Maybe it can be called Common or Shared

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the idea is to have a common configuration e.g. reporter, sampler at the root level. I am not sure if I like shared or common node in the CR for this, it seems unnecesarilly redundant. I would rather prefer the following (ignore the correctness of the sampler):

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryInst
metadata:
  name: java-instrumentation
spec:
  reporter:
    endpoint: http://otel-collector:1
  sampler:
    type: percentrage
    arg: 0.2
  java:
    image: ghcr.io/pavolloffay/otel-javaagent:1.5.3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. In that case let's call it exporter, which is what otel calls this concept.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

Shall we think about splitting this into data types? e.g. exporter for traces, metrics? We can start with what we have right now and then add metrics exporter. The current config will server as default and traces one.

internal/podinjector/main.go Outdated Show resolved Hide resolved
pkg/instrumentation/javaagent.go Show resolved Hide resolved
README.md Outdated
image: ghcr.io/pavolloffay/otel-javaagent:1.5.3 # <1>
```

1. Docker image with [OpenTelemetry Java auto-instrumentation](https://github.com/open-telemetry/opentelemetry-java-instrumentation).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc) @anuraaga this is what the docker image has to execute at runtime

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't finish reviewing yet but wanted to release my comments already.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
internal/webhookhandler/main.go Outdated Show resolved Hide resolved
internal/webhookhandler/main.go Outdated Show resolved Hide resolved

const (
// annotation contains the annotation name that pods contain, indicating whether a sidecar is desired.
annotation = "java.inst.opentelemetry.io/inject"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about opentelemetry.io/inject-java-autoinstrumentation ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we have for the sidecar sidecar.opentelemetry.io/inject: "true". Are there any semantics we follow for the annotation name? Here is what istio does https://istio.io/latest/docs/reference/config/annotations/ https://istio.io/latest/docs/reference/config/labels/

I think we should follow something like this:
kind.api-version/functionality

so for the instrumentation it would be instrumentation.opentelemetry.io/inject. Or with the short prefix inst.

Also instead of annotation, I think it's more appropriate to use a label. Here is why:

from https://blog.getambassador.io/kubernetes-labels-vs-annotations-95fc47196b6d

Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time … Labels allow for efficient queries and watches and are ideal for use in UIs and CLIs. Non-identifying information should be recorded using annotations.

k8s docs.
https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/
https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For trivia, I was going to suggest just auto.opentelemetry.io = java or something like that. After looking more at the code I realized that a pod needs to be able to opt out if the namespace opted in, so such a pattern works poorly.

Whether java is in the subdomain or path doesn't have any concrete benefit for me. We do need to decide on one and stick with it going forward though. I don't have enough context on k8s operator naming to pick a side though somehow path does seem more familiar. I can't describe why sorry :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About labels vs annotations: we use annotations for the user request (please, inject!), and labels to record the state (injected=my-otelcol-instance).

How about instrumentation.opentelemetry.io/inject-javaagent then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, instrumentation.opentelemetry.io/inject-java-agent :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instrumentation.opentelemetry.io/inject-java I think this will do the job. There is only single java instrumentation.

In the future there will be instrumentation.opentelemetry.io/inject-dotnet or instrumentation.opentelemetry.io/inject-sdk to just inject SDK configuration e.g. to serve as a control plane.

pkg/instrumentation/javaagent.go Show resolved Hide resolved
@pavolloffay
Copy link
Member Author

pavolloffay commented Oct 21, 2021

Also the new CRD type is defined in the same package as the collector. If we think they should be in separate packages here is PR #462. First I thought they should be in separate packages so they can evolve separately (e.g. from alpha to v1).

@jpkrohling
Copy link
Member

That PR has been merged. Click "re-request review" when it's ready.

@pavolloffay
Copy link
Member Author

@jpkrohling PR updated and review

@jpkrohling
Copy link
Member

There are some merge conflicts, possibly due to the rename of the package?

@pavolloffay
Copy link
Member Author

rebased again, it should be good now.

api/instrumentation/v1alpha1/opentelemetryinst_types.go Outdated Show resolved Hide resolved
api/instrumentation/v1alpha1/opentelemetryinst_types.go Outdated Show resolved Hide resolved
pkg/instrumentation/podmutator.go Outdated Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Outdated Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Outdated Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Outdated Show resolved Hide resolved
// AnnotationValue returns the effective annotation value, based on the annotations from the pod and namespace.
func AnnotationValue(ns corev1.Namespace, pod corev1.Pod) string {
// annotationValue returns the effective annotation value, based on the annotations from the pod and namespace.
func annotationValue(ns corev1.Namespace, pod corev1.Pod) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still be part of the public API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to expose this at this point. The API is not consumed from any other packages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already public. Do you need to make it private for this PR?

// Add a new sidecar container to the given pod, based on the given OpenTelemetryCollector.
func Add(cfg config.Config, logger logr.Logger, otelcol v1alpha1.OpenTelemetryCollector, pod corev1.Pod) (corev1.Pod, error) {
// add a new sidecar container to the given pod, based on the given OpenTelemetryCollector.
func add(cfg config.Config, logger logr.Logger, otelcol v1alpha1.OpenTelemetryCollector, pod corev1.Pod) (corev1.Pod, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for these, they should be part of the public API. Are these changes required for this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to expose this at this point. The API is not consumed from any other packages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is called from the pod mutator which has a public API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already part of the public API. Please revert this change, unless they are required by your PR.

Signed-off-by: Pavol Loffay <[email protected]>
@pavolloffay
Copy link
Member Author

@jpkrohling PR updated

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is missing an e2e test and docs. Also, what was public API should still be public API, unless you have a great reason to make it private as part of this PR. If it's not required by this PR, I'm open to hearing your arguments in a separate issue.

pkg/instrumentation/sdk_test.go Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Show resolved Hide resolved
pkg/instrumentation/sdk_test.go Show resolved Hide resolved
@pavolloffay
Copy link
Member Author

Somebody here is strongly opinionated about using example.com. I would like to know why, the tests do not execute any request.

@pavolloffay
Copy link
Member Author

. Also, what was public API should still be public API, unless you have a great reason to make it private as part of this PR. If it's not required by this PR, I'm open to hearing your arguments in a separate issue.

Before this PR those APIs were public bc they were used from the webhook package. This is not the case anymore because whole sidecar injection functionality is in a single well-encapsulated package sidecar. The only exposed API is the Annotation string and mutator implementation for the sidecar.

Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
@pavolloffay
Copy link
Member Author

Added some docs. The e2e test will come in the follow-up PR (See the first comment in this PR with a section for the follow-up work).

@jpkrohling
Copy link
Member

Somebody here is strongly opinionated about using example.com. I would like to know why, the tests do not execute any request.

When looking at a codebase I don't know and seeing references to endpoints, I'm never sure if it's being mocked somewhere and a call is being made, or if it's just a placeholder. Using example.com makes it clear that this is not a network connection in any way (local, mocked, ...).

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR those APIs were public bc they were used from the webhook package. This is not the case anymore because whole sidecar injection functionality is in a single well-encapsulated package sidecar. The only exposed API is the Annotation string and mutator implementation for the sidecar.

There are forks of this repository that might be using those public functions (cc @owais). Please, revert those changes to get this merged soon and open an issue proposing making them private. This is a potentially breaking change.

@@ -0,0 +1,20 @@
# Instrumentation Custom Resource Specification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the spirit, but this looks a bit misplaced: it's not linked from anywhere, it's not like anything we have so far for the other CR, and it's not a full example (like the e2e tests or the ones from config/samples. I'm OK in having it, but I'm not sure this is going to be that useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have asked for docs and I thought you are asking for this bc it is in the ./docs directory and there is a similar file for the collector CRD

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned here a couple of times e2e tests will ne done in a follow-up PR. Also I don't want to be adding examples until we have official docker image with javaagent - see #475

@pavolloffay
Copy link
Member Author

pavolloffay commented Oct 27, 2021

There are forks of this repository that might be using those public functions (cc @owais). Please, revert those changes to get this merged soon and open an issue proposing making them private. This is a potentially breaking change.

It does not make any sense but let's move it back as you wish. All the functionality is still there just some code moved from two packages to a single package and not it does not have to expose all those APIs that were exposed before due to the previous split design.

Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
@anuraaga
Copy link
Contributor

anuraaga commented Nov 1, 2021

@jpkrohling Just curious if this is ready to get in, with more followups for e.g. e2e testing. Looking forward to extending to add more languages :)

@jpkrohling
Copy link
Member

jpkrohling commented Nov 1, 2021

I'll try to get a final review later today. If @VineethReddy02 can do it before me and approve, I'll merge it without my final review, assuming you did review the latest version of the PR already, @anuraaga.

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some minor changes to the readme, but nothing serious. I have asked @owais to provide short feedback on the effects of this PR in downstream forks and will merge this later today based on his feedback.

Feel free to work on the readme changes in a follow-up PR.

@@ -146,6 +149,46 @@ spec:
EOF
```

### OpenTelemetry auto-instrumentation injection

The operator can inject and configure OpenTelemetry auto-instrumentation libraries. At this moment, the operator can inject only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have forced line wraps in this file, as editors can do it as well. While I don't have a particular preference, I do prefer the same style to be used consistently.

README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
@jpkrohling
Copy link
Member

@owais just confirmed that this is great from his side as well. Merging!

@jpkrohling jpkrohling merged commit dd248a3 into open-telemetry:main Nov 2, 2021
shree007 pushed a commit to shree007/opentelemetry-operator that referenced this pull request Dec 12, 2021
* Initial support for auto-instrumentation

Signed-off-by: Pavol Loffay <[email protected]>

* fix

Signed-off-by: Pavol Loffay <[email protected]>

* Fix bundle

Signed-off-by: Pavol Loffay <[email protected]>

* Add docs

Signed-off-by: Pavol Loffay <[email protected]>

* make a nonsense change

Signed-off-by: Pavol Loffay <[email protected]>

* Add link

Signed-off-by: Pavol Loffay <[email protected]>
ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
* Initial support for auto-instrumentation

Signed-off-by: Pavol Loffay <[email protected]>

* fix

Signed-off-by: Pavol Loffay <[email protected]>

* Fix bundle

Signed-off-by: Pavol Loffay <[email protected]>

* Add docs

Signed-off-by: Pavol Loffay <[email protected]>

* make a nonsense change

Signed-off-by: Pavol Loffay <[email protected]>

* Add link

Signed-off-by: Pavol Loffay <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants