Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python auto-instrumentation: handle musl based containers #3332

Merged
merged 5 commits into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .chloggen/3332-musl-python-autoinstrumentation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. collector, target allocator, auto-instrumentation, opamp, github action)
component: auto-instrumentation

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: add config for installing musl based auto-instrumentation for Python

# One or more tracking issues related to the change
issues: [2264]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -292,9 +292,12 @@ instrumentation.opentelemetry.io/inject-nodejs: "true"
```

Python:
Python auto-instrumentation also honors an annotation that will permit it to run it on images with a different C library than glibc.

```bash
instrumentation.opentelemetry.io/inject-python: "true"
instrumentation.opentelemetry.io/otel-python-platform: "glibc" # for Linux glibc based images, this is the default value and can be omitted
instrumentation.opentelemetry.io/otel-python-platform: "musl" # for Linux musl based images
```

.NET:
Expand Down
1 change: 1 addition & 0 deletions pkg/instrumentation/annotation.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ const (
annotationInjectNodeJSContainersName = "instrumentation.opentelemetry.io/nodejs-container-names"
annotationInjectPython = "instrumentation.opentelemetry.io/inject-python"
annotationInjectPythonContainersName = "instrumentation.opentelemetry.io/python-container-names"
annotationPythonPlatform = "instrumentation.opentelemetry.io/otel-python-platform"
annotationInjectDotNet = "instrumentation.opentelemetry.io/inject-dotnet"
annotationDotNetRuntime = "instrumentation.opentelemetry.io/otel-dotnet-auto-runtime"
annotationInjectDotnetContainersName = "instrumentation.opentelemetry.io/dotnet-container-names"
Expand Down
1 change: 1 addition & 0 deletions pkg/instrumentation/podmutator.go
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,7 @@ func (pm *instPodMutator) Mutate(ctx context.Context, ns corev1.Namespace, pod c
}
if pm.config.EnablePythonAutoInstrumentation() || inst == nil {
insts.Python.Instrumentation = inst
insts.Python.AdditionalAnnotations = map[string]string{annotationPythonPlatform: annotationValue(ns.ObjectMeta, pod.ObjectMeta, annotationPythonPlatform)}
} else {
logger.Error(nil, "support for Python auto instrumentation is not enabled")
pm.Recorder.Event(pod.DeepCopy(), "Warning", "InstrumentationRequestRejected", "support for Python auto instrumentation is not enabled")
Expand Down
38 changes: 26 additions & 12 deletions pkg/instrumentation/python.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,23 @@ import (
)

const (
envPythonPath = "PYTHONPATH"
envOtelTracesExporter = "OTEL_TRACES_EXPORTER"
envOtelMetricsExporter = "OTEL_METRICS_EXPORTER"
envOtelLogsExporter = "OTEL_LOGS_EXPORTER"
envOtelExporterOTLPProtocol = "OTEL_EXPORTER_OTLP_PROTOCOL"
pythonPathPrefix = "/otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation"
pythonPathSuffix = "/otel-auto-instrumentation-python"
pythonInstrMountPath = "/otel-auto-instrumentation-python"
pythonVolumeName = volumeName + "-python"
pythonInitContainerName = initContainerName + "-python"
envPythonPath = "PYTHONPATH"
envOtelTracesExporter = "OTEL_TRACES_EXPORTER"
envOtelMetricsExporter = "OTEL_METRICS_EXPORTER"
envOtelLogsExporter = "OTEL_LOGS_EXPORTER"
envOtelExporterOTLPProtocol = "OTEL_EXPORTER_OTLP_PROTOCOL"
glibcLinuxAutoInstrumentationSrc = "/autoinstrumentation/."
muslLinuxAutoInstrumentationSrc = "/autoinstrumentation-musl/."
pythonPathPrefix = "/otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation"
pythonPathSuffix = "/otel-auto-instrumentation-python"
pythonInstrMountPath = "/otel-auto-instrumentation-python"
pythonVolumeName = volumeName + "-python"
pythonInitContainerName = initContainerName + "-python"
glibcLinux = "glibc"
muslLinux = "musl"
)

func injectPythonSDK(pythonSpec v1alpha1.Python, pod corev1.Pod, index int) (corev1.Pod, error) {
func injectPythonSDK(pythonSpec v1alpha1.Python, pod corev1.Pod, index int, platform string) (corev1.Pod, error) {
volume := instrVolume(pythonSpec.VolumeClaimTemplate, pythonVolumeName, pythonSpec.VolumeSizeLimit)

// caller checks if there is at least one container.
Expand All @@ -46,6 +50,16 @@ func injectPythonSDK(pythonSpec v1alpha1.Python, pod corev1.Pod, index int) (cor
return pod, err
}

autoInstrumentationSrc := ""
switch platform {
case "", glibcLinux:
autoInstrumentationSrc = glibcLinuxAutoInstrumentationSrc
case muslLinux:
autoInstrumentationSrc = muslLinuxAutoInstrumentationSrc
default:
return pod, fmt.Errorf("provided instrumentation.opentelemetry.io/otel-python-platform annotation value '%s' is not supported", platform)
}

// inject Python instrumentation spec env vars.
for _, env := range pythonSpec.Env {
idx := getIndexOfEnv(container.Env, env.Name)
Expand Down Expand Up @@ -111,7 +125,7 @@ func injectPythonSDK(pythonSpec v1alpha1.Python, pod corev1.Pod, index int) (cor
pod.Spec.InitContainers = append(pod.Spec.InitContainers, corev1.Container{
Name: pythonInitContainerName,
Image: pythonSpec.Image,
Command: []string{"cp", "-r", "/autoinstrumentation/.", pythonInstrMountPath},
Command: []string{"cp", "-r", autoInstrumentationSrc, pythonInstrMountPath},
Resources: pythonSpec.Resources,
VolumeMounts: []corev1.VolumeMount{{
Name: volume.Name,
Expand Down
169 changes: 168 additions & 1 deletion pkg/instrumentation/python_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ func TestInjectPythonSDK(t *testing.T) {
name string
v1alpha1.Python
pod corev1.Pod
platform string
expected corev1.Pod
err error
}{
Expand All @@ -42,6 +43,7 @@ func TestInjectPythonSDK(t *testing.T) {
},
},
},
platform: "glibc",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
Expand Down Expand Up @@ -118,6 +120,7 @@ func TestInjectPythonSDK(t *testing.T) {
},
},
},
platform: "glibc",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
Expand Down Expand Up @@ -195,6 +198,7 @@ func TestInjectPythonSDK(t *testing.T) {
},
},
},
platform: "glibc",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
Expand Down Expand Up @@ -271,6 +275,7 @@ func TestInjectPythonSDK(t *testing.T) {
},
},
},
platform: "glibc",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
Expand Down Expand Up @@ -423,6 +428,7 @@ func TestInjectPythonSDK(t *testing.T) {
},
},
},
platform: "glibc",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
Expand Down Expand Up @@ -499,6 +505,7 @@ func TestInjectPythonSDK(t *testing.T) {
},
},
},
platform: "glibc",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Containers: []corev1.Container{
Expand All @@ -515,11 +522,171 @@ func TestInjectPythonSDK(t *testing.T) {
},
err: fmt.Errorf("the container defines env var value via ValueFrom, envVar: %s", envPythonPath),
},
{
name: "musl platform defined",
Python: v1alpha1.Python{Image: "foo/bar:1"},
pod: corev1.Pod{
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{},
},
},
},
platform: "musl",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
{
Name: pythonVolumeName,
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
SizeLimit: &defaultVolumeLimitSize,
},
},
},
},
InitContainers: []corev1.Container{
{
Name: "opentelemetry-auto-instrumentation-python",
Image: "foo/bar:1",
Command: []string{"cp", "-r", "/autoinstrumentation-musl/.", "/otel-auto-instrumentation-python"},
VolumeMounts: []corev1.VolumeMount{{
Name: "opentelemetry-auto-instrumentation-python",
MountPath: "/otel-auto-instrumentation-python",
}},
},
},
Containers: []corev1.Container{
{
VolumeMounts: []corev1.VolumeMount{
{
Name: "opentelemetry-auto-instrumentation-python",
MountPath: "/otel-auto-instrumentation-python",
},
},
Env: []corev1.EnvVar{
{
Name: "PYTHONPATH",
Value: fmt.Sprintf("%s:%s", "/otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation", "/otel-auto-instrumentation-python"),
},
{
Name: "OTEL_EXPORTER_OTLP_PROTOCOL",
Value: "http/protobuf",
},
{
Name: "OTEL_TRACES_EXPORTER",
Value: "otlp",
},
{
Name: "OTEL_METRICS_EXPORTER",
Value: "otlp",
},
{
Name: "OTEL_LOGS_EXPORTER",
Value: "otlp",
},
},
},
},
},
},
err: nil,
},
{
name: "platform not defined",
Python: v1alpha1.Python{Image: "foo/bar:1"},
pod: corev1.Pod{
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{},
},
},
},
platform: "",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Volumes: []corev1.Volume{
{
Name: pythonVolumeName,
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{
SizeLimit: &defaultVolumeLimitSize,
},
},
},
},
InitContainers: []corev1.Container{
{
Name: "opentelemetry-auto-instrumentation-python",
Image: "foo/bar:1",
Command: []string{"cp", "-r", "/autoinstrumentation/.", "/otel-auto-instrumentation-python"},
VolumeMounts: []corev1.VolumeMount{{
Name: "opentelemetry-auto-instrumentation-python",
MountPath: "/otel-auto-instrumentation-python",
}},
},
},
Containers: []corev1.Container{
{
VolumeMounts: []corev1.VolumeMount{
{
Name: "opentelemetry-auto-instrumentation-python",
MountPath: "/otel-auto-instrumentation-python",
},
},
Env: []corev1.EnvVar{
{
Name: "PYTHONPATH",
Value: fmt.Sprintf("%s:%s", "/otel-auto-instrumentation-python/opentelemetry/instrumentation/auto_instrumentation", "/otel-auto-instrumentation-python"),
},
{
Name: "OTEL_EXPORTER_OTLP_PROTOCOL",
Value: "http/protobuf",
},
{
Name: "OTEL_TRACES_EXPORTER",
Value: "otlp",
},
{
Name: "OTEL_METRICS_EXPORTER",
Value: "otlp",
},
{
Name: "OTEL_LOGS_EXPORTER",
Value: "otlp",
},
},
},
},
},
},
err: nil,
},
{
name: "platform not supported",
Python: v1alpha1.Python{Image: "foo/bar:1"},
pod: corev1.Pod{
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{},
},
},
},
platform: "not-supported",
expected: corev1.Pod{
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{},
},
},
},
err: fmt.Errorf("provided instrumentation.opentelemetry.io/otel-python-platform annotation value 'not-supported' is not supported"),
},
}

for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
pod, err := injectPythonSDK(test.Python, test.pod, 0)
pod, err := injectPythonSDK(test.Python, test.pod, 0, test.platform)
assert.Equal(t, test.expected, pod)
assert.Equal(t, test.err, err)
})
Expand Down
2 changes: 1 addition & 1 deletion pkg/instrumentation/sdk.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ func (i *sdkInjector) inject(ctx context.Context, insts languageInstrumentations

for _, container := range insts.Python.Containers {
index := getContainerIndex(container, pod)
pod, err = injectPythonSDK(otelinst.Spec.Python, pod, index)
pod, err = injectPythonSDK(otelinst.Spec.Python, pod, index, insts.Python.AdditionalAnnotations[annotationPythonPlatform])
if err != nil {
i.logger.Info("Skipping Python SDK injection", "reason", err.Error(), "container", pod.Spec.Containers[index].Name)
} else {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
apiVersion: opentelemetry.io/v1alpha1
Copy link
Contributor Author

@xrmx xrmx Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding specific e2e tests is not needed because the e2e-test-app-python docker image is already based on alpine right? looks like tests are failing on main

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like tests are failing on main

Can you elaborate this a bit more?

Adding specific e2e tests is not needed because the e2e-test-app-python docker image is already based on alpine right?

Maybe this was something where I failed. Our idea is, at some point, add verifications to know if the libraries were injected properly and verify they are emitting data.

Copy link
Contributor Author

@xrmx xrmx Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment the e2e-test-app-python is based on alpine but the python instrumentation image is glibc based. This is a problem because binary extensions are not portable between different C libraries (among other incompatibilities). So this PR builds them and copies one for musl or glibc depending on the configuration.

An example of failure in CI is this:
https://github.com/open-telemetry/opentelemetry-operator/actions/runs/11237912151/job/31241432422?pr=3330#step:8:1330

Where I guess the metrics thread kicks in and the system metrics package fails to load psutil binary module because it has been built on glibc and not musl.

Copy link
Contributor Author

@xrmx xrmx Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the other thing that should be kept in sync is the Python version of the two images because the ABI changes between python versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment the e2e-test-app-python is based on alpine but the python instrumentation image is glibc based. This is a problem because binary extensions are not portable between different C libraries (among other incompatibilities). So this PR builds them and copies one for musl or glibc depending on the configuration.

Didn't notice this when I added the images. As mentioned in the previous comment, the idea is to add real E2E checking if the instrumentation is generating real data. Since we are not checking this, issues like the one you saw are happening. This is something we need to fix with that image.

You can reuse that image (since it is musl based) for your E2E test. We need to add a new one for glibc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reuse that image (since it is musl based) for your E2E test. We need to add a new one for glibc.

I'm already using that image in the musl e2e 👍

kind: OpenTelemetryCollector
metadata:
name: sidecar
spec:
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:

exporters:
debug:

service:
pipelines:
traces:
receivers: [otlp]
exporters: [debug]
mode: sidecar
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: python-musl
spec:
env:
- name: OTEL_EXPORTER_OTLP_TIMEOUT
value: "20"
- name: OTEL_TRACES_SAMPLER
value: parentbased_traceidratio
- name: OTEL_TRACES_SAMPLER_ARG
value: "0.85"
- name: SPLUNK_TRACE_RESPONSE_HEADER_ENABLED
value: "true"
exporter:
endpoint: http://localhost:4317
propagators:
- jaeger
- b3
sampler:
type: parentbased_traceidratio
argument: "0.25"
python:
env:
- name: OTEL_LOG_LEVEL
value: "debug"
- name: OTEL_TRACES_EXPORTER
value: otlp
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://localhost:4318
Loading
Loading