Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: OpenTelemetry module integration #9062

Merged
merged 31 commits into from
Mar 22, 2023
Merged

Conversation

esigo
Copy link
Member

@esigo esigo commented Sep 18, 2022

What this PR does / why we need it:

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • CVE Report (Scanner found CVE and adding report)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation only

Which issue/s this PR fixes

#9016 step 2

How Has This Been Tested?

build controller image (in root folder of ingress-nginx repo):

  make build
  make image

use the new image in values.yaml:

  registry: gcr.io
  image: k8s-staging-ingress-nginx/controller

add the following to the controller-configmap.yaml:

  enable-opentelemetry: "true"
  opentelemetry-config: "/etc/nginx/opentelemtry.toml"
  opentelemetry-operation-name: "HTTP $request_method $service_name $uri"
  OpentelemetryTrustIncomingSpan: "true"
  otlp-collector-host: "otel-coll-collector.otel.svc"
  # otlp-collector-host: "tempo.observability.svc"
  otlp-collector-port: "4317"
  otel-max-queuesize: "2048"
  otel-schedule-delay-millis: "5000"
  otel-max-export-batch-size: "512"
  otel-service-name: "nginx-proxy" # Opentelemetry resource name
  otel-sampler: "AlwaysOn" # Also: AlwaysOff, TraceIdRatioBased
  otel-sampler-ratio: "1.0"
  otel-sampler-parent-based: "false"

and modify opentelemetry in values.yaml:

  opentelemetry:
    enabled: true
    image: registry.k8s.io/ingress-nginx/opentelemetry:v20230107-helm-chart-4.4.2-2-g96b3d2165@sha256:331b9bebd6acfcd2d3048abbdd86555f5be76b7e3d0b5af4300b04235c6056c9
    containerSecurityContext:
      allowPrivilegeEscalation: false

follow the instruction in example app:

deploy demo app:

make images
make deploy-app

deploy otel collector, grafan, tempo and Jaeger all-in-one:

make helm-repo
make observability

test:

kubectl port-forward --namespace=ingress-nginx service/ingress-nginx-controller 8090:80
bash test.sh

In the example the telemetry data will be sent to a collector first. The collector then exports the traces to a Jaeger backend and also to tempo which is used as a data source for grafana:

graph TD
subgraph Node
   nginx
end
nginx["nginx module"] --> |otlp-gRPC| OTEL-collector["Otel Collector"] --> |jaeger| backend["Jaeger"]
OTEL-collector["Otel Collector"] --> |otlp-gRPC| tempo["Tempo"] --> grafana["grafana"]
Loading

Alternatively one can deploy tempo which receives otlp-gRPC and send the traces directly to tempo:

graph TD
subgraph Node
   nginx
end
nginx["nginx module"] --> |otlp-gRPC| tempo["Tempo"] --> grafana["grafana"]
Loading

Jaeger:

image

grafana:

image

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have added unit and/or e2e tests to cover my changes.
  • All new and existing tests passed.
  • Added Release Notes.

Does my pull request need a release note?

Any user-visible or operator-visible change qualifies for a release note. This could be a:

  • CLI change
  • API change
  • UI change
  • configuration schema change
  • behavioral change
  • change in non-functional attributes such as efficiency or availability, availability of a new platform
  • a warning about a deprecation
  • fix of a previous Known Issue
  • fix of a vulnerability (CVE)

No release notes are required for changes to the following:

  • Tests
  • Build infrastructure
  • Fixes for unreleased bugs

For more tips on writing good release notes, check out the Release Notes Handbook

feature: OpenTelemetry module e2e integration

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority labels Sep 18, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @esigo. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Sep 18, 2022
@esigo esigo marked this pull request as ready for review September 30, 2022 14:54
@k8s-ci-robot k8s-ci-robot requested a review from rikatz September 30, 2022 14:54
@esigo esigo changed the title [WIP] feat: OpenTelemetry module integration feat: OpenTelemetry module integration Sep 30, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 30, 2022
@esigo
Copy link
Member Author

esigo commented Oct 1, 2022

/kind feature

@Dipenbhatt03
Copy link

Dipenbhatt03 commented Feb 20, 2023

Hey @esigo , This is great work.
So i was tracing ingress-nginx with opentracing and stumbled upon this guide. Instrument nginx with OpenTelemetry. Following this i could see tracing done at nginx module level but following the steps you have mentioned, i get a single span at nginx(no internal nginx module tracing info).

Am i doing something wrong, or the guide i mentioned and the work you are doing are completely different. Or is their any way of enabling module level tracing as well.

@esigo
Copy link
Member Author

esigo commented Feb 22, 2023

Hi @Dipenbhatt03,
We are using the other implementation. You probably need to check the documentation here #9144. I didn't get the question. You could see the spans for the other only when they are instrumented too.

@csepulveda
Copy link

Hello @esigo, do you know when this could be merged and released?
Regards!!!

@esigo
Copy link
Member Author

esigo commented Mar 7, 2023

Hello @esigo, do you know when this could be merged and released? Regards!!!

Hi @csepulveda I'll demo this in the next SIG meeting (next Thursday). It should be merged soon after that :)

@esigo
Copy link
Member Author

esigo commented Mar 19, 2023

status:
I demoed this in the last SIG meeting (16.03.2023). Expected to be released with 1.7.0.

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2023
@strongjz
Copy link
Member

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 21, 2023
@k8s-ci-robot k8s-ci-robot added area/docs needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 21, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 21, 2023
@esigo
Copy link
Member Author

esigo commented Mar 21, 2023

@strongjz merged documentation from #9144 (after addressing the review comments there).

@strongjz
Copy link
Member

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Mar 22, 2023
@strongjz
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 22, 2023
@strongjz
Copy link
Member

/unhold
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 22, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: esigo, strongjz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/docs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.