OpenTelemetry Proposal: Introduce semantic conventions for CI/CD observability #223

horovits · 2023-01-04T09:52:16Z

This is PR is a new OTEP for CI/CD Observability

Initial OTEP definition for CI/CD Observability

some typos and formatting

…P.md per pull request id open-telemetry#223

horovits · 2023-02-16T13:44:00Z

I brought up this is a topic at the project discussion at KubeCon NA, good feedback on that discussion. now I formalized the proposal and put it as an OTEP, curious for your thoughts. @alolita @dyladan and all

secustor

What I'm missing from this proposal is the current state of other CID integrations which already support OTEL.

secustor · 2023-03-09T12:45:47Z

text/0223-cicd-observability-OTEP.md

+- Which receivers are needed beyond OTLP to support the use cases and workflows?
+- Which exporters are needed to support common backends?
+- Which processors are needed to support the defined workflows?
+


Suggested change

- How the trace context should be propagated?

There are tools which already implement some basic context propagation. Such as the mentioned Jenkins plugin or OTEL-CLI which use environment variables for that use case.

I don't have the exhaustive list of tool integrations, and glad to collect it with other contributors here.
However, I think the path here is analyzing the needed semantic conventions.
In this context, I should flag a subsequent PR opened to propose semantic conventions for deployments:
open-telemetry/opentelemetry-specification#3169
@thisthat how do you see the alignment of these proposals?

Sorry for the late answer @horovits
I like to align the two proposals 👍 My PR focuses only on what attributes we should emit on a trace/log that describes a CI/CD pipeline. I think as part of this OTEP we should address the point of @secustor and try to agree on how the trace context can be propagated. This way, different tools can use together and each one contributes spans to the trace so there won't be blind spots.

@horovits I agree that shared semantics will help. For a tool to propagate context, it needs to know where to look for it in any incoming trigger or data/policy that is used to decide to perform an action. It also needs how to enrich the context and where to send it to propagate it further. Shared semantics definitely help here, which is very much aligned with the mission of CDEvents.

CDEvents has a simple model for deployment related of events today - it would be great to have shared semantics here as well. /cc @AloisReitbauer I think this would be of interest to the App Delivery Tag as well.

@thschue FYI ^^

I would like to retrigger this discussion as I'm facing some decisions which will have an effect on how we are tackling things.

I see 3 basic ways we should support to propagate context.

HTTP headers

Basically what is already in place the W3C standard.
The use case would be here that systems get triggered by webhook, e.g. VCS to CI system.

Environment variables

I'm imagining here one process handing over context to another. e.g. a pipeline run handing over context to a tool used inside of the run.

Other

Other ways are possible too, tough I would add them to the specification and make them optional e.g. CLI parameters or files containing the necessary information.

@secustor - not sure if you've seen it but there's another OTEP that is specifically related to context propagation at the environment levels that's in progress. open-telemetry/opentelemetry-specification#740

@deejgregor has an working branch for that addition as well that was brought up on today's SIG alongside this OTEP.

bdarfler · 2023-05-08T20:18:10Z

What I'm missing from this proposal is the current state of other CID integrations which already support OTEL.

This is one player in the space https://github.com/inception-health/otel-export-trace-action

reyang · 2023-05-09T16:57:24Z

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

horovits · 2023-05-10T17:48:38Z

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

in the context of an OpenTelemetry extension proposal OTEP, the point is of course to extend OTel to support CI/CD use cases. I believe that once there's an open specification in place, tool vendors/projects will follow and adopt it.
HTH clarifing.

reyang · 2023-05-10T17:51:56Z

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

in the context of an OpenTelemetry extension proposal OTEP, the point is of course to extend OTel to support CI/CD use cases. I believe that once there's an open specification in place, tool vendors/projects will follow and adopt it. HTH clarifing.

Sorry, this is still not clear to me - what exactly does "extend OTel to support CI/CD use cases" mean? What is needed/missing (e.g. do we need extra API? do we need specific semantic conventions) from OTel?

horovits · 2023-05-10T18:04:40Z

@horovits I have a clarification question regarding the goal here, is this about "OpenTelemetry should define some semantic convention for CI/CD observability" or "CI/CD should use OpenTelemetry"? The latter sounds like a good issue/proposal for CI/CD systems rather than OpenTelemetry.

in the context of an OpenTelemetry extension proposal OTEP, the point is of course to extend OTel to support CI/CD use cases. I believe that once there's an open specification in place, tool vendors/projects will follow and adopt it. HTH clarifing.

Sorry, this is still not clear to me - what exactly does "extend OTel to support CI/CD use cases" mean? What is needed/missing (e.g. do we need extra API? do we need specific semantic conventions) from OTel?

sorry I might miss your question, but I tried to elaborate on that in the 'Internal details' section of the proposal:
"OpenTelemetry specification should be enhanced to cover semantics relevant to pipelines, such as the branch, build, step (ID, duration, status), commit SHA (or other UUID), run (type, status, duration). In addition, distribution execution mechanism also introduces various entities, such as nodes, queues, jobs and executors (using the Jenkins terms, other tools having respective equivalents, which the specification should abstract with the semantic convention)."

can you elaborate what you find missing in the above , so I can try and answer?

reyang · 2023-05-10T18:16:28Z

sorry I might miss your question, but I tried to elaborate on that in the 'Internal details' section of the proposal: "OpenTelemetry specification should be enhanced to cover semantics relevant to pipelines, such as the branch, build, step (ID, duration, status), commit SHA (or other UUID), run (type, status, duration). In addition, distribution execution mechanism also introduces various entities, such as nodes, queues, jobs and executors (using the Jenkins terms, other tools having respective equivalents, which the specification should abstract with the semantic convention)."

can you elaborate what you find missing in the above , so I can try and answer?

Now I understand, thanks! I was trying to see how could TC help since this PR seems to be stuck / not receiving much attentions.

I personally would suggest:

Change the PR title to "Introduce semantic conventions for CI/CD observability".
Socialize with the semantic conventions working group https://github.com/open-telemetry/community#specification-sigs.
Raise awareness during the weekly spec SIG meeting.

kuisathaverat · 2023-05-17T10:40:36Z

text/0223-cicd-observability-OTEP.md

+
+OpenSearch dashboard for monitoring Jenkins pipelines:
+![OpenSearch dashboard for monitoring Jenkins pipelines](https://dytvr9ot2sszz.cloudfront.net/wp-content/uploads/2022/05/image7.png)
+


Worth to mention also the Elastic Stack as the primary integration of the Jenkins Opentelemetry plugin

Suggested change

Elastic Stack dashboard for monitoring Jenkins pipelines:

![Elastic Stack dashboard for monitoring Jenkins pipelines](https://raw.githubusercontent.com/jenkinsci/opentelemetry-plugin/master/docs/images/kibana_jenkins_overview_dashboard.png)

kuisathaverat · 2023-05-17T10:58:49Z

text/0223-cicd-observability-OTEP.md

+OpenSearch dashboard for monitoring Jenkins pipelines:
+![OpenSearch dashboard for monitoring Jenkins pipelines](https://dytvr9ot2sszz.cloudfront.net/wp-content/uploads/2022/05/image7.png)
+
+For more examples, see [this article](https://logz.io/learn/cicd-observability-jenkins/) on CI/CD observability using currently available open source tools.


Could we add more articles and presentations here? We have been a bunch in the last three years.

Improve your software delivery with CI/CD observability and OpenTelemetry

DevOpsWorld 2021 - Embracing Observability in Jenkins with OpenTelemetry

DevOpsWorld 2021 - Who Observes the Watchers? An Observability Journey

Embracing Observability in CI/CD with OpenTelemetry

FOSDEM 2022 - OpenTelemetry and CI/CD

cdCon Austin 2022 - Making your CI/CD Pipelines Speaking in Tongues with OpenTelemetry

Observability Guide - Elastic Stack 8.7

kuisathaverat · 2023-05-17T11:13:32Z

text/0223-cicd-observability-OTEP.md

+OpenTelemetry instrumentation should then support in collecting and emitting the new data. 
+
+OpenTelemetry Collector can then offer designated processors for these payloads, as well as new exporters for designated backend analytics tools, as such prove useful for release engineering needs beyond existing ecosystem.   
+


Not only the CI/CD system can send OpenTelemetry data, the tools used as part of the CI/CD pipeline can also send its own OpenTelemetry data. Distributed tracing allows combining all these spans in a single stream of spans. This process gives you more fine-grained details about your pipeline and process.

These are some of the tools integrated into your pipelines give you more details:

Maven OpenTelemetry extension

Ansible - create distributed traces with OpenTelemetry

Python - pytest-otel plugin for reporting APM traces of tests executed

Otel-cli - a command-line tool for sending OpenTelemetry traces

junit2otlp - is sending jUnit metrics to a back-end using Open Telemetry

kuisathaverat · 2023-05-17T11:28:09Z

text/0223-cicd-observability-OTEP.md

+- Which entity model should be supported to best represent CI/CD domain and pipelines?
+- What are the common CI/CD workflows we aim to support? 
+- What are the primary tools that should be supported with instrumentation in order to gain critical mass on CI/CD coverage?
+- Is CDEvents a good fit of a specification to integrate with? what is the aligmment, overlap and gaps? and if so, how to establish the cross-foundation and cross-group collaboration in an effective manner?


OpenTelemetry is more general purpose. It has spans, metrics, and logs. Overall you can replace CDEvents with OpenTelemetry but not the other way around.

Thanks @horovits for this proposal and for mentioning CDEvents here!

CDEvents aims to define shared semantics for interoperability in the CI/CD space. Interoperability includes the observability space too - common semantics in the events generated by the different tools enable things like visualization, metrics and more across tools. The transport layer we use today for such events is CloudEvents; the specification is, however, decoupled from the underlying transport by design, and we were planning indeed to reach out to this community to discuss collaboration.

So, I definitely agree we should collaborate, I think it would be really valuable for the ecosystem.
Many tools are adopting OpenTelemetry and many are adopting CDEvents (or both) and having common semantics would be very beneficial.

I'd be happy to join one of the open telemetry community meetings to present CDEvents, if that is helpful.

/cc @e-backmark-ericsson

CDEvents could be attached to spans as span events and thus best of both worlds.

kuisathaverat · 2023-05-17T11:32:01Z

text/0223-cicd-observability-OTEP.md

+## Open questions
+
+Open questions include:
+- Which entity model should be supported to best represent CI/CD domain and pipelines?


There is some work done in the Jenkins OpenTelemetry plugin to try to have a general model for naming conventions

Also in the junit2otlp for the testing naming conventions

kuisathaverat · 2023-05-17T11:37:19Z

text/0223-cicd-observability-OTEP.md

+
+CI/CD tools today emit various telemetry data, whether logs, metrics or trace data to report on the release pipeline state, to help pinpoint flakyness, and accelerate root cause analysis of failures, whether stemming from the application code, a configuration, or from the CI/CD environment. However, these tools do not follow any particular standard, specification, or semantic conventions. This makes it hard to use observability tools for monitoring these pipelines. Some of these tools provide some observability visualization and analytics capabilities out of the box, but in addition to the tight coupling the offered capabilities are oftentime not enough, especially when one wishes to monitor aggregated information across different tools and different stages of the release process.
+
+Some tools have started adopting OpenTelemetry, which is an important step in creating standardization. A good example is [Jenkins](https://github.com/jenkinsci/jenkins), a popular CI OSS project, which offers the [Jenkins OpenTelemetry plugin](https://plugins.jenkins.io/opentelemetry/) for emitting telemetry data in order to:


GitHub Actions also support sending OpenTelemetry data using the action otel-export-trace-action

kuisathaverat · 2023-05-17T11:46:40Z

text/0223-cicd-observability-OTEP.md

+
+Open questions include:
+- Which entity model should be supported to best represent CI/CD domain and pipelines?
+- What are the common CI/CD workflows we aim to support? 


The major CI/CD tools

Jenkins

GitLab

GitHub Actions

CircleCI

TeamCity

ArgoCD

TravisCI

Azure DevOps

...

And build systems/tools:

Maven (Java)

Gradle (Java)

Npm/yarn (Node.js)

Mage (Go)

CMake (C/C++)

Make

...

Test frameworks:

Junit (Java, C, C++, ...)

Pytest (Python)

Jest (Node.js)

...

Deploy/devOps tools

Ansible

Terraform

Puppet

Chef

...

While it isn't mentioned here, I am happy to advocate this internally to be added to Atlassian's Bitbucket Pipelines.

good listing, thanks for putting it up.
I would first focus on CI/CD tools. build and test frameworks etc. can come as a separate phase, as it brings in new domains.

MovieStoreGuy

From my understanding of this OTEP, it feels like there is two areas of concerns:

Defining a semantic convention on how to represent build status
Understanding how long changes take from inception to production

Please let me know if that isn't correct.

The reason why I think these should be decoupled is that VCS system and CI/CD system should be treated as separated entities since we could further expand the insights we can gain from a VCS system but not force the implementation details to be done by the CI/CD system. For example, you wouldn't want your CI/CD system to track PR size, contributors, and time to merge; while it can it does mean each CI/CD needs to support interactions for each VCS which can be vastly different. The same can be said vice versa.

I think we should drop reference about the collector and instrumentation since I would considered it out of scope due to the fact it side steps the conversation for standardising the actual data being sent from these system. At least, instrumentation should be moved to its own section.

MovieStoreGuy · 2023-05-23T07:23:34Z

text/0223-cicd-observability-OTEP.md

+
+Building CI/CD observability involves four stages: Collect → Store → Visualize → Alert. OpenTelemetry provides a unified way for the first step, namely collecting and ingesting the telemetry data in an open and uniform manner. 
+
+If you are a CI/CD tool builder, the specification and instrumentation will enable you to properly structure your telemetry, package and emit it over OTLP. OpenTelemetry specification will determine which data to collect, the semantic convention of the data, and how different signal types can be correlated based on that, to support downstream analytics of that data by various tools.


I think there is another potential spec here considering a lot of CI/CD is a combination of tooling in a repeatable fashion.

For example if you're able to generate a trace of your pipeline, the existing tooling is unaware of what is being used as your trace context. This means that your tools could be generating their own trace contexts that are disjointed to the CI/CD trace context, making it extremely difficult to combine those two together.

VCS and the use cases you mention such as "track PR size, contributors, and time to merge" are not in the scope of this OTEP. this proposal is about the release pipelines. it can carry metadata such as the build number being deployed and perhaps the commit SHA, but not getting into the internals of the build process. monitoring VCS may be a valid use case for a separate proposal.

text/0223-cicd-observability-OTEP.md

MovieStoreGuy · 2023-05-23T07:42:02Z

text/0223-cicd-observability-OTEP.md

+## Internal details
+
+OpenTelemetry specification should be enhanced to cover semantics relevant to pipelines, such as the branch, build, step (ID, duration, status), commit SHA (or other UUID), run (type, status, duration). These should be geared for observability into issues in the released application code. 
+In addition, oftentimes release issues are not code-based but rather environmental, stemming from issues in the build machines, garbage collection issues of the tool or even a malstructured pipeline step. In order to provide observability into CI/CD environment, especially one with distributed execution mechanism, there's need to monitor various entities such as nodes, queues, jobs and executors (using the Jenkins terms, other tools having respective equivalents, which the specification should abstract with the semantic convention).


This feels like another area that is conflating what the goal of this OTEP is.

Internal operations of Jenkins (or any build system) should be separated out into their own section.

If it is within your control (ie self hosted runners), you can use the otel collector and auto instrumentation agents (where possible and not provided by the vendor) on build nodes to surface this information.

many times pipeline runs fail due to environmental issues rather than ones related to the deployed code.
I see it as a core value that CI/CD observability brings, to discern these two cases.
and while you can still use OTEL to monitor your individual tools, the purpose of this OTEP is to standardize on this.
same as we've done with client side instrumentation WG.

Indeed issues with the environment in a distributed build system can be very difficult to track down even if both CI and nodes are instrumented to emit OTEL data.

For example CI could use https://plugins.jenkins.io/opentelemetry/ to send traces of the job executions and build agents could use https://github.com/prometheus/node_exporter and/or https://github.com/open-telemetry/opentelemetry-collector to export host metrics. Even in this scenario it can be difficult to correlate a failing pipeline run with the metrics host(s) that executed the build.

The plugin https://plugins.jenkins.io/opentelemetry-agent-metrics/ builds on https://plugins.jenkins.io/opentelemetry/ to solve this issue by running dedicated otel collectors on each build agent and adding attributes to the metrics identifying the run and main CI controller (https://github.com/jenkinsci/opentelemetry-agent-metrics-plugin/blob/main/src/main/resources/io/jenkins/plugins/onmonit/otel.yaml.tmpl)

MovieStoreGuy · 2023-05-23T07:48:37Z

text/0223-cicd-observability-OTEP.md

+
+The CDF (Continuous Delivery Foundation) has the Events Special Interest Group ([SIG Events](https://github.com/cdfoundation/sig-events)) which explores standardizing on CI/CD event to facilitate interoperability (it is a work-stream within the CDF SIG Interoperability.). The group is working on [CDEvents](https://cdevents.dev/), a standardized event protocol that caters for technology agnostic machine-to-machine communication in CI/CD systems. It makes sense to evaluate alignment between the standards.
+
+OpenTelemetry instrumentation should then support in collecting and emitting the new data. 


This is more language semantic but the instrumentation provided by open telemetry doesn't need to be updated, but rather, our definitions for events that should be captured within this domain of interest.

Suggested change

OpenTelemetry instrumentation should then support in collecting and emitting the new data.

These defined events from the Continuous Delivery Foundation (CDF) should be merged into the semantic convention for these systems to implement.

MovieStoreGuy · 2023-05-23T07:50:36Z

text/0223-cicd-observability-OTEP.md

+
+OpenTelemetry instrumentation should then support in collecting and emitting the new data. 
+
+OpenTelemetry Collector can then offer designated processors for these payloads, as well as new exporters for designated backend analytics tools, as such prove useful for release engineering needs beyond existing ecosystem.   


I would suggest moving this to a stretch goal to be honest, mostly contributing components to the collector requires (hopefully an contributors from that company) to help maintain that component. Gathering interest to develop and contribute components from vendors may take a bit of effort and I wouldn't want it to block this OTEP.

MovieStoreGuy · 2023-05-23T07:55:31Z

text/0223-cicd-observability-OTEP.md

+
+Open questions include:
+- Which entity model should be supported to best represent CI/CD domain and pipelines?
+- What are the common CI/CD workflows we aim to support? 


While it isn't mentioned here, I am happy to advocate this internally to be added to Atlassian's Bitbucket Pipelines.

MovieStoreGuy · 2023-05-23T07:56:30Z

text/0223-cicd-observability-OTEP.md

+Open questions include:
+- Which entity model should be supported to best represent CI/CD domain and pipelines?
+- What are the common CI/CD workflows we aim to support? 
+- What are the primary tools that should be supported with instrumentation in order to gain critical mass on CI/CD coverage?


This should be made its own area of interest mostly because there is a lot of scope and how would we standardise tooling to work and link back to the build pipeline natively.

MovieStoreGuy · 2023-05-23T07:59:45Z

text/0223-cicd-observability-OTEP.md

+- Which receivers are needed beyond OTLP to support the use cases and workflows?
+- Which exporters are needed to support common backends?
+- Which processors are needed to support the defined workflows?


I would considered these out of scope considering that we are not looking to bring additional instrumentation from tooling and vendors, but rather having agreed upon convention that each implement

+1 to this one.

dsotirakis · 2023-05-29T09:45:19Z

👋 Hello all!
I am also putting this article How we reduced flaky tests using Grafana, Prometheus, Grafana Loki, and Drone CI
to show how we can avoid storing logs and metrics only to query them when the pipelines have finished using PromQL or LogQL, by using OTel.

joshgav · 2023-06-22T13:26:45Z

A similar effort focused on measuring deliveries in particular:

cc @thisthat @AloisReitbauer

adrielp · 2023-11-14T18:58:32Z

Looking to revive this OTEP. It looks to have been a while since there's been any traction (though I just found a new comment from a couple weeks ago). I'd like to know what needs to be done to get this moved forward & in? Brought this up att the SIG meeting today for specification and the overall thought was to bring the discussion back here & potentially in the WG.

thisthat · 2023-11-23T11:47:17Z

Hey @adrielp, I am also interested in this OTEP and would like to help move this forward :)

adrielp · 2023-11-23T14:15:41Z

awesome @thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD).

Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there.

Also pulled up with @horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular.

thisthat · 2023-11-23T15:35:55Z

Please, keep me posted @adrielp I am more than happy to join and help the WG :)

Elfo404 · 2023-11-23T17:26:45Z

@adrielp same here!

dsotirakis · 2023-11-24T08:18:14Z

@adrielp please include me as well, happy to be a part of it!

mhausenblas · 2023-11-24T08:48:24Z

Count me in!

afrittoli · 2023-11-24T09:56:34Z

Thanks Adriel, Count me in for the working group. I definitely hope we can collaborate with the CDEvents (https://cdevents.dev) project as well. Andrea

…

On Thu, 23 Nov 2023 at 14:15, Adriel Perkins ***@***.***> wrote: awesome @thisthat <https://github.com/thisthat> ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD). Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there. Also pulled up with @horovits <https://github.com/horovits> yesterday and we'll be syncing up again on Monday about this OTEP in particular. — Reply to this email directly, view it on GitHub <#223 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAQ2PKBJSFYNX7G7AI7E333YF5LBZAVCNFSM6AAAAAATQTABMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRUGUYDSMZXHE> . You are receiving this because you commented.Message ID: ***@***.***>

afrittoli · 2023-11-24T09:59:39Z

+## Open questions + +Open questions include: +- Which entity model should be supported to best represent CI/CD domain and pipelines? +- What are the common CI/CD workflows we aim to support? +Tekton

+1 (I'm a bit biased as a Tekton maintainer :D) On Tekton side we have a few relevant features: - emit open telemetry metrics - generate open telemetry traces for distributed tracing via annotations on Tekton resources - emit CloudEvents as well as CDEvents (experimental) So, we definitely care about observability in Tekton Andrea

…

— Reply to this email directly, view it on GitHub <#223 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAQ2PKDILHE2XPRLVHOLJJTYF7CM7AVCNFSM6AAAAAATQTABMKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTONBXGA2DKNJTG4> . You are receiving this because you commented.Message ID: ***@***.***>

krzko · 2023-11-27T04:13:57Z

Would love to help with this as finalising the work on New component: Github Actions Receiver now and https://github.com/krzko/run-with-telemetry to provide telemetry for GitHub Actions.

horovits · 2023-11-27T08:20:28Z

Thanks Adriel, Count me in for the working group. I definitely hope we can collaborate with the CDEvents (https://cdevents.dev) project as well. Andrea
…
On Thu, 23 Nov 2023 at 14:15, Adriel Perkins @.> wrote: awesome @thisthat https://github.com/thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD). Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there. Also pulled up with @horovits https://github.com/horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular. — Reply to this email directly, view it on GitHub <#223 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ2PKBJSFYNX7G7AI7E333YF5LBZAVCNFSM6AAAAAATQTABMKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRUGUYDSMZXHE . You are receiving this because you commented.Message ID: @.>

@afrittoli yes we should carry on the discussion we started with the CDEvents as to potential collaboration between the projects. some concerns have been raised on your team last time, so we should map carefully if there's an overlap between OTel and CDEvents and the fit.

horovits · 2023-11-27T08:22:17Z

Would love to help with this as finalising the work on New component: Github Actions Receiver now and https://github.com/krzko/run-with-telemetry to provide telemetry for GitHub Actions.

@krzko Congrats on completing the work on GitHub Actions! the insights of your experimentation will be valuable here.

horovits · 2023-11-28T12:33:43Z

awesome @thisthat ! Per the last SIG WG meeting, I've started working on a project proposal to create a CI/CD Observability Sem conventions working group, focused on driving this OTEP as well as the Environment Variables as trace propagators OTEP(due to how it's necessary for distributed tracing in batch systems like CI/CD).

Part of the project proposal requires figuring out staffing needs and getting folks together for the working group, so definitely looking for folks there.

Also pulled up with @horovits yesterday and we'll be syncing up again on Monday about this OTEP in particular.

we'd like to hold a call to share with everyone the work to formalize a working group, and to see who's interested in getting involved as we figure out staffing requirements.
I put a vote for the date on our CNCF slack channel cicd-o11y (if you're not yet there - do join)
https://cloud-native.slack.com/archives/C0598R66XAP/p1701100511547279
currently date is 7 Dec, 1pm CET / 7am ET
join us on slack and I'll update with details and link.

adrielp · 2023-11-29T04:20:01Z

The PR has been opened to create the CI/CD Observability Working Group

Without a doubt, it's rough, but it's ready to be read, commented on, and discussed in the upcoming meeting. Staffing of course is one of those hot topics. 😄 🚀

adrielp · 2024-01-10T21:23:43Z

@mhausenblas @afrittoli - I just realized coming back here that I missed your names in the working group PR. Sorry about that 😞

I've added you now, please feel free to check it out and make sure I got it right! Thanks! https://github.com/open-telemetry/community/pull/1822/files#diff-41c277076e06d5ea84d2e8bc9eded2bc97e7f0888502f4f8d691b6c5c3639e57

horovits · 2024-01-21T14:02:49Z

Important update: we got approval of the TC to establish the CI/CD Observability SIG.
The mandate of the new SIG will be to execute on the above OTEP.
See here on the SIG scope and approval:
open-telemetry/community#1822 (comment)

carlosalberto · 2024-02-12T21:30:24Z

@horovits Any chance to address/discuss/answer to the comments to the PR? I will do another full review once that is done.

adrielp · 2024-02-13T02:05:28Z

@carlosalberto @horovits - just to provide an update on this. We discussed this OTEP on the SemConv meeting last week. We actually might not need to proceed directly with this OTEPs. Based on current direction, OTEPs are for SPEC changes, and right now our focus is the Semantic Conventions changes. We're reviewing the CDEvents work right now and coming up with a data model. Once we do that we'll be directly contributing to the Semantic Conventions through pull requests. If we need to make any future specification changes, we may leverage this OTEP or make new ones (that are smaller) to account for those changes.

But as of now, this OTEP isn't directly needed to be moved forward, just provide larger visibility to the efforts and context until there are spec changes. cc. @jsuereth

carlosalberto · 2024-02-16T01:44:03Z

Thanks for the follow up! Should we then close this OTEP? We can always find it later on if/as needed.

adrielp · 2024-02-22T17:16:13Z

@carlosalberto I'm fine with closing it if that's how y'all want to handle it. Based on the conversations, I think it makes logical sense given the direction we're headed. @horovits - any objections?

carlosalberto · 2024-03-03T17:18:35Z

Hey @horovits Any concern?

horovits · 2024-03-03T20:51:15Z

Hey @horovits Any concern?

@carlosalberto @adrielp sure let's follow the process of the Semantic Conventions team, if this OTEP is no longer required then let's close it.

horovits · 2024-03-05T12:02:48Z

closing the OTEP PR per the feedback from @carlosalberto @adrielp @jsuereth and the OpenTelemetry Semantic Conventions WG.
This doesn't mean we're backing off the proposal for CI/CD Observability conventions in OTel, only a different procedural path forward, full steam.
See below for more context, and join the cicd-o11y channel on the CNCF slack workspace for more discourse.

@carlosalberto @horovits - just to provide an update on this. We discussed this OTEP on the SemConv meeting last week. We actually might not need to proceed directly with this OTEPs. Based on current direction, OTEPs are for SPEC changes, and right now our focus is the Semantic Conventions changes. We're reviewing the CDEvents work right now and coming up with a data model. Once we do that we'll be directly contributing to the Semantic Conventions through pull requests. If we need to make any future specification changes, we may leverage this OTEP or make new ones (that are smaller) to account for those changes.

But as of now, this OTEP isn't directly needed to be moved forward, just provide larger visibility to the efforts and context until there are spec changes. cc. @jsuereth

horovits added 4 commits January 3, 2023 15:24

Create 0000-cicd-observability-OTEP.md

7caf89f

Initial OTEP definition for CI/CD Observability

Update 0000-cicd-observability-OTEP.md

86531a1

some typos and formatting

reference to DORA metrics

7071a08

adding example dashboards

b77547d

horovits requested a review from a team January 4, 2023 09:52

Rename 0000-cicd-observability-OTEP.md to 0223-cicd-observability-OTE…

95672d5

…P.md per pull request id open-telemetry#223

tedsuo added the triaged label Jan 30, 2023

secustor reviewed Mar 11, 2023

View reviewed changes

reyang mentioned this pull request May 16, 2023

feat: add attributes for cloud-native deployment open-telemetry/semantic-conventions#24

Closed

kuisathaverat reviewed May 17, 2023

View reviewed changes

jpkrohling mentioned this pull request May 17, 2023

New component: Git Provider Receiver open-telemetry/opentelemetry-collector-contrib#22028

Closed

2 tasks

MovieStoreGuy reviewed May 23, 2023

View reviewed changes

horovits changed the title ~~OpenTelemetry Proposal: CI/CD Observability Support by OpenTelemetry~~ OpenTelemetry Proposal: Introduce semantic conventions for CI/CD observability Jun 15, 2023

joshgav mentioned this pull request Jun 22, 2023

Semantic Convention for Deployments open-telemetry/opentelemetry-specification#3168

Closed

joshgav mentioned this pull request Jun 27, 2023

Discuss CDEvents support cncf/tag-app-delivery#397

Closed

arthurzenika mentioned this pull request Nov 22, 2023

Take inspiration from Grafana's OpenTelemetry OTEP on CI/CD observability ? mvisonneau/gitlab-ci-pipelines-exporter#753

Open

kuisathaverat mentioned this pull request Nov 30, 2023

Proposal for SIG CICD observability open-telemetry/community#1822

Merged

tylerbenson mentioned this pull request Dec 4, 2023

Replace AWS X-Ray Environment Span Link section open-telemetry/semantic-conventions#354

Merged

2 tasks

chewrocca mentioned this pull request Dec 6, 2023

OpenTelemetry Proposal: Introduce semantic conventions for CI/CD observability jenkinsci/opentelemetry-plugin#769

Closed

christophe-kamphaus-jemmic mentioned this pull request Dec 7, 2023

Use semantic conventions for CI/CD observability jenkinsci/opentelemetry-agent-metrics-plugin#21

Open

kuisathaverat mentioned this pull request Dec 29, 2023

Stable CI & Jenkins Semantic Conventions jenkinsci/opentelemetry-plugin#56

Open

horovits closed this Mar 5, 2024


		OpenSearch dashboard for monitoring Jenkins pipelines:
		![OpenSearch dashboard for monitoring Jenkins pipelines](https://dytvr9ot2sszz.cloudfront.net/wp-content/uploads/2022/05/image7.png)



	Elastic Stack dashboard for monitoring Jenkins pipelines:
	![Elastic Stack dashboard for monitoring Jenkins pipelines](https://raw.githubusercontent.com/jenkinsci/opentelemetry-plugin/master/docs/images/kibana_jenkins_overview_dashboard.png)

		OpenTelemetry instrumentation should then support in collecting and emitting the new data.

		OpenTelemetry Collector can then offer designated processors for these payloads, as well as new exporters for designated backend analytics tools, as such prove useful for release engineering needs beyond existing ecosystem.


		CI/CD tools today emit various telemetry data, whether logs, metrics or trace data to report on the release pipeline state, to help pinpoint flakyness, and accelerate root cause analysis of failures, whether stemming from the application code, a configuration, or from the CI/CD environment. However, these tools do not follow any particular standard, specification, or semantic conventions. This makes it hard to use observability tools for monitoring these pipelines. Some of these tools provide some observability visualization and analytics capabilities out of the box, but in addition to the tight coupling the offered capabilities are oftentime not enough, especially when one wishes to monitor aggregated information across different tools and different stages of the release process.

		Some tools have started adopting OpenTelemetry, which is an important step in creating standardization. A good example is [Jenkins](https://github.com/jenkinsci/jenkins), a popular CI OSS project, which offers the [Jenkins OpenTelemetry plugin](https://plugins.jenkins.io/opentelemetry/) for emitting telemetry data in order to:


		Building CI/CD observability involves four stages: Collect → Store → Visualize → Alert. OpenTelemetry provides a unified way for the first step, namely collecting and ingesting the telemetry data in an open and uniform manner.

		If you are a CI/CD tool builder, the specification and instrumentation will enable you to properly structure your telemetry, package and emit it over OTLP. OpenTelemetry specification will determine which data to collect, the semantic convention of the data, and how different signal types can be correlated based on that, to support downstream analytics of that data by various tools.


		The CDF (Continuous Delivery Foundation) has the Events Special Interest Group ([SIG Events](https://github.com/cdfoundation/sig-events)) which explores standardizing on CI/CD event to facilitate interoperability (it is a work-stream within the CDF SIG Interoperability.). The group is working on [CDEvents](https://cdevents.dev/), a standardized event protocol that caters for technology agnostic machine-to-machine communication in CI/CD systems. It makes sense to evaluate alignment between the standards.

		OpenTelemetry instrumentation should then support in collecting and emitting the new data.

	OpenTelemetry instrumentation should then support in collecting and emitting the new data.
	These defined events from the Continuous Delivery Foundation (CDF) should be merged into the semantic convention for these systems to implement.

OpenTelemetry Proposal: Introduce semantic conventions for CI/CD observability #223

OpenTelemetry Proposal: Introduce semantic conventions for CI/CD observability #223

Conversation

horovits commented Jan 4, 2023

horovits commented Feb 16, 2023

secustor left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HTTP headers

Environment variables

Other

Choose a reason for hiding this comment

bdarfler commented May 8, 2023

reyang commented May 9, 2023

horovits commented May 10, 2023

reyang commented May 10, 2023

horovits commented May 10, 2023

reyang commented May 10, 2023

Choose a reason for hiding this comment

kuisathaverat May 17, 2023 • edited Loading

Choose a reason for hiding this comment

kuisathaverat May 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afrittoli Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kuisathaverat May 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kuisathaverat May 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

horovits Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MovieStoreGuy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

horovits Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsotirakis commented May 29, 2023

joshgav commented Jun 22, 2023 • edited Loading

adrielp commented Nov 14, 2023

thisthat commented Nov 23, 2023

adrielp commented Nov 23, 2023

thisthat commented Nov 23, 2023 • edited Loading

Elfo404 commented Nov 23, 2023

dsotirakis commented Nov 24, 2023

mhausenblas commented Nov 24, 2023

afrittoli commented Nov 24, 2023 via email

afrittoli commented Nov 24, 2023 via email

krzko commented Nov 27, 2023

horovits commented Nov 27, 2023

horovits commented Nov 27, 2023

horovits commented Nov 28, 2023 • edited Loading

adrielp commented Nov 29, 2023

adrielp commented Jan 10, 2024

horovits commented Jan 21, 2024

carlosalberto commented Feb 12, 2024

adrielp commented Feb 13, 2024

carlosalberto commented Feb 16, 2024

adrielp commented Feb 22, 2024

secustor left a comment •

edited

Loading

kuisathaverat May 17, 2023 •

edited

Loading

kuisathaverat May 17, 2023 •

edited

Loading

afrittoli Jun 22, 2023 •

edited

Loading

kuisathaverat May 17, 2023 •

edited

Loading

kuisathaverat May 17, 2023 •

edited

Loading

horovits Jun 15, 2023 •

edited

Loading

horovits Jun 15, 2023 •

edited

Loading

joshgav commented Jun 22, 2023 •

edited

Loading

thisthat commented Nov 23, 2023 •

edited

Loading

horovits commented Nov 28, 2023 •

edited

Loading

horovits commented Mar 5, 2024 •

edited

Loading