This project delivers OpenTelemetry traces, metrics and logs from shell scripts (sh, ash, dash, bash, busybox, and many other POSIX compliant shells) as well as from GitHub actions (including node and docker actions). Compared to similar projects, it delivers not just a command-line SDK to create spans manually, but also provides automatic context propagation via HTTP (wget, wget2, curl, and netcat), auto-instrumentation of all available commands, auto-injection into child scripts and into executables using shebangs, as well as automatic log collection from stderr. Its installable via a debian or rpm package from the releases in this repository, or from the apt-repository below. This project is not officially affiliated with the CNCF project OpenTelemetry.
Check out our detailed Demos.
A simple command like curl http://www.google.at
on an AWS EC2 will produce a span like this:
{
"name": "curl http://www.google.at",
"context": {
"trace_id": "0xfecf2bf60c1433a33fc3b3715100f44a",
"span_id": "0x0441d068e0d7d83a",
"trace_state": "[]"
},
"kind": "SpanKind.CLIENT",
"parent_id": "0x31877863ebe897a6",
"start_time": "2024-04-07T15:27:35.230551Z",
"end_time": "2024-04-07T15:27:35.790162Z",
"status": {
"status_code": "UNSET"
},
"attributes": {
"shell.command_line": "curl http://www.google.at",
"shell.command": "curl",
"shell.command.type": "file",
"shell.command.name": "curl",
"subprocess.executable.path": "/usr/bin/curl",
"subprocess.executable.name": "curl",
"network.protocol.name": "http",
"network.transport": "tcp",
"http.request.method": "GET",
"server.address": "www.google.at",
"url.full": "http://www.google.at",
"url.path": "/",
"url.scheme": "http",
"network.peer.address": "172.217.18.3",
"network.peer.port": 80,
"http.request.header.host": [ "www.google.at" ],
"user_agent.original": "curl/8.2.1",
"http.request.header.user-agent": [ "curl/8.2.1" ],
"http.request.header.accept": [ "*/*" ],
"http.request.header.traceparent": [ "00-fecf2bf60c1433a33fc3b3715100f44a-0441d068e0d7d83a-01" ],
"http.response.status_code": 200,
"http.response.header.date": [ "Sun, 07 Apr 2024 15:27:35 GMT" ],
"http.response.header.expires": [ "-1" ],
"http.response.header.cache-control": [ "private, max-age=0" ],
"http.response.header.content-type": [ "text/html; charset=ISO-8859-1" ],
"http.response.header.content-security-policy-report-only": [ "object-src 'none';base-uri 'self';script-src 'nonce-Z06PcaKXtGYEozZ8RwkRlA' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp" ],
"http.response.header.server": [ "gws" ],
"http.response.header.x-xss-protection": [ "0" ],
"http.response.header.x-frame-options": [ "SAMEORIGIN" ],
"http.response.header.set-cookie": [ "AEC=AQTF6HxUMJNiGOP4fGxONroFJ61uidlU9uiGQsHcvRug4cz-d5iOLWCN7Q; expires=Fri, 04-Oct-2024 15:27:35 GMT; path=/; domain=.google.at; Secure; HttpOnly; SameSite=lax" ],
"http.response.header.accept-ranges": [ "none" ],
"http.response.header.vary": [ "Accept-Encoding" ],
"http.response.header.transfer-encoding": [ "chunked" ],
"http.response.body.size": 11984,
"pipe.stdin.bytes": 0,
"pipe.stdin.lines": 0,
"pipe.stdout.bytes": 18750,
"pipe.stdout.lines": 16,
"pipe.stderr.bytes": 391,
"pipe.stderr.lines": 4,
"shell.command.exit_code": 0,
"code.filepath": "./otel_demo_curl.sh",
"code.lineno": 2,
},
"events": [],
"links": [],
"resource": {
"attributes": {
"telemetry.sdk.language": "shell",
"telemetry.sdk.name": "opentelemetry",
"telemetry.sdk.version": "4.2.1",
"deployment.environment": "development",
"service.name": "DemoService",
"cloud.provider": "aws",
"cloud.platform": "aws_ec2",
"cloud.account.id": "785162395342",
"cloud.region": "eu-central-1",
"cloud.availability_zone": "eu-central-1b",
"host.id": "i-099942b18fe209756",
"host.type": "t3.small",
"host.name": "ip-172-31-41-64.eu-central-1.compute.internal",
"process.pid": 798293,
"process.parent_pid": 786902,
"process.executable.name": "bash",
"process.executable.path": "/usr/bin/bash",
"process.command_line": "bash ./otel_demo_curl.sh",
"process.command": "bash",
"process.owner": "ubuntu",
"process.runtime.name": "bash",
"process.runtime.description": "Bourne Again Shell",
"process.runtime.version": "5.2.15-2ubuntu1",
"process.runtime.options": "hB",
"service.version": "1.0.0",
"service.namespace": "Demo",
"service.instance.id": ""
},
"schema_url": ""
}
}
Install as described below. Put the following code at the start of an arbitrary script:
export OTEL_METRICS_EXPORTER=console
export OTEL_LOGS_EXPORTER=console
export OTEL_TRACES_EXPORTER=console
. otel.sh
Finally, run your script and see traces, metrics, and logs printed to stderr.
This project currently supports and is actively tested on debian-based (Debian and Ubuntu) and rpm-based (Fedora, OpenSuse, and Red Hat Enterprise Linux (RHEL)) operating systems as well as on the Windows Subsystem for Linux. The code also works on other Linux-based operating systems, however, there are no readily available installation packages for these systems. Mac-based operating systems are currently not supported.
Install either via
wget -O - https://raw.githubusercontent.com/plengauer/opentelemetry-bash/main/INSTALL.sh | sh
or, for debian-based systems, via
echo "deb [arch=all] http://3.73.14.87:8000/ stable main" | sudo tee /etc/apt/sources.list.d/otel.list
sudo apt-get update
sudo apt-get install opentelemetry-shell
Note: the apt repo only acts as a facade to offer a better debian-native installation option, internally it redirects the apt client to the releases of this repository.
You can either use the fully automatic instrumentation (recommended) or just import the API to do everything manually. In both cases, you can use the API to manually create customized spans and metrics. However, the automatic approach creates rich spans and logs fully automatically. We recommend to use the manual approach only to augment the automatic approach where necessary.
Import the OpenTelemetry auto instrumentation as well as the API by sourcing the otel.sh
file. This will both import the API described below (in case you need or want to extend manually) as well as initialize the SDK and the auto instrumentation. No explicit calls to otel_init
at the start or to otel_shutdown
at the end of the script are necessary. You can configure the SDK as described here. We recommend not just setting configuration variables, but also exporting them so that automatically injected children inherit the same configuration.
export OTEL_SERVICE_NAME=Test
export OTEL_RESOURCE_ATTRIBUTES=foo=bar,baz=foo
# TODO more config
. otel.sh
# ...
All commands are automatically instrumented to create spans. This includes commands on the PATH
, as well as all aliases and built-ins. Current limitations are shell functions as well as commands that are called via an absolute or relative path (/bin/cat
rather than cat
). For these cases, you can use otel_observe
described below.
For all commands, attributes of the shell.*
(describing the command being run), pipe.*
(describing behavior of stdin, stdout, and stderr), subprocess.executable.*
(describing the executable being run if any), and code.*
(describing the location in the script) families are recorded. Additionally, all lines written to stderr are recorded as logs.
If the command is a shell or an executable with a shebang (meaning it itself is script rather a than a binary), OpenTelemetry will be automatically injected into it to continue observation. No setup code or configuration is necessary in these children.
If the command is just a wrapper for another command (e.g., sudo
, find
, xargs
, parallel
, ...), OpenTelemetry will automatically be injected into the inner command to continue observation as well.
If the command represents communication to a third party service (like a HTTP request via curl
or wget
), relevant attributes from the http.*
, server.*
, url.*
, and network.*
families are added. Additionally, a W3C traceparent header is injected.
Finally, a single root span will be created and activated that represents the script. This span will automatically be deactivated and ended when the script ends.
To automatically monitor your Github Workflows on job level and to auto-inject into all individual steps, add the following step as first in every job you want to observe. You can configure the SDK as described here by adding environment variables to the setup step.
- uses: plengauer/opentelemetry-bash/actions/instrument/job@main
env:
OTEL_SERVICE_NAME: 'Test'
# ...
- run: ...
Optionally, setup a dedicted job that is used to collect all jobs under a single root span representing the entire workflow. The job has to have exactly the name below and must neither depend on any other job nor being depent on.
observe:
runs-on: ubuntu-latest
steps:
- uses: plengauer/opentelemetry-bash/actions/instrument/workflow@main
env:
OTEL_SERVICE_NAME: ${{ secrets.SERVICE_NAME }}
# ...
If you define that job to create a single root span for all other jobs, only the step in this root job has to be configured. The configuration will be propagated to the other jobs. The only exception to that are OpenTelemetry SDK Header configurations because they often contain API tokens and we cannot securely propagate these between jobs.
Import the API by referencing the otelapi.sh
file. This is only necessary if you do not choose a fully automatic approach described above. In case you use automatic instrumentation, the API will be imported automatically for you.
The SDK needs to be initialized and shut down manually at the start and at the end of your script respectively. All config must be set before the call to otel_init
. You can configure the underlying SDK with the same variables as any other OpenTelemetry SDK as described here. We recommend not just setting the environment variables, but also exporting them so that automatically injected children inherit the same configuration.
export OTEL_SERVICE_NAME=Test
export OTEL_RESOURCE_ATTRIBUTES=foo=bar,baz=foo
# TODO more config
. otelapi.sh
otel_init
# ...
otel_shutdown
You can configure the underlying SDK with the same variables as any other OpenTelemetry SDK as described here. In addition to that, use the following environment variables to further configure behavior of this project:
Variable | Values | Default | Description | State |
---|---|---|---|---|
OTEL_SHELL_CONFIG_OBSERVE_PIPES | TRUE , FALSE |
TRUE for GitHub Actions, FALSE otherwise |
Count bytes and lines on stdout, and stderr and add counts as attributes on spans. | stable |
OTEL_SHELL_CONFIG_OBSERVE_PIPES_STDIN | TRUE , FALSE |
FALSE |
Count bytes and lines on stdin and add counts as attributes on spans. | unsafe |
OTEL_SHELL_CONFIG_INSTRUMENT_MINIMALLY | TRUE , FALSE |
FALSE |
Only create SERVER , CONSUMER , CLIENT and PRODUCER spans, mute all INTERNAL spans. |
stable |
OTEL_SHELL_CONFIG_INJECT_DEEP | TRUE , FALSE |
TRUE for GitHub Actions, FALSE otherwise |
Inject native OpenTelemetry into scripting languages like node.js and python. | stable |
OTEL_SHELL_CONFIG_INSTRUMENT_ABSOLUTE_PATHS | TRUE , FALSE |
FALSE |
Create spans for commands with an absolute path to the executable. | experimental |
OTEL_SHELL_CONFIG_OBSERVE_SIGNALS | TRUE , FALSE |
TRUE for GitHub Actions, FALSE otherwise |
Create events for received signals. | stable |
OTEL_SHELL_CONFIG_OBSERVE_SUBPROCESSES | TRUE , FALSE |
TRUE for GitHub Actions, FALSE otherwise |
Create additional minimal spans for all indirect subprocesses. | stable |
Flags that are marked as stable
are tested and verified in tests and real-world scenarios. Flags marked as experimental
are new features that are tested but still lack long-term verification in real-world applications. They will eventually reach stable
or unsafe
. Flags marked as unsafe
have implicit assumptions about the nature of the instrumented scripts and will therefore never reach stable
.
Currently, we do not support restricted mode in shells (set -r
).
The API described below is for manually creating and customizing spans. We recommend to do this only if the automatic instrumentation is not sufficient.
Create spans with a span kind (SERVER
, CONSUMER
, CLIENT
, PRODUCER
, INTERNAL
) and an arbitrary name. The returned span handle is used to end the span and to customize it (see below). The span handle is similar to a file descriptor. It is only valid within this process and used for subsequent calls for customization. It is not the actual span ID.
span_handle="$(otel_span_start INTERNAL my span)"
# ...
otel_span_end "$span_handle"
Creating a span alone will not make new spans its children (see span activation below). Start and end time of the span as well as its parent (if another span is currently active) is automatically recorded.
Customize your spans by manually setting span attributes with the span handle. This is only possible between otel_span_start
and otel_span_end
for the respective span handle. Valid types are string
, int
, float
, and auto
. The auto
type will try to guess the type based on the value.
otel_span_attribute "$span_handle" key=value
otel_span_attribute_typed "$span_handle" string foo=bar
otel_span_attribute_typed "$span_handle" int numberkey=0
Customize your spans by manually adding span events with the span handle. This is only possible between otel_span_start
and otel_span_end
for the respective span handle. Valid types for event attributes string
, int
, float
, and auto
. The auto
type will try to guess the type based on the value.
event_handle="$(otel_event_create "my event")"
otel_event_attribute "$event_handle" key=value
otel_event_attribute_typed "$event_handle" string foo=bar
otel_event_add "$event_handle" "$span_handle"
Customize your spans by manually adding span links to other spans. This is only possible between otel_span_start
and otel_span_end
for the respective span handle. Valid types for link attributes string
, int
, float
, and auto
. The auto
type will try to guess the type based on the value.
link_handle="$(otel_link_create "$foreign_traceparent" "$foreign_tracestate")"
otel_link_attribute "$link_handle" key=value
otel_link_attribute_typed "$link_handle" string foo=bar
otel_link_add "$link_handle" "$span_handle"
Customize your spans by marking them as failed. This is only possible between otel_span_start
and otel_span_end
for the respective span handle.
otel_span_error "$span_handle"
Customize your spans by marking them as failed. This is only possible between otel_span_start
and otel_span_end
for the respective span handle. Please note, that span_start
does not automatically activate a span, nor does span_end
automatically deactivate it.
otel_span_activate "$span_handle"
# ... all spans created here will be children of the activated span
otel_span_deactivate
The easiest way to create a span manually and fill it with some simple default attributes is to use the command otel_observe
. This command acts as a wrapper for the given command. It first starts an INTERNAL
span with the command as name, records some default attributes, activates the span (in case the command in turn creates spans), and then runs the command. After the command terminates, it deactivates the span and ends it. The automatically recorded attributes are of the shell.*
, pipe.*
, and subprocess.executable.*
families. All lines, written to stderr will be recorded as logs. The command is fully transparent in its behavior, i.e., stdin, stdout, and stderr of the wrapped command are piped to the caller accordingly, and the command exits with the same exit code as the wrapped command does.
otel_observe cat file.txt
Please note, that this command will not perform injection or context propagation. This can only be done via the fully automatic approach described above.
To record metrics, first create a new metric with a key to get a metric handle. The returned metric handle is used to customize the metric data point and record its value. Valid types are string
, int
, float
, and auto
. The auto
type will try to guess the type based on the value.
metric_handle="$(otel_metric_create my.metric)"
otel_metric_attribute "$metric_handle" key=value
otel_metric_attribute_typed "$metric_handle" string foo=bar
otel_metric_add "$metric_handle" 42
As of now, there is no way to manually record logs. Logs, i.e., lines that are written to stderr, are recorded automatically when using auto-instrumentation and when using otel_observe
as described above.
Every command executed in an automatically instrumented script produces a small amount of constant overhead. This is usually not a problem, especially in traditional scripts where the heavy lifting is done within external processes and not in the script itself.
When a script is instrumented for the first time, instrumenting will cause some additional overhead. To speed up subsequent invocations of the same script, an instrumentation cache is created. The cache will be invalidated if the script changes or additional executables are installed on the system and will be automatically rebuilt on the next invocation of the script.
This projects adheres to the OpenTelemetry Semantic Conventions, but it also defines a number of shell-specifc extensions.
Attribute | Type | Description | Examples |
---|---|---|---|
shell.command_line | string | The full command line as it appears in the script, with all variables resolved to their values. | echo hello world , /bin/cat file.txt |
shell.command | string | The 0-th string of the command line. | echo , /bin/cat |
shell.command.name | string | The 0-th string of the command line resolved to its name only. | echo , cat |
shell.command.type | string | The type of the command. | file , builtin , function , alias , keyword |
shell.command.exit_code | int | The exit code of the command. | 0 , 1 , 127 |
Attribute | Type | Description | Examples |
---|---|---|---|
pipe.stdin.bytes | int | The number of bytes read from stdin (fd 0). | 0 |
pipe.stdin.lines | int | The number of lines read from stdin (fd 0). | 0 |
pipe.stdout.bytes | int | The number of bytes written to stdout (fd 1). | 0 |
pipe.stdout.lines | int | The number of lines written to stdout (fd 1). | 0 |
pipe.stderr.bytes | int | The number of bytes written to stderr (fd 2). | 0 |
pipe.stderr.lines | int | The number of lines written to stderr (fd 2). | 0 |
These attributes are set when the script is called via an SSH connection.
Attribute | Type | Description | Examples |
---|---|---|---|
ssh.ip | string | The IP address of the ssh deamon used for connecting. | 1.1.1.1 |
ssh.port | int | The port of the ssh deamon used for connecting. | 1.1.1.1 |
These attributes are set when the script is a debian package maintainer script (preinst
, postinst
, prerm
, postrm
)
Attribute | Type | Description | Examples |
---|---|---|---|
debian.package.name | string | The name of the package. | opentelemetry-shell |
debian.package.version | string | The version of the package. | 1.2.3 |