diff --git a/content/en/blog/2024/getting-started-with-otelsql/index.md b/content/en/blog/2024/getting-started-with-otelsql/index.md new file mode 100644 index 000000000000..addb5c6b726b --- /dev/null +++ b/content/en/blog/2024/getting-started-with-otelsql/index.md @@ -0,0 +1,429 @@ +--- +title: + Getting started with otelsql, the OpenTelemetry instrumentation for Go SQL +linkTitle: Getting started with otelsql +date: 2024-03-04 +author: '[Sam Xie](https://github.com/XSAM) (Cisco)' +cSpell:ignore: otelsql sqlcommenter +--- + +[otelsql](https://github.com/XSAM/otelsql) is an instrumentation library for the +[`database/sql`](https://pkg.go.dev/database/sql) library of the Go programming +language. It generates traces and metrics from the application when interacting +with databases. By doing that, the library allows you to identify errors or +slowdowns in your SQL queries that potentially impact the performance of your +application. + +Let's go dive into how to use this library! + +## Getting Started + +otelsql is a wrapper layer for interfaces from `database/sql`. When users use +the wrapped database interfaces, the otelsql generates the telemetry data and +passes operations to the underlying database. + +In the following example, you are going to use +[Docker Compose](https://docs.docker.com/compose/) to run the otel-collector +example from the otelsql repository. This example uses a MySQL client with the +otelsql instrumentation. The telemetry it generates will be pushed to the +OpenTelemetry Collector. Then, it shows the trace data on Jaeger and the metrics +data on a Prometheus server. + +Here is the data flow: + +```mermaid +flowchart LR; + A[MySQL client]-->B[OpenTelemetry Collector]; + B-->C["Jaeger (trace)"]; + B-->D["Prometheus (metrics)"]; +``` + +Let's clone the otelsql repository here and run the example and take a look at +the most important lines of code. + +```sh +git clone https://github.com/XSAM/otelsql.git +``` + +In the otelsql folder, you can also check out the git tag to `v0.29.0` (the +latest tag while writing this post) to ensure the example is runnable, as the +steps to run the example might be changed in the future. + +```sh +git checkout tags/v0.29.0 +``` + +Let's go to the folder of the otel-collector example and bring up all services. + +```sh +cd example/otel-collector +docker compose up -d +``` + +After building images and running services, let's check the service logs to +ensure the SQL client is finished. + +```sh +docker compose logs client +``` + +Then, we can access the Jaeger UI at [localhost:16686](http://localhost:16686) +and the Prometheus UI at [localhost:9090](http://localhost:9090) to see the +results. + +Here we are viewing a trace graph on Jaeger. We can see the duration and +parameters of each operation with the database. + +![example of Jaeger UI](jaeger-example.png) + +Here we are viewing the metric `db_sql_latency_milliseconds_sum` on Prometheus. + +![example of Prometheus UI](prometheus-example.png) + +More otelsql generated metrics options can be found on the +[otelsql document](https://github.com/XSAM/otelsql/blob/main/README.md#metric-instruments). + +## Understand the example + +Let's look at the `docker-compose.yaml` file first. + +```yaml +version: '3.9' +services: + mysql: + image: mysql:8.3 + environment: + - MYSQL_ROOT_PASSWORD=otel_password + - MYSQL_DATABASE=db + healthcheck: + test: + mysqladmin ping -h 127.0.0.1 -u root --password=$$MYSQL_ROOT_PASSWORD + start_period: 5s + interval: 5s + timeout: 5s + retries: 10 + + otel-collector: + image: otel/opentelemetry-collector-contrib:0.91.0 + command: ['--config=/etc/otel-collector.yaml'] + volumes: + - ./otel-collector.yaml:/etc/otel-collector.yaml + depends_on: + - jaeger + + prometheus: + image: prom/prometheus:v2.45.2 + volumes: + - ./prometheus.yaml:/etc/prometheus/prometheus.yml + ports: + - 9090:9090 + depends_on: + - otel-collector + + jaeger: + image: jaegertracing/all-in-one:1.52 + ports: + - 16686:16686 + + client: + build: + dockerfile: $PWD/Dockerfile + context: ../.. + depends_on: + mysql: + condition: service_healthy +``` + +This Docker compose file contains five services. The `client` service is the +MySQL client built from Dockerfile and the source code is main.go in the example +folder. The `client` service runs after the `mysql` service is up. Then, it +initializes the OpenTelemetry client and otelsql instrumentation, make SQL +queries to the `mysql` service, and send metrics and trace data to +`otel-collector` service through the +[OpenTelemetry Protocol (OTLP)](/docs/specs/otel/protocol/). + +After receiving the data, the `otel-collector` service transfers the data format +and send metrics data to the `prometheus` service, and send trace data to the +`jaeger` service. + +Let's check `main.go` to see what happens in the `client` service. Here is the +main function. + +```go +func main() { + ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt) + defer cancel() + + conn, err := initConn(ctx) + if err != nil { + log.Fatal(err) + } + + shutdownTracerProvider, err := initTracerProvider(ctx, conn) + if err != nil { + log.Fatal(err) + } + defer func() { + if err := shutdownTracerProvider(ctx); err != nil { + log.Fatalf("failed to shutdown TracerProvider: %s", err) + } + }() + + shutdownMeterProvider, err := initMeterProvider(ctx, conn) + if err != nil { + log.Fatal(err) + } + defer func() { + if err := shutdownMeterProvider(ctx); err != nil { + log.Fatalf("failed to shutdown MeterProvider: %s", err) + } + }() + + db := connectDB() + defer db.Close() + + err = runSQLQuery(ctx, db) + if err != nil { + log.Fatal(err) + } + + fmt.Println("Example finished") +} +``` + +This `main` function is pretty straightforward. It initializes a connection with +the `otel-collector` service, which is used by the tracer provider and the meter +provider. Then, it configures the tracer provider and meter provider with the +`connection` and a shutdown method, which ensures the telemetry data can be +pushed to the `otel-collector` service correctly before exiting the application. +After finishing setting up the OpenTelemetry client, it invokes the `connectDB` +method to use the otelsql library to interact with the MySQL database. Let's +look at the details here. + +```go +func connectDB() *sql.DB { + // Connect to database + db, err := otelsql.Open("mysql", mysqlDSN, otelsql.WithAttributes( + semconv.DBSystemMySQL, + )) + if err != nil { + log.Fatal(err) + } + + // Register DB stats to meter + err = otelsql.RegisterDBStatsMetrics(db, otelsql.WithAttributes( + semconv.DBSystemMySQL, + )) + if err != nil { + log.Fatal(err) + } + return db +} +``` + +Instead of using the [`sql.Open`](https://pkg.go.dev/database/sql#Open) method +that Go provides, we use +[`otelsql.Open`](https://pkg.go.dev/github.com/XSAM/otelsql#Open) to create an +[`sql.DB`](https://pkg.go.dev/database/sql#DB) instance. The `sql.DB` instance +returned by `otelsql.Open` is a wrapper that transfers and instruments all DB +operations to the underlying `sql.DB` instance (created by `sql.Open`). When +users send SQL queries with this wrapper, `otelsql` can see the queries and use +the OpenTelemetry client to generate telemetry. + +Besides using `otelsql.Open`, `otelsql` provides three additional ways to +initialize instrumentation: `otelsql.OpenDB`, `otelsql.Register`, and +`otelsql.WrapDriver`. These additional methods cover different use cases, as +some database drivers or frameworks don't provide a direct way to create +`sql.DB`. Sometimes, you might need these additional methods to manually create +a `sql.DB` and push it to those database drivers. You can check +[examples on the otelsql document](https://pkg.go.dev/github.com/XSAM/otelsql#pkg-examples) +to learn how to use these methods. + +Moving on, we use `otelsql.RegisterDBStatsMetrics` to register metrics data from +`sql.DBStats`. The metrics recording process runs in the background and updates +the value of the metric when needed after the registration, so we don't need to +worry about creating an individual thread for this. + +After having an `sql.DB` wrapped by `otelsql`, we can use it to make queries. + +```go +func runSQLQuery(ctx context.Context, db *sql.DB) error { + // Create a parent span (Optional) + tracer := otel.GetTracerProvider() + ctx, span := tracer.Tracer(instrumentationName).Start(ctx, "example") + defer span.End() + + err := query(ctx, db) + if err != nil { + span.RecordError(err) + return err + } + return nil +} + +func query(ctx context.Context, db *sql.DB) error { + // Make a query + rows, err := db.QueryContext(ctx, `SELECT CURRENT_TIMESTAMP`) + if err != nil { + return err + } + defer rows.Close() + + var currentTime time.Time + for rows.Next() { + err = rows.Scan(¤tTime) + if err != nil { + return err + } + } + fmt.Println(currentTime) + return nil +} +``` + +This `runSQLQuery` method creates a parent span first (it is an optional step, +it makes the query spans have a parent, and it looks good on the trace graph), +then queries the current timestamp from the MySQL database. + +After this method, the `client` application finishes and exits. They are the +most important lines of code for understanding the example. + +## Use the example as a playground + +After understanding the example, we can use it as a playground, making it a bit +complicated to see how it will be used in a real-world scenario. + +Use the following codes to replace the `runSQLQuery` method in the example. + +```go +func runSQLQuery(ctx context.Context, db *sql.DB) error { + // Create a parent span (Optional) + tracer := otel.GetTracerProvider() + ctx, span := tracer.Tracer(instrumentationName).Start(ctx, "example") + defer span.End() + + runSlowSQLQuery(ctx, db) + + err := query(ctx, db) + if err != nil { + span.RecordError(err) + return err + } + return nil +} + +func runSlowSQLQuery(ctx context.Context, db *sql.DB) { + db.QueryContext(ctx, `SELECT SLEEP(1)`) +} +``` + +This time, we added a new query to the example, which is a slow query that would +take 1 second to return. Let's see what could happen and how to identify this +slow query. + +To make this change work, we need to rebuild the `client` service. + +```sh +docker compose build client +docker compose up client +``` + +After the client is finished, we can check the trace graph for the trace we just +generated on Jaeger. + +![example of real-world-like Jaeger UI](real-world-like-jaeger-example.png) + +From this graph, we know the entire example takes 1 second to complete. The root +cause of this slowness is not related to the network latency with the database +and the timestamp query. It is the `SELECT SLEEP(1)` query that leads to the +slowness. + +You can also learn about the slowness from the aggregated statistics of the +database by metrics. Such is the observability otelsql can provide so you can +learn what your application is doing with the database. + +## Compatibility + +You might worry about the compatibility issue with other databases and other +third-party database frameworks (like ORMs) and wonder how widely this +instrumentation can be used. + +From an implementation perspective, as long as the database drivers or the +database frameworks interact with the database (any database, not just an SQL +database) through `database/sql` with context, `otelsql` should work just fine. + +This is an +[example](https://github.com/ent/ent/issues/1232#issuecomment-1200405070) that +shows how otelsql works with Facebook's entity framework for Go. + +## Other cool features + +Now that you've experienced the main feature, let's take some time to explore +the other cool features `otelsql` provides. + +### Sqlcommenter support + +otelsql integrates [Sqlcommenter](https://google.github.io/sqlcommenter), an +open source ORM auto-instrumentation library that merges with OpenTelemetry by +injecting a comment into SQL statements to enable context propagation for the +database. + +Using the option `WithSQLCommenter`, otelsql injects a comment for every SQL +statement it instruments. + +For instance, an SQL query sent to the database + +```sql +SELECT * from FOO +``` + +becomes + +```sql +SELECT * from FOO /*traceparent='00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01',tracestate='congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7'*/ +``` + +Then the database that supports `Sqlcommenter` can record its operation for this +query with a specified trace and publish its trace spans to a trace store, so +you can see your application trace spans correlated with query trace spans from +the database in one place. + +![example of Sqlcommenter from google cloud document](sqlcommenter-example.png) + +> Picture coming from +> [google cloud document](https://cloud.google.com/blog/products/databases/sqlcommenter-merges-with-opentelemetry). + +### Custom span name + +If you don't like the default span name, you can use +`otelsql.WithSpanNameFormatter` to customize the span name. + +Here is the example usage: + +```go +otelsql.WithSpanNameFormatter(func(ctx context.Context, method otelsql.Method, query string) string { + return string(method) + ": " + query +}) +``` + +Then, the span name could become `{method}: {query}`. Here is an example of the +span name: + +```text +sql.conn.query: select current_timestamp +``` + +### Filter spans + +You can use `otelsql.SpanFilter` from `otelsql.SpanOptions` to filter out spans +you don't want to generate. It is useful when you want to discard some spans. + +## What's next? + +You should now be able to apply what you have learned from this blog post to +your own installation of otelsql. + +I would love to hear about your experience! Star `otelsql` if you find it +helpful! If you run into any problems, don't hesitate to +[reach out](https://github.com/XSAM/otelsql?tab=readme-ov-file#communication) or +[create an issue](https://github.com/XSAM/otelsql/issues). diff --git a/content/en/blog/2024/getting-started-with-otelsql/jaeger-example.png b/content/en/blog/2024/getting-started-with-otelsql/jaeger-example.png new file mode 100644 index 000000000000..a034301b2ba5 Binary files /dev/null and b/content/en/blog/2024/getting-started-with-otelsql/jaeger-example.png differ diff --git a/content/en/blog/2024/getting-started-with-otelsql/prometheus-example.png b/content/en/blog/2024/getting-started-with-otelsql/prometheus-example.png new file mode 100644 index 000000000000..3dc9d6e3c423 Binary files /dev/null and b/content/en/blog/2024/getting-started-with-otelsql/prometheus-example.png differ diff --git a/content/en/blog/2024/getting-started-with-otelsql/real-world-like-jaeger-example.png b/content/en/blog/2024/getting-started-with-otelsql/real-world-like-jaeger-example.png new file mode 100644 index 000000000000..a034301b2ba5 Binary files /dev/null and b/content/en/blog/2024/getting-started-with-otelsql/real-world-like-jaeger-example.png differ diff --git a/content/en/blog/2024/getting-started-with-otelsql/sqlcommenter-example.png b/content/en/blog/2024/getting-started-with-otelsql/sqlcommenter-example.png new file mode 100644 index 000000000000..1fe5c5c321c5 Binary files /dev/null and b/content/en/blog/2024/getting-started-with-otelsql/sqlcommenter-example.png differ diff --git a/content/en/docs/concepts/instrumentation/_index.md b/content/en/docs/concepts/instrumentation/_index.md index cb11487a2902..39ae1d73d907 100644 --- a/content/en/docs/concepts/instrumentation/_index.md +++ b/content/en/docs/concepts/instrumentation/_index.md @@ -1,7 +1,6 @@ --- title: Instrumentation -description: >- - How OpenTelemetry instrumentations libraries and applications. +description: How OpenTelemetry facilitates instrumentation aliases: [instrumenting] weight: 15 --- @@ -11,25 +10,24 @@ from the system's components must emit [traces](/docs/concepts/signals/traces/), [metrics](/docs/concepts/signals/metrics/), and [logs](/docs/concepts/signals/logs/). -OpenTelemetry has two primary ways to instrument. +Using OpenTelemetry, you can instrument your code in two primary ways: 1. [Code-based solutions](/docs/concepts/instrumentation/code-based) via - official APIs and SDKs for eleven languages. -2. [Zero-code solutions](/docs/concepts/instrumentation/zero-code/) that, when - installed, instrument libraries you use. + official [APIs and SDKs for most languages](/docs/languages/) +2. [Zero-code solutions](/docs/concepts/instrumentation/zero-code/) -Code-based solutions allow you to get rich telemetry from your application -itself. They let you use the OpenTelemetry API to generate telemetry from your -application, which acts as an essential complement to the telemetry generated by -zero-code solutions. +**Code-based** solutions allow you to get deeper insight and rich telemetry from +your application itself. They let you use the OpenTelemetry API to generate +telemetry from your application, which acts as an essential complement to the +telemetry generated by zero-code solutions. -The Zero-code solutions are great for getting started, or when you can't modify +**Zero-code** solutions are great for getting started, or when you can't modify the application you need to get telemetry out of. They provide rich telemetry from libraries you use and/or the environment your application runs in. Another way to think of it is that they provide information about what's happening _at the edges_ of your application. -It's generally recommended that you use both solutions when you can. +You can use both solutions simultaneously. ## Additional OpenTelemetry Benefits diff --git a/content/en/docs/concepts/instrumentation/code-based.md b/content/en/docs/concepts/instrumentation/code-based.md index 3a669088f781..90e86dcbca3b 100644 --- a/content/en/docs/concepts/instrumentation/code-based.md +++ b/content/en/docs/concepts/instrumentation/code-based.md @@ -1,10 +1,9 @@ --- title: Code-based -description: >- - Learn about the essential steps to instrument your code base. +description: Learn the essential steps in setting up code-based instrumentation weight: 20 aliases: [manual] -cSpell:ignore: legitimatebusiness proxying +cSpell:ignore: proxying --- ## Import the OpenTelemetry API and SDK @@ -24,10 +23,10 @@ single default provider for these objects. You'll then get a tracer or meter instance from that provider, and give it a name and version. The name you choose here should identify what exactly is being instrumented -- if you're writing a library, for example, then you should name it after your library (for example -`com.legitimatebusiness.myLibrary`) as this name will namespace all spans or -metric events produced. It is also recommended that you supply a version string -(i.e., `semver:1.0.0`) that corresponds to the current version of your library -or service. +`com.example.myLibrary`) as this name will namespace all spans or metric events +produced. It is also recommended that you supply a version string (i.e., +`semver:1.0.0`) that corresponds to the current version of your library or +service. ## Configure the OpenTelemetry SDK diff --git a/content/en/docs/concepts/instrumentation/zero-code.md b/content/en/docs/concepts/instrumentation/zero-code.md index ca8787f06295..b72fcb828f03 100644 --- a/content/en/docs/concepts/instrumentation/zero-code.md +++ b/content/en/docs/concepts/instrumentation/zero-code.md @@ -2,7 +2,7 @@ title: Zero-code description: >- Learn how to add observability to an application without the need to write - more code + code weight: 10 aliases: [automatic] --- diff --git a/content/en/docs/kubernetes/operator/automatic.md b/content/en/docs/kubernetes/operator/automatic.md index 413b39d57785..95dba72f2129 100644 --- a/content/en/docs/kubernetes/operator/automatic.md +++ b/content/en/docs/kubernetes/operator/automatic.md @@ -317,9 +317,6 @@ spec: EOF ``` -> **Note**: OpenTelemetry Python automatic instrumentation does not support -> Flask or Werkzeug 3.0+ at this time. - By default, the `Instrumentation` resource that auto-instruments Python services uses `otlp` with the `http/protobuf` protocol (gRPC is not supported at this time). This means that the configured endpoint must be able to receive OTLP over diff --git a/content/en/docs/languages/java/automatic/configuration.md b/content/en/docs/languages/java/automatic/configuration.md index 6a6701c817e6..5b64b05eb06d 100644 --- a/content/en/docs/languages/java/automatic/configuration.md +++ b/content/en/docs/languages/java/automatic/configuration.md @@ -467,13 +467,13 @@ associated span name on the parent {{% config_option name="otel.instrumentation.common.experimental.controller-telemetry.enabled" - default=true -%}} Enables the controller telemetry. {{% /config_option %}} + default=false +%}} Set to `true` to enable controller telemetry. {{% /config_option %}} {{% config_option name="otel.instrumentation.common.experimental.view-telemetry.enabled" - default=true -%}} Enables the view telemetry. {{% /config_option %}} + default=false +%}} Set to `true` to enable view telemetry. {{% /config_option %}} ### Instrumentation span suppression behavior diff --git a/package.json b/package.json index 8512ec4b3e00..29a51eb663b6 100644 --- a/package.json +++ b/package.json @@ -113,8 +113,8 @@ "@opentelemetry/auto-instrumentations-web": "^0.36.0", "@opentelemetry/context-zone": "^1.8.0", "@opentelemetry/core": "^1.8.0", - "@opentelemetry/exporter-trace-otlp-http": "^0.48.0", - "@opentelemetry/instrumentation": "^0.48.0", + "@opentelemetry/exporter-trace-otlp-http": "^0.49.1", + "@opentelemetry/instrumentation": "^0.49.1", "@opentelemetry/resources": "^1.8.0", "@opentelemetry/sdk-trace-base": "^1.8.0", "@opentelemetry/sdk-trace-web": "^1.8.0", diff --git a/static/refcache.json b/static/refcache.json index 0297bea18a7b..e9267181f5de 100644 --- a/static/refcache.json +++ b/static/refcache.json @@ -383,6 +383,10 @@ "StatusCode": 200, "LastSeen": "2024-01-18T08:05:48.904048-05:00" }, + "https://cloud.google.com/blog/products/databases/sqlcommenter-merges-with-opentelemetry": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:07.232417-08:00" + }, "https://cloud.google.com/blog/products/serverless/cloud-functions-2nd-generation-now-generally-available": { "StatusCode": 200, "LastSeen": "2024-01-18T08:05:49.382943-05:00" @@ -2111,6 +2115,10 @@ "StatusCode": 200, "LastSeen": "2024-01-18T19:11:51.03094-05:00" }, + "https://github.com/XSAM/otelsql/issues": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:07.539727-08:00" + }, "https://github.com/aabmass": { "StatusCode": 200, "LastSeen": "2024-01-30T05:18:19.065836-05:00" @@ -2363,6 +2371,10 @@ "StatusCode": 200, "LastSeen": "2024-01-18T20:05:36.185289-05:00" }, + "https://github.com/ent/ent/issues/1232#issuecomment-1200405070": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:06.756997-08:00" + }, "https://github.com/equinix-labs/otel-cli": { "StatusCode": 200, "LastSeen": "2024-01-30T16:15:52.594088-05:00" @@ -4275,6 +4287,10 @@ "StatusCode": 200, "LastSeen": "2024-01-18T19:02:17.04578-05:00" }, + "https://google.github.io/sqlcommenter": { + "StatusCode": 206, + "LastSeen": "2024-02-24T14:33:07.021511-08:00" + }, "https://gorm.io/": { "StatusCode": 206, "LastSeen": "2024-01-30T16:14:47.910122-05:00" @@ -5451,6 +5467,10 @@ "StatusCode": 206, "LastSeen": "2024-02-23T22:55:04.014798-05:00" }, + "https://opentelemetry.io/docs/specs/otel/protocol": { + "StatusCode": 206, + "LastSeen": "2024-02-24T14:33:05.630341-08:00" + }, "https://opentracing.io": { "StatusCode": 206, "LastSeen": "2024-01-18T19:07:33.813401-05:00" @@ -5619,10 +5639,30 @@ "StatusCode": 200, "LastSeen": "2024-01-16T09:38:27.962889-05:00" }, + "https://pkg.go.dev/database/sql": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:03.76582-08:00" + }, + "https://pkg.go.dev/database/sql#DB": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:06.1255-08:00" + }, + "https://pkg.go.dev/database/sql#Open": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:05.81433-08:00" + }, "https://pkg.go.dev/github.com/XSAM/otelsql": { "StatusCode": 200, "LastSeen": "2024-01-08T12:17:16.696764+01:00" }, + "https://pkg.go.dev/github.com/XSAM/otelsql#Open": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:05.95303-08:00" + }, + "https://pkg.go.dev/github.com/XSAM/otelsql#pkg-examples": { + "StatusCode": 200, + "LastSeen": "2024-02-24T14:33:06.29193-08:00" + }, "https://pkg.go.dev/github.com/dnwe/otelsarama": { "StatusCode": 200, "LastSeen": "2024-01-25T12:26:12.14544959Z"