Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need a spec for console output format? #3565

Open
tigrannajaryan opened this issue Jun 23, 2023 · 8 comments
Open

Do we need a spec for console output format? #3565

tigrannajaryan opened this issue Jun 23, 2023 · 8 comments
Labels
area:sdk Related to the SDK spec:miscellaneous For issues that don't match any other spec label triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted

Comments

@tigrannajaryan
Copy link
Member

Many (all?) language SDKs have a "console" exporter that allows outputting the produced telemetry to standard output. This is very useful for getting started, debugging, etc.

The spec does not define a format for this exporter and I think language maintainers usually come up with the own formats.

Should we define a format that all SDKs can implement uniformly?

We do define a file exporter format, but console likely has requirements that will make it different (i.e. human readability is likely a top priority).

Related to discussion about this in PHP repo: open-telemetry/opentelemetry-php#1050

@open-telemetry/specs-approvers any thoughts?

@tigrannajaryan tigrannajaryan added the spec:miscellaneous For issues that don't match any other spec label label Jun 23, 2023
@yurishkuro
Copy link
Member

There two possible categories of output by console exporter, lossless data dump or a summary format like

exporting span id, trace id, name, attr count

@tigrannajaryan
Copy link
Member Author

There two possible categories of output by console exporter, lossless data dump or a summary format like

exporting span id, trace id, name, attr count

Yes.

In Collector we have a human-readable, detailed (but probably ambiguous) text format: https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/loggingexporter

PHP SDK's console exporter which I was reviewing recently uses json, which appears to be non-ambiguous, but also is not OTLP/JSON.

It may be worth checking to see what other SDKs do.

@svrnm
Copy link
Member

svrnm commented Jun 26, 2023

big +1 to this, as it would also significantly improve the docs, right now the outputs per language look very different:

  • Node.JS has a JSON-like representation

    Node.JS output

    Spans:

    {
      "traceId": "3f1fe6256ea46d19ec3ca97b3409ad6d",
      "parentId": "f0b7b340dd6e08a7",
      "name": "middleware - query",
      "id": "41a27f331c7bfed3",
      "kind": 0,
      "timestamp": 1624982589722992,
      "duration": 417,
      "attributes": {
        "http.route": "/",
        "express.name": "query",
        "express.type": "middleware"
      },
      "status": { "code": 0 },
      "events": []
    }
    {
      "traceId": "3f1fe6256ea46d19ec3ca97b3409ad6d",
      "parentId": "f0b7b340dd6e08a7",
      "name": "middleware - expressInit",
      "id": "e0ed537a699f652a",
      "kind": 0,
      "timestamp": 1624982589725778,
      "duration": 673,
      "attributes": {
        "http.route": "/",
        "express.name": "expressInit",
        "express.type": "middleware"
      },
      "status": { code: 0 },
      "events": []
    }
    {
      "traceId": "3f1fe6256ea46d19ec3ca97b3409ad6d",
      "parentId": "f0b7b340dd6e08a7",
      "name": "request handler - /",
      "id": "8614a81e1847b7ef",
      "kind": 0,
      "timestamp": 1624982589726941,
      "duration": 21,
      "attributes": {
        "http.route": "/",
        "express.name": "/",
        "express.type": "request_handler"
      },
      "status": { code: 0 },
      "events": []
    }
    {
      "traceId": "3f1fe6256ea46d19ec3ca97b3409ad6d",
      "parentId": undefined,
      "name": "GET /",
      "id": "f0b7b340dd6e08a7",
      "kind": 1,
      "timestamp": 1624982589720260,
      "duration": 11380,
      "attributes": {
        "http.url": "http://localhost:8080/",
        "http.host": "localhost:8080",
        "net.host.name": "localhost",
        "http.method": "GET",
        "http.route": "",
        "http.target": "/",
        "http.user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
        "http.flavor": "1.1",
        "net.transport": "ip_tcp",
        "net.host.ip": "::1",
        "net.host.port": 8080,
        "net.peer.ip": "::1",
        "net.peer.port": 61520,
        "http.status_code": 304,
        "http.status_text": "NOT MODIFIED"
      },
      "status": { "code": 1 },
      "events": []
    }

    Metrics:

    {
      descriptor: {
        name: 'http.server.duration',
        type: 'HISTOGRAM',
        description: 'measures the duration of the inbound HTTP requests',
        unit: 'ms',
        valueType: 1
      },
      dataPointType: 0,
      dataPoints: [
        {
          attributes: [Object],
          startTime: [Array],
          endTime: [Array],
          value: [Object]
        }
      ]
    }
    {
      descriptor: {
        name: 'http.client.duration',
        type: 'HISTOGRAM',
        description: 'measures the duration of the outbound HTTP requests',
        unit: 'ms',
        valueType: 1
      },
      dataPointType: 0,
      dataPoints: []
    }
    {
      descriptor: {
        name: 'db.client.connections.usage',
        type: 'UP_DOWN_COUNTER',
        description: 'The number of connections that are currently in the state referenced by the attribute "state".',
        unit: '{connections}',
        valueType: 1
      },
      dataPointType: 3,
      dataPoints: []
    }
    {
      descriptor: {
        name: 'http.server.duration',
        type: 'HISTOGRAM',
        description: 'measures the duration of the inbound HTTP requests',
        unit: 'ms',
        valueType: 1
      },
      dataPointType: 0,
      dataPoints: [
        {
          attributes: [Object],
          startTime: [Array],
          endTime: [Array],
          value: [Object]
        }
      ]
    }
    {
      descriptor: {
        name: 'http.client.duration',
        type: 'HISTOGRAM',
        description: 'measures the duration of the outbound HTTP requests',
        unit: 'ms',
        valueType: 1
      },
      dataPointType: 0,
      dataPoints: []
    }
    {
      descriptor: {
        name: 'db.client.connections.usage',
        type: 'UP_DOWN_COUNTER',
        description: 'The number of connections that are currently in the state referenced by the attribute "state".',
        unit: '{connections}',
        valueType: 1
      },
      dataPointType: 3,
      dataPoints: []
    }
    {
      descriptor: {
        name: 'http.server.duration',
        type: 'HISTOGRAM',
        description: 'measures the duration of the inbound HTTP requests',
        unit: 'ms',
        valueType: 1
      },
      dataPointType: 0,
      dataPoints: [
        {
          attributes: [Object],
          startTime: [Array],
          endTime: [Array],
          value: [Object]
        }
      ]
    }
    {
      descriptor: {
        name: 'http.client.duration',
        type: 'HISTOGRAM',
        description: 'measures the duration of the outbound HTTP requests',
        unit: 'ms',
        valueType: 1
      },
      dataPointType: 0,
      dataPoints: []
    }
    {
      descriptor: {
        name: 'db.client.connections.usage',
        type: 'UP_DOWN_COUNTER',
        description: 'The number of connections that are currently in the state referenced by the attribute "state".',
        unit: '{connections}',
        valueType: 1
      },
      dataPointType: 3,
      dataPoints: []
    }
  • Erlang/Elixir has a record(?) data structure representation

    Erlang output
    *SPANS FOR DEBUG*
    {span,64480120921600870463539706779905870846,11592009751350035697,[],
        undefined,<<"/">>,server,-576460731933544855,-576460731890088522,
        {attributes,128,infinity,0,
                    #{'http.status_code' => 200,
                      'http.client_ip' => <<"127.0.0.1">>,
                      'http.flavor' => '1.1','http.method' => <<"GET">>,
                      'http.scheme' => <<"http">>,'http.target' => <<"/">>,
                      'http.user_agent' =>
                          <<"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36">>,
                      'net.transport' => 'IP.TCP',
                      'net.host.name' => <<"localhost">>,
                      'net.host.port' => 4000,'net.peer.port' => 62839,
                      'net.sock.host.addr' => <<"127.0.0.1">>,
                      'net.sock.peer.addr' => <<"127.0.0.1">>,
                      'http.route' => <<"/">>,'phoenix.action' => home,
                      'phoenix.plug' =>
                          'Elixir.DiceGameWeb.PageController'}},
        {events,128,128,infinity,0,[]},
        {links,128,128,infinity,0,[]},
        undefined,1,false,
        {instrumentation_scope,<<"opentelemetry_phoenix">>,<<"1.1.0">>,
                               undefined}}
    
  • Java has a log-line like representation

    Java output

    Span

    [otel.javaagent 2023-04-24 17:33:54:567 +0200] [http-nio-8080-exec-1] INFO
    io.opentelemetry.exporter.logging.LoggingSpanExporter - 'RollController.index' :
    70c2f04ec863a956e9af975ba0d983ee 7fd145f5cda13625 INTERNAL [tracer:
    io.opentelemetry.spring-webmvc-6.0:1.25.0-alpha] AttributesMap{data=
    {thread.id=39, thread.name=http-nio-8080-exec-1}, capacity=128,
    totalAddedValues=2}
    [otel.javaagent 2023-04-24 17:33:54:568 +0200] [http-nio-8080-exec-1] INFO
    io.opentelemetry.exporter.logging.LoggingSpanExporter - 'GET /rolldice' :
    70c2f04ec863a956e9af975ba0d983ee 647ad186ad53eccf SERVER [tracer:
    io.opentelemetry.tomcat-10.0:1.25.0-alpha] AttributesMap{
    data={user_agent.original=curl/7.87.0, net.host.name=localhost,
    net.transport=ip_tcp, http.target=/rolldice, net.sock.peer.addr=127.0.0.1,
    thread.name=http-nio-8080-exec-1, net.sock.peer.port=53422,
    http.route=/rolldice, net.sock.host.addr=127.0.0.1, thread.id=39,
    net.protocol.name=http, http.status_code=200, http.scheme=http,
    net.protocol.version=1.1, http.response_content_length=1,
    net.host.port=8080, http.method=GET}, capacity=128, totalAddedValues=17}
    

    Metric

      [otel.javaagent 2023-04-24 17:34:25:347 +0200] [PeriodicMetricReader-1] INFO
    io.opentelemetry.exporter.logging.LoggingMetricExporter - Received a collection
     of 19 metrics for export.
    [otel.javaagent 2023-04-24 17:34:25:347 +0200] [PeriodicMetricReader-1] INFO
    io.opentelemetry.exporter.logging.LoggingMetricExporter - metric:
    ImmutableMetricData{resource=Resource{schemaUrl=
    https://opentelemetry.io/schemas/1.19.0, attributes={host.arch="aarch64",
    host.name="OPENTELEMETRY", os.description="Mac OS X 13.3.1", os.type="darwin",
    process.command_args=[/bin/java, -jar, java-simple.jar],
    process.executable.path="/bin/java", process.pid=64497,
    process.runtime.description="Homebrew OpenJDK 64-Bit Server VM 20",
    process.runtime.name="OpenJDK Runtime Environment",
    process.runtime.version="20", service.name="java-simple",
    telemetry.auto.version="1.25.0", telemetry.sdk.language="java",
    telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.25.0"}},
    instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.runtime-metrics,
    version=1.25.0, schemaUrl=null, attributes={}},
    name=process.runtime.jvm.buffer.limit, description=Total capacity of the buffers
    in this pool, unit=By, type=LONG_SUM, data=ImmutableSumData{points=
    [ImmutableLongPointData{startEpochNanos=1682350405319221000,
    epochNanos=1682350465326752000, attributes=
    {pool="mapped - 'non-volatile memory'"}, value=0, exemplars=[]},
    ImmutableLongPointData{startEpochNanos=1682350405319221000,
    epochNanos=1682350465326752000, attributes={pool="mapped"},
    value=0, exemplars=[]},
    ImmutableLongPointData{startEpochNanos=1682350405319221000,
    epochNanos=1682350465326752000, attributes={pool="direct"},
    value=8192, exemplars=[]}], monotonic=false, aggregationTemporality=CUMULATIVE}}
    ...
    
  • Python has a json like representation (which is different to the JS one)

    Python output

    Span

    {
      "name": "/rolldice",
      "context": {
        "trace_id": "0xdcd253b9501348b63369d83219da0b14",
        "span_id": "0x886c05bc23d2250e",
        "trace_state": "[]"
      },
      "kind": "SpanKind.SERVER",
      "parent_id": null,
      "start_time": "2022-04-27T23:53:11.533109Z",
      "end_time": "2022-04-27T23:53:11.534097Z",
      "status": {
        "status_code": "UNSET"
      },
      "attributes": {
        "http.method": "GET",
        "http.server_name": "127.0.0.1",
        "http.scheme": "http",
        "net.host.port": 5000,
        "http.host": "localhost:5000",
        "http.target": "/rolldice",
        "net.peer.ip": "127.0.0.1",
        "http.user_agent": "curl/7.68.0",
        "net.peer.port": 52538,
        "http.flavor": "1.1",
        "http.route": "/rolldice",
        "http.status_code": 200
      },
      "events": [],
      "links": [],
      "resource": {
        "attributes": {
          "telemetry.sdk.language": "python",
          "telemetry.sdk.name": "opentelemetry",
          "telemetry.sdk.version": "1.14.0",
          "telemetry.auto.version": "0.35b0",
          "service.name": "unknown_service"
        },
        "schema_url": ""
      }
    }

    Metric

    {
      "resource_metrics": [
        {
          "resource": {
            "attributes": {
              "service.name": "unknown_service",
              "telemetry.auto.version": "0.34b0",
              "telemetry.sdk.language": "python",
              "telemetry.sdk.name": "opentelemetry",
              "telemetry.sdk.version": "1.13.0"
            },
            "schema_url": ""
          },
          "schema_url": "",
          "scope_metrics": [
            {
              "metrics": [
                {
                  "data": {
                    "aggregation_temporality": 2,
                    "data_points": [
                      {
                        "attributes": {
                          "http.flavor": "1.1",
                          "http.host": "localhost:5000",
                          "http.method": "GET",
                          "http.scheme": "http",
                          "http.server_name": "127.0.0.1"
                        },
                        "start_time_unix_nano": 1666077040061693305,
                        "time_unix_nano": 1666077098181107419,
                        "value": 0
                      }
                    ],
                    "is_monotonic": false
                  },
                  "description": "measures the number of concurrent HTTP requests that are currently in-flight",
                  "name": "http.server.active_requests",
                  "unit": "requests"
                },
                {
                  "data": {
                    "aggregation_temporality": 2,
                    "data_points": [
                      {
                        "attributes": {
                          "http.flavor": "1.1",
                          "http.host": "localhost:5000",
                          "http.method": "GET",
                          "http.scheme": "http",
                          "http.server_name": "127.0.0.1",
                          "http.status_code": 200,
                          "net.host.port": 5000
                        },
                        "bucket_counts": [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                        "count": 1,
                        "explicit_bounds": [
                          0, 5, 10, 25, 50, 75, 100, 250, 500, 1000
                        ],
                        "max": 1,
                        "min": 1,
                        "start_time_unix_nano": 1666077040063027610,
                        "sum": 1,
                        "time_unix_nano": 1666077098181107419
                      }
                    ]
                  },
                  "description": "measures the duration of the inbound HTTP request",
                  "name": "http.server.duration",
                  "unit": "ms"
                }
              ],
              "schema_url": "",
              "scope": {
                "name": "opentelemetry.instrumentation.flask",
                "schema_url": "",
                "version": "0.34b0"
              }
            }
          ]
        }
      ]
    }
  • C++ has also a json-like representation

    C++ output

    Span

    {
      name          : /helloworld
      trace_id      : 05eec7a55d3544434265dad89d7fe96f
      span_id       : 45fb62c58c907f05
      tracestate    :
      parent_span_id: 0000000000000000
      start         : 1665577080650384378
      duration      : 1640298
      description   :
      span kind     : Client
      status        : Unset
      attributes    :
        http.header.Date: Wed, 12 Oct 2022 12:18:00 GMT
        http.header.Content-Length: 0
        http.status_code: 200
        http.method: GET
        .header.Host: localhost
        http.header.Content-Type: text/plain
        http.header.Connection: keep-alive
        .scheme: http
        http.url: http://localhost:8800/helloworld
      events        :
      links         :
      resources     :
        service.name: unknown_service
        telemetry.sdk.version: 1.6.1
        telemetry.sdk.name: opentelemetry
        telemetry.sdk.language: cpp
      instr-lib     : http-client
    }
    

Having a consistent output would make debugging easier and also allow us (@open-telemetry/docs-approvers) to create some more consistent documentation and with that a better experience for the end-users.

To add this as an additional note, there is also no consistency on the name of that exporter (I see console, stdout, logging, ostream), which I am less worried about, but this is also reflected in the value I need to set console output via an environment variable.

@carlosalberto carlosalberto added area:sdk Related to the SDK priority:p3 Lowest priority level labels Jun 26, 2023
@tigrannajaryan tigrannajaryan added needs discussion Need more information before all suitable labels can be applied question Question for discussion labels Jun 26, 2023
@arminru
Copy link
Member

arminru commented Jun 27, 2023

Personally, I think it'd be nice to have but not necessary to consolidate the human-readable log output across all SDKs. They all contain pretty much the same information and it should be relatively straightforward to orient oneself quickly when switching between them.

One thing that should be normalized, however, is the representation of the Trace IDs and Span IDs.
One might want to grep through their SDK and Collector logs based on a Trace or Span ID and would therefore need to know about the different representations being used and convert between all possible options to grep them in parallel, which would be very cumbersome.

Seems like the following representations are currently used as per the examples that @svrnm (thank you!) gathered above:

JS (lowercase hex string)

  "traceId": "3f1fe6256ea46d19ec3ca97b3409ad6d",
  "parentId": "f0b7b340dd6e08a7",
  "id": "41a27f331c7bfed3",

Erlang (decimal)

{span,64480120921600870463539706779905870846,11592009751350035697,[],
    undefined,<<"/">>,server,-576460731933544855,-576460731890088522,

Java (lowercase hex string)

70c2f04ec863a956e9af975ba0d983ee 7fd145f5cda13625 INTERNAL

Python (lowercase hex string with a 0x prefix)

    "trace_id": "0xdcd253b9501348b63369d83219da0b14",
    "span_id": "0x886c05bc23d2250e",
  "parent_id": null,

C++ (lowercase hex string)

  trace_id      : 05eec7a55d3544434265dad89d7fe96f
  span_id       : 45fb62c58c907f05
  parent_span_id: 0000000000000000

I think going with a lowercase hex string (either with or without the 0x prefix) would be most practical.

@dyladan
Copy link
Member

dyladan commented Jun 27, 2023

One big advantage is that the output for language SDKs is typically displayed in a format the language developers are already used to seeing. For example in Ruby the => syntax for maps is quite common but a JS developer would find it quite weird. Another advantage is that it discourages people from trying to programmatically consume the console output.

There are several things that could be specified and we can do all or none of them:

  1. What fields are exported. Do we export everything possible? Do we want to define some verbosity levels?
  2. What format are IDs exported in? In at least erlang/elixir trace and span IDs are exported as integers rather than hex making it difficult to quickly see if the same ID is appearing in your collector output for example.
  3. What is the overall structure of the data? JSON, YAML, tabular, et cetera. Is it meant to be human readable, machine readable, or both?

Personally most of the time I'm using the console logger as a debugging tool just to see that spans/metrics/logs are being collected and exported at all. Usually I don't care about most of the data there. Sometimes, when I am developing instrumentation I care about a specific set of attributes. There are many attributes like trace flags that I almost never care about.

Here's what I would suggest we specify:

  1. A minimum set of data that a console exporter should display. If an implementation wants to display more then fine, possibly configured with a verbosity flag. It MUST be possible to see all data attributes
  2. IDs are displayed as hex in order to make comparison between SDKs and collector more easily

I would not specify the structure of the overall object itself such as JSON/YAML/etc. I would leave a non-normative hint that the target audience is developers of the language debugging their telemetry and that the data should be in a format that users of that programming language are likely to be regularly familiar with.

@cartermp
Copy link
Contributor

Hmmm. I like the idea of language-specific idioms. I think the main challenge isn't one of uniformity, but of verbosity. Some languages (like .NET) use a nice and compact format. Others (like JS and Python) do not, so it's harder to scan for things.

@dyladan
Copy link
Member

dyladan commented Jun 27, 2023

Hmmm. I like the idea of language-specific idioms. I think the main challenge isn't one of uniformity, but of verbosity. Some languages (like .NET) use a nice and compact format. Others (like JS and Python) do not, so it's harder to scan for things.

I also prefer a compact format most of the time. In JS the reason we decided on the verbose format was mainly because we were using it to develop the SDK itself and were frequently looking at specific data. It was never really meant for end users in the early days. In a lot of ways it still isn't beyond day 0 demos.

@cartermp
Copy link
Contributor

Makes sense. I think I'd be unambiguously in favor of the spec offering general guidance on readability/compactness of outputs. If it's a blessed path + there's some general guidelines to follow, then that's a nice thing for outside contributors to pick up as a pure value-add.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:sdk Related to the SDK spec:miscellaneous For issues that don't match any other spec label triage:deciding:community-feedback Open to community discussion. If the community can provide sufficient reasoning, it may be accepted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants