Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add telemetry.source attribute namespace #2192

Closed

Conversation

martinkuba
Copy link
Contributor

@martinkuba martinkuba commented Dec 3, 2021

Currently the service.name attribute is required for any telemetry data. There is a need to distinguish client-side telemetry from backend (service) telemetry. This related PR is proposing a new set of attribute to describe client-side applications. The implication is that service.* attributes would be used only for telemetry coming from backend services, and as a result service.name should not be required for client-side resources.

This has been discussed in the Specs SIG and Client-side SIG, and the outcome of the discussion has been to propose a new attribute that would be used for generic description of the source of the telemetry. In addition to that, service and/or app attributes could also be present depending on the type of the telemetry.

Requirements for this change

  • enable identifying the type of telemetry (backend, client-side, potentially others in the future)
  • use attribute names that are intuitive to users
    • e.g. service.bundle to describe client-side bundle name is not intuitive
  • define common attributes that are independent of the type (e.g. name)
    • the reason for this is to prevent backends from having to look at multiple different attributes to determine the name (for example)
  • provide backward compatibility with service.name
    • since service.name is currently required

@svrnm svrnm mentioned this pull request Dec 6, 2021
@svrnm
Copy link
Member

svrnm commented Dec 6, 2021

First of all, I think telemetry.source.name (and namespace) are a good neutral name for any kind of entity sending OTel data. I have a few questions on the PR:

  • This change will break the spec, since until now all implementations have service.name mandatory. What is an option to work around that? Fallback in the implementations or updating the spec to say that service.name can be used instead of telemetry.source.name but it is discouraged (some kind of soft migration)? As an alternative: one of service.name / app.name / (...) OR telemetry.source.name should always be set and if not set telemetry.source.name is populated by the others (that's a very complicated logic....)
  • If there is a telemetry.source.name and namespace would would be a reason for an enduser to have that set as well as service.name / app.name? This only would make sense if those values would be different? Since the purpose of this change is avoiding confusion with the end user, I argue it would be better to remove name and namespace from service/app and call out in the spec that name/namespace is taken from telemetry.source
  • Following the discussion at Guidance for filling (or consider removing?) service.instance.id #1034 there should also be an telemetry.source.id or similar for unique identification (although you might get in trouble with client side telemetry since this might be seen as a unique identifier for the end-user)

| `telemetry.source.name` | string | Logical name of the source of the telemetry data. [1] | `shoppingcart` | Yes |
| `telemetry.source.namespace` | string | A namespace for `telemetry.source.name`. [2] | `Shop` | No |

**[1]:** If the value was not specified, SDKs MUST fallback to `unknown_source`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we had proposed this would fallback to service.name here, for backwards compatibility with the previous specification.

@@ -65,7 +83,7 @@ as specified in the [Resource SDK specification](../sdk.md#sdk-provided-resource
<!-- semconv service -->
| Attribute | Type | Description | Examples | Required |
|---|---|---|---|---|
| `service.name` | string | Logical name of the service. [1] | `shoppingcart` | Yes |
| `service.name` | string | Logical name of the service. [1] | `shoppingcart` | No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove this as required, we'll need to document what we migrated to.

Again, if we think of the specification as an API users consume, then we need to verify this isn't breaking.

We discussed using telemetry.source.* as the new service.name, but what does removing this as required due to expectations of users? I think in the meeting we had talked about some kind of backfill mechanic where telemetry.source.* would be inserting here on-demand to prevent breakages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth I am not sure how to verify that it's not breaking in all SDKs. I am guessing that if implemented correctly, it will default to unknown_service when not provided. I think potentially the bigger challenge is that backends might be expecting service.name to be present. So, perhaps I can leave it as required for resources that describe an actual service, with the value being the same as telemetry.source.name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd suggest the later. One of my crazy ideas was to have something where when a user asks for service.name they automatically get telemetry.source.name at the API level, e.g. see: open-telemetry/oteps#161 which did not get enough approvals, but would have helped us in this scenario.

@tigrannajaryan
Copy link
Member

Currently the service.name attribute is required for any telemetry data.

Where in the spec does this say that it is required for any telemetry data? As far as I can see it is required for a Service. Is there a stricter requirement somewhere?

This has been discussed in the Specs SIG and Client-side SIG, and the outcome of the discussion has been to propose a new attribute that would be used for generic description of the source of the telemetry.

Please post the summary of the discussion and the rationale why service.name is not good enough. We need to capture it here and have broader discussion, online discussions are not sufficient for decisions of this sort that can have broad implications.

@svrnm
Copy link
Member

svrnm commented Dec 7, 2021

Where in the spec does this say that it is required for any telemetry data? As far as I can see it is required for a Service. Is there a stricter requirement somewhere?

From how I read the spec it is required for any telemetry source:

"The SDK MUST provide access to a Resource with at least the attributes listed at Semantic Attributes with SDK-provided Default Value." (source)

following that link:

*"These are the attributes which MUST be provided by the SDK as specified in the Resource SDK specification:

  • service.name"*

--

I read that in a way that every telemetry source is a service and the argument against that is that a browser/mobile app is not a service, but an app (and the same is true for other things that might send OTel data in the future)

Please post the summary of the discussion and the rationale why service.name is not good enough.

The initial discussion started in this PR #2111, which is conflicting with this PR #2115, coming from the Client Side Telemetry SIG:

So the discussion is spread out over those 3 (and maybe some meeting notes?)

We need to capture it here and have broader discussion, online discussions are not sufficient for decisions of this sort that can have broad implications.

Sounds like this requires an OTEP to consolidate?

@tigrannajaryan
Copy link
Member

Sounds like this requires an OTEP to consolidate?

I think this definitely warrants an OTEP. There are several elements in this discussion which are unclear to me and which I am not sure I agree with. These need to be discussed more thoroughly. An OTEP would help a lot.

At the moment I see 2 possibilities to move forward:

  1. Decide that service.* is good enough set of attributes for client-side applications. It may require to clarify what a "Service" means and make it is defined in a way that includes client-side applications.
  2. Decide that a different set of attributes is necessary for client-side applications. Here we have 2 sub-possibilities:
    a. Introduce a new set of attributes that are in addition to service.* attributes to help capture specifics of client-side applications. This is fine and I think additive attributes are a fairly common approach in Otel spec.
    b. Introduce a new set of attributes for client-side applications to be recorded instead of service.*. This is a bigger change than (a), particularly because the spec seems to say that SDKs must always emit the "service.name". At the minimum this approach requires deleting this particular existing spec requirement.

As a side note, I am not sure that the existing requirement that all telemetry produced by Otel SDK must have a service.name attribute is a correct requirement. Note that the way it is phrased in the spec applies only to telemetry produced by Otel SDK, not by any telemetry source. Otel Collector for example is a valid source of telemetry and it is not required to include service.name in all telemetry (which I believe is a correct behavior, because not everything is a Service).

@arminru arminru added area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory labels Dec 7, 2021
@yurishkuro
Copy link
Member

clarify what a "Service" means and make it is defined in a way that includes client-side applications.

That was already done in #2111. The objections were not to the definition, but to the name "service".

My preference is option-1, for consistency.

@tigrannajaryan
Copy link
Member

clarify what a "Service" means and make it is defined in a way that includes client-side applications.

That was already done in #2111. The objections were not to the definition, but to the name "service".

My preference is option-1, for consistency.

That's my preference as well.

@svrnm
Copy link
Member

svrnm commented Dec 9, 2021

I think this definitely warrants an OTEP. There are several elements in this discussion which are unclear to me and which I am not sure I agree with. These need to be discussed more thoroughly. An OTEP would help a lot.

Done.

<!-- semconv service -->
| Attribute | Type | Description | Examples | Required |
|---|---|---|---|---|
| `telemetry.source.name` | string | Logical name of the source of the telemetry data. [1] | `shoppingcart` | Yes |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the telemetry.source.name for a metric collected from a Kubernetes cluster, e.g. CPU usage of a Kubernetes Node? We have k8s.node.* attributes for this. Will telemetry.source.name be a duplicate of k8s.node.name? Similarly, will telemetry.source.namespace be a duplicate of k8s.namespace?

Are we requiring that every specific technology for which we already have well-defined set of identifying attributes also duplicates the values of those attributes in telemetry.source.name and telemetry.source.namespace? We have such specific definitions for several Kubernetes entities, for OS processes, for FaaS functions, etc. I fail to see how this attempt to generalize works in the presence of those specific semantic conventions.

@jmacd
Copy link
Contributor

jmacd commented Dec 10, 2021

It looks to me like the present-day service.name attribute, being required, can't be un-done. I have been thinking of service.name and as being part of a schema for "Cloud service".

Can we refactor today's specification so that wherever we have service.name as a required attribute only for services? The goal I'm trying to outline is that there would be more than one schema URLs you might choose from when instrumenting, there's today's 1.x line of schemas aimed at servers, but ... the "client" schema can use different required attributes.

@tigrannajaryan
Copy link
Member

It looks to me like the present-day service.name attribute, being required, can't be un-done.

That requirement is defined in https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/resource/semantic_conventions which is marked Experimental, so presumably it can be undone or modified, right?

@jkwatson
Copy link
Contributor

It looks to me like the present-day service.name attribute, being required, can't be un-done. I have been thinking of service.name and as being part of a schema for "Cloud service".

Can we refactor today's specification so that wherever we have service.name as a required attribute only for services? The goal I'm trying to outline is that there would be more than one schema URLs you might choose from when instrumenting, there's today's 1.x line of schemas aimed at servers, but ... the "client" schema can use different required attributes.

I think the other way out would be to allow SDKs to provide a configuration option to ignore this requirement. Then, client-specific distributions (for swift, android, js) could be provided that have this set out of the box.

@tigrannajaryan
Copy link
Member

I think the other way out would be to allow SDKs to provide a configuration option to ignore this requirement.

I am in favour of removing the requirement altogether. What is preventing the removal? Exporters who send to backends that require it can deal with the problem.

@bogdandrutu
Copy link
Member

bogdandrutu commented Dec 14, 2021

I also prefer the option 1 from @tigrannajaryan 's options list, to generalize the meaning of "service" (OR to completely rename to "source" if we cannot generalize "service"). I personally think would be a mistake to have multiple entries "source", "app", "service" that will only confuse more the users and complicate things.

@martinkuba can you clarify what "client-side" means for you? In a microservice infrastructure a "client" is another service, in a web-application you can refer as "client-side" to the app/service/client that runs in the browser, in a mobile app you can refer to the app itself but also an intermediate service in a chain of service calls. Too many possibilities.

@martinkuba
Copy link
Contributor Author

I also prefer the option 1 from @tigrannajaryan 's options list, to generalize the meaning of "service" (OR to completely rename to "source" if we cannot generalize "service"). I personally think would be a mistake to have multiple entries "source", "app", "service" that will only confuse more the users and complicate things.

I also think that having multiple attributes that mean the same thing would be confusing (like service.name and app.name). However, one of our requirements is to be able to identify client-side telemetry, and originally the recommendation we received was that this should be accomplished by a presence of some attributes. As a result, we proposed adding app.* attributes in #2115. This led to a discussion about having a common attribute to describe the source of telemetry (currently service.name), this PR, and eventually to this OTEP.

@martinkuba can you clarify what "client-side" means for you? In a microservice infrastructure a "client" is another service, in a web-application you can refer as "client-side" to the app/service/client that runs in the browser, in a mobile app you can refer to the app itself but also an intermediate service in a chain of service calls. Too many possibilities.

By client-side, we mean applications that runs in client devices and that end-users interact with directly - web apps running in browsers, mobile apps on mobile devices etc.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link

github-actions bot commented Jan 2, 2022

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 2, 2022
@bogdandrutu bogdandrutu removed the Stale label Jan 4, 2022
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 11, 2022
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants