Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable rendering of "template" attributes in table generation #186

Merged
merged 5 commits into from
Aug 28, 2023

Conversation

AlexanderWert
Copy link
Member

@AlexanderWert AlexanderWert commented Jul 7, 2023

Problem

In the semantic conventions we have template-style attributes in some places.

Examples are:

This kind of attributes could not be included in the model .yml files yet, and thus could not be included in table generation.

Solution

This PR proposes a fix that enables definition of template-style attributes in the YAML files and corresponding table generation.

How it works

Template-style attributes (i.e. that do not result in a single attribute, but a group of attributes, for example http.request.header.<key>) can be defined in a separate attribute-templates sub-path in an attribute group.
Attribute templates extend the normal attributes by adding an additional property parameter_name that is the name of the parameter in the angle brackets (e.g. key in the example above).

Here is an example YAML file snippet:


groups:
  - id: http
    type: span
    prefix: http
    brief: 'This document defines semantic conventions for HTTP client and server Spans.'
    note: >
        These conventions can be used for http and https schemes
        and various HTTP versions like 1.1, 2 and SPDY.
    attribute_templates:                                                 <<---- THIS IS NEW
      - id: request.header
        type: string
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'
        parameter_name: key                                             <<---- THIS IS THE NEW PROPERTY
    attributes:
      - id: request.method
        type: string
        requirement_level: required
        sampling_relevant: false
        brief: 'HTTP request method.'
        examples: ["GET", "POST", "HEAD"]

This will result in the following markdown result through table generation:

<!-- semconv http -->
| Attribute  | Type | Description  | Examples  | Requirement Level |
|---|---|---|---|---|
| `http.request.method` | string | HTTP request method. | `GET`; `POST`; `HEAD` | Required |
| `http.request.header.<key>` | string | HTTP request headers, `<key>` being the normalized HTTP Header name (lowercase, with - characters replaced by _), the value being the header values. | ``http.request.header.content_type=["application/json"]`` | Recommended |
<!-- endsemconv -->

Defining such attributes templates in a separate path in the yaml (instead of listing them under the normal attributes) has the following advantages:

  1. These attribute templates have slightly different semantic (compared to normal attributes) as they represent a group of attributes (for which the actual name is runtime dynamic / unknown at specification time).

  2. It does not break code generation for attributes, because it's a separate path that is currently just not considered for code generation. Later, we could even build dedicated code generators for template-style attributes for the different languages. That's something that is currently defined 'manually' in all the SDKs.

  3. For markdown table generation, this PR already includes template-style attributes as an additional set of attributes. Hence, template-style attributes can be defined in YAML files and are included in table generation.

@AlexanderWert AlexanderWert requested review from a team July 7, 2023 13:23
@arminru arminru requested review from trask, lmolkova and Oberon00 July 7, 2023 13:28
Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM as a way forward

@lmolkova
Copy link
Contributor

lmolkova commented Jul 7, 2023

Nice!!
I wonder if we can achieve the same with fewer new concepts:

groups:
  - id: http.request.headers
    type: attribute_group
    prefix: http.request.header 
    brief: 'This group defines dynamic attributes in `http.request.headers` namespace'
    attributes:
      - id: key
        type:  string
        template: true # or 'dynamic' , etc  <<---- THIS IS NEW
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

or even this:

groups:
  - id: http.request.headers
    type: wildcard_attribute_group # or namespace, etc <<---- THIS IS NEW
    prefix: http.request.header 
    brief: 'This group registers a namespace for HTTP request headers'
    note: >
      HTTP request headers, `<key>` being the normalized HTTP Header name
      (lowercase, with - characters replaced by _), the value being the header values.

I personally prefer the latter because we don't need to define any attributes (and specify levels, or type which make no sense in this case).
I also think that ability to register a namespace without any specific attributes is useful beyond wildcard attributes.

@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 10, 2023

I wonder if we can achieve the same with fewer new concepts

@lmolkova
I agree with that! Maybe we can find a way to represent it with less new concepts in the yaml files. However, I think, in the implementation of the generator we still should separate those concepts. (i.e. template attributes should not be included when requesting the list of all attributes. Otherwise existing code generation logic could break.)

I personally prefer the latter because we don't need to define any attributes (and specify levels, or type which make no sense in this case).

  1. I think requirement level and type could still be reasonable for this kind of attributes. For example the http.request.header.<key> has a defined string type and the requirement-level opt-in.
    Or the db.elasticsearch.path_parts.<key> has a string type and requirement level conditionally required.
  2. The latter proposal also implies that these "dynamic" attributes will be always rendered into separate tables (because of a separate id). Even when desired to render it in the same table as other related attributes, it won't be easily possible.

So, how about we try to model it as a special kind of attribute (similar to your first proposal)?

Regarding the first proposal:
I like the idea. There's only one issue with that. It implies that the id of that attribute is the "dynamic" part of the attribute name (in the above example key) that would be put into angle brackets. While it works in the above example, it might be problematic in other places.
For example, if we would add http.request.header.<key> in this yaml file, based on the above proposal the id would be request.header.key, but only key within that id is the dynamic part.

So, what do you think about the following counter proposal?:

groups:
  - id: http.trace.http.common
    type: attribute_group
    prefix: http 
    brief: 'This group defines dynamic attributes in `http.request.headers` namespace'
    attributes:
      - id: request.header
        type:  string
        template_name: key # <<---- THIS IS NEW
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

So, instead of a boolean template property, having a template_name property that specifies the name of the dynamic part. If template_name is specified it indicates it's a template attribute, if it's not specified (i.e. null) then it's a normal attribute. With that it would be only one additional field, while it still would be possible to address all of the points I described above.

WDYT?

@Oberon00
Copy link
Member

I think this YAML-proposal is very good @AlexanderWert. However, I have a different opinion on

However, I think, in the implementation of the generator we still should separate those concepts. (i.e. template attributes should not be included when requesting the list of all attributes. Otherwise existing code generation logic could break.)

I think it is OK to let unadjusted code templates break. The upside is that after adjustment, it should be easier to handle non-template and template attributes in the same loop and also there seems to be no logical reason to render them out-of-order. In the meantime, the breakage when name is None/not set should be rather obvious, either failing the generator with an exception or at least generating obviously unusable code.

We may want to provide a --suppress-template-attributes flag (or --exclude=templates?) as a quickfix to not force everyone to adjust the generators right away.

@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 11, 2023

Thanks @Oberon00 !

there seems to be no logical reason to render them out-of-order

I agree for rendering in the markdown files: they should be treated very similar / same as normal attributes in the markdown files.

However, wouldn't we explicitly want to treat them separately for code generation because they will result in completely different constructs?

For example in Java, normal attributes result in constants in this file, however, template attributes have a different construct, which is a helper class, like done here manually.

Wouldn't that imply that we actually want the code rendering for template attributes to happen out-of-order (or at least from a different set as the normal attributes)?

Otherwise, we'd need to explicitly exclude template attributes from places like in this template.

Maybe I didn't fully get yet the benefit or cases for:

easier to handle non-template and template attributes in the same loop

@Oberon00
Copy link
Member

Oberon00 commented Jul 11, 2023

For the case of Java it might indeed be easier, but I think many languages just produce string constants.

Maybe I didn't fully get yet the benefit or cases for:

easier to handle non-template and template attributes in the same loop

No, I think you got it. It's not that big of a benefit (e.g. you can apparently use + in the template to combine lists, while on the other hand you can also use filters like reject/rejectattr/select/selectattr to separate them nicely), more a matter of taste what should be the default.

@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 11, 2023

more a matter of taste what should be the default.

I'm fine with both. The main motivation to keep it separate was to not introduce breaking changes with that.

How about inverting your proposal? So, not separating the concepts, but instead of opting-out rather opt-in into rendering template-attributes with something like --include-template-attributes (instead of --suppress-template-attributes).

That would not separate those but also prevent breaking changes (or the need to change invocation parameters to keep the current behavior).

@Oberon00
Copy link
Member

Oberon00 commented Jul 11, 2023

I was thinking of that as well, but that would make the tool suppress information by default and you have to know about the flag, otherwise everything will silently work but be reduced. I'd prefer the separate lists over that I think, but it's a very fine line at this point

@AlexanderWert
Copy link
Member Author

@lmolkova , @trask
Do you have opinions on my proposal and the discussion above?

Otherwise, I'd adapt this PR according to that proposal, WDYT?

@lmolkova
Copy link
Contributor

lmolkova commented Jul 18, 2023

@AlexanderWert it looks great!

One thing I'm slightly concerned about is id field.

I wonder if it can match the full attribute name, i.e.

groups:
  - id: http.trace.http.common
    type: attribute_group
    prefix: http 
    brief: 'This group defines dynamic attributes in `http.request.headers` namespace'
    attributes:
      - id: request.header.* 
        type:  string
        template_name: key # <<---- THIS IS NEW
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

otherwise, we're defining a namespace request.header under attributes section and it does not feel right.

an alternative:

groups:
    attributes:
      - id_template:  request.header.{key} // or request.header.[key], etc # <<---- THIS IS NEW
        type:  string
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

@Oberon00
Copy link
Member

Oberon00 commented Jul 18, 2023

Agreed, it seems to be a good idea to mark the id already somehow, either with special syntax or a different YAML property name.

How about using prefix instead of id, not using any trailing * or similar and not using something like template_name? The name of the key will probably be "key" 99,9% of the time anyways

groups:
  - id: http.trace.http.common
    type: attribute_group
    prefix: http 
    brief: 'This group defines dynamic attributes in `http.request.headers` namespace'
    attributes:
      - namespace: request.header # <--- namespace instead of id, no need to add extra "template_name"
        type:  string
        brief: >
          Namespace for HTTP request headers, the id being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

Another idea would be to use type: namespace[string] or similar, but that would be difficult to implement I think.

@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 19, 2023

The problem with omitting the id is that it's not possible to reference these attributes from other places / yaml files anymore.

So how about @lmolkova 's second proposal but keeping the id instead of id_template. That basically means that the format of the id implicitly indicates whether it's a template type or a normal attribute:

groups:
    attributes:
      - id:  request.header.<key> //  <<---- SPECIAL FORMAT that allows the tooling to differentiate
        type:  string
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

@Oberon00
Copy link
Member

The problem with omitting the id is that it's not possible to reference these attributes from other places

It would be possible to adapt this so that for referencing (and also unique-ness checks!) it behaves just like "id". Might be too cumbersome / complicate the code too much to be worth it though.

@AlexanderWert AlexanderWert force-pushed the template-attributes branch 2 times, most recently from 704ffd7 to 0c6d950 Compare July 19, 2023 10:24
@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 19, 2023

@lmolkova , @Oberon00

It would be possible to adapt this so that for referencing (and also unique-ness checks!) it behaves just like "id". Might be too cumbersome / complicate the code too much to be worth it though.

yeah, I fear it would complicate things a lot.

I adapted the PR to implement this approach:

groups:
    attributes:
      - id:  request.header.<key> //  <<---- SPECIAL FORMAT that allows the tooling to differentiate
        type:  string
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

I think it's the least invasive still the id is unique and indicates that it's a special type of attribute.

  • Internally the list of attributes excludes the template attributes to not break existing code generation
  • Table generation works for the actual template attribute definitions as well as for references to template attributes

Please have another look and if you're fine with that let's get this merged so we can move those parts to YAML as well.

Thanks!

@Oberon00 Oberon00 requested a review from jsuereth July 19, 2023 11:09
@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 19, 2023

otherwise, we're defining a namespace request.header under attributes section and it does not feel right.

@lmolkova Actually, we are not defining a namespace with that, but a template attribute (i.e. an attribute of a special type) with the unique id http.request.header. According to the attributes naming rules, this prohibits the usage of http.request.header as a namespace. Or am I missing something?

@AlexanderWert
Copy link
Member Author

AlexanderWert commented Jul 19, 2023

Based on the feedback, the newest update in this PR contains the following variant:

groups:
  - id: http.trace.http.common
    type: attribute_group
    prefix: http 
    brief: 'This group ...'
    attributes:
      - id:  request.header
        type:  template[string[]]   <--- THIS IS NEW
        brief: >
          HTTP request headers, `<key>` being the normalized HTTP Header name
          (lowercase, with - characters replaced by _), the value being the header values.
        examples: '`http.request.header.content_type=["application/json"]`'

@AlexanderWert AlexanderWert requested a review from Oberon00 July 25, 2023 07:11
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Could you please also update a few things?

I'm also curious how code generation will work on this - I'll try later today and share.

@AlexanderWert
Copy link
Member Author

@lmolkova

I added the requested changes:

  • changelog
  • docs
  • schema definition

In addition I also added the capability to render code (in a way that it should not affect existing code generation templates). Also added a test for the code generation of template attributes.

semantic-conventions/syntax.md Outdated Show resolved Hide resolved
semantic-conventions/syntax.md Outdated Show resolved Hide resolved
Signed-off-by: Alexander Wert <[email protected]>
@AlexanderWert
Copy link
Member Author

@lmolkova can this be merged? Or anything missing?

@AlexanderWert
Copy link
Member Author

@open-telemetry/specs-semconv-maintainers Can we merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants