Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmd/mdatagen] Ability to disable/enable all metrics #10074

Open
Tracked by #10904
dmitryax opened this issue Feb 15, 2024 · 21 comments
Open
Tracked by #10904

[cmd/mdatagen] Ability to disable/enable all metrics #10074

dmitryax opened this issue Feb 15, 2024 · 21 comments
Labels
enhancement New feature or request

Comments

@dmitryax
Copy link
Member

dmitryax commented Feb 15, 2024

Problem

We have a few issues when it's not desirable to configure the emitted metrics and resources starting from the "default" set. Sometimes, it's needed to start with an empty set and enable just a few metrics or start with enabling all metrics/resources (default+optional) and explicitly disabling a few of them.

This is needed at least for:

Possible solutions

Given that the metrics and resource_attributes interfaces provide a map name: config, we need to introduce special metric names that can be applied to a group of metrics/resources.

Option 1

Have a predefined special metric name that cannot collide with existing names, e.g., _ALL.

metrics:
  _ALL:
    enabled: false
  metric1:
    enabled: true

Option 2

Support glob-style * in metric names. Using * in metric/attribute names isn't valid according to the OTel specification, so it cannot cause collisions. Providing this capability makes the user interface more powerful, as users will be able to disable/enable groups of metrics by their namespaces.

metrics:
  '*':
     enabled: false
  metric1:
    enabled: true

Providing this interface can cause collisions if the several globs match the same metric (e.g. * and system.* match any system.* metric). Once such collisions are detected, the config validation should fail because there is no way to clearly define precedence with a map.

This approach requires an evaluation from the implementation perspective. It might over-complicate the mdatagen logic.

@dmitryax dmitryax changed the title Ability to disable/enable all metrics [cmd/mdatagen] Ability to disable/enable all metrics Feb 15, 2024
@TylerHelmuth
Copy link
Member

I really want this feature.

The glob-style matching is really interesting. It would easily let you turn off a set of metrics you don't want, like *__replicas to turn off all replica metrics from the k8sclusterreceiver. As long as we can detect collisions during startup I really like that option.

I think we should be very strict about not introducing regex as any part of this solution now or in the future. I know it is common to take the step from glob to regex eventually, but I don't think that'd be a good idea.

@dmitryax
Copy link
Member Author

dmitryax commented Feb 15, 2024

I think we should be very strict about not introducing regex as any part of this solution now or in the future. I know it is common to take the step from glob to regex eventually, but I don't think that'd be a good idea.

Right, I was thinking we can support * characters only, nothing else. But since other glob characters like ? [] are also invalid in OTel naming conventions, maybe we can just adopt glob as is. Definitely never regex.

@TylerHelmuth
Copy link
Member

Supporting full glob is probably ok.

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Feb 15, 2024

If we used globs, I think we need more rules for how clashes interact. It feels like wanting to do something like

metrics:
  '*':
     enabled: false
  "k8s.pod.*":
    enabled: true

could arise to turn off all metrics except pod metrics

@braydonk
Copy link
Contributor

I opened a related PR here which I was hoping I could leverage in the processesscraper deprecation work: open-telemetry/opentelemetry-collector-contrib#30995

I would be in favour of ditching this and going for the glob style instead. I think that would be very powerful. I'd be happy to take on the implementation.

Unless there exists a Go library for it that I'm not aware of, I'd implement glob matching with *, ?, and [] and then implement detecting globs and matching against metric names when accepting a metric builder config.

For clashes, I would need to look at this again but when the metrics builder config is passed to, is the order as written in the config file maintained? Because in that case I think it would make the most sense to make it so globs are matched in the order they appear in the config.

@dmitryax
Copy link
Member Author

For clashes, I would need to look at this again but when the metrics builder config is passed to, is the order as written in the config file maintained? Because in that case I think it would make the most sense to make it so globs are matched in the order they appear in the config.

@braydonk It's not maintained. It's map

@dmitryax
Copy link
Member Author

If we used globs, I think we need more rules for how clashes interact. It feels like wanting to do something like

Do you mean we define that enabled: true takes precedence over enabled: false? What about other configuration options that we can add in the future? Like applying some aggregation. It's problematic even for enabled: if we allow clashes, I'd see users trying something like

metrics:
  '*':
     enabled: true
  "k8s.pod.*":
    enabled: false

which won't work for them in that case.

Trying to be smart and set precedence based on the size of the matching metrics set seems to be an overkill. I think we should start with not allowing clashes.

@TylerHelmuth
Copy link
Member

Not allowing clashes at the start would be really nice, but I think will take away a lot of the use cases. The implementation will help users who want to turn on all their metrics, but not users who want to turn off most of the metrics. Maybe that is ok?

@dmitryax
Copy link
Member Author

The implementation will help users who want to turn on all their metrics, but not users who want to turn off most of the metrics.

Why not? You just disable all of them with * and enable some of them explicitly by names

@dmitryax
Copy link
Member Author

Another option is to allow clashes but specify priority explicitly as an additional field e.g.:

metrics:
  '*':
    matching_order: 1
    enabled: true
  "k8s.pod.*":
    matching_order: 2
    enabled: false

Seems a bit ugly to me

@TylerHelmuth
Copy link
Member

You just disable all of them with * and enable some of them explicitly by names

Oh I was under the impression that would be considered a clash. If that is allowed we're good - no clash support needed.

Copy link
Contributor

Pinging code owners for cmd/mdatagen: @dmitryax. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@dmitryax
Copy link
Member Author

@braydonk will you be able to take this?

@crobert-1 crobert-1 added the enhancement New feature or request label Feb 26, 2024
@braydonk
Copy link
Contributor

Yes I will take it.

@braydonk
Copy link
Contributor

braydonk commented Mar 19, 2024

I have the wildcard matching working. I implemented two types of matching:

* - 0 to n of any character
{a,b,...} - A collection of multiple match options

I can implement any other types of wildcards if it would make sense, but every use case I can think of only really benefits from these two options.

The matching also has a priority system where the earlier wildcards are applied first. This way the scenario where we see something like:

'*':
  enabled: false
'process.memory.*':
  enabled: true

This will work in a deterministic and (I think) sensible order; all metrics will be disabled, then all process.memory metrics will be enabled.
I'm working on writing a proper specification for the format and priority orders to iron out the edge cases.

The next step is to determine how this code is included in a generated metadata package. I see two reasonable possibilities:

  1. The wildcard matching code is included directly as another .go.tmpl file (or in an existing one) and it's used locally within the generated package.
  2. A go module is added somewhere under go.opentelemetry.io and the package is imported by generated code.

Option 1 is definitely easier to implement, but working with the code in template files might be frustrating for future maintenance. But this also probably doesn't deserve to be a new module because it seems pretty specific to mdatagen.

I'm going to start by implementing option 1 in a branch so I can put something up, but want to leave the possibility of option 2 open for now.

@braydonk
Copy link
Contributor

braydonk commented May 1, 2024

I have opened #10065

I ended up going with option 1 from my comment above for the PR.

P.S. This issue should probably be moved to opentelemetry-collector since this component has moved there now.

@TylerHelmuth TylerHelmuth transferred this issue from open-telemetry/opentelemetry-collector-contrib May 2, 2024
@dmitryax
Copy link
Member Author

dmitryax commented May 4, 2024

@braydonk, I don't think we can rely on the order of the map keys because it's not supposed to be preserved

@braydonk
Copy link
Contributor

braydonk commented May 4, 2024

The PR I posted takes that into account and works regardless of the order. There's a custom priority system. The priority of the pattern goes down for every . in the name, so * will be viewed as higher priority than x.* regardless of what order the patterns are read.

@dmitryax
Copy link
Member Author

dmitryax commented May 4, 2024

How does it work with the {...}? Which one wins compared to *? What if I have prefix.* and *.suffix?

@braydonk
Copy link
Contributor

I have written a full specification at https://gist.github.com/braydonk/ccb6775331fdd5dca91a650330b9839f.

@dmitryax
Copy link
Member Author

dmitryax commented Jun 13, 2024

Thank you for putting the doc! Once we have the PR merged, please submit a PR to add this in cmd/mdatagen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants