Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load balance metrics for cumulative to delta #25858

Closed
gouthamve opened this issue Aug 17, 2023 · 11 comments · Fixed by #26378
Closed

Load balance metrics for cumulative to delta #25858

gouthamve opened this issue Aug 17, 2023 · 11 comments · Fixed by #26378
Assignees
Labels
enhancement New feature or request exporter/loadbalancing good first issue Good for newcomers

Comments

@gouthamve
Copy link
Member

gouthamve commented Aug 17, 2023

Component(s)

exporter/loadbalancing

Is your feature request related to a problem? Please describe.

The cumulative to delta processor cannot be scaled right because the same metrics need to end up in the same Collector. https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/cumulativetodeltaprocessor#warnings

Describe the solution you'd like

Extend the load balancing exporter to also support load balancing metrics.

Describe alternatives you've considered

No response

Additional context

No response

@gouthamve gouthamve added enhancement New feature or request needs triage New item requiring triage exporter/loadbalancing labels Aug 17, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions
Copy link
Contributor

Pinging code owners for exporter/loadbalancing: @jpkrohling. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@jpkrohling jpkrohling removed the needs triage New item requiring triage label Aug 18, 2023
@jpkrohling
Copy link
Member

Do you think it's sufficient to take only the metric name into consideration, or should it take the resource attributes (names and values) as well?

@gouthamve
Copy link
Member Author

It should also have attributes. There could be one metric name which is 90% of all the metrics.

We've seen this in Cortex. Before we used to only shard on metric name and this caused extreme hot spotting. We then switched to include the labels too.

@jpkrohling
Copy link
Member

Given we have code for both logs and traces, I'm marking this as "good first issue" before working on that myself.

@jpkrohling jpkrohling added the good first issue Good for newcomers label Aug 23, 2023
@claudiobastos
Copy link
Contributor

claudiobastos commented Aug 25, 2023

@jpkrohling I would like to contribute on this one

WIP: https://github.com/claudiobastos/opentelemetry-collector-contrib/tree/claudiobastos/loadexporter-add-metrics

@claudiobastos
Copy link
Contributor

@jpkrohling

Does it make sense that routing based on Service and Routing based on Metric (+Resource attr/values) should be mutualy exclusive?

When balancing metrics, if service routing is defined, it should only balance a subset of them ?

@jpkrohling
Copy link
Member

@gouthamve, it would be nice to get your opinion, but I think the routing_key for metrics should be one of: service, metric, resource, attribute:

  • when service is provided, the service.name resource attribute alone is used for load-balancing
  • when metric is provided, the metric name alone should be used
  • when resource is used, the metric name plus all resource attribute names and their values should be used
  • when attribute is used, the metric name plus all resource attribute names and their values should be used, in addition to the data point's attribute names and values

@gouthamve
Copy link
Member Author

gouthamve commented Aug 31, 2023

While it makes sense, which would be the default?

I'd lean towards attribute, but only to provide the best load-balancing. But we might be missing use-cases, where resource might make more sense.

@claudiobastos
Copy link
Contributor

How about one routing key specific for metrics ? @jpkrohling @gouthamve

@jpkrohling
Copy link
Member

I think service could be a good default, as it probably has the best performance of them all while still being reasonably good for deployments with a good number of services. Otherwise, resource, which would provide the best balancing but with higher costs.

I think we should not implement attribute at the beginning, as it requires going too deep into the metric data points.

jpkrohling added a commit that referenced this issue Sep 21, 2023
Closes #25858

**Description:** Add metrics exporter that will balance its metrics
considering new routing choices

**Link to tracking Issue:** #25858

**Testing:** Add tests for new routing choices and metrics exporter

---------

Signed-off-by: Claudio B <[email protected]>
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this issue Nov 12, 2023
Closes open-telemetry#25858

**Description:** Add metrics exporter that will balance its metrics
considering new routing choices

**Link to tracking Issue:** open-telemetry#25858

**Testing:** Add tests for new routing choices and metrics exporter

---------

Signed-off-by: Claudio B <[email protected]>
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request exporter/loadbalancing good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants