Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stat_sinks: add OpenTelemetry stats sink #26620

Merged
merged 43 commits into from
Apr 25, 2023

Conversation

ohadvano
Copy link
Contributor

@ohadvano ohadvano commented Apr 9, 2023

Commit Message: stat_sinks: OpenTelemetry stats sink
Additional Description: Adding OpenTelemetry stats sink, following the OTLP protocol and proto definitions.
Risk Level: Low
Testing: Unit tests, Integration tests
Docs Changes: Changelog, Stats sinks docs
Release Notes: None
Platform Specific Features: None

Resolves #26498

@repokitteh-read-only
Copy link

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #26620 was opened by ohadvano.

see: more, trace.

@repokitteh-read-only repokitteh-read-only bot added api deps Approval required for changes to Envoy's external dependencies labels Apr 9, 2023
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @mattklein123
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).
CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @RyanTheOptimist

🐱

Caused by: #26620 was opened by ohadvano.

see: more, trace.

@ohadvano ohadvano changed the title [WIP] stat_sinks: OpenTelemetry stats sink [WIP] stat_sinks: add OpenTelemetry stats sink Apr 9, 2023
ohadvano added 8 commits April 9, 2023 17:37
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
Signed-off-by: Ohad Vano <[email protected]>
@ohadvano ohadvano marked this pull request as ready for review April 9, 2023 17:28
@ohadvano ohadvano changed the title [WIP] stat_sinks: add OpenTelemetry stats sink stat_sinks: add OpenTelemetry stats sink Apr 9, 2023
@ohadvano
Copy link
Contributor Author

ohadvano commented Apr 9, 2023

I am opening this PR for initial review, before proceeding to tests implementation. Will appreciate your help with initial review.
/assign-from @envoyproxy/senior-maintainers

@mattklein123 I have temporarily mentioned you in the CODEOWNERS for that new extension, just to unblock the CI (or not temporarily if requested of course).

@repokitteh-read-only
Copy link

@envoyproxy/senior-maintainers assignee is @mattklein123

🐱

Caused by: a #26620 (comment) was created by @ohadvano.

see: more, trace.

@mattklein123
Copy link
Member

Seems fine at a high level. Can you add tests, etc. and we can go from there?

/wait

@RyanTheOptimist
Copy link
Contributor

/lgtm deps

@repokitteh-read-only repokitteh-read-only bot removed the deps Approval required for changes to Envoy's external dependencies label Apr 10, 2023
@kyessenov
Copy link
Contributor

Very cool. Did you think of the semantic conversion support (https://opentelemetry.io/docs/reference/specification/compatibility/prometheus_and_openmetrics/)? For example, stripping _bucket and _count suffixes.

@ohadvano
Copy link
Contributor Author

Very cool. Did you think of the semantic conversion support (https://opentelemetry.io/docs/reference/specification/compatibility/prometheus_and_openmetrics/)? For example, stripping _bucket and _count suffixes.

Thanks for referring me to this doc, I haven't got to it yet. I'm curious however what is the motivation for supporting this? As I see it, Envoy is not implementing Prometheus semantics in first place? Also if someone chooses this sink then I would assume they would want the semantics to be with OTLP in first place, so what does Prometheus has to do with it? I guess we could add Prometheus sink dedicated for that protocol..

@kyessenov
Copy link
Contributor

@ohadvano You are right that Envoy stats pre-date conventions of prometheus. But looking forward, metrics are likely to be added that follow either prometheus or Otel conventions. For example, Istio added istio_requests_total in its custom extension, which is in "prometheus" style, so it'd fail "otel" style when converted verbatim. But if you write "otel" style metrics directly, there's no problem with "otel" sink. Perhaps, we can go the other way and instead assume all new metrics are "otel" style and convert to "prometheus" style in the "prometheus" sink. BTW I don't think this need to be solved now, I was curious to hear your thoughts.

@ohadvano
Copy link
Contributor Author

ohadvano commented Apr 10, 2023

For example, Istio added istio_requests_total in its custom extension, which is in "prometheus" style, so it'd fail "otel" style when converted verbatim

What would be the expected behavior in such case if OTEL sink is being used, or alternatively asking what do you mean by 'fail'?

@kyessenov
Copy link
Contributor

Per Otel semantic conventions: If the counter has a _total suffix, it MUST be removed.
The other one I know is: duration must be in seconds, and _milliseconds suffix must be stripped and unit converted.

@ohadvano
Copy link
Contributor Author

Per Otel semantic conventions: If the counter has a _total suffix, it MUST be removed. The other one I know is: duration must be in seconds, and _milliseconds suffix must be stripped and unit converted.

Yeah but I mean, what should happen in case you "violate" this convention and decide to name your counter with _total suffix anyways? I'm asking this because currently Envoy has some counter stats with this name, for example: downstream_cx_total. Is it expected that the collector will drop these metrics? I couldn't find any other doc that describes this, also couldn't find any other doc that specifies these conventions regardless of Prometheus, so will appreciate if you could refer me to any.

@ohadvano
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #26620 (comment) was created by @ohadvano.

see: more, trace.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high level comments to start out with, thanks.

/wait

ohadvano and others added 5 commits April 24, 2023 21:28
Signed-off-by: ohadvano <[email protected]>
Signed-off-by: ohadvano <[email protected]>
Signed-off-by: ohadvano <[email protected]>
Signed-off-by: ohadvano <[email protected]>
@ohadvano
Copy link
Contributor Author

/coverage

@repokitteh-read-only
Copy link

Coverage for this Pull Request will be rendered here:

https://storage.googleapis.com/envoy-pr/26620/coverage/index.html

The coverage results are (re-)rendered each time the CI envoy-presubmit (check linux_x64 coverage) job completes.

🐱

Caused by: a #26620 (comment) was created by @ohadvano.

see: more, trace.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at a high level. Some small comments, thanks.

/wait

Signed-off-by: ohadvano <[email protected]>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 enabled auto-merge (squash) April 25, 2023 15:48
@ohadvano
Copy link
Contributor Author

@mattklein123 I think this won't merge because of the API approval check, can you please approve it as well?

@ohadvano
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #26620 (comment) was created by @ohadvano.

see: more, trace.

@mattklein123 mattklein123 merged commit a6d46b6 into envoyproxy:main Apr 25, 2023
@ohadvano ohadvano deleted the ohadvano/otel_stats_sink branch April 25, 2023 20:11
alyssawilk added a commit to alyssawilk/envoy that referenced this pull request Apr 25, 2023
alyssawilk added a commit to alyssawilk/envoy that referenced this pull request Apr 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Open Telemetry stats sink support
4 participants