Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Generate function and extension logs via Telemetry API receiver #1347

Conversation

jerrytfleung
Copy link
Contributor

Motivation

Telemetry API can provide function and extension log. It is better to forward logs so as to provide more observability.

Changes

  • Changed Subscribe function so it takes event type slice in telemetryapi/client.go
  • Added logs support
  • Added sharedcomponent (copied from opentelemetry-collector-contrib) so as to share the http server among receivers
  • Added unit test cases for logs

@jerrytfleung jerrytfleung requested a review from a team May 25, 2024 02:37
@@ -64,7 +69,7 @@ func (r *telemetryAPIReceiver) Start(ctx context.Context, host component.Host) e
}()

telemetryClient := telemetryapi.NewClient(r.logger)
_, err := telemetryClient.Subscribe(ctx, r.extensionID, fmt.Sprintf("http://%s/", address))
_, err := telemetryClient.Subscribe(ctx, []telemetryapi.EventType{telemetryapi.Platform, telemetryapi.Function, telemetryapi.Extension}, r.extensionID, fmt.Sprintf("http://%s/", address))
Copy link
Contributor Author

@jerrytfleung jerrytfleung May 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is a good idea to setup a config that users can choose which type of events to subscribe.
Something like:

receivers:
  telemetryapi:
    types: [platform, function, extension]

But after I take a look to config.go. There is a field extensionID. Is it possible to let internal/lifecycle/manager setup the extensionID and users setup the types in the collector configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerbenson @rapphil @nslaughter Can you please help to advise the above question? Thanks!

Copy link
Contributor

@nslaughter nslaughter Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerrytfleung

  1. Sorry about the delay in getting you this review. I appreciate the work you're doing on this. Great stuff!
  2. I'm also not familiar enough with the extension API to say exactly how that will translate to the config from the top of mind.
  3. The config change that you mention seems to me like it will benefit some users and be implemented in a way that's unobjectionable to others, but I think these are questions we'd be best to hash out in a Github issue - ideally accompanied by a very focussed PR that includes the proposed implementation with maintainers, but this comment thread is fine for that discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But after I take a look to config.go. There is a field extensionID. Is it possible to let internal/lifecycle/manager setup the extensionID and users setup the types in the collector configuration

I think we will need to refactor the code a bit so that the extension id is passed to the factory functions createTracesReceiver and createLogsReceiver through the scope.

However, if we are adding configuration, we will also need to be able to support multiple instances of the receivers, which does not seem to be the case with the usage of this SharedComponents abstraction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After trying out, I found that the extensionID is preserved if I add a field. I updated the code so as to support multiple instances of telemetry api receiver.
I think SharedComponents could help to reuse the http server by using cfg as a key to select the corresponding receiver

@tylerbenson
Copy link
Member

@rapphil @nslaughter PTAL.

@tylerbenson tylerbenson added the go Pull requests that update Go code label Jun 13, 2024
Copy link
Contributor

@nslaughter nslaughter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know what you think of the changes suggested in factory.go. That's a key to this PR I think.

Comment on lines 52 to 55
r := receivers.GetOrAdd(cfg, func() component.Component {
return newTelemetryAPIReceiver(cfg, params)
})
r.Unwrap().(*telemetryAPIReceiver).registerTracesConsumer(next)
Copy link
Contributor

@nslaughter nslaughter Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want more data protection in this file than I see in the top-level variable for receivers, but it looks like the code doesn't read from receivers. As I traced my steps through this code, I started thinking we wouldn't get a benefit from the call to GetOrAdd receivers. What if we changed that to the following?

Suggested change
r := receivers.GetOrAdd(cfg, func() component.Component {
return newTelemetryAPIReceiver(cfg, params)
})
r.Unwrap().(*telemetryAPIReceiver).registerTracesConsumer(next)
r := newTelemetryAPIReceiver(cfg, params)
r.registerTracesConsumer(next)

And if that works we don't have to rely on copying internal/sharedcomponent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and actually this is a bit fragile. this means that you cannot use this receiver in lets say, more than one pipeline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to reuse the http server. internal/sharedcomponent can help to reuse the http server by looking up the receiver using cfg as a key.
I was assuming we could have configuration for telemetryapi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sharedcomponent could play a role so as to reuse http server and to support multiple instances of telemetry api receiver.

Can you take a look to the updated change and let me know your feedback? Thanks!

}

func (r *telemetryAPIReceiver) registerTracesConsumer(next consumer.Traces) {
r.nextTraces = next
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are trying to reuse the http server and it seems that we want to have a single instance of traces consume and logs consumer, would it make sense for nextTraces to be an array?

This would allow to have multiple receivers in different pipelines.

Copy link
Contributor Author

@jerrytfleung jerrytfleung Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is to reuse http server within a receiver instance and a receiver instance has a single instance of traces consumer and logs consumers.

I don't know the benefit of making nextTraces to be an array. Can you please elaborate more on that?

@rapphil
Copy link
Contributor

rapphil commented Jun 18, 2024

My overall comment is that I'm trying to understand why we need to have this:

A dded sharedcomponent (copied from opentelemetry-collector-contrib) so as to share the http server among receivers

Can't we have multiple subscriptions using the telemetry client, each using a different port to subscribe to the events so that we don't have to have this singleton receiver? This will open the door for adding the configurations that you suggested here, which I think is a good idea.

@jerrytfleung
Copy link
Contributor Author

jerrytfleung commented Jun 20, 2024

My overall comment is that I'm trying to understand why we need to have this:

Added sharedcomponent (copied from opentelemetry-collector-contrib) so as to share the http server among receivers

Can't we have multiple subscriptions using the telemetry client, each using a different port to subscribe to the events so that we don't have to have this singleton receiver? This will open the door for adding the configurations that you suggested here, which I think is a good idea.

I managed to support multiple instances of telemetry api receiver by allowing user to configure port in the telemetry API receiver configuration. Please help to take a look and let me know your feedback. Thanks!

example config

receivers:
    telemetryapi:
    telemetryapi/1:
      port: 4326
    telemetryapi/2:
      port: 4326
      types:
        - platform
        - function
    telemetryapi/3:
      port: 4326
      types: ["platform", "function"]

@jerrytfleung
Copy link
Contributor Author

@rapphil @nslaughter PTAL. Thanks!

"FATAL2": plog.SeverityNumberFatal2,
"FATAL3": plog.SeverityNumberFatal3,
"FATAL4": plog.SeverityNumberFatal4,
"CRITICAL": plog.SeverityNumberFatal,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding CRITICAL here. It is because Python logging library can log CRITICAL (50) and it is mapped to Fatal in opentelmetry-python https://opentelemetry-python.readthedocs.io/en/latest/_modules/opentelemetry/_logs/severity.html#std_to_otel

"FATAL3": plog.SeverityNumberFatal3,
"FATAL4": plog.SeverityNumberFatal4,
"CRITICAL": plog.SeverityNumberFatal,
"ALL": plog.SeverityNumberTrace,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@jerrytfleung jerrytfleung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to explain the decision of add CRITICAL and ALL severity text conversion.

@jerrytfleung
Copy link
Contributor Author

My overall comment is that I'm trying to understand why we need to have this:

A dded sharedcomponent (copied from opentelemetry-collector-contrib) so as to share the http server among receivers

Can't we have multiple subscriptions using the telemetry client, each using a different port to subscribe to the events so that we don't have to have this singleton receiver? This will open the door for adding the configurations that you suggested here, which I think is a good idea.

I am thinking the following configuration so users can configure the types of subscription per each receivers. Also they can put the receiver to the pipeline according to their requirements.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "localhost:4317"
      http:
        endpoint: "localhost:4318"
  telemetryapi/1:
    port: 4325
    types: [platform]
  telemetryapi/2:
    port: 4326
    types: [function]
exporters:
  logging:
    loglevel: debug

service:
  pipelines:
    traces:
      receivers: [otlp,telemetryapi/1,telemetryapi/2]
      exporters: [logging]
    metrics:
      receivers: [otlp]
      exporters: [logging]
    logs/1:
      receivers: [telemetryapi/1]
      exporters: [logging]
    logs/2:
      receivers: [telemetryapi/2]
      exporters: [logging]
  telemetry:
    metrics:
      address: localhost:8888


@bhaskarbanerjee
Copy link

@jerrytfleung I see that the config you shared above is supporting metric in the pipeline as well? metrics:

metrics:
      receivers: [otlp]
      exporters: [logging]

But as far I see this I can only see it supporting traces.

Do you mean that with this PR, the collector will start receiving and exporting metric as well as logs, in addition to the traces?

@jerrytfleung
Copy link
Contributor Author

@jerrytfleung I see that the config you shared above is supporting metric in the pipeline as well? metrics:

metrics:
      receivers: [otlp]
      exporters: [logging]

But as far I see this I can only see it supporting traces.

Do you mean that with this PR, the collector will start receiving and exporting metric as well as logs, in addition to the traces?

@bhaskarbanerjee Solarwinds has plan to extract information from platform event and convert them into metrics. e.g. When we see phase=init from initReport, we can increase the counter of fass.coldstarts and it is a FaaS metrics. Currently, it is still WIP from Solarwinds due to some internal discussion.

This PR covers logs only.

@tylerbenson
Copy link
Member

@rapphil @jerrytfleung What's the status of this PR? Are there more changes needed, or is resolving the merge conflict enough?

@jerrytfleung maybe you could come to the next SIG meeting and present about these changes? I'm having a hard time understanding and I think that would help.

@jerrytfleung
Copy link
Contributor Author

@rapphil @jerrytfleung What's the status of this PR? Are there more changes needed, or is resolving the merge conflict enough?

@jerrytfleung maybe you could come to the next SIG meeting and present about these changes? I'm having a hard time understanding and I think that would help.

I resolved the merge conflict. I don't have other changes to add. This PR targets to extract the log messages in telemetry API and export them.

I am happy to join the SIG meeting to present the idea / result and answer questions. Will see you then.

Copy link
Contributor

@nslaughter nslaughter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test coverage. I'm going to approve now. Would prefer to see those changes implemented for maintainability, but don't want to slow this feature down for that.

@@ -260,6 +260,7 @@ func severityTextToNumber(severityText string) plog.SeverityNumber {
"FATAL4": plog.SeverityNumberFatal4,
"CRITICAL": plog.SeverityNumberFatal,
"ALL": plog.SeverityNumberTrace,
"WARNING": plog.SeverityNumberWarn,
Copy link
Contributor Author

@jerrytfleung jerrytfleung Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding WARNING here. It is because Python logger can log WARNING level when the function record is in JSON format.

Example

{
    "timestamp": "2024-07-12T20:06:53Z",
    "level": "WARNING",
    "message": "30 warning {'filters': [], 'name': 'root', 'level': 0, 'parent': None, 'propagate': True, 'handlers': [<LambdaLoggerHandlerWithFrameType (NOTSET)>], 'disabled': False, '_cache': {10: True, 20: True, 30: True}}",
    "logger": "root",
    "requestId": "dcbdc55b-25bb-49dd-8273-64a96181ca2a",
    "warning": [
        "from-extra",
        30
    ]
}

@jerrytfleung
Copy link
Contributor Author

@tylerbenson @nslaughter I updated the test cases and polished the code again. Please help to proceed. Thanks!

@tylerbenson tylerbenson merged commit a64f3d3 into open-telemetry:main Sep 4, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go Pull requests that update Go code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants