Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support AWS Embedded Metric Format Version 0 #1

Merged
merged 2 commits into from
Apr 4, 2023
Merged

Add support AWS Embedded Metric Format Version 0 #1

merged 2 commits into from
Apr 4, 2023

Conversation

khanhntd
Copy link

@khanhntd khanhntd commented Apr 3, 2023

Same as this PR open-telemetry#20314

Description:
Currently, awsemfexporter will group all the metrics, translate to fields and setting the fields with exported metrics as EMF version 1 -

"_aws": {
					"CloudWatchMetrics": [
					{
						"Namespace": "ECS",
						"Dimensions": [ ["ClusterName"] ],
						"Metrics": [{"Name": "memcached_commands_total"}]
					}
					],
					"Timestamp": 1668387032641
			  	}

Therefore, adding support for EMF version v0 and a flag for customers to make a difference between Version 0 and Version 1 (e.g EMF v0)

				{
					"Version": 0,
					"CloudWatchMetrics": [
					{
						"Namespace": "ECS",
						"Dimensions": [ ["ClusterName"] ],
						"Metrics": [{"Name": "memcached_commands_total"}]
					}
					],
					"Timestamp": 1668387032641
			  	}

This can be validated by send metrics with awsemfexporter

image

Link to tracking Issue:
N/A

Testing:
There are two tests I have done:

  • By running go test with awsemfexporter, the result is
khanhntd@88665a0eaf54 awsemfexporter % go test ./...
ok      github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter       2.409s

@khanhntd khanhntd force-pushed the emf branch 4 times, most recently from c49a4eb to 140b262 Compare April 3, 2023 14:06
@khanhntd khanhntd changed the title Emf Add support AWS Embedded Metric Format Version 0 Apr 3, 2023
@khanhntd khanhntd marked this pull request as ready for review April 3, 2023 14:07
@@ -47,18 +49,32 @@ func createDefaultConfig() component.Config {
LogStreamName: "",
Namespace: "",
DimensionRollupOption: "ZeroAndSingleDimensionRollup",
EnableEMFVersion1: true,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default EMF version is v1 and the public document mentions V1 more than V0. Therefore, enable EMF version with default is true

@@ -186,8 +186,8 @@ func (lm *LabelMatcher) init() (err error) {
if len(lm.Separator) == 0 {
lm.Separator = ";"
}
lm.compiledRegex = regexp.MustCompile(lm.Regex)
return
lm.compiledRegex, err = regexp.Compile(lm.Regex)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MustCompile will panic if the regexp is not compilable. Therefore, using Compile which are the same with MustCompile but return readable error instead of panic

@@ -102,6 +108,8 @@ type MetricDescriptor struct {
Overwrite bool `mapstructure:"overwrite"`
}

var _ component.Config = (*Config)(nil)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will validate if the Config declared by awsemfexporter Config has all the declared function by the interface

Copy link

@chadpatel chadpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few small comments. I am mostly wondering why we want to handle the uuid error that is extremely unlikely


if err != nil {
return nil, err
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we bothering to handle this error?

Copy link

@williazz williazz Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we ignore unhandled errors just because they are unlikely?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more trying to get at what changed. This is one of those type of methods that will never fail unless something is completely broken

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its failed, it would fail at unit test. As billy said, why would we ignore unhandled errors, if it is in test environment ,yes I agree it should be more less robust but I don't think we want to enforce it with prod env.


// create AWS session
awsConfig, session, err := awsutil.GetAWSConfigSession(logger, &awsutil.Conn{}, &expConfig.AWSSessionSettings)
awsConfig, session, err := awsutil.GetAWSConfigSession(set.Logger, &awsutil.Conn{}, &config.AWSSessionSettings)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this changing? It seems like you are refactoring things that are unrelated to the specific change necessary for EMFv0

There are a few examples of that in this PR

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i agree. if you must include unrelated changes in a single PR, then i'd appreciate comments saying that they are unrelated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO - better to refactor in a different PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with creating in different PR. However, its only simple line change and there are hardly and complication in it. If you are saying refactoring more than 5 files with more than 50 lines then yes, I would agree with you.

@khanhntd khanhntd force-pushed the emf branch 3 times, most recently from f0327b9 to cb7b1ec Compare April 3, 2023 21:55
| `detailed_metrics` | Retain detailed datapoint values in exported metrics (e.g instead of exporting a quantile as a statistical value, preserve the quantile's population) | `false` |
| `output_destination` | Specify the EMFExporter output. Currently, two options are available. "cloudwatch" or "stdout" | `cloudwatch` |
| `detailed_metrics` | Retain detailed datapoint values in exported metrics (e.g instead of exporting a quantile as a statistical value, preserve the quantile's population) | `false` |
| `emf_version` | Send metrics to CloudWatchLogs with Embedded Metric Format in selected version [(e.g version 1 with _aws)](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html#CloudWatch_Embedded_Metric_Format_Specification_structure), version 0 without _aws) | `1` |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the critical thing to communicate to our customers is what the default is (i.e. 1) and that it is optional. It is also redundant to call it emf_version as we are in the emf exporter. Maybe output_version? I don't care as much about the naming as the lack of documentation on how this should be used (i.e., don't use it, the default is fine :))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree output_version would make confusing with the output_destination. Would output_version=2 means we are exporting emf version v2 to cloudwatch or we are exporting emf to cloudwatch version 2?

Agree with you on this

the critical thing to communicate to our customers is what the default is (i.e. 1) and that it is optional

That's why default exists right? For optional configuration instead of force customers to have expertise in it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to version for less redundant

@khanhntd khanhntd force-pushed the emf branch 2 times, most recently from 5c9b579 to 68b5a38 Compare April 4, 2023 07:51
@williazz
Copy link

williazz commented Apr 4, 2023

Did you do any testing?

@khanhntd
Copy link
Author

khanhntd commented Apr 4, 2023

Did you do any testing?

Its in PR description.

@khanhntd khanhntd merged commit 52ade39 into main Apr 4, 2023
@khanhntd khanhntd deleted the emf branch April 4, 2023 18:13
sky333999 pushed a commit that referenced this pull request Aug 24, 2023
…emetry#24676)

**Description:** The metadata.yml for the SSH check receiver currently
documents a resource attribute containing the SSH endpoint but this is
not emitted. This PR updates the receiver to include this resource
attribute.

**Link to tracking Issue:** open-telemetry#24441 

**Testing:**

Example collector config:
```yaml
receivers:
  sshcheck:
    endpoint: 13.245.150.131:22
    username: ec2-user
    key_file: /Users/dewald.dejager/.ssh/sandbox.pem
    collection_interval: 15s
    known_hosts: /Users/dewald.dejager/.ssh/known_hosts
    ignore_host_key: false
    resource_attributes:
      "ssh.endpoint":
        enabled: true

exporters:
  logging:
    verbosity: detailed
  prometheus:
    endpoint: 0.0.0.0:8081
    resource_to_telemetry_conversion:
      enabled: true

service:
  pipelines:
    metrics:
      receivers: [sshcheck]
      exporters: [logging, prometheus]
```

The log output looks like this:
```
2023-07-30T16:52:38.724+0200    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 2, "data points": 2}
2023-07-30T16:52:38.724+0200    info    ResourceMetrics #0
Resource SchemaURL: 
Resource attributes:
     -> ssh.endpoint: Str(13.245.150.131:22)
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope otelcol/sshcheckreceiver 0.82.0-dev
Metric #0
Descriptor:
     -> Name: sshcheck.duration
     -> Description: Measures the duration of SSH connection.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 2023-07-30 14:52:22.381672 +0000 UTC
Timestamp: 2023-07-30 14:52:38.404003 +0000 UTC
Value: 319
Metric #1
Descriptor:
     -> Name: sshcheck.status
     -> Description: 1 if the SSH client successfully connected, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
StartTimestamp: 2023-07-30 14:52:22.381672 +0000 UTC
Timestamp: 2023-07-30 14:52:38.404003 +0000 UTC
Value: 1
```

And the Prometheus metrics look like this:
```
# HELP sshcheck_duration Measures the duration of SSH connection.
# TYPE sshcheck_duration gauge
sshcheck_duration{ssh_endpoint="13.245.150.131:22"} 311
# HELP sshcheck_status 1 if the SSH client successfully connected, otherwise 0.
# TYPE sshcheck_status gauge
sshcheck_status{ssh_endpoint="13.245.150.131:22"} 1
```
sky333999 pushed a commit that referenced this pull request Aug 24, 2023
)

**Description:** 

Adding command line argument `--status-code` to `telemetrygen traces`,
which accepts `(Unset,Error,Ok)` (case sensitive) or the enum equivalent
of `(0,1,2)`.

Running 

```shell
telemetrygen traces --otlp-insecure --traces 1 --status-code 1
```

against a minimal local collector yields

```txt
2023-07-29T21:27:57.862+0100	info	ResourceSpans #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
Resource attributes:
     -> service.name: Str(telemetrygen)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope telemetrygen
Span #0
    Trace ID       : f6dc4be32c78b9999c69d504a79e68c1
    Parent ID      : 4e2cd6e0e90cf2ea
    ID             : 20835413e32d26a5
    Name           : okey-dokey
    Kind           : Server
    Start time     : 2023-07-29 20:27:57.861602 +0000 UTC
    End time       : 2023-07-29 20:27:57.861726 +0000 UTC
    Status code    : Error
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-client)
Span #1
    Trace ID       : f6dc4be32c78b9999c69d504a79e68c1
    Parent ID      :
    ID             : 4e2cd6e0e90cf2ea
    Name           : lets-go
    Kind           : Client
    Start time     : 2023-07-29 20:27:57.861584 +0000 UTC
    End time       : 2023-07-29 20:27:57.861726 +0000 UTC
    Status code    : Error
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-server)
```

and similarly (the string version)

```shell
telemetrygen traces --otlp-insecure --traces 1 --status-code '"Ok"'
```

produces 

```txt
Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
Resource attributes:
     -> service.name: Str(telemetrygen)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope telemetrygen
Span #0
    Trace ID       : dfd830da170acfe567b12f87685d7917
    Parent ID      : 8e15b390dc6a1ccc
    ID             : 165c300130532072
    Name           : okey-dokey
    Kind           : Server
    Start time     : 2023-07-29 20:29:16.026965 +0000 UTC
    End time       : 2023-07-29 20:29:16.027089 +0000 UTC
    Status code    : Ok
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-client)
Span #1
    Trace ID       : dfd830da170acfe567b12f87685d7917
    Parent ID      :
    ID             : 8e15b390dc6a1ccc
    Name           : lets-go
    Kind           : Client
    Start time     : 2023-07-29 20:29:16.026956 +0000 UTC
    End time       : 2023-07-29 20:29:16.027089 +0000 UTC
    Status code    : Ok
    Status message :
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-server)
```

The default is `Unset` which is the current behaviour.

**Link to tracking Issue:**

24286

**Testing:**

Added unit tests which covers both valid and invalid inputs.

**Documentation:**

Command line arguments are self documenting via the usage info in the
flag.

Co-authored-by: Pablo Baeyens <[email protected]>
jefchien pushed a commit that referenced this pull request Nov 27, 2023
open-telemetry#29116)

**Description:** 

As originally proposed in open-telemetry#26991 before I got distracted

Exposes the duration of generated spans as a command line parameter. It
uses a `DurationVar` flag so units can be easily provided and are
automatically applied.

Example usage:

```bash
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10ns # nanoseconds
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10us # microseconds
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10ms # milliseconds
telemetrygen traces --traces 100 --otlp-insecure --span-duration 10s # seconds
```

**Testing:** 

Ran without the argument provided `telemetrygen traces --traces 1
--otlp-insecure` and seen spans publishing with the default value.

Ran again with the argument provided: `telemetrygen traces --traces 1
--otlp-insecure --span-duration 1s`

And observed the expected output:

```
Resource SchemaURL: https://opentelemetry.io/schemas/1.4.0
Resource attributes:
     -> service.name: Str(telemetrygen)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope telemetrygen 
Span #0
    Trace ID       : 8b441587ffa5820688b87a6b511d634c
    Parent ID      : 39faad428638791b
    ID             : 88f0886894bd4ee2
    Name           : okey-dokey
    Kind           : Server
    Start time     : 2023-11-12 02:05:07.97443 +0000 UTC
    End time       : 2023-11-12 02:05:08.97443 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-client)
Span #1
    Trace ID       : 8b441587ffa5820688b87a6b511d634c
    Parent ID      : 
    ID             : 39faad428638791b
    Name           : lets-go
    Kind           : Client
    Start time     : 2023-11-12 02:05:07.97443 +0000 UTC
    End time       : 2023-11-12 02:05:08.97443 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> net.peer.ip: Str(1.2.3.4)
     -> peer.service: Str(telemetrygen-server)
	{"kind": "exporter", "data_type": "traces", "name": "debug"}
```

**Documentation:** No documentation added.

---------

Co-authored-by: Pablo Baeyens <[email protected]>
okankoAMZ pushed a commit that referenced this pull request Jun 27, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
This PR implements the new container logs parser as it was proposed at
open-telemetry#31959.

**Link to tracking Issue:** <Issue number if applicable>
open-telemetry#31959

**Testing:** <Describe what testing was performed and which tests were
added.>

Added unit tests. Providing manual testing steps as well:

### How to test this manually

1. Using the following config file:
```yaml
receivers:
  filelog:
    start_at: end
    include_file_name: false
    include_file_path: true
    include:
    - /var/log/pods/*/*/*.log
    operators:
      - id: container-parser
        type: container
        output: m1
      - type: move
        id: m1
        from: attributes.k8s.pod.name
        to: attributes.val
      - id: some
        type: add
        field: attributes.key2.key_in
        value: val2

exporters:
  debug:
    verbosity: detailed

service:
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [debug]
      processors: []
```
2. Start the collector:
`./bin/otelcontribcol_linux_amd64 --config
~/otelcol/container_parser/config.yaml`
3. Use the following bash script to create some logs:
```bash
#! /bin/bash

echo '2024-04-13T07:59:37.505201169-05:00 stdout P This is a very very long crio line th' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler43/1.log
echo '{"log":"INFO: log line here","stream":"stdout","time":"2029-03-30T08:31:20.545192187Z"}' >> /var/log/pods/kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log
echo '2024-04-13T07:59:37.505201169-05:00 stdout F at is awesome! crio is awesome!' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler43/1.log
echo '2021-06-22T10:27:25.813799277Z stdout P some containerd log th' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler44/1.log
echo '{"log":"INFO: another log line here","stream":"stdout","time":"2029-03-30T08:31:20.545192187Z"}' >> /var/log/pods/kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log
echo '2021-06-22T10:27:25.813799277Z stdout F at is super awesome! Containerd is awesome' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler44/1.log



echo '2024-04-13T07:59:37.505201169-05:00 stdout F standalone crio line which is awesome!' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler43/1.log
echo '2021-06-22T10:27:25.813799277Z stdout F standalone containerd line that is super awesome!' >> /var/log/pods/kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler44/1.log
```
4. Run the above as a bash script to verify any parallel processing.
Verify that the output is correct.


### Test manually on k8s

1. `make docker-otelcontribcol && docker tag otelcontribcol
otelcontribcol-dev:0.0.1 && kind load docker-image
otelcontribcol-dev:0.0.1`
2. Install using the following helm values file:
```yaml
mode: daemonset
presets:
  logsCollection:
    enabled: true

image:
  repository: otelcontribcol-dev
  tag: "0.0.1"
  pullPolicy: IfNotPresent

command:
  name: otelcontribcol

config:
  exporters:
    debug:
      verbosity: detailed
  receivers:
    filelog:
      start_at: end
      include_file_name: false
      include_file_path: true
      exclude:
        - /var/log/pods/default_daemonset-opentelemetry-collector*_*/opentelemetry-collector/*.log
      include:
        - /var/log/pods/*/*/*.log
      operators:
        - id: container-parser
          type: container
          output: some
        - id: some
          type: add
          field: attributes.key2.key_in
          value: val2


  service:
    pipelines:
      logs:
        receivers: [filelog]
        processors: [batch]
        exporters: [debug]
```
3. Check collector's output to verify the logs are parsed properly:
```console
2024-05-10T07:52:02.307Z	info	LogsExporter	{"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 2}
2024-05-10T07:52:02.307Z	info	ResourceLog #0
Resource SchemaURL: 
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-05-10 07:52:02.046236071 +0000 UTC
Timestamp: 2024-05-10 07:52:01.92533954 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 07:52:01)
Attributes:
     -> log: Map({"iostream":"stdout"})
     -> time: Str(2024-05-10T07:52:01.92533954Z)
     -> k8s: Map({"container":{"name":"busybox","restart_count":"0"},"namespace":{"name":"default"},"pod":{"name":"daemonset-logs-6f6mn","uid":"1069e46b-03b2-4532-a71f-aaec06c0197b"}})
     -> logtag: Str(F)
     -> key2: Map({"key_in":"val2"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-6f6mn_1069e46b-03b2-4532-a71f-aaec06c0197b/busybox/0.log)
Trace ID: 
Span ID: 
Flags: 0
LogRecord #1
ObservedTimestamp: 2024-05-10 07:52:02.046411602 +0000 UTC
Timestamp: 2024-05-10 07:52:02.027386192 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 07:52:02)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-6f6mn_1069e46b-03b2-4532-a71f-aaec06c0197b/busybox/0.log)
     -> time: Str(2024-05-10T07:52:02.027386192Z)
     -> log: Map({"iostream":"stdout"})
     -> logtag: Str(F)
     -> k8s: Map({"container":{"name":"busybox","restart_count":"0"},"namespace":{"name":"default"},"pod":{"name":"daemonset-logs-6f6mn","uid":"1069e46b-03b2-4532-a71f-aaec06c0197b"}})
     -> key2: Map({"key_in":"val2"})
Trace ID: 
Span ID: 
Flags: 0
...
```


**Documentation:** <Describe the documentation added.>  Added

Signed-off-by: ChrsMark <[email protected]>
okankoAMZ pushed a commit that referenced this pull request Jun 27, 2024
…try#33225)

**Description:** <Describe what has changed.>
Using the DB span example below, X-Ray exporter failed to generate the
expected DB call subsegment names because it could not parse JDBC
connection strings that start with the `jdbc:` prefix.
```
Span #1
    Trace ID       : 663a0b68a5e3849c09c07f914b3df738
    Parent ID      : 1052e2a4a2516884
    ID             : 374de78b552e23c2
    Name           : orders@no-appsignals-mysql-1.cnkqok6c8mo1.eu-west-1.rds.amazonaws.com
    Kind           : Client
    Start time     : 2024-05-07 11:07:20.62 +0000 UTC
    End time       : 2024-05-07 11:07:20.624 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> db.connection_string: Str(jdbc:mysql://no-appsignals-mysql-1.cnkqok6c8mo1.eu-west-1.rds.amazonaws.com:3306)
     -> db.name: Str(orders)
     -> db.system: Str(MySQL)
     -> db.user: Str([email protected])
```

**Link to tracking Issue:** <Issue number if applicable>

**Testing:** <Describe what testing was performed and which tests were
added.>
local tests
okankoAMZ pushed a commit that referenced this pull request Jun 27, 2024
…pen-telemetry#33353)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Container parser should add k8s metadata as resource attributes and not
as log record attributes.

**Link to tracking Issue:** <Issue number if applicable> Fixes
open-telemetry#33341

**Testing:** <Describe what testing was performed and which tests were
added.>
Manual testing on local k8s cluster:

```console
2024-06-04T06:40:08.219Z	info	ResourceLog #0
Resource SchemaURL: 
Resource attributes:
     -> k8s.pod.uid: Str(d5ecc924-e255-4525-b5be-6437939b1e4d)
     -> k8s.container.name: Str(busybox)
     -> k8s.namespace.name: Str(default)
     -> k8s.pod.name: Str(daemonset-logs-dhzcq)
     -> k8s.container.restart_count: Str(0)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-06-04 06:40:08.007370503 +0000 UTC
Timestamp: 2024-06-04 06:40:07.855932421 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 06:40:07)
Attributes:
     -> logtag: Str(F)
     -> key2: Map({"key_in":"val2"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-dhzcq_d5ecc924-e255-4525-b5be-6437939b1e4d/busybox/0.log)
     -> time: Str(2024-06-04T06:40:07.855932421Z)
     -> log.iostream: Str(stdout)
Trace ID: 
Span ID: 
Flags: 0
LogRecord #1
ObservedTimestamp: 2024-06-04 06:40:08.007451031 +0000 UTC
Timestamp: 2024-06-04 06:40:07.957875321 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 06:40:07)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-dhzcq_d5ecc924-e255-4525-b5be-6437939b1e4d/busybox/0.log)
     -> log.iostream: Str(stdout)
     -> time: Str(2024-06-04T06:40:07.957875321Z)
     -> key2: Map({"key_in":"val2"})
     -> logtag: Str(F)
Trace ID: 
Span ID: 
Flags: 0
```

**Documentation:** <Describe the documentation added.> ~

---------

Signed-off-by: ChrsMark <[email protected]>
sky333999 pushed a commit that referenced this pull request Oct 28, 2024
… Histo --> Histogram (open-telemetry#33824)

## Description

This PR adds a custom metric function to the transformprocessor to
convert exponential histograms to explicit histograms.

Link to tracking issue: Resolves open-telemetry#33827

**Function Name**
```
convert_exponential_histogram_to_explicit_histogram
```

**Arguments:**

- `distribution` (_upper, midpoint, uniform, random_)
- `ExplicitBoundaries: []float64`

**Usage example:**

```yaml
processors:
  transform:
    error_mode: propagate
    metric_statements:
    - context: metric
      statements:
        - convert_exponential_histogram_to_explicit_histogram("random", [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]) 
```

**Converts:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-31 09:35:25.212037 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
Bucket (32.000000, 64.000000], Count: 10
Bucket (64.000000, 128.000000], Count: 22
Bucket (128.000000, 256.000000], Count: 12
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

**To:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: Histogram
     -> AggregationTemporality: Delta
HistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-30 21:37:07.830902 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
ExplicitBounds #0: 10.000000
ExplicitBounds #1: 20.000000
ExplicitBounds #2: 30.000000
ExplicitBounds #3: 40.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 60.000000
ExplicitBounds #6: 70.000000
ExplicitBounds #7: 80.000000
ExplicitBounds #8: 90.000000
ExplicitBounds #9: 100.000000
Buckets #0, Count: 0
Buckets #1, Count: 0
Buckets #2, Count: 0
Buckets #3, Count: 2
Buckets #4, Count: 5
Buckets #5, Count: 0
Buckets #6, Count: 3
Buckets #7, Count: 7
Buckets #8, Count: 2
Buckets #9, Count: 4
Buckets #10, Count: 21
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

### Testing

- Several unit tests have been created. We have also tested by ingesting
and converting exponential histograms from the `statsdreceiver` as well
as directly via the `otlpreceiver` over grpc over several hours with a
large amount of data.

- We have clients that have been running this solution in production for
a number of weeks.

### Readme description:

### convert_exponential_hist_to_explicit_hist

`convert_exponential_hist_to_explicit_hist([ExplicitBounds])`

the `convert_exponential_hist_to_explicit_hist` function converts an
ExponentialHistogram to an Explicit (_normal_) Histogram.

`ExplicitBounds` is represents the list of bucket boundaries for the new
histogram. This argument is __required__ and __cannot be empty__.

__WARNING:__

The process of converting an ExponentialHistogram to an Explicit
Histogram is not perfect and may result in a loss of precision. It is
important to define an appropriate set of bucket boundaries to minimize
this loss. For example, selecting Boundaries that are too high or too
low may result histogram buckets that are too wide or too narrow,
respectively.

---------

Co-authored-by: Kent Quirk <[email protected]>
Co-authored-by: Tyler Helmuth <[email protected]>
sky333999 pushed a commit that referenced this pull request Oct 28, 2024
…etry#35544)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

As described at
open-telemetry#35491,
it is useful to provide the option to the users for defining
`receiver_creator`'s templates per container.

In this regard, the current PR introduces a new type of Endpoint called
`PodContainer` that matches the rule type `pod.container`. This Endpoint
is emitted for each container of the Pod similarly to how the `Port`
Endpoints are emitted per container that defines a port.

A complete example on how to use this feature to apply different parsing
on each of the Pod's container is provided in the `How to test this
manually` section.

**Link to tracking Issue:** <Issue number if applicable> Fixes
open-telemetry#35491

**Testing:** <Describe what testing was performed and which tests were
added.> TBA

**Documentation:** <Describe the documentation added.> TBA


### How to test this manually

1. Use the following values file to deploy the Collector's Helm chart
```yaml
mode: daemonset

image:
  repository: otelcontribcol-dev
  tag: "latest"
  pullPolicy: IfNotPresent

command:
  name: otelcontribcol

clusterRole:
  create: true
  rules:
   - apiGroups:
     - ''
     resources:
     - 'pods'
     - 'nodes'
     verbs:
     - 'get'
     - 'list'
     - 'watch'
   - apiGroups: [ "" ]
     resources: [ "nodes/proxy"]
     verbs: [ "get" ]
   - apiGroups:
       - ""
     resources:
       - nodes/stats
     verbs:
       - get
   - nonResourceURLs:
       - "/metrics"
     verbs:
       - get

extraVolumeMounts:
 - name: varlogpods
   mountPath: /var/log/pods
   readOnly: true

extraVolumes:
  - name: varlogpods
    hostPath:
      path: /var/log/pods

config:
  extensions:
    k8s_observer:
      auth_type: serviceAccount
      node: ${env:K8S_NODE_NAME}
      observe_nodes: true
  exporters:
    debug:
      verbosity: basic
  receivers:
    receiver_creator/logs:
      watch_observers: [ k8s_observer ]
      receivers:
        filelog/busybox:
          rule: type == "pod.container" && pod.labels["otel.logs"] == "true" && container_name == "busybox"
          config:
            include:
              - /var/log/pods/`pod.namespace`_`pod.name`_`pod.uid`/`container_name`/*.log
            include_file_name: false
            include_file_path: true
            operators:
              - id: container-parser
                type: container
              - type: add
                field: attributes.log.template
                value: busybox
        filelog/lazybox:
          rule: type == "pod.container" && pod.labels["otel.logs"] == "true" && container_name == "lazybox"
          config:
            include:
              - /var/log/pods/`pod.namespace`_`pod.name`_`pod.uid`/`container_name`/*.log
            include_file_name: false
            include_file_path: true
            operators:
              - id: container-parser
                type: container
              - type: add
                field: attributes.log.template
                value: lazybox
  service:
    extensions: [health_check, k8s_observer]
    pipelines:
      logs:
        receivers: [receiver_creator/logs]
        processors: [batch]
        exporters: [debug]
```
2. Follow the logs of the Collector's Pod i.e: `k logs -f
daemonset-opentelemetry-collector-agent-2hrg5`
3. Deploy a sample Pod which consists of 2 different containers:

```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: daemonset-logs
  labels:
    app: daemonset-logs
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: migration-logger
      otel.logs: "true"
  template:
    metadata:
      labels:
        app.kubernetes.io/component: migration-logger
        otel.logs: "true"
    spec:
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      containers:
        - name: lazybox
          image: busybox
          args:
            - /bin/sh
            - -c
            - while true; do echo "otel logs at $(date +%H:%M:%S)" && sleep 0.1s; done
        - name: busybox
          image: busybox
          args:
            - /bin/sh
            - -c
            - while true; do echo "otel logs at $(date +%H:%M:%S)" && sleep 0.1s; done
```

Verify in the logs that only 2 filelog receivers are started, one per
container:

```console
2024-10-02T12:05:17.506Z	info	[email protected]/observerhandler.go:96	starting receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/lazybox", "endpoint": "10.244.0.13", "endpoint_id": "k8s_observer/01543800-cfea-4c10-8220-387e60f65151/lazybox"}
2024-10-02T12:05:17.508Z	info	adapter/receiver.go:47	Starting stanza receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/lazybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/lazybox"}
2024-10-02T12:05:17.508Z	info	[email protected]/observerhandler.go:96	starting receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/busybox", "endpoint": "10.244.0.13", "endpoint_id": "k8s_observer/01543800-cfea-4c10-8220-387e60f65151/busybox"}
2024-10-02T12:05:17.510Z	info	adapter/receiver.go:47	Starting stanza receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/busybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/busybox"}
2024-10-02T12:05:17.709Z	info	fileconsumer/file.go:256	Started watching file	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/lazybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/lazybox", "component": "fileconsumer", "path": "/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/lazybox/0.log"}
2024-10-02T12:05:17.712Z	info	fileconsumer/file.go:256	Started watching file	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/busybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/busybox", "component": "fileconsumer", "path": "/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/busybox/0.log"}
```

In addition verify that the proper attributes are added per container
according to the 2 different filelog receiver definitions:


```console
2024-10-02T12:23:55.117Z	info	ResourceLog #0
Resource SchemaURL: 
Resource attributes:
     -> k8s.pod.name: Str(daemonset-logs-sz4zk)
     -> k8s.container.restart_count: Str(0)
     -> k8s.pod.uid: Str(01543800-cfea-4c10-8220-387e60f65151)
     -> k8s.container.name: Str(lazybox)
     -> k8s.namespace.name: Str(default)
     -> container.id: Str(63a8e69bdc6ee95ee7918baf913a548190f32838adeb0e6189a8210e05157b40)
     -> container.image.name: Str(busybox)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-10-02 12:23:54.896772888 +0000 UTC
Timestamp: 2024-10-02 12:23:54.750904381 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 12:23:54)
Attributes:
     -> log.iostream: Str(stdout)
     -> logtag: Str(F)
     -> log: Map({"template":"lazybox"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/lazybox/0.log)
Trace ID: 
Span ID: 
Flags: 0
ResourceLog #1
Resource SchemaURL: 
Resource attributes:
     -> k8s.container.restart_count: Str(0)
     -> k8s.pod.uid: Str(01543800-cfea-4c10-8220-387e60f65151)
     -> k8s.container.name: Str(busybox)
     -> k8s.namespace.name: Str(default)
     -> k8s.pod.name: Str(daemonset-logs-sz4zk)
     -> container.id: Str(47163758424f2bc5382b1e9702301be23cab368b590b5fbf0b30affa09b4a199)
     -> container.image.name: Str(busybox)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-10-02 12:23:54.897788935 +0000 UTC
Timestamp: 2024-10-02 12:23:54.749885634 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 12:23:54)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/busybox/0.log)
     -> logtag: Str(F)
     -> log.iostream: Str(stdout)
     -> log: Map({"template":"busybox"})
Trace ID: 
Span ID: 
Flags: 0
```

Signed-off-by: ChrsMark <[email protected]>
zzhlogin pushed a commit to zzhlogin/opentelemetry-collector-contrib-aws that referenced this pull request Dec 19, 2024
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
Generates simple histograms using telemetrygen
<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes

<!--Describe what testing was performed and which tests were added.-->
#### Testing
Test with a local otel collector with debug output

```
bin/telemetrygen metrics --metrics 5  --otlp-http --otlp-endpoint "localhost:4318"    --metric-type Histogram --otlp-insecure 
```
Output from debug Exporter: 
```
Resource SchemaURL: https://opentelemetry.io/schemas/1.13.0
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
     -> Name: gen
     -> Description:
     -> Unit:
     -> DataType: Histogram
     -> AggregationTemporality: Cumulative
HistogramDataPoints #0
StartTimestamp: 2024-11-13 16:22:50.633365 +0000 UTC
Timestamp: 2024-11-13 16:22:51.633367 +0000 UTC
Count: 0
Sum: 0.000000
ExplicitBounds #0: 0.000000
ExplicitBounds amazon-contributing#1: 1.000000
ExplicitBounds amazon-contributing#2: 2.000000
ExplicitBounds amazon-contributing#3: 3.000000
ExplicitBounds amazon-contributing#4: 4.000000
Buckets #0, Count: 0
Buckets amazon-contributing#1, Count: 0
Buckets amazon-contributing#2, Count: 0
Buckets amazon-contributing#3, Count: 0
Buckets amazon-contributing#4, Count: 0
ResourceMetrics amazon-contributing#1
Resource SchemaURL: https://opentelemetry.io/schemas/1.13.0
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
     -> Name: gen
     -> Description:
     -> Unit:
     -> DataType: Histogram
     -> AggregationTemporality: Cumulative
HistogramDataPoints #0
StartTimestamp: 2024-11-13 16:22:50.639942 +0000 UTC
Timestamp: 2024-11-13 16:22:51.639942 +0000 UTC
Count: 1
Sum: 1.000000
ExplicitBounds #0: 0.000000
ExplicitBounds amazon-contributing#1: 1.000000
ExplicitBounds amazon-contributing#2: 2.000000
ExplicitBounds amazon-contributing#3: 3.000000
ExplicitBounds amazon-contributing#4: 4.000000
Buckets #0, Count: 0
Buckets amazon-contributing#1, Count: 1
Buckets amazon-contributing#2, Count: 0
Buckets amazon-contributing#3, Count: 0
Buckets amazon-contributing#4, Count: 0
ResourceMetrics amazon-contributing#2
Resource SchemaURL: https://opentelemetry.io/schemas/1.13.0
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
     -> Name: gen
     -> Description:
     -> Unit:
     -> DataType: Histogram
     -> AggregationTemporality: Cumulative
HistogramDataPoints #0
StartTimestamp: 2024-11-13 16:22:50.6404 +0000 UTC
Timestamp: 2024-11-13 16:22:51.640401 +0000 UTC
Count: 2
Sum: 4.000000
ExplicitBounds #0: 0.000000
ExplicitBounds amazon-contributing#1: 1.000000
ExplicitBounds amazon-contributing#2: 2.000000
ExplicitBounds amazon-contributing#3: 3.000000
ExplicitBounds amazon-contributing#4: 4.000000
Buckets #0, Count: 0
Buckets amazon-contributing#1, Count: 1
Buckets amazon-contributing#2, Count: 0
Buckets amazon-contributing#3, Count: 1
Buckets amazon-contributing#4, Count: 0
ResourceMetrics amazon-contributing#3
Resource SchemaURL: https://opentelemetry.io/schemas/1.13.0
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
     -> Name: gen
     -> Description:
     -> Unit:
     -> DataType: Histogram
     -> AggregationTemporality: Cumulative
HistogramDataPoints #0
StartTimestamp: 2024-11-13 16:22:50.640729 +0000 UTC
Timestamp: 2024-11-13 16:22:51.640729 +0000 UTC
Count: 3
Sum: 3.000000
ExplicitBounds #0: 0.000000
ExplicitBounds amazon-contributing#1: 1.000000
ExplicitBounds amazon-contributing#2: 2.000000
ExplicitBounds amazon-contributing#3: 3.000000
ExplicitBounds amazon-contributing#4: 4.000000
Buckets #0, Count: 1
Buckets amazon-contributing#1, Count: 1
Buckets amazon-contributing#2, Count: 1
Buckets amazon-contributing#3, Count: 0
Buckets amazon-contributing#4, Count: 0
ResourceMetrics amazon-contributing#4
Resource SchemaURL: https://opentelemetry.io/schemas/1.13.0
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope
Metric #0
Descriptor:
     -> Name: gen
     -> Description:
     -> Unit:
     -> DataType: Histogram
     -> AggregationTemporality: Cumulative
HistogramDataPoints #0
StartTimestamp: 2024-11-13 16:22:50.641073 +0000 UTC
Timestamp: 2024-11-13 16:22:51.641073 +0000 UTC
Count: 4
Sum: 12.000000
ExplicitBounds #0: 0.000000
ExplicitBounds amazon-contributing#1: 1.000000
ExplicitBounds amazon-contributing#2: 2.000000
ExplicitBounds amazon-contributing#3: 3.000000
ExplicitBounds amazon-contributing#4: 4.000000
Buckets #0, Count: 0
Buckets amazon-contributing#1, Count: 0
Buckets amazon-contributing#2, Count: 1
Buckets amazon-contributing#3, Count: 2
Buckets amazon-contributing#4, Count: 1
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```
<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->

---------

Co-authored-by: Pablo Baeyens <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants