Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Log Entries in CloudWatch Agent, using public.ecr.aws/aws-observability/adot-autoinstrumentation-java:v1.32.5 #947

Open
yumemi-mabe opened this issue Nov 7, 2024 · 4 comments

Comments

@yumemi-mabe
Copy link

yumemi-mabe commented Nov 7, 2024

Describe the bug

Using public.ecr.aws/aws-observability/adot-autoinstrumentation-java:v1.32.5,
the CloudWatch Agent is sending duplicate log entries with the same content, like this:

image
2024-11-07T04:21:35Z I! {
    "caller": "[email protected]/emf_exporter.go:119",
    "msg": "Start processing resource metrics",
    "kind": "exporter",
    "data_type": "metrics",
    "name": "awsemf/application_signals",
    "labels": {
        "aws.ecs.cluster.arn": "arn:aws:ecs:ap-northeast-1:xxx:cluster/ecscluster-my-app",
        "aws.ecs.container.arn": "arn:aws:ecs:ap-northeast-1:xxx:container/ecscluster-my-app/xxx/xxx",
        "aws.ecs.container.image.id": "sha256:xxx",
        "aws.ecs.launchtype": "fargate",
        "aws.ecs.task.arn": "arn:aws:ecs:ap-northeast-1:xxx:task/ecscluster-my-app/xxx",
        "aws.ecs.task.family": "ecstask-my-app",
        "aws.ecs.task.revision": "3",
        "aws.local.service": "ecstask-my-app",
        "aws.log.group.arns": "",
        "aws.log.group.names": "/aws/ecs/task/ecstask-my-app",
        "aws.log.stream.arns": "",
        "aws.log.stream.names": "",
        "cloud.account.id": "xxx",
        "cloud.availability_zone": "ap-northeast-1a",
        "cloud.platform": "aws_ecs",
        "cloud.provider": "aws",
        "cloud.region": "ap-northeast-1",
        "container.id": "xxx",
        "container.image.name": "xxx.dkr.ecr.ap-northeast-1.amazonaws.com/xxx",
        "container.image.tag": "xxx",
        "container.name": "ecstask-my-app",
        "host.arch": "aarch64",
        "host.name": "ip-xxx.ap-northeast-1.compute.internal",
        "os.description": "Linux 5.10.226-214.880.amzn2.aarch64",
        "os.type": "linux",
        "process.command_args": "",
        "process.executable.path": "/usr/lib/jvm/java-17-amazon-corretto/bin/java",
        "process.pid": "",
        "process.runtime.description": "Amazon.com Inc. OpenJDK 64-Bit Server VM 17.0.9+8-LTS",
        "process.runtime.name": "OpenJDK Runtime Environment",
        "process.runtime.version": "17.0.9+8-LTS",
        "service.name": "ecstask-my-app",
        "telemetry.auto.version": "1.32.5-aws",
        "telemetry.sdk.language": "java",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.34.1"
    }
}

Note: I masked Sensitive information.

This issue occurs in v1.32.5 but did not occur in v1.32.4.

Steps to reproduce

According to the Application Signals - Deploy using the Sidecar Strategy documentation, I configured my task definition as follows:

containerDefinitions in taskdef.json
"containerDefinitions": [
  {
    "name": "my-app",
    "logConfiguration": {
      "logDriver": "awslogs",
      "secretOptions": null,
      "options": {
        "awslogs-group": "${AWS_LOGS_GROUP}",
        "awslogs-region": "${AWS_REGION}",
        "awslogs-stream-prefix": "my-app",
        "awslogs-datetime-format": "%Y-%m-%d %H:%M:%S"
      }
    },
    "entryPoint": [],
    "portMappings": [
      {
        "hostPort": 8080,
        "protocol": "tcp",
        "containerPort": 8080
      }
    ],
    "environment": [
      {
        "name": "OTEL_RESOURCE_ATTRIBUTES",
        "value": "aws.log.group.names=${AWS_LOGS_GROUP},aws.hostedin.environment=$HOST_ENV,service.name=my-app"
      },
      {
        "name": "OTEL_LOGS_EXPORTER",
        "value": "none"
      },
      {
        "name": "OTEL_METRICS_EXPORTER",
        "value": "none"
      },
      {
        "name": "OTEL_EXPORTER_OTLP_PROTOCOL",
        "value": "http/protobuf"
      },
      {
        "name": "OTEL_AWS_APPLICATION_SIGNALS_ENABLED",
        "value": "true"
      },
      {
        "name": "JAVA_TOOL_OPTIONS",
        "value": " -javaagent:/otel-auto-instrumentation/javaagent.jar"
      },
      {
        "name": "OTEL_AWS_APPLICATION_SIGNALS_EXPORTER_ENDPOINT",
        "value": "http://localhost:4316/v1/metrics"
      },
      {
        "name": "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT",
        "value": "http://localhost:4316/v1/traces"
      },
      {
        "name": "OTEL_TRACES_SAMPLER",
        "value": "xray"
      },
      {
        "name": "OTEL_PROPAGATORS",
        "value": "tracecontext,baggage,b3,xray"
      }
    ],
    "mountPoints": [
      {
        "sourceVolume": "opentelemetry-auto-instrumentation",
        "containerPath": "/otel-auto-instrumentation",
        "readOnly": false
      }
    ],
    "image": "<MY_APP_IMAGE>",
    "essential": true,
    "dependsOn": [
      {
        "containerName": "ecs-cwagent",
        "condition": "START"
      },
      {
        "containerName": "init",
        "condition": "START"
      }
    ]
  },
  {
    "name": "ecs-cwagent",
    "image": "public.ecr.aws/cloudwatch-agent/cloudwatch-agent:latest",
    "essential": true,
    "secrets": [
      {
        "name": "CW_CONFIG_CONTENT",
        "valueFrom": "ecs-cwagent"
      }
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${AWS_LOGS_GROUP}",
        "awslogs-region": "${AWS_REGION}",
        "awslogs-stream-prefix": "ecs-cwagent",
        "awslogs-datetime-format": "%Y-%m-%dT%H:%M:%SZ"
      }
    }
  },
  {
    "name": "init",
    "image": "public.ecr.aws/aws-observability/adot-autoinstrumentation-java:v1.32.5",
    "essential": false,
    "command": [
      "cp",
      "/javaagent.jar",
      "/otel-auto-instrumentation/javaagent.jar"
    ],
    "mountPoints": [
      {
        "sourceVolume": "opentelemetry-auto-instrumentation",
        "containerPath": "/otel-auto-instrumentation",
        "readOnly": false
      }
    ]
  }
]

What did you expect to see?

The CloudWatch Agent doesn't send duplicate log entries.

What did you see instead?

CloudWatch Agent is sending duplicate log entries with the same content.

Additional context

  • I’m running the task configuration described above on AWS Fargate.
@yumemi-mabe yumemi-mabe changed the title Duplicate Log Entries in CloudWatch Agent v1.32.5 Duplicate Log Entries in CloudWatch Agent, using public.ecr.aws/aws-observability/adot-autoinstrumentation-java:v1.32.5 Nov 7, 2024
@pingleig
Copy link
Member

pingleig commented Nov 7, 2024

I think this issue is from CloudWatch Agent, the log line is here

https://github.com/amazon-contributing/opentelemetry-collector-contrib/blob/aws-cwa-dev/exporter/awsemfexporter/emf_exporter.go#L119

func (emf *emfExporter) pushMetricsData(_ context.Context, md pmetric.Metrics) error {
	rms := md.ResourceMetrics()
	labels := map[string]string{}
	for i := 0; i < rms.Len(); i++ {
		rm := rms.At(i)
		am := rm.Resource().Attributes()
		if am.Len() > 0 {
			am.Range(func(k string, v pcommon.Value) bool {
				labels[k] = v.Str()
				return true
			})
		}
	}
	emf.config.logger.Info("Start processing resource metrics", zap.Any("labels", labels))
	emf.processResourceLabels(labels)

I am not sure if this need to be info level logging.

TBH I am not sure which repo is the right place to create issue (cwagent, otel contrib), you can check with @bjrara

@bjrara
Copy link
Member

bjrara commented Nov 15, 2024

I agree with this comment, the log should not be created at info level.

In v1.32.5, we introduced runtime metrics (which will be released on 11/20) that emit JVM metrics every minute. I believe the so called "duplicated" logs come from different batches of metrics, one from runtime, one from RED (latency, error, fault) metrics generated from incoming/outgoing calls.

It shouldn't cause any impacting problems but I do agree such logs are misleading and noisy. We should downgrade the log level to avoid confusion.

@bjrara
Copy link
Member

bjrara commented Nov 18, 2024

Actually, the log level has been downgraded in OTEL collector: open-telemetry/opentelemetry-collector-contrib#29337.

Seems to be a missing patch at CloudWatch agent side.

@bjrara
Copy link
Member

bjrara commented Nov 18, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants