Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloudWatch Agent Fails on EKS when IMDS is Restricted According to Best Practices #517

Open
fitchtech opened this issue May 4, 2021 · 10 comments · May be fixed by #1171
Open

CloudWatch Agent Fails on EKS when IMDS is Restricted According to Best Practices #517

fitchtech opened this issue May 4, 2021 · 10 comments · May be fixed by #1171
Labels
bug Something isn't working

Comments

@fitchtech
Copy link

When deploying the aws-cloudwatch-metric chart version 0.0.4 with image.tag 1.247347.6b250880 and IRSA mapped to arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy I get the following error in the DaemonSet logs. IMDS is disabled per best practices as I'm using RBAC with IRSA. I'm not sure if this is an actual issue or can be safely ignored.

Logs:

2021/05/04 16:39:28 I! 2021/05/04 16:39:25 E! ec2metadata is not available
2021/05/04 16:39:25 I! attempt to access ECS task metadata to determine whether I'm running in ECS.
2021/05/04 16:39:26 W! retry [0/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021/05/04 16:39:27 W! retry [1/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021/05/04 16:39:28 W! retry [2/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2021/05/04 16:39:28 I! access ECS task metadata fail with response unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers), assuming I'm not running in ECS.
I! Detected the instance is OnPrem
2021/05/04 16:39:28 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json ...
/opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json does not exist or cannot read. Skipping it.
2021/05/04 16:39:28 Reading json config file path: /etc/cwagentconfig/..2021_05_04_16_39_17.050069089/cwagentconfig.json ...
2021/05/04 16:39:28 Find symbolic link /etc/cwagentconfig/..data
2021/05/04 16:39:28 Find symbolic link /etc/cwagentconfig/cwagentconfig.json
2021/05/04 16:39:28 Reading json config file path: /etc/cwagentconfig/cwagentconfig.json ...
Valid Json input schema.
Got Home directory: /root
Got Home directory: /root
I! Set home dir Linux: /root
I! SDKRegionWithCredsMap region: us-west-2
No csm configuration found.
No metric configuration found.
Configuration validation first phase succeeded

2021/05/04 16:39:28 I! Config has been translated into TOML /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
2021-05-04T16:39:28Z I! Starting AmazonCloudWatchAgent 1.247347.6
2021-05-04T16:39:28Z I! Loaded inputs: cadvisor k8sapiserver
2021-05-04T16:39:28Z I! Loaded aggregators:
2021-05-04T16:39:28Z I! Loaded processors: ec2tagger k8sdecorator
2021-05-04T16:39:28Z I! Loaded outputs: cloudwatchlogs
2021-05-04T16:39:28Z I! Tags enabled:
2021-05-04T16:39:28Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"ip-10-0-6-146.us-west-2.compute.internal", Flush Interval:1s
2021-05-04T16:39:28Z I! [logagent] starting
2021-05-04T16:39:28Z I! [logagent] found plugin cloudwatchlogs is a log backend

@fitchtech fitchtech added the bug Something isn't working label May 4, 2021
@fitchtech
Copy link
Author

This is still a problem in the aws-cloudwatch-metrics 0.0.5 helm chart. If you restrict access to the instance metadata service (IMDS) per the EKS best practices documentation.

By setting hostNetwork = true in the cloudwatch agent daemonset it then can then access the metadata service so the agent can start.

However, that shouldn't be necessary if the agent is using the proper credential chain for assuming a role via the Kubernetes service account with annotations for IAM Roles for Service Accounts (IRSA).

@fitchtech fitchtech changed the title aws-cloudwatch-metrics 0.0.4 unable to access metadata service CloudWatch Agent Fails on EKS when IMDS is Restricted According to Best Practices Aug 20, 2021
@bcelenza
Copy link

bcelenza commented Nov 7, 2021

I ran into this while trying to get the agent running on a pure fargate cluster w/ IRSA.

It looks like the agent is using it's own credentials chain, which does not include support for IRSA: aws/amazon-cloudwatch-agent#308

@claudio-vellage
Copy link

I'm not sure, for me it doesn't seem to pickup the service account at all, it might be a different issue, but for some reason it tries to always use the account from the node:

pod

   ...
   serviceAccount: aws-cloudwatch-metrics
   serviceAccountName: aws-cloudwatch-metrics
   ...

sa

 metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::*REDACTED*:role/AmazonEKSCloudWatchMetricsRole
[outputs.cloudwatchlogs] Aws error received when sending logs to /aws/containerinsights/*REDACTED*/performance/*REDACTED*: AccessDeniedException: User: arn:aws:sts::*REDACTED*:assumed-role/eksNodeRole/*REDACTED* is not authorized to perform: logs:PutLogEvents on resource: arn:aws:logs:us-east-1:*REDACTED*:log-gr status code: 400, request id: *REDACTED*

It's somehow trying to use the eksNodeRole (assigned to the role), not the service account, not sure why? I'm using the same approach as for all other applications, where it's working flawlessly.

@itforgeuk
Copy link

Any update with this? Should we downgrade cwagent version?

@all4innov
Copy link

any update?

@hbouaziz
Copy link

hbouaziz commented May 4, 2022

Happy 1-year bugversairy :)

@mkirlin
Copy link

mkirlin commented Oct 10, 2022

Hi! We've run into this issue in our cluster, so I just wanted to bump this again. We're raising a ticket with our AWS rep as well.

@all4innov
Copy link

some new updates ?

@ebadfd
Copy link

ebadfd commented Aug 10, 2023

any update ?

@adrianmkng
Copy link

adrianmkng commented Sep 4, 2023

By default instance_metadata_tags is disabled on EC2 instances so you can't query the instance metadata within the instance itself.

To make things more interesting there's a bug with EKS where you can't actually enable this for EKS nodes (see: terraform-aws-modules/terraform-aws-eks#1785). I don't think there is a work around for this at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
9 participants