Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ec2-metadata-service errors in up to date AWS EKS cluster using Pod Identity #6667

Open
3 of 4 tasks
shaftoe opened this issue Nov 15, 2024 · 10 comments
Open
3 of 4 tasks
Assignees
Labels
bug This issue is a bug. needs-review This issue/pr needs review from an internal developer. p2 This is a standard priority issue

Comments

@shaftoe
Copy link

shaftoe commented Nov 15, 2024

Checkboxes for prior research

Describe the bug

Using latest version of https://www.npmjs.com/package/@aws-sdk/ec2-metadata-service seems to not work out of the box with NodeJS v18 in an AWS EKS kubernetes cluster running pod with service account associated and valid policy attached.

Regression Issue

  • Select this option if this issue appears to be a regression.

SDK version number

@aws-sdk/[email protected]

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

node v18.20.4

Reproduction Steps

# testing the Pod with latest awscli
root@nodetest:/tmp# /usr/local/bin/aws --version
aws-cli/2.21.2 Python/3.12.6 Linux/6.1.112-124.190.amzn2023.x86_64 exe/x86_64.debian.12
root@nodetest:/tmp# /usr/local/bin/aws sts get-caller-identity
{
    "UserId": "xxxxx:eks-app-dev-nodetest-23833230-c77c-4398-95a9-c03cc43bf1a7",
    "Account": "xxxxx",
    "Arn": "arn:aws:sts::xxxxx:assumed-role/eks-app-dev-app-multimediaworker/eks-app-dev-nodetest-23833230-c77c-4398-95a9-c03cc43bf1a7"
}
root@nodetest:/tmp# /usr/local/bin/aws secretsmanager get-secret-value --secret-id xxxxx --output text > output # Works too

Trying getting metadata info via JS module:

root@nodetest:/tmp# npm install @aws-sdk/ec2-metadata-service

added 19 packages, and audited 20 packages in 2s

found 0 vulnerabilities

root@nodetest:/tmp# cat test.js 
const main = async () => {
    const { MetadataService } = require("@aws-sdk/ec2-metadata-service");

    const metadataService = new MetadataService({});
    const metadata = await metadataService.request("/latest/meta-data/", {});

    console.log(metadata);
}

main();

root@nodetest:/tmp# node test.js 
/tmp/node_modules/@aws-sdk/ec2-metadata-service/dist-cjs/index.js:112
      throw new Error(`Error making request to the metadata service: ${error}`);
            ^

Error: Error making request to the metadata service: Error: Request failed with status code 401
    at _MetadataService.request (/tmp/node_modules/@aws-sdk/ec2-metadata-service/dist-cjs/index.js:112:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async main (/tmp/test.js:5:22)

Node.js v18.20.4

Observed Behavior

Error when trying to fetch metadata

Expected Behavior

Metadata fetched correctly

Possible Solution

No response

Additional Information/Context

No response

@shaftoe shaftoe added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Nov 15, 2024
@RanVaknin
Copy link
Contributor

RanVaknin commented Nov 15, 2024

Hi @shaftoe ,

Your comparison between the CLI and the SDK is not the same.

In the CLI you are not specifying any specific method of credentials and letting the CLI's default credential chain resolve your creds for you.

In the SDK you are using a specific client which is EC2 IMDS-specific. I don't think that functionality extends to the container metadata service which is different (IMDS endpoint and container metadata endpoint have different IP addresses).

This begs the question, what are you trying to do? If you are just trying to use the SDK on an EKS pod, you don't need to use any of this. The default credential chain will be able to fetch credentials from the container metadata endpoint automatically if correctly configured.

If your pod gets injected with the relevant env variables on start time, the SDK will hook into those and make that request to the container metadata service on your behalf. See SDK docs for more info.

Thanks,
Ran~

@shaftoe
Copy link
Author

shaftoe commented Nov 15, 2024

Thanks for the detailed explanation @RanVaknin.

This begs the question, what are you trying to do?

I am (or better, the application I'm trying to fix) trying to "getting AnnouncedIp from ec2 meta data api" (as from inline comments), or better, to retrieve the public-ipv4 address associated to the pod via the metadata API. It's been working fine so far using https://www.npmjs.com/package/node-ec2-metadata but it appears that running the same application in a new cluster doesn't anymore, it also seemed that @aws-sdk/ec2-metadata-service was meant exactly for that.

I suppose at this point the question is: what's the right way to use @aws-sdk/ec2-metadata-service in an EKS environment? or, is it another recommended way to access such metadata?

PS AWS env vars like AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE and AWS_CONTAINER_CREDENTIALS_FULL_URI are populated as expected

@RanVaknin
Copy link
Contributor

Hi @shaftoe ,

Thanks for the clarification. Can you ssh into your pod and log all the available env variables you have there, and do the same for the previous working cluster's pod to see if there are any discrepancies between the two?
If you can share those with us (redact any sensitive info), that might be helpful.

Also, I haven't tested this, but maybe this would work?

const metadataService = new MetadataService({
  endpoint: "http://169.254.170.2",
  disableFetchToken: true 
});

Thanks,
Ran~

@RanVaknin RanVaknin self-assigned this Nov 15, 2024
@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Nov 15, 2024
@shaftoe
Copy link
Author

shaftoe commented Nov 15, 2024

Of course, thanks a ton for the quick help.

So, env vars (which don't seem to contain anything to be redacted):

root@nodetest:/# export
declare -x AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE="/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token"
declare -x AWS_CONTAINER_CREDENTIALS_FULL_URI="http://169.254.170.23/v1/credentials"
declare -x AWS_DEFAULT_REGION="us-west-2"
declare -x AWS_REGION="us-west-2"
declare -x AWS_STS_REGIONAL_ENDPOINTS="regional"
declare -x HOME="/root"
declare -x HOSTNAME="nodetest"
declare -x KUBERNETES_PORT="tcp://10.31.0.1:443"
declare -x KUBERNETES_PORT_443_TCP="tcp://10.31.0.1:443"
declare -x KUBERNETES_PORT_443_TCP_ADDR="10.31.0.1"
declare -x KUBERNETES_PORT_443_TCP_PORT="443"
declare -x KUBERNETES_PORT_443_TCP_PROTO="tcp"
declare -x KUBERNETES_SERVICE_HOST="10.31.0.1"
declare -x KUBERNETES_SERVICE_PORT="443"
declare -x KUBERNETES_SERVICE_PORT_HTTPS="443"
declare -x NODE_VERSION="18.20.4"
declare -x OLDPWD="/"
declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
declare -x PWD="/tmp"
declare -x SHLVL="1"
declare -x TERM="xterm"
declare -x YARN_VERSION="1.22.19"

I've also tested the code change suggestion but it seem to hang (without short timeout, waited for 1 minute or so...)

PS out of curiosity I've tried with endpoint: http://169.254.170.23 and it fails with status code 301

@shaftoe
Copy link
Author

shaftoe commented Nov 15, 2024

Old pod env vars differs slightly (redacted):

declare -x AWS_DEFAULT_REGION="us-west-2"
declare -x AWS_REGION="us-west-2"
declare -x AWS_ROLE_ARN="arn:aws:iam::xxxxxx:role/eks-main-dev-app-xxxxxxx"
declare -x AWS_STS_REGIONAL_ENDPOINTS="regional"
declare -x AWS_WEB_IDENTITY_TOKEN_FILE="/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
declare -x HOME="/root"
declare -x HOSTNAME="ip-xxx.us-west-2.compute.internal"
declare -x HTTP_LISTEN_PORT="4443"
declare -x INTERACTIVE="0"
declare -x KUBERNETES_PORT="tcp://10.31.0.1:443"
declare -x KUBERNETES_PORT_443_TCP="tcp://10.31.0.1:443"
declare -x KUBERNETES_PORT_443_TCP_ADDR="10.31.0.1"
declare -x KUBERNETES_PORT_443_TCP_PORT="443"
declare -x KUBERNETES_PORT_443_TCP_PROTO="tcp"
declare -x KUBERNETES_SERVICE_HOST="10.31.0.1"
declare -x KUBERNETES_SERVICE_PORT="443"
declare -x KUBERNETES_SERVICE_PORT_HTTPS="443"
declare -x NODE_VERSION="18.20.4"
declare -x OLDPWD
declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
declare -x PWD="/app"
declare -x SHLVL="1"
declare -x TERM="xterm"
declare -x TLS_CERT_SECRET_ID="platocorp.com"
declare -x YARN_VERSION="1.22.19"

@shaftoe
Copy link
Author

shaftoe commented Nov 15, 2024

Request http://169.254.170.2 did timeout eventually:

Error: Error making request to the metadata service: TimeoutError: connect ETIMEDOUT 169.254.170.2:80

@RanVaknin
Copy link
Contributor

Hi @shaftoe ,

Thanks for the info. The main difference I see between the two clusters is that your older cluster is using IRSA which is the newer more secure way of authenticating with EKS.

Admittedly, I'm lightyears away from being an EKS expert, and my knowledge is really based on debugging these type of issues with customers, so please bear with me while I'm trying to understand your setup.

PS out of curiosity I've tried with endpoint: http://169.254.170.23 and it fails with status code 301

That is interesting. Based on this issue, it might be able to resolve if you add a trailing slash - http://169.254.170.23/

If you are just trying to hit the container metadata endpoint, you shouldn't need to use the SDK. In theory you can just ssh into your pod, and make a curl request to the endpoint to get that metadata.

If it were me debugging my own environment, I will just try all of the following and see if one of them sticks:

# Basic endpoint probing
curl http://169.254.170.2/
curl http://169.254.170.23/

# IMDS v2 endpoints (EC2 metadata service)
curl http://169.254.169.254/latest/meta-data/
curl http://169.254.169.254/latest/meta-data/public-ipv4
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/

# Pod Identity endpoints 
curl http://169.254.170.23/v1/credentials
TOKEN=$(cat $AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE)
curl -H "Authorization: $TOKEN" http://169.254.170.23/v1/credentials

Thanks again,
Ran~

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Nov 15, 2024
@shaftoe
Copy link
Author

shaftoe commented Nov 15, 2024

The main difference I see between the two clusters is that your older cluster is using IRSA which is the newer more secure way of authenticating with EKS.

This is actually funny, we're currently setting up a new EKS cluster following all found recommendations and EKS pod identity seems to be "the new way" (so supposedly "the correct way" too, right?) for interacting with IAM, see the announcement blog post if you're curious.

That is interesting. Based on awslabs/aws-sdk-rust#560, it might be able to resolve if you add a trailing slash - http://169.254.170.23/

Fails with 301, but

TOKEN=$(cat $AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE)
curl -H "Authorization: $TOKEN" http://169.254.170.23/v1/credentials

works and shows the tokens as expected.

I agree that I don't need to make use of the SDK if all that's needed is to parse some HTTP response (probably in JSON format), the question is where to find the documentation for the API exposed by http://169.254.170.23/, so far the /v1/credentials is the only one that I was able to hit without getting a 404. I'm trying the various combinations of /meta-data/, /latest/meta-data/, and so on so far with no luck. The initial idea though was that using the SDK should shield us from possible future changes in the APIs, frankly all these different IP addresses look a lot like magic numbers...

@RanVaknin
Copy link
Contributor

Hey @shaftoe ,

Thanks for that! I guess I'm playing catch up with EKS auth..

I read through the EKS pod identity docs and I'm not seeing anything about the metadata endpoint. All I see is clarifications about the credentials endpoint which you've confirmed is working.

I'll have to look a little deeper into this, perhaps setting my own cluster with pod identity and reaching out to the EKS team internally for clarification. This might take some time, so please hang tight while I try to find out more.

Thanks again,
Ran~

@RanVaknin RanVaknin added needs-review This issue/pr needs review from an internal developer. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Nov 15, 2024
@shaftoe
Copy link
Author

shaftoe commented Nov 15, 2024

Hang on, it's me who has to thank you for the help so far! Take your time, I'll check emails and answer asap if you have any related question. Enjoy your weekend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-review This issue/pr needs review from an internal developer. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants