Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS_EC2_ENDPOINT overrides the STS endpoint and breaks IRSA #1122

Closed
hakman opened this issue Nov 28, 2021 · 5 comments · Fixed by #1398
Closed

AWS_EC2_ENDPOINT overrides the STS endpoint and breaks IRSA #1122

hakman opened this issue Nov 28, 2021 · 5 comments · Fixed by #1398
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@hakman
Copy link
Member

hakman commented Nov 28, 2021

/kind bug

What happened?
kOps is adding support for IPv6 clusters and the EBS CSI driver requires access to the new EC2 dual-stack endpoints.
Setting AWS_EC2_ENDPOINT=https://api.ec2.us-east-1.aws overrides the EC2 endpoint but also affects the STS endpoint and breaks IRSA.

2021/11/28 02:57:37 DEBUG: Validate Response sts/AssumeRoleWithWebIdentity failed, attempt 0/8, error SerializationError: failed to unmarshal error message
	status code: 400, request id: e42074ab-cfa5-4fb5-b18f-5202d253f4a3
caused by: UnmarshalError: failed to unmarshal error message
	00000000  3c 3f 78 6d 6c 20 76 65  72 73 69 6f 6e 3d 22 31  |<?xml version="1|
00000010  2e 30 22 20 65 6e 63 6f  64 69 6e 67 3d 22 55 54  |.0" encoding="UT|
00000020  46 2d 38 22 3f 3e 0a 3c  52 65 73 70 6f 6e 73 65  |F-8"?>.<Response|
00000030  3e 3c 45 72 72 6f 72 73  3e 3c 45 72 72 6f 72 3e  |><Errors><Error>|
00000040  3c 43 6f 64 65 3e 4e 6f  53 75 63 68 56 65 72 73  |<Code>NoSuchVers|
00000050  69 6f 6e 3c 2f 43 6f 64  65 3e 3c 4d 65 73 73 61  |ion</Code><Messa|
00000060  67 65 3e 54 68 65 20 72  65 71 75 65 73 74 65 64  |ge>The requested|
00000070  20 76 65 72 73 69 6f 6e  20 28 32 30 31 31 2d 30  | version (2011-0|
00000080  36 2d 31 35 29 20 6f 66  20 73 65 72 76 69 63 65  |6-15) of service|
00000090  20 41 6d 61 7a 6f 6e 45  43 32 20 64 6f 65 73 20  | AmazonEC2 does |
000000a0  6e 6f 74 20 65 78 69 73  74 3c 2f 4d 65 73 73 61  |not exist</Messa|
000000b0  67 65 3e 3c 2f 45 72 72  6f 72 3e 3c 2f 45 72 72  |ge></Error></Err|
000000c0  6f 72 73 3e 3c 52 65 71  75 65 73 74 49 44 3e 65  |ors><RequestID>e|
000000d0  34 32 30 37 34 61 62 2d  63 66 61 35 2d 34 66 62  |42074ab-cfa5-4fb|
000000e0  35 2d 62 31 38 66 2d 35  32 30 32 64 32 35 33 66  |5-b18f-5202d253f|
000000f0  34 61 33 3c 2f 52 65 71  75 65 73 74 49 44 3e 3c  |4a3</RequestID><|
00000100  2f 52 65 73 70 6f 6e 73  65 3e                    |/Response>|

caused by: unknown error response tag, {{ Response} []}
2021/11/28 02:57:37 DEBUG: Request sts/AssumeRoleWithWebIdentity Details:
---[ REQUEST POST-SIGN ]-----------------------------
POST / HTTP/1.1
Host: api.ec2.us-east-1.aws
User-Agent: aws-sdk-go/1.40.4 (go1.17.3; linux; amd64) exec-env/aws-ebs-csi-driver-v1.5.0
Content-Length: 1291
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Accept-Encoding: gzip


-----------------------------------------------------
2021/11/28 02:57:37 DEBUG: Response sts/AssumeRoleWithWebIdentity Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 400 Bad Request
Connection: close
Transfer-Encoding: chunked
Cache-Control: no-cache, no-store
Content-Type: text/xml;charset=UTF-8
Date: Sun, 28 Nov 2021 02:57:36 GMT
Server: AmazonEC2
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: accept-encoding
X-Amzn-Requestid: af2a0511-174d-4e80-8198-f83f2ca21a79

What you expected to happen?
I expect setting AWS_EC2_ENDPOINT=https://api.ec2.us-east-1.aws to not affect STS.

How to reproduce it (as minimally and precisely as possible)?

  1. Create a cluster with IRSA enabled and start the ebs-csi-controller and enabled debugging (--aws-sdk-debug-log).
  2. Add AWS_EC2_ENDPOINT=https://api.ec2.{{ Region }}.aws as env var to the ebs-plugin container.
  3. Create a new volume and check the logs for mentions of STS.

Anything else we need to know?:
This should be partially addressed by #1120 by using AWS_USE_DUALSTACK_ENDPOINT=true instead.
Though anyone using custom endpoints would still be affected, like it was initially intended in #369.

Environment

  • Kubernetes version (use kubectl version): 1.22.4
  • Driver version: 1.5.0
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 28, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 26, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 28, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wongma7 wongma7 reopened this Sep 22, 2022
@wongma7
Copy link
Contributor

wongma7 commented Sep 22, 2022

Looks like we should be implementing a resolver so that the endpoint is set to AWS_EC2_ENDPOINT iff service is EC2 as documented under https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ "Using Custom Endpoints". In that example they set the endpoint to s3.custom.endpoint.com iff service is S3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants