Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix authentication for ECR #159

Merged
merged 1 commit into from
Jun 7, 2023
Merged

fix authentication for ECR #159

merged 1 commit into from
Jun 7, 2023

Conversation

jlbutler
Copy link
Contributor

Fixes #143

It's a bit gross, but auth token needs to be retrieved using IAM credentials. Once retrieved, can be used with v2 OCI API.

This is the solution from the temp patch referred to in #59 but the final fix merged in looks like it tried to adopt using the get-login-password response.

Let me know what you think, sorry it took a while to come help out.

@3XX0
Copy link
Member

3XX0 commented May 22, 2023

I don't understand, did something change on the ECR side? It used to work fine.
I still see the following for the Docker documentation:

aws ecr get-login-password --region region | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.region.amazonaws.com

@jlbutler
Copy link
Contributor Author

Sorry for latency, traveling at the moment. No, nothing changed. 'aws ecr get-login-password' is a base64-encoded password token, and 'aws ecr get-authorization-token' is a bas64-encoded user:pass basic auth token (where pass is the same as get-login-password output).

As enroot is using the v2 api and curl'ing, we need the auth token, vs just the login password. Docker login uses the password alone. Hope that makes sense, happy to discuss further or clarify.

@jlbutler
Copy link
Contributor Author

Oh actually, if curl is using --user that should work... maybe that's what changed at some point. I'll look at this in a bit.

@jlbutler
Copy link
Contributor Author

So the way the script works now, the change in this PR gets the right token. To use get-login-password, curl would have to change to using --basic and --user AWS:${pw}. So, I think this is correct, let me know what you think.

@3XX0
Copy link
Member

3XX0 commented May 26, 2023

The script is already using basic auth either with -u or through netrc:

req_params+=("-u" "${user}")

This used to work so it's probably a change in the URL(s) we need to use.
I think we need to look at what's happening through MITM proxy and compare it to what Docker does. This should be very similar.

@jlbutler
Copy link
Contributor Author

jlbutler commented Jun 1, 2023

Sorry for latency in responses, had some travel. I can say that the endpoints or auth model haven't changed in ECR, so I don't think we need to trace things. I think we have two options based around using either a login password, or an auth token. Also, I could be misunderstanding the scripts :)

In docker::_authenticate(), it looks like we either set --user, or --netrc-file. I might be misreading the script, but if we grep out the credentials invocation from the credentials file in config, then we set the --netrc-file arg, otherwise we set --user.

In our case, since we've set up the credentials file, we go down the netrc path, and our req_params end up being something like: --data-urlencode service=ecr.amazonaws.com --netrc-file /proc/self/fd/10

We eventually we put the response from netrc file (read as token) into the request header as basic auth. For this use case, we need to get an auth token (aws ecr get-authorization-token) vs a login password (aws ecr get-login-password). This is the change I made in this PR - get the auth token out of the creds file as-is, and change docs to using token vs password. This was inspired by your initial notes... and we agree kinda gross, but it does work.

We could alternatively change this to setting --user, in which case I think that needs to set something like:

req_params+=("-u" "${user}":"${token}")

Where tokenis what we get from aws ecr get-login-password. I'm not a curl expert, but from quick testing if I don't set user:pass, it blocks on the tty and asks for password.

I'm happy to go in either direction with this. Either way it seems to be a special case, because we don't expose a registry-wide auth endpoint (because access policies are repo-scoped in ECR).

Let me know what you think, or if I've misunderstood along the way. Thanks!

@3XX0
Copy link
Member

3XX0 commented Jun 1, 2023

docker::_authenticate() will always authenticate through basic auth (through -u or --netrc-file). The idea is to check that the user has indeed access to the given manifest and retrieve a bearer token if necessary for downloading the layers. This PR completely bypasses this check.

aws ecr get-login-password is the correct way to retrieve the password. aws ecr get-authorization-token will give us the base64 basic header and we don't want that since curl is generating it for us already.

In the case of ECR, IIRC there are no bearer tokens, every call requires basic auth.
It used to work so something definitely changed. Unfortunately, I can't test easily right now but if I had to guess, maybe ECR doesn't reply with the basic authorization header anymore (which is essentially the same as get-authorization-token) and we need to construct it ourselves:

| awk '/Authorization: Basic/ { sub(/\r/, "", $4); print $4 }' \

You should be able to check what's going on by adding -v to curl_opts at the beginning of the file.

With machine 12345.dkr.ecr.eu-west-2.amazonaws.com login AWS password $(aws ecr get-login-password --region eu-west-2) in the credential file, you should see the basic auth being done in the first GET as well as the response header which should contain the authorization header we store for subsequent calls.

@jlbutler
Copy link
Contributor Author

jlbutler commented Jun 3, 2023

I didn't intend to work around the existing flow, just to get the right sort of token in hand for injecting into a request header. However, given your response it made me think that it should work w/ login (as it'll have the user prepended, then encoded).

I traced the curl that was failing (line 76), and the response body was clobbering into the auth token string in the header:

> GET /v2/ecr-public/amazonlinux/amazonlinux/manifests/2023?service=ecr.amazonaws.com HTTP/1.1
> Host: <registry-id>.dkr.ecr.us-west-2.amazonaws.com
> Authorization: Basic QVdTOmV5SndZWGxzYjJGa0lqb2lVMGRMUzBJMVdFdFJTa1p3WTJSaVYyRkZOVGxu
VVhGV1QxcGhPR1V3Y0hBeWNIVmFabFpOWkU1Sk4ySnROak5qY1hWQlNqVkxkR0ZMVkVocFJrSjBNa0p2Tldobl
<< snip >>
GTVNHNVhXSGh0WWxST1ZqSk5PSFZ3Unk5a04zcEdlVTExUjJZclUyaDJkM3B0WlZkWU9WTjVXRk4xT1cxNWJ
RXdXVTlPZVRKVGQxSkRXV28yTkV4RVExRkZMM05PZUdsQ2NUTllValZYUVVkTE0wbEJTamHTTP/1.1 200 OK
Content-Length: 770
Content-Type: application/vnd.docker.distribution.manifest.list.v2+json
Docker-Distribution-Api-Version: registry/2.0
Sizes:
Date: Sat, 03 Jun 2023 20:47:52 GMT

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 529,
         "digest": "sha256:e4d705bbc710fd025c445479b34108b236f761fc0fc3b9c75e19f5dcdb00286f",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 529,
         "digest": "sha256:079c63757ce340abe853ce58a0141175770508ae9baf6a52c7cb0fe9641d80da",
         "platform": {
            "architecture": "arm64",
            "os": "linux",
            "variant": "v8"
         }
      }
   ]
}RXVjNSak1rRjFhWGx2Ymtsa1NqRlpkekY0Y1VseUsxaERMM2RCVFdoc1QwaDRUMWt6VnpKbGVWcDZkRXBQUzNj
dmMwMWtNRU5WYWxwSU1EWTBlQzlQTlc1RVFXdG1SaXRrVDA4M1JHRnVSVmw0TDI5S1RYTTBSekV6TkhsT1Yyb
<<snip>>
1NWWk5QU0lzSW5abGNuTnBiMjRpT2lJeUlpd2lkSGx3WlNJNklrUkJWRUZmUzBWWklpd2laWGh3YVhKaGRHbHZ
iaUk2TVRZNE5UZzJPRFEzTVgwPQ==
> User-Agent: curl/7.81.0
> Accept: */*

Not sure why we're just seeing this recently, or maybe only with ECR? Regardless, I think adding --write-out is a reasonable fix for this. It outputs after the transfer, so it effectively separates the header and response body data.

Like I said, I'm not a curl expert. There might be a better way to do this, but it does the trick. enroot import is working for me now w/ aws ecr get-login-password in the credentials file.

@3XX0
Copy link
Member

3XX0 commented Jun 7, 2023

Ok yeah this makes more sense. Thanks for looking into it!

@3XX0 3XX0 closed this Jun 7, 2023
@3XX0 3XX0 reopened this Jun 7, 2023
@3XX0 3XX0 merged commit 9bc97b1 into NVIDIA:master Jun 7, 2023
@jlbutler
Copy link
Contributor Author

jlbutler commented Jun 8, 2023

Cool, thanks for going on the journey with me!

@krono
Copy link
Contributor

krono commented Jan 18, 2024

Hi, sorry to re-ping that.
adding just -w for --write-out misses the fact that --write-out takes an argument.
The patch as is makes curl always print -G (without newline) after everyting printet.
That's probably not indendent

@3XX0
Copy link
Member

3XX0 commented Jan 19, 2024

Um yeah, also after checking -w is already used in the common implementation of curl.
Maybe something like -D - instead?

@krono
Copy link
Contributor

krono commented Mar 1, 2024

Not a curl wizzard, tho…

3XX0 added a commit that referenced this pull request Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error fetching image manifest list from private AWS ECR repo
3 participants