Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use registry host to get ACR access token and add new OIDCLogin function #560

Merged
merged 3 commits into from
May 25, 2023

Conversation

somtochiama
Copy link
Member

@somtochiama somtochiama commented May 16, 2023

Closes: fluxcd/source-controller#1071

This pull request updates the Azure login code to use the registry host to get the ACR token and adds new OIDCLogin function that doesn't accept an image.

Comment on lines 117 to 119
registryStr := oci.GetRegistryStr(image, ref)
fmt.Println(registryStr)
authConfig, err := c.getLoginAuth(ctx, registryStr)
Copy link
Contributor

@darkowlzz darkowlzz May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
registryStr := oci.GetRegistryStr(image, ref)
fmt.Println(registryStr)
authConfig, err := c.getLoginAuth(ctx, registryStr)
registryStr := oci.GetRegistryStr(image, ref)
authConfig, err := c.getLoginAuth(ctx, registryStr)

But regardless, this doesn't seem to be the root cause of the issue. I dug deeper to find that the issue is with the arguments passed to the login method. This whole auth package originates from image-reflector-controller and is mostly based on the constructs of go-containerregistry. Revisiting the code, it feels like there's no reason it has to do that and can be made more generic, but this is what we have for now.

To verify that the login works if the name reference passed to the login method is constructed properly, I added some new tests in #561 which verified that it works for both azure and gcp.

In the code snippet above, instead of extracting the registry from ref, just passing the image would also make it work if the image is the URL of the registry. I think we can make this better in the future to make it more explicit what the URL and ref mean. Again, it is structured this way due to historic reasons, while trying to reorganize the code and not break any existing users of image-reflector-controller, even I had doubts around the old code that I was trying to put in a new structure. Due to lack of cloud integration tests before, it was hard to make any changes and verify them, but today, we can make such changes easily and verify if they work using our new CI setup.

The root cause of the issue originates in how Helm OCI and OCIRepository in source-controller are sharing the OIDC login code. Refer oidcAuth() in OCIRepository controller. In this context, the URL is a reference to an image and it can't be a repository root. Some time back, this method used to accept an instance of OCIRepository but was modified to work with Helm OCI and the argument name was picked to be a URL. URL in Helm's context can be a repository address or a registry address (repo root). Constructing a name reference that OCI auth package needs from just a registry URL results in a name that's translated as a repository name without any registry address (more details about this was documented in #434). This results in a bad name reference being passed to the login method and the login fails.

I think we need to fix this in source-controller. But we can also make the OCI auth package better separately. For now, maybe the Helm repository URL that's being passed to oidcAuth() in helmchart reconciler can be appended with the chart name to fix it. May not work exactly for the corresponding code in helmrepository reconciler as there's no chart name there, but any fake name would also work with it as helmrepository OCI is essentially noop. This is just based on my current understanding of the problem. May find some better way to fix it later.

Copy link
Member Author

@somtochiama somtochiama May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May not work exactly for the corresponding code in helmrepository reconciler as there's no chart name there, but any fake name would also work with it as helmrepository OCI is essentially noop

So we append any random string to helmrepository URL?

Revisiting the code, it feels like there's no reason it has to do that and can be made more generic, but this is what we have for now.

Yes, it seems to me that the Login function doesn't really need the image but the URL of the registry. But right now it accepts an image and a ref (even the ref argument doesn't seem like it is needed). A URL of the registry is sometimes passed in and this becomes a problem since it is parsed as an image.

Instead of making users of the function compensate for this. I think it should be handled within the function. Maybe at the start of the function, we use the GetRegistryStr to get the endpoint URL correctly and then pass that to the login function of the different providers (and ImageRegistryProvider)

Copy link
Contributor

@darkowlzz darkowlzz May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed more about it privately with @somtochiama and decided that we can start modifying the existing package itself to become more accepting of registry host based login. Once we do that, the oci/auth package will no longer be specific to OCI, it'll become a generic OIDC auth package which increases the scope of its use case.
Because all the APIs are public and to keep the migration work minimal, it'd be better to add new methods to the package and modify the implementation of the existing methods to allow the new methods to work. We can do the following:

  • Introduce a new login method in oci/auth/login Manager.OIDCLogin(). This will only take a URL. The URL will be used to extract the registry host and that'll be passed down to the other provider login methods.
func (m *Manager) OIDCLogin(ctx context.Context, url string, opts ProviderOptions) (authn.Authenticator, error)

Also, instead of ImageRegistryProvider() trying to figure out about the registry host, the registry host will be passed to it.

  • Introduce new login methods in the individual providers Client.OIDCLogin(). These will also only take a URL, which is the registry host URL. Since this is an attempt to make it generic, not specific to OCI registry, it can be any host endpoint.
func (c *Client) OIDCLogin(ctx context.Context, autoLogin bool, regHost string) (authn.Authenticator, error)

We need to make it clear that the URL is of the registry host in comments or my naming it accordingly.

  • Update the old Client.Login() methods that use name.reference to ignore it and use the provided image URL. This will allow the internal functions that Client.Login() calls to be reusable by Client.OIDCLogin() as well.

We can do further refinements and refactoring when we move to a new generic auth package.
I think we don't need to mark the old methods as deprecated as this whole package was meant to be for OCI. When we create a new generic auth package, we can copy the whole code, except for the old login methods and also rename the OIDCLogin() methods to just Login(). I think this will help us to do a smooth transition without breaking anything or requiring a lot of work to be done upfront to fix the main issue that we are trying to solve for now.

@somtochiama somtochiama marked this pull request as draft May 17, 2023 10:45
@somtochiama somtochiama changed the base branch from main to generic-oicd-auth May 18, 2023 19:29
oci/auth/azure/exchanger.go Outdated Show resolved Hide resolved
oci/auth/aws/auth.go Outdated Show resolved Hide resolved
oci/auth/aws/auth.go Show resolved Hide resolved
oci/auth/aws/auth_test.go Show resolved Hide resolved
oci/auth/aws/auth_test.go Show resolved Hide resolved
oci/auth/azure/auth.go Outdated Show resolved Hide resolved
oci/auth/azure/auth.go Outdated Show resolved Hide resolved
oci/auth/login/login.go Outdated Show resolved Hide resolved
oci/tests/integration/Makefile Show resolved Hide resolved
Copy link
Contributor

@darkowlzz darkowlzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding all the improvements and detailed explanation in the commits. I've left a few more suggestions.

The second commit description seems to imply that there's something wrong with the azure login code. As discussed above, there's no issue in the current login code if the inputs are crafted properly. We are only trying to allow login to consumers of this API that don't have a repository name and just a registry host. I think the following line can be corrected to reflect the same:

This prevents an issue with Azure login that occurs when the
registry host is parsed incorrectly and ref.Context().RegistryStr() returns
the index.docker.io is returned as the registry host.

oci/auth/azure/exchanger.go Outdated Show resolved Hide resolved
oci/auth/login/login.go Outdated Show resolved Hide resolved
oci/auth/aws/auth.go Outdated Show resolved Hide resolved
oci/tests/integration/testapp/main.go Outdated Show resolved Hide resolved
oci/auth/login/login.go Outdated Show resolved Hide resolved
oci/auth/login/login.go Outdated Show resolved Hide resolved
oci/tests/integration/testapp/main.go Outdated Show resolved Hide resolved
@darkowlzz
Copy link
Contributor

Also, can you please update the PR title and description to reflect the latest implementation?

@somtochiama somtochiama changed the title Use correct url to get ACR access token when root url is given Use registry host to get ACR access token and add new OIDCLogin function May 23, 2023
@somtochiama somtochiama marked this pull request as ready for review May 24, 2023 12:34
oci/auth/azure/auth.go Outdated Show resolved Hide resolved
Copy link
Contributor

@darkowlzz darkowlzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It LGTM!
Please squash the commits where applicable and keep the relevant details in the body where needed.

NOTE: This PR is against the branch generic-oicd-auth so that we can run the e2e tests in the CI once before merging it to the main branch.

… registry url.

This commits removes the code that parses the region from the ecr registry and uses it to set the config.
By default, LoadDefaultConfig will get the region of the instance from the metadata service which is the
correct region that should be used when requesting for a token.

Signed-off-by: Somtochi Onyekwere <[email protected]>
@darkowlzz darkowlzz added the area/oci OCI related issues and pull requests label May 25, 2023
… ref argument

The Login function in oci/auth no longer uses the name.Reference argument and instead
gets the registry host from the image argument and passes that to the Login method of
the different region. This is to allow login to consumers of this function that don't
have a repository name and just a registry host.

Signed-off-by: Somtochi Onyekwere <[email protected]>
It also updates the Makefile to build the testapp for linux/amd64
by default. This can be changed by setting GOARCH and GOOS variables.

Signed-off-by: Somtochi Onyekwere <[email protected]>
@stefanprodan
Copy link
Member

This PR is against the branch generic-oicd-auth so that we can run the e2e tests in the CI once before merging it to the main branch.

Hmm so we don't want this in main? what's the plan to get this into SC then?

@darkowlzz
Copy link
Contributor

Hmm so we don't want this in main? what's the plan to get this into SC then?

The e2e tests can't be run on fork branches. In order to test it once before merging to main, we follow this strategy. We have similar branches for running e2e test changes in flux2 repo as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/oci OCI related issues and pull requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using kubelet identity to access ACR OCI charts
4 participants