Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/resource: support S3 access point URLs #1264

Merged
merged 1 commit into from
Apr 28, 2022

Conversation

zeleena
Copy link
Contributor

@zeleena zeleena commented Oct 1, 2021

Support S3 access point URLs in ARN format as a source.
This allows valid, opaque S3 URLs such as
s3:arn:aws:s3:us-west-2:123456789012:accesspoint/test/object
Being able to use this format will allow S3 URLs on different
partitions and lays the foundation to potentially support
multi-region access points in the future.

Fixes #1091
Signed-off-by: Zeleena Kearney [email protected]

@sohankunkerkar
Copy link
Member

sohankunkerkar commented Oct 2, 2021

Thanks for your contribution. I think you may need to add the following changes to this PR:

  1. dd9ca61 (configuration-v3_4_experimental.md)
  2. abfb728 (v_3_4_experimental)

Copy link
Member

@sohankunkerkar sohankunkerkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need a kola test coverage, but overall changes look good to me.
@bgilbert could you take another stab at it?

sohankunkerkar
sohankunkerkar previously approved these changes Oct 11, 2021
@zeleena
Copy link
Contributor Author

zeleena commented Oct 26, 2021

Hey @sohankunkerkar could I have you take a look again? No new changes, but between the time you approved and the time I saw it, the default branch caused some merge conflicts that I needed to resolve.

@sohankunkerkar
Copy link
Member

@zeleena I think the change looks good to me.
@bgilbert might want to take another look before merging this PR.

Copy link
Contributor

@bgilbert bgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! And thanks for your patience; apologies for the slow review.

I think we should probably support this as a separate arn URL scheme, and be a little more rigorous about doing so, e.g. by checking the service name and by supporting non-accesspoint S3 ARNs. Feedback welcome, though.

I agree with @sohankunkerkar that we should have a kola test for this, probably as an external test. Feel free to add one if you'd like to dig into that infrastructure, but it probably makes more sense for us to handle that part ourselves.

config/v3_4_experimental/types/url_test.go Outdated Show resolved Hide resolved
config/v3_4_experimental/types/url.go Outdated Show resolved Hide resolved
config/v3_4_experimental/types/url.go Show resolved Hide resolved
docs/configuration-v3_4_experimental.md Outdated Show resolved Hide resolved
internal/resource/url.go Outdated Show resolved Hide resolved
@zeleena
Copy link
Contributor Author

zeleena commented Dec 2, 2021

Thanks for the review @bgilbert . I've addressed your comments and generally supported the opaque arn: URL scheme.

It uses the aws sdk to more rigorously check that the ARN is for S3, and supports non-accesspoint S3 ARNs as well as accesspoint S3 ARNs. I've also added comments about the expected format of the ARN resource and official AWS documentation that supports those expectations.

Please take a look and let me know what think.

@zeleena zeleena requested a review from bgilbert December 2, 2021 00:14
@zeleena
Copy link
Contributor Author

zeleena commented Jan 4, 2022

Happy new year. @bgilbert could you please take a look and let me know if this is okay to merge?

@ashleycutalo
Copy link

@bgilbert could you take another look at this please?

Copy link
Contributor

@bgilbert bgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience once again. This generally looks good; just one issue around handling of AWS partitions, and a question about whether we should support version IDs with arn.

You'll need to rebase this to current main, since our CI checks have changed. Also, in this case I think the whole PR counts as one logical change, so please squash the PR down to one commit.

I'll aim to get you a faster response in the next round. 🙂

internal/resource/url.go Outdated Show resolved Hide resolved
@@ -17,6 +17,7 @@ package types
import (
"net/url"

"github.com/aws/aws-sdk-go/aws/arn"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, this adds a dependency to our external API. However, the arn package only depends on the stdlib, so this shouldn't be too impactful.

internal/resource/url.go Outdated Show resolved Hide resolved
internal/resource/url.go Outdated Show resolved Hide resolved
internal/resource/url.go Outdated Show resolved Hide resolved
@zeleena zeleena force-pushed the aws-s3-access-points branch from 9990a3c to 828a4cd Compare February 16, 2022 01:19
@zeleena
Copy link
Contributor Author

zeleena commented Feb 16, 2022

Hi @bgilbert, I've addressed the comments, rebased, and squashed my PR down to one commit as requested. Please take a look.

@zeleena zeleena requested a review from bgilbert February 16, 2022 16:47
Copy link
Contributor

@bgilbert bgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good generally; just need to resolve issues with versionId and with non-aws partitions.

I'll be OOO for a bit, so @jlebon will handle final review and merge. Thanks again for the PR!

internal/resource/url.go Outdated Show resolved Hide resolved
key = strings.Join(urlSplit[1:], "/")
}

sess.Config.Region = aws.String(s3arn.Region)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3 ARNs don't include the Region field so this won't be sufficient for non-aws partitions. We may need to add a hardcoded map from AWS partitions (s3arn.Partition) to a representative region for region hinting. (Or maybe the AWS SDK can do that for us?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS partitions are commonly aws or aws-cn from those docs, and there isn't a clear way to get region from those.

What I've done to address this is to use the current way of using GetBucketRegion for the S3 bucket and S3 bucket ARN cases since we have the bucket name available, and used the Region field for the S3 accesspoint case since the Region field will exist for accesspoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As is, the non-accesspoint case will fail to select the correct region if we're running in a different partition than the one specified in the ARN. I still think a lookup table for a representative regionHint, indexed by partition, is the right solution here.

internal/resource/url.go Outdated Show resolved Hide resolved
@bgilbert bgilbert mentioned this pull request Feb 27, 2022
2 tasks
@bgilbert
Copy link
Contributor

Filed #1322 to track followups to this PR (kola tests and docs).

@zeleena zeleena force-pushed the aws-s3-access-points branch from dff4066 to b5f1dde Compare February 28, 2022 22:52
@zeleena
Copy link
Contributor Author

zeleena commented Feb 28, 2022

I read those docs as describing the general use of that field in resource-specific ARN schema. The S3 schema in particular doesn't appear to support version IDs in that field, so AFAICT we'd be adding a custom extension. I do see your point about supporting the full set of functionality, so on balance I'm okay with leaving this in.

The code doesn't strip off the custom query string before handing it to the ARN parser, so unlike in the s3:// case, the query string ends up in input.Key. Have you tested that fetches work with versionId specified?

We should also add a TestURLValidate test for this case.

Not sure why I can't reply to this comment directly below it.

There is no custom query to strip off; this code uses the u.Scheme and u.Opaque to create the fullUrl variable, and u.Query is a separate field that contains the versionId. See how URLs are interpreted here. So the query string does not end up in input.Key because it is a separate field which is not included when handed to the ARN parser.

I added test cases to cover all three cases of S3 bucket, S3 bucket ARN and S3 accesspoint ARN and their corresponding URL with versioning.

@zeleena zeleena requested a review from bgilbert February 28, 2022 23:08
@zeleena
Copy link
Contributor Author

zeleena commented Feb 28, 2022

Addressed the comments and squashed commits. PTAL @jlebon or @bgilbert if you return from OOO.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion but LGTM as is too. Happy to leave that for a follow-up if you prefer.

internal/resource/url.go Outdated Show resolved Hide resolved
@jlebon
Copy link
Member

jlebon commented Mar 3, 2022

Also, can you confirm whether you were able to test this?

@zeleena zeleena force-pushed the aws-s3-access-points branch from b5f1dde to 961d70e Compare April 4, 2022 22:03
@zeleena
Copy link
Contributor Author

zeleena commented Apr 4, 2022

Also, can you confirm whether you were able to test this?

Yes, this can still retrieve configs from S3 given a bucket in s3 or arn format as before. Multi-region access points won't be enabled until this uses aws-sdk-go-v2, which I hope to do in a follow-up PR.

Also one thing is that I had to modify my setup to include a fix for golang/go#51706, but I imagine this was related to my local setup.

Please take a look and feel free to merge if this is good to go.

@zeleena zeleena requested a review from jlebon April 4, 2022 22:14
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks much nicer now! Have one remaining comment, otherwise this LGTM. I'll let @bgilbert also have a look.

internal/resource/url.go Outdated Show resolved Hide resolved
@jlebon
Copy link
Member

jlebon commented Apr 7, 2022

Also, it looks like there is a merge commit in the PR. Could you try rebasing your commit onto the latest main?

@zeleena zeleena force-pushed the aws-s3-access-points branch from dc84fed to baa443f Compare April 8, 2022 23:07
Support S3 access point URLs in ARN format as a source.
This allows valid, opaque S3 URLs such as
`s3:arn:aws:s3:us-west-2:123456789012:accesspoint/test/object`
Being able to use this format will allow S3 URLs on different
partitions and lays the foundation to potentially support
multi-region access points in the future.

Fixes coreos#1091
Signed-off-by: Zeleena Kearney <[email protected]>
@zeleena zeleena force-pushed the aws-s3-access-points branch from baa443f to 4daa32d Compare April 8, 2022 23:50
@zeleena
Copy link
Contributor Author

zeleena commented Apr 8, 2022

Also, it looks like there is a merge commit in the PR. Could you try rebasing your commit onto the latest main?

Addressed comment and rebased. Mind taking a look now @jlebon @bgilbert ?

@zeleena
Copy link
Contributor Author

zeleena commented Apr 20, 2022

Hi all, it's been almost two weeks since the last reply. Does this look good and is this okay to merge? @bgilbert @jlebon

@jlebon jlebon mentioned this pull request Apr 27, 2022
Copy link
Contributor

@bgilbert bgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks so much for your extensive patience in pushing this to the finish line.

There are a couple tiny nits and I'm still not convinced about the partition selection for non-accesspoint ARNs, but let's merge this now and I'll see about making a followup PR.

internal/resource/url.go Show resolved Hide resolved
internal/resource/url_test.go Show resolved Hide resolved
@bgilbert bgilbert merged commit bd6aa78 into coreos:main Apr 28, 2022
@bgilbert
Copy link
Contributor

Nits addressed in #1355.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

S3 URLs on different partitions
5 participants