Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: migrate s3 ops to sdk v2, add workflow for loading sdk v2 config #376

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lucix-aws
Copy link

@lucix-aws lucix-aws commented Oct 11, 2024

Part of #187

I identified this project as an advanced user of the Go SDK for the following reasons:

  • The nature of the project requires calling many AWS services
  • The way the project loads config is sufficiently more advanced than the more common / basic use cases

I did this migration as an exercise to get a better idea for what doing so is like in a large project and to see if there's anything we can improve in that area. This made me realize that an outstanding issue we have aws/aws-sdk-go-v2#2422 is more important of a parity gap than I thought, its priority has been increased accordingly.

What this PR does:

  • establishes the workflow for loading SDK v2 aws.Config, which is the v2 equivalent of session.Session.
  • migrates all S3 resources to using SDK v2
    • The actual call-to-call translation from v1 to v2 is pretty straightforward. Note that I switched use of ListObjects to ListObjectsV2 because the latter has a paginator available. It shouldn't affect anything behaviorally that I'm aware of.
  • The libnuke facade gives you a ctx everywhere for doing most/all service calls, but config.LoadDefaultConfig also takes one, in places where I didn't have access I just passed context.TODO()
  • I don't know what your stance is on stability of exported APIs in this project, some of them were probably broken by this change.

What I tested:

  • Ran the existing s3-bucket integration test. Note that I'm getting a failure from running this against both main and this branch, so I assume that's existing. The actual nuke command also worked so I didn't investigate it that much.
  • Actually ran nuke against my dev account to blow up a bunch of old testing buckets:
regions:
  - us-east-1

blocklist:
  - "999999999999" # production

accounts:
  "<...>": 
    filters:
      S3Bucket:
        # ... filtered a bunch of stuff I wanted to keep

resource-types:
  includes:
    - S3Bucket
  • The "custom service" / endpoint URL path probably warrants some additional e2e testing. I don't have a good setup for this.

You definitely can have both SDK v1 and v2 live alongside each other, but you may or may not wish to cut releases in that transition period. If that's the case I would just change this to merge to a feature branch.

pkg/awsutil/config.go Outdated Show resolved Hide resolved
pkg/awsutil/config.go Outdated Show resolved Hide resolved
@ekristen
Copy link
Owner

@lucix-aws This is amazing. It was on my list to look at doing this in earnest this winter, but this looks like a great start.

It has been a while since I've run the integration tests, so it's possible they are broken, I will take a look.

Generally not worried about exported APIs in this project, it's not meant to be directly used, and if it is it's usually at a higher level that making changes to the individual resources shouldn't be a problem.

I hadn't yet decided if I was going to run them in parallel or not. Ultimately I think I might have to as it's quite the lift to shift everything over at once especially with so many changes.

I will definitely take a closer look at this and this will help me form the strategy for migrating.

Copy link

@wty-Bryant wty-Bryant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice example to learn migration in detail, left some minor comments

@ekristen
Copy link
Owner

This fix fixes the integration tests, you'll likely need to rebase.

@lucix-aws
Copy link
Author

Patch addresses PR comments, lint, and rebases for the integration tests.

@lucix-aws
Copy link
Author

I PR'd the addition of config.WithBaseEndpoint SDK-side and will update once that releases. I'll also deal w/ the commit message linting then.

@mdgm88
Copy link

mdgm88 commented Oct 17, 2024

Upgrading s3 to using the sdk v2 could also be handy for using list buckets. It's now possible to filter list buckets on a region: https://github.com/aws/aws-sdk-go-v2/releases/tag/release-2024-10-16

@lucix-aws
Copy link
Author

We are mid-hackathon in SDKs and Tools right now, will get back to this next week.

@ekristen
Copy link
Owner

No problem @lucix-aws. Looking forward to when this can be merged and we can the ball rolling on moving to sdk v2 for all resources.

@lucix-aws
Copy link
Author

Fixed commits and pushed an update to use the new config.WithBaseEndpoint. This should be ready for review/merge.

Copy link
Owner

@ekristen ekristen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @lucix-aws I really appreciate this, going to save me a lot of time. Looks like we have a small merge conflict on go.mod though. I'll start testing it locally in the mean time.

@lucix-aws lucix-aws force-pushed the port-sdkv2-s3 branch 2 times, most recently from 9c45415 to 3e4d7d0 Compare November 1, 2024 15:15
@ekristen
Copy link
Owner

ekristen commented Nov 1, 2024

@lucix-aws I've been testing through this, I found one small problem with the integration tests and I fixed that.

However it does look like we are also missing parity from the session function to the config function around skipping the global handler https://github.com/ekristen/aws-nuke/blob/main/pkg/awsutil/session.go#L207C1-L210C3

The current result is that anything that's in us-east-1 is being picked up by our special "global" region, which in fact is us-east-1, but it's designed for things that only exist there like IAM, etc.

@ekristen
Copy link
Owner

ekristen commented Nov 1, 2024

Do you really want to nuke the account with the ID 637423565883 and the alias 'no-alias-637423565883'?
Waiting 3s before continuing.
global - S3Bucket - s3://fsdfhasdjfohsgdalfha-989908790-89080890890890898908 - [CreationDate: "2024-11-01T22:49:00Z", Name: "fsdfhasdjfohsgdalfha-989908790-89080890890890898908", ObjectLock: ""] - would remove
us-east-1 - S3Bucket - s3://fsdfhasdjfohsgdalfha-989908790-89080890890890898908 - [CreationDate: "2024-11-01T22:49:00Z", Name: "fsdfhasdjfohsgdalfha-989908790-89080890890890898908", ObjectLock: ""] - would remove

it should be

Do you really want to nuke the account with the ID 637423565883 and the alias 'no-alias-637423565883'?
Waiting 3s before continuing.
us-east-1 - S3Bucket - s3://fsdfhasdjfohsgdalfha-989908790-89080890890890898908 - [CreationDate: "2024-11-01T22:49:00Z", Name: "fsdfhasdjfohsgdalfha-989908790-89080890890890898908", ObjectLock: ""] - would remove

@ekristen
Copy link
Owner

ekristen commented Nov 4, 2024

aws/aws-sdk-go-v2#2886

@lucix-aws
Copy link
Author

lucix-aws commented Nov 5, 2024

Following up after my response to the issue you opened. We've got two handlers here that were using the old metadata:

  • skipMissingServiceInRegionHandler
    • I would just drop this entirely. Per my comment on the issue, the assertion this code makes isn't really accurate. Our stance is that regions should just be treated as opaque strings, there's no meaningful way to infer presence of a service in a given region in the SDK.
  • skipGlobalHandler
    • You'll have to confirm my understanding, but it seems like this basically means "if region = global, don't do any work unless the service is marked as global". The appropriate way to translate that to v2 would be to insert, by default, a middleware that returns ErrSkipRequest if the user region was global, and then remove that on a per-service basis in NewFromConfig for services that you want to operate in this manner. I can implement this in this PR if you elect to go this route.
    • You might just want to remove the notion of global in the long-term and have everything go by region, but that's up to you and almost certainly outside the scope of this PR/MV.

I'll defer to you on how you want to proceed here.

@ekristen
Copy link
Owner

ekristen commented Nov 5, 2024

I appreciate the feedback, global is essentially us-east-1 except that it's for the things are are truly global, for example IAM users, roles, policies, and a few other select things. I think that I might have to just add a metadata property for "global" resources.

The reason why this exists is because yes you can query IAM in europe or asia, but it's all based out of us-east-1 and we don't want to try and remove it from two locations when it's really just in us-east-1, so the global exists as a way of ensuring we only query IAM stuff once.

So the handler omitted anything that wasn't global, when global was queried.

@lucix-aws
Copy link
Author

I think that I might have to just add a metadata property for "global" resources.

This seems like the right way forward here - that seems like it's really the level you want to be selecting on here (resources), rather than just a particular service.

So, in terms of this PR, do we want to just put a blanket "skip when global" middleware or leave it for now?

@ekristen
Copy link
Owner

ekristen commented Nov 5, 2024

An example middleware would be great, I'll need to fiddle with adding the global metadata and figuring out how to tie to together to make it work.

@lucix-aws
Copy link
Author

done, see b7892be

@ekristen
Copy link
Owner

ekristen commented Nov 7, 2024

@lucix-aws thanks, this is very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants