Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/devp2p: use AWS-SDK v2 #22360

Merged
merged 7 commits into from
Mar 19, 2021
Merged

cmd/devp2p: use AWS-SDK v2 #22360

merged 7 commits into from
Mar 19, 2021

Conversation

qhenkart
Copy link
Contributor

The AWS SDK released the v2 with a number of performance and usability improvements.

The AWS SDK for Go v2 requires Go 1.15 or higher, and offers significant performance improvements in CPU and memory utilization over version 1

I recommend perusing the announcement details

Migration is relatively seamless, although there were two locations that required a slightly different approach to achieve the same results. In both cases, the behavior mimics the v1 implementation verbatim. However these locations require a closer review and I think they've also exposed an opportunity for further optimization. I will leave comments in those locations to make the review easier

existing := make(map[string]recordSet)
err := c.api.ListResourceRecordSetsPages(&req, func(resp *route53.ListResourceRecordSetsOutput, last bool) bool {
for {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callback paginators are no longer implemented in v2, utilizing cursors instead. This refactor actually follows the same logical flow as an already existing pagination implementation

https://github.com/ethereum/go-ethereum/pull/22360/files#diff-2ca52123d993656e6b05df9dd124a36b2be796acfa612f371c3c90afc68b24b2L142

if err := c.api.WaitUntilResourceRecordSetsChanged(wreq); err != nil {
return err
var count int
for {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waiters have been refactored in v2. They are no longer local extensions of the request.Waiter type. V2 does have support for waiters but limited to certain packages and functions.

Please see how the previous waiter was implemented
Like my implementation here, you can see that the waiter checks the status 60 times in 30 second increments

v2 waiter implementations are handled in a similar fashion

@@ -109,21 +113,35 @@ func (c *route53Client) deploy(name string, t *dnsdisc.Tree) error {
batches := splitChanges(changes, route53ChangeSizeLimit, route53ChangeCountLimit)
for i, changes := range batches {
Copy link
Contributor Author

@qhenkart qhenkart Feb 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this loop can be optimized, mostly because the waiter is blocking (this is the case in the current version as well as in this PR).

Currently for every batch, it makes the change and then waits up to 30 minutes for the change to propagate before making the next change for the next batch. In a worse case scenario, this loop could block a command for hours.

My proposal is that either

  1. we loop through the batch and make all of the changes, then block the thread until all changes are made. This way the bottleneck only exists on the AWS side of how fast changes are propagated

  2. create a go-routine for each batch. This way each routine is responsible for making the change and then waiting for the propagation. The thread would lock until all of the routines have broken out of the wait loop. Like option 1, the bottleneck would only exist on the AWS side.

option 2 would be more elegant, both options would be a dramatic optimization. However I am not clear if this is necessary or desired. If it is I can include those changes in this PR or a different one (I can also create the optimization with the v1 sdk)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to make this change, I prefer option 1. There can be a lot of changes, and starting a goroutine for each has its own problems.

Copy link
Contributor Author

@qhenkart qhenkart Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fjl I have implemented option 1. Propagation takes an average of 60 seconds, according to AWS, so now instead of waiting ~ 60-90 seconds per change, ideally it would take 60-90 seconds for the first change and then the rest of the waiters should be successful on the first request

@qhenkart
Copy link
Contributor Author

@fjl probably we have to wait on this PR as a result of #22359 . The AWS-SDK-V2 requires Go 1.15 and if builds at 1.14 are still supported, then this PR should probably wait until 1.15 is the minimum build supported. Should I close and reopen when 1.14 is no longer supported?

I have reopened another PR that recreates the same optimization that we discussed and applied here but for the v1 implementation

@fjl
Copy link
Contributor

fjl commented Feb 26, 2021

I'll try this with Go 1.14, maybe it works anyway

@fjl
Copy link
Contributor

fjl commented Feb 26, 2021

Actually, it looks like it does build with Go 1.14 (we have a 1.14 builder on Travis)

@fjl fjl changed the title Updates AWS-SDK to v2 cmd/devp2p: use AWS-SDK v2 Mar 19, 2021
@fjl fjl merged commit e3a3f7c into ethereum:master Mar 19, 2021
holiman added a commit that referenced this pull request Mar 20, 2021
…for route53 (#22538)

This PR fixes a regression introduced in #22360, when we updated to the v2 of the AWS sdk, which causes current crawler to just get the same first 100 results over and over, and get stuck in a loop.
atif-konasl pushed a commit to frozeman/pandora-execution-engine that referenced this pull request Oct 15, 2021
This updates the DNS deployer to use AWS SDK v2. Migration is relatively
seamless, although there were two locations that required a slightly
different approach to achieve the same results. In particular, waiting for
DNS change propagation is very different with SDK v2. 

This change also optimizes DNS updates by publishing all changes before
waiting for propagation.
atif-konasl pushed a commit to frozeman/pandora-execution-engine that referenced this pull request Oct 15, 2021
…for route53 (ethereum#22538)

This PR fixes a regression introduced in ethereum#22360, when we updated to the v2 of the AWS sdk, which causes current crawler to just get the same first 100 results over and over, and get stuck in a loop.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants