Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force ENI/IP reconciliation to delete from the datastore #733

Closed
wants to merge 4 commits into from
Closed

Force ENI/IP reconciliation to delete from the datastore #733

wants to merge 4 commits into from

Conversation

tatatodd
Copy link
Contributor

Fixes #732

The ENI/IP reconciliation logic fails to delete from the datastore, if any IPs are already assigned to pods. This is wrong; the AWS local metadata is the source of truth for what ENIs/IPs are actually attached to the EC2 instance. By failing to delete from the datastore, ipamd will assign IPs from ENIs that aren't actually attached to the EC2 instance.

This PR fixes this by forcing the reconciliation logic to delete from the datastore. I've also added prometheus counters to track how often this force-deletion is occurring, to aid debugging.

The unittests pass, and I've been running an image built from this on my clusters, and have seen that it fixes the problem.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

* seems about right

* force removal tests
@tatatodd
Copy link
Contributor Author

Oops, this was a bad merge from my 1.5.5 branch. I'll fix that now.

@jaypipes
Copy link
Contributor

jaypipes commented Dec 6, 2019

@tatatodd this is showing merge conflicts. Mind having a whack at those?

Copy link
Contributor

@jaypipes jaypipes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few femto-nits inline. But, big picture, I'm concerned about the comment that @mogren had on an earlier revision about the ec2 instance metadata service often containing stale information. If that is the case, do we really want to use the IMDS as the source of truth instead of the internal ipamd datastore?

ipamd/ipamd.go Outdated Show resolved Hide resolved
ipamd/ipamd.go Outdated Show resolved Hide resolved
ipamd/ipamd.go Outdated Show resolved Hide resolved
ipamd/datastore/data_store.go Outdated Show resolved Hide resolved
ipamd/datastore/data_store.go Outdated Show resolved Hide resolved
ipamd/datastore/data_store.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@tatatodd tatatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look @jaypipes
I've fixed the merge and addressed your comments.

WRT the higher-level question of whether the local metadata should be the source-of-truth, my 2 cents is that at the moment it's already treated as the source of truth. AFAICT that seems to be the point of the reconciliation process. And while it's unfortunate that the local metadata can be so stale, if it reports that an ENI is detached, it doesn't seem useful for ipamd to ignore the local metadata and to continue to assign IPs from that ENI.

Another way to think of it is that it's weird to treat the local metadata as the source of truth for ENI attachment, but to ignore it for ENI detachment, which is what HEAD currently does.

That said, I don't know what other architectural changes you folks might have in mind, so I'll leave this up to you.

ipamd/datastore/data_store.go Outdated Show resolved Hide resolved
ipamd/datastore/data_store.go Outdated Show resolved Hide resolved
ipamd/datastore/data_store.go Outdated Show resolved Hide resolved
ipamd/ipamd.go Outdated Show resolved Hide resolved
ipamd/ipamd.go Outdated Show resolved Hide resolved
ipamd/ipamd.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jaypipes jaypipes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those small changes, @tatatodd, appreciated.

I think your explanation/justification is on point, so I'm cool with pushing this through. I do wonder if we should attempt to reassign the active pod's IP addresses from the detached ENI to an attached ENI (or alternately bring up a new ENI automatically on the instance and assign the pods an IP from that newly-attached ENI). But, this patch doesn't need to address that; we can discuss in a future patch whether that's something we should tackle.

@jaypipes
Copy link
Contributor

@mogren if you're cool with this, feel free to send it on through.

Copy link
Contributor

@mogren mogren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this makes sense, thanks!

@jaypipes
Copy link
Contributor

@tatatodd I felt bad because I merged something ahead of you that cause merge conflicts with your PR, so I checked out your commits, rebased and fixed the conflicts and have pushed a PR here: #754. Closing this PR out (you still get credit for all your excellent contributions, just in the #754 instead of here).

@tatatodd
Copy link
Contributor Author

Thanks for the help @jaypipes and @mogren !

This was referenced Dec 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ipamd assigns IPs from detached ENIs
3 participants