-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add shutdown listener #645
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code lgtm. Couple minor suggestions, nothing major.
ipamd/ipamd.go
Outdated
// reconcileCooldownCache keeps timestamps of the last time an IP address was unassigned from an ENI, | ||
// so that we don't reconcile and add it back too quickly if IMDS lags behind reality. | ||
reconcileCooldownCache ReconcileCooldownCache | ||
terminating *int32 // Flag to warn that the pod is about to shut down. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could just make this an int32 and then when doing the atomic.StoreInt32()
call below, just refer to the address of this field, like so:
atomic.StoreInt32(&c.terminating, 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, makes it cleaner. (Was testing the concept first with a boolean, but that's not thread safe.)
ipamd/ipamd.go
Outdated
c.awsClient = client | ||
|
||
t := int32(0) // Initializing to 0, meaning 'false' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that way you wouldn't need the above line or temporary variable.
624eb22
to
c8c7ff8
Compare
c8c7ff8
to
9735bff
Compare
@mogren @jaypipes do you think this maybe needs to be extended to prevent ipamd making any other updates in addition to the ENI updates? I'm currently debugging some weirdness and spotted some updates being made after a SIGTERM which led me here. Are you sure that if we SIGKILL at any point things will always reconcile cleanly on the next start? |
Hi @mattmb - Can you please expand more on what other updates you are seeing? |
I don't have a full conclusion yet @jayanthvn but will keep you posted if I find something conclusive. This is more of a hunch right now so just interested if the design of the shutdown process has been thought about before. In our infra we issue certificates to auth IPAMd to k8s and when we renew them we have to restart IPAMd. In addition we had a config management bug where we were doing an unnecessary restart quite soon after boot. One symptom I've seen is stale IP rules that seem to be left behind when the Pod itself was terminated some time ago. Here's an example log from one host I was debugging that shows quite a bit of activity after the sigterm before the sigkill:
|
*Issue #608, #401
Description of changes:
Some changes to reduce the risk of leaking ENIs during CNI upgrades.
SIGTERM
andSIGINT
.exec /app/aws-k8s-agent
to start the agent in order to propagate signals.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.