-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
security: clear client cert expiration cache on SIGHUP #110726
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)
pkg/security/certificate_manager.go
line 179 at r2 (raw file):
log.StructuredEvent(ctx, &eventpb.CertsReload{Success: true}) } cm.clientCertExpirationCache.Clear()
I'd recommend pulling this up, above the call to LoadCertificates
, such that the structured logging message is still guaranteed to be emitted after the reload (and now, flush) completes.
this would also benefit from a test, you can see TestRotateCerts for an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the pointer to the test!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/security/certificate_manager.go
line 179 at r2 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
I'd recommend pulling this up, above the call to
LoadCertificates
, such that the structured logging message is still guaranteed to be emitted after the reload (and now, flush) completes.
good call
b176611
to
17e8e3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me. Let's however check with @lancel66 whether the customer is OK with a flush of all the metrics at once (an alternative design, not considered here, was to provide a mechanism to flush the metric for just 1 SQL user).
Reviewed 1 of 2 files at r3, 4 of 4 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rafiss)
pkg/security/cert_expiry_cache.go
line 91 at r4 (raw file):
OnEvictedEntry: func(entry *cache.Entry) { gauge := entry.Value.(*aggmetric.Gauge) gauge.Update(0)
Why was this needed?
This will help with making a change to clear the cache on the SIGHUP signal. Release note: None
17e8e3c
to
d07142b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @mgartner)
pkg/security/cert_expiry_cache.go
line 91 at r4 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
Why was this needed?
I noticed this comment on Unlink
// Unlink unlinks this child from the parent, i.e. the parent will no longer// track this child (i.e. won't generate labels for it, etc). However, the child
// will continue to be functional and reference the parent, meaning updates to
// it will be reflected in the aggregate stored in the parent.
so it seems more correct to reset the child gauge when unlinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rafiss)
pkg/security/cert_expiry_cache.go
line 91 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
I noticed this comment on
Unlink
// Unlink unlinks this child from the parent, i.e. the parent will no longer// track this child (i.e. won't generate labels for it, etc). However, the child // will continue to be functional and reference the parent, meaning updates to // it will be reflected in the aggregate stored in the parent.
so it seems more correct to reset the child gauge when unlinking.
probably worth a comment.
d07142b
to
f106183
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @mgartner)
pkg/security/cert_expiry_cache.go
line 91 at r4 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
probably worth a comment.
i've added one now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @mgartner, and @rafiss)
pkg/security/cert_expiry_cache.go
line 91 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
i've added one now
the aggregate value stored in the parent is meaningless, so that's why it wasn't being reset when unlinked. but this change is harmless so up to you
pkg/security/certificate_manager.go
line 190 at r5 (raw file):
} // MaybeUpsertClientExpiration updates the updates or inserts the expiration
nit: "updates" instead of "updates the updates"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @cameronnunez, @knz, and @mgartner)
pkg/security/cert_expiry_cache.go
line 91 at r4 (raw file):
Previously, cameronnunez (Cameron Nunez) wrote…
the aggregate value stored in the parent is meaningless, so that's why it wasn't being reset when unlinked. but this change is harmless so up to you
i'll keep it since it's a good example to set for other usages
pkg/security/certificate_manager.go
line 190 at r5 (raw file):
Previously, cameronnunez (Cameron Nunez) wrote…
nit: "updates" instead of "updates the updates"
fixed, thanks!
Release note (security update): The SIGHUP signal will now clear the cached expiration times for client certificates that are reported by the security.certificate.expiration.client metric.
f106183
to
8c42665
Compare
tftrs! bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 40ca81d to blathers/backport-release-23.1-110726: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Fixes #110493.
Epic: CRDB-28893
Release note (security update): The SIGHUP signal will now clear the
cached expiration times for client certificates that are reported by
the security.certificate.expiration.client metric. (The SIGHUP signal also
will still cause the server to reload certificates from the filesystem, as
it did before.)