Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate CertFetcher with flag-protection into AMP Packager. #349

Merged
merged 23 commits into from
Nov 13, 2019

Conversation

banaag
Copy link
Collaborator

@banaag banaag commented Sep 18, 2019

Update certcache to use certfetcher if cert autorenew is turned on.
Update ACME config to include email adddress and acme challenge port.
Update certloader.PopulateCertCache to instantiate certfetcher and pass it on to certcache instance.
Update all relevant tests.

… Update certcache to use certfetcher if cert autorenew is turned on. Update certloader.PopulateCertCache to instantiate certfetcher and pass it on to certcache instance.
… Update certcache to use certfetcher if cert autorenew is turned on. Update certloader.PopulateCertCache to instantiate certfetcher and pass it on to certcache instance.
Copy link
Member

@Gregable Gregable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only minor comments, overall the structure looks good to me.

// How often to check if certs needs updating.
const CertCheckInterval = 24 * time.Hour

// Recommended renewal duration for certs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe clarify in the comments that this is the duration before the next expiry rather than the duration after the last renewal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
d, err := util.GetDurationToExpiry(this.certs[0], this.certs[0].NotAfter)
if err != nil {
// Current cert is already invalid. Check if renewal is available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using TODO to mark places for the follow-on changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already try to do this, what you pointed out is code I missed updating and now fixed. Thanks!

Copy link
Member

@twifkak twifkak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very very very very sorry for the delay. I did my best to understand the changes, but I don't think I fully succeeded at that, so comments are a bit more surface-level than I'd normally prefer to give.

go.sum Show resolved Hide resolved
packager/certcache/certcache.go Outdated Show resolved Hide resolved
packager/certcache/certcache.go Show resolved Hide resolved
packager/certcache/certcache.go Outdated Show resolved Hide resolved
packager/certcache/certcache.go Outdated Show resolved Hide resolved
// If renewedCerts is set, copy that over to certs
// and set renewedCerts to nil.
// TODO(banaag): do the same cert setting dance on disk.
// Purge OCSP cache? Make shouldUpdateOCSP() return true?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, don't forget TODO this (future PR OK).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the cert setting dance on disk. The Purge OCSP cache can be a future PR.

packager/certcache/certcache.go Outdated Show resolved Hide resolved
certs []*x509.Certificate
// If certFetcher is not set, that means cert auto-renewal is not available.
certFetcher *certfetcher.CertFetcher
renewedCertsMu sync.RWMutex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The certName should change on a cert renewal (the sha256 computed by util.CertName is different), I think? So I think we need a CertCache.renewedCertName field also.
  2. I think we need to put any reads/writes of certName and renewedCertName under their respective RWMutex.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

log.Println("Error trying to fetch new certificates from CA: ", err)
return
}
this.setNewCerts(certs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems incorrect to me; sorry if I read it wrong. AFAICT:

  1. 8 days before expiry, this code runs and creates the new cert file. Nothing uses it yet.
  2. 1 day later, maintainCerts calls this function again, which will lead down here to another call to FetchNewCert. Ditto daily.
  3. Finally, maintainCerts calls this function and the old cert is expired, at which point the new cert is copied on top of it.

Step 2 seems like it'll make multiple redundant ACME requests for a cert.

Step 3 seems like it's happening at the wrong time (both too soon and too late). It seems like we should wait until we can get a GOOD OCSP response for it, and then immediately start serving it and using it to sign SXGs, so that we don't enter that 7-day period when we can't sign SXGs that last long enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me think about this and our offline discussion about OCSP checks and send a follow-up PR when it's done. Thanks!

packager/certloader/certloader.go Outdated Show resolved Hide resolved
@banaag banaag requested review from Gregable and twifkak October 31, 2019 17:02
@banaag
Copy link
Collaborator Author

banaag commented Nov 7, 2019

Fixed the comments I missed (thanks twifkak@). Please feel free to continue review. @twifkak @Gregable

Copy link
Member

@twifkak twifkak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I don't want to block you any more than I already have (sorry!) so I'll approve with a few remaining changes you can make before merging.

Before recommending this to users, though, I would like to see:

  • A lot more test coverage (future PR OK). This adds some new functions with a lot of branching paths, and also complex interactions between functions based on the timing of the various go routines, or when things happen on the ACME or OCSP server, or when things happen on another AMP Packager.
  • Go over your TODOs again, to see if there's anything we missed (future PR OK). For example, I recall we were supposed to present the TOS as part of the initial setup process (certfetcher.go).

@@ -234,6 +237,9 @@ func (this *CertCache) ocspMidpoint(bytes []byte, issuer *x509.Certificate) (tim

func (this *CertCache) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
params := mux.Params(req)

this.certsMu.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be RLock/RUnlock? The only use of this.certs is a read, down on line 333.

Also, should it be closer to the read (i.e. inside the for loop)?

Copy link
Collaborator Author

@banaag banaag Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already RLock and RUnlock in the current version (maybe you were looking at an older file?). Also, this is supposed to guard the read on this.certName. In the current version, I don't see anything on line 333 that needs to be guarded.

@@ -475,22 +481,50 @@ func (this *CertCache) findIssuer() *x509.Certificate {
return nil
}

// Finds the issuer of the specified cert (i.e. the second from the bottom of the
// chain).
func (this *CertCache) findIssuerUsingCerts(certs []*x509.Certificate) *x509.Certificate {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eliminate the duplication: implement findIssuer() using this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -669,30 +718,61 @@ func (this *CertCache) updateCertIfNecessary() {
this.setCerts(certs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this one have an OCSP cache purge, too? If so, maybe move the certloader.RemoveFile call inside the setCerts impl.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Done.

Comment on lines 754 to 758
d := time.Duration(0)
err := errors.New("")
if this.hasCert() {
d, err = util.GetDurationToExpiry(this.certs[0], time.Now())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/spaces/tabs/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran go fmt certcache.go

Comment on lines 754 to 755
d := time.Duration(0)
err := errors.New("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of having these incorrect default values. Perhaps it's worth extracting the first half of this function as a:

func (this *CertCache) doesCertNeedReloading() bool {
  if !this.hasCert() { return true }
  d, err := ...
  return err != nil || d < certRenewalInterval
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

issuer := certToPEM(certs[1])
bundled := append(cert, issuer...)
certLen := len(certs)
bundled := make([]byte, 0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just []byte{} perhaps?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bundled := append(cert, issuer...)
certLen := len(certs)
bundled := make([]byte, 0, 0)
for i := 0; i < certLen - 1; i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just change certLen - 1 to certLen and get rid of the special case issuer after the loop? Better still use for _, cert := range certs {.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

cert := certToPEM(certs[i])
bundled = append(bundled, cert...)
}
issuer := certToPEM(certs[certLen])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't certs[certLen] cause an out-of-range error?

Also, FYI, other than the leaf cert being at position 0, there's no ordering requirement for the rest of the certs array. The issuer cert can be any of the subsequent certs. (Hence why findIssuer() exists.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fixed in previous comment. Good to know about issuer position.

} else {
var ocspUpdateAfter time.Time

ocsp, _, errorOcsp := this.readOCSP(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's Go style to capitalize the initialisms inside camel-case, e.g. errorOCSP and newOCSP. (But keep it lowercase if it's the first one, like ocspUpdateAfter.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


ocsp, _, errorOcsp := this.readOCSP(true)
if errorOcsp != nil {
newOcsp := this.fetchOCSP(ocsp, this.renewedCerts, &ocspUpdateAfter, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this is going to try to fetch the OCSP response for the new cert once a day, rather than using the backoff retry logic you added to readOCSP. Is that desired? Maybe that's OK for now, but can you add a TODO to revisit?

I realize it's difficult to use readOCSP here, since it's hard-coded to use this.certs and friends, rather than this.renewedCerts and friends. That makes me think two things:

  1. The extraction of the retry logic from readOCSP would be useful.
  2. We should bundle certName, certs, certsMu, ocspFile, ocspFilePath, ocspUpdateAfter, and ocspUpdateAfterMu into a new struct type, and then have two copies of that in certcache - one for current certs and one for new certs.

OK for future PR... TODO? 🐑 😁

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for the certs is by design and is as follows:
updateCertIfNecessary() will be called:

  1. On demand, during initialization or when GetLatestCert() is called and we already have an expired cert.
  2. Every 'certCheckInterval' hours (currently every 24 hours).

The logic for retries in readOCSP is specific to just reading OCSP and retrying when we got a new cert because of Digicert's setup.

Inside updateCertIfNecessary, there's a case where we already got the renewalCert (waiting in this.renewedCerts) and in this case, we make sure we also get the OCSP by calling readOCSP with retries set to true.

I understand your suggestion of possibly reusing the readOCSP retry logic in this edge case, but given that the updateCertIfNecessary() calls are a day apart, chances are the fetchOCSP might fail the first time, but will succeed the next day. Given that, it will still be nice-to-have the refactor of readOCSP so I added this as a TODO.

// Callers can use the uninitialized CertCache for testing certificates (without doing OCSP or
// cert refreshes).
func New(certs []*x509.Certificate, certFetcher *certfetcher.CertFetcher, domains []string,
certFile string, newCertFile string, ocspCache string) *CertCache {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider this comment optional.

The long argument list makes the callsites tricky to read and easy to get wrong, especially if several of the arguments have the same type.

An alternative pattern would be to create an IsInitialized() bool or similarly named function that verifies all of the required fields have been set. Then callers can just set fields in the struct by name and assert IsInitialized before doing anything with it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO.

continue
}
}
if len(ocsp) == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have 3 places in this method that catch different error cases and trigger the numTries++;continue logic.

Consider extracting a method that does the ocsp read, checks all of the conditions (read error, empty file, unhealthy ocsp) and returns an error on any. This method and for loop then just implements the retry on error logic, but only needs a single condition/continue block.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

} else {
break
}
numTries++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reachable?

The previous condition block starting l.367 always exists I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with refactor from comment above.

// Wait using exponential backoff.
log.Printf("Waiting for %d minute(s)\n", waitTimeInMinutes)
waitTimeDuration := time.Duration(waitTimeInMinutes) * time.Minute
// For exponential backoff.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no "right" read pattern, so feel free to ignore as bikeshedding.

I think that the cost of making an OCSP request to the CA is reasonably low enough that we could do it every minute or 10 until it succeeds or we time out.

With this backoff, the last lookup is 8hrs apart? Seems excessively long to wait. I'd consider something like every 30s for 5 minutes, and then every 5 minutes for the next 12 hrs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the logic so that it caps the wait time at 10 minutes, max.

domains []string, acmeChallengePort int, shouldRegister bool) (*CertFetcher, error) {
func New(email string, certSignRequest *x509.CertificateRequest, privateKey crypto.PrivateKey,
acmeDiscoURL string, httpChallengePort int, httpChallengeWebRoot string,
tlsChallengePort int, dnsProvider string, shouldRegister bool) (*CertFetcher, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional again. Same thought here about breaking this up. Callsite could have some structure like:

fetcher := CertFetcher()
fetcher.setUser(email, privateKey)
fetcher.bindToPort(port)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TODO.

@banaag
Copy link
Collaborator Author

banaag commented Nov 13, 2019

Fixed @Gregable comments. PTAL.

@banaag banaag merged commit e416691 into ampproject:master Nov 13, 2019
@@ -107,6 +107,12 @@ type CertCache struct {
// Callers need to call Init() on the returned CertCache before the cache can auto-renew certs.
// Callers can use the uninitialized CertCache for testing certificates (without doing OCSP or
// cert refreshes).
//
// TODO(banaag): per greigable@ comments:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*gregable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address in follow-up PR #369


return ocsp, ocspUpdateAfter, nil
}

// Returns the OCSP response and expiry, refreshing if necessary.
// TODO(banaag): Per twifkak's suggestion, consider:
// It may also be interesting to try to separate the retry logic from the fetch logic. One approach comes to mind:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you just did this TODO, essentially?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address in follow-up PR #369

@@ -396,6 +411,10 @@ func waitForSpecifiedTime(waitTimeInMinutes int, numRetries int) int {
waitTimeDuration := time.Duration(waitTimeInMinutes) * time.Minute
// For exponential backoff.
newWaitTimeInMinutes := 2 * waitTimeInMinutes
if newWaitTimeInMinutes > 10 {
// Cap the wait time at 10 minutes.
newWaitTimeInMinutes = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per this change, the comment on maxOCSPTries needs to be updated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address in follow-up PR #369

banaag added a commit to banaag/amppackager that referenced this pull request Nov 15, 2019
banaag added a commit that referenced this pull request Nov 15, 2019
twifkak pushed a commit that referenced this pull request Dec 5, 2019
First pass at updating amppkg.example.toml to document new experimental ACME support, implemented in #349 and others.
@banaag banaag added the acme All ACME related PRs label Feb 24, 2020
@banaag banaag deleted the certfetcher-integrate branch March 5, 2020 22:06
@twifkak twifkak mentioned this pull request Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acme All ACME related PRs cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants