-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make node TLS certificates totally ephemeral #2675
Conversation
go/common/identity/identity.go
Outdated
} | ||
nextCert, err := tlsCert.Generate(CommonName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also make sure to reduce certificate expiration once this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentry nodes and IAS proxies require non-rotating certificates, so I've left this as-is for now.
413c698
to
5e69dc3
Compare
Codecov Report
@@ Coverage Diff @@
## master #2675 +/- ##
==========================================
+ Coverage 65.67% 65.69% +0.02%
==========================================
Files 342 342
Lines 32579 32866 +287
==========================================
+ Hits 21395 21591 +196
- Misses 8483 8581 +98
+ Partials 2701 2694 -7
Continue to review full report at Codecov.
|
a0e883a
to
f9a5dbe
Compare
@@ -50,7 +50,7 @@ func doNodeInit(cmd *cobra.Command, args []string) { | |||
) | |||
os.Exit(1) | |||
} | |||
if _, err = identity.LoadOrGenerate(dataDir, nodeSignerFactory); err != nil { | |||
if _, err = identity.LoadOrGenerate(dataDir, nodeSignerFactory, true); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that nodes provisioned this way won't have ephemeral TLS certs unless they obliterate the persisted cert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. As rotation is off by default for now, I thought this would make most sense.
Once we enable rotation by default, we can also disable persistent TLS certs here or add a flag to control this behavior.
57581bb
to
6e591d1
Compare
@@ -0,0 +1,6 @@ | |||
node: Add automatic TLS certificate rotation support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably open an issue in https://github.com/oasislabs/docs once this is merged, so we wont forget adding/updating docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
ch, sub, err := auth.client.WatchRuntimes(ctx) | ||
var redialAttempts uint | ||
|
||
Redial: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use the cenkalti/backoff package we use in other similar cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be an overkill for this simple use case -- the only time this connection is dropped is if the upstream node rotates its TLS certificates, a constant 2s retry is good enough.
1216289
to
36e985d
Compare
Thanks to everyone for your previous review comments. This PR is now ready for re-review :) |
// TLSCertificate is a certificate that can be used for TLS. | ||
TLSCertificate *tls.Certificate | ||
// tlsSigner is a node TLS certificate signer. | ||
tlsSigner signature.Signer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of style I prefer private struct members to be separated from public ones (and not interleaved like it is here).
36e985d
to
acdb3c9
Compare
go/worker/sentry/grpc/worker.go
Outdated
dialUpstream := func() error { | ||
_, err := g.upstreamDialer(g.ctx) | ||
if err != nil { | ||
if numRetries < 60 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could instead wrap the existing constant backoff with WithMaxRetries
upstreamConn, err := initConnection(identity) | ||
if err != nil { | ||
return nil, fmt.Errorf("gRPC sentry worker initializing upstream connection failure: %w", err) | ||
g.upstreamDialer = func(ctx context.Context) (*grpc.ClientConn, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the mutex be removed if instead the dialer just did initConnection
and returned it (without storing it in the sentry worker struct)?
It might be also a bit cleaner, since in that case there will always be two connections, one the proxy will use and one the policy watcher? (and each would keep track of it's own connection like both already do, and if needed, initialize more connections). I think the only difference to current impl. would be that currently the connection is ?sometimes? shared (I think currently it depends if proxy
or sentry
first invoke the dialer, if the connection is shared or not - since upstreamDialer
updates the internal connection state of the sentry
, but doesn't the one of the proxy
?).
On the other hand if we want to keep using one connection i think it could be done cleaner if upstreamDialer
is it's own struct (and not part of sentry worker), that maintains a connection (or a pool of connections), and exposes a dial method that returns it (and redials it if necessary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to replace the policy watcher with a new sentry command (e.g. UpdatePolicies
or something) that is called by the upstream node in a similar way to SetUpstreamTLSCertificates
added in this PR. This will allow further simplifications of the sentry code, but that should be done in a separate PR.
I'll make an issue for this after I merge the PR.
@@ -0,0 +1,5 @@ | |||
Sentry nodes no longer require TLS certificate file of the upstream node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left 2 minor comments, otherwise looks good! 👍
acdb3c9
to
ee4d9fc
Compare
Thanks! |
Closes #2098.
TODO:
storage client: failed to write to any storage node
(I usedGetCertificate
where I should have usedGetClientCertificate
🤦♂️ ).