-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Envoy to fetch secrets using SDS service. #4256
Conversation
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Load balancing senior maintainer load a bit. |
Hi @ggreenway, this PR is split from #4176. Please let me know if you have any questions. Thanks! |
@lizan Can you take a first pass on this one please? |
* secret provider. | ||
* @return TlsCertificateConfigProviderSharedPtr the dynamic TLS secret provider. | ||
*/ | ||
virtual TlsCertificateConfigProviderSharedPtr findOrCreateDynamicSecretProvider( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this work for other Secret types? name it findOrCreateTlsCertificateProvider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method name is updated to findOrCreateTlsCertificateProvider(). Will add findOrCreateCertificateValidationContextProvider() once #4264 is in. Thanks.
source/common/ssl/ssl_socket.cc
Outdated
config_(std::move(config)), | ||
ssl_ctx_(manager_.createSslClientContext(stats_scope_, *config_)) { | ||
config_->setSecretUpdateCallback([this]() { onAddOrUpdateSecret(); }); | ||
} | ||
|
||
Network::TransportSocketPtr ClientSslSocketFactory::createTransportSocket() const { | ||
if (ssl_ctx_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while locking is not neccessary, you have to capture ssl_ctx_ into a local variable here, onAddOrUpdateSecret
may overwrite it since it's on main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Thanks! Fixed this issue and added comment above.
Signed-off-by: JimmyCYJ <[email protected]>
@ggreenway Done, I was already did passes in #4176 and before @JimmyCYJ sending out it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good.
Is there any unit-test coverage for the SslSocketFactory behavior when the secrets aren't ready? If not, please add some.
void SecretManagerImpl::removeDynamicSecretProvider(const std::string& map_key) { | ||
ENVOY_LOG(debug, "Unregister secret provider. hash key: {}", map_key); | ||
|
||
RELEASE_ASSERT(dynamic_secret_providers_.erase(map_key) == 1, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: I prefer not putting actions inside ASSERT, even if it's RELEASE_ASSERT.
auto num_deleted = dynamic_secret_providers_.erase(map_key);
RELEASE_ASSERT(num_deleted == 1, "");
Also, does this need to be RELEASE_ASSERT? ASSERT is preferable in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. Moved erase() call outside of RELEASE_ASSERT.
RELEASE_ASSERT was recommended here comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be RELEASE_ASSERT; otherwise in opt builds, this entire line disappears.
With it split into two lines, the rational for making it RELEASE_ASSERT instead of ASSERT is no longer true, so this should now be ASSERT
TlsCertificateConfigProviderSharedPtr SecretManagerImpl::findOrCreateTlsCertificateProvider( | ||
const envoy::api::v2::core::ConfigSource& sds_config_source, const std::string& config_name, | ||
Server::Configuration::TransportSocketFactoryContext& secret_provider_context) { | ||
const std::string map_key = std::to_string(MessageUtil::hash(sds_config_source)) + config_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a hash as part of the key? Doesn't this leave the (rare) possibility of a hash collision? Can we flatten sds_config_source to string representation instead of taking a hash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Thanks! rds_impl.cc also use string representation as key. Changed map_key to make it consistent with rds.
Server::Configuration::TransportSocketFactoryContext& secret_provider_context) { | ||
const std::string map_key = std::to_string(MessageUtil::hash(sds_config_source)) + config_name; | ||
|
||
auto secret_provider = dynamic_secret_providers_[map_key].lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be easier to read with a typename instead of auto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with typename. Thanks.
@@ -51,9 +54,24 @@ class ContextConfigImpl : public virtual Ssl::ContextConfig { | |||
unsigned minProtocolVersion() const override { return min_protocol_version_; }; | |||
unsigned maxProtocolVersion() const override { return max_protocol_version_; }; | |||
|
|||
bool isReady() const override { | |||
// either tls_certficate_provider_ is nullptr or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: capitalize Either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
bool isReady() const override { | ||
// either tls_certficate_provider_ is nullptr or | ||
// tls_certficate_provider_->secret() is NOT nullptr. | ||
return !tls_certficate_provider_ || tls_certficate_provider_->secret(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|| tls_certificate_provider_->secret() != nullptr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
source/common/ssl/ssl_socket.cc
Outdated
} else { | ||
ENVOY_LOG(debug, "Create NotReadySslSocket"); | ||
stats_.upstream_connection_reset_by_sds_.inc(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stat name is confusing to me. Maybe rename to "upstream_context_secrets_not_ready" or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to upstream_context_secrets_not_ready
source/server/server.cc
Outdated
@@ -55,9 +55,9 @@ InstanceImpl::InstanceImpl(Options& options, TimeSource& time_source, | |||
dispatcher_(api_->allocateDispatcher(time_source)), | |||
singleton_manager_(new Singleton::ManagerImpl()), | |||
handler_(new ConnectionHandlerImpl(ENVOY_LOGGER(), *dispatcher_)), | |||
random_generator_(std::move(random_generator)), listener_component_factory_(*this), | |||
random_generator_(std::move(random_generator)), | |||
secret_manager_(new Secret::SecretManagerImpl()), listener_component_factory_(*this), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::make_unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Protobuf::RepeatedPtrField<envoy::api::v2::auth::Secret> secret_resources; | ||
auto secret_config = secret_resources.Add(); | ||
MessageUtil::loadFromYaml(TestEnvironment::substitute(yaml), *secret_config); | ||
std::dynamic_pointer_cast<SdsApi>(secret_provider)->onConfigUpdate(secret_resources, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to refactor this so we don't have to do this cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find a way to refactor this. The onConfigUpdate() is inherited from SubscriptionCallbacks, not SecretProvider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think there is no way to refactor this without dynamic_cast, but this could be dynamic_cast<SdsApi&>(*secret_provider).onConfigUpdate
// Wait for ssl_context_updated_by_sds counter. | ||
if (version_ == Network::Address::IpVersion::v4) { | ||
test_server_->waitForCounterGe( | ||
"listener.127.0.0.1_0.server_ssl_socket_factory.ssl_context_update_by_sds", 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to be able to set the stat_prefix on listeners, so that you don't have to care which ip version the test is for. But that's a fix for another day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Thanks!
Signed-off-by: JimmyCYJ <[email protected]>
@ggreenway Thanks for the review! I have addressed some comments. Will work on the rest soon. |
Add certificate validation context config provider and refactor ContextConfigImpl. Make static and inline CertificateValidationContext utilize it. This is independent of PR #4256. Once #4256 is in, next step is to support fetching CertificateValidationContext via validation_context_sds_secret_config. Risk Level: Low Testing: unit test, integration test Docs Changes: N/A Release Notes: N/A Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
…der_context Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
@ggreenway Thanks for the review! I have added two unit tests into ssl_socket_test.cc for NotReadySslSocket. Please take a look. |
NiceMock<Server::Configuration::MockTransportSocketFactoryContext> secret_context; | ||
|
||
envoy::api::v2::core::ConfigSource config_source; | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why do you need this block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I added that block to cover some some function call, but many changes have been made and this block is not needed.
@@ -132,6 +137,51 @@ name: "abc.com" | |||
"Secret type not implemented"); | |||
} | |||
|
|||
TEST_F(SecretManagerImplTest, SdsDynamicSecretUpdateSuccess) { | |||
Server::MockInstance server; | |||
std::unique_ptr<SecretManager> secret_manager(new SecretManagerImpl()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: std::make_unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Protobuf::RepeatedPtrField<envoy::api::v2::auth::Secret> secret_resources; | ||
auto secret_config = secret_resources.Add(); | ||
MessageUtil::loadFromYaml(TestEnvironment::substitute(yaml), *secret_config); | ||
std::dynamic_pointer_cast<SdsApi>(secret_provider)->onConfigUpdate(secret_resources, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think there is no way to refactor this without dynamic_cast, but this could be dynamic_cast<SdsApi&>(*secret_provider).onConfigUpdate
|
||
auto secret_provider = secret_manager->findOrCreateTlsCertificateProvider( | ||
config_source, "abc.com", secret_context); | ||
std::string yaml = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to const std::string and replaced dynamic_pointer_cast with dynamic_cast. Thanks.
Signed-off-by: JimmyCYJ <[email protected]>
Waiting for review of most recent commit by @lizan |
Pulling the following changes from github.com/envoyproxy/envoy: f936fc6 ssl: serialize accesses to SSL socket factory contexts (envoyproxy#4345) e34dcd6 Fix crash in tcp_proxy (envoyproxy#4323) ae6a252 router: fix matching when all domains have wildcards (envoyproxy#4326) aa06142 test: Stop fake_upstream methods from accidentally succeeding (envoyproxy#4232) 5d73187 rbac: update the authenticated.user to a StringMatcher. (envoyproxy#4250) c6bfc7d time: Event::TimeSystem abstraction to make it feasible to inject time with simulated timers (envoyproxy#4257) 752483e Fixing the fix (envoyproxy#4333) 83487f6 tls: update BoringSSL to ab36a84b (3497). (envoyproxy#4338) 7bc210e test: fixing interactions between waitFor and ignore_spurious_events (envoyproxy#4309) 69474b3 admin: order stats in clusters json admin (envoyproxy#4306) 2d155f9 ppc64le build (envoyproxy#4183) 07efc6d fix static initialization fiasco problem (envoyproxy#4314) 0b7e3b5 test: Remove declared but undefined class methods (envoyproxy#4297) 1485a13 lua: make sure resetting dynamic metadata wrapper when request info is marked dead d243cd6 test: set to zero when start_time exceeds limit (envoyproxy#4328) 0a1e92a test: fix heap use-after-free in ~IntegrationTestServer. (envoyproxy#4319) cddc732 CONTRIBUTING: Document 'kick-ci' trick. (envoyproxy#4335) f13ef24 docs: remove reference to deprecated value field (envoyproxy#4322) e947a27 router: minor doc fixes in stream idle timeout (envoyproxy#4329) 0c2e998 tcp-proxy: fixing a TCP proxy bug where we attempted to readDisable a closed connection (envoyproxy#4296) 00ffe44 utility: fix strftime overflow handling. (envoyproxy#4321) af1183c Re-enable TcpProxySslIntegrationTest and make the tests pass again. (envoyproxy#4318) 3553461 fuzz: fix H2 codec fuzzer post envoyproxy#4262. (envoyproxy#4311) 42f6048 Proto string issue fix (envoyproxy#4320) 9c492a0 Support Envoy to fetch secrets using SDS service. (envoyproxy#4256) a857219 ratelimit: revert `revert rate limit failure mode config` and add tests (envoyproxy#4303) 1d34172 dns: fix exception unsafe behavior in c-ares callbacks. (envoyproxy#4307) 1212423 alts: add gRPC TSI socket (envoyproxy#4153) f0363ae fuzz: detect client-side resets in H2 codec fuzzer. (envoyproxy#4300) 01aa3f8 test: hopefully deflaking echo integration test (envoyproxy#4304) 1fc0f4b ratelimit: link legacy proto when message is being used (envoyproxy#4308) aa4481e fix rare List::remove(&target) segfault (envoyproxy#4244) 89e0f23 headers: fixing fast fail of size-validation (envoyproxy#4269) 97eba59 build: bump googletest version. (envoyproxy#4293) 0057e22 fuzz: avoid false positives in HCM fuzzer. (envoyproxy#4262) 9d094e5 Revert ac0bd74 (envoyproxy#4295) ddb28a4 Add validation context provider (envoyproxy#4264) 3b47cba added histogram latency information to Hystrix dashboard stream (envoyproxy#3986) cf87d50 docs: update SNI FAQ. (envoyproxy#4285) f952033 config: fix update empty stat for eds (envoyproxy#4276) 329e591 router: Add ability of custom headers to rely on per-request data (envoyproxy#4219) 68d20b4 thrift: refactor build files and imports (envoyproxy#4271) 5fa8192 access_log: log requested_server_name in tcp proxy (envoyproxy#4144) fa45bb4 fuzz: libc++ clocks don't like nanos. (envoyproxy#4282) 53f8944 stats: add symbol table for future stat name encoding (envoyproxy#3927) c987b42 test infra: Remove timeSource() from the ClusterManager api (envoyproxy#4247) cd171d9 websocket: tunneling websockets (and upgrades in general) over H2 (envoyproxy#4188) b9dc5d9 router: disallow :path/host rewriting in request_headers_to_add. (envoyproxy#4220) 0c91011 network: skip socket options and source address for UDS client connections (envoyproxy#4252) da1857d build: fixing a downstream compile error by noting explicit fallthrough (envoyproxy#4265) 9857cfe fuzz: cleanup per-test environment after each fuzz case. (envoyproxy#4253) 52beb06 test: Wrap proto string in std::string before comparison (envoyproxy#4238) f5e219e extensions/thrift_proxy: Add header matching to thrift router (envoyproxy#4239) c9ce5d2 fuzz: track read_disable_count bidirectionally in codec_impl_fuzz_test. (envoyproxy#4260) 35103b3 fuzz: use nanoseconds for SystemTime in RequestInfo. (envoyproxy#4255) ba6ba98 fuzz: make runtime root hermetic in server_fuzz_test. (envoyproxy#4258) b0a9014 time: Add 'format' test to ensure no one directly instantiates Prod*Time from source. (envoyproxy#4248) 8567460 access_log: support beginning of epoch in START_TIME. (envoyproxy#4254) 28d5f41 proto: unify envoy_proto_library/api_proto_library. (envoyproxy#4233) f7d3cb6 http: fix allocation bug introduced in envoyproxy#4211. (envoyproxy#4245) Fixes istio/istio#8310 (once pulled into istio/istio). Signed-off-by: Piotr Sikora <[email protected]>
Description: Use SDS api that fetches secrets from remote SDS server. Secrets are stored in Secret Provider. Listeners and Clusters are updated when secrets are received.
Risk Level: Low
Testing: Unit tests and integration tests
Fixes #1194
Signed-off-by: Jimmy Chen [email protected]