-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of SDS #4176
Implementation of SDS #4176
Conversation
@@ -26,6 +26,7 @@ | |||
#include "envoy/upstream/upstream.h" | |||
|
|||
namespace Envoy { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: revert
Secret::TlsCertificateConfigProviderSharedPtr | ||
getTlsCertificateConfigProvider(const envoy::api::v2::auth::CommonTlsContext& config, | ||
Secret::SecretManager& secret_manager) { | ||
Secret::TlsCertificateConfigProviderSharedPtr ContextConfigImpl::getTlsCertificateConfigProvider( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this need to be a non static private function? I don't think it uses any member variable?
include/envoy/ssl/context_config.h
Outdated
* Add secret callback into context config. | ||
* @param callback callback that is executed by context config. | ||
*/ | ||
virtual void setUpdateCallback(Secret::SecretCallbacks& callback) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setSecretUpdateCallback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
source/common/secret/sds_api.cc
Outdated
/* rest_legacy_constructor */ nullptr, | ||
"envoy.service.discovery.v2.SecretDiscoveryService.FetchSecrets", | ||
"envoy.service.discovery.v2.SecretDiscoveryService.StreamSecrets"); | ||
Config::Utility::checkLocalInfo("sds", local_info_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can move subcription_ creation into the constructor so we don't need to store so many object references.
Only call start in the initialize() function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot move subcription_ into constructor. Envoy::Config::SubscriptionFactory::subscriptionFromConfigSourceenvoy::api::v2::auth::Secre will access members of those objects, which are not ready when SdsApi constructor is called. That causes segmentation fault in test runs.
if (!secret_provider) { | ||
ASSERT(secret_provider_context.initManager() != nullptr); | ||
|
||
std::function<void()> unregister_secret_provider = [map_key, config_name, sds_config_source, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for easy to read, maybe create a member function removeDynamicSecretProvider() function, can the lambda function just call it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
@@ -18,6 +19,13 @@ static const std::string INLINE_STRING = "<inline>"; | |||
|
|||
class ContextConfigImpl : public virtual Ssl::ContextConfig { | |||
public: | |||
~ContextConfigImpl() override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move it .cc file
return !tls_certficate_provider_ || tls_certficate_provider_->secret(); | ||
} | ||
|
||
void setUpdateCallback(Secret::SecretCallbacks& callback) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move it to .cc file
tls_certficate_provider_->removeUpdateCallback(*secret_callback_); | ||
} | ||
secret_callback_ = &callback; | ||
if (tls_certficate_provider_.get() != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only save the callback when provider is not null.
You can do
if (secret_callback_) {
if (secret_callback_) {
secret_callback_->remove
}
secret_callback_ = &callback;
secret_callback_->add
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
source/common/ssl/ssl_socket.cc
Outdated
@@ -17,6 +17,38 @@ using Envoy::Network::PostIoAction; | |||
namespace Envoy { | |||
namespace Ssl { | |||
|
|||
namespace { | |||
|
|||
class NotReadySslSocket : public Network::TransportSocket, public Connection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comment for this class
// This SslSocket will be used when SSL secret is not fetched from SDS server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ENVOY_LOG(debug, "Unregister secret provider. hash key: {}", map_key); | ||
auto secret_provider = dynamic_secret_providers_.find(map_key); | ||
if (secret_provider != dynamic_secret_providers_.end()) { | ||
dynamic_secret_providers_.erase(map_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can call erase() directly, then check the return value. If 0, log an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice advice. Thanks!
sds_config_source.DebugString()); | ||
} | ||
std::function<void()> unregister_secret_provider = [map_key, this]() { | ||
this->removeDynamicSecretProvider(map_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may not need this variable, just use the lambda directly in the function argument. such
xxx(....,
[map_key, this] () { removeDynamicSecretProvider(map_key); }
);
1c67da7
to
5b9e45f
Compare
Can you merge master and run clang-format on 6.0 per #4168? |
854b7af
to
b4a354d
Compare
@lizan I have run clang-format on 6.0 |
@@ -0,0 +1,22 @@ | |||
#pragma once | |||
|
|||
#include <memory> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unused include
@@ -0,0 +1,21 @@ | |||
#pragma once | |||
|
|||
#include <string> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string is not used either...
f1c054b
to
02e1c52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hard work here and extra thanks for the integration tests. It gives me good confidence this change works. Nice! I have some high level comments to get started with. FYI, @htuch is going to take over senior maintainer review on this since I am out for a month starting tomorrow. Thank you!
|
||
namespace Server { | ||
namespace Configuration { | ||
class TransportSocketFactoryContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It would probably be better to just move this interface into the secret or SSL/TLS namespace but not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will keep this.
include/envoy/ssl/context_config.h
Outdated
|
||
namespace Envoy { | ||
namespace Secret { | ||
class SecretCallbacks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the forward declare here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this forward declaration and included secret_callbacks.h.
include/envoy/ssl/context_config.h
Outdated
@@ -95,6 +98,17 @@ class ContextConfig { | |||
* @return The maximum TLS protocol version to negotiate. | |||
*/ | |||
virtual unsigned maxProtocolVersion() const PURE; | |||
|
|||
/** | |||
* @return true if the ssl config is ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add more information about what ready means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is updated.
include/envoy/ssl/context_config.h
Outdated
|
||
/** | ||
* Add secret callback into context config. | ||
* @param callback callback that is executed by context config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more information about when callbacks will be invoked? It's not clear at the interface level why/when this happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More information is added into comment. Thanks.
* Pass an init manager to register dynamic secret provider. | ||
* @param init_manager instance of init manager. | ||
*/ | ||
virtual void setInitManager(Init::Manager& init_manager) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not optimal that this interface has a setter and a getter for the init manager. Is there any way to simplify this so that the init manager is known ahead of time and we only need a getter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the init manager should exist when we create TransportSocketFactoryContext, so that we only need a getter. But we create TransportSocketFactoryContext at ClusterManagerFactory and then create init manager per cluster, we have to add it into factory context. @qiwzhang proposed #3831, I think once we are working on that, we can get rid of this setter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the init manager then be part of the factory context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, init manager is per cluster. At the time of the construction of TransportSocketFactoryContext, per cluster init manager is not available. We have to set it when init manager is available.
@@ -17,6 +17,38 @@ using Envoy::Network::PostIoAction; | |||
namespace Envoy { | |||
namespace Ssl { | |||
|
|||
namespace { | |||
|
|||
// This SslSocket will be used when SSL secret is not fetched from SDS server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this? What causes us to require an instantiated connection? Can't we cause the control flow to return in such a way that there is no socket and we just fail whatever we are doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our first attempt was to return nullptr in socket creation when ssl_context is not ready (SDS client failed to get secret). This works for ListenerImpl, but did not work for ClusterImpl. After looked at the ClusterImpl codes, most of them do not expect nullptr socket nor nullptr connection. It will require a lot of code changes to make these code to handle null connection.
For less code change, we decided to return such dummy socket which would just reset the connect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have an opinion on whether this is the best course of action or not. @htuch hopefully can have a look. In general I would prefer that we don't do this but if it's the best way I can accept that.
} | ||
|
||
bool ClientSslSocketFactory::implementsSecureTransport() const { return true; } | ||
|
||
void ClientSslSocketFactory::onAddOrUpdateSecret() { | ||
ENVOY_LOG(debug, "Secret is updated."); | ||
ssl_ctx_ = manager_.createSslClientContext(stats_scope_, *config_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty positive this needs locking. I think you will be accessing ssl_ctx_ across threads to make transport sockets. Thus, you will need a R/W lock here and ultimately should move to using TLS but a R/W lock is OK for now? I think at a high level having a locking/threading analysis of this change would be useful. Can you add that to the description somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Thanks! Will add R/W lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a lock to protect read/write to ssl_ctx_, and added comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you will need a lock beyond what shared_ptr already provide, it seems unnecessary. Though you might want to have a local variable of shared_ptr during create socket, so the access to the member variable is always atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizan Thanks for pointing this out. Yes the lock is not necessary as we already use shared_ptr. They are removed.
} | ||
|
||
bool ServerSslSocketFactory::implementsSecureTransport() const { return true; } | ||
|
||
void ServerSslSocketFactory::onAddOrUpdateSecret() { | ||
ENVOY_LOG(debug, "Secret is updated."); | ||
ssl_ctx_ = manager_.createSslServerContext(stats_scope_, *config_, server_names_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locking. Is there a way to share this code in a base class somehow? Would prefer to not implement a bunch of this twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock is added, thanks! ServerSslSocketFactory::onAddOrUpdateSecret() and ClientSslSocketFactory::onAddOrUpdateSecret() are different, one owns ServerContextSharedPtr ssl_ctx_ and the other owns ClientContextSharedPtr ssl_ctx_, and they are created by calling different methods at context manager. I would like to leave this method in separate class.
@@ -40,6 +40,8 @@ | |||
#include "common/upstream/outlier_detection_impl.h" | |||
#include "common/upstream/resource_manager_impl.h" | |||
|
|||
#include "server/init_manager_impl.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move init_manager_impl out of server/ and into a new subdirectory in source/ called init/. Feel free to do this in a followup but please add a TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO in init_manager_impl.h. Thanks.
*/ | ||
void onPreInitComplete(); | ||
|
||
/** | ||
* Called by every concrete cluster after all Sds api targets registered at SDS init manager are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove references to SDS or make it clear that SDS is one of the things that init manager is used for. We are likely to add other things in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed references to SDS from comment. Thanks.
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
d7fe495
to
d6eb302
Compare
Signed-off-by: JimmyCYJ <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments to get started. This is a pretty huge PR; FWIW, I strongly encourage shorter PRs for reviewabilty and velocity.
* Finds and returns a dynamic secret provider associated to SDS config. Create | ||
* a new one if such provider does not exist. | ||
* | ||
* @param config_source a protobuf message object contains SDS config source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/contains/containing a/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
* a new one if such provider does not exist. | ||
* | ||
* @param config_source a protobuf message object contains SDS config source. | ||
* @param config_name a name that uniquely refers to the SDS config source |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: full stop at end of sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
* @param config_name a name that uniquely refers to the SDS config source | ||
* @param secret_provider_context context that provides components for creating and initializing | ||
* secret provider. | ||
* @return the dynamic TLS secret provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: @return TlsCertificateConfigProviderSharedPtr the dynamic TLS secret provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
@@ -18,7 +19,17 @@ template <class SecretType> class SecretProvider { | |||
*/ | |||
virtual const SecretType* secret() const PURE; | |||
|
|||
// TODO(lizan): Add more methods for dynamic secret provider. | |||
/** | |||
* Add secret callback into secret provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments on thread safety? E.g. from which threads is it safe to call this, on which thread will the callback be invoked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments are added. Thanks.
* Pass an init manager to register dynamic secret provider. | ||
* @param init_manager instance of init manager. | ||
*/ | ||
virtual void setInitManager(Init::Manager& init_manager) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the init manager then be part of the factory context?
source/common/secret/sds_api.cc
Outdated
} | ||
const auto& secret = resources[0]; | ||
MessageUtil::validate(secret); | ||
if (!(secret.name() == sds_config_name_)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: !=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
void SecretManagerImpl::removeDynamicSecretProvider(const std::string& map_key) { | ||
ENVOY_LOG(debug, "Unregister secret provider. hash key: {}", map_key); | ||
|
||
ASSERT(dynamic_secret_providers_.erase(map_key) == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be RELEASE_ASSERT
; otherwise in opt
builds, this entire line disappears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out. Fixed.
TlsCertificateConfigProviderSharedPtr SecretManagerImpl::findOrCreateDynamicSecretProvider( | ||
const envoy::api::v2::core::ConfigSource& sds_config_source, const std::string& config_name, | ||
Server::Configuration::TransportSocketFactoryContext& secret_provider_context) { | ||
std::string map_key = std::to_string(MessageUtil::hash(sds_config_source)) + config_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:const std::string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ASSERT(secret_provider_context.initManager() != nullptr); | ||
|
||
std::function<void()> unregister_secret_provider = [map_key, this]() { | ||
removeDynamicSecretProvider(map_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add some lifetime comments here. @ambuc recently hit issues where these kinds of callbacks were invoked and the equivalent of SdsApi
outlived the equivalent of SecreteManagerImpl
(this was in ListenerManager
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lifetime issue has been captured by integration tests. We have adjusted the order of secret manager and other components to make sure SdsApi objects are destroyed before SecretManagerImpl. Comments are added. Thanks.
source/common/ssl/ssl_socket.cc
Outdated
return std::make_unique<Ssl::SslSocket>(ssl_ctx_, Ssl::InitialState::Client); | ||
// SDS would update ssl_ctx_ when Envoy is running. | ||
// Need a read lock to let multiple threads gain read access to ssl_ctx_. | ||
std::shared_lock<std::shared_timed_mutex> lock(ssl_ctx_mutex_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into this yet, but this is something I'd like to understand the necessity for better; usually in Envoy, needing to do shared memory concurrency is not needed, and other mechanisms like TLS posting are the right solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JimmyCYJ also, personal plea to avoid force push; general GH etiquette avoids this to make reviewer lives easier (so they can just look a the delta between each PR). |
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Implement SDS API and dummy socket, and they are not in use. This is split from PR #4176. Risk Level: Low Testing: Unit tests Docs Changes: None Fixes #1194 Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
Signed-off-by: JimmyCYJ <[email protected]>
This PR didn't update correctly. I am going to close this one and create a new PR. |
I have created PR #4256, please take a look. Thanks! |
Description: Implement SDS api that fetches secrets from remote SDS server. Secrets are stored in Secret Provider. Listeners and Clusters are updated when secrets are received.
Risk Level: Low
Testing: Unit tests and integration tests
Fixes #1194
Signed-off-by: Jimmy Chen [email protected]