Skip to content

Commit

Permalink
resource manager priorities (#46)
Browse files Browse the repository at this point in the history
  • Loading branch information
mattklein123 authored Aug 31, 2016
1 parent a9fbf89 commit 2bf238b
Show file tree
Hide file tree
Showing 34 changed files with 356 additions and 140 deletions.
17 changes: 13 additions & 4 deletions configs/envoy_service_to_service.template.json
Original file line number Diff line number Diff line change
Expand Up @@ -334,9 +334,14 @@
"connect_timeout_ms": 250,
"type": "static",
"lb_type": "round_robin",
"max_pending_requests": 30, {# Apply back pressure quickly at the local host level. NOTE: This
only is applicable with the HTTP/1.1 connection pool. #}
"max_connections": 100,
"circuit_breakers": {
"default": {
"max_pending_requests": 30, {# Apply back pressure quickly at the local host level. NOTE:
This only is applicable with the HTTP/1.1 connection
pool. #}
"max_connections": 100
}
},
"hosts": [{"url": "tcp://127.0.0.1:8080"}]

},
Expand All @@ -346,7 +351,11 @@
"type": "static",
"lb_type": "round_robin",
"features": "http2",
"max_requests": 200,
"circuit_breakers": {
"default": {
"max_requests": 200
}
},
"hosts": [{"url": "tcp://127.0.0.1:8081"}]
},
{
Expand Down
6 changes: 5 additions & 1 deletion configs/routing_helper.template.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,11 @@
"http_codec_options": "no_compression",
"service_name": "{{ service }}",
{% if 'max_requests' in options %}
"max_requests": {{ options['max_requests'] }},
"circuit_breakers": {
"default": {
"max_requests": {{ options['max_requests'] }}
}
},
{% endif %}
"health_check": {
"type": "http",
Expand Down
28 changes: 5 additions & 23 deletions docs/configuration/cluster_manager/cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,7 @@ Cluster
"service_name": "...",
"health_check": "{...}",
"max_requests_per_connection": "...",
"max_connections": "...",
"max_pending_requests": "...",
"max_requests": "...",
"max_retries": "...",
"circuit_breakers": "{...}",
"ssl_context": "{...}",
"features": "...",
"http_codec_options": "...",
Expand Down Expand Up @@ -96,25 +93,9 @@ max_requests_per_connection
parameter is respected by both the HTTP/1.1 and HTTP/2 connection pool implementations. If not
specified, there is no limit. Setting this parameter to 1 will effectively disable keep alive.

max_connections
*(optional, integer)* The maximum number of connections that Envoy will make to the upstream
cluster. If not specified, the default is 1024. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.

max_pending_requests
*(optional, integer)* The maximum number of pending requests that Envoy will allow to the upstream
cluster. If not specified, the default is 1024. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.

max_requests
*(optional, integer)* The maximum number of parallel requests that Envoy will make to the upstream
cluster. If not specified, the default is 1024. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.

max_retries
*(optional, integer)* The maximum number of parallel retries that Envoy will allow to the upstream
cluster. If not specified, the default is 3. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.
:ref:`circuit_breakers <config_cluster_manager_cluster_circuit_breakers>`
*(optional, object)* Optional :ref:`circuit breaking <arch_overview_circuit_break>` settings
for the cluster.

:ref:`ssl_context <config_cluster_manager_cluster_ssl>`
*(optional, object)* The TLS configuration for connections to the upstream cluster. If no TLS
Expand Down Expand Up @@ -150,6 +131,7 @@ alt_stat_name
:hidden:

cluster_hc
cluster_circuit_breakers
cluster_ssl
cluster_stats
cluster_runtime
55 changes: 55 additions & 0 deletions docs/configuration/cluster_manager/cluster_circuit_breakers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
.. _config_cluster_manager_cluster_circuit_breakers:

Circuit breakers
================

Circuit breaking :ref:`architecture overview <arch_overview_circuit_break>`.

Circuit breaking settings can be specified individually for each defined priority. How the
different priorities are used are documented in the sections of the configuration guide that use
them.

.. code-block:: json
{
"default": "{...}",
"high": "{...}"
}
default
*(optional, object)* Settings object for default priority.

high
*(optional, object)* Settings object for high priority.

Per priority settings
---------------------

.. code-block:: json
{
"max_connections": "...",
"max_pending_requests": "...",
"max_requests": "...",
"max_retries": "...",
}
max_connections
*(optional, integer)* The maximum number of connections that Envoy will make to the upstream
cluster. If not specified, the default is 1024. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.

max_pending_requests
*(optional, integer)* The maximum number of pending requests that Envoy will allow to the upstream
cluster. If not specified, the default is 1024. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.

max_requests
*(optional, integer)* The maximum number of parallel requests that Envoy will make to the upstream
cluster. If not specified, the default is 1024. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.

max_retries
*(optional, integer)* The maximum number of parallel retries that Envoy will allow to the upstream
cluster. If not specified, the default is 3. See the :ref:`circuit breaking overview
<arch_overview_circuit_break>` for more information.
18 changes: 12 additions & 6 deletions docs/intro/arch_overview/circuit_breaking.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,20 @@ configure and code each application independently. Envoy supports various types
* **Cluster maximum pending requests**: The maximum number of requests that will be queued while
waiting for a ready connection pool connection. In practice this is only applicable to HTTP/1.1
clusters since HTTP/2 connection pools never queue requests. HTTP/2 requests are multiplexed
immediately.
immediately. If this circuit breaker overflows the :ref:`upstream_rq_pending_overflow
<config_cluster_manager_cluster_stats>` counter for the cluster will increment.
* **Cluster maximum requests**: The maximum number of requests that can be outstanding to all hosts
in a cluster at any given time.
in a cluster at any given time. In practice this is applicable to HTTP/2 clusters since HTTP/1.1
clusters are governed by the maximum connections circuit breaker. If this circuit breaker
overflows the :ref:`upstream_rq_pending_overflow <config_cluster_manager_cluster_stats>` counter
for the cluster will increment.
* **Cluster maximum active retries**: The maximum number of retries that can be outstanding to all
hosts in a cluster at any given time. In general we recommend aggressively circuit breaking
retries so that retries for sporadic failures are allowed but the overall retry volume cannot
explode and cause large scale cascading failure.
explode and cause large scale cascading failure. If this circuit breaker overflows the
:ref:`upstream_rq_retry_overflow <config_cluster_manager_cluster_stats>` counter for the cluster
will increment.

Each circuit breaking limit is :ref:`configurable <config_cluster_manager_cluster>` and tracked on a
per upstream cluster basis. This allows different components of the distributed system to be tuned
independently and have different limits.
Each circuit breaking limit is :ref:`configurable <config_cluster_manager_cluster_circuit_breakers>`
and tracked on a per upstream cluster and per priority basis. This allows different components of
the distributed system to be tuned independently and have different limits.
3 changes: 2 additions & 1 deletion include/envoy/upstream/cluster_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ class ClusterManager {
*
* Can return nullptr if there is no host available in the cluster.
*/
virtual Http::ConnectionPool::Instance* httpConnPoolForCluster(const std::string& cluster) PURE;
virtual Http::ConnectionPool::Instance* httpConnPoolForCluster(const std::string& cluster,
ResourcePriority priority) PURE;

/**
* Allocate a load balanced TCP connection for a cluster. The created connection is already
Expand Down
7 changes: 7 additions & 0 deletions include/envoy/upstream/resource_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@

namespace Upstream {

/**
* Resource priority classes. The parallel NumResourcePriorities constant allows defining fixed
* arrays for each priority, but does not pollute the enum.
*/
enum class ResourcePriority { Default, High };
const size_t NumResourcePriorities = 2;

/**
* An individual resource tracked by the resource manager.
*/
Expand Down
5 changes: 3 additions & 2 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -233,9 +233,10 @@ class Cluster : public virtual HostSet {
virtual const std::string& name() const PURE;

/**
* @return ResourceManager& the resource manager to use by proxy agents for this cluster.
* @return ResourceManager& the resource manager to use by proxy agents for for this cluster (at
* a particular priority).
*/
virtual ResourceManager& resourceManager() const PURE;
virtual ResourceManager& resourceManager(ResourcePriority priority) const PURE;

/**
* Shutdown the cluster prior to destroying connection pools and other thread local data.
Expand Down
4 changes: 3 additions & 1 deletion source/common/http/async_client_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@ AsyncClientImpl::~AsyncClientImpl() { ASSERT(active_requests_.empty()); }

AsyncClient::Request* AsyncClientImpl::send(MessagePtr&& request, AsyncClient::Callbacks& callbacks,
const Optional<std::chrono::milliseconds>& timeout) {
ConnectionPool::Instance* conn_pool = factory_.connPool();
// For now we use default priority for all requests. We could eventually expose priority out of
// send if needed.
ConnectionPool::Instance* conn_pool = factory_.connPool(Upstream::ResourcePriority::Default);
if (!conn_pool) {
callbacks.onFailure(AsyncClient::FailureReason::Reset);
return nullptr;
Expand Down
2 changes: 1 addition & 1 deletion source/common/http/async_client_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ class AsyncClientConnPoolFactory {
/**
* Return a connection pool or nullptr if there is no healthy upstream host.
*/
virtual ConnectionPool::Instance* connPool() PURE;
virtual ConnectionPool::Instance* connPool(Upstream::ResourcePriority priority) PURE;
};

class AsyncRequestImpl;
Expand Down
12 changes: 6 additions & 6 deletions source/common/http/http1/conn_pool.cc
Original file line number Diff line number Diff line change
Expand Up @@ -71,10 +71,10 @@ ConnectionPool::Cancellable* ConnPoolImpl::newStream(StreamDecoder& response_dec
return nullptr;
}

if (host_->cluster().resourceManager().pendingRequests().canCreate()) {
if (host_->cluster().resourceManager(priority_).pendingRequests().canCreate()) {
// If we have no connections at all, make one no matter what so we don't starve.
if ((ready_clients_.size() == 0 && busy_clients_.size() == 0) ||
host_->cluster().resourceManager().connections().canCreate()) {
host_->cluster().resourceManager(priority_).connections().canCreate()) {
createNewConnection();
}

Expand Down Expand Up @@ -243,12 +243,12 @@ ConnPoolImpl::PendingRequest::PendingRequest(ConnPoolImpl& parent, StreamDecoder
: parent_(parent), decoder_(decoder), callbacks_(callbacks) {
parent_.host_->cluster().stats().upstream_rq_pending_total_.inc();
parent_.host_->cluster().stats().upstream_rq_pending_active_.inc();
parent_.host_->cluster().resourceManager().pendingRequests().inc();
parent_.host_->cluster().resourceManager(parent_.priority_).pendingRequests().inc();
}

ConnPoolImpl::PendingRequest::~PendingRequest() {
parent_.host_->cluster().stats().upstream_rq_pending_active_.dec();
parent_.host_->cluster().resourceManager().pendingRequests().dec();
parent_.host_->cluster().resourceManager(parent_.priority_).pendingRequests().dec();
}

ConnPoolImpl::ActiveClient::ActiveClient(ConnPoolImpl& parent)
Expand All @@ -268,14 +268,14 @@ ConnPoolImpl::ActiveClient::ActiveClient(ConnPoolImpl& parent)
parent_.host_->stats().cx_active_.inc();
conn_length_ = parent_.host_->cluster().stats().upstream_cx_length_ms_.allocateSpan();
connect_timer_->enableTimer(parent_.host_->cluster().connectTimeout());
parent_.host_->cluster().resourceManager().connections().inc();
parent_.host_->cluster().resourceManager(parent_.priority_).connections().inc();
}

ConnPoolImpl::ActiveClient::~ActiveClient() {
parent_.host_->cluster().stats().upstream_cx_active_.dec();
parent_.host_->stats().cx_active_.dec();
conn_length_->complete();
parent_.host_->cluster().resourceManager().connections().dec();
parent_.host_->cluster().resourceManager(parent_.priority_).connections().dec();
}

void ConnPoolImpl::ActiveClient::onBufferChange(Network::ConnectionBufferType type, uint64_t,
Expand Down
11 changes: 7 additions & 4 deletions source/common/http/http1/conn_pool.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,9 @@ namespace Http1 {
*/
class ConnPoolImpl : Logger::Loggable<Logger::Id::pool>, public ConnectionPool::Instance {
public:
ConnPoolImpl(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host, Stats::Store& store)
: dispatcher_(dispatcher), host_(host), store_(store) {}
ConnPoolImpl(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host, Stats::Store& store,
Upstream::ResourcePriority priority)
: dispatcher_(dispatcher), host_(host), store_(store), priority_(priority) {}

~ConnPoolImpl();

Expand Down Expand Up @@ -127,15 +128,17 @@ class ConnPoolImpl : Logger::Loggable<Logger::Id::pool>, public ConnectionPool::
std::list<PendingRequestPtr> pending_requests_;
Stats::Store& store_;
std::list<DrainedCb> drained_callbacks_;
Upstream::ResourcePriority priority_;
};

/**
* Production implementation of the ConnPoolImpl.
*/
class ConnPoolImplProd : public ConnPoolImpl {
public:
ConnPoolImplProd(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host, Stats::Store& store)
: ConnPoolImpl(dispatcher, host, store) {}
ConnPoolImplProd(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host, Stats::Store& store,
Upstream::ResourcePriority priority)
: ConnPoolImpl(dispatcher, host, store, priority) {}

// ConnPoolImpl
CodecClientPtr createCodecClient(Upstream::Host::CreateConnectionData& data) override;
Expand Down
10 changes: 5 additions & 5 deletions source/common/http/http2/conn_pool.cc
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ namespace Http {
namespace Http2 {

ConnPoolImpl::ConnPoolImpl(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host,
Stats::Store& store)
: dispatcher_(dispatcher), host_(host), stats_store_(store) {}
Stats::Store& store, Upstream::ResourcePriority priority)
: dispatcher_(dispatcher), host_(host), stats_store_(store), priority_(priority) {}

ConnPoolImpl::~ConnPoolImpl() {
if (primary_client_) {
Expand Down Expand Up @@ -81,7 +81,7 @@ ConnectionPool::Cancellable* ConnPoolImpl::newStream(Http::StreamDecoder& respon
}

if (primary_client_->client_->numActiveRequests() >= maxConcurrentStreams() ||
!host_->cluster().resourceManager().requests().canCreate()) {
!host_->cluster().resourceManager(priority_).requests().canCreate()) {
log_debug("max requests overflow");
callbacks.onPoolFailure(ConnectionPool::PoolFailureReason::Overflow, nullptr);
host_->cluster().stats().upstream_rq_pending_overflow_.inc();
Expand All @@ -92,7 +92,7 @@ ConnectionPool::Cancellable* ConnPoolImpl::newStream(Http::StreamDecoder& respon
host_->stats().rq_active_.inc();
host_->cluster().stats().upstream_rq_total_.inc();
host_->cluster().stats().upstream_rq_active_.inc();
host_->cluster().resourceManager().requests().inc();
host_->cluster().resourceManager(priority_).requests().inc();
callbacks.onPoolReady(primary_client_->client_->newStream(response_decoder),
primary_client_->real_host_description_);
}
Expand Down Expand Up @@ -176,7 +176,7 @@ void ConnPoolImpl::onStreamDestroy(ActiveClient& client) {
client.client_->numActiveRequests());
host_->stats().rq_active_.dec();
host_->cluster().stats().upstream_rq_active_.dec();
host_->cluster().resourceManager().requests().dec();
host_->cluster().resourceManager(priority_).requests().dec();
if (&client == draining_client_.get() && client.client_->numActiveRequests() == 0) {
// Close out the draining client if we no long have active requests.
client.client_->close();
Expand Down
4 changes: 3 additions & 1 deletion source/common/http/http2/conn_pool.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ namespace Http2 {
*/
class ConnPoolImpl : Logger::Loggable<Logger::Id::pool>, public ConnectionPool::Instance {
public:
ConnPoolImpl(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host, Stats::Store& store);
ConnPoolImpl(Event::Dispatcher& dispatcher, Upstream::ConstHostPtr host, Stats::Store& store,
Upstream::ResourcePriority priority);
~ConnPoolImpl();

// Http::ConnectionPool::Instance
Expand Down Expand Up @@ -76,6 +77,7 @@ class ConnPoolImpl : Logger::Loggable<Logger::Id::pool>, public ConnectionPool::
ActiveClientPtr primary_client_;
ActiveClientPtr draining_client_;
std::list<DrainedCb> drained_callbacks_;
Upstream::ResourcePriority priority_;
};

/**
Expand Down
12 changes: 11 additions & 1 deletion source/common/json/json_loader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ StringLoader::StringLoader(const std::string& json) {

StringLoader::~StringLoader() { json_decref(json_); }

Object::EmptyObject Object::empty_;

Object::EmptyObject::EmptyObject() : json_(json_object()) {}

Object::EmptyObject::~EmptyObject() { json_decref(json_); }

std::vector<Object> Object::asObjectArray() const {
if (!json_is_array(json_)) {
throw Exception(fmt::format("'{}' is not an array", name_));
Expand Down Expand Up @@ -73,8 +79,12 @@ int64_t Object::getInteger(const std::string& name, int64_t default_value) const
}
}

Object Object::getObject(const std::string& name) const {
Object Object::getObject(const std::string& name, bool allow_empty) const {
json_t* object = json_object_get(json_, name.c_str());
if (!object && allow_empty) {
object = empty_.json_;
}

if (!object) {
throw Exception(fmt::format("key '{}' missing in '{}'", name, name_));
}
Expand Down
Loading

0 comments on commit 2bf238b

Please sign in to comment.