Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outlier: tooling for success rate ejection #618

Merged
merged 17 commits into from
Mar 31, 2017
1 change: 1 addition & 0 deletions docs/operations/admin.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ modify different aspects of the server.
weight, Integer, Load balancing weight (1-100)
zone, String, Service zone
canary, Boolean, Whether the host is a canary
success_rate, Double, Request success rate (0-100). -1 if there was not enough volume to calculate.

Host health status
A host is either healthy or unhealthy because of one or more different failing health states.
Expand Down
13 changes: 13 additions & 0 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,19 @@ class Host : virtual public HostDescription {
* Set the current load balancing weight of the host, in the range 1-100.
*/
virtual void weight(uint32_t new_weight) PURE;

/**
* @return the success rate of the host in the last calculated interval, in the range -1-100.
* -1 means that the host did not have enough request volume to calculate success rate
* or the cluster did not have enough hosts to run through success rate outlier ejection.
*/
virtual double successRate() const PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these should come directly out of the outlier detector sink via here: https://github.com/lyft/envoy/blob/ce8eb8634f0c6aa24485e17fbd563ecc18d73982/include/envoy/upstream/host_description.h#L51

Don't add to host object.


/**
* Set the success rate of the host.
* @param new_success_rate the new success rate calculated for the host in the last interval.
*/
virtual void successRate(double new_success_rate) PURE;
};

typedef std::shared_ptr<const Host> HostConstSharedPtr;
Expand Down
8 changes: 7 additions & 1 deletion source/common/upstream/outlier_detection_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,6 @@ void DetectorImpl::processSuccessRateEjections() {
valid_success_rate_hosts.reserve(host_sinks_.size());

for (const auto& host : host_sinks_) {
host.second->updateCurrentSuccessRateBucket();
// Don't do work if the host is already ejected.
if (!host.first->healthFlagGet(Host::HealthFlag::FAILED_OUTLIER_CHECK)) {
Optional<double> host_success_rate =
Expand All @@ -296,6 +295,7 @@ void DetectorImpl::processSuccessRateEjections() {
valid_success_rate_hosts.emplace_back(
HostSuccessRatePair(host.first, host_success_rate.value()));
success_rate_sum += host_success_rate.value();
host.first->successRate(host_success_rate.value());
}
}
}
Expand All @@ -317,6 +317,12 @@ void DetectorImpl::onIntervalTimer() {

for (auto host : host_sinks_) {
checkHostForUneject(host.first, host.second, now);

// Need to update the writer bucket to keep the data valid.
host.second->updateCurrentSuccessRateBucket();
// Refresh host success rate stat for the /clusters endpoint. If there is a new valid value it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comma after the first sentence in the if sentence.

// will get updated in processSuccessRateEjections().
host.first->successRate(-1);
}

processSuccessRateEjections();
Expand Down
3 changes: 3 additions & 0 deletions source/common/upstream/upstream_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@ class HostImpl : public HostDescriptionImpl,
bool healthy() const override { return !health_flags_; }
uint32_t weight() const override { return weight_; }
void weight(uint32_t new_weight) override;
double successRate() const override { return success_rate_; }
void successRate(double new_success_rate) override { success_rate_.store(new_success_rate); }

protected:
static Network::ClientConnectionPtr
Expand All @@ -96,6 +98,7 @@ class HostImpl : public HostDescriptionImpl,
private:
std::atomic<uint64_t> health_flags_{};
std::atomic<uint32_t> weight_;
std::atomic<double> success_rate_;
};

typedef std::shared_ptr<std::vector<HostSharedPtr>> HostVectorSharedPtr;
Expand Down
2 changes: 2 additions & 0 deletions source/server/http/admin.cc
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,8 @@ Http::Code AdminImpl::handlerClusters(const std::string&, Buffer::Instance& resp
host->address()->asString(), host->zone()));
response.add(fmt::format("{}::{}::canary::{}\n", cluster.second.get().info()->name(),
host->address()->asString(), host->canary()));
response.add(fmt::format("{}::{}::success_rate::{}\n", cluster.second.get().info()->name(),
host->address()->asString(), host->successRate()));
}
}

Expand Down
2 changes: 2 additions & 0 deletions test/mocks/upstream/host.h
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,8 @@ class MockHost : public Host {
MOCK_CONST_METHOD0(stats, HostStats&());
MOCK_CONST_METHOD0(weight, uint32_t());
MOCK_METHOD1(weight, void(uint32_t new_weight));
MOCK_CONST_METHOD0(successRate, double());
MOCK_METHOD1(successRate, void(double new_success_rate));
MOCK_CONST_METHOD0(zone, const std::string&());
};

Expand Down