Skip to content

Commit

Permalink
Load Shedding: TCP Vegas, request priority and request cohort
Browse files Browse the repository at this point in the history
The overload detector uses a TCP Vegas based algorithm,
as implemented by Netflix Concurrency Limiters.

Priority load shedding uses 5 priority levels and 128 cohorts.
A simple cubic function is used to determine whether current CPU
load has reached the threshold for current request.
  • Loading branch information
Ladicek committed May 22, 2024
1 parent 470fb26 commit 805d1f5
Show file tree
Hide file tree
Showing 21 changed files with 910 additions and 1 deletion.
10 changes: 10 additions & 0 deletions bom/application/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1886,6 +1886,16 @@
<artifactId>quarkus-smallrye-openapi-common-deployment</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding-deployment</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-vertx</artifactId>
Expand Down
13 changes: 13 additions & 0 deletions devtools/bom-descriptor-json/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1305,6 +1305,19 @@
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding</artifactId>
<version>${project.version}</version>
<type>pom</type>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-logging-gelf</artifactId>
Expand Down
13 changes: 13 additions & 0 deletions docs/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1321,6 +1321,19 @@
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding-deployment</artifactId>
<version>${project.version}</version>
<type>pom</type>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-logging-gelf-deployment</artifactId>
Expand Down
148 changes: 148 additions & 0 deletions docs/src/main/asciidoc/load-shedding-reference.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
////
This guide is maintained in the main Quarkus repository
and pull requests should be submitted there:
https://github.com/quarkusio/quarkus/tree/main/docs/src/main/asciidoc
////
= Load Shedding reference guide

Check warning on line 6 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.Headings] Use sentence-style capitalization in 'Load Shedding reference guide'. Raw Output: {"message": "[Quarkus.Headings] Use sentence-style capitalization in 'Load Shedding reference guide'.", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 6, "column": 3}}}, "severity": "INFO"}
include::_attributes.adoc[]
:numbered:
:sectnums:
:categories: web
:topics: web,load-shedding
:extensions: io.quarkus:quarkus-load-shedding
:extension-status: experimental

include::{includes}/extension-status.adoc[]

Load shedding is the practice of detecting service overload and rejecting requests.

In Quarkus, the `quarkus-load-shedding` extension provides a load shedding mechanism.

== Use the Load Shedding extension

To use the load shedding extension, you need to add the `io.quarkus:quarkus-load-shedding` extension to your project:

Check warning on line 23 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.Fluff] Depending on the context, consider using 'Rewrite the sentence, or use 'must', instead of' rather than 'need to'. Raw Output: {"message": "[Quarkus.Fluff] Depending on the context, consider using 'Rewrite the sentence, or use 'must', instead of' rather than 'need to'.", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 23, "column": 41}}}, "severity": "INFO"}

[source,xml,role="primary asciidoc-tabs-target-sync-cli asciidoc-tabs-target-sync-maven"]
.pom.xml
----
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding</artifactId>
</dependency>
----

[source,gradle,role="secondary asciidoc-tabs-target-sync-gradle"]
.build.gradle

Check warning on line 35 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.CaseSensitiveTerms] Use 'Gradle' rather than 'gradle'. Raw Output: {"message": "[Quarkus.CaseSensitiveTerms] Use 'Gradle' rather than 'gradle'.", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 35, "column": 8}}}, "severity": "INFO"}
----
implementation("io.quarkus:quarkus-load-shedding")
----

No configuration is required, though the possible configuration options are described below.

== The load shedding algorithm

The load shedding algorithm has 2 parts:

* overload detection
* priority load shedding (optional)

=== Overload detection

To detect whether the current service is overloaded, an adaptation of TCP Vegas is used.

The algorithm starts with 100 allowed concurrent requests.
For each request, it compares the number of current requests with the allowed limit and if the limit is exceeded, an overload situation is signalled.

If the limit is not exceeded, or if priority load shedding determines that the request should not be rejected (see below), the request is allowed.
When it finishes, its duration is compared with the lowest duration seen so far to estimate a queue size.
If the queue size is lower than _alpha_, the current limit is increased, but only up to a given maximum, by default 1000.
If the queue size is greater than _beta_, the current limit is decreased.
Otherwise, the current limit is kept intact.

Alpha and beta are computed by multiplying the configurable constants with a base 10 logarithm of the current limit.

After some number of requests, which can be modified by configuring the _probe_ factor, the lowest duration seen is reset to the last seen duration of a request.

=== Priority load shedding

If an overload situation is signalled, priority load shedding is invoked.

By default, priority load shedding is enabled, which means a request is only rejected if the current CPU load is high enough.
To determine whether a request should be rejected, 2 attributes are considered:

* request priority
* request cohort

There are 5 statically defined priorities and 128 cohorts, which amounts to 640 request groups in total.

After both priority and cohort are assigned to a request, a request group number is computed: `group = priority * max_cohorts + cohort`.
Then, the group number is compared to a simple cubic function of current CPU load, where `load` is a number between 0 and 1: `max_groups * (1 - load^3)`.
If the group number is higher, the request is rejected, otherwise it is allowed even in an overload situation.

If priority load shedding is disabled, all requests are rejected in an overload situation.

==== Customizing request priority

Priority is assigned by a `io.quarkus.load.shedding.RequestPrioritizer`.
There is 5 statically defined priorities in the `io.quarkus.load.shedding.RequestPriority` enum: `CRITICAL`, `IMPORTANT`, `NORMAL`, `BACKGROUND` and `DEGRADED`.
By default, if no request prioritizer applies, the priority is assumed to be `NORMAL`.

Check warning on line 88 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.Spelling] Use correct American English spelling. Did you really mean 'prioritizer'? Raw Output: {"message": "[Quarkus.Spelling] Use correct American English spelling. Did you really mean 'prioritizer'?", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 88, "column": 16}}}, "severity": "WARNING"}

There is one default prioritizer which assigns the priority of `CRITICAL` to requests to the non-application endpoints.

Check warning on line 90 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.Spelling] Use correct American English spelling. Did you really mean 'prioritizer'? Raw Output: {"message": "[Quarkus.Spelling] Use correct American English spelling. Did you really mean 'prioritizer'?", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 90, "column": 11}}}, "severity": "WARNING"}

Check warning on line 90 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.TermsSuggestions] Depending on the context, consider using ', which (non restrictive clause preceded by a comma)' or 'that (restrictive clause without a comma)' rather than 'which'. Raw Output: {"message": "[Quarkus.TermsSuggestions] Depending on the context, consider using ', which (non restrictive clause preceded by a comma)' or 'that (restrictive clause without a comma)' rather than 'which'.", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 90, "column": 22}}}, "severity": "INFO"}
It declares no `@Priority`.

It is possible to define custom implementations of the `RequestPrioritizer` interface.
The implementations must be CDI beans, otherwise they are ignored.
The CDI rules of typesafe resolution must be followed.

Check warning on line 95 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.Spelling] Use correct American English spelling. Did you really mean 'typesafe'? Raw Output: {"message": "[Quarkus.Spelling] Use correct American English spelling. Did you really mean 'typesafe'?", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 95, "column": 7}}}, "severity": "WARNING"}
That is, if multiple implementations exist with a different `@Priority` value, only the implementations with the highest priority are retained.

==== Customizing request cohort

Cohort is assigned by a `io.quarkus.load.shedding.RequestClassifier`.
There is 128 statically defined cohorts, with the lowest number being 1 and highest number being 128.
The classifier should return a number in this interval; if it does not, the number is adjusted automatically.

Check warning on line 102 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.SentenceLength] Try to keep sentences to an average of 32 words or fewer. Raw Output: {"message": "[Quarkus.SentenceLength] Try to keep sentences to an average of 32 words or fewer.", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 102, "column": 101}}}, "severity": "INFO"}

There is one default classifier which assigns a cohort based on a hash of the remote IP address and current time, such that an IP address changes its cohort roughly every 2.5 hours.

Check warning on line 104 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.TermsSuggestions] Depending on the context, consider using ', which (non restrictive clause preceded by a comma)' or 'that (restrictive clause without a comma)' rather than 'which'. Raw Output: {"message": "[Quarkus.TermsSuggestions] Depending on the context, consider using ', which (non restrictive clause preceded by a comma)' or 'that (restrictive clause without a comma)' rather than 'which'.", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 104, "column": 21}}}, "severity": "INFO"}
It declares no `@Priority`.

It is possible to define custom implementations of the `RequestClassifier` interface.
The implementations must be CDI beans, otherwise they are ignored.
The CDI rules of typesafe resolution must be followed.

Check warning on line 109 in docs/src/main/asciidoc/load-shedding-reference.adoc

View workflow job for this annotation

GitHub Actions / Linting with Vale

[vale] reported by reviewdog 🐶 [Quarkus.Spelling] Use correct American English spelling. Did you really mean 'typesafe'? Raw Output: {"message": "[Quarkus.Spelling] Use correct American English spelling. Did you really mean 'typesafe'?", "location": {"path": "docs/src/main/asciidoc/load-shedding-reference.adoc", "range": {"start": {"line": 109, "column": 7}}}, "severity": "WARNING"}
That is, if multiple implementations exist with a different `@Priority` value, only the implementations with the highest priority are retained.

== Limitations

The load shedding extension currently only applies to HTTP requests, and is heavily skewed towards request/response network interactions.
This means that gRPC, WebSocket and other kinds of streaming over HTTP are not supported.
Other "entrypoints" to Quarkus applications, such as messaging, are not supported either.

Further, the load shedding implementation is currently rather basic and not heavily tested in production.
Improvements may be necessary.

== Configuration reference

include::{generated-dir}/config/quarkus-load-shedding.adoc[opts=optional, leveloffset=+1]

== Further reading

Netflix Technology Blog:

* https://netflixtechblog.medium.com/performance-under-load-3e6fa9a60581[Performance Under Load]
* https://netflixtechblog.com/keeping-netflix-reliable-using-prioritized-load-shedding-6cc827b02f94[Keeping Netflix Reliable Using Prioritized Load Shedding]

Uber Engineering Blog:

* https://www.uber.com/blog/cinnamon-using-century-old-tech-to-build-a-mean-load-shedder/[Cinnamon: Using Century Old Tech to Build a Mean Load Shedder]
* https://www.uber.com/blog/pid-controller-for-cinnamon/[PID Controller for Cinnamon]
* https://www.uber.com/blog/cinnamon-auto-tuner-adaptive-concurrency-in-the-wild/[Cinnamon Auto-Tuner: Adaptive Concurrency in the Wild]

Amazon Builders' Library:

* https://aws.amazon.com/builders-library/using-load-shedding-to-avoid-overload/[Using load shedding to avoid overload]

Google Cloud Blog:

* https://cloud.google.com/blog/products/gcp/using-load-shedding-to-survive-a-success-disaster-cre-life-lessons[Using load shedding to survive a success disaster]

CodeReliant Blog:

* https://www.codereliant.io/load-shedding/[Load Shedding for High Traffic Systems]
68 changes: 68 additions & 0 deletions extensions/load-shedding/deployment/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding-parent</artifactId>
<version>999-SNAPSHOT</version>
</parent>

<artifactId>quarkus-load-shedding-deployment</artifactId>

<name>Quarkus - Load Shedding - Deployment</name>

<dependencies>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-load-shedding</artifactId>
</dependency>

<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-vertx-http-deployment</artifactId>
</dependency>

<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-rest-deployment</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-junit5-internal</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>rest-assured</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<annotationProcessorPaths>
<path>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-extension-processor</artifactId>
<version>${project.version}</version>
</path>
</annotationProcessorPaths>
</configuration>
</plugin>
</plugins>
</build>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package io.quarkus.load.shedding.deployment;

import java.util.ArrayList;
import java.util.List;

import io.quarkus.arc.deployment.AdditionalBeanBuildItem;
import io.quarkus.deployment.annotations.BuildStep;
import io.quarkus.deployment.builditem.FeatureBuildItem;
import io.quarkus.load.shedding.runtime.HttpLoadShedding;
import io.quarkus.load.shedding.runtime.HttpRequestClassifier;
import io.quarkus.load.shedding.runtime.ManagementRequestPrioritizer;
import io.quarkus.load.shedding.runtime.OverloadDetector;
import io.quarkus.load.shedding.runtime.PriorityLoadShedding;

public class LoadSheddingProcessor {
private static final String FEATURE = "load-shedding";

@BuildStep
FeatureBuildItem feature() {
return new FeatureBuildItem(FEATURE);
}

@BuildStep
AdditionalBeanBuildItem beans() {
List<String> beans = new ArrayList<>();
beans.add(OverloadDetector.class.getName());
beans.add(HttpLoadShedding.class.getName());
beans.add(PriorityLoadShedding.class.getName());
beans.add(ManagementRequestPrioritizer.class.getName());
beans.add(HttpRequestClassifier.class.getName());

return AdditionalBeanBuildItem.builder().addBeanClasses(beans).build();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
package io.quarkus.load.shedding;

import static io.restassured.RestAssured.when;
import static org.assertj.core.api.Assertions.assertThat;

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.atomic.AtomicInteger;

import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;

import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.RegisterExtension;

import io.quarkus.test.QuarkusUnitTest;

public class NaiveLoadSheddingTest {
private static final int NUM_THREADS = 20;
private static final int NUM_REQUESTS = 10;

@RegisterExtension
static final QuarkusUnitTest config = new QuarkusUnitTest()
.withApplicationRoot(jar -> jar.addClasses(MyResource.class))
.overrideConfigKey("quarkus.load-shedding.initial-limit", "5")
.overrideConfigKey("quarkus.load-shedding.max-limit", "10")
.overrideConfigKey("quarkus.load-shedding.priority.enabled", "false");

@Test
public void test() throws InterruptedException {
AtomicInteger numErrors = new AtomicInteger();
CountDownLatch begin = new CountDownLatch(1);
CountDownLatch end = new CountDownLatch(NUM_THREADS);
for (int i = 0; i < NUM_THREADS; i++) {
new Thread(() -> {
try {
begin.await();
for (int j = 0; j < NUM_REQUESTS; j++) {
int statusCode = when().get("/").then().extract().statusCode();
if (statusCode == 503) {
numErrors.incrementAndGet();
}
}
end.countDown();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}).start();
}

begin.countDown();
end.await();

assertThat(numErrors).hasValueGreaterThanOrEqualTo(100);
}

@Path("/")
public static class MyResource {
@GET
public String hello() throws InterruptedException {
Thread.sleep(100);
return "Hello, world!";
}
}
}
24 changes: 24 additions & 0 deletions extensions/load-shedding/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<artifactId>quarkus-extensions-parent</artifactId>
<groupId>io.quarkus</groupId>
<version>999-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>quarkus-load-shedding-parent</artifactId>
<packaging>pom</packaging>

<name>Quarkus - Load Shedding</name>

<modules>
<module>deployment</module>
<module>runtime</module>
</modules>

</project>
Loading

0 comments on commit 805d1f5

Please sign in to comment.