prioritized client updates #17354

tgross · 2023-05-30T15:03:14Z

The allocrunner sends several updates to the server during the early lifecycle of an allocation and its tasks. Clients batch-up allocation updates every 200ms, but experiments like the C2M challenge have shown that even with this batching, servers can be overwhelmed with client updates during high volume deployments. Benchmarking done in #9451 has shown that client updates can easily represent ~70% of all Nomad Raft traffic.

Each allocation sends many updates during its lifetime, but only those that change the ClientStatus field are critical for progressing a deployment or kicking off a reschedule to recover from failures.

Add a priority to the client allocation sync and update the syncTicker receiver so that we only send an update if there's a high priority update waiting, or on every 5th tick. This means when there are no high priority updates, the client will send updates at most every 1s instead of 200ms. Benchmarks have shown this can reduce overall Raft traffic by 10%, as well as reduce client-to-server RPC traffic.

This changeset also switches from a channel-based collection of updates to a shared buffer, so as to split batching from sending and prevent backpressure onto the allocrunner when the RPC is slow. This doesn't have a major performance benefit in the benchmarks but makes the implementation of the prioritized update simpler.

Fixes: #9451

tgross · 2023-05-30T15:41:09Z

~~Moving this to draft because as I was re-reading it I realized there's another minor optimization we can make here that should make the implementation simpler in the allocrunner as well.~~ Done.

The allocrunner sends several updates to the server during the early lifecycle of an allocation and its tasks. Clients batch-up allocation updates every 200ms, but experiments like the C2M challenge has shown that even with this batching, servers can be overwhelmed with client updates during high volume deployments. Benchmarking done in #9451 has shown that client updates can easily represent ~70% of all Nomad Raft traffic. Each allocation sends many updates during its lifetime, but only those that change the `ClientStatus` field are critical for progressing a deployment or kicking off a reschedule to recover from failures. Add a priority to the client allocation sync and update the `syncTicker` receiver so that we only send an update if there's a high priority update waiting, or on every 5th tick. This means when there are no high priority updates, the client will send updates at most every 1s instead of 200ms. Benchmarks have shown this can reduce overall Raft traffic by 10%, as well as reduce client-to-server RPC traffic. This changeset also switches from a channel-based collection of updates to a shared buffer, so as to split batching from sending and prevent backpressure onto the allocrunner when the RPC is slow. This doesn't have a major performance benefit in the benchmarks but makes the implementation of the prioritized update simpler. Fixes: #9451

tgross · 2023-05-30T19:16:02Z

client/client_test.go

@@ -765,6 +765,24 @@ func TestClient_SaveRestoreState(t *testing.T) {
 						return fmt.Errorf("expected running client status, got %v",
 							ar.AllocState().ClientStatus)
 					}
+


Note for reviewers: this test was actually already racy but it's a very tight race before the changes in this PR. So these fixes for the test remove the race.

jrasell

LGTM! I added a couple of very minor comments and a question, but nothing blocking.

jrasell · 2023-05-31T07:49:51Z

client/client.go

+// filteredAcknowledgedUpdates returns a list of client alloc updates with the
+// already-acknowledged updates removed, and the highest priority of any update.


If I understand correctly, the caller must hold at least a read lock on c.allocUpdatesLock when calling this function? If so, would it be worth adding a note to the function comment about this for future readers?

I'd go a step further and suggest extracting the allocUpdates map into its own little data structure and let it handle its own locking. There's quite a bit of implementation detail around what it's used for splattered all over these Client functions.

If I understand correctly, the caller must hold at least a read lock on c.allocUpdatesLock when calling this function? If so, would it be worth adding a note to the function comment about this for future readers?

Well it's not a RWMutex, but yes. Really the only reason this is in its own function at all now is that we need to take the c.allocLock to read from the allocrunners and having it in its own function lets us scope-down that lock.

I'd go a step further and suggest extracting the allocUpdates map into its own little data structure and let it handle its own locking. There's quite a bit of implementation detail around what it's used for splattered all over these Client functions.

So in other words, make updatesToSync and filterAcknowledgedUpdates methods on the allocUpdates data structure? We'd need to pass the *Client as a parameter so that we can access the allocrunners. But yeah that seems reasonable. Let me take a quick try at that.

@shoenig @jrasell I've refactored this by pulling out the allocUpdates to their own struct pendingAllocUpdates (so as not to conflict with the other struct named allocUpdates which is the updates received from the server!). I think this makes the whole locking situation a lot cleaner. Let me know what you think.

client/client.go

shoenig

LGTM; just the one locking thing

schmichael · 2023-06-01T22:57:58Z

client/allocrunner/alloc_runner.go

 	case !last.DeploymentStatus.Equal(a.DeploymentStatus):
-		return false
+		return cstructs.AllocUpdatePriorityTypical


Technically deployments are gated by this field, so it could considered critical since it can cause a scheduling decision...

...but nothing about deployments is concerned with sub-second latencies, so I think it's fine to leave this as Typical.

If you're in this code again maybe add a comment pointing out that while deployment status changes are not urgent, they can affect scheduling but not in a way that sub-second skew is significant.

If you really want to tidy things up the PR description misses this too:

Each allocation sends many updates during its lifetime, but only those that change the ClientStatus field are critical for progressing a deployment or kicking off a reschedule to recover from failures.

DeploymentStatus is critical for progressing a deployment as well.

Technically deployments are gated by this field, so it could considered critical since it can cause a scheduling decision...

...but nothing about deployments is concerned with sub-second latencies, so I think it's fine to leave this as Typical

Somehow I missed that, so yeah I would've set it to urgent based on the reasoning I had in the PR. I'll keep (for now at least) and I'll add some commentary here around reasoning for things.

schmichael · 2023-06-01T23:05:08Z

client/allocrunner/alloc_runner.go

 	ar.stateLock.RLock()
 	defer ar.stateLock.RUnlock()

 	last := ar.lastAcknowledgedState
 	if last == nil {
-		return false
+		return cstructs.AllocUpdatePriorityTypical


If we don't know what it was before, how can we assume the change is typical? Seems worth a comment especially since all of the other code in this method must check from Highest Priority to Lowest in order to ensure a change to a low priority field doesn't demote an actually high priority update.

You're right that we don't know for sure. In practice an allocation will never become healthy quickly enough that the first update we send is that update. That being said we probably should account for allocations that quickly fail because there's a bunch of things that can go unrecoverably wrong on the client before we ever hit the task runner, and it'd be nice to be able to send those failure states to the server more quickly.

In #17354 we made client updates prioritized to reduce client-to-server traffic. When the client has no previously-acknowledged update we assume that the update is of typical priority; although we don't know that for sure in practice an allocation will never become healthy quickly enough that the first update we send is the update saying the alloc is healthy. But that doesn't account for allocations that quickly fail in an unrecoverable way because of allocrunner hook failures, and it'd be nice to be able to send those failure states to the server more quickly. This changeset does so and adds some extra comments on reasoning behind priority.

tgross force-pushed the alloc-sync-prioritized-updates branch from 4e2c253 to 97878f4 Compare May 30, 2023 15:05

tgross mentioned this pull request May 30, 2023

client: make batching of allocation updates smarter #9451

Closed

tgross added the type/enhancement label May 30, 2023

tgross added this to the 1.6.0 milestone May 30, 2023

tgross added the theme/client label May 30, 2023

vercel bot deployed to Preview – nomad-storybook-and-ui May 30, 2023 15:12 View deployment

tgross marked this pull request as ready for review May 30, 2023 15:35

tgross requested review from jrasell, shoenig and schmichael May 30, 2023 15:36

tgross marked this pull request as draft May 30, 2023 15:40

tgross removed request for schmichael, shoenig and jrasell May 30, 2023 15:40

tgross force-pushed the alloc-sync-prioritized-updates branch 3 times, most recently from 59cfde9 to 9d55450 Compare May 30, 2023 17:43

vercel bot deployed to Preview – nomad-storybook-and-ui May 30, 2023 17:49 View deployment

tgross force-pushed the alloc-sync-prioritized-updates branch from 9d55450 to 116ad2a Compare May 30, 2023 18:32

vercel bot deployed to Preview – nomad-storybook-and-ui May 30, 2023 18:36 View deployment

tgross marked this pull request as ready for review May 30, 2023 19:14

tgross requested review from schmichael, shoenig and jrasell May 30, 2023 19:14

tgross commented May 30, 2023

View reviewed changes

jrasell approved these changes May 31, 2023

View reviewed changes

shoenig reviewed May 31, 2023

View reviewed changes

client/client.go Outdated Show resolved Hide resolved

shoenig reviewed May 31, 2023

View reviewed changes

client/client.go Outdated Show resolved Hide resolved

refactor to improve ergnomics of allocUpdates lock

eae6854

tgross requested review from jrasell and shoenig May 31, 2023 14:29

vercel bot deployed to Preview – nomad-storybook-and-ui May 31, 2023 14:33 View deployment

shoenig reviewed May 31, 2023

View reviewed changes

client/client.go Show resolved Hide resolved

shoenig approved these changes May 31, 2023

View reviewed changes

add note to method about lock

88527e5

vercel bot deployed to Preview – nomad-storybook-and-ui May 31, 2023 18:49 View deployment

tgross merged commit 893d4a7 into main May 31, 2023

tgross deleted the alloc-sync-prioritized-updates branch May 31, 2023 19:34

schmichael reviewed Jun 1, 2023

View reviewed changes

tgross mentioned this pull request Jun 15, 2023

adjust prioritized client updates #17541

Merged

tgross mentioned this pull request Jul 10, 2023

tunables for client allocation update frequency #17869

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prioritized client updates #17354

prioritized client updates #17354

tgross commented May 30, 2023 •

edited

Loading

tgross commented May 30, 2023 •

edited

Loading

tgross May 30, 2023

jrasell left a comment

jrasell May 31, 2023

shoenig May 31, 2023

tgross May 31, 2023

tgross May 31, 2023

shoenig left a comment

schmichael Jun 1, 2023

tgross Jun 15, 2023

schmichael Jun 1, 2023

tgross Jun 15, 2023

		// filteredAcknowledgedUpdates returns a list of client alloc updates with the
		// already-acknowledged updates removed, and the highest priority of any update.

prioritized client updates #17354

prioritized client updates #17354

Conversation

tgross commented May 30, 2023 • edited Loading

tgross commented May 30, 2023 • edited Loading

Choose a reason for hiding this comment

jrasell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shoenig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgross commented May 30, 2023 •

edited

Loading

tgross commented May 30, 2023 •

edited

Loading