Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: use qps for hot ranges sorting #99716

Merged
merged 1 commit into from
Mar 30, 2023

Conversation

kvoli
Copy link
Collaborator

@kvoli kvoli commented Mar 27, 2023

We introduced CPU balancing by default in #97424. This had the side effect of changing the hot ranges api to return the hottest replicas by CPU, rather than QPS.

This patch updates the replica rankings struct to support tracking both by CPU and QPS simultaneously. The hot ranges API collects the top k by QPS and the store rebalancer collects depending on the setting of kv.allocator.load_based_rebalancing.objective, which is by default cpu.

Resolves: #99605

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@kvoli kvoli marked this pull request as ready for review March 27, 2023 20:35
@kvoli kvoli requested a review from a team as a code owner March 27, 2023 20:35
@kvoli kvoli requested a review from koorosh March 27, 2023 20:35
Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)


pkg/kv/kvserver/store.go line 2692 at r1 (raw file):

	// and rebalancing. By default rebalancing uses CPU whilst the UI will use
	// QPS.
	rankingsAccumulator := NewReplicaAccumulator(load.CPU, load.Queries)

I'd suggest to rely on s.rebalanceObjManager.Objective().ToDimension() and provide it as an argument to NewReplicaAccumulator. It would guarantee that in case of new objectives added,
it doesn't need any changes.

NewReplicaAccumulator(s.rebalanceObjManager.Objective().ToDimension(), load.Queries)

Also it can introduce cases where s.rebalanceObjManager.Objective().ToDimension() is set to qps so NewReplicaAccumulator should eliminate provided objective's duplicates.
I'm not sure how critical so curious to know what do you think?


pkg/kv/kvserver/store_rebalancer_test.go line 505 at r1 (raw file):

func loadRanges(rr *ReplicaRankings, s *Store, ranges []testRange) {
	// Track both CPU and QPS by default, the ordering the consumer uses will
	// deepend on the current rebalance objective.

typo: "depend"

@kvoli kvoli requested a review from koorosh March 28, 2023 13:19
Copy link
Collaborator Author

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @koorosh)


pkg/kv/kvserver/store.go line 2692 at r1 (raw file):

Previously, koorosh (Andrii Vorobiov) wrote…

I'd suggest to rely on s.rebalanceObjManager.Objective().ToDimension() and provide it as an argument to NewReplicaAccumulator. It would guarantee that in case of new objectives added,
it doesn't need any changes.

NewReplicaAccumulator(s.rebalanceObjManager.Objective().ToDimension(), load.Queries)

Also it can introduce cases where s.rebalanceObjManager.Objective().ToDimension() is set to qps so NewReplicaAccumulator should eliminate provided objective's duplicates.
I'm not sure how critical so curious to know what do you think?

I'm not sure if that would work, this PR purposefully introduces tracking for multiple dimensions at once. I'd rather keep it explicit which dimensions we are tracking. Unless we want to move all tracking into the tenant struct for UI purposes.


pkg/kv/kvserver/store_rebalancer_test.go line 505 at r1 (raw file):

Previously, koorosh (Andrii Vorobiov) wrote…

typo: "depend"

Updated.

@kvoli kvoli force-pushed the 230327.heapk-hottest branch 3 times, most recently from e54df13 to d00eb94 Compare March 28, 2023 14:16
Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @kvoli)

@kvoli kvoli added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Mar 29, 2023
@kvoli kvoli force-pushed the 230327.heapk-hottest branch 2 times, most recently from 38200dc to eb423f3 Compare March 29, 2023 14:17
We introduced CPU balancing by default in cockroachdb#97424. This had the side
effect of changing the hot ranges api to return the hottest replicas by
CPU, rather than QPS.

This patch updates the replica rankings struct to support tracking both
by CPU and QPS simultaneously. The hot ranges API collects the top k by
QPS and the store rebalancer collects depending on the setting of
`kv.allocator.load_based_rebalancing.objective`, which is by default
`cpu`.

Resolves: cockroachdb#99605

Release note (bug fix): The hot ranges UI page would show hot ranges by
CPU and not QPS, depending on the value of
`kv.allocator.load_based_rebalancing.objective` (default `cpu`). Now the
UI page will always collect based on QPS.
@kvoli kvoli force-pushed the 230327.heapk-hottest branch from eb423f3 to 41e3497 Compare March 30, 2023 13:14
@kvoli
Copy link
Collaborator Author

kvoli commented Mar 30, 2023

TYFTR

bors r=koorosh

@craig
Copy link
Contributor

craig bot commented Mar 30, 2023

Build succeeded:

@craig craig bot merged commit 7074da6 into cockroachdb:master Mar 30, 2023
craig bot pushed a commit that referenced this pull request Apr 20, 2023
99108: server: deflake test hot ranges response r=koorosh a=koorosh

Flaky tests for HotRanges api were caused after changing source
for getting stats per replica in `mapToHotReplicasInfo` func.
`mapToHotReplicasInfo` func calls `Repl().LoadStats()` that
internally relies on calculates average values per second
in current moment, but values could be recorded earlier.

Now, `mapToHotReplicasInfo` func relies on `RangeUsageInfo()` func
that keeps average stats at the time stats were recorded.

Related change to this fix: #99716

Release note: None

Fixes: #98619

Co-authored-by: Andrii Vorobiov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ui: Hot Ranges page doesn't show top ranges by QPS (instead by CPU usage)
3 participants