Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: "auto create stats" job should use lower priority for IO #82508

Open
Tracked by #121779
tbg opened this issue Jun 7, 2022 · 5 comments
Open
Tracked by #121779

kvserver: "auto create stats" job should use lower priority for IO #82508

tbg opened this issue Jun 7, 2022 · 5 comments
Labels
A-admission-control C-investigation Further steps needed to qualify. C-label will change. T-admission-control Admission Control

Comments

@tbg
Copy link
Member

tbg commented Jun 7, 2022

#81516 adds the admission/follower-overload/presplit-control roachtest. In this roachtest, a three node cluster is set up so that two nodes have all leases for a kv0 workload. At the time of writing, kv0 runs with 4mb/s of goodput (400 rate limit * 10k per write). On AWS (where this run took place), on a default EBS volume with throughput limit 125mb/s and 3000 iops (aggregate read+write), this is right at the limit. As a result, n1 and n2 get into mild IO overload territory.

It was observed that the nodes with leases consistently read more data from disk (green and orange are n1 and n2)

image

read mb/s:

image

Zooming out, we see this pattern:

image

No splits are occurring at the time. However, the bumps match up well with these bumps in raft log:

image

The raft log queue processes replicas at a fixed rate throughout these spikes, so it's unclear if it is now simply contending with read activity or if it is itself the cause of read activity.

Overlaying rate(rocksdb_compacted_bytes_read[$__rate_interval]) onto the bytes read shows that compactions are not the driver of the spiky reads on n1 and n2. Quite the opposite, whenever these spikes occur, compactions can't read as quickly as they would like to.

Jira issue: CRDB-16492

Epic CRDB-37479

@tbg tbg added the C-investigation Further steps needed to qualify. C-label will change. label Jun 7, 2022
@blathers-crl blathers-crl bot added the T-kv KV Team label Jun 7, 2022
@nicktrav
Copy link
Collaborator

nicktrav commented Jun 7, 2022

The spikes in increased read throughput correlates strongly with periods during which the auto create stats job is running. Note that we consume more read bandwidth, and as the device is maxed out, we start "stealing" throughput from writes. This is a look at the same time period that you posted:

Screen Shot 2022-06-07 at 6 56 38 AM

Zooming out further, we see the same thing, though this time we have much more throughput to consume (we bumped up to 250 MB/s). That said, we still see increased reads stealing some write throughput (note the dips in the green line on the write throughput chart at the bottom when the read throughput increases ):

Screen Shot 2022-06-07 at 6 56 59 AM

@irfansharif
Copy link
Contributor

What's left to do in this issue? Downgrade the admission priority level of requests originating from the "auto create stats" job?

@tbg tbg changed the title kvserver: investigate read rate discrepancy between leader and follower in kv0 kvserver: "auto create stats" job should use lower priority for IO Jun 27, 2022
@nvanbenschoten
Copy link
Member

This seems like a small, targeted change with minimal risk. Should we try to get it in for v22.2?

@sumeerbhola
Copy link
Collaborator

We should lower the priority and ensure that it gets subject to elastic CPU AC. We don't currently have a way to share read bandwidth in AC.

@exalate-issue-sync exalate-issue-sync bot removed the E-quick-win Likely to be a quick win for someone experienced. label Dec 8, 2023
@aadityasondhi aadityasondhi added the T-admission-control Admission Control label Dec 13, 2023
@aadityasondhi aadityasondhi self-assigned this Jan 18, 2024
@exalate-issue-sync exalate-issue-sync bot removed the T-admission-control Admission Control label Jan 18, 2024
aadityasondhi added a commit to aadityasondhi/cockroach that referenced this issue Jan 19, 2024
aadityasondhi added a commit to aadityasondhi/cockroach that referenced this issue Jan 26, 2024
aadityasondhi added a commit to aadityasondhi/cockroach that referenced this issue Jan 29, 2024
This aims to simulate badnwidth induced overload by running a large kv0
workload on a 3 node cluster, while all the leases are owned by n1 and
n2.

Informs cockroachdb#82508.

Release note: None
aadityasondhi added a commit to aadityasondhi/cockroach that referenced this issue Feb 5, 2024
This aims to simulate read bandwidth induced overload by running a large
kv0 workload on a 3 node cluster, while all the leases are owned by n1 and
n2.

Informs cockroachdb#82508.

Release note: None
craig bot pushed a commit that referenced this issue Feb 6, 2024
117988: roachtest: admission/follower-overload test improvements r=sumeerbhola a=aadityasondhi

[roachtest: fix zone config syntax in ac follower overload test](c80491e) 

Informs #82508.

Release note: None

---

[roachtest: add bandwidth overload test in admission/follower-overload](d0dc296) 

This aims to simulate read bandwidth induced overload by running a large
kv0 workload on a 3 node cluster, while all the leases are owned by n1 and
n2.

Informs #82508.

Release note: None

Co-authored-by: Aaditya Sondhi <[email protected]>
wenyihu6 pushed a commit to wenyihu6/cockroach that referenced this issue Feb 21, 2024
wenyihu6 pushed a commit to wenyihu6/cockroach that referenced this issue Feb 21, 2024
This aims to simulate read bandwidth induced overload by running a large
kv0 workload on a 3 node cluster, while all the leases are owned by n1 and
n2.

Informs cockroachdb#82508.

Release note: None
@aadityasondhi
Copy link
Collaborator

Update here for posterity:

We ran a few internal experiments for this, and the reason for overload is saturating the provisioned disk bandwidth. Until we enable disk bandwidth AC (#86857), making changes here will not actually subject the AUTO CREATE STATS (background) job to throttling since we do not throttle reads at the moment.

Full internal discussion is available here.

@exalate-issue-sync exalate-issue-sync bot added the T-storage Storage Team label May 1, 2024
@blathers-crl blathers-crl bot added the A-storage Relating to our storage engine (Pebble) on-disk storage. label May 1, 2024
@aadityasondhi aadityasondhi added T-admission-control Admission Control and removed A-storage Relating to our storage engine (Pebble) on-disk storage. T-storage Storage Team labels May 7, 2024
@aadityasondhi aadityasondhi removed their assignment May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-admission-control C-investigation Further steps needed to qualify. C-label will change. T-admission-control Admission Control
Projects
None yet
Development

No branches or pull requests

6 participants