Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: account for AddSSTableRequests in QPS #76252

Merged
merged 2 commits into from
Feb 16, 2022

Conversation

kvoli
Copy link
Collaborator

@kvoli kvoli commented Feb 8, 2022

Previously, Queries-Per-Second (QPS) was calculated uniformly per
BatchRequest as 1. This patch introduces variable QPS calculation
for AddSSTableRequest, which use an order of magnitude more
resources than other request types.

This patch introduces the
kv.replica_stats.addsst_request_size_factor cluster setting. This
setting is used to attribute QPS to AddSSTableRequest sizes. The
calculation is done as QPS = 1 + size(AddSSTableRequest) / factor.
When kv.replica_stats.addsst_request_size_factor is less than 1, or
no AddSSTableRequest exists within a BatchRequest, then QPS = 1;
the current behavior today.

resolves #73731

Release note (performance improvement):
Introduced kv.replica_stats.addsst_request_size_factor cluster
setting. This setting is used to tune Queries-Per-Second (QPS)
sensitivity to large imports. By default, this setting is disabled.
When enabled, the size of any AddSSTableRequest will contribute to
QPS in inverse relation to this settings magnitude. By default this
setting configured to a conservative 50,000; every 50 kilobytes will
be accounted for as an additional 1 QPS.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@kvoli kvoli added the do-not-merge bors won't merge a PR with this label. label Feb 9, 2022
@kvoli kvoli force-pushed the addsst-qps branch 8 times, most recently from 8f0d5bc to 7a62a9e Compare February 10, 2022 16:34
@kvoli kvoli removed the do-not-merge bors won't merge a PR with this label. label Feb 10, 2022
@kvoli kvoli marked this pull request as ready for review February 10, 2022 18:36
@kvoli kvoli requested a review from a team as a code owner February 10, 2022 18:36
@kvoli kvoli self-assigned this Feb 10, 2022
@kvoli
Copy link
Collaborator Author

kvoli commented Feb 10, 2022

The experiments conducted on this change can be found here (internal).

The results show an improvement upon per-batch-request, the current accounting method on master. The key metric of success here is variation in write throughput between nodes in a cluster. QPS variation, is also important as the input to allocator decisions.

Write throughput variance between nodes is shown to be strictly less, when testing a large TPC-E (500k) import and index creation.
Screen Shot 2022-02-10 at 1 40 15 PM.

The following are the summarized findings, showing the comparison between the current accounting method and with this patch applied. This experiment was conducted with background noise, alongside a large (500k) TPC-E import and index creation.

For posterity, the experiments conducted and summary of findings are as follows:

(1) TPC-C Large Import (30k Warehouses)

roachprod ssh ${cluster_user}-${cluster_suffix}:15      -- './cockroach workload fixtures import tpcc --warehouses=30000 {pgurl:1}' 

(2) TPC-E Large Import and Index Creation (500k)

roachprod sql ${cluster_user}-${cluster_suffix}:15  -- -e 'CREATE TABLE trade (t_id INT8 NOT NULL PRIMARY KEY, t_dts TIMESTAMP NOT NULL, t_st_id VARCHAR(4) NOT NULL, t_tt_id VARCHAR(3) NOT NULL, t_is_cash BOOL NOT NULL, t_s_symb VARCHAR(15) NOT NULL, t_qty INT4 NOT NULL CHECK (t_qty > 0), t_bid_price DECIMAL(8,2) NOT NULL CHECK (t_bid_price > 0), t_ca_id INT8 NOT NULL, t_exec_name VARCHAR(49) NOT NULL, t_trade_price DECIMAL(8,2), t_chrg DECIMAL(10,2) NOT NULL CHECK (t_chrg >= 0), t_comm DECIMAL(10,2) NOT NULL CHECK (t_comm >= 0), t_tax DECIMAL(10,2) NOT NULL CHECK (t_tax  >= 0), t_lifo BOOL NOT NULL, FAMILY (t_id), FAMILY (t_comm, t_dts, t_st_id, t_trade_price, t_exec_name, t_tt_id, t_is_cash, t_s_symb, t_qty, t_bid_price, t_ca_id, t_chrg, t_lifo), FAMILY (t_tax));'

roachprod sql ${cluster_user}-${cluster_suffix}:15  -- -e "IMPORT INTO trade CSV DATA ('gs://cockroach-fixtures/tpce-csv/customers=500000/*/Trade.txt?AUTH=implicit') WITH delimiter = '|'; CREATE INDEX ON trade (t_ca_id, t_dts DESC) STORING (t_st_id, t_tt_id, t_is_cash, t_s_symb, t_qty, t_bid_price, t_exec_name, t_trade_price, t_chrg);" 

(3) TPC-C (3k Warehouses), in parallel with TPC-E Large Import and Index Creation (500k)

A combination of the above two, with modifications and inclusion of the TPCC workload runner. This workload runner is present throughout the experiment lifetime, to create a fixed level of background QPS.

roachprod ssh ${cluster_user}-${cluster_suffix}-master:16 -- './cockroach workload fixtures import tpcc --warehouses=3000 {pgurl:1};'

roachprod ssh ${cluster_user}-${cluster_suffix}-20k:16 -- './cockroach workload run tpcc --ramp=1m --active-warehouses=3000 --warehouses=3000 {pgurl:1-15}; '

The TPC-C workload in this test went to 0 QPS for both accounting methods during index creation. Whilst the patched cluster did perform marginally better, retaining some TPC-C QPS at stages - both ultimately were unable to process the TPC-C workload, whilst the TPC-E index creation was running. The above evaluation held at 3000/2500/2000/1000 Warehouses. The patched cluster could sustain TPC-C at 750 Warehouses and the master branch cluster at 250. These results are shown below.

2022-02-11_10-09

QPS shows minor, yet insignificant variance improvement on the patched cluster.
image

Copy link
Contributor

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)


pkg/kv/kvserver/replica_metrics.go, line 253 at r2 (raw file):

// that weren't sent through a DistSender, which in practice should be
// practically none). Also return the amount of time over which the stat was
// accumulated. AddSSTableRequests can be treated differently, in which case

nit: This seems like an odd place to have this comment, could we just have it closer to where we're actually accounting for QPS and have this comment point to that location?


pkg/kv/kvserver/replica_send.go, line 974 at r2 (raw file):

// 1 QPS, unless an AddSSTableRequest exists, in which case the sum of all
// AddSSTableRequest's data size is divided by a factor and added to QPS. This
// specific treatment of QPS is a special case to accout for the difference

account*


pkg/kv/kvserver/replica_send.go, line 975 at r2 (raw file):

// AddSSTableRequest's data size is divided by a factor and added to QPS. This
// specific treatment of QPS is a special case to accout for the difference
// extreme between AddSSTableRequest and other requests in terms of resource

extreme difference?


pkg/kv/kvserver/replica_stats.go, line 46 at r2 (raw file):

	"kv.replica_stats.addsst_request_size_factor",
	"the divisor that is applied to addsstable request sizes, then recorded in a leaseholders QPS; 0 means all requests are treated as cost 1",
	0,

Did we decide that this factor was going to be zero by default? That's kind of unfortunate. Could we instead consider setting it to a large-ish value (say 1mb)?


pkg/kv/kvserver/replica_rankings_test.go, line 86 at r2 (raw file):

// TestAddSSTQPSStat verifies that AddSSTableRequests are accounted for
// differently, when present in a BatchRequest, with a divisor set.
func TestAddSSTQPSStat(t *testing.T) {

nice test!


pkg/kv/kvserver/replica_rankings_test.go, line 92 at r2 (raw file):

	tc := serverutils.StartNewTestCluster(t, 1, base.TestClusterArgs{
		ServerArgs: base.TestServerArgs{

I'd also recommend disabling the replicateQueue by using ReplicationMode: Manual here to avoid lease transfers away from the replica you're measuring the QPS off of.


pkg/kv/kvserver/replica_rankings_test.go, line 93 at r2 (raw file):

	tc := serverutils.StartNewTestCluster(t, 1, base.TestClusterArgs{
		ServerArgs: base.TestServerArgs{
			DisableWebSessionAuthentication: true,

I'm surprised we need this. Let's remove it if we don't?


pkg/kv/kvserver/replica_rankings_test.go, line 167 at r2 (raw file):

		// If queries are correctly recorded, we should see increase in query count by
		// the expected QPS. However, it is possible to to get a slightly higher or

to to

also nit: the wrapping is a little off here.

Copy link
Collaborator Author

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @kvoli)


pkg/kv/kvserver/replica_metrics.go, line 253 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

nit: This seems like an odd place to have this comment, could we just have it closer to where we're actually accounting for QPS and have this comment point to that location?

Removed. Where this is being accounted for, it is recorded - were you think of adding a comment above the method call r.getBatchRequestQPS()? Currently this is an abridged version of the function comment getBatchRequest.


pkg/kv/kvserver/replica_send.go, line 974 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

account*

Fixed.


pkg/kv/kvserver/replica_send.go, line 975 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

extreme difference?

Reworded.


pkg/kv/kvserver/replica_stats.go, line 46 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

Did we decide that this factor was going to be zero by default? That's kind of unfortunate. Could we instead consider setting it to a large-ish value (say 1mb)?

Updated following our discussion to a conservative 50k.


pkg/kv/kvserver/replica_rankings_test.go, line 86 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

nice test!

ty


pkg/kv/kvserver/replica_rankings_test.go, line 92 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

I'd also recommend disabling the replicateQueue by using ReplicationMode: Manual here to avoid lease transfers away from the replica you're measuring the QPS off of.

Done.


pkg/kv/kvserver/replica_rankings_test.go, line 93 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

I'm surprised we need this. Let's remove it if we don't?

Removed.


pkg/kv/kvserver/replica_rankings_test.go, line 167 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

to to

also nit: the wrapping is a little off here.

Formatted.

Copy link
Contributor

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @kvoli)


pkg/kv/kvserver/replica_metrics.go, line 253 at r2 (raw file):

Previously, kvoli (Austen) wrote…

Removed. Where this is being accounted for, it is recorded - were you think of adding a comment above the method call r.getBatchRequestQPS()? Currently this is an abridged version of the function comment getBatchRequest.

Yeah, this looks good to me. I was thinking that we'd have a line here saying "See getBatchRequestQPS() for how this accounted for". Hopefully, this makes it such that when the QPS accounting method is changed, these comments point to a single place that has an up-to-date comment.

Copy link
Contributor

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @kvoli)


pkg/kv/kvserver/replica_stats.go, line 46 at r2 (raw file):

Previously, kvoli (Austen) wrote…

Updated following our discussion to a conservative 50k.

Let's update the release note with that.

@aayushshah15
Copy link
Contributor

Added Nathan in case he wants to give this a pass as well.

@kvoli kvoli force-pushed the addsst-qps branch 2 times, most recently from f52c29c to e76876e Compare February 11, 2022 21:12
@kvoli kvoli force-pushed the addsst-qps branch 5 times, most recently from 182e2bf to 23ffb9c Compare February 15, 2022 19:06
This patch updates the function naming to reflect the current method
handler for batch requests containing at least one write request. to
`executeWriteBatch`, from `executeReadWriteBatch`.

Release note: None
@kvoli kvoli force-pushed the addsst-qps branch 2 times, most recently from 5d60fc1 to 2a304e3 Compare February 15, 2022 19:51
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 11 of 11 files at r5, 11 of 11 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @kvoli)


pkg/kv/kvserver/replica_stats.go, line 53 at r6 (raw file):

).WithPublic()

// AddSSTableRequestSizeFactor returns the size cost factor for a given replica.

nit: I think this method actually hurts readability because it forces readers to jump over here to confirm that this is pulling from a cluster setting and it's not a per-range property. We generally don't wrap cluster setting accesses like this in methods. What do you think about inlining it?

@kvoli
Copy link
Collaborator Author

kvoli commented Feb 15, 2022

bors r-

@craig
Copy link
Contributor

craig bot commented Feb 15, 2022

Canceled.

@kvoli
Copy link
Collaborator Author

kvoli commented Feb 15, 2022

:lgtm:

Reviewed 11 of 11 files at r5, 11 of 11 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @kvoli)

pkg/kv/kvserver/replica_stats.go, line 53 at r6 (raw file):

).WithPublic()

// AddSSTableRequestSizeFactor returns the size cost factor for a given replica.

nit: I think this method actually hurts readability because it forces readers to jump over here to confirm that this is pulling from a cluster setting and it's not a per-range property. We generally don't wrap cluster setting accesses like this in methods. What do you think about inlining it?

Updated to be inline.

@irfansharif
Copy link
Contributor

The commit message refers to #7371, I think we meant #73731?

Previously, Queries-Per-Second (QPS) was calculated uniformly per
`BatchRequest` as 1. This patch introduces variable QPS calculation
for `AddSSTableRequest`, which use an order of magnitude more
resources than other request types.

This patch introduces the
`kv.replica_stats.addsst_request_size_factor` cluster setting. This
setting is used to attribute QPS to `AddSSTableRequest` sizes. The
calculation is done as QPS = 1 + size(AddSSTableRequest) / factor.
When `kv.replica_stats.addsst_request_size_factor` is less than 1, or
no `AddSSTableRequest` exists within a `BatchRequest`, then QPS = 1;
the current behavior today.

resolves cockroachdb#73731

Release note (performance improvement):
Introduced `kv.replica_stats.addsst_request_size_factor` cluster
setting. This setting is used to tune Queries-Per-Second (QPS)
sensitivity to large imports. By default, this setting is disabled.
When enabled, the size of any AddSSTableRequest will contribute to
QPS in inverse relation to this settings magnitude. By default this
setting configured to a conservative 50,000; every 50 kilobytes will
be accounted for as an additional 1 QPS.
@kvoli
Copy link
Collaborator Author

kvoli commented Feb 16, 2022

The commit message refers to #7371, I think we meant #73731?

Good spot. Updated.

@kvoli
Copy link
Collaborator Author

kvoli commented Feb 16, 2022

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 16, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: improve the accounting of AddSSTableRequests' QPS
7 participants