Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: add read committed variants of ycsb #107517

Merged
merged 2 commits into from
Jul 26, 2023

Conversation

nvanbenschoten
Copy link
Member

Closes #107112.

This PR adds the following six roachtest variants:

ycsb/A/nodes=3/cpu=32/isolation-level=read-committed
ycsb/B/nodes=3/cpu=32/isolation-level=read-committed
ycsb/C/nodes=3/cpu=32/isolation-level=read-committed
ycsb/D/nodes=3/cpu=32/isolation-level=read-committed
ycsb/E/nodes=3/cpu=32/isolation-level=read-committed
ycsb/F/nodes=3/cpu=32/isolation-level=read-committed

It does so after adding an --isolation-level flag to ycsb, which controls the isolation level to run the workload transactions under. If unset, the workload will run with the default isolation level of the database.

Release note: None

Informs cockroachdb#107112.

This commit adds an `--isolation-level` flag to ycsb, which controls the
isolation level to run the workload transactions under. If unset, the
workload will run with the default isolation level of the database.

Release note: None
@nvanbenschoten nvanbenschoten requested a review from michae2 July 25, 2023 04:43
@nvanbenschoten nvanbenschoten requested a review from a team as a code owner July 25, 2023 04:43
@nvanbenschoten nvanbenschoten requested review from herkolategan and smg260 and removed request for a team July 25, 2023 04:43
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@nvanbenschoten
Copy link
Member Author

I'll kick off some iterations of each workload to compare vs. serializable.

@nvanbenschoten
Copy link
Member Author

Here is the result of the comparison:

                      │ ycsb_ssi.txt │               ycsb_rc.txt               │
                      │    ops/s     │     ops/s       vs base                 │
ycsb/A/nodes=3/cpu=32   30.97k ± ∞ ¹    31.92k ± 6%          ~ (p=0.662 n=5+6)
ycsb/B/nodes=3/cpu=32   82.73k ± ∞ ¹    91.95k ±  ∞ ¹  +11.16% (p=0.016 n=5)
ycsb/C/nodes=3/cpu=32   99.24k ± ∞ ¹   103.40k ± 6%          ~ (p=0.931 n=5+6)
ycsb/D/nodes=3/cpu=32   85.23k ± ∞ ¹    96.24k ± 2%          ~ (p=0.126 n=5+6)
ycsb/E/nodes=3/cpu=32   6.386k ± ∞ ¹    7.080k ± 1%          ~ (p=0.126 n=5+6)
ycsb/F/nodes=3/cpu=32   17.17k ± ∞ ¹    30.24k ± 3%    +76.11% (p=0.004 n=5+6)

                      │ ycsb_ssi.txt │              ycsb_rc.txt              │
                      │   avg(ms)    │   avg(ms)     vs base                 │
ycsb/A/nodes=3/cpu=32    4.600 ± ∞ ¹   4.500 ± 4%          ~ (p=0.771 n=5+6)
ycsb/B/nodes=3/cpu=32    2.300 ± ∞ ¹   2.100 ±  ∞ ¹   -8.70% (p=0.048 n=5)
ycsb/C/nodes=3/cpu=32    1.900 ± ∞ ¹   1.900 ± 5%          ~ (p=0.732 n=5+6)
ycsb/D/nodes=3/cpu=32    1.700 ± ∞ ¹   1.500 ± 0%          ~ (p=0.065 n=5+6)
ycsb/E/nodes=3/cpu=32    22.60 ± ∞ ¹   20.35 ± 1%          ~ (p=0.113 n=5+6)
ycsb/F/nodes=3/cpu=32    8.400 ± ∞ ¹   4.800 ± 4%    -42.86% (n=5+6)

                      │ ycsb_ssi.txt │              ycsb_rc.txt               │
                      │   p99(ms)    │    p99(ms)     vs base                 │
ycsb/A/nodes=3/cpu=32    54.50 ± ∞ ¹   67.10 ±  9%    +23.12% (p=0.017 n=5+6)
ycsb/B/nodes=3/cpu=32    12.60 ± ∞ ¹   10.50 ±   ∞ ¹  -16.67% (p=0.008 n=5)
ycsb/C/nodes=3/cpu=32    8.100 ± ∞ ¹   7.900 ± 13%          ~ (p=0.905 n=5+6)
ycsb/D/nodes=3/cpu=32    6.800 ± ∞ ¹   6.150 ±  2%          ~ (p=0.082 n=5+6)
ycsb/E/nodes=3/cpu=32    67.10 ± ∞ ¹   62.90 ±  3%          ~ (p=0.182 n=5+6)
ycsb/F/nodes=3/cpu=32   130.00 ± ∞ ¹   54.50 ± 12%    -58.08% (p=0.004 n=5+6)

The largest difference is from ycsb F (50% read, 50% read-modify-write). That's likely due to a combination of two factors:

  1. this PR passes --read-modify-write-in-txn=false for read committed. I think I'm going to switch that to be the default even for serializable. We've seen external users of our YCSB benchmark get tripped up on this difference vs. the official benchmark.
  2. reads under RC don't block on exclusive locks. This eliminates blocking between SFU locks in the read-modify-write transaction and non-locking readers.

We also see a meaningful improvement in ycsb A (50% read, 50% update) and ycsb B (95% read, 5% update). I think this is also due to RC reads not blocking on exclusive locks. This eliminates blocking between the implicit SFU locks acquired by the updates and the non-locking readers.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Jul 25, 2023
The `read-modify-write-in-txn` flag was added to ycsb in cockroachdb#103117. This
commit changes the default of the flag from true to false. This makes
the default configuration of ycsb more closely mirror the official ycsb
implementation, to avoid confusion when external users run the workload
and compare.

We saw in cockroachdb#103117 and in cockroachdb#107517 that this improves throughput
substantially. We will expect to see the same in roachperf once this
change is merged. After merging the PR, I will add a roachperf annotation.

Epic: None
Release note: None
@nvanbenschoten
Copy link
Member Author

I think I'm going to switch that to be the default even for serializable. We've seen external users of our YCSB benchmark get tripped up on this difference vs. the official benchmark.

This change is in #107537.

craig bot pushed a commit that referenced this pull request Jul 26, 2023
105441: server: notify tenant servers of metadata changes r=yuzefovich,stevendanna a=knz

Informs (towards fixing) #98431.
Fixes #105721.
Epic: CRDB-26691

See the individual commits for details.


106325: sql/parser: parse CREATE PROCEDURE r=mgartner a=mgartner

#### sql/parser: parse CREATE PROCEDURE

`CREATE PROCEDURE` statements can now be parsed by the SQL parser.
Currently, executing `CREATE PROCEDURE` will produce an unimplemented
error.

Epic: CRDB-25388

Release note: None

#### sem/tree: rename function AST nodes

This commit renames UDF-related AST nodes with "function" in the name to
use the more general term "routine". The term "routine" is inclusive of
both UDFs and procedures (e.g., Postgres implements a `DROP ROUTINE`
statement which drops both UDFs and procedures), which is fitting
because we'll be using the same AST nodes for both `CREATE FUNCTION` and
`CREATE PROCEDURE` statements.

Release note: None

#### sem/tree: rename udf.go to create_routine.go

Release note: None


107470: server, ccl: adjust more tests to work with test tenant r=yuzefovich a=yuzefovich

Informs: #76378.
Epic: CRDB-18499.


107507:  kvserver: transfer lease when acquiring lease outside preferences r=erikgrinaker,andrewbaptist a=kvoli

First 2 commits are from #107505.

When a leaseholder is lost, any surviving replica may acquire the lease, even if it violates lease preferences. There are two main reasons for this: we need to elect a new Raft leader who will acquire the lease, which is agnostic to lease preferences, and there may not even be any surviving replicas that satisfy the lease preferences at all, so we don't want to keep the range unavailable while we try to figure this out (considering e.g. network timeouts can delay this for many seconds).

However, after acquiring a lease, we rely on the replicate queue to transfer the lease back to a replica that conforms with the preferences, which can take several minutes. In multi-region clusters, this can cause severe latency degradation if the lease is acquired in a remote region.

This PR will detect lease preference violations when a replica acquires a new lease, and eagerly enqueue it in the replicate queue for transfer (if possible). If the first process fails, the replica will be sent to purgatory and retried soon after.

Additionally, two roachtests are added for lease preference conformance timing. The first test, `.../partial-first-preference-down`, takes down one of the preferred locality nodes (1/2). The second test, `.../full-first-preference-down`, takes down both the preferred locality nodes (2/2).

Resolves #106100.
Epic: none

Release note (bug fix): When losing a leaseholder and using lease preferences, the lease can be acquired by any other replica (regardless of lease preferences) in order to restore availability as soon as possible. The new leaseholder will now immediately check if it violates the lease preferences, and attempt to transfer the lease to a replica that satisfies the preferences if possible.


107512: sql: fix crdb_internal.encode_key in some contexts r=yuzefovich a=yuzefovich

Previously, we forgot to set `CatalogBuiltins` for the internal planner which is used by `crdb_internal.encode_key` (which, in turn, is used to power `SHOW RANGE ... FOR ROW`), so evaluating it would lead to an error. For example, the internal planner is used in the backfill. This is now fixed.

Fixes: #106397.

Release note (bug fix): CockroachDB would previously return an error when using `SHOW RANGE ... FOR ROW ...` in `CREATE TABLE ... AS ...` construct, and this is now fixed.

107537: workload/ycsb: default --read-modify-write-in-txn to false r=erikgrinaker a=nvanbenschoten

The `read-modify-write-in-txn` flag was added to ycsb in #103117. This commit changes the default of the flag from true to false. This makes the default configuration of ycsb more closely mirror the official ycsb implementation, to avoid confusion when external users run the workload and compare.

We saw in #103117 and in #107517 that this improves throughput substantially. We will expect to see the same in roachperf once this change is merged. After merging the PR, I will add a roachperf annotation.

Epic: None
Release note: None

107549: persistedsqlstats: add AOST clause to statement statistics size check r=maryliag a=THardy98

Epic: None

This change as an AOST clause to the statement statistics size check.
This helps mitigate contention on the `system.statement_statistics`
table which has caused the sql stats compaction job to fail.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Closes cockroachdb#107112.

This commit adds the following six roachtest variants:
```
ycsb/A/nodes=3/cpu=32/isolation-level=read-committed
ycsb/B/nodes=3/cpu=32/isolation-level=read-committed
ycsb/C/nodes=3/cpu=32/isolation-level=read-committed
ycsb/D/nodes=3/cpu=32/isolation-level=read-committed
ycsb/E/nodes=3/cpu=32/isolation-level=read-committed
ycsb/F/nodes=3/cpu=32/isolation-level=read-committed
```

Release note: None
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/ycsbIsoLevel branch from 5bea524 to 2a335a2 Compare July 26, 2023 15:00
@nvanbenschoten
Copy link
Member Author

I updated this PR to reflect the change to the default value of --read-modify-write-in-txn in #107537. @michae2 PTAL when you get a chance.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

Reviewed 2 of 2 files at r1, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan, @nvanbenschoten, and @smg260)


pkg/workload/connection.go line 123 at r1 (raw file):

	// NOTE: validation of the isolation level value is done by the server during
	// connection establishment.
	return setUrlParam(urls, "default_transaction_isolation", isoLevel)

If isoLevel is snapshot, should this also set sql.txn.snapshot_isolation.enabled?

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r=michae2

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan, @michae2, and @smg260)


pkg/workload/connection.go line 123 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

If isoLevel is snapshot, should this also set sql.txn.snapshot_isolation.enabled?

If there was a session variable to enable snapshot isolation, I think it would be a good idea to set it here. As is, we only have the cluster setting, and we generally don't like workloads silently manipulating cluster settings, so I'll leave for now. Please let me know if you'd like me to revisit.

@craig
Copy link
Contributor

craig bot commented Jul 26, 2023

Build succeeded:

@craig craig bot merged commit 284b6c0 into cockroachdb:master Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: support ycsb under read committed
3 participants