Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload/ycsb: add flag to use column families #32704

Merged
merged 1 commit into from
Nov 29, 2018

Conversation

nvanbenschoten
Copy link
Member

This change adds a new --families flag to the ycsb workload. Now
that #18168 is addressed, this significantly reduces the contention
present in the workload by avoiding conflicts on updates to different
columns in the same table.

I just confirmed that this still provides a huge speedup. On a 24 cpu machine:

workload run ycsb --init --workload='A' --concurrency=128 --duration=1m


--families=false gives:

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
   60.0s        0         103089         1718.0     13.9      0.6      1.8    604.0   1476.4  read
   60.0s        0         102947         1715.6     59.5      5.2     11.5   2281.7   8321.5  update

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   60.0s        0         206036         3433.6     36.7      3.3      8.9   1342.2   8321.5


--families=true gives:

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
   60.0s        0         333477         5557.8      9.2      0.6      6.0    302.0   1275.1  read
   60.0s        0         332366         5539.3     13.7      6.8     17.8     54.5   4831.8  update

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   60.0s        0         665843        11097.1     11.5      3.9     16.3    268.4   4831.8

cc. @robert-s-lee @drewdeally

Release note: None

This change adds a new `--families` flag to the ycsb workload. Now
that cockroachdb#18168 is addressed, this significantly reduces the contention
present in the workload by avoiding conflicts on updates to different
columns in the same table.

Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@nvanbenschoten
Copy link
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Nov 29, 2018
32704: workload/ycsb: add flag to use column families r=nvanbenschoten a=nvanbenschoten

This change adds a new `--families` flag to the ycsb workload. Now
that #18168 is addressed, this significantly reduces the contention
present in the workload by avoiding conflicts on updates to different
columns in the same table.

I just confirmed that this still provides a huge speedup. On a 24 cpu machine:
```
workload run ycsb --init --workload='A' --concurrency=128 --duration=1m


--families=false gives:

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
   60.0s        0         103089         1718.0     13.9      0.6      1.8    604.0   1476.4  read
   60.0s        0         102947         1715.6     59.5      5.2     11.5   2281.7   8321.5  update

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   60.0s        0         206036         3433.6     36.7      3.3      8.9   1342.2   8321.5


--families=true gives:

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
   60.0s        0         333477         5557.8      9.2      0.6      6.0    302.0   1275.1  read
   60.0s        0         332366         5539.3     13.7      6.8     17.8     54.5   4831.8  update

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
   60.0s        0         665843        11097.1     11.5      3.9     16.3    268.4   4831.8
```

cc. @robert-s-lee @drewdeally 

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig
Copy link
Contributor

craig bot commented Nov 29, 2018

Build succeeded

@craig craig bot merged commit 16ffee3 into cockroachdb:master Nov 29, 2018
@nvanbenschoten nvanbenschoten deleted the nvanbenschoten/ycsbFam branch November 29, 2018 18:54
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Mar 5, 2020
This change marks all columns in the YCSB "usertable" as NOT NULL. Doing
so allows the load generator to take advantage of cockroachdb#44239, which avoids
the KV lookup on the primary column family entirely when querying one of
the other column families. The query plans before and after demonstrate
this:

```
--- Before
root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key';
    tree    |    field    |               description
------------+-------------+------------------------------------------
            | distributed | false
            | vectorized  | false
  render    |             |
   └── scan |             |
            | table       | usertable@primary
            | spans       | /"key"/0-/"key"/1 /"key"/6/1-/"key"/6/2
            | parallel    |

--- After
root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key';
    tree    |    field    |      description
------------+-------------+------------------------
            | distributed | false
            | vectorized  | false
  render    |             |
   └── scan |             |
            | table       | usertable@primary
            | spans       | /"key"/6/1-/"key"/6/2
```

This becomes very important when running YCSB with a column family per
field and with implicit SELECT FOR UPDATE (see cockroachdb#45159). Now that (as
of cockroachdb#45701) UPDATE statements acquire upgrade locks during their initial row
fetch, we don't want them acquiring upgrade locks on the primary column
family of the row they are intending to update a single column in. This
re-introduces the contention between writes to different columns in the
same row that column families helped avoid (see cockroachdb#32704). By marking each
column as NOT NULL, we can continue to avoid this contention.
RichardJCai pushed a commit to RichardJCai/cockroach that referenced this pull request Mar 9, 2020
This change marks all columns in the YCSB "usertable" as NOT NULL. Doing
so allows the load generator to take advantage of cockroachdb#44239, which avoids
the KV lookup on the primary column family entirely when querying one of
the other column families. The query plans before and after demonstrate
this:

```
--- Before
root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key';
    tree    |    field    |               description
------------+-------------+------------------------------------------
            | distributed | false
            | vectorized  | false
  render    |             |
   └── scan |             |
            | table       | usertable@primary
            | spans       | /"key"/0-/"key"/1 /"key"/6/1-/"key"/6/2
            | parallel    |

--- After
root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key';
    tree    |    field    |      description
------------+-------------+------------------------
            | distributed | false
            | vectorized  | false
  render    |             |
   └── scan |             |
            | table       | usertable@primary
            | spans       | /"key"/6/1-/"key"/6/2
```

This becomes very important when running YCSB with a column family per
field and with implicit SELECT FOR UPDATE (see cockroachdb#45159). Now that (as
of cockroachdb#45701) UPDATE statements acquire upgrade locks during their initial row
fetch, we don't want them acquiring upgrade locks on the primary column
family of the row they are intending to update a single column in. This
re-introduces the contention between writes to different columns in the
same row that column families helped avoid (see cockroachdb#32704). By marking each
column as NOT NULL, we can continue to avoid this contention.
RichardJCai pushed a commit to RichardJCai/cockroach that referenced this pull request Mar 9, 2020
This change marks all columns in the YCSB "usertable" as NOT NULL. Doing
so allows the load generator to take advantage of cockroachdb#44239, which avoids
the KV lookup on the primary column family entirely when querying one of
the other column families. The query plans before and after demonstrate
this:

```
--- Before
root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key';
    tree    |    field    |               description
------------+-------------+------------------------------------------
            | distributed | false
            | vectorized  | false
  render    |             |
   └── scan |             |
            | table       | usertable@primary
            | spans       | /"key"/0-/"key"/1 /"key"/6/1-/"key"/6/2
            | parallel    |

--- After
root@:26257/ycsb> EXPLAIN SELECT field5 FROM usertable WHERE ycsb_key = 'key';
    tree    |    field    |      description
------------+-------------+------------------------
            | distributed | false
            | vectorized  | false
  render    |             |
   └── scan |             |
            | table       | usertable@primary
            | spans       | /"key"/6/1-/"key"/6/2
```

This becomes very important when running YCSB with a column family per
field and with implicit SELECT FOR UPDATE (see cockroachdb#45159). Now that (as
of cockroachdb#45701) UPDATE statements acquire upgrade locks during their initial row
fetch, we don't want them acquiring upgrade locks on the primary column
family of the row they are intending to update a single column in. This
re-introduces the contention between writes to different columns in the
same row that column families helped avoid (see cockroachdb#32704). By marking each
column as NOT NULL, we can continue to avoid this contention.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants