Queries take >10s when updating same row concurrently #17883

jethrogb · 2017-08-24T00:11:27Z

I have a table like this:

CREATE TABLE testtable (
    lookup BYTES PRIMARY KEY,
    name STRING,
    field1 BYTES,
    field2 BYTES,
    blob BYTES
)

I have tens of concurrent processes running this in a loop:

UPDATE testtable SET blob = 'some random value' WHERE lookup = 'one value'

This starts out relatively fast but queries can take 10s of seconds after a bit. After killing these processes, operations on this table (even as simple as SELECT count(*) FROM testtable) remain slow for a very long time.

I'm running single-node but I've also observed similar behavior on clusters. Config header:

I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] file created at: 2017/08/23 15:46:49
I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] running on machine: jethroft
I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] binary: CockroachDB CCL v1.0.4 (linux amd64, built 2017/07/27 17:54:36, go1.8.3)
I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] arguments: [./cockroach start --insecure]

Test program below. Run with:

./test-cockroach.py --server localhost init
./test-cockroach.py --server localhost run --rows 1 --max-processes 60

If it doesn't get slow enough, try increasing the num_updates variable in the code or the max-processes number on the command line. With num_updates=1 it usually finishes in a couple of minutes. With num_updates=10 it can take a very long time to finish.

#!/usr/bin/python

from __future__ import print_function

import argparse
from multiprocessing import Array, Process
import psycopg2
from psycopg2.extras import LoggingConnection, LoggingCursor
import random
import six
import string
import sys
import time
import uuid

letters = list(string.ascii_letters)
dbname = "ftxtest"
table_name = "testtable"

default_db_rows = 5000
# Number of different sizes to stop and gather performance data.
num_measurements = 50

name_len = 16
min_blob_len = 128
max_blob_len = 256
num_updates = 2

# Maximum number of processes for multi-process test. (We use Python
# multiprocessing instead of threading because Python threads can't
# run concurrently very much due to the global interpreter lock).
default_processes = 10

def random_string(len=name_len):
    output = ""
    for i in range(name_len):
        output += random.choice(letters)
    return output

def random_blob_len():
    return random.randrange(min_blob_len, max_blob_len)

def random_blob():
    return random_bytearray(random_blob_len())

def random_bytes(length):
    for _ in xrange(length):
        yield random.getrandbits(8)

def random_bytearray(length):
    return bytearray(random_bytes(length))

def create_database(conn):
    try:
        cur = conn.cursor()
        cur.execute("create database {}".format(dbname));
        conn.commit()
    except psycopg2.ProgrammingError as e:
        if e.pgcode == "42P04":
            print("Database already exists, continuing")
        else:
            raise

def create_table(conn, mysql=False):
    try:
        cur = conn.cursor()
        if mysql:
            binary = "bytea"
            text = "varchar"
        else:
            binary = "bytes"
            text = "string"

        schema = ("lookup {binary} primary key, name {text}, field1 {binary}, " +
                  "field2 {binary}, blob {binary}").format(binary=binary, text=text)
        cur.execute("create table {} ({})".format(table_name, schema))
        conn.commit()
    except psycopg2.ProgrammingError as e:
        if e.pgcode == "42P07":
            print("Table already exists, continuing")
        else:
            raise

def create_rows(conn, num_rows):
    cur = conn.cursor()
    rows = []
    for i in range(num_rows):
        lookup = str(uuid.uuid4())
        name = random_string()
        field1 = str(uuid.uuid4())
        field2 = str(uuid.uuid4())
        blob = psycopg2.Binary(random_blob())

        cur.execute("insert into {} ".format(table_name) +
                    "(lookup, name, field1, field2, blob) VALUES " +
                    "(%s, %s, %s, %s, %s);",
                    (lookup, name, field1, field2, blob))

        rows.append(lookup)

    return rows

def first_server(args):
    return args.server.split(",")[0]

def connect(server, args):
    # To have the client log what it"s doing, run with the following code:
    #conn = psycopg2.connect(dbname=dbname, host=args.server, port=args.port,
    #                        user="root", connection_factory=LoggingConnection,
    #                        cursor_factory=LoggingCursor)
    #conn.initialize(open("/tmp/connection", "w"))
    #return conn
    return psycopg2.connect(dbname=dbname, host=server, port=args.port,
                            user=args.user, password=args.password)
        
def init(args):
    create_database(connect(first_server(args), args))
    create_table(connect(first_server(args), args), mysql=args.mysql)

def init_table(args):
    create_table(connect(first_server(args), args), mysql=args.mysql)

def time_updates(cur, conn, rows, num_rows, num_updates):
    total = 0.0
    for i in range(num_updates):
        row_idx = random.randrange(num_rows)

        lookup = rows[row_idx]
        # Leave everything else the same, but update the blob.
        blob = random_blob()

        start = time.time()
        cur.execute("update {} set blob = %s where lookup = %s;".format(table_name),
                    (blob, lookup))
        end = time.time()
        print(".", end="")
        sys.stdout.flush()
        total += end - start
    return total / num_updates

def update_rows_worker(args, server, rows, num_rows, num_updates, index, output):
    conn = connect(server, args)
    conn.autocommit = True
    cur = conn.cursor()

    output[index] = time_updates(cur, conn, rows, num_rows, num_updates)

def run_multiprocess_test(args, rows, num_rows, num_processes):
    servers = args.server.split(",")
    output = Array("f", num_processes)
    processes = []
    for i in xrange(num_processes):
        server = servers[i % len(servers)]
        processes.append(Process(target=update_rows_worker,
                                 args=(args, server, rows, num_rows,
                                       num_updates, i, output)))
    for process in processes:
        process.start()

    for process in processes:
        process.join();

    total = 0.0
    for average in output:
        total += average

    print("{},{}".format(num_processes, total / num_processes))

        
def run(args):
    conn = connect(first_server(args), args)
    conn.autocommit = True
    cur = conn.cursor()

    cur.execute("delete from {};".format(table_name))
    
    rows = []
    num_rows = len(rows)

    increment = max(1, args.rows / num_measurements)

    while True:
        if num_rows >= args.rows:
            break
        # Add the new rows to the end of the list.
        rows.extend(create_rows(conn, increment))
        num_rows = len(rows)

        if not args.skip_scaling:
            average = time_updates(cur, conn, rows,
                                   num_rows, num_updates)

            print("{},{}".format(num_rows, average))
        else:
            print("{} rows created".format(num_rows))

    print("Multiprocess latencies:")
    for i in [args.max_processes + 1]:
        run_multiprocess_test(args, rows, num_rows, i)
        
        
if __name__ == "__main__":
    parser = argparse.ArgumentParser(prog="python-test-cockroach.py")
    parser.add_argument("--server", required=True,
                        help="Cockroach server to connect to")
    parser.add_argument("--port", type=int, default=26257,
                        help="Port that cockroach is listening on")
    parser.add_argument("--user", default="root",
                        help="User to use to log into postgresql")
    parser.add_argument("--password", default=None,
                        help="Password to use to log into postgresql")
    parser.add_argument("--mysql", action="store_true", default=False,
                        help="Run with mysql")
    subparsers = parser.add_subparsers()
    init_parser = subparsers.add_parser("init", help="initialize the database")
    init_parser.set_defaults(func=init)
    init_table_parser = subparsers.add_parser("inittable",
                                              help="initialize the db table")
    init_table_parser.set_defaults(func=init_table)
    run_parser = subparsers.add_parser("run", help="run test")
    run_parser.set_defaults(func=run)
    run_parser.add_argument("--rows", help="Maximum number of rows to test",
                            type=int, default=default_db_rows)
    run_parser.add_argument("--skip-scaling", action="store_true", default=False,
                            help="Skip scaling measurements, go straight to MP")
    run_parser.add_argument("--max-processes", default=default_processes,
                            type=int,
                            help="Maximum number of processes to test")

    args = parser.parse_args()
    args.func(args)

The text was updated successfully, but these errors were encountered:

cuongdo · 2017-08-24T00:35:21Z

@jordanlewis can you have a look?

jethrogb · 2017-08-24T00:45:11Z

My original report of no forward progress appears to be incorrect, I adjusted the reporting in the test program (also edited above) and I'm now seeing at least one query per minute.

Edit: well maybe I'm seeing no forward progress after all, just with more concurrent processes.

jethrogb · 2017-08-24T22:54:02Z

Tried again with 1.0.5, hoping #17385 would help, but no improvement.

jordanlewis · 2017-08-25T03:38:23Z

I followed your instructions with the latest alpha and couldn't reproduce what you were seeing - running the program with concurrency 60 and num_updates 10 finished in under a minute.

Same with v1.0.5, actually. Did you run the program for a long time before being able to reproduce the behavior you saw?

jethrogb · 2017-08-25T03:51:35Z

I've found it very easy to reproduce on different systems. Have you tried increasing the concurrency? On a 2-core system 60 is all that's needed but on a 4-core I doubled it to 120. I can probably setup an AMI that exhibits this behavior, would that help? If so, send me your AWS account ID.

jordanlewis · 2017-08-25T04:24:35Z

Doubling the concurrency to 120 does produce some interesting behavior.

It's pretty easy to reproduce this with our kv load generator as well (https://github.com/cockroachdb/loadgen/tree/master/kv):

[0:11:2]% ./kv -concurrency 120 --cycle-length 1 -read-percent 0
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
      1s        0           33.9           33.9    121.6    939.5   1006.6   1006.6
      2s        0           14.0           23.9   1409.3   1879.0   1879.0   1879.0
      3s        0           17.9           21.9   2080.4   2952.8   2952.8   2952.8
      4s        0           17.0           20.7   3087.0   4026.5   4026.5   4026.5
      5s        0           11.0           18.8   4160.7   4831.8   5100.3   5100.3
      6s        0           18.1           18.7   3758.1   5905.6   6174.0   6174.0
      7s        0           21.0           19.0   3758.1   6979.3   7247.8   7247.8
      8s        0           20.0           19.1   3489.7   7784.6   8053.1   8053.1
      9s        0           18.0           19.0   3758.1   8589.9   9126.8   9126.8
     10s        0           15.0           18.6   3355.4   9663.7  10200.5  10200.5
     11s        0           18.0           18.5   2952.8  10200.5  10200.5  10200.5
     12s        0           19.0           18.6   2818.6  10200.5  10200.5  10200.5
     13s        0           15.0           18.3   2684.4  10200.5  10200.5  10200.5

Latencies creep up to 10 seconds and never improve.

A CPU profile of both workloads shows that the lion's share of time is spent in beginCmds.Unlock - which bottoms out in runtime.systemstack and eventually runtime.mach_semaphore_signal. I'm not entirely sure what this means. Here's a screenshot of the top bit:

This workload is so heavily contended that it's not surprising that query latency is quite bad. It's possible that there are some improvements we can make, but since CockroachDB runs in serializable mode by default, such a contended workload inevitably means that all of the operations must be run in serial and will encounter a lot of waiting and/or retries.

tbg · 2017-08-25T13:33:23Z

Haven't looked at this in detail, but perhaps related to #17121.

benesch · 2017-08-25T21:17:38Z

Just as another data point, we accidentally caused exactly this performance degradation during a restore. A goroutine dump showed about 60 routines hammering on the same row in the system.jobs table. (Worked around by limiting concurrency in #17934; doesn't fix the root cause.)

jordanlewis · 2018-04-25T14:46:04Z

Reassigning to @nvanbenschoten as this is likely a core performance issue, but feel free to kick it back.

nvanbenschoten · 2018-04-25T14:52:22Z

Yeah, this looks very similar to #20448. Hopefully the changes we make there will translate cleanly to wins on this issue. If not, we'll have to investigate further.

nvanbenschoten · 2018-07-23T02:36:19Z

I was able to reproduce exactly what @jethrogb saw on 2.0.4. --max-processes 60 takes about 7 seconds on my laptop and --max-processes 120 takes about 75 seconds. Thanks for the easy to use load gen!

As expected, recent changes to better handle high contention (notably #25014) have greatly improved the situation. Now on master --max-processes 60 takes about .6 seconds, --max-processes 120 takes about 1.2 seconds, --max-processes 600 takes 6 seconds, and things seems to continue to scale linearly from there. These changes will all be in the upcoming 2.1 release.

cuongdo assigned jordanlewis Aug 24, 2017

dianasaur323 added the community-questions label Aug 24, 2017

jethrogb changed the title ~~No forward progress when updating same row concurrently~~ Queries >10s take when updating same row concurrently Aug 24, 2017

jethrogb changed the title ~~Queries >10s take when updating same row concurrently~~ Queries take >10s when updating same row concurrently Aug 24, 2017

knz mentioned this issue Mar 21, 2018

sql: provide more visibility into long-latency queries #24109

Closed

5 tasks

knz added C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-community Originated from the community and removed O-community-questions labels Apr 24, 2018

jordanlewis assigned nvanbenschoten and unassigned jordanlewis Apr 25, 2018

jordanlewis added the A-kv-transactions Relating to MVCC and the transactional model. label Apr 25, 2018

nvanbenschoten added this to the 2.1 milestone Jul 22, 2018

nvanbenschoten closed this as completed Jul 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queries take >10s when updating same row concurrently #17883

Queries take >10s when updating same row concurrently #17883

jethrogb commented Aug 24, 2017 •

edited

Loading

cuongdo commented Aug 24, 2017

jethrogb commented Aug 24, 2017 •

edited

Loading

jethrogb commented Aug 24, 2017

jordanlewis commented Aug 25, 2017

jethrogb commented Aug 25, 2017 •

edited

Loading

jordanlewis commented Aug 25, 2017

tbg commented Aug 25, 2017 via email •

edited

Loading

benesch commented Aug 25, 2017

jordanlewis commented Apr 25, 2018

nvanbenschoten commented Apr 25, 2018

nvanbenschoten commented Jul 23, 2018

Queries take >10s when updating same row concurrently #17883

Queries take >10s when updating same row concurrently #17883

Comments

jethrogb commented Aug 24, 2017 • edited Loading

cuongdo commented Aug 24, 2017

jethrogb commented Aug 24, 2017 • edited Loading

jethrogb commented Aug 24, 2017

jordanlewis commented Aug 25, 2017

jethrogb commented Aug 25, 2017 • edited Loading

jordanlewis commented Aug 25, 2017

tbg commented Aug 25, 2017 via email • edited Loading

benesch commented Aug 25, 2017

jordanlewis commented Apr 25, 2018

nvanbenschoten commented Apr 25, 2018

nvanbenschoten commented Jul 23, 2018

jethrogb commented Aug 24, 2017 •

edited

Loading

jethrogb commented Aug 24, 2017 •

edited

Loading

jethrogb commented Aug 25, 2017 •

edited

Loading

tbg commented Aug 25, 2017 via email •

edited

Loading