Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries take >10s when updating same row concurrently #17883

Closed
jethrogb opened this issue Aug 24, 2017 · 11 comments
Closed

Queries take >10s when updating same row concurrently #17883

jethrogb opened this issue Aug 24, 2017 · 11 comments
Assignees
Labels
A-kv-transactions Relating to MVCC and the transactional model. C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-community Originated from the community
Milestone

Comments

@jethrogb
Copy link

jethrogb commented Aug 24, 2017

I have a table like this:

CREATE TABLE testtable (
    lookup BYTES PRIMARY KEY,
    name STRING,
    field1 BYTES,
    field2 BYTES,
    blob BYTES
)

I have tens of concurrent processes running this in a loop:

UPDATE testtable SET blob = 'some random value' WHERE lookup = 'one value'

This starts out relatively fast but queries can take 10s of seconds after a bit. After killing these processes, operations on this table (even as simple as SELECT count(*) FROM testtable) remain slow for a very long time.

I'm running single-node but I've also observed similar behavior on clusters. Config header:

I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] file created at: 2017/08/23 15:46:49
I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] running on machine: jethroft
I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] binary: CockroachDB CCL v1.0.4 (linux amd64, built 2017/07/27 17:54:36, go1.8.3)
I170823 15:46:49.746002 1 util/log/clog.go:1011  [config] arguments: [./cockroach start --insecure]

Test program below. Run with:

./test-cockroach.py --server localhost init
./test-cockroach.py --server localhost run --rows 1 --max-processes 60

If it doesn't get slow enough, try increasing the num_updates variable in the code or the max-processes number on the command line. With num_updates=1 it usually finishes in a couple of minutes. With num_updates=10 it can take a very long time to finish.

#!/usr/bin/python

from __future__ import print_function

import argparse
from multiprocessing import Array, Process
import psycopg2
from psycopg2.extras import LoggingConnection, LoggingCursor
import random
import six
import string
import sys
import time
import uuid

letters = list(string.ascii_letters)
dbname = "ftxtest"
table_name = "testtable"

default_db_rows = 5000
# Number of different sizes to stop and gather performance data.
num_measurements = 50

name_len = 16
min_blob_len = 128
max_blob_len = 256
num_updates = 2

# Maximum number of processes for multi-process test. (We use Python
# multiprocessing instead of threading because Python threads can't
# run concurrently very much due to the global interpreter lock).
default_processes = 10

def random_string(len=name_len):
    output = ""
    for i in range(name_len):
        output += random.choice(letters)
    return output

def random_blob_len():
    return random.randrange(min_blob_len, max_blob_len)

def random_blob():
    return random_bytearray(random_blob_len())

def random_bytes(length):
    for _ in xrange(length):
        yield random.getrandbits(8)

def random_bytearray(length):
    return bytearray(random_bytes(length))

def create_database(conn):
    try:
        cur = conn.cursor()
        cur.execute("create database {}".format(dbname));
        conn.commit()
    except psycopg2.ProgrammingError as e:
        if e.pgcode == "42P04":
            print("Database already exists, continuing")
        else:
            raise

def create_table(conn, mysql=False):
    try:
        cur = conn.cursor()
        if mysql:
            binary = "bytea"
            text = "varchar"
        else:
            binary = "bytes"
            text = "string"

        schema = ("lookup {binary} primary key, name {text}, field1 {binary}, " +
                  "field2 {binary}, blob {binary}").format(binary=binary, text=text)
        cur.execute("create table {} ({})".format(table_name, schema))
        conn.commit()
    except psycopg2.ProgrammingError as e:
        if e.pgcode == "42P07":
            print("Table already exists, continuing")
        else:
            raise

def create_rows(conn, num_rows):
    cur = conn.cursor()
    rows = []
    for i in range(num_rows):
        lookup = str(uuid.uuid4())
        name = random_string()
        field1 = str(uuid.uuid4())
        field2 = str(uuid.uuid4())
        blob = psycopg2.Binary(random_blob())

        cur.execute("insert into {} ".format(table_name) +
                    "(lookup, name, field1, field2, blob) VALUES " +
                    "(%s, %s, %s, %s, %s);",
                    (lookup, name, field1, field2, blob))

        rows.append(lookup)

    return rows

def first_server(args):
    return args.server.split(",")[0]

def connect(server, args):
    # To have the client log what it"s doing, run with the following code:
    #conn = psycopg2.connect(dbname=dbname, host=args.server, port=args.port,
    #                        user="root", connection_factory=LoggingConnection,
    #                        cursor_factory=LoggingCursor)
    #conn.initialize(open("/tmp/connection", "w"))
    #return conn
    return psycopg2.connect(dbname=dbname, host=server, port=args.port,
                            user=args.user, password=args.password)
        
def init(args):
    create_database(connect(first_server(args), args))
    create_table(connect(first_server(args), args), mysql=args.mysql)

def init_table(args):
    create_table(connect(first_server(args), args), mysql=args.mysql)

def time_updates(cur, conn, rows, num_rows, num_updates):
    total = 0.0
    for i in range(num_updates):
        row_idx = random.randrange(num_rows)

        lookup = rows[row_idx]
        # Leave everything else the same, but update the blob.
        blob = random_blob()

        start = time.time()
        cur.execute("update {} set blob = %s where lookup = %s;".format(table_name),
                    (blob, lookup))
        end = time.time()
        print(".", end="")
        sys.stdout.flush()
        total += end - start
    return total / num_updates

def update_rows_worker(args, server, rows, num_rows, num_updates, index, output):
    conn = connect(server, args)
    conn.autocommit = True
    cur = conn.cursor()

    output[index] = time_updates(cur, conn, rows, num_rows, num_updates)

def run_multiprocess_test(args, rows, num_rows, num_processes):
    servers = args.server.split(",")
    output = Array("f", num_processes)
    processes = []
    for i in xrange(num_processes):
        server = servers[i % len(servers)]
        processes.append(Process(target=update_rows_worker,
                                 args=(args, server, rows, num_rows,
                                       num_updates, i, output)))
    for process in processes:
        process.start()

    for process in processes:
        process.join();

    total = 0.0
    for average in output:
        total += average

    print("{},{}".format(num_processes, total / num_processes))

        
def run(args):
    conn = connect(first_server(args), args)
    conn.autocommit = True
    cur = conn.cursor()

    cur.execute("delete from {};".format(table_name))
    
    rows = []
    num_rows = len(rows)

    increment = max(1, args.rows / num_measurements)

    while True:
        if num_rows >= args.rows:
            break
        # Add the new rows to the end of the list.
        rows.extend(create_rows(conn, increment))
        num_rows = len(rows)

        if not args.skip_scaling:
            average = time_updates(cur, conn, rows,
                                   num_rows, num_updates)

            print("{},{}".format(num_rows, average))
        else:
            print("{} rows created".format(num_rows))

    print("Multiprocess latencies:")
    for i in [args.max_processes + 1]:
        run_multiprocess_test(args, rows, num_rows, i)
        
        
if __name__ == "__main__":
    parser = argparse.ArgumentParser(prog="python-test-cockroach.py")
    parser.add_argument("--server", required=True,
                        help="Cockroach server to connect to")
    parser.add_argument("--port", type=int, default=26257,
                        help="Port that cockroach is listening on")
    parser.add_argument("--user", default="root",
                        help="User to use to log into postgresql")
    parser.add_argument("--password", default=None,
                        help="Password to use to log into postgresql")
    parser.add_argument("--mysql", action="store_true", default=False,
                        help="Run with mysql")
    subparsers = parser.add_subparsers()
    init_parser = subparsers.add_parser("init", help="initialize the database")
    init_parser.set_defaults(func=init)
    init_table_parser = subparsers.add_parser("inittable",
                                              help="initialize the db table")
    init_table_parser.set_defaults(func=init_table)
    run_parser = subparsers.add_parser("run", help="run test")
    run_parser.set_defaults(func=run)
    run_parser.add_argument("--rows", help="Maximum number of rows to test",
                            type=int, default=default_db_rows)
    run_parser.add_argument("--skip-scaling", action="store_true", default=False,
                            help="Skip scaling measurements, go straight to MP")
    run_parser.add_argument("--max-processes", default=default_processes,
                            type=int,
                            help="Maximum number of processes to test")

    args = parser.parse_args()
    args.func(args)
@cuongdo
Copy link
Contributor

cuongdo commented Aug 24, 2017

@jordanlewis can you have a look?

@jethrogb jethrogb changed the title No forward progress when updating same row concurrently Queries >10s take when updating same row concurrently Aug 24, 2017
@jethrogb
Copy link
Author

jethrogb commented Aug 24, 2017

My original report of no forward progress appears to be incorrect, I adjusted the reporting in the test program (also edited above) and I'm now seeing at least one query per minute.

Edit: well maybe I'm seeing no forward progress after all, just with more concurrent processes.

@jethrogb jethrogb changed the title Queries >10s take when updating same row concurrently Queries take >10s when updating same row concurrently Aug 24, 2017
@jethrogb
Copy link
Author

Tried again with 1.0.5, hoping #17385 would help, but no improvement.

@jordanlewis
Copy link
Member

I followed your instructions with the latest alpha and couldn't reproduce what you were seeing - running the program with concurrency 60 and num_updates 10 finished in under a minute.

Same with v1.0.5, actually. Did you run the program for a long time before being able to reproduce the behavior you saw?

@jethrogb
Copy link
Author

jethrogb commented Aug 25, 2017

I've found it very easy to reproduce on different systems. Have you tried increasing the concurrency? On a 2-core system 60 is all that's needed but on a 4-core I doubled it to 120. I can probably setup an AMI that exhibits this behavior, would that help? If so, send me your AWS account ID.

@jordanlewis
Copy link
Member

Doubling the concurrency to 120 does produce some interesting behavior.

It's pretty easy to reproduce this with our kv load generator as well (https://github.com/cockroachdb/loadgen/tree/master/kv):

[0:11:2]% ./kv -concurrency 120 --cycle-length 1 -read-percent 0
_elapsed___errors__ops/sec(inst)___ops/sec(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
      1s        0           33.9           33.9    121.6    939.5   1006.6   1006.6
      2s        0           14.0           23.9   1409.3   1879.0   1879.0   1879.0
      3s        0           17.9           21.9   2080.4   2952.8   2952.8   2952.8
      4s        0           17.0           20.7   3087.0   4026.5   4026.5   4026.5
      5s        0           11.0           18.8   4160.7   4831.8   5100.3   5100.3
      6s        0           18.1           18.7   3758.1   5905.6   6174.0   6174.0
      7s        0           21.0           19.0   3758.1   6979.3   7247.8   7247.8
      8s        0           20.0           19.1   3489.7   7784.6   8053.1   8053.1
      9s        0           18.0           19.0   3758.1   8589.9   9126.8   9126.8
     10s        0           15.0           18.6   3355.4   9663.7  10200.5  10200.5
     11s        0           18.0           18.5   2952.8  10200.5  10200.5  10200.5
     12s        0           19.0           18.6   2818.6  10200.5  10200.5  10200.5
     13s        0           15.0           18.3   2684.4  10200.5  10200.5  10200.5

Latencies creep up to 10 seconds and never improve.

A CPU profile of both workloads shows that the lion's share of time is spent in beginCmds.Unlock - which bottoms out in runtime.systemstack and eventually runtime.mach_semaphore_signal. I'm not entirely sure what this means. Here's a screenshot of the top bit:

image

This workload is so heavily contended that it's not surprising that query latency is quite bad. It's possible that there are some improvements we can make, but since CockroachDB runs in serializable mode by default, such a contended workload inevitably means that all of the operations must be run in serial and will encounter a lot of waiting and/or retries.

@tbg
Copy link
Member

tbg commented Aug 25, 2017 via email

@benesch
Copy link
Contributor

benesch commented Aug 25, 2017

Just as another data point, we accidentally caused exactly this performance degradation during a restore. A goroutine dump showed about 60 routines hammering on the same row in the system.jobs table. (Worked around by limiting concurrency in #17934; doesn't fix the root cause.)

@knz knz added C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-community Originated from the community and removed O-community-questions labels Apr 24, 2018
@jordanlewis jordanlewis added the A-kv-transactions Relating to MVCC and the transactional model. label Apr 25, 2018
@jordanlewis
Copy link
Member

Reassigning to @nvanbenschoten as this is likely a core performance issue, but feel free to kick it back.

@nvanbenschoten
Copy link
Member

Yeah, this looks very similar to #20448. Hopefully the changes we make there will translate cleanly to wins on this issue. If not, we'll have to investigate further.

@nvanbenschoten nvanbenschoten added this to the 2.1 milestone Jul 22, 2018
@nvanbenschoten
Copy link
Member

I was able to reproduce exactly what @jethrogb saw on 2.0.4. --max-processes 60 takes about 7 seconds on my laptop and --max-processes 120 takes about 75 seconds. Thanks for the easy to use load gen!

As expected, recent changes to better handle high contention (notably #25014) have greatly improved the situation. Now on master --max-processes 60 takes about .6 seconds, --max-processes 120 takes about 1.2 seconds, --max-processes 600 takes 6 seconds, and things seems to continue to scale linearly from there. These changes will all be in the upcoming 2.1 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-transactions Relating to MVCC and the transactional model. C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-community Originated from the community
Projects
None yet
Development

No branches or pull requests

8 participants