copy: optimize I/O for small data segments #93156

cucaroach · 2022-12-06T20:26:44Z

The pg protocol allows any sizing for the CopyData messages that make up a copy. psql sends many small message (1 per line) and pgconn does something smarter and sends the data in 64kb segments. Because of some inefficient buffering and some extraneous data copies ([]byte->string->[]byte conversions) psql is slower. We should optimize COPY for the small buffer case and arrange for data to be copied straight from the underlying pgwire conn.rd (bufio.Reader) to the COPY machines buffer. The pgwire conn.rd buffer defaults to the goruntime default of 4k, we should probably bump that up to something beefier like 64kb.

Another data point is that apparently the AWS DMS service uses the tiny buffer approach:

W2e21206 20:24:39.743311 22794 sql/copy.go:448 ⋮ [n1,client=35.174.169.2:39170,user=‹dms›] 80984  copy processing 41 bytes

If I change the pgconn buffer from 64k to 50 bytes I get this performance difference, 3.92 MB/s vs 2.56 MB/s so we're definitely loosing performance when dealing with little messages.

Jira issue: CRDB-22191

The text was updated successfully, but these errors were encountered:

mgartner · 2023-03-30T15:38:14Z

@cucaroach are you still planning to do this for v23.1?

Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream a bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the CSV reader do its thing. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text and binary formats read directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321

Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321

Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. ``` $ BENCHES=BenchmarkCopyCSVEndToEnd PKG=./pkg/sql/copy scripts/bench master Executed 1 out of 1 test: 1 test passes. name old time/op new time/op delta CopyCSVEndToEnd-10 3.94s ± 4% 3.76s ± 4% -4.59% (p=0.000 n=10+9) name old alloc/op new alloc/op delta CopyCSVEndToEnd-10 8.92GB ± 1% 8.61GB ± 2% -3.46% (p=0.000 n=10+9) name old allocs/op new allocs/op delta CopyCSVEndToEnd-10 13.8M ± 0% 11.6M ± 0% -16.08% (p=0.000 n=7+8) ``` Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321

mgartner · 2023-09-14T16:07:23Z

@cucaroach I'm going to backlog this because I think it's not a top priority, but please correct me if I'm wrong.

cucaroach · 2023-09-14T16:25:01Z

There's a PR out to fix it #104150

Not sure why its not linked to this issue. Oh wait yeah it says so at the top. This should probably be in 24.1.

mgartner · 2024-03-28T15:56:44Z

Let's try to finish this up and merge the linked PR.

cucaroach added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Dec 6, 2022

cucaroach mentioned this issue Dec 6, 2022

Improve COPY and batch insert performance #91831

Open

18 tasks

cucaroach self-assigned this Dec 6, 2022

blathers-crl bot added the T-sql-queries SQL Queries Team label Dec 6, 2022

cucaroach linked a pull request May 24, 2023 that will close this issue

copy: implement atomic retry #102805

Open

cucaroach linked a pull request May 31, 2023 that will close this issue

copy: new copy IO implementation #104150

Open

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to 23.2 Release in SQL Queries Jul 24, 2023

mgartner moved this from 23.2 Release to New Backlog in SQL Queries Sep 14, 2023

cucaroach moved this from New Backlog to 24.1 Release in SQL Queries Sep 14, 2023

yuzefovich unassigned cucaroach Oct 12, 2023

mgartner moved this from 24.1 Release to 24.2 Release in SQL Queries Nov 22, 2023

mgartner moved this from 24.2 Release to Backlog in SQL Queries May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copy: optimize I/O for small data segments #93156

copy: optimize I/O for small data segments #93156

cucaroach commented Dec 6, 2022 •

edited

Loading

mgartner commented Mar 30, 2023

mgartner commented Sep 14, 2023

cucaroach commented Sep 14, 2023 •

edited

Loading

mgartner commented Mar 28, 2024

copy: optimize I/O for small data segments #93156

copy: optimize I/O for small data segments #93156

Comments

cucaroach commented Dec 6, 2022 • edited Loading

mgartner commented Mar 30, 2023

mgartner commented Sep 14, 2023

cucaroach commented Sep 14, 2023 • edited Loading

mgartner commented Mar 28, 2024

cucaroach commented Dec 6, 2022 •

edited

Loading

cucaroach commented Sep 14, 2023 •

edited

Loading