-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy: implement buffered copy that can retry #99327
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
Comments
cucaroach
added
the
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
label
Mar 23, 2023
This was referenced Mar 23, 2023
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Mar 30, 2023
This turned out to be necessary to get customers over the line with DMS in 22.2. Now that its had time to prove itself enable it by default. Epic: none Informs: cockroachdb#99327 Fixes: cockroachdb#99464
craig bot
pushed a commit
that referenced
this issue
Mar 30, 2023
99847: copy: enable copy_from_retries_enabled by default r=cucaroach a=cucaroach This turned out to be necessary to get customers over the line with DMS in 22.2. Now that its had time to prove itself enable it by default. Epic: none Informs: #99327 Fixes: #99464 99918: server: ignore spanconfig limit for shared-process tenants r=knz a=stevendanna This change installs a noop spanconfig limiter when starting shared-process tenants. Since shared-process tenants are currently used for "unified architecture" in which the tenant is expected to behave as close as possible to the system tenant. When we have a tenant-side capabilities reader, we should tie this to a capability instead. This has an unfortunate side-effect of making any tenant-level override for the span limit a lie for shared-process tenants. Fixes #93562. Fixes #93561. Release note: None Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: Steven Danna <[email protected]>
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 2, 2023
Previously any errors encountered during atomic copies would just be passed back to the client which could be very disruptive to large migrations. Now we buffer up to sql.copy.retry.max_size bytes and if an error is encountered we replay the COPY from the beginning with a new transaction. Release note (copy change): Atomic COPY commands can now be retried increasing the rate of success of DMS operations especially when atomic COPY behavior is required. Fixes: cockroachdb#99327 Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 5, 2023
Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 5, 2023
Previously any errors encountered during atomic copies would just be passed back to the client which could be very disruptive to large migrations. Now we buffer up to sql.copy.retry.max_size bytes and if an error is encountered we replay the COPY from the beginning with a new transaction. Release note (copy change): Atomic COPY commands can now be retried increasing the rate of success of DMS operations especially when atomic COPY behavior is required. Fixes: cockroachdb#99327 Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 13, 2023
Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 13, 2023
Previously any errors encountered during atomic copies would just be passed back to the client which could be very disruptive to large migrations. Now we buffer up to sql.copy.retry.max_size bytes and if an error is encountered we replay the COPY from the beginning with a new transaction. Release note (copy change): Atomic COPY commands can now be retried increasing the rate of success of DMS operations especially when atomic COPY behavior is required. Fixes: cockroachdb#99327 Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 14, 2023
Add 3 node config for more retry coverage and turn off backups since its a perf test. Release note: none Epic: CRDB-25321 Informs: cockroachdb#99327
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 22, 2023
Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 22, 2023
Previously any errors encountered during atomic copies would just be passed back to the client which could be very disruptive to large migrations. Now we buffer up to sql.copy.retry.max_size bytes and if an error is encountered we replay the COPY from the beginning with a new transaction. Release note (sql change): Atomic COPY commands can now be retried increasing the rate of success of DMS operations especially when atomic COPY behavior is required. Fixes: cockroachdb#99327 Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 22, 2023
Add 3 node config for more retry coverage and turn off backups since its a perf test. Release note: none Epic: CRDB-25321 Informs: cockroachdb#99327
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 22, 2023
Add 3 node config for more retry coverage and turn off backups since its a perf test. Release note: none Epic: CRDB-25321 Informs: cockroachdb#99327
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 22, 2023
Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jun 22, 2023
Previously any errors encountered during atomic copies would just be passed back to the client which could be very disruptive to large migrations. Now we buffer up to sql.copy.retry.max_size bytes and if an error is encountered we replay the COPY from the beginning with a new transaction. Release note (sql change): Atomic COPY commands can now be retried increasing the rate of success of DMS operations especially when atomic COPY behavior is required. Fixes: cockroachdb#99327 Epic: CRDB-25321
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jul 15, 2023
Purely test change to get %100 line coverage in readTextData. Informs: cockroachdb#99327 Epic: None Release note: None
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jul 18, 2023
Purely test change to get %100 line coverage in readTextData. Informs: cockroachdb#99327 Epic: None Release note: None
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jul 18, 2023
Add 3 node config for more retry coverage and turn off backups since its a perf test. Release note: none Epic: CRDB-25321 Informs: cockroachdb#99327
cucaroach
added a commit
to cucaroach/cockroach
that referenced
this issue
Jul 18, 2023
Refactor COPY so that all the buffer reading takes place in a separate implementation of the io.Reader interface. This does two things, it enables the COPY implementation to efficiently handle small CopyData frames by eliminating extra buffering and exposes the COPY bytes as a pure stream of bytes which makes retry easier. It also cleans up the COPY code that handles CopyData segments straddling line boundaries, now we can just let the text/CSV reader do their thing and not have to do any writeback. The old implementation would read from a pgwire BufferedReader (copy 1) into a pgwire "ReadBuffer" (copy 2) and then push those segments into a bytes.Buffer "buf" (copy 3). The text and binary readers would read right from buf the CSV reader has its own buffer and we would read lines from buf and write them into the CSV reader's buffer (copy 4). The new approach does away with all this and the text format reads directly from a bufio.Reader (copy 1) stacked on the copy.Reader (no buffering) stacked on the pgwire BufferedReader (copy 2). For CSV the CSVReader reads directly from the copy.Reader since it has its own buffer so again only two copies off the wire. Binary reads directly from the copy.Reader since it requires no ReadLine (but the I/O is still buffered at the pgwire level). This doesn't seem to affect performance much but it gives the GC a nice break and sets up a clean solution for cockroachdb#99327. When encountering a memory usage error we used to try to let the encoder finish the row but with the more efficient buffering this started succeeds where it always failed before. Now we just don't do the hail mary and if we hit the limit we bail and return immediately, this is more OOM safe and simpler. ``` $ BENCHES=BenchmarkCopyCSVEndToEnd PKG=./pkg/sql/copy scripts/bench master Executed 1 out of 1 test: 1 test passes. name old time/op new time/op delta CopyCSVEndToEnd-10 3.94s ± 4% 3.76s ± 4% -4.59% (p=0.000 n=10+9) name old alloc/op new alloc/op delta CopyCSVEndToEnd-10 8.92GB ± 1% 8.61GB ± 2% -3.46% (p=0.000 n=10+9) name old allocs/op new allocs/op delta CopyCSVEndToEnd-10 13.8M ± 0% 11.6M ± 0% -16.08% (p=0.000 n=7+8) ``` Fixes: cockroachdb#93156 Informs: cockroachdb#99327 Release note: none Epic: CRDB-25321
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
We currently have limited retry support that is disabled and only works for nonatomic copies. We should enable it by default in 23.1 and implement a limited entire copy buffer so we can extend our retry support to atomic copies.
Jira issue: CRDB-25818
The text was updated successfully, but these errors were encountered: