let dbreadtable use copy #9

parisni · 2021-06-22T10:01:55Z

dbreadtable could use copy to csv and then read the csv to dataframe.

this needs a temporary folder, but this would improve performance in many cases.

also several R readers could be passed such fread to get different object such dataframe of datatable or whatever

krlmlr · 2021-09-11T15:01:09Z

Thanks. In what situation would COPY be faster? Would you like to share an example and timings?

parisni · 2021-09-11T17:45:37Z

well from my experience, copy is more effective in mostly any case. You can notice the performance improvement on large enough tables (milion rows). An other advantage is it has a lower impact on the postgres database compared with classic method, with no cursor needs.

A simple way to prove this is to benchmark on several tables sizes:

a simple bash script with psql + \copy (stmt) to 'path/to/csv' and then a R script with csv reader
your package way of dealing

Also copy has a drawback: it losts columns types in the process, so you have to get this information from the database and inject it in the csv reader afterwards. Moreover you cannot deal with bytea fields, nor with arrays (dependently of the csv reader capabilities).

For information, the psycopg3 python library, is a full rewrite of psycopg2 focussed on copy for any interaction with postgres https://www.psycopg.org/psycopg3/

krlmlr · 2021-09-13T19:11:11Z

Good point about column types. I think we should stick with the current approach for dbReadTable(). Unfortunately, COPY table_name TO STDOUT currently hangs R, we could mitigate this with a new postgresCopyTo() function.

krlmlr · 2021-10-29T18:55:21Z

An integration with Arrow might bring us much more bang for the buck, including type safety.

krlmlr closed this as completed Oct 29, 2021

krlmlr transferred this issue from r-dbi/RPostgres Oct 31, 2021

krlmlr reopened this Oct 31, 2021

krlmlr added the interface 🔌 label Nov 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

let dbreadtable use copy #9

let dbreadtable use copy #9

parisni commented Jun 22, 2021

krlmlr commented Sep 11, 2021

parisni commented Sep 11, 2021 •

edited

Loading

krlmlr commented Sep 13, 2021

krlmlr commented Oct 29, 2021

let dbreadtable use copy #9

let dbreadtable use copy #9

Comments

parisni commented Jun 22, 2021

krlmlr commented Sep 11, 2021

parisni commented Sep 11, 2021 • edited Loading

krlmlr commented Sep 13, 2021

krlmlr commented Oct 29, 2021

parisni commented Sep 11, 2021 •

edited

Loading