-
-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cursor.copy_from table argument no longer accepting schema #1294
Comments
The change was needed to avoid the chance of SQL injection. Please use |
Ah, that makes sense. I couldn't find anything in the docs so I thought it was a bug. Thanks a lot! |
TBH when making that change it didn't occur me to think about passing a schema-qualified table. However there is no way to accept them and remove the security concern at the same time: the only way would be to make the function accept a So, I sincerely apologise for the inconvenience, but I think it's for the best :) |
No need to apologize, psycopg2 is an amazing library! Thanks for all the hard work! I just made the change and everything is back to normal. |
Hi @dvarrazzo , I was able to adapt my existing function by adding this line just prior to calling "copy_from": I wonder why something like this could not be implemented within the psycopg2 library?
|
This affects the whole session. It might be a good solution for you, as well as for a program, but it's not a good solution with a driver because it has global (session-wide) repercussions. |
Thanks for the quick response!
That is a good point. The fact that I am opening and closing a new connection as part of the operation should help with that, but it makes it a much narrower solution... I suppose it might work to cache the existing search path ('SHOW search_path;') and try to restore it afterwards, but that may end up having its own complications? Is there any further discussion you could link to, which talks about the SQL injection concern that led to this change? I saw mention that the "copy_from" method is considered "legacy" at this point? Does that mean I should prefer to use copy_expert instead? I am using this in the context of trying to bulk upload medium-sized pandas dataframes (in the 10M's of rows), which can be held in memory (or processed by chunks).
|
I suppose you don't consider multithread or otherwise concurrent programs. That solution is good for you: we are very happy about that but it's just not a general solution.
No public discussion, no. But if someone exposes that function to unchecked input it is open to sql injections. It doesn't need further literature.
Look at the parameters supported by
This is a bug tracker for psycopg and this is a ticket about a psycopg issue. If you wish to discuss your code please write to the mailing list. |
Though this issue may not be an issue, the discussion helped me a lot to understand the reasons behind this change. |
Not to be too critical here, but this is a breaking change. Promise I'm not just here to complain, would be happy to help fix this. P.S. stack overflow google fu leads to a lot of references about putting schemas in double quotes due to casing issues and it takes a while to track down this issue |
@taylorsmithgg I know that the semver police is after us on this. It was a mistake to make this change: the original intention was to fix the column names only and the breakage was not intentional. I assume that, if you are using a shared library to do this operation, you can modify it, instead of changing all the thousands of scripts using it taking hundreds of hours, right? Using I see easy fixes to your problem: sure easier than make psycopg2 safe for all the possible use cases and keep the track record of not allowing people to shoot in their foot until they tick all the boxes declaring that they want to do so under their responsibility. |
@dvarrazzo Unfortunately not. Our implementation was just built to orchestrate returning the |
There's difference between My DB is in Azure PostgreSQL and it won't access my backend service inside Kubernetes pod.
while, |
@anant-matelabs you must use psycopg2/psycopg/cursor_type.c Line 1407 in 3c58e96
|
I was missing the |
First just wanna say thank you for the amazing library and all your hard work <3 I found a very interesting edge case I want to place here, as I can't tell if it is a bug or not, but hopefully I can save someone else the many hours this just took me to figure out. The problem happened when I was still getting the "relation does not exist" error even when I switched to using the (My Here is my schema/table setup: create schema if not exists test_schema;
create table if not exists test_schema.test_table (
id bigserial not null primary key,
value text
); Then I used the FILENAME = 'tmp_df.csv'
TABLE = 'test_schema.test_table'
CONNECTION_STRING = '<MY CONNECTION STRING>'
# Connect
conn = psycopg2.connect(CONNECTION_STRING)
cur = conn.cursor()
# Create dataframe of test data
df = pd.DataFrame(['joe', 'schmo'], columns=['value'])
df.to_csv(FILENAME, index=False)
# Build query string for copy
string = sql.SQL("""
copy {} (value)
from stdin (
format csv,
null "NULL",
delimiter ',',
header
);
""").format(sql.Identifier(TABLE))
# Open csv and copy_expert the data into table
with open(FILENAME) as csv_file:
cur.copy_expert(string, csv_file)
# Commit
conn.commit()
# Close
cur.close()
conn.close() This code was still giving me the "relation # This is what I had, and it doesn't work
string = sql.SQL("""
copy {} (value)
from stdin (
format csv,
null "NULL",
delimiter ',',
header
);
""").format(sql.Identifier(TABLE))
# This is what it formats to, and it also doesn't work hardcoded like this
string = sql.SQL("""
copy "test_schema.test_table" (value)
from stdin (
format csv,
null "NULL",
delimiter ',',
header
);
""")
# If you remove the double quotes, it works fine
string = sql.SQL("""
copy test_schema.test_table (value)
from stdin (
format csv,
null "NULL",
delimiter ',',
header
);
""") It would be great to get an explanation as to what's happening here / what the recommended method of dynamically adding table names to sql strings is? Also, if I'm a complete moron and using this package incorrectly, feel free to let me know of my stupidity. Thank you again! <3 |
Sorry, my question was answered by #1383 by this comment |
* Update Makefile with Dev1 commands from ECR repo * Create Dev1 build GitHub Action * Add a PR template * update the .gitignore to ignore .DS_Store files This also includes a fix for a problem introduced by a newer version of the psycopg2-binary package. There was a change introduced after 2.8.6 that impacted how this app loaded data into tables in the PostGIS database. For now, instead of trying to fix the code, I just restricted the version of the psycopg2-binary to 2.8.6 or earlier. See * psycopg/psycopg2#1294 and * psycopg/psycopg2#1383 for more details.
copy_from does not accept 'schema.table' notation since psycopg2.9. Then a bug arises in Luigi if the table refers a schema name. This PR modifies the 'copy' method behavior, by refering copy_expert instead of copy_from, as suggested in psycopg/psycopg2#1294. Fix spotify#3198
Hi,
In 2.8.6 and earlier, I was able to pass a
schema.table
argumentcursor.copy_from
:But this is no longer working on >2.9:
The text was updated successfully, but these errors were encountered: