-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with schemas under Redshift #18
Comments
Just to be sure, can you try to run the tests against your redshift instance? I want to see if something is different compared to the one I used for testing. You will need gradle 1.12 and you can run the redshift tests like so:
Thanks! |
The unit test gives me lots of errors like this one:
|
Ooops, pressed the "Close and comment" button by accident. I've reopened it. |
I am unable to reproduce this. I started a new redshift cluster and ran the tests w/o any problems. Log is here: https://gist.github.com/fs111/a953d7bd0c83aa85bece Something must be different with your redshift cluster, which I am unable to reproduce. |
Well, the hypothesis that leads me to is that your freshly created Redshift cluster has different settings than mine, in such a way that an unqualified table reference ( I will have a look at Redshift documentation to see which settings might affect that, but I think it's fair to say that the following items are problems:
And note that these issues may well affect all the databases, not just Redshift. So I think this is morphing into a more general "support specifying a database schema in |
All that being said, we are more than happy to take pull requests. I think extending it is a good idea, but we have a bit of a bandwidth problem. If you would like to give it a shot, let me know. |
I'm still running into issues using jdbc with redshift when trying to specify a schema before a table. It results in "schema does not exist". Any luck on a work around? |
@Sicarus This is still an open issue and requires internal changes in cascading-jdbc. I currently do not have the bandwidth to work on that, but if you are willing to give it a try, I can guide you through. |
I'm using
cascading-jdbc-redshift:2.5.4-wip-83
, and I can't get it to sink into my Redshift instance at all.The table I'm trying to sink into is called
etl_demo
. If I just use that as the table name for theRedshiftTableDesc
, and run the job, I get this error:So if I'm reading this right, Redshift doesn't like the CREATE TABLE statement because the user doesn't specify a schema to create the table in. If I try
"public.etl_demo"
as the table name for theRedshiftTableDesc
, then I get this:Here the CREATE TABLE succeeds, but apparently the
DatabaseMetaData
check fails because it looks for a table with namepublic.etl_demo
—but no such table exists.And if I create the table by hand in Redshift, and use
"etl_demo"
in theRedshiftTableDesc
, I get this (SinkMode.REPLACE
):My take here is that it can't drop the table because it doesn't know which schema to look in, once more.
If I create the table by hand and use
SinkMode.UPDATE
instead then I get past that, only to fail later down when it tries an INSERT statement into the unqualified table name:(Note that it's trying to INSERT into the Redshift table instead of staging into S3 and then using a
COPY
command, as it's supposed to. I don't know if this is a problem in my end.)I think it may be possible to work around all of these issues by configuring the Redshift server to add a default
search_path
, but I don't think this should be necessary to use the tap.Looking at commit 37364da, perhaps the
TableDesc
should allow you to specify a schema name in addition to the table name? TheDatabaseMetaData
call is usingnull
for the schema name, so the following logic (fromJDBCTap.java
) doesn't work if a database has more than one table with the same name in different schemas:The text was updated successfully, but these errors were encountered: