Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Added load_data to sqlalchemy's backend #1981

Merged
merged 1 commit into from
May 26, 2020

Conversation

xmnlab
Copy link
Contributor

@xmnlab xmnlab commented Sep 25, 2019

In this PR:

  • Added load_data
  • Allow database parameter for load/create/drop when database parameter is the same than the current database (fixes 'PostgreSQLClient' object has no attribute 'engine' #1979)
  • Added has_attachment to AlchemyClient, useful for some operations that need to use schema when database attachment is used, for example to_sql.

ibis/sql/alchemy.py Outdated Show resolved Hide resolved
ibis/tests/all/test_client.py Outdated Show resolved Hide resolved
@xmnlab xmnlab changed the title ENH: Added load_data to sqlalchemy's beckend ENH: Added load_data to sqlalchemy's backend Oct 2, 2019
Copy link
Contributor Author

@xmnlab xmnlab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review @ian-r-rose

ibis/sql/alchemy.py Outdated Show resolved Hide resolved
ibis/tests/all/test_client.py Outdated Show resolved Hide resolved
@xmnlab xmnlab changed the title ENH: Added load_data to sqlalchemy's backend FEAT: Added load_data to sqlalchemy's backend Nov 20, 2019
@xmnlab xmnlab force-pushed the fix-psql-create-table branch from efb199d to 71c7b32 Compare November 20, 2019 15:59
@xmnlab xmnlab added ddl Issues related to creating or altering data definitions feature Features or general enhancements postgres The PostgreSQL backend sqlalchemy SQLAlchemy-based backends labels Mar 6, 2020
@xmnlab xmnlab force-pushed the fix-psql-create-table branch from 83ce743 to 70c3769 Compare March 24, 2020 17:01
@xmnlab xmnlab marked this pull request as ready for review March 24, 2020 19:42
@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 24, 2020

this PR is ready for review. thanks!

@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 25, 2020

CI is green now.

@jreback jreback added this to the Next Feature Release milestone Mar 27, 2020
@@ -15,6 +15,7 @@ Release Notes
* :feature:`2060` Add initial support for ibis.random function
* :support:`2107` Added fragment_size to table creation for OmniSciDB
* :feature:`2117` Add non-nullable info to schema output
* :feature:`1981` Add load_data to sqlalchemy's backends
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually can you add an entry for the postrgres bug fix, ping on green.

@xmnlab xmnlab force-pushed the fix-psql-create-table branch from f7a2e2a to e116d4a Compare March 29, 2020 23:52
@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 30, 2020

@jreback CI is green. thanks!

@xmnlab
Copy link
Contributor Author

xmnlab commented Apr 1, 2020

@jreback a gentle reminder about this PR :)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls
really really try to do one change per PR
missing things makes reviews 10x harder and will take much longer

@@ -10,7 +10,7 @@ services:
POSTGRES_PASSWORD: ''

mysql:
image: mariadb:10.2
image: mariadb:10.4.12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need this new? iow are we now not supporting an older version?

generally we should not touch anything in the CI except for a dedicated PR

this way we don’t regress

also we want to test oldest versions that are supported and not newest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to mention about this change.

MariaDB had an issue that you can check it here: MariaDB/mariadb-docker#262 (comment)

so as I was running tests locally that hit some issue due MYSQL_INITDB_SKIP_TZINFO I needed to change that.

I can open a separated PR for that if you want.

@@ -516,7 +516,7 @@ def insert(
)
return self._execute(statement)

def load_data(self, path, overwrite=False, partition=None):
def load_data(self, path, overwrite=False, partition=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need kwargs? generally we want to explicitly list all parameters if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I am adding a test for all backends .. it simplified the test. but I can add specific arguments for each backend.

@@ -1099,6 +1100,10 @@ def begin(self):

@invalidates_reflection_cache
def create_table(self, name, expr=None, schema=None, database=None):
# reset database if it is the same one used by the connection
if database == self.database_name:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hit #1979 testing load_data method locally. Just reseting the database variable when it is the same as sef.database_name fixed the issue.

another way to fix that maybe would be replace self.engine.url.database by self.database_name


Raises
------
NotImplementedError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is related to this:

           raise NotImplementedError(
                'Loading data to a table from a different database is not '
                'yet implemented'
            )

probably the original message is much better. I will use that for the documentation.

Copy link
Contributor Author

@xmnlab xmnlab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jreback for the review. I will apply your suggestion.

sorry if sometimes it looks a big PR .. but as sometimes it touches more than 1 issue for one specific testing I am doing locally .. so it is hard to keep it in different PRs

@@ -10,7 +10,7 @@ services:
POSTGRES_PASSWORD: ''

mysql:
image: mariadb:10.2
image: mariadb:10.4.12
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to mention about this change.

MariaDB had an issue that you can check it here: MariaDB/mariadb-docker#262 (comment)

so as I was running tests locally that hit some issue due MYSQL_INITDB_SKIP_TZINFO I needed to change that.

I can open a separated PR for that if you want.

@@ -1099,6 +1100,10 @@ def begin(self):

@invalidates_reflection_cache
def create_table(self, name, expr=None, schema=None, database=None):
# reset database if it is the same one used by the connection
if database == self.database_name:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hit #1979 testing load_data method locally. Just reseting the database variable when it is the same as sef.database_name fixed the issue.

another way to fix that maybe would be replace self.engine.url.database by self.database_name


Raises
------
NotImplementedError
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is related to this:

           raise NotImplementedError(
                'Loading data to a table from a different database is not '
                'yet implemented'
            )

probably the original message is much better. I will use that for the documentation.

@@ -516,7 +516,7 @@ def insert(
)
return self._execute(statement)

def load_data(self, path, overwrite=False, partition=None):
def load_data(self, path, overwrite=False, partition=None, **kwargs):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I am adding a test for all backends .. it simplified the test. but I can add specific arguments for each backend.

@xmnlab xmnlab force-pushed the fix-psql-create-table branch 2 times, most recently from ba5f2a3 to 0f22a66 Compare April 8, 2020 15:12
@xmnlab xmnlab force-pushed the fix-psql-create-table branch from 2d4619d to 66b1f77 Compare May 26, 2020 19:26
@xmnlab
Copy link
Contributor Author

xmnlab commented May 26, 2020

this PR is ready again for a new review. thanks!

@jreback jreback merged commit f99b642 into ibis-project:master May 26, 2020
@jreback
Copy link
Contributor

jreback commented May 26, 2020

thanks @xmnlab

@xmnlab xmnlab deleted the fix-psql-create-table branch June 3, 2020 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ddl Issues related to creating or altering data definitions feature Features or general enhancements postgres The PostgreSQL backend sqlalchemy SQLAlchemy-based backends
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'PostgreSQLClient' object has no attribute 'engine'
3 participants