fewer adapters will need to re-implement basic_load_csv_rows #3623

dataders · 2021-07-24T00:28:56Z

resolves #3622

Description

Implements a get_binding_char macro, which allows for easy substitution. This will allow dbt-sqlserver, dbt-oracle, and other adapters to only implement get_binding_char and not overload the entire basic_load_csv_rows

Questions

are we at the point yet of obviating basic_load_csv_rows in favor of default__load_csv_rows where get_batch_size() is another dispatched macro?
related to above are dispatched macros the best way to store the param string and default_batch_size? perhaps it's cleaner to add this as a SQLAdapterClass property?
I thought it a bit strange to that 3 different macros are defines, followed by their default__ implementations. I followed this pattern, but it feels weird to have get_binding_char so far from default__get_binding_char. any strong opinions?

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

jtcohen6

I'm all about it—let's make these macros more modular, and make the lives of adapter plugin maintainers easier.

Answers to your questions:

Agreed, it's a bit silly that the only difference between default__load_csv_rows and basic_load_csv_rows is setting the batch size to 10000. I'd be open to new dispatched macros for get_csv_batch_size() and get_csv_insert_into_sql().
I gave this a bit of thought. Our general rule is: All explicit SQL should be wrapped in Jinja macros. All implicit or non-SQL behavior of databases should be set in python. There are adapter plugins that leverage multiple connection options—e.g. dbt-spark uses pyodbc for some, not others—which is an argument in favor of adding a SQLConnectionManager property/method. This one is tricky, but I think a SQL param is SQL, sort of, so I'm happy setting it in a Jinja macro.
No strong preference here. In other places, I think we tend to define a dispatched macro, immediately followed by its default implementation, and I agree that's a bit easier to read.

setup.py

CHANGELOG.md

…nto get_binding_char

dataders · 2021-08-01T20:05:10Z

took another stab at refactoring. The diff is ugly (the pedant in my is toying with two separate PRs) so make this better.

I decided to keep the params for load_csv_rows() at two, so adapter maintainers don't have breaking changes. However, I'm not in love with the hacky workaround of introducingget_batch_size() with a var() call. On the same note, I'm not in love with the batch_size parameter name, as it isn't immediately clear from the name as to whether it means the maximum number of rows to include in each query or the max number of SQL params to include in each query. The truth is the former, so recently we merged dbt-msft/dbt-sqlserver#151 which will auto-calculate batch size.

dataders · 2021-08-11T00:39:51Z

@jtcohen6, I responded to your feedback, but I'm not sure why some of these tests are failing

jtcohen6

@swanderz Looking good! So sorry for the delay getting back to you on this one.

When you get a chance, could you pull the latest changes from develop? Both in terms of the changelog placement, and also because we just overhauled our CI testing suite. I'd be curious to see if the same tests are failing as the ones you saw a few weeks ago.

core/dbt/include/global_project/macros/materializations/seed/seed.sql

…eed.sql Co-authored-by: Jeremy Cohen <[email protected]>

…nto get_binding_char

…to get_binding_char

jtcohen6 · 2021-08-31T11:52:53Z

Going to close and reopen just to trigger additional adapter tests

jtcohen6

Thanks for this @swanderz!

…#3623) * fewer adapters will need to re-implemnt basic_load_csv_rows * hack version * reordering per convention * make redundant basic_load_csv_rows * for next version * Update core/dbt/include/global_project/macros/materializations/seed/seed.sql Co-authored-by: Jeremy Cohen <[email protected]> * Move up changelog entry Co-authored-by: Jeremy Cohen <[email protected]> Co-authored-by: Jeremy Cohen <[email protected]>

dbt 0.21.0 introduced changes which dropped basic_load_csv_rows dbt-labs/dbt-core#3623. Now it's able to use the default macro default__load_csv_rows

fewer adapters will need to re-implemnt basic_load_csv_rows

849f23f

dataders temporarily deployed to Redshift July 24, 2021 00:29 Inactive

cla-bot bot added the cla:yes label Jul 24, 2021

dataders temporarily deployed to Redshift July 24, 2021 00:29 Inactive

dataders had a problem deploying to Postgres July 24, 2021 00:29 Failure

dataders temporarily deployed to Bigquery July 24, 2021 00:29 Inactive

dataders temporarily deployed to Snowflake July 24, 2021 00:29 Inactive

hack version

ff0c3bd

dataders had a problem deploying to Postgres July 26, 2021 16:23 Failure

dataders temporarily deployed to Redshift July 26, 2021 16:24 Inactive

dataders temporarily deployed to Bigquery July 26, 2021 16:24 Inactive

dataders temporarily deployed to Snowflake July 26, 2021 16:24 Inactive

jtcohen6 reviewed Jul 27, 2021

View reviewed changes

setup.py Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

dataders added 4 commits August 1, 2021 12:36

reordering per convention

ae4c639

make redundant basic_load_csv_rows

b019a13

for next version

6b0101e

Merge branch 'develop' of https://github.com/fishtown-analytics/dbt i…

4577375

…nto get_binding_char

dataders requested a review from jtcohen6 August 1, 2021 19:59

jtcohen6 reviewed Aug 30, 2021

View reviewed changes

core/dbt/include/global_project/macros/materializations/seed/seed.sql Outdated Show resolved Hide resolved

dataders and others added 3 commits August 30, 2021 15:36

Update core/dbt/include/global_project/macros/materializations/seed/s…

6d7ca82

…eed.sql Co-authored-by: Jeremy Cohen <[email protected]>

Merge branch 'develop' of https://github.com/fishtown-analytics/dbt i…

7dc06fb

…nto get_binding_char

Merge branch 'get_binding_char' of https://github.com/swanderz/dbt in…

abdf322

…to get_binding_char

dataders requested a review from jtcohen6 August 30, 2021 22:52

jtcohen6 added the ok_to_test label Aug 31, 2021

jtcohen6 closed this Aug 31, 2021

jtcohen6 reopened this Aug 31, 2021

jtcohen6 and others added 2 commits August 31, 2021 14:28

Move up changelog entry

10f8c3c

Merge branch 'develop' into get_binding_char

192d811

jtcohen6 approved these changes Aug 31, 2021

View reviewed changes

jtcohen6 merged commit 464beca into dbt-labs:develop Aug 31, 2021

dataders mentioned this pull request Sep 11, 2021

implement get_binding_char dbt-msft/dbt-sqlserver#161

Closed

jtcohen6 mentioned this pull request Sep 23, 2021

Seeding big files fail on 0.21.0-rc1 #3941

Closed

5 tasks

hovaesco added a commit to starburstdata/dbt-trino that referenced this pull request Oct 5, 2021

Drop trino__load_csv_rows macro

f0dab8b

dbt 0.21.0 introduced changes which dropped basic_load_csv_rows dbt-labs/dbt-core#3623. Now it's able to use the default macro default__load_csv_rows

hovaesco mentioned this pull request Oct 5, 2021

Bump version to 0.21.0 starburstdata/dbt-trino#11

Merged

hovaesco added a commit to starburstdata/dbt-trino that referenced this pull request Oct 7, 2021

Drop trino__load_csv_rows macro

ef05a4e

dbt 0.21.0 introduced changes which dropped basic_load_csv_rows dbt-labs/dbt-core#3623. Now it's able to use the default macro default__load_csv_rows

This was referenced Nov 11, 2021

make calc_batch_size compatible with default__load_csv_rows dbt-msft/dbt-sqlserver#179

Closed

v0.21.0 dbt-msft/dbt-sqlserver#173

Merged

dataders changed the title ~~fewer adapters will need to re-implemnt basic_load_csv_rows~~ fewer adapters will need to re-implement basic_load_csv_rows Nov 11, 2021

jtcohen6 mentioned this pull request Nov 25, 2021

dbt Seed, throwing a error dbt-labs/dbt-snowflake#58

Closed

EminUZUN pushed a commit to EminUZUN/dbt-trino that referenced this pull request Feb 14, 2023

Drop trino__load_csv_rows macro

774da0b

dbt 0.21.0 introduced changes which dropped basic_load_csv_rows dbt-labs/dbt-core#3623. Now it's able to use the default macro default__load_csv_rows

damian3031 pushed a commit to damian3031/dbt-trino that referenced this pull request Sep 9, 2024

Drop trino__load_csv_rows macro

21ae59f

dbt 0.21.0 introduced changes which dropped basic_load_csv_rows dbt-labs/dbt-core#3623. Now it's able to use the default macro default__load_csv_rows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fewer adapters will need to re-implement basic_load_csv_rows #3623

fewer adapters will need to re-implement basic_load_csv_rows #3623

dataders commented Jul 24, 2021 •

edited

Loading

jtcohen6 left a comment

dataders commented Aug 1, 2021

dataders commented Aug 11, 2021

jtcohen6 left a comment

jtcohen6 commented Aug 31, 2021

jtcohen6 left a comment

fewer adapters will need to re-implement basic_load_csv_rows #3623

fewer adapters will need to re-implement basic_load_csv_rows #3623

Conversation

dataders commented Jul 24, 2021 • edited Loading

Description

Questions

Checklist

jtcohen6 left a comment

Choose a reason for hiding this comment

dataders commented Aug 1, 2021

dataders commented Aug 11, 2021

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 commented Aug 31, 2021

jtcohen6 left a comment

Choose a reason for hiding this comment

dataders commented Jul 24, 2021 •

edited

Loading