Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeding big files fail on 0.21.0-rc1 #3941

Closed
1 of 5 tasks
simon-robertsson opened this issue Sep 23, 2021 · 1 comment · Fixed by #3942
Closed
1 of 5 tasks

Seeding big files fail on 0.21.0-rc1 #3941

simon-robertsson opened this issue Sep 23, 2021 · 1 comment · Fixed by #3942
Labels
bug Something isn't working snowflake

Comments

@simon-robertsson
Copy link

simon-robertsson commented Sep 23, 2021

Describe the bug

Seeding of big seed-files fail in dbt 0.21.0-rc1 on Snowflake.

Steps To Reproduce

  • Running dbt seed -s s_seedfile --full-refresh --no-version-check with dbt 0.21.0-rc1 on top of Snowflake using a s_seedfile larger than 16384 rows.

Expected behavior

The command results in an error. The database error message says that the maximum number of expressions in a list is exceeded. The seed command on this file also takes ~2x the time compared to dbt 0.20.1.

Screenshots and log output

Database Error in seed s_seedfile (data/xx/s_seedfile.csv)
  001795 (42601): SQL compilation error: error line 1 at position 98
  maximum number of expressions in a list exceeded, expected at most 16,384, got 74,646

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 0.21.0-rc1
   latest version: 0.20.2

Your version of dbt is ahead of the latest release!

Plugins:
  - postgres: 0.21.0rc1
  - redshift: 0.21.0rc1
  - bigquery: 0.21.0rc1
  - snowflake: 0.21.0rc1

The operating system you're using:
Debian GNU/Linux 10 (buster) on Windows 10 x86_64
The output of python --version:

Python 3.8.2

Additional context

Double checked running same command on same file same setup with dbt 0.20.1 without issue.

@simon-robertsson simon-robertsson added bug Something isn't working triage labels Sep 23, 2021
@jtcohen6 jtcohen6 removed the triage label Sep 23, 2021
@jtcohen6
Copy link
Contributor

@simon-robertsson Thanks for the catch, and for opening the issue!

We got our wires crossed between #3510 and #3623. The former reimplemented load_csv_rows on Snowflake, to wrap the insert DML in include explicit begin + commit statements. The latter subtly reworked the way that batch_size is defined, from being hard-coded in the macro to an independently called + dispatched macro.

The fix here is as simple as adding this line to snowflake__load_csv_rows:
https://github.com/dbt-labs/dbt/blob/74dda5aa19fae008569e8785b01da10ac83715d5/core/dbt/include/global_project/macros/materializations/seed/seed.sql#L78

I'll have a PR up for this shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working snowflake
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants