Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-736] [CT-735] varchar column reduces in size on an incremental run which leads to "Value too long for character type" error #5351

Closed
stnguyen opened this issue Jun 9, 2022 · 5 comments · Fixed by #5395
Assignees
Labels
bug Something isn't working help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors incremental Incremental modeling with dbt Team:Adapters Issues designated for the adapter area of the code

Comments

@stnguyen
Copy link

stnguyen commented Jun 9, 2022

Describe the bug

I have an incremental model with the following properties:

materialized="incremental",
sort="date_col",
dist="date_col",
unique_key="user_id_date",
incremental_strategy="merge",
on_schema_change="sync_all_columns"

The first fresh run worked fine. One of the column named platform (which is built from a case-when statement) was detected automatically by dbt to be of type varchar(20).

On the next incremental run, platform column of the new portion of data only had strings of maximum 17 characters. This led to dbt trying to perform a column type change on redshift from varchar(20) to varchar(17), and got this error:

Value too long for character type

Steps To Reproduce

  • There's a platform column build from case-when in the incremental model
  • First fresh run works fine
  • First incremental run hits the error

Expected behavior

The new column type should be big enough to hold both new and old values

Screenshots and log output

From dbt log file:

In "tbl":
  Schema changed: True
  Source columns not in target: []
  Target columns not in source: []
  New column types: [{'column_name': 'platform', 'new_type': 'character varying(17)'}]

alter table "tbl" add column "platform__dbt_alter" character varying(17);
update "tbl" set "platform__dbt_alter" = "platform";
alter table "tbl" drop column "platform" cascade;
alter table "tbl" rename column "platform__dbt_alter" to "platform"

15:43:07.968169 [debug] [Thread-1  ]: Postgres adapter: Postgres error: Value too long for character type
DETAIL:  
  -----------------------------------------------
  error:  Value too long for character type
  code:      8001
  context:   Value too long for type character varying(17)
  query:     13418344
  location:  string.cpp:247
  process:   query2_263_13418344 [pid=11812]
  -----------------------------------------------

System information

The output of dbt --version:

Core:
  - installed: 1.1.0
  - latest:    1.1.0 - Up to date!

Plugins:
  - bigquery:  1.1.0 - Up to date!
  - snowflake: 1.1.0 - Up to date!
  - redshift:  1.1.0 - Up to date!
  - postgres:  1.1.0 - Up to date!

The operating system you're using: MacOS 12.3.1

The output of python --version: Python 3.8.9

@stnguyen stnguyen added bug Something isn't working triage labels Jun 9, 2022
@github-actions github-actions bot changed the title varchar column reduces in size on an incremental run which leads to "Value too long for character type" error [CT-735] varchar column reduces in size on an incremental run which leads to "Value too long for character type" error Jun 9, 2022
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 9, 2022

@stnguyen Thanks for opening!

It sounds like on_schema_change logic here, or more likely the logic in diff_column_data_types here, is missing logic to the effect of: If the type is the same, and only the precision differs, don't treat that as a data type change. There are some Column methods we might be able to use to this effect.

If the precision has increased from the older data to the newer data, dbt should already be handling that during expand_target_column_types, which occurs right before schema changes are detected and processed.

I'm going to transfer this issue to the dbt-core repo, since that's actually where the code change should happen, in the default/global project macros.

@jtcohen6 jtcohen6 removed the triage label Jun 9, 2022
@jtcohen6 jtcohen6 transferred this issue from dbt-labs/dbt-redshift Jun 9, 2022
@jtcohen6 jtcohen6 added incremental Incremental modeling with dbt Team:Adapters Issues designated for the adapter area of the code labels Jun 9, 2022
@github-actions github-actions bot changed the title [CT-735] varchar column reduces in size on an incremental run which leads to "Value too long for character type" error [CT-736] [CT-735] varchar column reduces in size on an incremental run which leads to "Value too long for character type" error Jun 9, 2022
@epapineau
Copy link
Contributor

I'd like to contribute to this issue if it's not already being worked on :)

@leahwicz
Copy link
Contributor

@epapineau I'm good with you contributing if you are up for this! @jtcohen6 is this good with you as well?

@jtcohen6
Copy link
Contributor

very much so! excited to have you back for round 2, @epapineau :)

@leahwicz
Copy link
Contributor

@epapineau I'm going to assign this to you just so we know you are working on it. If for any reason you have to drop it, just remove yourself so we can keep track. Thanks again for the help here!

@leahwicz leahwicz added help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors good_first_issue Straightforward + self-contained changes, good for new contributors! and removed good_first_issue Straightforward + self-contained changes, good for new contributors! labels Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors incremental Incremental modeling with dbt Team:Adapters Issues designated for the adapter area of the code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants