Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination Redshift: Fixed maximum record size for SUPER type #12940

Merged
merged 9 commits into from
May 20, 2022

Conversation

adam-bloom
Copy link
Contributor

@adam-bloom adam-bloom commented May 17, 2022

What

See #12680: now that airbyte uses SUPER instead of VARCHAR, the max data size needs to be increased.

How

Changes the constant. #12064 changes the type for all operations (not configurable), so we don't need to be clever.

https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html

Also see https://docs.aws.amazon.com/redshift/latest/dg/limitations-super.html - the VARCHAR check is retained, but only used for fields within the SUPER object instead of the object overall.

Recommended reading order

N/A

🚨 User Impact 🚨

N/A

Pre-merge Checklist

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@CLAassistant
Copy link

CLAassistant commented May 17, 2022

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the area/connectors Connector related issues label May 17, 2022
@marcosmarxm marcosmarxm requested a review from grubberr May 17, 2022 20:39
@marcosmarxm
Copy link
Member

@adam-bloom can you sign the CLA?
@grubberr can you review this contribution?

@marcosmarxm marcosmarxm self-assigned this May 17, 2022
@alexandertsukanov
Copy link
Contributor

@adam-bloom Thanks for the contribution, good catch. Please, follow the Pre-merge checklist. Nothing critical from my side.

@adam-bloom
Copy link
Contributor Author

@adam-bloom Thanks for the contribution, good catch. Please, follow the Pre-merge checklist. Nothing critical from my side.

@alexandertsukanov what parts of the checklist specifically? I attempted to run the integration tests, but they require AWS credentials for the redshift destination tests. Are these expected to be runnable in any AWS account (and I should just use my own creds) or do these need to be run by someone on the airbyte team? Otherwise, I think it's just adding a changelog entry, correct?

Thanks for the few syntax suggestions - I'll clean those up along with adding the changelog entry.

@adam-bloom adam-bloom changed the title fix #12680: use correct max size for SUPER type Fixed maximum record size for SUPER type May 18, 2022
@adam-bloom adam-bloom changed the title Fixed maximum record size for SUPER type Destination Redshift: Fixed maximum record size for SUPER type May 18, 2022
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label May 18, 2022
@marcosmarxm
Copy link
Member

marcosmarxm commented May 18, 2022

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2347901035
❌ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2347901035
🐛 https://gradle.com/s/i3cy5a5myk2cw

@marcosmarxm
Copy link
Member

@adam-bloom please update the connector version in the Dockerfile please

@adam-bloom
Copy link
Contributor Author

@adam-bloom please update the connector version in the Dockerfile please

@marcosmarxm done

@adam-bloom
Copy link
Contributor Author

/test connector=connectors/destination-redshift

@marcosmarxm
Copy link
Member

marcosmarxm commented May 18, 2022

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2348274051
❌ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2348274051
🐛 https://gradle.com/s/hlukrpmqtdi5k

@alexandertsukanov
Copy link
Contributor

alexandertsukanov commented May 19, 2022

Looks like a lot of tests fails with https://github.com/airbytehq/airbyte/runs/6497341731?check_suite_focus=true
2022-05-18 21:34:36 INFO i.a.w.p.a.DefaultAirbyteStreamFactory(lambda$create$0):61 - Caused by: com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: password authentication failed for user "awsuser";
Could be forked repository the root cause of that?

Also, I see failed tests here
RedshiftInsertDestinationAcceptanceTest > testSyncVeryBigRecords() FAILED org.opentest4j.AssertionFailedError: expected: <6> but was: <7> at app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at app//org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150) at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145) at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:527) at
Please, take a look @adam-bloom

@adam-bloom
Copy link
Contributor Author

Yup, the issue is with the testSyncVeryBigRecords test. I took a look yesterday and thought that the issue was that we were hitting an equality condition in a <=, so added an extra character to make the "too big" record fail again. That appears to have not done the trick.

I can take another look today, but I'm not sure how to best go about this considering:

  1. I cannot run this test locally
  2. I cannot kick off github actions myself (that's restricted to airbyte team members)

Those two factors combined are going to make for very poor developer experience as a try to close the loop. I can't use a debugger, nor can I quickly iterate on identifying the issue.

Let me see if unit tests run correctly in a local environment, and I can add a unit test to cover this case. Any other suggestions on how to best work towards a solution here?

@adam-bloom
Copy link
Contributor Author

@alexandertsukanov @marcosmarxm would one of you be able to trigger a re-test? I think I was able to figure out the issue (and find another unrelated one while I was at it!)

@alexandertsukanov
Copy link
Contributor

alexandertsukanov commented May 19, 2022

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2352695793
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2352695793
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 13      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    13      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     155      8    95%
normalization/transform_config/transform.py                         159     31    81%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_catalog/utils.py                             38      9    76%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 543    352    35%
-------------------------------------------------------------------------------------
TOTAL                                                              1333    559    58%

@alexandertsukanov
Copy link
Contributor

@alexandertsukanov @marcosmarxm would one of you be able to trigger a re-test? I think I was able to figure out the issue (and find another unrelated one while I was at it!)

I have run again

@alexandertsukanov
Copy link
Contributor

Looks good now.
CC: @marcosmarxm, @adam-bloom

@alexandertsukanov
Copy link
Contributor

@adam-bloom could you bump the version of the connector after I will approve. Thanks.

@adam-bloom
Copy link
Contributor Author

@adam-bloom could you bump the version of the connector after I will approve. Thanks.

@alexandertsukanov where does the version still need to be bumped? The changelog and dockerfile are already updated.

@alexandertsukanov
Copy link
Contributor

@adam-bloom could you bump the version of the connector after I will approve. Thanks.

@alexandertsukanov where does the version still need to be bumped? The changelog and dockerfile are already updated.

sorry, my bad, looks good so far

@marcosmarxm
Copy link
Member

marcosmarxm commented May 20, 2022

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2358552614
❌ Failed to publish connectors/destination-redshift
❌ Couldn't auto-bump version for connectors/destination-redshift

@marcosmarxm
Copy link
Member

@adam-bloom could you bump the version of the connector after I will approve. Thanks.

I'll test the auto-bumper if it's working

@adam-bloom
Copy link
Contributor Author

Looks like something failed - it thinks 0.3.35 is already used? Is that also on another branch somewhere and this needs to go up one more?

@alexandertsukanov
Copy link
Contributor

I see next
image
Looks like version was published by some PR, but I don't understand why Dockerfile has another

@marcosmarxm
Copy link
Member

marcosmarxm commented May 20, 2022

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2358673492
🚀 Successfully published connectors/destination-redshift
🚀 Auto-bumped version for connectors/destination-redshift
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2358673492

@alexandertsukanov
Copy link
Contributor

here is the PR where the connector was published #12820 with 35 version
So, I think I can publish this one after mentioned will be merged.

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @adam-bloom

@marcosmarxm marcosmarxm merged commit 9e31d02 into airbytehq:master May 20, 2022
@adam-bloom adam-bloom deleted the fix-12680 branch May 20, 2022 21:49
suhomud pushed a commit that referenced this pull request May 23, 2022
* fix #12680: use correct max size for SUPER type

* address review comments

* docs: add changelog entry

* bump version

* test: make large record actually too big

* fix: use a method instead of member variable for correct override

* fix: if overall object is invalid, don't check individual elements

* feat: add unit test for isValidData

* auto-bump connector version

Co-authored-by: Octavia Squidington III <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants