Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destination Redshift: fixed array contents verification for SUPER #13069

Merged
merged 4 commits into from
Jun 3, 2022

Conversation

adam-bloom
Copy link
Contributor

@adam-bloom adam-bloom commented May 20, 2022

What

Follow up to #12940. That solution relies on Jsons.flatten, which previously did not actually flatten arrays, but rather just converted them to strings. Redshift can store arrays within a SUPER.

How

Added a case to flatten to handle arrays. This function is only used one other location:

As far as I can tell, that case (enum/const) should never encounter an array, so I do not believe adding this handling will result in any other negative consequences. Someone with more knowledge of the codebase than me, feel free to disagree! I can always add a switch to enable this behavior.

When flatten encounters an array, it will now flatten it into a map with keys [<index>] and the corresponding values. This should allow flatten to truly flatten arbitrary data.

As an example:

{"foo": [{"blue": "apple"}]}

would have previously been flattened to: {"foo": "[{\"blue\": \"apple\"}]"}
It will now flatten to: {"foo.[0].blue": "apple"}

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@adam-bloom adam-bloom changed the title Destination Redshift: fixed array handling for SUPER Destination Redshift: fixed array contents verification for SUPER May 20, 2022
@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels May 20, 2022
@adam-bloom
Copy link
Contributor Author

@marcosmarxm @alexandertsukanov FYI - not a regression in the last PR, but one more edge case that I missed.

Comment on lines +224 to +243
} else if (node.isArray()) {
final Map<String, Object> output = new HashMap<>();
final int arrayLen = node.size();
for (int i = 0; i < arrayLen; i++) {
final String field = String.format("[%d]", i);
final JsonNode value = node.get(i);
mergeMaps(output, field, flatten(value));
}
return output;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @adam-bloom but can you add a unit test for this? I'll request the team to review it becsause the change can impact more than just Redshift.

@marcosmarxm marcosmarxm self-assigned this May 23, 2022
@adam-bloom
Copy link
Contributor Author

Sure @marcosmarxm. There weren't any unit tests for flatten, so I just added two. One covering arrays and one without any arrays.

> Task :airbyte-commons:test
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended

Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

See https://docs.gradle.org/7.4/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 15s
16 actionable tasks: 3 executed, 13 up-to-date

Again, let me know if anyone thinks there is a potential impact to behavior elsewhere, and I can make this optional behavior in flatten.

@adam-bloom adam-bloom force-pushed the redshift/fix-super-arrays branch 2 times, most recently from 4a0c874 to 5ec2047 Compare May 25, 2022 20:50
@alexandertsukanov
Copy link
Contributor

@adam-bloom looks like some conflicts with the master, could you fix them after I will run the tests? Thanks.

__
sema-logo  Summary: 🛠️ This code needs a fix

@adam-bloom
Copy link
Contributor Author

@alexandertsukanov resolved

@alexandertsukanov
Copy link
Contributor

alexandertsukanov commented May 27, 2022

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2397080295
❌ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2397080295
🐛 https://gradle.com/s/3hjibqwkf2s2e

Build Failed

Test summary info:

Could not find result summary

@adam-bloom
Copy link
Contributor Author

@alexandertsukanov looks like the failure was in dbt/normalization. Is that test currently passing on master? Or do you think that there's a side-effect of the other use of flatten showing up here?

@adam-bloom
Copy link
Contributor Author

Hi @alexandertsukanov @marcosmarxm - do either of you have suggestions for next steps? I don't see an obvious cause/relationship for the test that failed, and I cannot run the tests locally to verify if this test behavior changed with this change. I'm hoping to get this PR closed out soon, but don't know how to move it forward right now.

@alexandertsukanov
Copy link
Contributor

@adam-bloom I see normalization tests are failing, let me ask somebody from Python team to take a look

__
sema-logo  Summary: 🛠️ This code needs a fix

@grubberr
Copy link
Contributor

grubberr commented Jun 2, 2022

@adam-bloom please merge latest master branch to your branch and re-run
/test connector=connectors/destination-redshift
again

@adam-bloom
Copy link
Contributor Author

@grubberr easy enough, done! @alexandertsukanov @marcosmarxm can one of you kick of another test run for me, thanks.

@alexandertsukanov
Copy link
Contributor

alexandertsukanov commented Jun 2, 2022

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2428849153
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2428849153
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 13      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    13      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     155      8    95%
normalization/transform_config/transform.py                         159     31    81%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_catalog/utils.py                             38      9    76%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 543    352    35%
-------------------------------------------------------------------------------------
TOTAL                                                              1333    559    58%

Build Passed

Test summary info:

All Passed

@marcosmarxm
Copy link
Member

marcosmarxm commented Jun 3, 2022

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2436178651
🚀 Successfully published connectors/destination-redshift
🚀 Auto-bumped version for connectors/destination-redshift
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2436178651

@marcosmarxm marcosmarxm merged commit 071a8e9 into airbytehq:master Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation bounty community connectors/destination/redshift
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

6 participants