fix: create schema and table on `add_sink` #1036

kgpayne · 2022-10-04T14:50:11Z

bug: Tables not created if no records arrive with SQLSink #1027

📚 Documentation preview 📚: https://meltano-sdk--1036.org.readthedocs.build/en/1036/

codecov · 2022-10-04T16:55:19Z

Codecov Report

Attention: Patch coverage is 90.47619% with 4 lines in your changes missing coverage. Please review.

Project coverage is 83.39%. Comparing base (fab2cc3) to head (7139d08).
Report is 1069 commits behind head on main.

Files with missing lines	Patch %	Lines
singer_sdk/streams/sql.py	80.95%	1 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1036      +/-   ##
==========================================
+ Coverage   83.13%   83.39%   +0.26%     
==========================================
  Files          39       39              
  Lines        3747     3758      +11     
  Branches      628      628              
==========================================
+ Hits         3115     3134      +19     
+ Misses        470      464       -6     
+ Partials      162      160       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kgpayne · 2022-10-05T17:25:04Z

I see codecov is failing. Wasn't exactly sure where best to add tests, but will look again in the morning 👍

singer_sdk/streams/sql.py

singer_sdk/sql/connector.py

aaronsteers · 2022-10-11T20:54:02Z

@kgpayne - Looks like we've all had a first pass at this, and I see you are making a couple iterations, so I'll mark as draft. Feel free to move back to ready status once you are ready for final review. 👍

…ne/issue1027

tests/core/test_sqlite.py

singer_sdk/sinks/sql.py

BuzzCutNorman · 2022-10-18T17:46:11Z

@kgpayne I was wondering how should a developer set a target's default schema once this code is put in place? Say for argument's sake I created a target-postgres and set the default schema on connect with some code like this.

        @event.listens_for(engine, "connect", insert=True)
        def set_search_path(dbapi_connection, connection_record):
            existing_autocommit = dbapi_connection.autocommit
            dbapi_connection.autocommit = True
            cursor = dbapi_connection.cursor()
            cursor.execute(f"SET SESSION search_path='{self.config.get('schema') or 'public'}'")
            cursor.close()
            dbapi_connection.autocommit = existing_autocommit

I believe with this PR in place any blank table would be created in the tap's schema not in the target's default schema from the meltano.yml

aaronsteers · 2022-10-18T23:27:35Z

@BuzzCutNorman - Thanks for raising.

@kgpayne - There are probably several discussion points to unwind here, but what do you think of using a setting name of load_schema (aligned with internal Meltano naming) or default_target_schema (Pipelinewise convention)?

New discussion here where we can dive deep:

How should the SDK deal with settings for `load_schema` and `schema_mappings` for SQL-based targets #1084 (comment)

If we can remove the reference to config.get('schema') and/or config["schema"] from this PR and apply a calculated schema when available (and just don't set a load schema otherwise?), that safely would push the new scope to a subsequent PR.

What do you think?

kgpayne · 2022-10-19T15:25:33Z

@aaronsteers @BuzzCutNorman

If we can remove the reference to config.get('schema') and/or config["schema"] from this PR...

We don't currently have any references to config.get('schema') in this PR or on main - that call is in BuzzCutNormans' example only. This PR already implements the behaviour you describe:

apply a calculated schema when available (and just don't set a load schema otherwise?)

I.e. the incoming schema_name property will try to derive a schema name (by splitting the stream name on -) and return None if the name is not splittable 👍

I posted feedback on the discussion too, but TLDR; is that we don't currently allow the specification of a schema/search path during the construction of the SQLAlchemy engine, but could easily do so with a connect_args setting (here or in another PR). This would then provide a default schema in cases where the schema_name property returns None, without the need for a custom snippet (like the one BuzzCutNorman suggested) or the overriding of create_sqlalchemy_engine on the SQLConnecter class 🙂 In future we could additionally support a load_schema to 'force' the schema to a user value irrespective of connection or derived schema names.

WDYT?

aaronsteers · 2022-10-19T15:35:59Z

We don't currently have any references to config.get('schema') in this PR or on main - that call is in BuzzCutNormans' example only.

This was my misreading then. Thanks very much for clarifying.

kgpayne · 2022-10-19T15:42:12Z

We don't currently have any references to config.get('schema') in this PR or on main - that call is in BuzzCutNormans' example only.

This was my misreading then. Thanks very much for clarifying.

@aaronsteers I had to check, as I thought we used it somewhere too 🙂 Whilst its common in SQL target implementations (as you point out in the discussion) it isn't (yet) in the SDK 👍

…ne/issue1027

singer_sdk/sinks/sql.py

Co-authored-by: Aaron ("AJ") Steers <[email protected]>

aaronsteers

@kgpayne - Looks good from my side.

@edgarrmondragon - Any other changes you'd want to see in this iteration?

edgarrmondragon

@kgpayne Looks good to me!

aaronsteers · 2022-10-19T20:31:41Z

@BuzzCutNorman re:

@kgpayne I was wondering how should a developer set a target's default schema once this code is put in place?

I think we should follow this PR up with this one:

For SQL-based targets, add built-in handling for schema_mapping #1086

That would give a standard way for users to control the target load schemas. And presumably those behaviors would be built into the SQLStream.schema_name property (or a similar built-in featureset), such that the schema_name returned there would honor the user's default schema preference and/or a preference for schema (re)mappings.

start on schema and table creation on

c0917b0

kgpayne requested review from edgarrmondragon, cjohnhanson and aaronsteers as code owners October 4, 2022 14:50

kgpayne marked this pull request as draft October 4, 2022 14:52

Ken Payne added 4 commits October 4, 2022 15:54

linting

1af0a57

add default schema name

5e41500

add schema to table metadata

05ea897

Merge branch 'main' into kgpayne/issue1027

7bb0e70

kgpayne marked this pull request as ready for review October 4, 2022 16:39

Merge branch 'main' into kgpayne/issue1027

6281e7d

Ken Payne added 2 commits October 5, 2022 00:21

Merge branch 'main' into kgpayne/issue1027

19b3ccb

Merge branch 'main' into kgpayne/issue1027

6d7e156

Merge branch 'main' into kgpayne/issue1027

059301e

edgarrmondragon reviewed Oct 5, 2022

View reviewed changes

singer_sdk/streams/sql.py Outdated Show resolved Hide resolved

Add missing import for singer_sdk.helpers._catalog

e68c045

aaronsteers reviewed Oct 5, 2022

View reviewed changes

singer_sdk/sql/connector.py Outdated Show resolved Hide resolved

Ken Payne added 7 commits October 11, 2022 11:50

Merge branch 'main' into kgpayne/issue1027

0347950

undo connection module

c7abd72

fix copy-paste formatting

c59bd5e

fix test

7fd3bb1

more connector changes

615e5a6

fix docstring

4171a95

Merge branch 'main' into kgpayne/issue1027

5bf574a

aaronsteers marked this pull request as draft October 11, 2022 20:54

Ken Payne added 2 commits October 12, 2022 12:14

add schema creation test

b60ddca

Merge branch 'kgpayne/issue1027' of github.com:meltano/sdk into kgpay…

1e28606

…ne/issue1027

Merge branch 'main' into kgpayne/issue1027

b49ee49

kgpayne marked this pull request as ready for review October 12, 2022 12:34

Merge branch 'main' into kgpayne/issue1027

3e92f07

kgpayne requested review from edgarrmondragon and aaronsteers October 14, 2022 10:27

Merge branch 'main' into kgpayne/issue1027

9a94766

edgarrmondragon reviewed Oct 14, 2022

View reviewed changes

tests/core/test_sqlite.py Show resolved Hide resolved

singer_sdk/sinks/sql.py Show resolved Hide resolved

Merge branch 'main' into kgpayne/issue1027

6283762

kgpayne commented Oct 18, 2022

View reviewed changes

singer_sdk/sinks/sql.py Outdated Show resolved Hide resolved

Merge branch 'main' into kgpayne/issue1027

b64a7e3

Ken Payne added 2 commits October 19, 2022 16:44

remove create_table_with_records method

d33b822

Merge branch 'kgpayne/issue1027' of github.com:meltano/sdk into kgpay…

5233657

…ne/issue1027

aaronsteers reviewed Oct 19, 2022

View reviewed changes

singer_sdk/sinks/sql.py Outdated Show resolved Hide resolved

Ken Payne and others added 2 commits October 19, 2022 17:08

Update singer_sdk/sinks/sql.py

e3e3a30

Co-authored-by: Aaron ("AJ") Steers <[email protected]>

Merge branch 'main' into kgpayne/issue1027

7139d08

kgpayne requested review from aaronsteers and edgarrmondragon October 19, 2022 17:19

aaronsteers approved these changes Oct 19, 2022

View reviewed changes

edgarrmondragon approved these changes Oct 19, 2022

View reviewed changes

kgpayne merged commit 2721dc5 into main Oct 19, 2022

kgpayne deleted the kgpayne/issue1027 branch October 19, 2022 22:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: create schema and table on `add_sink` #1036

fix: create schema and table on `add_sink` #1036

kgpayne commented Oct 4, 2022 •

edited by github-actions bot

Loading

codecov bot commented Oct 4, 2022 •

edited

Loading

kgpayne commented Oct 5, 2022

aaronsteers commented Oct 11, 2022

BuzzCutNorman commented Oct 18, 2022 •

edited

Loading

aaronsteers commented Oct 18, 2022 •

edited

Loading

kgpayne commented Oct 19, 2022 •

edited

Loading

aaronsteers commented Oct 19, 2022

kgpayne commented Oct 19, 2022

aaronsteers left a comment

edgarrmondragon left a comment

aaronsteers commented Oct 19, 2022 •

edited

Loading

fix: create schema and table on add_sink #1036

fix: create schema and table on add_sink #1036

Conversation

kgpayne commented Oct 4, 2022 • edited by github-actions bot Loading

codecov bot commented Oct 4, 2022 • edited Loading

Codecov Report

kgpayne commented Oct 5, 2022

aaronsteers commented Oct 11, 2022

BuzzCutNorman commented Oct 18, 2022 • edited Loading

aaronsteers commented Oct 18, 2022 • edited Loading

kgpayne commented Oct 19, 2022 • edited Loading

aaronsteers commented Oct 19, 2022

kgpayne commented Oct 19, 2022

aaronsteers left a comment

Choose a reason for hiding this comment

edgarrmondragon left a comment

Choose a reason for hiding this comment

aaronsteers commented Oct 19, 2022 • edited Loading

fix: create schema and table on `add_sink` #1036

fix: create schema and table on `add_sink` #1036

kgpayne commented Oct 4, 2022 •

edited by github-actions bot

Loading

codecov bot commented Oct 4, 2022 •

edited

Loading

BuzzCutNorman commented Oct 18, 2022 •

edited

Loading

aaronsteers commented Oct 18, 2022 •

edited

Loading

kgpayne commented Oct 19, 2022 •

edited

Loading

aaronsteers commented Oct 19, 2022 •

edited

Loading