Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Configurable object/column name formatting options for targets #2490

Open
dluo-sig opened this issue Jun 18, 2024 · 5 comments
Open

feat: Configurable object/column name formatting options for targets #2490

dluo-sig opened this issue Jun 18, 2024 · 5 comments
Labels
kind/Feature New feature or request SQL Support for SQL taps and targets valuestream/SDK

Comments

@dluo-sig
Copy link

dluo-sig commented Jun 18, 2024

Feature scope

Targets (data type handling, batching, SQL object generation, tests, etc.)

Description

It would be helpful if there is a way to configure the column and object names that get mapped from the extractor.

  1. direct mapping - names as-is
  2. snake_case
  3. PascalCase

Nice to have: some targets also have the ability to specify column names that are outside of the normal specs. For example, SQL Server can have column names with any character when qualified with square brackets, PostgreSQL/Snowflake can do the same with double quotes, etc. This would be an additional option to use the appropriate identifiers across the board when referencing objects/fields at the target. It would not be desirable to always force this option, as for example, Snowflake becomes case-sensitive when double quotes are provided.

@dluo-sig dluo-sig added kind/Feature New feature or request valuestream/SDK labels Jun 18, 2024
@visch
Copy link
Contributor

visch commented Jun 18, 2024

Yes, good part is there's already a function for this

def conform_name( # noqa: PLR6301

I overrode this as I wanted "direct mapping" in your list to be the default not snake_case see https://github.com/MeltanoLabs/target-postgres/blob/1e59be2750961876d52b8e69cf05c5eb06cb13b4/target_postgres/sinks.py#L280-L282

There was a longer discussion about this here #1205

I think the idea of a config option to choose between them is a good one!

@dluo-sig
Copy link
Author

@edgarrmondragon had proposed using the humps library for this.

edgarrmondragon added a commit that referenced this issue Jun 21, 2024
* [`inflection` last version](https://pypi.org/project/inflection/) is from 2020/08/20
* [`pyhumps` last version](https://pypi.org/project/pyhumps/) is from 2022/10/21

Related:

* #2490 (comment)
edgarrmondragon added a commit that referenced this issue Jun 24, 2024
* [`inflection` last version](https://pypi.org/project/inflection/) is from 2020/08/20
* [`pyhumps` last version](https://pypi.org/project/pyhumps/) is from 2022/10/21

Related:

* #2490 (comment)
edgarrmondragon added a commit that referenced this issue Jun 26, 2024
* [`inflection` last version](https://pypi.org/project/inflection/) is from 2020/08/20
* [`pyhumps` last version](https://pypi.org/project/pyhumps/) is from 2022/10/21

Related:

* #2490 (comment)
@edgarrmondragon edgarrmondragon added the SQL Support for SQL taps and targets label Jul 21, 2024
@dluo-sig
Copy link
Author

dluo-sig commented Aug 8, 2024

Related request #2545 (comment)

edgarrmondragon added a commit that referenced this issue Aug 9, 2024
* [`inflection` last version](https://pypi.org/project/inflection/) is from 2020/08/20
* [`pyhumps` last version](https://pypi.org/project/pyhumps/) is from 2022/10/21

Related:

* #2490 (comment)
@lumenn
Copy link

lumenn commented Aug 28, 2024

I'm currently looking for same use case - i'd like to convert all column names to lower case (source is tap-mssql, and loader is target-postgres). In my case tables have around 100 columns, so making it manually per column doesn't seem inviting :)

@dluo-sig
Copy link
Author

dluo-sig commented Sep 9, 2024

I think what could happen is that with this new configuration set, PluginMapper can take it into account in register_raw_stream_schema and insert key pairs into self.stream_maps_dict with the new mapping and a __NULL__ for the old column. We then need to handle __key_properties__ to take into account key columns that were renamed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/Feature New feature or request SQL Support for SQL taps and targets valuestream/SDK
Projects
None yet
Development

No branches or pull requests

4 participants