Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Record flattening drops typeless properties from schema #1886

Open
1 task
andyoneal opened this issue Jul 29, 2023 · 3 comments
Open
1 task

bug: Record flattening drops typeless properties from schema #1886

andyoneal opened this issue Jul 29, 2023 · 3 comments
Assignees
Labels
kind/Bug Something isn't working valuestream/SDK

Comments

@andyoneal
Copy link
Contributor

Singer SDK Version

0.28.0

Is this a regression?

  • Yes

Python Version

3.8

Bug scope

Targets (data type handling, batching, SQL object generation, etc.)

Operating System

Linux

Description

When a schema message is handled by the target with flattening_enabled=true in the config, if a property in that schema does not contain a type (e.g. "PropertyName": {} ), the property is dropped from the parsed schema and fails validation on the first record message received.

See NewValue and OldValue in the sample. They are present in the schema and record messages, but fail validation.

Guessing this related to #1204 and #1400

Code

{"type": "SCHEMA", "stream": "Tenant__History", "schema": {"type": "object", "additionalProperties": false, "properties": {"Id": {"type": "string"}, "IsDeleted": {"type": ["null", "boolean"]}, "ParentId": {"type": ["null", "string"]}, "CreatedById": {"type": ["null", "string"]}, "CreatedDate": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": ["string", "null"]}]}, "Field": {"type": ["null", "string"]}, "DataType": {"type": ["null", "string"]}, "OldValue": {}, "NewValue": {}}}, "key_properties": ["Id"], "bookmark_properties": ["CreatedDate"]}
{"type": "RECORD", "stream": "Tenant__History", "record": {"Id": "0006000019000W00A0", "IsDeleted": false, "ParentId": "030000000010P000A0", "CreatedById": "0056O00000AxPa1111", "CreatedDate": "2021-01-15T13:00:00.000000Z", "Field": "created", "DataType": "Text", "OldValue": null, "NewValue": null}, "version": 1690589856957, "time_extracted": "2023-07-29T00:17:36.965777Z"}
Traceback (most recent call last):
   File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/bin/target-postgres", line 8, in <module>
    sys.exit(TargetPostgres.cli())
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)     
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/singer_sdk/target_base.py", line 546, in invoke
    target.listen(file_input) 
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/singer_sdk/io_base.py", line 33, in listen
    self._process_lines(file_input)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/singer_sdk/target_base.py", line 291, in _process_lines
    counter = super()._process_lines(file_input)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/singer_sdk/io_base.py", line 79, in _process_lines
    self._process_record_message(line_dict)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/target.py", line 339, in _process_record_message
    raise e                   
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/target.py", line 334, in _process_record_message
    super()._process_record_message(message_dict)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/singer_sdk/target_base.py", line 338, in _process_record_message
    sink._validate_and_parse(transformed_record)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/singer_sdk/sinks/core.py", line 314, in _validate_and_parse
    self._validator.validate(record)
  File "/path/to/meltano/meltano/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 430, in validate
    raise error               
jsonschema.exceptions.ValidationError: Additional properties are not allowed ('NewValue', 'OldValue' were unexpected)
Failed validating 'additionalProperties' in schema:
    {'additionalProperties': False,
     'properties': {'CreatedById': {'type': ['null', 'string']},
                    'CreatedDate': {'format': 'date-time',
                                    'type': ['null', 'string']},
                    'DataType': {'type': ['null', 'string']},
                    'Field': {'type': ['null', 'string']},
                    'Id': {'type': 'string'},
                    'IsDeleted': {'type': ['null', 'boolean']},
                    'ParentId': {'type': ['null', 'string']},
                    '_sdc_batched_at': {'format': 'date-time',
                                        'type': ['null', 'string']},
                    '_sdc_deleted_at': {'format': 'date-time',
                                        'type': ['null', 'string']},
                    '_sdc_extracted_at': {'format': 'date-time',
                                          'type': ['null', 'string']},
                    '_sdc_received_at': {'format': 'date-time',
                                         'type': ['null', 'string']},
                    '_sdc_sequence': {'type': ['null', 'integer']},
                    '_sdc_table_version': {'type': ['null', 'integer']}},
     'type': 'object'}        
                              
On instance:                  
    {'CreatedById': '0056O00000AxPa1111',
     'CreatedDate': '2021-01-15T13:00:00.000000Z',
     'DataType': 'Text',      
     'Field': 'created',      
     'Id': '0006000019000W00A0',
     'IsDeleted': False,      
     'NewValue': None,        
     'OldValue': None,        
     'ParentId': '030000000010P000A0',
     '_sdc_batched_at': '2023-07-29T00:17:58.741266+00:00',
     '_sdc_deleted_at': None, 
     '_sdc_extracted_at': '2023-07-29T00:17:36.965777Z',
     '_sdc_received_at': '2023-07-29T00:17:58.741283+00:00',
     '_sdc_sequence': 1690589878741,
     '_sdc_table_version': 1690589856957}
@andyoneal andyoneal added kind/Bug Something isn't working valuestream/SDK labels Jul 29, 2023
@tayloramurphy
Copy link
Collaborator

@andyoneal have you confirmed this is still a bug on the latest version?

cc @edgarrmondragon

@edgarrmondragon
Copy link
Collaborator

I can confirm that this is the case in v0.30.0 but it's a bit unclear to me how the this type of fields should be handled.

Copy link

stale bot commented Jul 30, 2024

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

@stale stale bot added the stale label Jul 30, 2024
@edgarrmondragon edgarrmondragon self-assigned this Jul 31, 2024
@stale stale bot removed the stale label Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/Bug Something isn't working valuestream/SDK
Projects
None yet
Development

No branches or pull requests

3 participants