Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__filter__ of stream map breaks tap-postgres #1821

Closed
1 task
tharwan opened this issue Jul 8, 2023 · 4 comments · Fixed by #1835
Closed
1 task

__filter__ of stream map breaks tap-postgres #1821

tharwan opened this issue Jul 8, 2023 · 4 comments · Fixed by #1835
Assignees
Labels
kind/Bug Something isn't working valuestream/SDK

Comments

@tharwan
Copy link

tharwan commented Jul 8, 2023

Singer SDK Version

2.19.1

Is this a regression?

  • Yes

Python Version

3.10

Bug scope

Mapping (stream maps, flattening, etc.)

Operating System

MacOS

Description

my setup looks like this:

  extractors:
  - name: tap-postgres
    pip_url: git+https://github.com/MeltanoLabs/tap-postgres.git@5320d1fa47442466785f6821af3ebfbe3ebff327
    config:
      stream_maps:
        public-time_series_meta:
          __filter__: ts_id in config['allowed_id']
      stream_map_config:
        allowed_id:
          - 2364
          - 2363
    select:
      - public-time_series_meta.*

The catalog from the tap contains the following:

{
      "tap_stream_id": "public-time_series_meta",
      "table_name": "time_series_meta",
      "replication_method": "",
      "key_properties": [
        "ts_id"
      ],
      "schema": {
        "properties": {
          "ts_id": {
            "type": [
              "integer"
            ]
          },
          "name": {
            "type": [
              "string",
              "null"
            ]
          },
          "region": {
            "type": [
              "string",
              "null"
            ]
          },
          "source": {
            "type": [
              "string",
              "null"
            ]
          },
          "refers_to": {
            "type": [
              "integer",
              "null"
            ]
          },
          "default_source": {
            "type": [
              "boolean",
              "null"
            ]
          },
          "frequency": {
            "type": [
              "string",
              "null"
            ]
          },
          "kwargs": {
            "type": [
              "string",
              "null"
            ]
          },
          "description": {
            "type": [
              "string",
              "null"
            ]
          },
          "unit": {
            "type": [
              "string",
              "null"
            ]
          },
          "method": {
            "type": [
              "string",
              "null"
            ]
          },
          "load_interval": {
            "type": [
              "string",
              "null"
            ]
          },
          "access": {
            "type": [
              "string",
              "null"
            ]
          },
          "orig_timezone": {
            "type": [
              "string",
              "null"
            ]
          }
        },
        "type": "object",
        "required": [
          "ts_id"
        ]
      }

but when I run meltano invoke tap postgres I get:
singer_sdk.exceptions.StreamMapConfigError: Invalid key properties for 'public-time_series_meta': [ts_id]. Property 'ts_id' was not detected in schema.

Running the same config in meltano-map-transformer works fine.

Code

-
@tharwan tharwan added kind/Bug Something isn't working valuestream/SDK labels Jul 8, 2023
@tayloramurphy
Copy link
Collaborator

@edgarrmondragon can you take a look at this?

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Jul 12, 2023

I can't reproduce this with a different tap. This config works as expected with meltano invoke:

    config:
      stream_maps:
        incidents:
          __filter__: id in config["allowed_ids"]        
      stream_map_config:
        allowed_ids:
          - "407956781"

I'm gonna try to reproduce this with meltanolabs/tap-postgres now.

EDIT: I can reproduce this with the postgres extractor.

@edgarrmondragon
Copy link
Collaborator

@tharwan can you try select the top-level stream level too? i.e.

  extractors:
  - name: tap-postgres
    pip_url: git+https://github.com/MeltanoLabs/tap-postgres.git@5320d1fa47442466785f6821af3ebfbe3ebff327
    config:
      stream_maps:
        public-time_series_meta:
          __filter__: ts_id in config['allowed_id']
      stream_map_config:
        allowed_id:
          - 2364
          - 2363
    select:
      - public-time_series_meta  # <-----
      - public-time_series_meta.*

@edgarrmondragon
Copy link
Collaborator

In the meantime I've been able to work around this problem by using a manual catalog file:

  extractors:
  - name: tap-postgres
    pip_url: git+https://github.com/MeltanoLabs/tap-postgres.git@5320d1fa47442466785f6821af3ebfbe3ebff327
    catalog: catalog.json # <---- handle selections in this file
    config:
      stream_maps:
        public-time_series_meta:
          __filter__: ts_id in config['allowed_id']
      stream_map_config:
        allowed_id:
          - 2364
          - 2363

For example

        {
          "breadcrumb": [],
          "metadata": {
            "selected": true,
            "inclusion": "available",
            "table-key-properties": [
              "id"
            ],
            "forced-replication-method": "",
            "schema-name": "public"
          }
        }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/Bug Something isn't working valuestream/SDK
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants