-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation: detail docker-in-docker requirements #10
Comments
Hello! Just want to add here that I had no issues with But I had issues with the user that runs everything, due to the My workaround currently was to run |
@sabino When I run meltano in docker, which then runs the tap-airbyte-wrapper, I specify the top-level Are you running meltano itself as a docker container, or running it directly? I haven't hit permissions issues yet, but that seems to be in-line with what I'd expect in that instance, and it'll help me if I come to using the airbyte taps for postgres! |
I'm running Here is our current (partial, because I've omitted some things) implementation using
version: 1
send_anonymous_usage_stats: false
env:
MELTANO_SNOWPLOW_COLLECTOR_ENDPOINTS: []
include_paths:
- ./environments/*.meltano.yml
- ./extract/*.meltano.yml
- ./mappers/*.meltano.yml
- ./load/*.meltano.yml
- ./transform/*.meltano.yml
- ./orchestrate/*.meltano.yml
- ./utilities/*.meltano.yml
default_environment: dev
plugins:
extractors:
- name: tap-postgres
variant: airbyte
# Pointing to my fork temporarily (until PR gets merged)
pip_url: git+https://github.com/sabino/tap-airbyte-wrapper.git
# This is the LOG BASED Extractor
# It requires Docker to work because it is
# a wrapper around the airbyte CDC implementation
- name: tap-postgres-log
inherit_from: tap-postgres
variant: airbyte
pip_url: git+https://github.com/sabino/tap-airbyte-wrapper.git
config:
airbyte_spec:
image: airbyte/source-postgres
tag: 3.3.10
airbyte_config:
ssl_mode.mode: require
replication_method.method: CDC
replication_method.plugin: pgoutput
replication_method.publication: ext__pub
replication_method.replication_slot: ext__slot
flattening_max_depth: 0
docker_mounts:
- type: bind
source: /var/run/docker.sock
target: /var/run/docker.sock
select:
- '*.*'
metadata:
'*':
replication-method: LOG_BASED
replication-key: _ab_cdc_lsn
- name: tap__rds_db
inherit_from: tap-postgres-log
plugins:
extractors:
# Database X
- name: raw__database_x
inherit_from: tap__rds_db
config:
airbyte_config:
database: database_x
plugins:
loaders:
# Used to output things locally
- name: target-local
inherit_from: target-duckdb
variant: jwills
pip_url: target-duckdb~=0.4
config:
filepath: ${OUTPUT_DB_PATH}
add_metadata_columns: true
- name: target-bigquery
variant: z3z1ma
pip_url: git+https://github.com/z3z1ma/target-bigquery.git
config:
column_name_transforms:
add_underscore_when_invalid: true
lower: true
quote: false
snake_case: true
cluster_on_key_properties: false
credentials_path: ${GOOGLE_APPLICATION_CREDENTIALS}
dataset: ${MELTANO_EXTRACT__LOAD_SCHEMA}
project: ${BIGQUERY_PROJECT_ID}
flattening_enabled: false
flattening_max_depth: 0
upsert: false
overwrite: false
denormalized: true
method: storage_write_api
schema_resolver_version: 2
options:
storage_write_batch_mode: true With that I just set the Set And just run: |
I imagine we might be seeing a difference that I'm running meltano in AWS Batch (a layer over AWS ECS), which spins up the meltano/meltano:v3.2.0 docker image. In that, I need to mount The same fix was needed for local - If I just add in Because of the way docker-in-docker works, if that At the end of the day, my system is working well. This issue may act as enough documentation for others to follow if they hit this issue too. |
I added this env var in main, |
I'd recommend documenting the requirements for running this wrapper if you're already running meltano in docker. This is possible, but requires the
/tmp
directory to be mounted from the host to the meltano container.This ensures that
/tmp/config.json
can be accessed, as the python script usesmktemp
to create a directory in/tmp
such as/tmp/tmp.BNlf296WXX
, which can then be mapped down to the airbyte container. Due to the way docker-in-docker works, the volume mounts are mounts that exist on the host, not the meltano container, so it needs to be bind-mounted for it to show up in the airbyte image.The text was updated successfully, but these errors were encountered: