-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing collecton ingest DAG #261
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there additional context to this PR? I didn't know we had issues with collection ingests.
run_discover_build_and_push = TriggerMultiDagRunOperator( | ||
task_id="trigger_discover_items_dag", | ||
dag=dag, | ||
trigger_dag_id="veda_discover", | ||
python_callable=trigger_discover_and_build_task, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not use this operator, as it creates a disconnect between the original DAGRun event and subsequent DAG runs (ie. a failure in an instance veda-discover will not feed back to the original veda-dataset DAG using this operator). We could instead describe the discover pipeline in a TaskGroup and call expand()
on it, which will map task instances to the correct event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to re-use discover_build_ingest
DAG instead of redefining the same tasks. But having a task group used in both DAGs is not a bad idea
@@ -41,7 +41,9 @@ COPY --chown=airflow:airflow scripts "${AIRFLOW_HOME}/scripts" | |||
|
|||
RUN cp ${AIRFLOW_HOME}/configuration/airflow.cfg* ${AIRFLOW_HOME}/. | |||
|
|||
RUN pip install pypgstac==0.7.4 | |||
# Commited because it downgrade pydentics to v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an artifact and we should not need pypgstac in airflow--that should be entirely handled by the ingest api
Summary: Summary of changes
Fixing collection ingest
Changes