-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Airflow Dataset in inlets and outlets breaks Datahub Airflow plugin #7809
Comments
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io |
This affects the by the time latest Datahub release v0.10.2 |
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io |
This issue was closed because it has been inactive for 30 days since being marked as stale. |
Describe the bug
Using object instances aside
datahub_provider.entities.Dataset
in inlets or outlets breaks the lineage emit of the Datahub Airflow plugin and prevents use of Data-aware SchedulingTo Reproduce
Steps to reproduce the behavior:
airflow.datasets.Dataset
Lineage emit will fail due to missing attribute
urn
onairflow.datasets.Dataset
instances:Expected behavior
Adding a Dataset of type
airflow.datasets.Dataset
should not impact the functionality of Airflow and the Datahub Plugin. This error makes it also not possible to use the Data-aware Scheduling feature of AirflowAirflow Datasets should be ignored when processing the inlets/outlets in the plugin
Additional context
Points of failure:
datahub/metadata-ingestion/src/datahub_provider/_plugin.py
Line 159 in 864ac2d
datahub/metadata-ingestion/src/datahub_provider/_plugin.py
Line 163 in 864ac2d
Solution
inlets
andoutlets
must not be assumed to be always of typedatahub_provider.entities.Dataset
.The text was updated successfully, but these errors were encountered: