-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
openlineage, bigquery: add openlineage method support for BigQueryInsertJobOperator #31293
openlineage, bigquery: add openlineage method support for BigQueryInsertJobOperator #31293
Conversation
8a31e4f
to
c0f5aff
Compare
a82ee78
to
e982bfc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please add the documentation changes around BigQueryExecuteQueryOperator
lineage`
e982bfc
to
aded88f
Compare
aded88f
to
a647e01
Compare
f4c71a0
to
6570061
Compare
101309a
to
0d17581
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation changes for open lineage support for this operator is needed
@sunank200 what kind of documentation would you like to see? For user facing one, I believe we don't need to do that - beyond having a doc somewhere which operators are supported. Users don't need to do anything or configure anything specific to particular operator if they wish to use OpenLineage. So, for them, the actual integration is transparent. For developers, we might need to write something. For general use I've written some tips in separate PR: #31817 - do you think we need to have something provider-specific beyond docstrings? |
84dd40b
to
20d713b
Compare
861a49e
to
5110229
Compare
71b6953
to
0bb3ddf
Compare
Since BigQueryExecuteQueryOperator is deprecated for BigQueryInsertJobOperator, why add features to BigQueryExecuteQueryOperator ? |
0bb3ddf
to
cedf85d
Compare
@raphaelauv it's still widely used. But it's valid point, so I extracted the implementation into mixin and added it to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the feature from deprecated classes.
@@ -1001,7 +1063,7 @@ def execute_complete(self, context: Context, event: dict[str, Any]) -> Any: | |||
return event["records"] | |||
|
|||
|
|||
class BigQueryExecuteQueryOperator(GoogleCloudBaseOperator): | |||
class BigQueryExecuteQueryOperator(GoogleCloudBaseOperator, _BigQueryOpenLineageMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 for adding new features to deprecated classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eladkal removed this from BigQueryExecuteQueryOperator
and associated tests.
…cuteQueryOperator Signed-off-by: Maciej Obuchowski <[email protected]>
cedf85d
to
d047c65
Compare
…cuteQueryOperator (#31293) Signed-off-by: Maciej Obuchowski <[email protected]> (cherry picked from commit e10aa6a)
This PR adds OpenLineage support for BigQueryInsertJobOperator.
Despite being SQL-based, this does not use SQL parsing due to the fact that BigQuery has an API that allows us to get the lineage data directly from it. Code that does that is implemented in OpenLineage Common library: https://github.com/OpenLineage/OpenLineage/blob/main/integration/common/openlineage/common/provider/bigquery.py
the method merely uses it.