-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Column lineage graph endpoint #2124
Conversation
b2478de
to
75798d1
Compare
Codecov Report
@@ Coverage Diff @@
## main #2124 +/- ##
============================================
+ Coverage 75.82% 76.33% +0.51%
- Complexity 1063 1099 +36
============================================
Files 209 214 +5
Lines 5013 5139 +126
Branches 403 407 +4
============================================
+ Hits 3801 3923 +122
+ Misses 763 762 -1
- Partials 449 454 +5
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
d47f846
to
f79a812
Compare
ad91189
to
5260505
Compare
e18c13e
to
21dac22
Compare
5260505
to
e824c61
Compare
e824c61
to
39b3add
Compare
39b3add
to
a8a5a41
Compare
@@ -88,4 +95,59 @@ void doUpsertColumnLineageRow( | |||
}, | |||
value = "values") | |||
List<ColumnLineageRow> rows); | |||
|
|||
@SqlQuery( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most important piece of the PR: recursive query to extract column-lineage graph.
Only column_lineage
table is used and joined to obtained graph nodes.
Other tables are only used to enrich found nodes.
a8a5a41
to
c65ecdd
Compare
@@ -533,8 +533,8 @@ public static class ColumnLineageOutputColumn extends BaseJsonModel { | |||
@ToString | |||
public static class ColumnLineageInputField extends BaseJsonModel { | |||
|
|||
@NotNull private String datasetNamespace; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix to previous PR to align with Openlineage spec:
https://github.com/OpenLineage/OpenLineage/blob/main/spec/facets/ColumnLineageDatasetFacet.json
c65ecdd
to
791e1bf
Compare
791e1bf
to
b79d712
Compare
Signed-off-by: Pawel Leszczynski <[email protected]>
b79d712
to
8e66689
Compare
Signed-off-by: Pawel Leszczynski [email protected]
Problem
PR #2096 allows storing in database column-lineage information from the events. In this PR we expose column lineage through a graph endpoint according to the proposal (https://github.com/MarquezProject/marquez/blob/main/proposals/2045-column-lineage-endpoint.md)
Closes: #2114
Solution
NodeType
DATASET_FIELD
is added,column-lineage
endpoint returns serializedLineage
objects similar to a currently existinglineage
endpoint,Checklist
CHANGELOG.md
with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary).sql
database schema migration according to Flyway's naming convention (if relevant)