Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding indices on jobs table #2161

Merged
merged 4 commits into from
Oct 13, 2022
Merged

Adding indices on jobs table #2161

merged 4 commits into from
Oct 13, 2022

Conversation

phixMe
Copy link
Member

@phixMe phixMe commented Oct 4, 2022

Problem

Our current /lineage query is a little slow in instances where there are many jobs. When looking at the query plan, lots of the time is spent in the jobs table with the join operation. We are joining on some unindexed fields via the jobs_view query that merges the jobs table and the jobs_fqn table for some semantic simplicity with our data model.

image

Solution

Adds indices to the fields used we join on inside the lineage query.

Note: All database schema changes require discussion. Please link the issue for context.

Checklist

  • You've signed-off your work
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've updated the CHANGELOG.md with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary)
  • You've versioned your .sql database schema migration according to Flyway's naming convention (if relevant)
  • You've included a header in any source code files (if relevant)

@boring-cyborg boring-cyborg bot added the api API layer changes label Oct 4, 2022
@codecov
Copy link

codecov bot commented Oct 4, 2022

Codecov Report

Merging #2161 (8a35583) into main (5265c92) will decrease coverage by 0.81%.
The diff coverage is n/a.

❗ Current head 8a35583 differs from pull request most recent head 24f2ad1. Consider uploading reports for the commit 24f2ad1 to get more accurate results

@@             Coverage Diff              @@
##               main    #2161      +/-   ##
============================================
- Coverage     76.60%   75.78%   -0.82%     
+ Complexity     1136     1061      -75     
============================================
  Files           219      209      -10     
  Lines          5287     5006     -281     
  Branches        420      403      -17     
============================================
- Hits           4050     3794     -256     
+ Misses          765      763       -2     
+ Partials        472      449      -23     
Impacted Files Coverage Δ
api/src/main/java/marquez/service/models/Node.java 50.00% <0.00%> (-11.54%) ⬇️
.../java/src/main/java/marquez/client/MarquezUrl.java 60.31% <0.00%> (-5.44%) ⬇️
...ients/java/src/main/java/marquez/client/Utils.java 84.61% <0.00%> (-2.49%) ⬇️
...i/src/main/java/marquez/service/models/NodeId.java 62.10% <0.00%> (-2.39%) ⬇️
...a/src/main/java/marquez/client/models/Dataset.java 56.66% <0.00%> (-1.40%) ⬇️
...main/java/marquez/service/models/LineageEvent.java 85.07% <0.00%> (-1.23%) ⬇️
.../src/main/java/marquez/service/models/Dataset.java 72.41% <0.00%> (-0.92%) ⬇️
api/src/main/java/marquez/MarquezContext.java 84.93% <0.00%> (-0.79%) ⬇️
...va/src/main/java/marquez/client/MarquezPathV1.java 62.29% <0.00%> (-0.61%) ⬇️
...va/src/main/java/marquez/client/MarquezClient.java 59.90% <0.00%> (-0.45%) ⬇️
... and 21 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@pawel-big-lebowski
Copy link
Collaborator

Having index on reference columns seems to be a good idea most of the time 🥇 🚀 💯

@mobuchowski
Copy link
Contributor

@phixMe do we want to have this in the next release?

@phixMe phixMe marked this pull request as ready for review October 13, 2022 15:00
@phixMe
Copy link
Member Author

phixMe commented Oct 13, 2022

@mobuchowski Yeah, let's get this in.

@phixMe phixMe enabled auto-merge (squash) October 13, 2022 15:32
@phixMe phixMe merged commit a458374 into main Oct 13, 2022
@phixMe phixMe deleted the perf/lineage-indices branch October 13, 2022 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api API layer changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants