Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop GitHub tables before each run #53

Merged
merged 2 commits into from
Nov 28, 2023
Merged

Drop GitHub tables before each run #53

merged 2 commits into from
Nov 28, 2023

Conversation

ghickman
Copy link
Contributor

@ghickman ghickman commented Nov 28, 2023

Now that we're backfilling all data on each run we can simplify some of that by dropping the table each time we run the backfill. We still maintain the upsert functionality so this is not entirely necessary but it also helps in local development.

The backfill time is down to <2m locally and the expectation is that we'll run backfills in the middle of the night so I'm not expecting this to cause problems for users in production.

While we only have one GitHub table currently this looks for all github_* tables on the basis that we're not ingesting all the data all at once so future tables (eg issues) won't have to worry about doing this too.

Copy link
Contributor

@benbc benbc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a proper review later post-merge.

Base automatically changed from bulk-upsert to main November 28, 2023 12:25
@ghickman ghickman merged commit 30242d5 into main Nov 28, 2023
7 checks passed
@ghickman ghickman deleted the drop-table-each-run branch November 28, 2023 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants