Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLI command to delete runs older than date #10832

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

joshuataylor
Copy link
Contributor

Summary & Motivation

Deleting older than a certain date is super useful in Dagster. We don't really care about jobs older than a certain date range (say 90 days), this just takes up storage space.

What would be useful is a command to delete older than a date.

See this issue #4497

I really like this idea: U02B400J44U: Maybe dagster run wipecould take an argument like--older-than=2w``, So I added it.

I really don't like adding this to wipe as wipe is an incredibly destructive action and I foresee users wiping their instances by mistake. Instead I've created a new command called delete-range. If someone has a better name for this, I'm open for suggestions as I don't like this name :).

Now you can do the following:
dagster run delete-range 1h

And it will ask you if you want to delete older than the date it found:

$ dagster run delete-range 1h
Are you sure you want to delete run history and event logs from 2022-12-01 18:58:57.162173? Type DELETE.

And after it will tell you how many deletions it made:

$ dagster run delete-range 1h
Are you sure you want to delete run history and event logs from 2022-12-01 18:58:57.162173? Type DELETE.: DELETE
Found 0 runs to delete.
0it [00:00, ?it/s]
Deleted run history and event logs older than 2022-12-01 18:58:57.162173.

It will also show you a progress bar which will update every run.

Questions:

  1. Is there a better way to get Run IDs without having to load the entire run? I think this would be better for batching as you don't have to load more information to memory than you need.
  2. How do I create runs with an arbitrary started time via tests?

Once I get #2 answered I'll finish the tests.

🤔 Should there be a delete_many function added at some point to bulk delete runs? Right now you have to do run-by-run.

How I Tested These Changes

  1. Manually on a local project we have to delete old runs
  2. New tests

@vercel
Copy link

vercel bot commented Dec 1, 2022

@joshuataylor is attempting to deploy a commit to the Elementl Team on Vercel.

A member of the Team first needs to authorize it.

@vercel
Copy link

vercel bot commented Dec 1, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated
dagit-storybook ⬜️ Ignored (Inspect) Dec 1, 2022 at 0:04AM (UTC)

@dpeng817
Copy link
Contributor

dpeng817 commented Dec 2, 2022

buildkite build: https://buildkite.com/dagster/dagster/builds/40812. Will follow up with results

@joshuataylor
Copy link
Contributor Author

Thanks! If you also know the answers to my questions around setting the run start time, that'll be amazing as the tests should be fairly straight forward :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants