Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementing storage cleaningsystem with options #311

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Gregory-Pereira
Copy link
Collaborator

Addresses: #305
/cc @vishnoianil @nerdalert Please take a look.

I am struggling still with the event based strategy. I built a channel that will watch for disk-pressure, but I didn't know how to pass the working directory of the job that pushed the usage past the limit and triggered the channel event.

The .github/workflows/test-at.yaml workflow file will be dropped once I can verify that the at binary ships with Ubuntu 22.04.

Also looking for thoughts on the enum I tried to setup with the CleanupStrategyType, didn't see / know of any comparable Enum examples in our codebase.

@Gregory-Pereira Gregory-Pereira force-pushed the 305-worker-post-upload-cleanup-strategy branch from 449407a to b48d701 Compare April 28, 2024 22:54
pressureEvent = mediumDiskPressureEvent
} else {
pressureEvent = highDiskPressureEvent

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I could set the strategy to immediate if it hits highDiskPressureEvent...?

@Gregory-Pereira
Copy link
Collaborator Author

Confirmed at does not ship with ubuntu 22.04 out of the box, will need to find another way. Considering cron but I dont want these jobs kicking around for specific filepaths running every 3 days or so, could result in hundreds of cronjobs. Another potential solution is a seperate queue for cleanup jobs that is time based ...

@vishnoianil
Copy link
Member

@Gregory-Pereira

  • Immediate will delete the local data once all the files are uploaded to s3 bucket.
  • Lazy - It should take duration as an input. For example, it will wake up every two weeks (set by user) and cleanup all the data that is being stored for more than 2 weeks.
  • Event - It should delete the oldest directories till the disk pressure goes down to 70%.
    And i think it would be better if we can leverage the os/fs packages (or any other stable go pkg) for finding the target directories and deleting it, that way we don't have to take care of the OS specific issues, until and unless we hit to road block there.

I think at this point of time, we can just implement a simple Lazy policy and see how well it works. We can implement rest of the policies in the future if needed.

@Gregory-Pereira
Copy link
Collaborator Author

@vishnoianil I am confused if this cleanup will take place at a static directory? I was under the assumption that both of these cleaning operations would be after pre-check / generate, in which case wouldn't it be a path to the data specific to that call? It seems odd to use a lazy strategy with regard to data for a specific run. Please let me know if I am misunderstanding something (almost certain I am).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants