Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux wreck purge: calling nil, doesn't purge #1356

Closed
trws opened this issue Mar 15, 2018 · 5 comments
Closed

flux wreck purge: calling nil, doesn't purge #1356

trws opened this issue Mar 15, 2018 · 5 comments
Assignees

Comments

@trws
Copy link
Member

trws commented Mar 15, 2018

Apparently splash wants to run somewhere between 1 million and 2 million jobs in one instance... The purge command seemed like a good idea, but the unlinks never get committed. Specifically the f:commit on line 421 is a nil value when called. Does it possibly need to be called on a kvsdir object?

@grondo
Copy link
Contributor

grondo commented Mar 15, 2018

flux wreck purge should be tested in the testsuite, but otherwise hasn't gotten much actual use. The commit method should work on a flux handle object or kvsdir object, iirc. Maybe the flux handle f was destroyed before the call?

I can look more tomorrow... Sorry

@grondo
Copy link
Contributor

grondo commented Mar 15, 2018

Sorry, I should have checked code before responding. Thinking it over, I believe you may be correct that commit should be called on a kvsdir object. I'll verify in the morning, but didn't want to leave incorrect information here.

@grondo
Copy link
Contributor

grondo commented Mar 15, 2018

Well, I was wrong, there is no test for flux wreck purge -- that's embarrassing.

Probably f:commit() needs to be changed to f:kvs_commit(), but I'll test functionality quickly before submitting a fix. Sorry about that!

@trws
Copy link
Member Author

trws commented Mar 15, 2018

Thanks for checking it out so quickly @grondo! Much appreciated.

@grondo
Copy link
Contributor

grondo commented Mar 15, 2018

No problem, not sure how that issue slipped through the cracks!

BTW, flux wreck purge was designed to be used in conjunction with flux cron to keep job entries low in the kvs for long-running instances. I can't find the exact combined usage we've used before, but the best way is to run flux wreck purge from a flux cron entry that triggers on every N job completions, e.g. something like

flux cron event --nth=1000 --min-interval=10s wreck.state.complete flux wreck purge -Rt 1000

(Every 1000 jobs purge job entires from kvs, keeping 1000 entries, run at most once every 10 seconds)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants