-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Priority plugin work-in-progress #290
Merged
Merged
+3,002
−54
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The new priority plugin can be loaded in addition to the existing fcfs or backfill plugins. With the introduction of the priority plugin, the existing "sched_plugin" had to be renamed to the "behavior_plugin". "sched_plugin" now refers to any plugin loaded by the sched module. The two categories of sched plugins, with different API's, are the behavior and priority plugins.
This is a continuation (but not a completion) of the multi-factor priority plugin that was initiated by Don Lipari. Don retired before there was time to complete the plugin. The code was uplifted from C to C++. The idea behind that change is that some of the core scheduler code is also C++, and a native C++ priority plugin would allow an API where the parts can directly exchange STL data structures such as vectors and remain RAII through out the process rather than dropping down to C and manual memory management. The parser for the account/user information from the SLURM sacct command had numerous bugs. Many are now fixed. Also, the parser and the rest of the code were improved to allow the sacct information to be reloaded and adjusted online (previously the data was completly static after initial read). Some of the sacctmgr parsing bugs were: Add better error handling for missing "|". A missing "|" would have resulted in the an entry containing information from the previous line. Now at least the bad line will be ignored with and error. The code failed to properly copy temporary strings. That is also resolved. prio_node_create() makes copies of its inputs. Proper error handling is added for memory allocation failures, and prio_node_destroy() is added to clean up the structure. Also we correct the code to allow the "if (user)" check to function. Previously "user" was a pointer to an empty string, meaning that the condition would never be false. Now the parser checks for an empty field and sets "user" to NULL rather than pointing to an empty string. If white space were ever to appear in the field, this check would not function correctly. Also, often in the code the term "child" was used when it really meant "sibling". Naming was changed to better reflect the behavior. A simple first test command is added. norm_usage/norm_shares were caluclated incorrectly. That has been fixed. The code comment describing the algorithm did not really match the algorithm. That has been somewhat improved. Under the "sched" directory make a new "priority" sub-directory. The "priority" direcotry will contain the various priority plugins. The first priority plugin is in the mod_fair_tree (Modified Fair Tree) subdirectory. The files in the plugin are first compiled into a libtool convenience library which is in turn used to both build the shared library and directly linked against by test applications. Test applications can include either of the headers, depending on testing needs. The shared library will only export the functions that appear in the "priority_external_api.hxx" file, so use of the internal header and the convenience library give additional symbol access for testing. The implementation of the fair share algorithm is improved to begin genericizing it from some of the odd sacctmgr designs (e.g. the "parent" account changes column from either "Parent" or "Account" based on the values of _other_ columns). The recursion is cleaned up considerably. Make it possible to add and remove accounts/users (WIP). While accounts can be removed (just load in a new set of data from sacctmgr that has the account missing), there is perhaps more work to be done here. For instance, if a user has a job running and meanwhile that user is removed from the job's current account, what needs to happen? Presumably we need to keep record of that user until all jobs from that user are complete so that resource usage can correctly be applied to the parent accounts up the tree. So the user perhaps the user needs to be marked deleted and no longer eligible for new jobs, but kept around until all current jobs are complete. Duplicate accounts are ignored. operator<< functions are provided for debugging purposes.
Codecov Report
@@ Coverage Diff @@
## master #290 +/- ##
=========================================
- Coverage 74.69% 74.19% -0.5%
=========================================
Files 49 49
Lines 9080 9176 +96
=========================================
+ Hits 6782 6808 +26
- Misses 2298 2368 +70
Continue to review full report at Codecov.
|
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request preserves the priority plugin work-in-progress of Don and Chris for posterity.