-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ml autoscaling for zero allocations #114982
Merged
Merged
Changes from 2 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
e5c8da6
Fix estimated memory usage for a model with zero allocations.
jan-elastic d4d069d
Ignore number of threads of models with zero allocations in autoscali…
jan-elastic 2da5870
Add some long overdue comments.
jan-elastic aa12b95
Another estimateMemoryUsageBytes fix
jan-elastic File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -242,72 +242,72 @@ static void getMemoryAndProcessors( | |
final int numMissingProcessors = numMissingAllocations * numberOfThreadsPerAllocation; | ||
int numExistingProcessorsToBeUsed = Math.min(numMissingProcessors, numberOfAvailableProcessors); | ||
|
||
if (numberOfRequestedAllocations == 0) { | ||
continue; | ||
} | ||
if (assignment.getNodeRoutingTable().isEmpty() == false | ||
&& assignment.getNodeRoutingTable().values().stream().allMatch(r -> r.getState().consumesMemory() == false)) { | ||
// Ignore states that don't consume memory, for example all allocations are failed or stopped | ||
// if the node routing table is empty, then it will match the above condition, but it needs to be handled in the next branch | ||
continue; | ||
} | ||
|
||
if (assignment.getNodeRoutingTable().isEmpty() == false) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Everything down here in this file is just indentation. If you want to review it, I'd recommend settings -> hide whitespace |
||
// if the routing table is non-empty, this is an existing model | ||
existingModelMemoryBytes += estimatedMemoryUsage; | ||
} else { | ||
// only increase memory requirements for new models | ||
extraPerNodeModelMemoryBytes += Math.max(extraPerNodeModelMemoryBytes, estimatedMemoryUsage); | ||
extraModelMemoryInBytes += estimatedMemoryUsage; | ||
} | ||
|
||
if (assignment.getNodeRoutingTable().isEmpty() == false) { | ||
// if the routing table is non-empty, this is an existing model | ||
existingModelMemoryBytes += estimatedMemoryUsage; | ||
} else { | ||
// only increase memory requirements for new models | ||
extraPerNodeModelMemoryBytes += Math.max(extraPerNodeModelMemoryBytes, estimatedMemoryUsage); | ||
extraModelMemoryInBytes += estimatedMemoryUsage; | ||
// if not low priority, check processor requirements. | ||
if (Priority.LOW.equals(modelAssignment.getValue().getTaskParams().getPriority()) == false) { | ||
if (numMissingProcessors > numberOfAvailableProcessors) { | ||
// as assignments can be placed on different nodes, we only need numberOfThreadsPerAllocation here | ||
extraProcessors += numMissingProcessors - numExistingProcessorsToBeUsed; | ||
extraPerNodeProcessors = Math.max(extraPerNodeProcessors, 1); // if extra processors >0, we need at least 1 | ||
// extraPerNodeProcessors | ||
} | ||
|
||
// if not low priority, check processor requirements. | ||
if (Priority.LOW.equals(modelAssignment.getValue().getTaskParams().getPriority()) == false) { | ||
if (numMissingProcessors > numberOfAvailableProcessors) { | ||
// as assignments can be placed on different nodes, we only need numberOfThreadsPerAllocation here | ||
extraProcessors += numMissingProcessors - numExistingProcessorsToBeUsed; | ||
extraPerNodeProcessors = Math.max(extraPerNodeProcessors, 1); // if extra processors >0, we need at least 1 | ||
// extraPerNodeProcessors | ||
} | ||
if (perNodeAvailableProcessors < numberOfThreadsPerAllocation) { | ||
extraPerNodeProcessors = Math.max(extraPerNodeProcessors, numberOfThreadsPerAllocation); | ||
} | ||
numberOfAvailableProcessors -= numExistingProcessorsToBeUsed; | ||
if (perNodeAvailableProcessors < numberOfThreadsPerAllocation) { | ||
extraPerNodeProcessors = Math.max(extraPerNodeProcessors, numberOfThreadsPerAllocation); | ||
} | ||
numberOfAvailableProcessors -= numExistingProcessorsToBeUsed; | ||
} | ||
|
||
if (extraProcessors > 0 || extraPerNodeProcessors > 0 || extraModelMemoryInBytes > 0 || extraPerNodeModelMemoryBytes > 0) { | ||
logger.info( | ||
() -> format( | ||
"trained model [%s] assigned to [%s], waiting for [%d] allocations to start due to missing hardware", | ||
modelAssignment.getKey(), | ||
Strings.arrayToCommaDelimitedString(modelAssignment.getValue().getStartedNodes()), | ||
numMissingAllocations | ||
) | ||
); | ||
} | ||
|
||
if (extraProcessors > 0 || extraPerNodeProcessors > 0 || extraModelMemoryInBytes > 0 || extraPerNodeModelMemoryBytes > 0) { | ||
logger.info( | ||
() -> format( | ||
"trained model [%s] assigned to [%s], waiting for [%d] allocations to start due to missing hardware", | ||
modelAssignment.getKey(), | ||
Strings.arrayToCommaDelimitedString(modelAssignment.getValue().getStartedNodes()), | ||
numMissingAllocations | ||
for (String node : modelAssignment.getValue().getNodeRoutingTable().keySet()) { | ||
sumOfCurrentlyExistingAndUsedProcessors += modelAssignment.getValue().getNodeRoutingTable().get(node).getTargetAllocations() | ||
* numberOfThreadsPerAllocation; | ||
|
||
jobRequirementsByNode.computeIfAbsent(node, k -> new ArrayList<>()) | ||
.add( | ||
MlJobRequirements.of( | ||
estimatedMemoryUsage, | ||
Priority.LOW.equals(modelAssignment.getValue().getTaskParams().getPriority()) | ||
? 0 | ||
: modelAssignment.getValue().getNodeRoutingTable().get(node).getTargetAllocations() | ||
* numberOfThreadsPerAllocation | ||
) | ||
); | ||
} | ||
|
||
for (String node : modelAssignment.getValue().getNodeRoutingTable().keySet()) { | ||
sumOfCurrentlyExistingAndUsedProcessors += modelAssignment.getValue() | ||
.getNodeRoutingTable() | ||
.get(node) | ||
.getTargetAllocations() * numberOfThreadsPerAllocation; | ||
|
||
jobRequirementsByNode.computeIfAbsent(node, k -> new ArrayList<>()) | ||
.add( | ||
MlJobRequirements.of( | ||
estimatedMemoryUsage, | ||
Priority.LOW.equals(modelAssignment.getValue().getTaskParams().getPriority()) | ||
? 0 | ||
: modelAssignment.getValue().getNodeRoutingTable().get(node).getTargetAllocations() | ||
* numberOfThreadsPerAllocation | ||
) | ||
); | ||
} | ||
|
||
// min(3, max(number of allocations over all deployed models) | ||
// the minimum number of nodes is equal to the number of allocations, up to 3 | ||
// if the number of allocations is greater than 3, then wantedMinNodes is still 3 | ||
// in theory this should help availability for 2-3 allocations | ||
// the planner should split over all available nodes | ||
minNodes = Math.min(3, Math.max(minNodes, numberOfRequestedAllocations)); | ||
} | ||
|
||
// min(3, max(number of allocations over all deployed models) | ||
// the minimum number of nodes is equal to the number of allocations, up to 3 | ||
// if the number of allocations is greater than 3, then wantedMinNodes is still 3 | ||
// in theory this should help availability for 2-3 allocations | ||
// the planner should split over all available nodes | ||
minNodes = Math.min(3, Math.max(minNodes, numberOfRequestedAllocations)); | ||
} | ||
|
||
// dummy autoscaling entity | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is on
TaskParams
-StartTrainedModelDeploymentAction.TaskParams. estimateMemoryUsageBytes()
There is another public method
StartTrainedModelDeploymentAction.estimateMemoryUsageBytes()
on line 792 that needs this check.If StartTrainedModelDeploymentAction.estimateMemoryUsageBytes() can return - then line 635
+ (cacheSize.getBytes() - modelBytes);
needs a Max.(0,...) to ensure the return is non-negativeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
StartTrainedModelDeploymentAction.estimateMemoryUsageBytes
I don't think the
Max
is necessary.StartTrainedModelDeploymentAction.estimateMemoryUsageBytes
returns 0 only if the number of allocations is 0, in which case theTaskParams.estimateMemoryUsageBytes
already returns 0.