-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prediction API #5
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cmelone
force-pushed
the
add/predict
branch
2 times, most recently
from
January 31, 2024 07:28
dcb99d4
to
aab0a7d
Compare
alecbcs
requested changes
Feb 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good! Just a few initial impressions and a small suggestion.
cmelone
commented
Feb 9, 2024
alecbcs
reviewed
Feb 16, 2024
cmelone
force-pushed
the
add/predict
branch
2 times, most recently
from
March 7, 2024 21:08
790dbdf
to
87eebb8
Compare
alecbcs
reviewed
Apr 25, 2024
Rather than grabbing the 4-5 rows that match based on other params (name, version for both pkg and compiler) and then filtering by variant, these queries are now split up into two. The program will first try to do an exact match on variants. If that doesn't work, it'll try to match based on expensive variants. This increases the amount of queries being done, but is a more robust system and will result in more matches.
we have two options for accepting multiple predictions: 1) instructing the client to send a request for each prediction 2) allowing the client to pass a list of specs to predict for the second option presents a problem when we get into the weeds of GET request length limits. since specs can vary widely in length when running json.dumps() on them, it would be difficult to either instruct the client to limit their length, or to somehow validate it. when we were allowing bulk prediction in one GET request, it was handled through an asyncio.gather() call. this approach is approximately 2x faster than individual HTTP requests when testing 5000 consecutive calls to the API (4s vs 8s). In this case, 8 seconds is not very long so it's not a problem, but we can revisit this in the future if we run into performance issues.
Co-authored-by: Alec Scott <[email protected]>
alecbcs
approved these changes
May 1, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks @cmelone!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements an endpoint that accepts the following payload via a
GET
request to/v1/allocation
:It will also accept an array of these dictionaries.
The program will then search for 4 ≤ x ≤ 5 samples with the following covariates (in priority order):
("pkg_name", "pkg_version", "compiler_name", "compiler_version")
("pkg_name", "compiler_name", "compiler_version")
("pkg_name", "pkg_version", "compiler_name")
("pkg_name", "compiler_name")
("pkg_name", "pkg_version")
("pkg_name",)
We've chosen this order after running simulations of this prediction algorithm and finding that (1) is the best predictor of resource usage.
After collecting the sample, the program returns a mean of past CPU and memory usage to suggest an appropriate k8s allocation. For the moment, we are only targeting requests.
Additionally, we have implemented a safeguard ensuring that the requests will not be set below what is currently allocated for each package. spack/spack#42351, which was merged a few days ago, increases the allocation for many packages (based on max memory per package, rather than mean as we do here). I am hypothesizing that this will likely render many of the predictions futile, but it's still a good opportunity for a Gantry trial run.
I'm looking for any ways to improve the code or make it more concise!