Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

personalizable resource limits #618

Open
2 of 4 tasks
Tracked by #951
esraneufeld opened this issue Apr 7, 2022 · 17 comments · Fixed by ITISFoundation/osparc-simcore#3983 or ITISFoundation/osparc-simcore#3989
Open
2 of 4 tasks
Tracked by #951
Assignees
Labels
Epic Zenhub label (Pleas do not modify) PO issue Created by Product owners

Comments

@esraneufeld
Copy link
Member

esraneufeld commented Apr 7, 2022

To avoid overscheduling of nodes, services should be scheduled with minimal resource requirements. In order not to limit "power users", those requirements can be dynamically be overwritten by users via the osparc user interface. Possibility to override is tied to users/group permissions.
(computational backend for now, also keep in mind #576)

Tasks

  1. a:apiserver
    pcrespov
  2. a:apiserver
    bisgaard-itis
  3. a:webserver
    sanderegg

Sundae

  1. 2 of 4
    a:infra+ops
    mrnicegyu11 sanderegg
@esraneufeld
Copy link
Member Author

esraneufeld commented Apr 7, 2022

To avoid overscheduling of nodes, services should be scheduled with minimal resource requirements. In order not to limit "power users", those requirements can be dynamically be overwritten by users via the osparc user interface. Possibility to override is tied to users/group permissions.
(computational backend for now, also keep in mind #576)

@sanderegg
Copy link
Member

sanderegg commented Apr 29, 2022

Update on sprint Macarons

Done

Todo

  • allow specific users to change limits on computational resources
    • save resources per node in project
    • handle sharing of projects
  • if the resources are not available osparc should not run and show a clear message
  • extend to dynamic services

@sanderegg
Copy link
Member

Update on sprint Croissant

Done

Ongoing

Open

  • Manual overriding services resources (reservations and limits)
  • Allow specific users to change limits on computational services in osparc GUI
    • save resources per node in project
    • handle sharing of projects
  • if resource not available osparc should not run and show a clear message
  • extend to dynamic services

@elisabettai
Copy link
Contributor

@mguidon, could something be done this sprint to prevent the case MS was hitting see this message in Mattermost? For the POs this is still very important and needed for s4l upcoming releases.
Could you please discuss with the backend team and see what can be done?

@mguidon
Copy link
Member

mguidon commented Nov 11, 2022

Hi. Strictly speaking, personalized resources are not required for s4l:web:lite. Melanies problem is related to her using a very outdated legacy service (with strict resource limits of outrageous 96 GB RAM) and her needing more RAM than the default one for the newer version (16GB).
If we want to work on that I suggest to do a minimal thing that allows us to override the limits but not yet the users. (e.g. enhancing the services_specifications table in the db.)
@sanderegg @GitHK any thoughts on this?

@sanderegg
Copy link
Member

sanderegg commented Nov 14, 2022

@sanderegg
Copy link
Member

sanderegg commented Dec 1, 2022

@sanderegg sanderegg added this to the Mithril milestone Feb 28, 2023
@sanderegg
Copy link
Member

sanderegg commented Mar 2, 2023

@sanderegg sanderegg added this to the Jelly Beans milestone Apr 5, 2023
@sanderegg sanderegg assigned mguidon and unassigned GitHK Apr 6, 2023
@sanderegg
Copy link
Member

sanderegg commented Apr 6, 2023

Some notes

current implementation

  • modern dynamic services and all computational services can have personalized resource limits/reservations
  • these custom changes are set per user/group manually through osparc database
  • these changes need access to the said database (dev-team or app-team) and to know what one is doing

If the user need to dynamically change a service resources I see the following requirements:

  • UI to allow to change the resource requirements inside a study
  • The allowed resource requirements depends on the user/group affiliations and access to actual computational resources (which might be known if running on internal cluster, or unknown if on an external cluster)
  • the selected resource requirements shall be saved with the study for persistence
  • One might have different default resources based on user/group affiliations
  • These defaults will need to be set by some kind of administrator (the cluster's owner)
  • When sharing a study with someone else, what happens when these resource requirements are not available to that user/group?

@sanderegg
Copy link
Member

sanderegg commented Apr 6, 2023

Goal for sprint Jelly Beans

Write down in details how this will be surfaced to the users

  • Default resources
  • Max resoruces
  • How to link this to auto-scaling

Potential first use case:
Melanie can choose how much RAM the ti jupyter smash has available. Default will be what is on the label. If the admin allows it, she can individually change the settings upon starting of the service.

@sanderegg
Copy link
Member

sanderegg commented Apr 25, 2023

Update on sprint Jelly Beans

Towards user-defined resources on services plan

1. Running a service with dynamically changed resources

  • a service maximum authorized resources per user/groups can be defined (database changes), if not defined would default to the current defaults (defined by service labels or overridden defaults)
  • a service defined resources are persisted with the project it is used in and is reloaded with the project
  • a service defined resources is visible in the UI
  • a service defined resources can be modified within a project up to the max authorized resources for the current user
  • a service defined resources is modifiable through the UI
  • the backend starts/runs the service based on service within project defined resources
  • should be accessible from Public API as well

2. administrator access to a service default resources

  • administrator can override the default resources for a service (equivalent to current manual change in the database)
  • administrator can set different default resources for different users/groups (equivalent to current manual change in the database
  • administrator can change the maximum authorized used resources for a service for different users/groups
  • system shall prevent administrator from completely using a shared machine by keeping some factor (e.g. if most powerful machine has 32 CPUs, then the maximum usable could be 8 CPUs)

3. sharing a study

  • pre-check if sharee has necessary available resources (similar as user rights)

Use-case:

  • Melanie can choose how much RAM the ti jupyter smash has available. Default will be what is on the label. If the admin allows it, she can individually change the settings upon starting of the service.

after discussion with @mguidon the idea is to go with 1. first and wait with 2. until capabilities of dynamic clusters are better defined. 3. could be implemented once 1. is done.

@mguidon
Copy link
Member

mguidon commented Apr 27, 2023

  • Is the administrator one of us or a user? For the latter, how does a user become administrator?
  • I think we should default to a selection of cpu/ram and gpu configs that can be chosen from (e.g. high cpu, high ram or similar)

@sanderegg
Copy link
Member

  • Should also be accessible from the PublicAPI

@sanderegg sanderegg modified the milestones: Jelly Beans, Pastel de Nata May 15, 2023
@sanderegg
Copy link
Member

@sanderegg
Copy link
Member

sanderegg commented Jul 7, 2023

Update Watermelon

Done:

Summary:
The user can now change a service required resources (CPUs, RAM) through the oSparc GUI. These changes are persisted within the project. There are currently no upper bound for resources. The platform will try to start the service using the defined required resources.

Open:

  • PublicAPI changes to present the same options as the GUI
  • Allow to select pre-defined EC2 instances to run services

@sanderegg
Copy link
Member

@matusdrobuliak66 matusdrobuliak66 modified the milestones: Sundae, Baklava Aug 22, 2023
@pcrespov pcrespov removed this from the Baklava milestone Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Zenhub label (Pleas do not modify) PO issue Created by Product owners
Projects
None yet
8 participants