Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computational clusters: when the EC2 vCPU quota is reached the clusters does not start processing #1277

Closed
Tracked by #950
sanderegg opened this issue Feb 27, 2024 · 1 comment · Fixed by ITISFoundation/osparc-simcore#5408
Assignees
Labels
Feedback Feedback through frontend type:bug Issue that prevents to perform a certain task, features that don't work as t
Milestone

Comments

@sanderegg
Copy link
Member

Long Story Short

A simulation that requires >=10 machines of c6a.24xlarge does not process.

Expected Behavior

The simulation should complete.
When the number of requested machines is above the current quota, it should try to reduce the number until it manages to create a machine.
Feedback to the user should be given (this is yet another feature)

Actual behaviour

  • the computational cluster autoscaling service requests 1 <= X <= EC2_INSTANCES_MAX_INSTANCES EC2 instances to run the computational jobs
  • if X * vCPUs is above the AWS EC2 quota assigned, then an error is returned and the autoscaling does not try with smaller number of machines.
    --> the user has no feedback
    --> the computation does not run at all
    --> it needs developers interventions in a relatively deep way (access to cluster, change of EC2_INSTANCES_MAX_INSTANCES variable to a lower number to unlock the computation)

Steps to reproduce

run multiple computational jobs that require a EC2 instance where the quota is almost at 100%
Environment

Additional context

@sanderegg sanderegg added type:bug Issue that prevents to perform a certain task, features that don't work as t Feedback Feedback through frontend labels Feb 27, 2024
@sanderegg sanderegg added this to the Schoggilebe milestone Feb 27, 2024
@sanderegg sanderegg self-assigned this Feb 27, 2024
@sanderegg
Copy link
Member Author

Requested 5000vCPUs for C/R/T on-demand
Requested 5000vCPUs for G

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feedback Feedback through frontend type:bug Issue that prevents to perform a certain task, features that don't work as t
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant