Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computational clusters: "erred" jobs prevent workers from being shutdown #1218

Closed
Tracked by #950
sanderegg opened this issue Dec 22, 2023 · 1 comment
Closed
Tracked by #950
Assignees
Labels
type:bug Issue that prevents to perform a certain task, features that don't work as t

Comments

@sanderegg
Copy link
Member

sanderegg commented Dec 22, 2023

some computational jobs are in "erred" status.
When terminating the workers on AWS console, these workers are re-created by the autoscaling service.

I guess they are considered as unrunnable by the dask backend.

--> filter jobs in error from unrunnable tasks, so that autoscaling does not re-create unnecessary worker machines

@sanderegg sanderegg self-assigned this Dec 22, 2023
@sanderegg sanderegg added the type:bug Issue that prevents to perform a certain task, features that don't work as t label Dec 22, 2023
@sanderegg
Copy link
Member Author

this should be fixed by ITISFoundation/osparc-simcore#5203
closing for now

@sanderegg sanderegg added this to the This is Sparta! milestone Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Issue that prevents to perform a certain task, features that don't work as t
Projects
None yet
Development

No branches or pull requests

1 participant