[ECS/Fargate] [request]: Allow stopTimeout to be configured for ondemand tasks #1020

ghost · 2020-08-07T16:51:47Z

Based on this comment: spring-projects/spring-boot#4657 (comment), if we were to implement graceful termination on the application side, it would really help if stopTimeout would allowed to be configurable - at least for on-demand fargate tasks.

Currently the max value is 120s which is not sufficient for all usecases.

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
What do you want us to build?

Which service(s) is this request for?
This could be Fargate, ECS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.

Are you currently working around this issue?
How are you currently solving this problem?

Additional context
Anything else we should know?

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

The text was updated successfully, but these errors were encountered:

matteomazza91 · 2021-03-24T10:56:45Z

Implementing graceful termination on the application side we are facing the same issue.
We have an ECS service with an ALB, Fargate tasks and autoscaling that increase and decrease the desiredCount.

When an ECS task receive SIGTERM (docker stop), it should get a chance to complete the ongoing work before being forcibly killed.
Unfortunately for our use case a maximum value of 120s for stopTimeout is not enough.

bryanculbertson · 2021-04-05T21:56:43Z

I run long running tasks on ECS and would like to run them in fargate. I would like for them to have a chance to finish which would mean a multi-hour stop timeout.

vineetraja · 2021-04-09T08:21:59Z

The 2 minute limit is too short for many use-cases.
With this limit in place, AWS fargate auto-scaling is rendered useless for any system that cares about graceful shutdown.

RicePatrick · 2021-05-08T01:01:39Z

Agreed with other posters. We have long running Fargate tasks that listen to a queue for picking up tasks, and tasks have a potentially unbounded processing time. Yes, they'll pick up the message again from the queue after timeout if the tasks dies (once visibility expires), but I'd prefer that the task be allowed to finish and gracefully shut down.

With a 2 minute limit, we've been forced to abandon AWS's default auto scaling, and create a lambda that checks for scaling every minute, and sends a http request to containers to shut down instead of using sigterm.

marc-guenther · 2021-08-06T19:05:47Z

Same problem here. This renders Fargate useless for our application. Seems we need to use EC2 instances, where this limit does not exist?

We are paying for running containers even whey they are shutting down, why cannot we set the stopTimeout to whatever value we like?

GytisZ · 2021-08-19T10:39:53Z

Encountering a similar issue. We have tasks that we'd like to drain / shutdown slower than in two minutes. Currently that means that we're migrating from ECS Services and Autoscaling and will need to manually manage the tasks both during deployments and autoscaling. Using the RunTask and some scripts to hold it all together.

Being able to extend the StopTimeout to 1-2 hours is the only reason why the current setup doesn't work for us.

keirw2022 · 2021-10-12T09:49:14Z

Bumping this! - Such a needed feature.

satya-500 · 2021-11-08T08:34:31Z

StopTimeout should be set by users as much they want.

maddipati-srinivas · 2021-11-23T11:23:04Z

StopTimeout - max can be only 2 mints for SIGTERM. For more details go through the this link. But this solution will not work for stateful operations as it's depends on our business logic.

May I know when we can expect full pledge solution from AWS?

mdomsch-seczetta · 2022-07-19T20:35:02Z

We moved our (long-running) batch processing application from ECS on Fargate to ECS on EC2 so that we could manage the termination behavior and extend it as long as necessary in order to properly let our batch jobs complete and drain safely without loss of work. 2 minutes is woefully insufficient. However, this has lead to significantly increased DataDog monitoring costs (from ~$1.40/task to $56/task), which cannot be borne in our budget. We'd be happy to keep the tasks on Margate, if the StopTimeout could be extended as long as necessary.

craigify · 2023-01-26T19:10:33Z

Yes we have the exact same problem. I am not able to use ECS for one of our major applications because I need to allow a Fargate instance much more time than 2 minutes to shut down.

matt-domsch-sp · 2023-05-16T12:38:36Z

#256 (comment) notes that ECS Task Scale-in Protection can now be set. However, that does not solve the problem. This prevents SIGTERM from reaching a running task, so the 2-minute SIGKILL timer never starts. But it also removes the signal (SIGTERM) that the task should stop picking up new jobs to run. Many task servers, such as sidekiq, can work on multiple jobs simultaneously. If one job is running (thus scale-in protection set), if there's another job in a queue ready to be processed, the task could pick up that job too, when we only want to wait for the first job to complete, not start any new jobs on this task. Now that we're allowed to use ECS Task Scale-in Protection to keep a task alive indefinitely, we should similarly be allowed to prevent ECS from sending SIGKILL after 2 minutes.

ADrejta · 2024-08-21T10:05:08Z

+1 on this. The 2 minute maximum timeout is quite low for graceful shutdowns of longer running jobs. We either have to implement a retry mechanism of jobs that might be lost or completely switch to running ECS in EC2 mode where that number can be set much higher.

ghost added the Proposed Community submitted issue label Aug 7, 2020

akshayram-wolverine added the Fargate AWS Fargate label Aug 11, 2020

vineetraja mentioned this issue Apr 9, 2021

[ECS] [request]: Control which containers are terminated on scale in #125

Closed

rgoltz mentioned this issue Sep 15, 2022

[ECS] [request]: Extend max container definition stopTimeout #1808

Open

omieomye added the ECS Amazon Elastic Container Service label Sep 23, 2022

vibhav-ag assigned vibhav-ag, AbhishekNautiyal and herrhound and unassigned vibhav-ag Oct 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ECS/Fargate] [request]: Allow stopTimeout to be configured for ondemand tasks #1020

[ECS/Fargate] [request]: Allow stopTimeout to be configured for ondemand tasks #1020

ghost commented Aug 7, 2020

matteomazza91 commented Mar 24, 2021

bryanculbertson commented Apr 5, 2021

vineetraja commented Apr 9, 2021

RicePatrick commented May 8, 2021

marc-guenther commented Aug 6, 2021

GytisZ commented Aug 19, 2021

keirw2022 commented Oct 12, 2021

satya-500 commented Nov 8, 2021

maddipati-srinivas commented Nov 23, 2021

mdomsch-seczetta commented Jul 19, 2022

craigify commented Jan 26, 2023

matt-domsch-sp commented May 16, 2023 •

edited

Loading

ADrejta commented Aug 21, 2024

[ECS/Fargate] [request]: Allow stopTimeout to be configured for ondemand tasks #1020

[ECS/Fargate] [request]: Allow stopTimeout to be configured for ondemand tasks #1020

Comments

ghost commented Aug 7, 2020

Community Note

matteomazza91 commented Mar 24, 2021

bryanculbertson commented Apr 5, 2021

vineetraja commented Apr 9, 2021

RicePatrick commented May 8, 2021

marc-guenther commented Aug 6, 2021

GytisZ commented Aug 19, 2021

keirw2022 commented Oct 12, 2021

satya-500 commented Nov 8, 2021

maddipati-srinivas commented Nov 23, 2021

mdomsch-seczetta commented Jul 19, 2022

craigify commented Jan 26, 2023

matt-domsch-sp commented May 16, 2023 • edited Loading

ADrejta commented Aug 21, 2024

matt-domsch-sp commented May 16, 2023 •

edited

Loading