Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task #210

alinabuzachis · 2022-05-13T11:33:21Z

Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task

Needed for ansible-collections/community.aws#1145

markuman · 2022-05-19T18:41:08Z

aws/terminator/compute.py

+            self.client.delete_service(cluster=self.name, service=name['serviceName'], force=True)
+
+        self.client.delete_cluster(cluster=self.name)


Suggested change

self.client.delete_service(cluster=self.name, service=name['serviceName'], force=True)

self.client.delete_cluster(cluster=self.name)

self.client.delete_service(cluster=self.name, service=name['serviceName'], force=True)

self.client.delete_cluster(cluster=self.name)

do we need here a waiter? https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ecs.html#ECS.Waiter.ServicesInactive
otherwise delete_cluster might fail. Or does it not matter, because the next run of the terminator will delete the cluster, because the service is gone then?

We would typically avoid using waiters on the lambda to prevent timeouts, especially for anything that might take more than a minute or two. The ideal way to handle a scenario like this would be to create terminators for the cluster and any dependent resources, and increase the age limit on the cluster terminator. This is complicated by the fact that doing anything with services or container instances also requires the cluster name.

markuman · 2022-05-19T18:42:31Z

aws/terminator/compute.py

+            tasks = _paginate_task_results(name['containerInstanceArn'])
+            for task in tasks:
+                self.client.stop_task(cluster=self.name, task=task['taskArn'])


Suggested change

tasks = _paginate_task_results(name['containerInstanceArn'])

for task in tasks:

self.client.stop_task(cluster=self.name, task=task['taskArn'])

tasks = _paginate_task_results(name['containerInstanceArn'])

for task in tasks:

self.client.stop_task(cluster=self.name, task=task['taskArn'])

Unless the desired count of the belonging service is not set to 0, the service might spawn new tasks to replace the stopped ones.

maybe flipping solves this issue in a lazy way?

delete the service

delete all left running tasks (like killall -9)

delete the cluster

Do the tasks need to be individually deleted at all? Won't deregistering the container instance also remove the tasks?

If the ecs_tasks runs on ec2, yes. deleteing the instance, kills also the ecs tasks.
but if it's serverless (fargate), there is no otherway to stop/kill the tasks.

but basically you did it that way, it you won't wait for the service to become inactive.

imo we've got two choices.
a) delete service and wait until it's gone. forward delete cluster
b) delete service, kill pending tasks, forward delete cluster

I feel like the first option seems like less work.

We wouldn't want to ignore the exception forever though. So perhaps, one class with the default age_limit that catches and ignores ClusterContainsServicesException and ClusterContainsTasksException on delete_cluster(). Then a second class, which has an age_limit 10 minutes longer than the default, that only tries to delete clusters (and that one doesn't ignore those exceptions, so that errors still surface to us in the monitoring).

I'm assuming 10 minutes should be more than long enough, even on a bad day, for all tasks and services to be 100% gone.

That sounds good!

@markuman I tried to create the second class, @markuman can you check please and let me know if this is what you meant? Thanks.

LGTM.
What's about the failing CI. I cannot find any usefull error message.

@markuman it needs to be refactored in some way because the file's dimension exceeds the limit. Will try to see what I can do for this thing.

gravesm · 2022-06-07T15:12:40Z

Since we've exceeded the policy size on compute, I would suggest moving the ECS policies to the paas policy.

alinabuzachis · 2022-06-07T15:24:52Z

Since we've exceeded the policy size on compute, I would suggest moving the ECS policies to the paas policy.

Done, thank you @gravesm!

gravesm · 2022-06-08T14:19:52Z

aws/terminator/paas.py

+                self.client.stop_task(cluster=self.name, task=task['taskArn'])
+
+        # If there are running services, delete them first
+        services = _paginate_service_results()


Services should be the first things that are deleted, otherwise, any deleted tasks may just be restarted again.

gravesm · 2022-06-08T14:26:57Z

aws/terminator/paas.py

+        # Deregister container instances
+        for name in container_instances:
+            self.client.deregister_container_instance(containerInstance=name['containerInstanceArn'])
+


Before deleting the cluster, you will also need to delete all the tasks. In the case of fargate, there are no container instances and you may have tasks running that aren't managed by a service. I'm also not clear on whether using force=True when deleting a service also deletes the tasks or just orphans them. Either way, there should be a step that stops all tasks.

gravesm

A couple small fixes, but otherwise this LGTM.

gravesm · 2022-06-28T19:19:52Z

aws/terminator/paas.py

+            if not names:
+                return []
+
+            return client.describe_clusters(name=names)['clusters']


Suggested change

return client.describe_clusters(name=names)['clusters']

return client.describe_clusters(clusters=names)['clusters']

Done! Thanks.

gravesm · 2022-06-28T19:20:18Z

aws/terminator/paas.py

+            if not names:
+                return []
+
+            return client.describe_clusters(name=names)['clusters']


Suggested change

return client.describe_clusters(name=names)['clusters']

return client.describe_clusters(clusters=names)['clusters']

Done! Thanks.

tremble · 2023-02-03T11:21:09Z

@alinabuzachis @gravesm Where do we stand on getting this merged?

Signed-off-by: Alina Buzachis <[email protected]>

gravesm · 2023-02-03T20:09:34Z

Hey, I'll try to get to this early next week. There are a lot of parts covered by this PR and I'd like to set aside some time to test it.

markuman · 2023-02-05T15:56:58Z

Just a note, the ecs_cluster integration test is currently broken.

gravesm · 2023-02-06T15:46:18Z

@tremble I'm running into a lot of problems with the ecs_cluster tests. There may be other things that are broken, as @markuman referred to, but the biggest problem I have so far is that the tests seem to be intended to provision containers on EC2 instances, but then the service definitions are using Fargate provisioning.

What I need in order to get this merged is a working test suite. I'm willing to work on getting the permissions correct, but I need a PR I can test against that is otherwise working.

markuman · 2023-02-06T16:00:33Z

What I need in order to get this merged is a working test suite. I'm willing to work on getting the permissions correct, but I need a PR I can test against that is otherwise working.

I can try to target it this week.

gravesm · 2023-02-23T20:14:51Z

Closing this as it was superseded by #264

ecs: integration test and new purge parameters SUMMARY Make the ecs_cluster integration test work again ecs_service - new parameter purge_placement_constraints and purge_placement_strategy. Otherwise it is impossible to remove those placements without breaking backwards compatibility. purge_placement_constraints in the integration test purge_placement_strategy in the integration test required by mattclay/aws-terminator#210 (comment) ISSUE TYPE Bugfix Pull Request Docs Pull Request Feature Pull Request COMPONENT NAME ecs_service ADDITIONAL INFORMATION works for me again ansible-test integration --python 3.10 ecs_cluster --docker --allow-unsupported ... PLAY RECAP ********************************************************************* testhost : ok=143 changed=69 unreachable=0 failed=0 skipped=1 rescued=0 ignored=6 Reviewed-by: Mark Chappell Reviewed-by: Markus Bergholz <[email protected]> Reviewed-by: Alina Buzachis Reviewed-by: Mike Graves <[email protected]>

ecs: integration test and new purge parameters SUMMARY Make the ecs_cluster integration test work again ecs_service - new parameter purge_placement_constraints and purge_placement_strategy. Otherwise it is impossible to remove those placements without breaking backwards compatibility. purge_placement_constraints in the integration test purge_placement_strategy in the integration test required by mattclay/aws-terminator#210 (comment) ISSUE TYPE Bugfix Pull Request Docs Pull Request Feature Pull Request COMPONENT NAME ecs_service ADDITIONAL INFORMATION works for me again ansible-test integration --python 3.10 ecs_cluster --docker --allow-unsupported ... PLAY RECAP ********************************************************************* testhost : ok=143 changed=69 unreachable=0 failed=0 skipped=1 rescued=0 ignored=6 Reviewed-by: Mark Chappell Reviewed-by: Markus Bergholz <[email protected]> Reviewed-by: Alina Buzachis Reviewed-by: Mike Graves <[email protected]> (cherry picked from commit 86c60b4)

[PR #1716/86c60b49 backport][stable-5] ecs: integration test and new purge parameters This is a backport of PR #1716 as merged into main (86c60b4). SUMMARY Make the ecs_cluster integration test work again ecs_service - new parameter purge_placement_constraints and purge_placement_strategy. Otherwise it is impossible to remove those placements without breaking backwards compatibility. purge_placement_constraints in the integration test purge_placement_strategy in the integration test required by mattclay/aws-terminator#210 (comment) ISSUE TYPE Bugfix Pull Request Docs Pull Request Feature Pull Request COMPONENT NAME ecs_service ADDITIONAL INFORMATION works for me again ansible-test integration --python 3.10 ecs_cluster --docker --allow-unsupported ... PLAY RECAP ********************************************************************* testhost : ok=143 changed=69 unreachable=0 failed=0 skipped=1 rescued=0 ignored=6 Reviewed-by: Mark Chappell

) ecs: integration test and new purge parameters SUMMARY Make the ecs_cluster integration test work again ecs_service - new parameter purge_placement_constraints and purge_placement_strategy. Otherwise it is impossible to remove those placements without breaking backwards compatibility. purge_placement_constraints in the integration test purge_placement_strategy in the integration test required by mattclay/aws-terminator#210 (comment) ISSUE TYPE Bugfix Pull Request Docs Pull Request Feature Pull Request COMPONENT NAME ecs_service ADDITIONAL INFORMATION works for me again ansible-test integration --python 3.10 ecs_cluster --docker --allow-unsupported ... PLAY RECAP ********************************************************************* testhost : ok=143 changed=69 unreachable=0 failed=0 skipped=1 rescued=0 ignored=6 Reviewed-by: Mark Chappell Reviewed-by: Markus Bergholz <[email protected]> Reviewed-by: Alina Buzachis Reviewed-by: Mike Graves <[email protected]>

alinabuzachis force-pushed the ecs_policies branch 6 times, most recently from 888e6ac to def7593 Compare May 13, 2022 17:47

alinabuzachis changed the title ~~[WIP] Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task~~ Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task May 16, 2022

alinabuzachis force-pushed the ecs_policies branch from ede308c to b6526ad Compare May 16, 2022 15:15

markuman mentioned this pull request May 19, 2022

ecs_service planning ansible-collections/community.aws#1142

Open

10 tasks

markuman reviewed May 19, 2022

View reviewed changes

alinabuzachis force-pushed the ecs_policies branch 3 times, most recently from 090593e to f1af250 Compare June 1, 2022 14:46

markuman mentioned this pull request Jun 5, 2022

fix assert of deploymentCircuitBreaker ansible-collections/community.aws#1217

Merged

alinabuzachis force-pushed the ecs_policies branch 2 times, most recently from 951959f to ee500ed Compare June 7, 2022 15:23

alinabuzachis force-pushed the ecs_policies branch from ee500ed to 4acd9dc Compare June 7, 2022 15:36

gravesm reviewed Jun 8, 2022

View reviewed changes

gravesm reviewed Jun 29, 2022

View reviewed changes

alinabuzachis added 7 commits February 3, 2023 12:28

Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task

3f8c4c6

Signed-off-by: Alina Buzachis <[email protected]>

Abstract List and Describe operations

22a048c

Signed-off-by: Alina Buzachis <[email protected]>

Add extra black line

3b35f8b

Signed-off-by: Alina Buzachis <[email protected]>

Update resources

b4666e4

Signed-off-by: Alina Buzachis <[email protected]>

Remove bracket

25c38fd

Signed-off-by: Alina Buzachis <[email protected]>

Update for dependant resources

1a99a8c

Signed-off-by: Alina Buzachis <[email protected]>

Refactor

4b8cb8d

Signed-off-by: Alina Buzachis <[email protected]>

alinabuzachis added 6 commits February 3, 2023 12:28

Remove extra space

fd8c78a

Signed-off-by: Alina Buzachis <[email protected]>

apply suggestion

72adc9a

Signed-off-by: Alina Buzachis <[email protected]>

Linting

29eb767

Signed-off-by: Alina Buzachis <[email protected]>

Move ecs policies and terminator classes into paas files

ee2de76

Signed-off-by: Alina Buzachis <[email protected]>

Apply suggestion

3b27ffe

Signed-off-by: Alina Buzachis <[email protected]>

Apply suggestion

29e9112

Signed-off-by: Alina Buzachis <[email protected]>

alinabuzachis force-pushed the ecs_policies branch from a1c21c0 to 29e9112 Compare February 3, 2023 11:33

Fix rebase

eff6b5b

Signed-off-by: Alina Buzachis <[email protected]>

alinabuzachis force-pushed the ecs_policies branch from e6d006d to eff6b5b Compare February 3, 2023 11:38

remove trailing spaces

087394b

Signed-off-by: Alina Buzachis <[email protected]>

markuman mentioned this pull request Feb 9, 2023

ecs: integration test and new purge parameters ansible-collections/community.aws#1716

Merged

tremble mentioned this pull request Feb 16, 2023

ECS Policies #264

Merged

gravesm closed this Feb 23, 2023

patchback bot mentioned this pull request Mar 1, 2023

[PR #1716/86c60b49 backport][stable-5] ecs: integration test and new purge parameters ansible-collections/community.aws#1731

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task #210

Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task #210

alinabuzachis commented May 13, 2022 •

edited

Loading

markuman May 19, 2022

gravesm May 23, 2022

markuman May 19, 2022

markuman May 19, 2022

gravesm May 23, 2022

markuman May 23, 2022

gravesm May 23, 2022

jillr May 31, 2022

markuman May 31, 2022

alinabuzachis Jun 1, 2022

markuman Jun 7, 2022

alinabuzachis Jun 7, 2022

gravesm commented Jun 7, 2022

alinabuzachis commented Jun 7, 2022

gravesm Jun 8, 2022

gravesm Jun 8, 2022

gravesm left a comment

gravesm Jun 28, 2022

alinabuzachis Jul 4, 2022

gravesm Jun 28, 2022

alinabuzachis Jul 4, 2022

tremble commented Feb 3, 2023

gravesm commented Feb 3, 2023

markuman commented Feb 5, 2023

gravesm commented Feb 6, 2023

markuman commented Feb 6, 2023

gravesm commented Feb 23, 2023

		self.client.delete_service(cluster=self.name, service=name['serviceName'], force=True)

		self.client.delete_cluster(cluster=self.name)

	return client.describe_clusters(name=names)['clusters']
	return client.describe_clusters(clusters=names)['clusters']

Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task #210

Policies for ecs_service, ecs_cluster, ecs_taskdefinition, ecs_task #210

Conversation

alinabuzachis commented May 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gravesm commented Jun 7, 2022

alinabuzachis commented Jun 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gravesm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tremble commented Feb 3, 2023

gravesm commented Feb 3, 2023

markuman commented Feb 5, 2023

gravesm commented Feb 6, 2023

markuman commented Feb 6, 2023

gravesm commented Feb 23, 2023

alinabuzachis commented May 13, 2022 •

edited

Loading