Stop swarm services by scaling down and up #328

m90 · 2024-01-13T11:30:38Z

Closes #327

pixxon · 2024-01-24T11:53:02Z

cmd/backup/script.go

+		return noop, fmt.Errorf("stopContainers: error getting docker info: %w", err)
+	}
+	isDockerSwarm := dockerInfo.Swarm.LocalNodeState != "inactive"
+	if isDockerSwarm {


I would recommend making it stop both the services and the containers.
It might be a bit of an odd approach, but it is possible to create containers in a docker swarm that are not bound to services.

I would recommend making it stop both the services and the containers.

I'm not entirely sure I understand. If I stop containers in Swarm Mode, Swarm will try to restart them, unless the service has a on-failure restart policy. This is what's currently happening when you use this image with Swarm. From what I understand you cannot really "stop" a service, you can a. temporarily scale it down to 0 or b. delete and redeploy.

This is currently not reflected in the code, but I was thinking the best (also least breaking) approach here would be to do the following:

if the label is on the container(s) (i.e. not in deploy), the container will be stopped and restarted, keeping the existing behavior

if the label is put on the service (i.e. in deploy), the service will be scaled down and up again

If you want to help out here, that's more than welcome, although I would still like to understand the pros and cons of all approaches (stopping containers, removing services, scaling services) better so we can pick the best one.

Sorry, I was not clear with the exact issue by not stopping the containers too.

Even if a node is part of a Docker Swarm, while unorthodox, it is still possible to create standalone containers on it by calling docker run.
Given the following scenario:

Node is part of a Docker Swarm

User for some reason calls docker run on the specific node

Applies docker-volume-backup.stop-during-backup labels on the container

Starts the backup process

At this point, the backup will skip stopping containers, since it believes the containers are all part of a service, which is a false assumption.

To avoid this happening, I would recommend stopping both services and containers. The only error case here would be if a container has the label and the service that it belongs to also has it, so a container would be stopped twice. ( Which is a configuration error, but can be ignored by the backup if needed. )

Regarding your understanding of the services, that is correct. You cannot stop a service. but scale it down to 0.

The main difference between stopping all containers and scaling down is where the containers will be scheduled after the backup. In the first case they will stay on the same node, there is a guarantee that the container will not move to a new one. For scaling, it will depend on how the service is configured, if the placement requirements allow, it is possible to schedule it on another node. This might have some issues if the services configuration is faulty. ( And the volumes are local. )

The advantage of the option to configure on service labels, at least for me, is reducing the chances of messing up the configuration. For containers, one has to apply the on-failure restart policy and put the labels on the containers, not the services. The latter is a bit counterintuitive, since they are usually on the services, eg Traefik.

Removing and recreating the services sounds a bit too intrusive for my taste and I am not totally sure if you can save the state of the service/stack for redeployment. As a user of this product, I would prefer the scaling alternative instead.

I like the idea you have explained, it is also something that I would implement. It would not break the previous configurations upon upgrade, but allows the tool to be configured in a more Swarm friendly way. Also having both options would allow the user to pick based on their exact scenario.

Ah, now I get it, also makes sense seeing you commented on this particular line. I think that should give us a pretty straightforward plan of what's to add (scale services to zero and up again when they are labeled). I'll see if I can try this out the next couple of days and will keep you posted here. Thanks for your input.

pixxon · 2024-01-24T11:54:14Z

Can I offer any kind of help to get this pr resolved faster?

m90 · 2024-01-25T18:48:06Z

I'm continuing work on this in #333 which currently is still very much a work in progress. I'll let you know how it goes.

m90 force-pushed the swarm-scale-down branch 3 times, most recently from a031896 to e22d616 Compare January 13, 2024 16:13

Stop swarm services by scaling down and up

55e2904

m90 force-pushed the swarm-scale-down branch from e22d616 to 55e2904 Compare January 13, 2024 20:25

pixxon reviewed Jan 24, 2024

View reviewed changes

m90 mentioned this pull request Jan 25, 2024

Improve Swarm support #333

Merged

m90 closed this Jan 25, 2024

m90 deleted the swarm-scale-down branch January 31, 2024 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop swarm services by scaling down and up #328

Stop swarm services by scaling down and up #328

m90 commented Jan 13, 2024

pixxon Jan 24, 2024

m90 Jan 24, 2024 •

edited

Loading

pixxon Jan 24, 2024

m90 Jan 24, 2024

pixxon commented Jan 24, 2024

m90 commented Jan 25, 2024 •

edited

Loading

Stop swarm services by scaling down and up #328

Stop swarm services by scaling down and up #328

Conversation

m90 commented Jan 13, 2024

pixxon Jan 24, 2024

Choose a reason for hiding this comment

m90 Jan 24, 2024 • edited Loading

Choose a reason for hiding this comment

pixxon Jan 24, 2024

Choose a reason for hiding this comment

m90 Jan 24, 2024

Choose a reason for hiding this comment

pixxon commented Jan 24, 2024

m90 commented Jan 25, 2024 • edited Loading

m90 Jan 24, 2024 •

edited

Loading

m90 commented Jan 25, 2024 •

edited

Loading