Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop swarm services by scaling down and up #328

Closed
wants to merge 1 commit into from
Closed

Conversation

m90
Copy link
Member

@m90 m90 commented Jan 13, 2024

Closes #327

@m90 m90 force-pushed the swarm-scale-down branch 3 times, most recently from a031896 to e22d616 Compare January 13, 2024 16:13
return noop, fmt.Errorf("stopContainers: error getting docker info: %w", err)
}
isDockerSwarm := dockerInfo.Swarm.LocalNodeState != "inactive"
if isDockerSwarm {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend making it stop both the services and the containers.
It might be a bit of an odd approach, but it is possible to create containers in a docker swarm that are not bound to services.

Copy link
Member Author

@m90 m90 Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend making it stop both the services and the containers.

I'm not entirely sure I understand. If I stop containers in Swarm Mode, Swarm will try to restart them, unless the service has a on-failure restart policy. This is what's currently happening when you use this image with Swarm. From what I understand you cannot really "stop" a service, you can a. temporarily scale it down to 0 or b. delete and redeploy.

This is currently not reflected in the code, but I was thinking the best (also least breaking) approach here would be to do the following:

  • if the label is on the container(s) (i.e. not in deploy), the container will be stopped and restarted, keeping the existing behavior
  • if the label is put on the service (i.e. in deploy), the service will be scaled down and up again

If you want to help out here, that's more than welcome, although I would still like to understand the pros and cons of all approaches (stopping containers, removing services, scaling services) better so we can pick the best one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was not clear with the exact issue by not stopping the containers too.

Even if a node is part of a Docker Swarm, while unorthodox, it is still possible to create standalone containers on it by calling docker run.
Given the following scenario:

  1. Node is part of a Docker Swarm
  2. User for some reason calls docker run on the specific node
  3. Applies docker-volume-backup.stop-during-backup labels on the container
  4. Starts the backup process
  5. At this point, the backup will skip stopping containers, since it believes the containers are all part of a service, which is a false assumption.

To avoid this happening, I would recommend stopping both services and containers. The only error case here would be if a container has the label and the service that it belongs to also has it, so a container would be stopped twice. ( Which is a configuration error, but can be ignored by the backup if needed. )

Regarding your understanding of the services, that is correct. You cannot stop a service. but scale it down to 0.

The main difference between stopping all containers and scaling down is where the containers will be scheduled after the backup. In the first case they will stay on the same node, there is a guarantee that the container will not move to a new one. For scaling, it will depend on how the service is configured, if the placement requirements allow, it is possible to schedule it on another node. This might have some issues if the services configuration is faulty. ( And the volumes are local. )

The advantage of the option to configure on service labels, at least for me, is reducing the chances of messing up the configuration. For containers, one has to apply the on-failure restart policy and put the labels on the containers, not the services. The latter is a bit counterintuitive, since they are usually on the services, eg Traefik.

Removing and recreating the services sounds a bit too intrusive for my taste and I am not totally sure if you can save the state of the service/stack for redeployment. As a user of this product, I would prefer the scaling alternative instead.

I like the idea you have explained, it is also something that I would implement. It would not break the previous configurations upon upgrade, but allows the tool to be configured in a more Swarm friendly way. Also having both options would allow the user to pick based on their exact scenario.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I get it, also makes sense seeing you commented on this particular line. I think that should give us a pretty straightforward plan of what's to add (scale services to zero and up again when they are labeled). I'll see if I can try this out the next couple of days and will keep you posted here. Thanks for your input.

@pixxon
Copy link
Contributor

pixxon commented Jan 24, 2024

Can I offer any kind of help to get this pr resolved faster?

@m90 m90 mentioned this pull request Jan 25, 2024
@m90
Copy link
Member Author

m90 commented Jan 25, 2024

I'm continuing work on this in #333 which currently is still very much a work in progress. I'll let you know how it goes.

@m90 m90 closed this Jan 25, 2024
@m90 m90 deleted the swarm-scale-down branch January 31, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Docker Swarm support
2 participants