Proposal: Restore running container network settings for containerd integration to support hot upgrade #975

coolljt0725 · 2016-02-26T10:02:26Z

Docker engine PR moby/moby#20662 try to integrate containerd for container
supervision, that's awesome. This will make it possible to upgrade the daemon without shutting down all running containers and docker daemon down will not affect the running containers any more, just restart docker daemon will restore all the previous running container. This also need the libnetwork to restore the container network settings(endpoints, sandbox, networks, portmapping). Currently, the daemon starting will clean up the network stuff(networks, endpoints, sandbox), so the ports, ip address, sandboxes of the old running containers are not aware of by the new daemon, the ip and the port still can be allocated to new containers.

I made some progress( see https://github.com/coolljt0725/libnetwork/tree/restore_network )on supporting this. Here is an example(docker binary build form branch https://github.com/coolljt0725/docker/tree/containerd-integration-network which based on PR moby/moby#20662):

run a ngnix container with 80 port

$ docker run -d -ti -p 80:80 nginx
fbc3c1025f63c5429c7feae208b4794672d2c44ab5e0b638e0abfcc1d03c7451
[lei@centos-188 docker]$ docker inspect -f {{.NetworkSettings.Networks.bridge.IPAddress}} fbc3c1025f63c5429c7feae208b4794672d2c44ab5e0b638e0abfcc1d03c7451
172.17.0.2

and I can access the nginx server from my chrome
2. kill the docker daemon and restart it
$ sudo kill -9 $(cat /var/run/docker.pid)
3. after restart, we can see this container is still running.

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                         NAMES
fbc3c1025f63        nginx               "nginx -g 'daemon off"   5 minutes ago       Up 5 minutes        0.0.0.0:80->80/tcp, 443/tcp   jolly_albattani

we still can access the nginx server from chrome.
start a container and try to pushlish port 80 will failed because daemon know it has been allocate to nginx.
start any container, the ip 172.17.0.2 of nginx container will not be allocated again because daemon know it has been allocated.

I don't know if this is the right approach to implement this, I'm happy to open a PR to work on this

The text was updated successfully, but these errors were encountered:

chenchun · 2016-02-26T13:28:31Z

@coolljt0725 Thanks for working on this. I disagree with some of your design. My main consideration is that libnetwork should persist its states into local/global KV and restore them on restart. It should not depend on docker to replay things back.

I've did some work on docker-1.9.1. The following is the work I have done.

Persist default bridge network since we have to persist connected endpoints and we should have a way to deal with default network config changing on restart;
Persist bridge endpoints into local KV and populating them back after populating bridge networks;
Populating sandbox from local KV on restart;
Persist sandbox.config along with sbState. Legacy container links depends on these states.
Persist PortMapper states using local KV;
sandbox.isStub should be deleted;
Userland proxy process should be considered;
Bridge driver should not delete existing chains on init;
...
There maybe more to consider.

coolljt0725 · 2016-02-27T01:00:52Z

@chenchun 👍 Good job, thank you.
I doesn't take a deep consideration about this, I saw the containerd integration PR yesterday and I tested it, found the container network settings is not restored, so I changed some code to make it work around, and I'm not quite familiar with the libnetwork for now ,obviously my branch is too simple. There are much too learn:-)

calavera · 2016-03-01T21:24:35Z

I'm happy to open a PR to work on this

go for it!

andyxning · 2016-12-19T04:06:16Z

@coolljt0725 @calavera Any progress on this?

andyxning · 2016-12-20T10:15:08Z

For those come to this:

libnetwork Add network restore to support docker live restore container #1244 has fixed this
docker has fixed this in #23524

coolljt0725 mentioned this issue Feb 26, 2016

Containerd integration moby/moby#20662

Merged

11 tasks

anusha-ragunathan mentioned this issue Mar 29, 2016

Update mount state of live containers after a daemon crash. moby/moby#21372

Merged

coolljt0725 mentioned this issue Apr 22, 2016

Add network restore to support docker live restore container #1135

Closed

coolljt0725 closed this as completed Jun 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Restore running container network settings for containerd integration to support hot upgrade #975

Proposal: Restore running container network settings for containerd integration to support hot upgrade #975

coolljt0725 commented Feb 26, 2016

chenchun commented Feb 26, 2016

coolljt0725 commented Feb 27, 2016

calavera commented Mar 1, 2016

andyxning commented Dec 19, 2016

andyxning commented Dec 20, 2016

Proposal: Restore running container network settings for containerd integration to support hot upgrade #975

Proposal: Restore running container network settings for containerd integration to support hot upgrade #975

Comments

coolljt0725 commented Feb 26, 2016

chenchun commented Feb 26, 2016

coolljt0725 commented Feb 27, 2016

calavera commented Mar 1, 2016

andyxning commented Dec 19, 2016

andyxning commented Dec 20, 2016