Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid errors: ip conflict and ip address already in use #8445

Merged
merged 1 commit into from
Feb 15, 2019

Conversation

yuyangbj
Copy link
Contributor

@yuyangbj yuyangbj commented Jan 8, 2019

If the container VM is operated from vSphere client, in some cases
VIC will allocate duplicate IP address for cVMs. For example: when
the first cVM is poweroff from VC, the assigned IP address will be
released, but the guestinfo is still saving the assigned IP address
for VC HA or migration functionality; the second cVM will be assigned
the first cVM ip address, when the first cVM is powered on from VC,
these two cVMs are using same IP address.

Another related error scenario is the docker network inspect will not
show container endpoint if cVM is powered on from vSphere.

[full ci]
Fixes #8405 #7128 #8052

@yuyangbj yuyangbj requested a review from a team as a code owner January 8, 2019 10:12
@yuyangbj yuyangbj requested review from hickeng and wjun January 8, 2019 10:12
@yuyangbj
Copy link
Contributor Author

yuyangbj commented Jan 8, 2019

@hickeng @wjun I tested the scenario below and there is no problem on my side.

  1. Test docker kill
  2. Test docker restart/start/stop from VIC and check docker inspect network/container
  3. Test IP address exhausting scenario
  4. Power off/ Power on container VM from vSphere client

@renmaosheng
Copy link
Contributor

we should run a full CI see if this breaks some tests, thanks.

@@ -161,6 +161,7 @@ func (s *Scope) reserveEndpointIP(e *Endpoint) error {
e.ip = eip
return nil
}
err = fmt.Errorf("No available IP address in current network %s", s.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to throw a new error here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More friendly :)

@@ -1096,13 +1096,14 @@ func (c *ContainerBackend) containerStart(op trace.Operation, name string, hostC
handle = bindRes.Payload.Handle
endpoints = bindRes.Payload.Endpoints

// due to issue 8405, 7128 and 8052, we will not clear out assigned IP address besides container is removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unbind() happens during error, so we should keep the undo here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume here we have assigned an IP address here, so we do not release the IP address even though it is not started.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since each exited container would keep an IP, should we inform users to reap exited containers to avoid ip exhausting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We inform users when there is no available ip, see line 164

@@ -1263,7 +1264,7 @@ func (c *ContainerBackend) containerStop(op trace.Operation, name string, second
}

operation := func() error {
return c.containerProxy.Stop(op, vc, name, seconds, true)
return c.containerProxy.Stop(op, vc, name, seconds, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is logic from true -> false here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is using to determine if we need to call unbindContainer.

@@ -1412,6 +1412,71 @@ func (c *Context) deleteScope(s *Scope) {
delete(c.scopes, s.Name())
}

func (c *Context) UpdateContainerNameInScope(h *exec.Handle) error {
Copy link
Contributor

@wjun wjun Feb 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments about this api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. thanks

// name
c.containers[h.ExecConfig.Name] = con
// aliases
for k, v := range newAliases {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since c.removeAliases is invoked in line 1455, there is no need to check if con exists or not in the for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1412,6 +1412,71 @@ func (c *Context) deleteScope(s *Scope) {
delete(c.scopes, s.Name())
}

func (c *Context) UpdateContainerNameInScope(h *exec.Handle) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion, please remove stop/start containers in the rename test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do.

@@ -513,6 +516,7 @@ func (handler *ContainersHandlersImpl) RenameContainerHandler(params containers.
}

h = h.Rename(op, params.Name)
handler.netCtx.UpdateContainerNameInScope(h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other cases to call this UpdateContainerNameInScope function besides h.Rename?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is a new method for rename ops.

If the container VM is operated from vSphere client, in some cases
VIC will allocate duplicate IP address for cVMs. For example: when
the first cVM is poweroff from VC, the assigned IP address will be
released, but the guestinfo is still saving the assigned IP address
for VC HA or migration functionality; the second cVM will be assigned
the first cVM ip address, when the first cVM is powered on from VC,
these two cVMs are using same IP address.

Another related error scenario is the docker network inspect will not
show container endpoint if cVM is powered on from vSphere.
@yuyangbj yuyangbj merged commit f016267 into vmware:master Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants