Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore previous serf user events to avoid wrong fdb programming #1639

Merged
merged 1 commit into from
Feb 7, 2017

Conversation

sanimej
Copy link

@sanimej sanimej commented Feb 7, 2017

When deploying classic swarm with overlay network and using on-node-failure reschedule policy randomly some containers might have connectivity issues with other containers in the same network.

To illustrate: One container C1 with three worker nodes W1, W2 and W3.

Container interface's mac is built from the IP address.
Let say C1 has 10.0.0.2 and mac 02:42:0a:00:00:02. If its initially scheduled on W1 the reachability info will be distributed to other nodes will be
<join 10.0.0.2, 02:42:0a:00:00:02, VTEP IP of W1>

If W1 goes through a failure and the container gets rescheduled to W2 the results events will be
<leave 10.0.0.2, 02:42:0a:00:00:02, VTEP IP of W1>
<join 10.0.0.2, 02:42:0a:00:00:02, VTEP IP of W2>

At this point if W3's daemon is restarted serf replays all the previous events. But they can be delivered out of order to the client. So libnetwork could end up getting the following..

<join 10.0.0.2, 02:42:0a:00:00:02, VTEP IP of W2>
<leave 10.0.0.2, 02:42:0a:00:00:02, VTEP IP of W1>
<join 10.0.0.2, 02:42:0a:00:00:02, VTEP IP of W1>

W3 ends up with incorrect forwarding info.

The change is to avoid getting the older events from serf when joining the gossip cluster. Hence for the containers that are already running in the network the miss notification and query mechanism will be used to program the neighbor entry/FDB.

Also added some debugs to help in debugging.

Signed-off-by: Santhosh Manohar [email protected]

@@ -146,6 +146,7 @@ func (d *driver) processQuery(q *serf.Query) {
return
}

logrus.Debugf("Sending peer query resp %v", fmt.Sprintf("%s %s %s", peerMac.String(), net.IP(peerIPMask).String(), vtep.String()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why %v is you are actually formatting a string as parameter, should be %s

peerIPMask is already of type net.IP no need to cast

net.HardwareAddr, net.IPMask and net.IP implement String() string, you do not need to call it.

Copy link
Author

@sanimej sanimej Feb 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am taking the same fmt.Sprintf we have in the next line and throwing it in a debug. Are you suggesting that its better to print the fields individually in the debug ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, unnecessary casting make the code bloated.
It is a minor comment, you don't need to change it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will change it to avoid the cast and explicit string conversion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aboch Made the change.

peerIPMask is ok type net.IPMask and the default Stringer for net.IPMask is different from net.IP. That is why it was casted explicitly.

@sanimej
Copy link
Author

sanimej commented Feb 7, 2017

This change can be reverted (if we can avoid querying the cluster its better) if the serf ordering can be fixed.

hashicorp/serf#448

@@ -94,8 +94,8 @@ func (d *driver) notifyEvent(event ovNotify) {
}

func (d *driver) processEvent(u serf.UserEvent) {
logrus.Debugf("Received user event name:%s, payload:%s\n", u.Name,
string(u.Payload))
logrus.Debugf("Received user event name:%s, payload:%s LTime:%v \n", u.Name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%v -> %d since you know it is a uint64

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aboch %v does exactly that. Its just easier to use the default specifier.

From https://golang.org/pkg/fmt/

The default format for %v is:

bool:                    %t
int, int8 etc.:          %d
uint, uint8 etc.:        %d, %x if printed with %#v
float32, complex64, etc: %g
string:                  %s
chan:                    %p
pointer:                 %p

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we know that. It just looks weird in the same Debufg call you mix the dedicated verb %s with the generic one %v when you exactly know the type of each argument.

As I said it is minor comment. You don't have to change it.

@aboch
Copy link
Contributor

aboch commented Feb 7, 2017

LGTM besides few minor comments

@aboch
Copy link
Contributor

aboch commented Feb 7, 2017

LGTM again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants