-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better interface selection heuristic #3546
Conversation
This PR introduces a better interface selection heuristic such that we select interfaces with globally routable unicast addresses over link local addresses. Fixes #3487
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this heuristic will work for the linked issue, but it will break on machine which worries me about its portability. When I go run ...
this script I get:
1 lo
0 127.0.0.1/8 (*net.IPNet)
IsGlobalUnicast = false
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = false
IsLoopback = true
IsMulticast = false
1 ::1/128 (*net.IPNet)
IsGlobalUnicast = false
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = false
IsLoopback = true
IsMulticast = false
2 enp0s31f6
3 wlp4s0
0 192.168.86.26/24 (*net.IPNet)
IsGlobalUnicast = true
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = false
IsLoopback = false
IsMulticast = false
1 fe80::d896:c428:a4f2:e480/64 (*net.IPNet)
IsGlobalUnicast = false
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = true
IsLoopback = false
IsMulticast = false
4 lxcbr0
0 10.0.3.1/24 (*net.IPNet)
IsGlobalUnicast = true
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = false
IsLoopback = false
IsMulticast = false
5 docker0
0 172.18.0.1/16 (*net.IPNet)
IsGlobalUnicast = true
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = false
IsLoopback = false
IsMulticast = false
1 fe80::42:a4ff:fed1:1f7d/64 (*net.IPNet)
IsGlobalUnicast = false
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = true
IsLoopback = false
IsMulticast = false
6 vmnet1
0 172.16.203.1/24 (*net.IPNet)
IsGlobalUnicast = true
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = false
IsLoopback = false
IsMulticast = false
1 fe80::250:56ff:fec0:1/64 (*net.IPNet)
IsGlobalUnicast = false
IsInterfaceLocalMulticast= false
IsLinkLocalMulticast = false
IsLinkLocalUnicast = true
IsLoopback = false
IsMulticast = false
...two more vmnet's...
wlp4s0
is my wifi which I would expect to be my "best" interface. However because it has a linklocal unicast address (due to ipv6 being enabled locally but not available on my LAN) it gets a score of 1
while lxc's bridge (lxcbr0
) gets a score of 2 for having a global unicast address but no link local.
I'm not sure what we can do other than blacklist or score interfaces based on well known names or attempt to use cloud-specific metadata before scraping the system's interfaces...
client/fingerprint/network.go
Outdated
|
||
addrs, err := f.interfaceDetector.Addrs(&intf) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 We could WARN
and continue
here, but after peeking at the Addrs(...)
code it seems like all of the errors it returns indicate pretty bad syscall failures. Seems like just bailing is probably safest as users can manually set this anyway.
client/fingerprint/network_test.go
Outdated
func TestNetworkFingerPrint_excludelo_down_interfaces(t *testing.T) { | ||
// This tests that we prefer skipping link local and down interfaces as well as | ||
// those without global unicast addresses | ||
func TestNetworkFingerPrint_preferential_interfaces(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how any of these tests are exercising the new code path with the scoring introduced above.
I realize its not possible to test in a portable way across machines, but I would try to refactor the code so the scoring/selection logic is testable
client/fingerprint/network.go
Outdated
} | ||
} | ||
|
||
if intfScore > bestScore { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If infscore stays at zero and bestScore at -1000 this will pick the first intf
from the intfs
list, is that acceptable?
client/fingerprint/network.go
Outdated
@@ -200,14 +199,57 @@ func (f *NetworkFingerprint) findInterface(deviceName string) (*net.Interface, e | |||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the comments on the PR indicated that the returned list here is not stable (can return things out of order). What happens if there is more than one interface with a global unicast address, thus resulting in them getting the same score?
Reading the code below, it seems like it will choose the first one, but if the order changes when calling interfaceDetector.Interfaces()
that will change the results.
@preetapan @schmichael Would appreciate a new review. Code has changed. |
to force network fingerprinting on. When run in dev mode, this is defaulted | ||
to the machine's loopback interface. If not in dev mode, the interface attached | ||
to the default route is used. All addresses on the interface are fingerprinted | ||
except those that are scoped local for IPv6. When allocating ports for tasks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggesting a small rewrite for brevity and continuity.
When run in dev mode, this defaults to the loopback interface.
When not in dev mode, the interface attached to the default route is used. All IP addresses except those scoped local for IPV6 on the chosen interface are fingerprinted. The scheduler chooses from those IP addresses when allocating ports for tasks.
if defaultIfName == "" { | ||
return nil, fmt.Errorf("no network_interface given and failed to determine interface attached to default route") | ||
} | ||
deviceName = defaultIfName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much nicer now 👍
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR introduces a better interface selection heuristic such that we
select interfaces with globally routable unicast addresses over link
local addresses.
Fixes #3487