-
-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve traffic-managers podCIDR discovery #2093
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good but i'm somewhat confused about the watch
function --just wanna make sure that it actually is being invoked, otherwise none of this will work 🙃
charts/telepresence/values.yaml
Outdated
# nodePodCIDRs extract CIDRs from the podCIDR and podCIDRs field of the Node Spec. | ||
# coverPodIPs extract IPs from the podIP and podIPs field of the Pod Status and compute the CIDRs needed to cover those IPs. | ||
# environment use CIDRs listed in the space separated POD_CIDRS environment variable verbatim. | ||
# auto first tru nodePodCIDRs and if that fails, tru coverPodIPs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo -- it says tru
instead of try
informer cache.SharedIndexInformer | ||
subnets []*net.IPNet | ||
changed time.Time | ||
lock sync.Mutex // Protects all access to ipsMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this read Protects all access to subnets
?
return changed | ||
} | ||
|
||
func (w *nodeWatcher) watch(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe i'm just confused but it seems like watch is never invoked?
func (w *nodeWatcher) viable(ctx context.Context) bool { | ||
nodes, err := w.lister.List(labels.Everything()) | ||
if err != nil { | ||
dlog.Errorf(ctx, "unable to list nodes: %v", err) | ||
return false | ||
} | ||
|
||
// Create the initial snapshot | ||
w.lock.Lock() | ||
defer w.lock.Unlock() | ||
changed := false | ||
dlog.Infof(ctx, "Scanning %d nodes", len(nodes)) | ||
for _, node := range nodes { | ||
if w.addLocked(nodeSubnets(ctx, node)) { | ||
changed = true | ||
} | ||
} | ||
if changed { | ||
dlog.Infof(ctx, "Found %d subnets", len(w.subnets)) | ||
w.changed = time.Now() | ||
} else { | ||
dlog.Info(ctx, "No subnets found") | ||
} | ||
return changed | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like this function combines initialization with a precondition check (i.e. "Are we actually able to watch subnets from nodes?") -- that's okay, but calling it twice in a row will make the second call return false
, no? Maybe we should short-circuit return if w.subnets
is already populated -- after all, if they're there then getting them is definitely viable
@@ -0,0 +1,247 @@ | |||
package cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comments apply here as for the nodewatcher :)
b7b0415
to
497ec40
Compare
The kates package provides nice and simple to use generalisations, but it also introduces overhead and removes the ability to control some aspects of watching. This commit replaces the traffic-manager's use of the kates package with more standard k8s APIs. Signed-off-by: Thomas Hallgren <[email protected]>
Fixes the ``` goroutine "/daemon/watch-cluster-info" exited with error: PANIC: close of closed channel ``` that happens the second time the traffic-manager reports status info. Signed-off-by: Thomas Hallgren <[email protected]>
Adds two environment variables to the traffic manager that controls how it discovers what subnets the cluster is using for the pods. The environment variables are configurable in the Helm chart. Signed-off-by: Thomas Hallgren <[email protected]>
Signed-off-by: Thomas Hallgren <[email protected]>
497ec40
to
3c13df6
Compare
The |
Creates a new `subnet.Set` datatype with methods for adding, deleting, and appending `net.IPNet` instances. Signed-off-by: Thomas Hallgren <[email protected]>
Signed-off-by: Thomas Hallgren <[email protected]>
Signed-off-by: Thomas Hallgren <[email protected]>
Signed-off-by: Thomas Hallgren <[email protected]>
3c13df6
to
07706f3
Compare
@@ -79,7 +79,7 @@ podCIDRs: [] | |||
# nodePodCIDRs extract CIDRs from the podCIDR and podCIDRs field of the Node Spec. | |||
# coverPodIPs extract IPs from the podIP and podIPs field of the Pod Status and compute the CIDRs needed to cover those IPs. | |||
# environment use CIDRs listed in the space separated POD_CIDRS environment variable verbatim. | |||
# auto first tru nodePodCIDRs and if that fails, tru coverPodIPs | |||
# auto first try nodePodCIDRs and if that fails, tru coverPodIPs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still says tru coverPodIPs
:^)
0045fb6
to
07706f3
Compare
The 2 minutes timeout would sometimes fail. Signed-off-by: Thomas Hallgren <[email protected]>
If the read at line 901 returned an error other than `io.EOF`, the surrounding loop would continue in infinity. This commit ensures that the loop brakes on any error. Signed-off-by: Thomas Hallgren <[email protected]>
f2326be
to
b592579
Compare
Signed-off-by: Thomas Hallgren <[email protected]>
b592579
to
3fd1746
Compare
Description
This PR contains two improvements to the pod CIDR discovery mechanism.
Checklist
./CHANGELOG.md
.