-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow user to replace ingress network #31714
Conversation
Is the custom ingress network still treated "special", i.e. is "icc" communication disabled by default? |
That has not changed, even now icc is not disabled on the ingress network. What container cannot do today (and this PR does not change that) is to reach services via the ingress network on L4 ports other than the ones the services exposes. That is achieved internally, without user configurations needed on the ingress network creation. |
@aboch but hopefully Service Discovery, VIP and LB are disabled on this Ingress network ? |
Yes, forgot that thanks. That is all disabled. It's all done internally and user does not and will not need to control it via network creation options. |
hmm some mess happening with vendoring. let me look into that. Looks like I need to update my vndr tool. Edit: run the updted vndr, should be fine now |
hmm, a docker-py integration test time out failed janky (https://jenkins.dockerproject.org/job/Docker-PRs/40326/console):
But when I run it locally, the docker-py integ test passes:
Could this be just a glitch, has anybody seen it before ? |
All green. |
@@ -82,6 +82,7 @@ type NetworkSpec struct { | |||
IPv6Enabled bool `json:",omitempty"` | |||
Internal bool `json:",omitempty"` | |||
Attachable bool `json:",omitempty"` | |||
Ingress bool `json:",omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to swagger.yaml
@@ -400,6 +400,7 @@ type NetworkResource struct { | |||
IPAM network.IPAM // IPAM is the network's IP Address Management | |||
Internal bool // Internal represents if the network is used internal only | |||
Attachable bool // Attachable represents if the global scope is manually attachable by regular containers from workers in swarm mode. | |||
Ingress bool // Ingress indicates the network is providing the routing-mesh for the swarm cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to swagger.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will, along to the addition to the command reference (forgot to add the diff to the commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -431,6 +432,7 @@ type NetworkCreate struct { | |||
IPAM *network.IPAM | |||
Internal bool | |||
Attachable bool | |||
Ingress bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add to swagger.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
cli/command/network/create.go
Outdated
@@ -59,6 +60,8 @@ func newCreateCommand(dockerCli *command.DockerCli) *cobra.Command { | |||
flags.BoolVar(&opts.ipv6, "ipv6", false, "Enable IPv6 networking") | |||
flags.BoolVar(&opts.attachable, "attachable", false, "Enable manual container attachment") | |||
flags.SetAnnotation("attachable", "version", []string{"1.25"}) | |||
flags.BoolVar(&opts.ingress, "ingress", false, "Swarm routing-mesh network") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Create a swarm routing-mesh network"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
daemon/network.go
Outdated
if ingressWorkerStop != nil { | ||
close(ingressWorkerStop) | ||
ingressWorkerOnce = sync.Once{} | ||
ingressID = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without locking, this can race with the goroutine in setupIngressWorker
.
daemon/network.go
Outdated
func (daemon *Daemon) stopIngressWorker() { | ||
if ingressWorkerStop != nil { | ||
close(ingressWorkerStop) | ||
ingressWorkerOnce = sync.Once{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of resetting ingressWorkerOnce
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the nodes leaves the swarm, and joins a swarm again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an external guarantee that stopIngressWorker
can't be called concurrently with SetupIngress
or ReleaseIngress
?
I think I'd feel more comfortable having a mutex and a boolean instead of a sync.Once
. Then the mutex could also protect ingressID
, and we wouldn't have to make assumptions about which functions can be called at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an external guarantee that stopIngressWorker can't be called concurrently with SetupIngress or ReleaseIngress?
Yes, stopIngressWorker
is called when docker leaves the swarm.
SetupIngress
and ReleaseIngress
cannot be called concurrently as they are all queued.
The worker logic is in fact to serialize the setup/release events.
I was under the impression setup or release which in turn are called by executor.Configure() cannot happen to be called after DaemonLeavesCluster()
.
Given the worker drains ingress network creation/deletion events, I did not expect a long list of events left in the queue to process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let me think about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this can be resolved by moving the ingressID
reset into the setupIngressWorker()
go routine when reacting to the stop event: L122.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because only one select at the time will run, and the one that executes setup/release function will read ingressID
value and pass it to the functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What confuses me about ingressWorkerOnce
is that the use of sync.Once
implies there's synchronization involved, but the fact that we're resetting it without a lock implies synchronization isn't an issue. Reviewing the places where ReleaseIngress
and SetupIngress
are called, I don't think they can be called simultaneously. If this is the case, I think it would be clearer to just use a boolean and an if statement. Otherwise, the code looks suspiciously like incorrect concurrent code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for using once here is not because we are in a concurrent prone code path. It is simply for assuring we initialize the worker once, given we do lazy initialization of the worker.
Therefore only the first call to SetupIngress
from the executor will result in starting the worker.
@aaronlehmann Addressed your comments. PTAL. |
daemon/network.go
Outdated
) | ||
|
||
func (daemon *Daemon) setupIngressWorker() { | ||
ingressWorkerStop = make(chan struct{}, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This channel doesn't need to be buffered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change
daemon/network.go
Outdated
ingressChan <- struct{}{} | ||
return func() { <-ingressChan } | ||
func (daemon *Daemon) stopIngressWorker() { | ||
if ingressWorkerStop != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if
statement also strikes me as suspicious. It implies that stopIngressWorker
can be called before setupIngressWorker
. But if that's the case, it suggests that after leaving a swarm and rejoining, we could close an already-closed channel (ingressWorkerStop
never gets reset to nil).
Again, I think adding locking around this stuff would make it a lot easier to be confident that there aren't concurrency bugs. If nothing else, it would make my review a lot easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It implies that stopIngressWorker can be called before setupIngressWorker
I see your point.
The fact is the SetupIngressNetwork
being called from the executor and the call to DaemonLeavesCluster
happen from two different threads. Their execution order is not guaranteed, and I saw it not deterministic during the testing of initial diffs.
In fact in the setup ingress call, we need to wait for agentInitWait()
for this reason.
So it is not excluded a swarm join
quickly followed by a swarm leave
would flip the order of events.
So, on DaemonLeavesCluster
, I cannot assume the worker was started.
If locking this section ease the code reading, I agree we should do it. I will update shortly.
But this should not be confused with the fact I use the once, which are being used for lazy initialization of the worker as I explained in the other comment.
Updated. |
daemon/network.go
Outdated
ingressJobsChannel chan *ingressJob | ||
ingressWorkerStop chan struct{} | ||
ingressID string | ||
ingress sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the lock. But startIngressWorker
is using ingressWorkerStop
without holding the lock, and SetupIngress
/ ReleaseIngress
are using ingressWorkerOnce
without holding the lock, so unfortunately my concerns about concurrency are not fully addressed.
Earlier in the thread you said:
The fact is the
SetupIngressNetwork
being called from the executor and the
call toDaemonLeavesCluster
happen from two different threads. Their execution
order is not guaranteed, and I saw it not deterministic during the testing of
initial diffs.In fact in the setup ingress call, we need to wait for
agentInitWait()
for
this reason.So it is not excluded a
swarm join
quickly followed by aswarm leave
would
flip the order of events.
...so I still don't see how it's safe to use these variables from different goroutines without a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetupIngress / ReleaseIngress are using ingressWorkerOnce
Not sure I follow. Once.Do(f)
is designed to not allow more than one execution of f
, even in concurrent environment. I do not think there is any need to hold a lock during a call to Once.Do.
Being called in a once do construct, startIngressWorker
is guaranteed to be called once by one thread.
Still, as you said, startIngressWorker
initializes the two channels, and stopIngressNetwork
, which also accesses the channels, could be concurrently called. To protect against this, both startIngressWorker
and stopIngressNetwork
are now locking around the access.
Yes, then it spawns the goroutine which listen on those channels, with no protection.
The correct thing here is probably to define a type ingressWorker struct
with start
, stop
, enqueue
methods, so that it can control the access to and lifecycle of its channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think there is any need to hold a lock during a call to Once.Do.
There is if you may overwrite the sync.Once
from another goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, you are referring to the reset of the once variable.
Yes, that is a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I look at this, it seems to me the best tradeoff now is to not to stop the goroutine once it is started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaronlehmann I decided to fall back to not stopping the routine, when the node leaves the swarm.
IMO it is fine for now as we are adding the new functionality.
We can later improve the logic getting to ideal stop of the routine, with calm.
I updated the diffs and tested the add/rm ingress, before and after leaving and rejoining the cluster.
daemon/network.go
Outdated
ingress.Lock() | ||
if ingressWorkerStop != nil { | ||
close(ingressWorkerStop) | ||
ingressWorkerStop = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern may cause the goroutine in startIngressWorker
to miss the channel close, because it might see the nil
channel instead of the one that was closed. Also, the race detector will complain (and this might cause problems, depending on the internal implementation of select
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. I added this in last diff round and this is wrong and not needed.
The existing check if ingressWorkerStop != nil
is just to tell us whether a worker routine is actually running, if so signal the routine to stop.
Signed-off-by: Alessandro Boch <[email protected]>
Signed-off-by: Alessandro Boch <[email protected]>
@aaronlehmann I understand that this PR is blocking other Swarmkit vendoring PRs. I am waiting for @aboch to be back so that he can respond to the compatibility question. |
@mavenugo Although I think it is very unlikely admin would remove the routing-mesh on a half upgraded prod environment, it is true casual user may hit the problem, so I am proposing we can follow the same approach used in
Higher level mgmt tools can already reject the request given they likely have notion of the engine version running on each node. |
@aboch I agree. if we can do this change, I think we will have the necessary UX covered and also when the proper capability exchange mechanism arrives, it can be automated as well. Can you pls take care of this change and we can get this merged. |
@mavenugo Updated. |
@aboch build failure that seems legit
|
Thanks @vdemeester |
Signed-off-by: Alessandro Boch <[email protected]>
@aboch I think it is better to introduce the |
@mavenugo we can add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐯
Thanks @aboch @vdemeester . LGTM. |
Allow user to replace ingress network
I am using 17.05 for my manager and worker nodes and am trying to create a new ingress network with custom subnet and then creating swarm service. But service doesn't seem to spawn any containers. I see some errors from docker logs. Seems like it keeps "looping over" Requesting Address from the new subnet. Can someone help out here? Logs as below: Jun 28 10:58:24 dockerd[19967]: time="2017-06-28T10:58:24.952553135-07:00" level=debug msg="Service xa3gd5xvpgatbwwhis1of9lp2 was scaled up from 0 to 1 instances" module=node node.id=b9zadh96hjod5zrx55lvsv19m My steps of reproducing are:
Observations: docker service ls docker network ls docker network inspect ikhelr9jnjll |
@edesai please open an issue instead of commenting on a closed pull request Also keep in mind that the GitHub issue tracker is not intended as a general support forum,
|
@thaJeztah Can driver of ingress network be customized, or can port 4789 UDP of overlay driver be changed? In my scenario, 4789 UDP is already used by the underlying network of VMs which cannot be modified. When I use weave network plugin v2 for swarm, communication between containers on different hosts works well, but it seems that the network is still blocked when routing mesh(LB) happens. |
Vendoring of swarmkit carries a change to support ingress network replacement and two important scheduler fixes:
Vendoring of libnetwork carries:
This PR allows user to remove and (re)create the ingress network:
Fixes #24220
Depends on moby/swarmkit/pull/2028 and moby/libnetwork/pull/1678
- A picture of a cute animal (not mandatory but encouraged)