-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow swarm join with --availability=drain
#24993
Conversation
I'm +1 on design, I think it makes sense to haven an option to set the initial availability of a node |
Tentatively adding this to the milestone (because the related issue is), but it's no show-stopper |
@@ -43,6 +43,9 @@ message IssueNodeCertificateRequest { | |||
// Token represents a user-provided string that is necessary for new | |||
// nodes to join the cluster | |||
string token = 3; | |||
|
|||
// Availability allows a user to control the current scheduling status of a node. | |||
string availability = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be ignored for a renewal. I think that's okay, but we need to make it clear in the comment.
I'm not sure we want to add availability
to the top level of IssueNodeCertificateRequest
. Either there should be a nested messages with parameters for the new node object, or we should pass a NodeSpec
and only honor certain fields from it (because not everything in a NodeSpec
is under the node's control).
@aluzzardi @stevvooe any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like much availability
leaking to the NodeCA
.
Also - if we go down this path, we should probably take an initial NodeSpec rather than a bunch of fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really isn't the right place for it.
Agree with the concept of providing a NodeSpec
at join time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yongtang: It sounds like the consensus is that we should provide a NodeSpec
instead of just an availability setting. But as I mentioned earlier, the node CA should only use certain fields from this spec when it creates the node. There are things like node labels that are controlled from elsewhere.
In general, this is a good feature, but the implementation needs some design work. Do we have something open in swarmkit? |
@stevvooe @aaronlehmann @aluzzardi @vdemeester @thaJeztah Thanks a lot for the review and help! I was looking for feedbacks on this PR before, so I haven't created the pull request on swarmkit side yet. Let me rework on this pull request based on the comments. Will work on |
Thanks @yongtang! In the meantime, I'll remove the milestone as it won't make it in time for 1.12.0 (which is ok). |
a48810c
to
303fb5d
Compare
ping @stevvooe |
@cpuguy83 I am not necessarily on board with using |
303fb5d
to
efc32f8
Compare
@@ -235,7 +235,7 @@ func (c *Cluster) reconnectOnFailure(n *node) { | |||
} | |||
} | |||
|
|||
func (c *Cluster) startNewNode(forceNewCluster bool, localAddr, remoteAddr, listenAddr, advertiseAddr, joinAddr, joinToken string) (*node, error) { | |||
func (c *Cluster) startNewNode(forceNewCluster bool, localAddr, remoteAddr, listenAddr, advertiseAddr, joinAddr, joinToken string, availability types.NodeAvailability) (*node, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is way too many string arguments in a row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @stevvooe. The pull request has been updated with localAddr
, listenAddr
, and advertiseAddr
into one consolidated nodeAddr
. Please let me know if there are any issues.
ping @yongtang |
82cac7d
to
8f71074
Compare
Thanks @cpuguy83. The pull request has been rebased and updated. Please let me know if there are any issues. |
This was my last comment here:
If we can reconcile this, then we can probably move along. |
To unblock this, maybe we can go back to the approach of not using Using |
@aaronlehmann I have no problem using Let's just make this a field value for "request availability", ultimately leaving the policy to the orchestrator as to whether it is honored or not. Do we also incorporate this into the join token? |
5f626fa
to
009f3e3
Compare
Thanks all for the review. The PR has been updated with vendoring of swarmkit done. Now all tests passes. Please take a look and let me know if there are additional issues. |
LGTM after rebase |
009f3e3
to
1c13b15
Compare
@aaronlehmann Rebased. Thanks. |
What's the status here? |
The swarmkit-side PR was merged, so this is most of the way there. Sadly, it needs yet another rebase. It still looks good to me. Needs a second LGTM. |
1c13b15
to
663e1d1
Compare
@cpuguy83 @aaronlehmann The PR has been rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@thaJeztah: What are the next steps to get this merged? It's not an urgent PR, but I feel bad about it being open so long, and asking for rebases every few weeks. |
oh, my bad, lost track of this one 😞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some suggestions
@@ -133,6 +134,16 @@ Snapshots compact the Raft log and allow for more efficient transfer of the | |||
state to new managers. However, there is a performance cost to taking snapshots | |||
frequently. | |||
|
|||
### `--availability string` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove string
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
### `--availability string` | ||
|
||
This flag specifies the availability of the node at the time the node joins a master. | ||
Possiblle availability value includes `active`, `pause`, or `drain`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo Possiblle
-> Possible
Possible availability values are `active`, `pause`, or `drain`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This flag is useful in certain situations. For example, a cluster may want to have | ||
dedicated manager nodes that are not served as worker nodes. This could be achieved | ||
by passing `--availability=drain` to `docker swarm init`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifying the availability when initialising a swarm allows you to
initialize a dedicated manager node that do not get tasks scheduled.
The following example initializes a swarm, and sets the availability
to `drain`;
$ docker swarm init --availability=drain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% satisfied with that one, so open to suggestions
@@ -94,6 +95,15 @@ This flag is generally not necessary when joining an existing swarm. | |||
|
|||
Secret value required for nodes to join the swarm | |||
|
|||
### `--availability string` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove string
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
### `--availability string` | ||
|
||
This flag specifies the availability of the node at the time the node joins a master. | ||
Possiblle availability value includes `active`, `pause`, or `drain`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible availability values are `active`, `pause`, or `drain`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
663e1d1
to
02b9334
Compare
Thanks @aaronlehmann @thaJeztah the PR has been updated. |
This fix tries to address the issue raised in 24596 where it was not possible to join as manager only (`--availability=drain`). This fix adds a new flag `--availability` to `swarm join`. Related documentation has been updated. An integration test has been added. NOTE: Additional pull request for swarmkit and engine-api will be created separately. This fix fixes 24596. Signed-off-by: Yong Tang <[email protected]>
This fix adds a new flag `--availability` to `swarm join`. Related documentation has been updated. An integration test has been added. Signed-off-by: Yong Tang <[email protected]>
02b9334
to
0f30c64
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
- What I did
This fix tries to address the issue raised in #24596 where it was not possible to join as manager only (
--availability=drain
).The related PR in swarmkit is:
moby/swarmkit#1271
- How I did it
This fix adds a new flag
--availability
toswarm join
.- How to verify it
An integration test has been added.
- Description for the changelog
Add a new flag
--availability
toswarm join
.- A picture of a cute animal (not mandatory but encouraged)
This fix fixes #24596.
The related PR in swarmkit is:
moby/swarmkit#1271