Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Support: NetBIOS and Active Directory LDAP SAMAccountName restrictions on Hostname #2217

Closed
rhockenbury opened this issue Jan 30, 2020 · 55 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue or PR is ready to be actively worked on. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@rhockenbury
Copy link

rhockenbury commented Jan 30, 2020

User Story

As an operator, I would like to manage windows server worker nodes with the cluster api. Hostnames on windows are limited to 15 characters, and the hostnames that are set by the cluster api (by default in cloud-init metadata) exceed this limit. The cluster api should support a more flexible mechanism of setting hostnames so that shorter hostnames can be set for VMs.

Detailed Description

Netbios requires windows computer names to be 15 characters or fewer (https://support.microsoft.com/en-us/help/909264/naming-conventions-in-active-directory-for-computers-domains-sites-and). Attempting to set hostname with more than 15 characters on a windows machine will result in only the first 15 being used.

When using the machine deployment api object, the machine api object names are derived from the machineset controller (

machine.ObjectMeta.GenerateName = fmt.Sprintf("%s-", machineSet.Name)
). This name is later used to set the vm name (for example in CAPV - https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/895539d004ea33299435a2c739791e9800d0c2ae/controllers/vspheremachine_controller.go#L320), and then also as the local-hostname in the cloud-init metadata (https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/390c49a23e2b535a27b330e4983c59eb0b42f476/pkg/services/govmomi/service.go#L203).

The machine api object names are prefixed by the name of the machine deployment api object. These names, for example, will be in the form:

workload-cluster-2-md-0-5f77f47487-2c4sq 
workload-cluster-2-md-0-5f77f47487-25xhg

where workload-cluster-2-md-0 is the name of the machine deployment api object. The prefix is appended with 17 extra characters (-5f77f47487-2c4sq, -5f77f47487-25xhg), which will bring the total character count above 15. Notice that setting the deployment api object name to 3 or more characters will guarantee the same first 15 characters, and thus hostname collisions for the nodes. Being able to set the deployment api object name to something more meaningful than what could be expressed in 3 characters would be useful.

My current workaround is to have cloudbase-init invoke an additional script before the join command that reforms the host name and sets it for the vm. This is somewhat undesirable as now the hostname and node api object name are not the same as the vm name. For consistency, it's desired (but not required) that the the vm name (as shown by the cloud provider), the machine api object name, and the node api object name are the same as the hostname of the vm.

Anything else you would like to add:

I realize that windows worker nodes are not officially supported by the cluster api, but I'm mentioning it since it's something that's up for discussion for the cluster-api roadmap (https://github.com/kubernetes-sigs/cluster-api/pull/2148/files#diff-767f66541aad47089dd5ded720dede6bR32).

Another workaround could be use to use the machine api object directly instead of the machine deployment api object, which would directly set the vm name based on the name of the machine api object. However, the benefits of using the machine deployment are lost.

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 30, 2020
@rhockenbury rhockenbury mentioned this issue Jan 30, 2020
4 tasks
@detiber
Copy link
Member

detiber commented Jan 31, 2020

The main reason for the hostname matching the Machine name is currently due to the initial implementation details of vSphere infrastructure provider. In the case of AWS and Linux hosts, there is a requirement when using the AWS cloud provider integration that the hostname must match the internal dns name of the host and we override the hostname setting via cloud-init config for each Machine we provision.

Outside of limitations mentioned above, there should be no requirements that the hostname of an individual instance match the Machine name in any way.

@rhockenbury
Copy link
Author

Agreed - that's certainly not a requirement.

The cloud-init metadata local-hostname is set to the Machine name (at least on CAPV) - what I would propose is flexibility with how local-hostname metadata gets set, so that it's not necessarily set by default to the Machine name.

@ncdc ncdc added this to the Next milestone Feb 5, 2020
@benmoss
Copy link

benmoss commented Feb 18, 2020

I don't think this is a CAPI issue, I think this is just with CAPV. On AWS the hostnames are not specified in the cloud-init metadata

@rhockenbury rhockenbury changed the title Windows Support: Hostnames Windows Support: NetBios Restrictions on Hostname Mar 8, 2020
@rhockenbury
Copy link
Author

@akutz @yastij Would you mind taking a look at this?

@randomvariable
Copy link
Member

randomvariable commented Mar 13, 2020

Is this definitely an issue in a Kubernetes context? The linked page looks like it was written for Windows XP and 2003 when NetBIOS was still a thing.
AD DNS names shouldn't be restricted in the same way, and they do say for FQDNs, it's 63 chars per component, 255 total.

Is the issue is that a machine configured with NetBIOS will register a Kerberos principal with the truncated name? If so, is there a case to be made that NetBIOS should be disabled in Windows images?

@rhockenbury
Copy link
Author

AFAIK, NetBios is still required to domain join a windows machine. Looping in @ksubrmnn and @JocelynBerrendonner.

@randomvariable
Copy link
Member

randomvariable commented Mar 13, 2020

It might depend on how credentials are provided and how the domain is specified. If the FQDN is used and credentials are provided as [email protected], it should default to the DNS SRV records? I admit it's been a decade since I touched Windows, but my memory was that this was possible in at least Win2K8/Vista.

@JocelynBerrendonner
Copy link

AFAIK, NetBios is still required to domain join a windows machine. Looping in @ksubrmnn and @JocelynBerrendonner.

Thanks for reaching out! I don't know the answer to the Netbios/domain join question off the top of my head, but I'll find the experts and pull them in shortly.

@JocelynBerrendonner
Copy link

JocelynBerrendonner commented Mar 13, 2020

@rhockenbury : As per my investigation, netbios is not required to join a domain on Windows machine (that's been the case since around Windows 2000). The page you mentioned only provide naming conventions when Netbios is actually used. Also, as other folks mentioned, the machine name is only truncated in Netbios. When setting a long host name (let's say "MyComputerWithALongName") in a domain (let's say "contoso.com"), the machine is still reachable through its FQDN "MyComputerWithALongName.contoso.com". However, through Netbios, it will indeed only be reachable through the truncated Nebios name "MyComputerWithA".

Is using FQDN an option here?

@rhockenbury
Copy link
Author

Thanks for the additional insight. It feels that it would be best to disable NetBios seeing how with using the machine api object name as the hostname would result in NetBios name collisions. I'll need to follow-up internally to see if we could do this.

@JocelynBerrendonner
Copy link

@rhockenbury : after further discussions with the experts, NETBIOS name resolution is mostly unused today. Though the first step in name resolution is usually going through NETBIOS, if the NETBIOS name is not found, Windows will fallback to resolving the machine name using DNS. For example, if you try to reach a machine through "MyComputerWithALongName", Windows will be able to find that name in DNS provided that the DNS Suffix search order is properly populated in the network interface TCP/IP settings (this last point is important). If you try to ping "MyComputerWithALongName" and if the Suffix is properly populated (to, let's say contoso.com), then Windows will behave similarly to Linux and try "MyComputerWithALongName.contoso.com".

The bottom line is, I previously suggested using the FQDN, but as per my discussion with the expert, there is actually no need for it. If the DNS suffix search order is properly populated in Windows nodes, the long host names Cluster-API generates should directly be usable. And whether NETBIOS is enable or not shouldn't matter. If a long name doesn't work with NETBIOS enabled, it will likely not work with NETBIOS disabled either.

FWIW, you can check the DNS suffix list using the Get-DnsClientGlobalSettings in powershell:

_PS C:\hns> Get-DnsClientGlobalSetting

UseSuffixSearchList : True
SuffixSearchList : {contoso.com}
UseDevolution : True
DevolutionLevel : 0_

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 12, 2020
@rhockenbury
Copy link
Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 12, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 10, 2020
@randomvariable
Copy link
Member

I think we concluded that this isn't an issue? @jsturtevant has also stated as such in the Windows proposal.

/close
for now, and we can revisit if it turns out to be a problem?

@k8s-ci-robot
Copy link
Contributor

@randomvariable: Closing this issue.

In response to this:

I think we concluded that this isn't an issue? @jsturtevant has also stated as such in the Windows proposal.

/close
for now, and we can revisit if it turns out to be a problem?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rhockenbury
Copy link
Author

@randomvariable
Copy link
Member

/reopen

This question was re-raised in SIG Windows around app support, though we were wondering that since pod names and DNS names synonymous, then pod names longer than the NETBIOS limit should also break applications that don't support longer names. If that's the case, it still doesn't make sense to make this a cluster api concern.

I think @JocelynBerrendonner was going to get a definitive answer.

@k8s-ci-robot
Copy link
Contributor

@randomvariable: Reopened this issue.

In response to this:

/reopen

This question was re-raised in SIG Windows around app support, though we were wondering that since pod names and DNS names synonymous, then pod names longer than the NETBIOS limit should also break applications that don't support longer names. If that's the case, it still doesn't make sense to make this a cluster api concern.

I think @JocelynBerrendonner was going to get a definitive answer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Nov 30, 2020
@randomvariable
Copy link
Member

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added this to the v1.1 milestone Oct 22, 2021
@vincepri
Copy link
Member

cc @jayunit100

@jayunit100
Copy link
Contributor

Yup! So, we'd love to propose a fix to this or go through the other folks proposed fixes in an upcoming capi meeting ?

@randomvariable
Copy link
Member

@jayunit100 feel free to reach out if you have a solution in mind. We'll then request the change as required, whether that's some provider contract (which I suspect it might be) or otherwise

@randomvariable
Copy link
Member

some of the stuff @jayunit100 and @weiwenli97 have been looking at is in https://docs.google.com/document/d/1C7PxLukDUyGxhgPxHRpGYPbROlZarak0QdE7grUPReQ/edit#

@sbueringer
Copy link
Member

/help

@k8s-ci-robot
Copy link
Contributor

@sbueringer:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Feb 18, 2022
@sbueringer
Copy link
Member

/unassign @randomvariable

@fabriziopandini fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini fabriziopandini removed this from the v1.2 milestone Jul 29, 2022
@fabriziopandini fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini
Copy link
Member

/triage needs-information
@CecileRobertMichon is this still a problem?

@k8s-ci-robot k8s-ci-robot added the triage/needs-information Indicates an issue needs more information in order to work on it. label Sep 30, 2022
@CecileRobertMichon
Copy link
Contributor

if you're asking if

Hostnames on windows are limited to 15 characters

is still true, then yes. I know some providers including CAPZ have implemented workaround to trim the AzureMachineName to use as hostname (https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/main/azure/scope/machine.go#L399). Not sure if this is something that can be fixed at the CAPI level. @marosset do you have any thoughts?

@fabriziopandini
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Nov 30, 2022
@dimatha
Copy link

dimatha commented Jan 9, 2023

Hey guys, I just wanted to ask if there are any plans working on this?

@jsturtevant
Copy link
Contributor

I don't think anyone is working on this. Most providers that support Windows have used the trimming of the hostname (#2217 (comment)) at this point, I think.

We haven't had many complaints about this as an approach. Have you run into any issues or have other requirements?

@dimatha
Copy link

dimatha commented Jan 9, 2023

Thanks for coming back to me @jsturtevant !
Yeah, just tested CAPV 1.5.1 provider and the "fix" is there. It does truncate the hostname! I should have checked it first :(
Thanks a lot for your quick update!

@vincepri
Copy link
Member

vincepri commented Nov 8, 2023

Closing due to inactivity, feel free to reopen if you're ready to pick up the work.

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closing this issue.

In response to this:

Closing due to inactivity, feel free to reopen if you're ready to pick up the work.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue or PR is ready to be actively worked on. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests