Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking issue for Windows support #1393

Open
14 of 17 tasks
neolit123 opened this issue Feb 6, 2019 · 32 comments
Open
14 of 17 tasks

tracking issue for Windows support #1393

neolit123 opened this issue Feb 6, 2019 · 32 comments
Assignees
Labels
area/ecosystem area/windows kind/feature Categorizes issue or PR as related to a new feature. kind/tracking-issue lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/windows Categorizes an issue or PR as relevant to SIG Windows.
Milestone

Comments

@neolit123
Copy link
Member

neolit123 commented Feb 6, 2019

kubernetes/enhancements tracking issue:

KEP was added here:


GA graduation:


beta graduation:

- [ ] upgrades
upgrades were delegated to documentation and having scripts for the process is not really needed.


alpha graduation:

as list of cleanup changes that we can do regardless:


side work:


/kind feature
/area ecosystem
/priority important-longterm
/assign

cc @michmike @PatrickLang

@neolit123
Copy link
Member Author

neolit123 commented Apr 3, 2019

@neolit123 neolit123 added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/windows Categorizes an issue or PR as relevant to SIG Windows. labels Apr 9, 2019
@neolit123
Copy link
Member Author

kubernetes/enhancements tracking issue:
kubernetes/enhancements#995

KEP was added here:
kubernetes/enhancements#994

@neolit123
Copy link
Member Author

update the OP with:

as list of cleanup changes that we can do regardless:

@benmoss
Copy link
Member

benmoss commented May 8, 2019

I don't see any preflight checks that are failing or not appropriate for Windows:

PS C:\> .\winsw\join.ps1
I0508 14:27:14.679350    1320 join.go:364] [preflight] found NodeName empty; using OS hostname as NodeName
I0508 14:27:14.681294    1320 initconfiguration.go:105] detected and using CRI socket: tcp://localhost:2375
[preflight] Running pre-flight checks
I0508 14:27:14.690354    1320 preflight.go:90] [preflight] Running general checks
I0508 14:27:14.952085    1320 checks.go:254] validating the existence and emptiness of directory \etc\kubernetes\manifests
I0508 14:27:14.953111    1320 checks.go:292] validating the existence of file \etc\kubernetes\kubelet.conf
I0508 14:27:14.961114    1320 checks.go:292] validating the existence of file \etc\kubernetes\bootstrap-kubelet.conf
I0508 14:27:14.963083    1320 checks.go:105] validating the container runtime
I0508 14:27:15.124971    1320 checks.go:131] validating if the service is enabled and active
I0508 14:27:16.139511    1320 checks.go:524] running all checks
I0508 14:27:16.655173    1320 checks.go:412] checking whether the given node name is reachable using net.LookupHost
I0508 14:27:16.671956    1320 checks.go:622] validating kubelet version
I0508 14:27:16.834297    1320 checks.go:131] validating if the service is enabled and active
I0508 14:27:17.475948    1320 checks.go:209] validating availability of port 10250
I0508 14:27:17.476979    1320 checks.go:292] validating the existence of file C:/etc/kubernetes/pki/ca.crt
I0508 14:27:17.485021    1320 checks.go:439] validating if the connectivity type is via proxy or direct
I0508 14:27:17.487030    1320 join.go:426] [preflight] Discovering cluster-info
I0508 14:27:17.488914    1320 token.go:199] [discovery] Trying to connect to API Server "192.168.79.131:6443"
I0508 14:27:17.491183    1320 token.go:74] [discovery] Created cluster-info discovery client, requesting info from "https://192.168.79.131:6443"
I0508 14:27:17.512331    1320 token.go:140] [discovery] Requesting info from "https://192.168.79.131:6443" again to validate TLS against the pinned public key
I0508 14:27:17.529788    1320 token.go:163] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.79.131:6443"
I0508 14:27:17.532935    1320 token.go:205] [discovery] Successfully established connection with API Server "192.168.79.131:6443"
I0508 14:27:17.534895    1320 join.go:440] [preflight] Fetching init configuration
I0508 14:27:17.535888    1320 join.go:473] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I0508 14:27:17.571275    1320 interface.go:278] Looking for system interface with a global IPv4 address
I0508 14:27:17.572239    1320 interface.go:196] Interface Ethernet0 is up
I0508 14:27:17.585881    1320 interface.go:302] Skipping: no address family match for "fe80::a977:1755:66ff:8b87" on interface "Ethernet0".
I0508 14:27:17.586472    1320 interface.go:310] Found global unicast address "192.168.79.128" on interface "Ethernet0".
I0508 14:27:17.587191    1320 preflight.go:101] [preflight] Running configuration dependant checks
I0508 14:27:17.594267    1320 controlplaneprepare.go:207] [download-certs] Skipping certs download
I0508 14:27:17.595830    1320 kubelet.go:105] [kubelet-start] writing bootstrap kubelet config file at \etc\kubernetes\bootstrap-kubelet.conf
I0508 14:27:17.604244    1320 kubelet.go:113] [kubelet-start] writing CA certificate at C:/etc/kubernetes/pki/ca.crt
I0508 14:27:17.766276    1320 kubelet.go:131] [kubelet-start] Stopping the kubelet
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "\\var\\lib\\kubelet\\config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "\\var\\lib\\kubelet\\kubeadm-flags.env"
I0508 14:27:18.466047    1320 kubelet.go:148] [kubelet-start] Starting the kubelet
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connectex: No connection could be made because the target machine actively refused it..
I0508 14:28:43.240082    1320 kubelet.go:166] [kubelet-start] preserving the crisocket information for the node
I0508 14:28:43.241154    1320 patchnode.go:30] [patchnode] Uploading the CRI Socket information "tcp://localhost:2375" to the Node API object "win-vb8d2n40slh" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

@neolit123
Copy link
Member Author

neolit123 commented May 8, 2019

@benmoss
did you start the kubelet service using the Start-Servicefrontend instead of sc?

@neolit123
Copy link
Member Author

also, did you had to apply the \ -> c:\ fix i did in \etc\kubernetes\kubelet.conf

@ksubrmnn
Copy link

ksubrmnn commented May 8, 2019

@benmoss Can you share .\winsw\join.ps1?

@benmoss
Copy link
Member

benmoss commented May 8, 2019

I am using WinSW to wrap kubelet.exe as a Service. I really like WinSW as a service wrapper, it would be my vote rather than using the --windows-service flag.

https://github.com/benmoss/kubeadm-windows/blob/master/join.ps1
https://github.com/benmoss/kubeadm-windows/blob/master/kubelet.xml

To install the service you just need to run kubelet.exe install from that directory. The way WinSW works is you download the WinSW binary, rename it to the name of the service, and put it in the same directory as the corresponding xml config file. kubelet.exe install then registers it as a Windows service.

@neolit123
Copy link
Member Author

i think it might be a case where sc does something differently.
i will try the different options.

@benmoss
Copy link
Member

benmoss commented May 8, 2019

And no, I didn't have to fix the paths in /etc/kubernetes/kubelet.conf. The only path problem I'm running into is that kubelet is joining paths to /etc/kubernetes/pki/ca.crt incorrectly. It errors with

F0508 14:27:19.857413    4916 server.go:251] unable to load client CA file C:\var\lib\kubelet\etc\kubernetes\pki\ca.crt: open C:\var\lib\kubelet\etc\kubernetes\pki\ca.crt: The system cannot find the path specified.

I have been working around that by just copying /etc into /var/lib/kubelet/ but that's obviously not right.

@neolit123
Copy link
Member Author

neolit123 commented Jun 10, 2019

updated OP with latest PRs merged.
for 1.15 (alpha) remaining items are install script and docs.

EDIT: looks like the docs and script will miss the 1.15 release deadlines.

@neolit123 neolit123 removed this from the v1.15 milestone Jun 11, 2019
@neolit123 neolit123 removed the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Sep 3, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2021
@neolit123
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 23, 2021
@k8s-triage-robot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 26, 2021
@neolit123
Copy link
Member Author

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 26, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2021
@neolit123
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2022
@neolit123
Copy link
Member Author

neolit123 commented Jan 24, 2022 via email

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 24, 2022
@pacoxu
Copy link
Member

pacoxu commented Sep 27, 2022

/cc
TODO: After windows ut can run regularly, we need a grid board to know the code coverage of windows ut like https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-unit.

@neolit123
Copy link
Member Author

neolit123 commented Sep 10, 2024

recently had to do some fixes in the system validators library related to parsing the OS name on Windows after @jsturtevant reported the issue:

this is another thing we fixed recently, which took a while:

also checked what we have in the KEP for GA graduation:

The feature is well tested and adapted by the community. e2e tests are stable and consistent with other SIG-Windows CI signals. Documentation is complete.

i think we are pretty much done with this thanks to CAPZ signal, but there is this missing AI that has not been addressed for ~2 years. it's the result of a refactor that happened at some point in the page for adding Windows nodes.

cc @jsturtevant @marosset @aravindhp @knabben

(see my latest comment/proposal there kubernetes/website#34476 (comment))
can we just xref the sig-windows-tools guides from the windows guide and close/repurpose that website ticket?

i think after that we could just say that kubeadm support is GA.
the kube-proxy / CNI story is still not so simple for Windows users, but that seems out of band.
same for other documentation such as kubernetes/website#31428

@neolit123
Copy link
Member Author

i think after that we could just say that kubeadm support is GA.

if we agree on that i can close:

and PR the KEP with a GA status.

@sftim
Copy link

sftim commented Sep 10, 2024

GA

For GA features, we (very much) like to have docs.

@neolit123
Copy link
Member Author

joined the sig windows meeting today, and we discussed the docs part. sig windows agreed with my proposal here:

on the technical side there seem to be a couple of GA blockers around kube-proxy:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem area/windows kind/feature Categorizes issue or PR as related to a new feature. kind/tracking-issue lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/windows Categorizes an issue or PR as relevant to SIG Windows.
Projects
Status: No status
Development

No branches or pull requests

9 participants