Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

determine and document resource requirements #485

Open
BenTheElder opened this issue May 5, 2019 · 21 comments
Open

determine and document resource requirements #485

BenTheElder opened this issue May 5, 2019 · 21 comments
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@BenTheElder
Copy link
Member

What would you like to be documented: A more accurate lower bound on resources when using kind with docker desktop. Currently we suggest 4GB / 4 CPU which while probably accurate for building kubernetes should be more than we need to run a node.

https://kind.sigs.k8s.io/docs/user/quick-start/#creating-a-cluster

Why is this needed: We don't want to overstate requirements dramatically and scare off potential users :-)

We'll need to do some testing to determine the threshold. It should be reduced at HEAD, it might also be interesting to check if that is true 🙃

@BenTheElder BenTheElder added the kind/documentation Categorizes issue or PR as related to documentation. label May 5, 2019
@BenTheElder
Copy link
Member Author

/assign

@nickolaev
Copy link

I am trying kind in the CircleCI machine executor which seems to be perfectly fine with its 2 CPU cores. Hope this helps.

@BenTheElder
Copy link
Member Author

thanks!

can confirm that at HEAD even the smallest settings docker desktop offers currently work fine for kind create cluster:
Screen Shot 2019-05-05 at 3 24 25 PM

$ kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.14.1) 🖼
 ✓ Preparing nodes 📦 
 ✓ Creating kubeadm config 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
Cluster creation complete. You can now use the cluster with:

export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"
kubectl cluster-info

$ kubectl get no
NAME                 STATUS   ROLES    AGE   VERSION
kind-control-plane   Ready    master   20s   v1.14.1

$ kubectl get po --all-namespaces
NAMESPACE     NAME                      READY   STATUS    RESTARTS   AGE
kube-system   coredns-fb8b8dccf-qhm47   1/1     Running   0          23s
kube-system   coredns-fb8b8dccf-rc56v   1/1     Running   0          23s
kube-system   kube-proxy-ksw7d          1/1     Running   0          23s
kube-system   weave-net-dqz4p           2/2     Running   0          23s

@BenTheElder
Copy link
Member Author

will actually have to follow up and sample the usage over time, the lowest settings on docker desktop / macOS appear to be above our lower bound 🙃

@costinm
Copy link

costinm commented May 9, 2019

We are using both CircleCI machine and remote_docker environments, which are 2 CPU, to test Istio - it seems to be working quite well. Since Circle doesn't allow higher CPU, it would be good to keep this as a baseline.

@costinm
Copy link

costinm commented May 9, 2019

Besides - if the ARM64 bugs are fixed, I would hope Kind will run on Raspberry Pi - k8s can run just fine.

@BenTheElder
Copy link
Member Author

We are using both CircleCI machine and remote_docker environments, which are 2 CPU, to test Istio - it seems to be working quite well. Since Circle doesn't allow higher CPU, it would be good to keep this as a baseline.

A single node should work with considerably less than this, however the rest of what we can do performance wise is mostly bound by Kubernetes / CRI / ... CNI is probably the last place we have room for squeezing this lower and we're working on that.

It may regress some in the future due to the components we don't control, but keeping everything as low as we can is a high priority 👍

Besides - if the ARM64 bugs are fixed, I would hope Kind will run on Raspberry Pi - k8s can run just fine.

As far as I know ARM64 works but requires building images yourself. Currently it will be painful to cross-build those because of getting kubernetes loaded, but that is being worked on in low priority.

There is some limited ARM64 CI working now from the openlab folks.

@BenTheElder
Copy link
Member Author

these will shift a bit with the updated CNI configuration (should be lower), but will remeasure.

still need to document as well

@BenTheElder
Copy link
Member Author

since we've approached the limits of what we can reduce from kind's end alone, some experimentation with making upstream Kubernetes lighter: https://github.com/BenTheElder/kubernetes/tree/experiment

if we go forward with this change upstream then kind will support leveraging it immediately.

@BenTheElder
Copy link
Member Author

I've improved that prototype with the goals of:

  • being able to test / use out of tree cloud providers without in-tree code well ahead of the in-tree removal
  • being able to ship kind with lighter node images

So far that more or less works and I've created a provisional PR upstream.

At this point I think the next step is a KEP, expect more on this in the near future :-)

We may need to slightly adjust what else we ship though (EG currently we are missing the metrics APIs) but will continue to push for light weight clusters overall. I think we can lighten some other things at the same time to make room without adding much overhead.

@BenTheElder
Copy link
Member Author

#932 + recent containerd build infra and upgrades should reduce the memory overhead per pod.

@BenTheElder
Copy link
Member Author

/help

@k8s-ci-robot
Copy link
Contributor

@BenTheElder:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Nov 7, 2019
@BenTheElder BenTheElder added the good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. label Nov 19, 2019
@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 17, 2020
@kubernetes-sigs kubernetes-sigs deleted a comment from fejta-bot Feb 18, 2020
@BenTheElder BenTheElder added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 18, 2020
@shreyasbapat
Copy link

Hi @BenTheElder
I wish to work on this issue and get started, can you confirm if only the image of the docker desktop interface needs to be changed with the one you posted above?

@BenTheElder
Copy link
Member Author

the image can be left actually, as the image is talking about building kubernetes image which takes more resources.

it would be helpful to determine exactly how much resources a typical kind cluster uses in a repeatable fashion and keep this documented.

@shreyasbapat
Copy link

it would be helpful to determine exactly how much resources a typical kind cluster uses in a repeatable fashion and keep this documented.

And for that, I will have to perform some experiments and report back right? Do you suggest me to check multiple times?

@BenTheElder
Copy link
Member Author

that's a good idea, I think the most important thing is that we write down how we determined this somewhere so we can come back and verify what it's currently at :-)

@shreyasbapat
Copy link

that's a good idea, I think the most important thing is that we write down how we determined this somewhere so we can come back and verify what it's currently at :-)

On it. Will notify once I am done

@BenTheElder BenTheElder removed their assignment Jun 23, 2020
@shekhar-rajak
Copy link

it would be helpful to determine exactly how much resources a typical kind cluster uses in a repeatable fashion and keep this documented.

Different hardware system and system config can behave differently, not sure where to benchmark memory & time .

@jayunit100
Copy link
Contributor

FWIW I ran an experiment on my kids 2 core i5 with 8GB of ram dedicated to docker and was able to run a four node kind cluster with no issues , including scheduling of 10+ pods (14 if you include CNI) and sonobuoy .

Meanwhile pushing to ten nodes on a massive server with 48 cores failed bc of etcd.

So sounds like the most important tweak for kind may be he running etcd in memory if running large number of nodes.

stg-0 pushed a commit to stg-0/kind that referenced this issue Mar 4, 2024
@williscool
Copy link

just wanted to share that aws publishes the maximum pods you can run on an ec2 instances for eks. which you can roughly map to an amount of cpus and memory per pod

https://github.com/awslabs/amazon-eks-ami/blob/main/templates/shared/runtime/eni-max-pods.txt

found that here

https://stackoverflow.com/questions/57970896/pod-limit-on-node-aws-eks

not sure if its helpful or not for finding the right numbers for kind but figured why not share

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

8 participants