-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a security checklist for clusters #33992
Conversation
/sig security |
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site settings. |
Great security checklist already! But it could make sense to add the following additional information: Network security
Memory and CPU limits
Pod security
Also in regard to the Secrets section, I think secrets should not be mounted as environment variables, because in case of an error, those could be written into a log and therefore be exposed. |
Thanks a lot for the great inputs! Could you consider suggesting changes directly in the checklist? Otherwise, I'll add your ideas if you prefer 😊 |
I added my feedback directly. - Please feel free to change my suggestions in any way you want. |
Should we also add a section about certain admission controlers that can enhance security? |
You can totally propose another section if you have any ideas and it's relevant, we just have to keep third party links out of this checklist. So you have to keep generic or find some resources or list of stuff already available by the CNCF for example. |
I have written my suggestion as comment. Its a short section 😃 |
This looks really good, and is quite informative. One minor thought is that doesn't read so much like a checklist to me, more as a set of recommendations/considerations. What are folks thoughts on trying to add a condensed list of actionable items? Would it be valuable? |
Provided folks are happy to maintain it, we can also publish a PDF version of same. Or YAML, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the issue says that we should use third party links but things about CSI benchmark is probably a nice thing to add.
I don't think I'm familiar enough with bootstrapping clusters to answer that question, unfortunately :(! |
Hi Folks, can we merge this PR if there are no big concerns from the reviewers and capture the nice to haves or minor changes in another issue/PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes since #33992 (comment) are minor. https://deploy-preview-33992--kubernetes-io-main-staging.netlify.app/docs/concepts/security/security-checklist/ looks good.
/lgtm
/approve
Many thanks to the folks who have moved this PR forward.
I've added some comments. None of this feedback in any way blocks us merging this PR.
- [ ] Intermediate and leaf certificates have an expiry date no more than 3 | ||
years in the future. | ||
- [ ] A process exists for periodic access review, and reviews occur no more | ||
than 24 months apart. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK for the checklist to recommend that you've thought about those expiry dates - there's no specific right answer for how long those intervals might be.
After bootstrapping, neither users nor components should authenticate to the | ||
Kubernetes API as `system:masters`. Similarly, running all of | ||
kube-controller-manager as `system:masters` should be avoided. In fact, | ||
`system:masters` should only be used as a break-glass mechanism, as opposed to | ||
an admin user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could make a similar comment about /etc/kubernetes/admin.conf
|
||
## Network security | ||
|
||
- [ ] CNI plugins in-use supports network policies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- [ ] CNI plugins in-use supports network policies. | |
- [ ] CNI plugins in-use supports [network policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/). |
feature, an alternative solution could be to use a service mesh to provide that | ||
functionality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optionally:
feature, an alternative solution could be to use a service mesh to provide that | |
functionality. | |
feature, an alternative solution could be to use a [service mesh](https://glossary.cncf.io/service-mesh/) | |
to provide that functionality. |
If a cloud provider is used for hosting Kubernetes, the access from pods to the cloud | ||
metadata API `169.254.169.254` should also be restricted or blocked if not needed | ||
because it may leak information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider blocking egress to 169.254.0.0/16
and making exceptions based on business need.
(For IPv6, link-local addresses have an attached interface scope which means that even node-local routes don't get set up unless somebody explicitly makes this happen, so I think we don't need to mention IPv6).
|
||
- [ ] RBAC rights to `create`, `update`, `patch`, `delete` workloads is only granted if necessary. | ||
- [ ] Appropriate Pod Security Standards policy is applied for all namespaces and enforced. | ||
- [ ] Memory limit is set for the workloads with a limit equal or inferior to the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't set a resource limit that is less than the request.
- [ ] Memory limit is set for the workloads with a limit equal or inferior to the request. | |
- [ ] Memory limit is set for the workloads with a request equal to or less | |
than the limit. |
For historical context, please note that Docker has been using | ||
[a default seccomp profile](https://docs.docker.com/engine/security/seccomp/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For historical context, please note that Docker has been using | |
[a default seccomp profile](https://docs.docker.com/engine/security/seccomp/) | |
For historical context, note that Docker Engine has been using | |
[default seccomp profile](https://docs.docker.com/engine/security/seccomp/) |
[a default seccomp profile](https://docs.docker.com/engine/security/seccomp/) | ||
to only allow a restricted set of syscalls since 2016 from | ||
[Docker Engine 1.10](https://www.docker.com/blog/docker-engine-1-10-security/), | ||
but Kubernetes is still not confining workloads by default. The default seccomp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but Kubernetes is still not confining workloads by default. The default seccomp | |
but Kubernetes does not confine workloads by default. The default seccomp |
as well. Fortunately, [Seccomp Default](/blog/2021/08/25/seccomp-default/), a | ||
new alpha feature to use a default seccomp profile for all workloads can now be | ||
enabled and tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as well. Fortunately, [Seccomp Default](/blog/2021/08/25/seccomp-default/), a | |
new alpha feature to use a default seccomp profile for all workloads can now be | |
enabled and tested. | |
as well. |
and then in the What's Next section, link to https://k8s.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads
Container image should contain the bare minimum to run the program they | ||
package. Preferably, only the program and its dependencies, building the image | ||
from the minimal possible base. In particular, image used in production should not | ||
contain shells or debugging utilities, as an | ||
[ephemeral debug container](/docs/tasks/debug/debug-application/debug-running-pod/#ephemeral-container) | ||
can be used for troubleshooting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should explain and recommend multistage builds as one technique that helps minimize what gets included.
LGTM label has been added. Git tree hash: e60390e52d7e7f5218be49460fd8d7bf739c8f63
|
/label tide/merge-method-squash |
Actually, I think a normal merge commit will be OK. |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sftim The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hehe, it feels almost a bit weird that this is finally merged! 😵💫 Thanks so much everybody, we were talking about that with sig security doc just a few minutes before!
Can I ask why you did not squash merged finally @sftim? Your comments were pretty nice btw, should we create a PR for those? |
When GitHub does a squash merge it produces a single commit into the target branch, and doesn't let you track what the merge base of the work was. This way, we're tracking where the branches diverged before merging back in. |
How about creating some issues? Where appropriate, mark those as good-first-issue so that folks with less experience can pick these up. |
Addresses issue Create a security checklist for deploying a cluster #28 in k/sig-security repository. [preview]
Draft for the security checklist guide from the collaborative document with @savitharaghunathan, @Skybound1, and @p4ck3t0. Many people participated in this list via this PR, thanks everyone☺️ !
This is a document in the form of a checklist with paragraphs that further describe some items. The goal is to centralize many security concerns and form a central place to redirect to other documentation or guide.
This checklist is meant to evolve in the future, as the participation on the PR proved, people provided really diverse ideas and this is essential to try to cover as much as possible.