-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move networkpolicies out of /contrib into /common (#2385) #2457
Move networkpolicies out of /contrib into /common (#2385) #2457
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
can you add @TobiasGoerke as additional reviewer (not approver) in the owners file? |
/LGTM |
@@ -49,6 +50,8 @@ resources: | |||
- ../common/istio-1-16/cluster-local-gateway/base | |||
# Kubeflow namespace | |||
- ../common/kubeflow-namespace/base | |||
# NetworkPolicies | |||
- ../common/networkpolicies/base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimwnasptd @juliusvonkohout Are there any concerns with enabling network policies by default? Is there any concern around how impacts any additional components users deploy either from /contrib
or their own?
This also would mean that any changes made by WG that do not align with current network policies would need to be addressed during the manifest testing time frame of the release. Is that something that Manifest WG wants to take on for all future releases as part of the release cycle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a lot of discussion in #2385. It is more about how we want to do project governance and enforce security. I can do testing for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should include automated tests with the PR if the netpol will be enabled by default, to test the current policies and future changes to the policies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly would you like to test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we can't automatically trigger e2e right now, we can at least trigger all the component tests to ensure that it passes with netpol enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you not think that it is a bit overkill to add the networkpolicies folder as path to 10 different GH actions? I would either use the e2e pipeline or make a simple test that just validates the networkpolicies with kustomize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that just capturing kustomize build
output is not enough, as Kustomize allows you to build invalid resources not following the OpenAPI spec definition.
Example:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: minio
namespace: kubeflow
spec:
podSelector:
matchExpressions:
- sexy_key: app
operator: In
values:
- minio # artifact storage
With kustomize build common/networkpolicies/base
there is no error captured, but when doing static validation against OpenAPI-based validation tools kubeval
or kubectl-validate, an error is raised.
Kubeval:
$ kustomize build common/networkpolicies/base | kubeval (⎈ |kind-kind-dex-auth:default)
PASS - stdin contains a valid NetworkPolicy (kubeflow.cache-server)
WARN - stdin contains an invalid NetworkPolicy (kubeflow.minio) - key: key is required
....
Kubectl Validate:
$ kustomize build common/networkpolicies/base > netpol.yaml
$ kubectl-validate netpol.yaml
netpol.yaml...ERROR
spec.podSelector.matchExpressions[0].sexy_key: Invalid value: value provided for unknown field
Error: validation failed
@annajung my proposal is to consider using OpenAPI validation tools for testing manifest generation, aside from any other e2e and runtime checks.
- Kubectl Validate is really just emerging, but seems to be under the right umbrella (Kubernetes SIGs)
- Kubeval seems to be discontinued
- any other promissing tools?
I can try to do a bit of hacking around with the GH Action workflows to test using the kubectl-validate
route... let me know what's your thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After checking the entire kustomize build example/
there are more errors surfaced:
$ kubectl-validate example.yaml
example.yaml...ERROR
Role.rbac.authorization.k8s.io "cert-manager-webhook:dynamic-serving" is invalid: metadata.name: Invalid value: "cert-manager-webhook:dynamic-serving": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
Role.rbac.authorization.k8s.io "cert-manager-cainjector:leaderelection" is invalid: metadata.name: Invalid value: "cert-manager-cainjector:leaderelection": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
Role.rbac.authorization.k8s.io "cert-manager:leaderelection" is invalid: metadata.name: Invalid value: "cert-manager:leaderelection": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
ClusterRole.rbac.authorization.k8s.io "cert-manager-controller-approve:cert-manager-io" is invalid: metadata.name: Invalid value: "cert-manager-controller-approve:cert-manager-io": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
ClusterRole.rbac.authorization.k8s.io "cert-manager-webhook:subjectaccessreviews" is invalid: metadata.name: Invalid value: "cert-manager-webhook:subjectaccessreviews": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
RoleBinding.rbac.authorization.k8s.io "cert-manager-webhook:dynamic-serving" is invalid: metadata.name: Invalid value: "cert-manager-webhook:dynamic-serving": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
RoleBinding.rbac.authorization.k8s.io "cert-manager-cainjector:leaderelection" is invalid: metadata.name: Invalid value: "cert-manager-cainjector:leaderelection": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
RoleBinding.rbac.authorization.k8s.io "cert-manager:leaderelection" is invalid: metadata.name: Invalid value: "cert-manager:leaderelection": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
ClusterRoleBinding.rbac.authorization.k8s.io "cert-manager-controller-approve:cert-manager-io" is invalid: metadata.name: Invalid value: "cert-manager-controller-approve:cert-manager-io": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
ClusterRoleBinding.rbac.authorization.k8s.io "cert-manager-webhook:subjectaccessreviews" is invalid: metadata.name: Invalid value: "cert-manager-webhook:subjectaccessreviews": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
Maybe deserves separate topic to cover checking multiple components. Adding the name to the file/resource where the errors are found would be useful as well, but I think also just the name of the Kubernetes resources is good for starters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit test could look something like the following:
But the problem is that for custom resources, operators etc, the CRDs needs to be passed as local folders for openAPI schemas extraction, so more work is needed for this.
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: cert-manager.io/v1
failed to retrieve validator: failed to locate OpenAPI spec for GV: kubeflow.org/v1beta1
failed to retrieve validator: failed to locate OpenAPI spec for GV: metacontroller.k8s.io/v1alpha1
failed to retrieve validator: failed to locate OpenAPI spec for GV: networking.istio.io/v1alpha3
failed to retrieve validator: failed to locate OpenAPI spec for GV: networking.istio.io/v1alpha3
apiVersion: networking.k8s.io/v1 | ||
kind: NetworkPolicy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliusvonkohout Seldon isn't installed by default, so does it make sense to add this policy by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That depends. We can also think about moving it to /contrib/seldon
@rawc0der why did you disable seldon? Seldon will still be used in Kubeflow 1.9. |
I was actually thinking that I can enable it back as maybe there is no problem by running this netpol by default. What's you view on this? |
Kserve is aslo in /contrib but installed by default. Seldon is also quite popular. So having them in one place for the time being is what i prefer. @kimwnasptd what do you think? |
TO conclude, one example of where Istio AuthPolicies lacks some support is with services using protocol specific implementation, i.e. the databases: mysql, rds. AuthorizationPolicy can only cover HTTP/gRPC based protocols at the moment. Advantages of securing with Network policies:
Without netpols:
Attack vector in Layer 7:
Last take:
From this blog post. |
"rawc0der I now have the power to merge this kubeflow/internal-acls#584 (comment) , but i need to discuss this with Kimonas first. |
@rawcoder please fix the merge conflict and i think you have to remove yourself from the owners list or become approved by google. Otherwise i have to fork this before merging. |
/hold |
@andreyvelich can you take this up with the KSC ? |
@terrytangyuan can you take this up with the KSC ? |
I think this is in our agenda but we haven't got to it yet. cc @kubeflow/kubeflow-steering-committee |
@juliusvonkohout Do we have KSC approval for merging this? |
I think the KSC still has to decide, but i provided most arguments in the community call, so i hope it is just a formal approval. |
I am +1 considering @juliusvonkohout says this is not breaking change. It is in line with a security profile strategy goals and helps to unblock a CNCF requirement for graduation. By default it will be turned on with this change. It can be disabled by commenting out one line. Can we get feedback from distributions ? @kimwnasptd @james-jwu @johnugeorge others? |
As discussed in the community call with three KSC members we will merge it now. /lgtm Thank you for moving enabling it by default @rawcoder. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: juliusvonkohout, rawc0der The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
…beflow#2457) * Move networkpolicies out of /contrib into /common (kubeflow#2385) * extend OWNERS file * disable seldon NP * enable seldon NP * add rawc0der as Reviewer * Remove rawc0der reviewer * remove the duplicate resource
Which issue is resolved by this Pull Request:
Resolves #2385
Description of your changes:
Moved
networkpolicies
from /contrib to /common and added the base overlay to the example kustomization.Checklist:
Make sure you have installed kustomize == ^5.0.0
make generate-changed-only
make test