Skip to content

Commit

Permalink
Address more comments and add logic in stale controller
Browse files Browse the repository at this point in the history
Signed-off-by: Yang Ding <[email protected]>
  • Loading branch information
Dyanngg committed Mar 11, 2022
1 parent 570f9c0 commit f7e29a8
Show file tree
Hide file tree
Showing 9 changed files with 304 additions and 144 deletions.
98 changes: 13 additions & 85 deletions docs/multicluster/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,88 +90,16 @@ IPs from all member clusters. The new created Antrea Multi-cluster Service is ju
Kubernetes Service, so Pods in a member cluster can access the multi-cluster Service as usual without
any extra setting.

## Multi-cluster ClusterNetworkPolicy Replication (ACNP Copy-span)

Antrea Multi-cluster admins can specify certain ClusterNetworkPolicies to be replicated across
the entire ClusterSet. This is especially useful for ClusterSet admins who want all clusters in the
ClusterSet to be applied with a consistent security posture (for example, all namespaces in all
clusters can only communicate with Pods in their own namespaces). For more information regarding
Antrea ClusterNetworkPolicy(ACNP), refer to [this document](../antrea-network-policy.md).

To achieve such ACNP copy-span, admins can, in the acting leader cluster of a Multi-cluster deployment,
create a ResourceExport of kind `AntreaClusterNetworkPolicy` which contains the ClusterNetworkPolicy spec
they wish to be replicated. The ResourceExport should be created in the Namespace which implements the
Common Area of the ClusterSet. In future releases, some additional tooling may become available to
automate the creation of such ResourceExport and make ACNP replication across cluster eaiser.

```yaml
apiVersion: multicluster.crd.antrea.io/v1alpha1
kind: ResourceExport
metadata:
name: strict-namespace-isolation-for-test-clusterset
namespace: antrea-mcs-ns # Namespace that implements Common Area of test-clusterset
spec:
kind: AntreaClusterNetworkPolicy
name: strict-namespace-isolation # In each importing cluster, an ACNP of name antrea-mc-strict-namespace-isolation will be created with the spec below
clusternetworkpolicy:
priority: 1
tier: securityops
appliedTo:
- namespaceSelector: {} # Selects all Namespaces in the member cluster
ingress:
- action: Pass
from:
- namespaces:
match: Self # Skip drop rule for traffic from Pods in the same Namespace
- podSelector:
matchLabels:
k8s-app: kube-dns # Skip drop rule for traffic from the core-dns components
- action: Drop
from:
- namespaceSelector: {} # Drop from Pods from all other Namespaces
```
The above sample spec will create an ACNP in each member cluster which implements strict namespace
isolation for that cluster.
Note that because the Tier that an ACNP refers to must exist before the ACNP is applied, an importing
cluster may fail to create the ACNP to be replicated, if the tier in the ResourceExport spec cannot be
found in that particular cluster. The ACNP creation status of each member cluster will be reported back
to the Common Area as K8s Events, and can be checked by describing the ResourceImport of the original
ResourceExport:
```text
kubectl describe resourceimport -A
---
Name: strict-namespace-isolation-antreaclusternetworkpolicy
Namespace: antrea-mcs-ns
API Version: multicluster.crd.antrea.io/v1alpha1
Kind: ResourceImport
Spec:
Clusternetworkpolicy:
Applied To:
Namespace Selector:
Ingress:
Action: Pass
Enable Logging: false
From:
Namespaces:
Match: Self
Pod Selector:
Match Labels:
k8s-app: kube-dns
Action: Drop
Enable Logging: false
From:
Namespace Selector:
Priority: 1
Tier: random
Kind: AntreaClusterNetworkPolicy
Name: strict-namespace-isolation
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ACNPImportSucceeded 2m11s resourceimport-controller ACNP successfully created in the importing cluster test-cluster-east
Warning ACNPImportFailed 2m11s resourceimport-controller ACNP Tier does not exist in the importing cluster test-cluster-west
```
## Antrea Multi-cluster policy enforcement

At this moment, Antrea does not support Pod-level policy enforcement for cross-cluster traffic. Access
towards Multi-cluster Services can be regulated with Antrea ClusterNetworkPolicy `toService` rules. In
each member cluster, users can create an Antrea ClusterNetworkPolicy selecting Pods in that cluster, with
the imported Mutli-cluster Service name and Namespace in an egress `toService` rule, and the Action to
take for traffic matching this rule. For more information regarding Antrea ClusterNetworkPolicy (ACNP),
refer to [this document](../antrea-network-policy.md).

Multi-cluster admins can also specify certain ClusterNetworkPolicies to be replicated across the entire
ClusterSet. The ACNP to be replicated should be created as a ResourceExport in the leader cluster, and
the resource export/import pipeline will ensure member clusters receive this ACNP spec to be replicated.
Each member cluster's Multi-cluster Controller will then create an ACNP in their respective clusters.
85 changes: 85 additions & 0 deletions docs/multicluster/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,91 @@ ResourceExport into the corresponding ResourceImport until users correct it.
due to forementioned mismatch issue, Antrea Multi-cluster Controller will also skip converging
the corresponding Endpoints ResourceExport until users correct it.

## Multi-cluster ClusterNetworkPolicy Replication

Since Antrea v1.6.0, Multi-cluster admins can specify certain ClusterNetworkPolicies to be replicated
across the entire ClusterSet. This is especially useful for ClusterSet admins who want all clusters in
the ClusterSet to be applied with a consistent security posture (for example, all Namespaces in all
clusters can only communicate with Pods in their own namespaces). For more information regarding
Antrea ClusterNetworkPolicy (ACNP), refer to [this document](../antrea-network-policy.md).

To achieve such ACNP copy-span replication across clusters, admins can, in the acting leader cluster of
a Multi-cluster deployment, create a ResourceExport of kind `AntreaClusterNetworkPolicy` which contains
the ClusterNetworkPolicy spec they wish to be replicated. The ResourceExport should be created in the
Namespace which implements the Common Area of the ClusterSet. In future releases, some additional tooling
may become available to automate the creation of such ResourceExport and make ACNP replication easier.

```yaml
apiVersion: multicluster.crd.antrea.io/v1alpha1
kind: ResourceExport
metadata:
name: strict-namespace-isolation-for-test-clusterset
namespace: antrea-mcs-ns # Namespace that implements Common Area of test-clusterset
spec:
kind: AntreaClusterNetworkPolicy
name: strict-namespace-isolation # In each importing cluster, an ACNP of name antrea-mc-strict-namespace-isolation will be created with the spec below
clusternetworkpolicy:
priority: 1
tier: securityops
appliedTo:
- namespaceSelector: {} # Selects all Namespaces in the member cluster
ingress:
- action: Pass
from:
- namespaces:
match: Self # Skip drop rule for traffic from Pods in the same Namespace
- podSelector:
matchLabels:
k8s-app: kube-dns # Skip drop rule for traffic from the core-dns components
- action: Drop
from:
- namespaceSelector: {} # Drop from Pods from all other Namespaces
```

The above sample spec will create an ACNP in each member cluster which implements strict namespace
isolation for that cluster.

Note that because the Tier that an ACNP refers to must exist before the ACNP is applied, an importing
cluster may fail to create the ACNP to be replicated, if the Tier in the ResourceExport spec cannot be
found in that particular cluster. If there are such failures, the ACNP creation status of failed member
clusters will be reported back to the Common Area as K8s Events, and can be checked by describing the
ResourceImport of the original ResourceExport:

```text
kubectl describe resourceimport -A
---
Name: strict-namespace-isolation-antreaclusternetworkpolicy
Namespace: antrea-mcs-ns
API Version: multicluster.crd.antrea.io/v1alpha1
Kind: ResourceImport
Spec:
Clusternetworkpolicy:
Applied To:
Namespace Selector:
Ingress:
Action: Pass
Enable Logging: false
From:
Namespaces:
Match: Self
Pod Selector:
Match Labels:
k8s-app: kube-dns
Action: Drop
Enable Logging: false
From:
Namespace Selector:
Priority: 1
Tier: random
Kind: AntreaClusterNetworkPolicy
Name: strict-namespace-isolation
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ACNPImportFailed 2m11s resourceimport-controller ACNP Tier random does not exist in the importing cluster test-cluster-west
```

## Known Issue

We recommend user to reinstall or update Antrea Multi-cluster controllers through `kubectl apply`.
Expand Down
2 changes: 1 addition & 1 deletion multicluster/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ IMG ?= antrea/antrea-mc-controller:latest
# Produce CRDs that work back to Kubernetes 1.11 (no version conversion)
# For controller-gen, float value is not allowed by default as it is considered dangerous
# See https://github.com/kubernetes-sigs/controller-tools/issues/245
# However the ResourceExport/Import refers to ACNP type definition and the priority field in ACNP spec is of type float64.
# However the ResourceExport/Import refers to ACNP type definition and the priority field in ACNP spec is type float64.
# Hence, before any ACNP spec bumps that changes the priorty field to a different type,
# the allowDangerousTypes flag is needed for CRD manifests to generate correctly.
CRD_OPTIONS ?= "crd:trivialVersions=true,allowDangerousTypes=true,preserveUnknownFields=false"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ package commonarea
import (
"context"
"errors"
"fmt"
"math/rand"

corev1 "k8s.io/api/core/v1"
Expand All @@ -35,7 +36,6 @@ import (
const (
nameSuffixLength int = 5
acnpImportStatusPrefix string = "acnp-import-status-"
acnpImportSucceeded string = "ACNPImportSucceeded"
acnpImportFailed string = "ACNPImportFailed"
)

Expand Down Expand Up @@ -64,9 +64,10 @@ func (r *ResourceImportReconciler) handleResImpUpdateForClusterNetworkPolicy(ctx
}
if !acnpNotFound {
if _, ok := acnp.Annotations[common.AntreaMCACNPAnnotation]; !ok {
err := errors.New("unable to import Antrea ClusterNetworkPolicy which conflicts with existing one")
msg := "Unable to import Antrea ClusterNetworkPolicy which conflicts with existing one in cluster " + r.localClusterID
err := errors.New(msg)
klog.ErrorS(err, "", "acnp", klog.KObj(acnp))
return ctrl.Result{}, err
return ctrl.Result{}, r.reportStatusEvent(msg, ctx, resImp)
}
}
acnpObj := getMCAntreaClusterPolicy(resImp)
Expand All @@ -78,55 +79,30 @@ func (r *ResourceImportReconciler) handleResImpUpdateForClusterNetworkPolicy(ctx
// Create or update the ACNP if necessary.
if acnpNotFound {
if err = r.localClusterClient.Create(ctx, acnpObj, &client.CreateOptions{}); err != nil {
klog.ErrorS(err, "failed to create imported Antrea ClusterNetworkPolicy", "acnp", klog.KObj(acnpObj))
return ctrl.Result{}, err
msg := "Failed to create imported Antrea ClusterNetworkPolicy in cluster " + r.localClusterID
klog.ErrorS(err, msg, "acnp", klog.KObj(acnpObj))
return ctrl.Result{}, r.reportStatusEvent(msg, ctx, resImp)
}
r.installedResImports.Add(*resImp)
} else if !apiequality.Semantic.DeepEqual(acnp.Spec, acnpObj.Spec) {
acnp.Spec = acnpObj.Spec
if err = r.localClusterClient.Update(ctx, acnp, &client.UpdateOptions{}); err != nil {
klog.ErrorS(err, "failed to update imported Antrea ClusterNetworkPolicy", "acnp", klog.KObj(acnpObj))
return ctrl.Result{}, err
msg := "Failed to update imported Antrea ClusterNetworkPolicy in cluster " + r.localClusterID
klog.ErrorS(err, msg, "acnp", klog.KObj(acnpObj))
return ctrl.Result{}, r.reportStatusEvent(msg, ctx, resImp)
}
}
} else if tierNotFound && !acnpNotFound {
// The ACNP Tier does not exist, and the policy cannot be realized in this particular importing member cluster.
// If there is an ACNP previously created via import (which has a valid Tier by then), it should be cleaned up.
if err = r.localClusterClient.Delete(ctx, acnpObj, &client.DeleteOptions{}); err != nil {
klog.ErrorS(err, "failed to delete imported Antrea ClusterNetworkPolicy that no longer has a valid Tier for the current cluster", "acnp", klog.KObj(acnpObj))
return ctrl.Result{}, err
msg := "Failed to delete imported Antrea ClusterNetworkPolicy that no longer has a valid Tier for cluster " + r.localClusterID
klog.ErrorS(err, msg, "acnp", klog.KObj(acnpObj))
return ctrl.Result{}, r.reportStatusEvent(msg, ctx, resImp)
}
}

statusEvent := &corev1.Event{
ObjectMeta: metav1.ObjectMeta{
Name: randName(acnpImportStatusPrefix + r.localClusterID + "-"),
Namespace: resImp.Namespace,
},
InvolvedObject: corev1.ObjectReference{
APIVersion: resourceImportAPIVersion,
Kind: resourceImportKind,
Name: resImp.Name,
Namespace: resImp.Namespace,
UID: resImp.GetUID(),
},
FirstTimestamp: metav1.Now(),
LastTimestamp: metav1.Now(),
ReportingController: acnpEventReportingController,
ReportingInstance: acnpEventReportingInstance,
Action: "reconciled",
}
if tierNotFound {
statusEvent.Type = corev1.EventTypeWarning
statusEvent.Reason = acnpImportFailed
statusEvent.Message = "ACNP Tier does not exist in the importing cluster " + r.localClusterID
} else {
statusEvent.Type = corev1.EventTypeNormal
statusEvent.Reason = acnpImportSucceeded
statusEvent.Message = "ACNP successfully created in the importing cluster " + r.localClusterID
}
if err = r.remoteCommonArea.Create(ctx, statusEvent, &client.CreateOptions{}); err != nil {
klog.ErrorS(err, "failed to create acnp import event for resourceimport", "resImp", klog.KObj(resImp))
return ctrl.Result{}, err
} else if tierNotFound {
msg := fmt.Sprintf("ACNP Tier %s does not exist in importing cluster %s", tierName, r.localClusterID)
return ctrl.Result{}, r.reportStatusEvent(msg, ctx, resImp)
}
return ctrl.Result{}, nil
}
Expand Down Expand Up @@ -170,6 +146,38 @@ func getMCAntreaClusterPolicy(resImp *multiclusterv1alpha1.ResourceImport) *v1al
}
}

func (r *ResourceImportReconciler) reportStatusEvent(errMsg string, ctx context.Context, resImp *multiclusterv1alpha1.ResourceImport) error {
if errMsg == "" {
return nil
}
statusEvent := &corev1.Event{
ObjectMeta: metav1.ObjectMeta{
Name: randName(acnpImportStatusPrefix + r.localClusterID + "-"),
Namespace: resImp.Namespace,
},
Type: corev1.EventTypeWarning,
Reason: acnpImportFailed,
Message: errMsg,
InvolvedObject: corev1.ObjectReference{
APIVersion: resourceImportAPIVersion,
Kind: resourceImportKind,
Name: resImp.Name,
Namespace: resImp.Namespace,
UID: resImp.GetUID(),
},
FirstTimestamp: metav1.Now(),
LastTimestamp: metav1.Now(),
ReportingController: acnpEventReportingController,
ReportingInstance: acnpEventReportingInstance,
Action: "synced",
}
if err := r.remoteCommonArea.Create(ctx, statusEvent, &client.CreateOptions{}); err != nil {
klog.ErrorS(err, "Failed to create ACNP import event for ResourceImport", "resImp", klog.KObj(resImp))
return err
}
return nil
}

func randSeq(n int) string {
b := make([]rune, n)
for i := range b {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,12 @@ import (
"testing"

"github.com/stretchr/testify/assert"
corev1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/fake"

mcsv1alpha1 "antrea.io/antrea/multicluster/apis/multicluster/v1alpha1"
Expand Down Expand Up @@ -147,7 +149,15 @@ func TestResourceImportReconciler_handleCopySpanACNPCreateEvent(t *testing.T) {
} else if !tt.expectedSuccess && (err == nil || !apierrors.IsNotFound(err)) {
t.Errorf("ResourceImport Reconciler should not import an ACNP whose Tier does not exist in current cluster. Expected NotFound error. Actual err = %v", err)
}
//TODO(yang): add Event creation tests
if !tt.expectedSuccess {
errorList := &corev1.EventList{}
if err := fakeRemoteClient.List(ctx, errorList, &client.ListOptions{}); err != nil {
t.Errorf("Failed to list Events in remote Common Area")
}
if len(errorList.Items) == 0 {
t.Errorf("An event should be created for failed ACNP imports")
}
}
}
})
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ func (r *ResourceExportReconciler) refreshEndpointsResourceImport(
var newSubsets []corev1.EndpointSubset
undeleteItems, err := r.getNotDeletedResourceExports(resExport)
if err != nil {
klog.ErrorS(err, "failed to list ResourceExports, retry later")
klog.ErrorS(err, "Failed to list ResourceExports, retry later")
return newResImport, false, err
}
for _, re := range undeleteItems {
Expand Down Expand Up @@ -400,7 +400,7 @@ func (r *ResourceExportReconciler) refreshACNPResourceImport(
if !apiequality.Semantic.DeepEqual(resExport.Spec.ClusterNetworkPolicy, resImport.Spec.ClusterNetworkPolicy) {
undeletedItems, err := r.getNotDeletedResourceExports(resExport)
if err != nil {
klog.ErrorS(err, "failed to list ResourceExports for ACNP, retry later")
klog.ErrorS(err, "Failed to list ResourceExports for ACNP, retry later")
return newResImport, false, err
}
if len(undeletedItems) == 1 && undeletedItems[0].Name == resExport.Name && undeletedItems[0].Namespace == resExport.Namespace {
Expand Down
Loading

0 comments on commit f7e29a8

Please sign in to comment.