Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Namespaced Group CRD #2438

Merged
merged 3 commits into from
Aug 10, 2022
Merged

Add support for Namespaced Group CRD #2438

merged 3 commits into from
Aug 10, 2022

Conversation

abhiraut
Copy link
Contributor

@abhiraut abhiraut commented Jul 21, 2021

Add Group CRD which is responsible for collecting Pods and Namespaces in the Group's Namespace based on
labelselectors defined in the Group definition. It also allows setting an IPBlock and ChildGroups (cannot be set with other
Selectors) in the Group. The purpose of a Group is to allow grouping of resources and then be referenced in
AntreaNetworkPolicies without having to add the same selectors in every ANP when the group of resources are meant to be
shared. This allows for greater sharing and decouples the job of reconciling effective group members from that of enforcing
security policies.

This PR adds the following:

  • Group API types
  • Group CRD YAML
  • Controller changes to reconcile effective members of a Group
  • Controller changes to trigger ANP update introduced by a Group
  • Validation webhook to validate a GroupSpec
  • NetworkPolicyStatus refactored with Conditions

@abhiraut abhiraut force-pushed the ns-group branch 2 times, most recently from 4077845 to 351f7b0 Compare July 21, 2021 01:44
@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2021

Codecov Report

Merging #2438 (b5f390f) into main (8446156) will increase coverage by 2.45%.
The diff coverage is 35.09%.

❗ Current head b5f390f differs from pull request most recent head fc772a5. Consider uploading reports for the commit fc772a5 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2438      +/-   ##
==========================================
+ Coverage   64.72%   67.17%   +2.45%     
==========================================
  Files         297      298       +1     
  Lines       44988    45940     +952     
==========================================
+ Hits        29119    30862    +1743     
+ Misses      13536    12690     -846     
- Partials     2333     2388      +55     
Flag Coverage Δ *Carryforward flag
e2e-tests 40.44% <10.76%> (?)
integration-tests 35.36% <100.00%> (-0.02%) ⬇️ Carriedforward from 024254d
kind-e2e-tests 50.76% <43.04%> (+6.56%) ⬆️ Carriedforward from 024254d
unit-tests 44.31% <77.09%> (-0.13%) ⬇️ Carriedforward from 024254d

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
pkg/apis/controlplane/helper.go 13.33% <0.00%> (-26.67%) ⬇️
pkg/controller/networkpolicy/group.go 0.00% <0.00%> (ø)
pkg/controller/networkpolicy/store/group.go 8.33% <0.00%> (+0.92%) ⬆️
pkg/controller/types/group.go 55.00% <ø> (ø)
pkg/controller/types/networkpolicy.go 100.00% <ø> (ø)
pkg/controller/networkpolicy/validate.go 45.94% <9.80%> (-6.02%) ⬇️
...kg/controller/networkpolicy/antreanetworkpolicy.go 79.63% <55.07%> (-7.55%) ⬇️
pkg/controller/networkpolicy/status_controller.go 70.26% <64.10%> (-1.52%) ⬇️
pkg/controller/networkpolicy/crd_utils.go 67.31% <65.38%> (-24.40%) ⬇️
pkg/controller/networkpolicy/clustergroup.go 70.55% <78.57%> (-6.52%) ⬇️
... and 79 more

@abhiraut abhiraut force-pushed the ns-group branch 4 times, most recently from c506c6c to 6bceb6b Compare July 28, 2021 23:00
@abhiraut abhiraut changed the title WIP: Add support for Namespaced Group CRD Add support for Namespaced Group CRD Aug 5, 2021
@abhiraut abhiraut requested a review from Dyanngg August 5, 2021 22:07
@abhiraut abhiraut added this to the Antrea v1.3 release milestone Aug 6, 2021
build/yamls/base/controller.yml Outdated Show resolved Hide resolved
cmd/antrea-controller/controller.go Outdated Show resolved Hide resolved
Namespace: "",
Name: "acnpA",
},
out: "AntreaClusterNetworkPolicy:acnpA",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: why do we have a space between resource type and name for cg/group ToString(), but not for ANP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean the difference in NP.ToString() and Group/CG.ToTypedString() ?
The ToTypedString for groups is introduced for logging and other purposes. However the ToString for both objects is the same( no whitespace)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked the same question as @Dyanngg above

pkg/controller/networkpolicy/antreanetworkpolicy.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/validate.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/clustergroup.go Outdated Show resolved Hide resolved
@abhiraut abhiraut force-pushed the ns-group branch 2 times, most recently from 4eee880 to 5002f59 Compare August 9, 2021 23:35
@antoninbas
Copy link
Contributor

Switching milestone to v1.4 as I see this PR is not up-to-date and still needs to go through more reviews

@abhiraut abhiraut force-pushed the ns-group branch 3 times, most recently from 16d4cb7 to aa49d6e Compare August 27, 2021 23:14
@abhiraut abhiraut added the area/network-policy/api Issues or PRs related to the network policy API. label Sep 9, 2021
@abhiraut abhiraut requested review from Dyanngg and tnqn September 23, 2021 21:36
@abhiraut abhiraut added the area/grouping Issues or PRs related to ClusterGroup, Group API. label Sep 23, 2021
@abhiraut abhiraut requested a review from antoninbas September 29, 2021 23:11
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 high level questions:

  • Does it make sense to allow IPBlocks for (namespaced) Groups?
  • A similar question for child Groups: what if I select child Groups defined in a different Namespace?

pkg/apis/controlplane/helper.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/helper.go Outdated Show resolved Hide resolved
Namespace: "",
Name: "acnpA",
},
out: "AntreaClusterNetworkPolicy:acnpA",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked the same question as @Dyanngg above

@abhiraut
Copy link
Contributor Author

abhiraut commented Oct 5, 2021

2 high level questions:

  • Does it make sense to allow IPBlocks for (namespaced) Groups?

I don't want to, but K8s NetworkPolicies allow IPBlocks so for parity I kept it as is..

  • A similar question for child Groups: what if I select child Groups defined in a different Namespace?

the childGroup only accepts string, so it will be considered as a Name of the group and looked for in its own namespace. So even if childGroup child-0 exists in ns2, it should not be added as part of ns1/parentGroup-0 which references childGroup ns1/child-0.

@tnqn
Copy link
Member

tnqn commented Jul 21, 2022

If an Antrea NetworkPolicy sets AppliedTo to a Group which selects cluster scoped Pods, we apply the policy to Pods selected by the Group but in this namespace only.

You meant ANP can use ClusterGroup in appliedTo, or Group in ANP appliedTo can have Namespace selector, or both?

I meant Group in ANP appliedTo that have Namespace selector (but empty Namespace selector means all Namespaces according to the PR). I think the PR doesn't support ANP to use ClusterGroup.

@jianjuns
Copy link
Contributor

I meant Group in ANP appliedTo that have Namespace selector (but empty Namespace selector means all Namespaces according to the PR). I think the PR doesn't support ANP to use ClusterGroup.

Ok. The strategy works for me too. But I feel sharing Group for address and appliedTo is not very useful, so we may decide based on implementation complexity.

@qiyueyao
Copy link
Contributor

If an Antrea NetworkPolicy sets AppliedTo to a Group which selects cluster scoped Pods, we apply the policy to Pods selected by the Group but in this namespace only.

This makes sense to me, it is possible to implement in this way. I do have some questions:

  1. Do we inform back users that Group has only been applied to Pods in this namespace? Or just comment in the documentation as a convention?
  2. Will we in the future need Unrealizable status response? I remember when we introduced this, we scoped that it could be useful elsewhere so that it's not only created for this PR.

@tnqn
Copy link
Member

tnqn commented Jul 22, 2022

  1. Do we inform back users that Group has only been applied to Pods in this namespace? Or just comment in the documentation as a convention?

I think documenting it is good enough. It should be a common sense that a namespace scoped policy can not apply to other namespaces.

  1. Will we in the future need Unrealizable status response? I remember when we introduced this, we scoped that it could be useful elsewhere so that it's not only created for this PR.

I don't think of an use case yet given that validating webhook should already reject "unrealizable" policies, we could discuss it separately when it's needed.

@Dyanngg
Copy link
Contributor

Dyanngg commented Jul 22, 2022

@tnqn @qiyueyao sorry I'm late to the party. I'm a bit worried about the semantical meaning of Group with namespaceSelector in appliedTo based on your suggestion (or at least want some clarification). The ClusterRole + RoleBinding example from K8s makes perfect sense because ClusterRole does not have a namespaceSelector: when scoped down to a specific Namespace, its subjects are still unambiguous. Group with namespaceSelector in appliedTo, on the other hand, is a bit more complicated. Do we first check if the Group selection actually intersect with the Namespace in which the policy is created? In other words, if the policy's Namespace labels match the namespaceSelector of the Group? If they do not match, then essentially we should have an empty appliedTo, instead of simply disregard namespaceSelector when we process such case. Also, this would be any Namespace label change events will lead to ANP reprocessing.

@tnqn
Copy link
Member

tnqn commented Jul 26, 2022

@Dyanngg thanks for your comment.

Do we first check if the Group selection actually intersect with the Namespace in which the policy is created? In other words, if the policy's Namespace labels match the namespaceSelector of the Group? If they do not match, then essentially we should have an empty appliedTo, instead of simply disregard namespaceSelector when we process such case. Also, this would be any Namespace label change events will lead to ANP reprocessing.

Yes, I think it should check the group's NamespaceSelector should select this Namespace because it should be a common sense that using a group in AddressGroup or AppliedToGroup should not expand its selection. The change in my mind was that Namespace label change triggers Group reprocessing; if the Group changes, it triggers corresponding AppliedToGroup reprocessing; if AppliedToGroup's span changes, it triggers ANP reprocessing. When calculating AppliedToGroups inherited from Groups, it first get members of this group from groupingInterface, then filter the ones in this Namespace.

But now I don't have strong preference of using the above approach. I happened to have a discussion with @reachjainrahul and @wenyingd today about a case in cloud that a policy may be unrealizable if it's set with an action that is not supported by cloud like "Reject" while it's impossible to detect it when creating the NetworkPolicy because we don't know whether it applies to cloud instances or not at that time. So for @qiyueyao's question that "Will we in the future need Unrealizable status response", I think the answer would be yes.
We also discussed how to expose the reason why a policy is unrealizable, which I think the code in this PR doesn't provide to users directly yet, right? I was thinking maybe it make sense to have a K8s style "Condition" for NetworkPolicy,

// NetworkPolicyConditionType describes the condition types of NetworkPolicies.
type NetworkPolicyConditionType string

const NetworkPolicyRealizable NetworkPolicyConditionType = "Realizable"

// NetworkPolicyCondition describes the state of a NetworkPolicy at a certain point.
type NetworkPolicyCondition struct {
	// Type of statefulset condition.
	Type NetworkPolicyConditionType
	// Status of the condition, one of True, False, Unknown.
	Status api.ConditionStatus
	// The last time this condition was updated.
	LastTransitionTime metav1.Time
	// The reason for the condition's last transition.
	Reason string
	// A human readable message indicating details about the transition.
	Message string
}

// NetworkPolicyStatus represents information about the status of a NetworkPolicy.
type NetworkPolicyStatus struct {
	// The phase of a NetworkPolicy is a simple, high-level summary of the NetworkPolicy's status.
	Phase NetworkPolicyPhase `json:"phase"`
	// The generation observed by Antrea.
	ObservedGeneration int64 `json:"observedGeneration"`
	// The number of nodes that have realized the NetworkPolicy.
	CurrentNodesRealized int32 `json:"currentNodesRealized"`
	// The total number of nodes that should realize the NetworkPolicy.
	DesiredNodesRealized int32 `json:"desiredNodesRealized"`

	// Represents the latest available observations of a NetworkPolicy current state.
	Conditions []NetworkPolicyCondition
}

Then we could set NetworkPolicyRealizable condition to False and report custom reason and message in this condition, what do you think? If in this way, I think we don't differentiate whether it's partially or fully unrealizable.

@github-actions github-actions bot requested a review from tnqn July 29, 2022 05:10
@tnqn tnqn added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 2, 2022
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
build/charts/antrea/crds/clusternetworkpolicy.yaml Outdated Show resolved Hide resolved
pkg/apis/controlplane/types.go Outdated Show resolved Hide resolved
pkg/apis/crd/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/crd/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/crd/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/apis/crd/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/antreanetworkpolicy.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/crd_utils.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/crd_utils.go Outdated Show resolved Hide resolved
@qiyueyao qiyueyao force-pushed the ns-group branch 2 times, most recently from f83436b to f1cad44 Compare August 9, 2022 10:30
@github-actions github-actions bot requested a review from tnqn August 9, 2022 10:43
pkg/apis/controlplane/types.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/types.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/antreanetworkpolicy.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/antreanetworkpolicy.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/antreanetworkpolicy.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/status_controller.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/status_controller.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/store/networkpolicy.go Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
docs/antrea-network-policy.md Outdated Show resolved Hide resolved
pkg/apis/controlplane/helper.go Outdated Show resolved Hide resolved
@github-actions github-actions bot requested review from jianjuns and tnqn August 10, 2022 04:51
pkg/controller/networkpolicy/crd_utils.go Outdated Show resolved Hide resolved
ObservedGeneration: internalNP.Generation,
Conditions: GenerateNetworkPolicyCondition(internalNP.RealizableMessage),
}
internalNP.SpanMeta.NodeNames = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean here. StatusController should not become another writer of internalNP, which makes the process more complicated and introduces a race condition. And setting to nil will cause the policy is updated to Pending without Unrealizable condition at next round of update, see L273.

I meant when NetworkPolicyController calculates internalNP's span in syncInternalNetworkPolicy, the span should be empty (not nil) if the policy is unrealizable. But actually we don't even need to make any change to syncInternalNetworkPolicy because we already set its AppliedToGroups to empty slices, so it spans no node anyway.

nodeNames := sets.String{}
// Lock the internal NetworkPolicy store as we may have a case where in the
// same internal NetworkPolicy is being updated in the NetworkPolicy UPDATE
// handler.
n.internalNetworkPolicyMutex.Lock()
internalNPObj, found, _ := n.internalNetworkPolicyStore.Get(key)
if !found {
	// Make sure to unlock the store before returning.
	n.internalNetworkPolicyMutex.Unlock()
	return fmt.Errorf("internal NetworkPolicy %s not found", key)
}
internalNP := internalNPObj.(*antreatypes.NetworkPolicy)
// Maintain a copy of old SpanMeta Nodenames so we can later enqueue Groups
// only if it is updated.
oldNodeNames := internalNP.SpanMeta.NodeNames

// Do no send NetworkPolicy with realization error to any Node.
if internalNP.RealizationError == nil {
	// Calculate the set of Node names based on the span of the
	// AppliedToGroups referenced by this NetworkPolicy.
	for _, appliedToGroupName := range internalNP.AppliedToGroups {
		appGroupObj, found, _ := n.appliedToGroupStore.Get(appliedToGroupName)
		if !found {
			continue
		}
		appGroup := appGroupObj.(*antreatypes.AppliedToGroup)
		utilsets.MergeString(nodeNames, appGroup.SpanMeta.NodeNames)
	}
}
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks! So no change in syncInternalNetworkPolicy for this commit.

Status: v1.ConditionTrue,
LastTransitionTime: v1.Now(),
})
case ErrNetworkPolicyAppliedToUnsupportedGroup.Error():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more natural to check the type of the error. Since we only need to add the field to "types.NetworkPolicy", it could be error like "RealizationError", not having to be string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an ErrNetworkPolicyAppliedToUnsupportedGroup struct that implements error.

@github-actions github-actions bot requested a review from tnqn August 10, 2022 10:17
abhiraut and others added 3 commits August 10, 2022 03:18
Add Group CRD which is responsible for collecting Pods and Namespaces in the Group's Namespace based on
labelselectors defined in the Group definition. It also allows setting an IPBlock and ChildGroups (cannot be set with other
Selectors) in the Group. The purpose of a Group is to allow grouping of resources and then be referenced in
AntreaNetworkPolicies without having to add the same selectors in every ANP when the group of resources are meant to be
shared. This allows for greater sharing and decouples the job of reconciling effective group members from that of enforcing
security policies.

This PR adds the following:
-Group API types
-Group CRD YAML
-Controller changes to reconcile effective members of a Group
-Controller changes to trigger ANP update introduced by a Group
-Validation webhook to validate a GroupSpec
-NetworkPolicyStatus refactored with Conditions

Signed-off-by: Qiyue Yao <[email protected]>
Co-authored-by: abhiraut <[email protected]>
NetworkPolicyStatus refactor

Signed-off-by: Qiyue Yao <[email protected]>
Signed-off-by: Qiyue Yao <[email protected]>
@tnqn
Copy link
Member

tnqn commented Aug 10, 2022

/test-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Aug 10, 2022

/skip-conformance test succeeded but failed to collect codecov.

@tnqn
Copy link
Member

tnqn commented Aug 10, 2022

@qiyueyao all tests succeeded, I'm going to merge this PR. Thanks for your hard work!

@tnqn tnqn merged commit 576b080 into antrea-io:main Aug 10, 2022
@qiyueyao
Copy link
Contributor

Thanks for all the insightful and prompt reviews! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/grouping Issues or PRs related to ClusterGroup, Group API. area/network-policy/api Issues or PRs related to the network policy API. kind/feature Categorizes issue or PR as related to a new feature. review-manager-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants