Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

[Documentation] Security Group Host Design #390

Merged
merged 18 commits into from
Nov 3, 2020

Conversation

er1cthe0ne
Copy link
Contributor

@er1cthe0ne er1cthe0ne commented Sep 23, 2020

This is the first draft of Security Group Host Design. It contains the high-level approach and the advantage compared to OpenStack Neutron. Diagrams and workflow pictures will be added next.

@er1cthe0ne er1cthe0ne added the documentation Improvements or additions to documentation label Sep 23, 2020
@er1cthe0ne er1cthe0ne self-assigned this Sep 23, 2020
@codecov-commenter
Copy link

codecov-commenter commented Sep 23, 2020

Codecov Report

Merging #390 into master will increase coverage by 2.57%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #390      +/-   ##
============================================
+ Coverage     34.24%   36.82%   +2.57%     
- Complexity      959     1151     +192     
============================================
  Files           413      447      +34     
  Lines         10086    10703     +617     
  Branches       1228     1365     +137     
============================================
+ Hits           3454     3941     +487     
- Misses         6189     6230      +41     
- Partials        443      532      +89     
Impacted Files Coverage Δ Complexity Δ
...rewei/alcor/route/controller/RouterController.java 61.86% <0.00%> (-38.14%) 11.00% <0.00%> (+10.00%) ⬇️
.../alcor/portmanager/processor/ProcessorManager.java 74.35% <0.00%> (-18.98%) 11.00% <0.00%> (+1.00%) ⬇️
.../futurewei/alcor/route/utils/RouteManagerUtil.java 55.55% <0.00%> (-6.67%) 9.00% <0.00%> (+3.00%) ⬇️
...alcor/portmanager/util/RestParameterValidator.java 44.61% <0.00%> (-5.39%) 14.00% <0.00%> (+1.00%) ⬇️
...lcor/portmanager/processor/DataPlaneProcessor.java 75.93% <0.00%> (-4.39%) 29.00% <0.00%> (+12.00%) ⬇️
...rewei/alcor/portmanager/processor/PortContext.java 67.27% <0.00%> (-1.62%) 22.00% <0.00%> (+4.00%) ⬇️
...r/route/service/Impl/NeutronRouterServiceImpl.java 59.71% <0.00%> (-0.44%) 25.00% <0.00%> (ø%)
...alcor/elasticipmanager/dao/ElasticIpAllocator.java 63.53% <0.00%> (-0.28%) 48.00% <0.00%> (ø%)
...ewei/alcor/portmanager/processor/MacProcessor.java 93.65% <0.00%> (ø) 13.00% <0.00%> (ø%)
...ewei/alcor/portmanager/processor/VpcProcessor.java 90.00% <0.00%> (ø) 7.00% <0.00%> (ø%)
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e30926c...8ebba63. Read the comment docs.

@xieus xieus added this to the Version 0.9.2020.09.30 milestone Oct 3, 2020
@codecov-io
Copy link

codecov-io commented Oct 9, 2020

Codecov Report

Merging #390 into master will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #390   +/-   ##
=========================================
  Coverage     36.79%   36.80%           
- Complexity     1149     1150    +1     
=========================================
  Files           447      447           
  Lines         10703    10703           
  Branches       1365     1366    +1     
=========================================
+ Hits           3938     3939    +1     
+ Misses         6231     6230    -1     
  Partials        534      534           
Impacted Files Coverage Δ Complexity Δ
...alcor/portmanager/processor/DatabaseProcessor.java 85.36% <0.00%> (-2.44%) 9.00% <0.00%> (ø%)
...alcor/portmanager/util/RestParameterValidator.java 44.61% <0.00%> (+3.07%) 14.00% <0.00%> (+1.00%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83f950b...a73ca7b. Read the comment docs.


. On-Host direct connection from VM port to OVS: it doesn't need extra Linux bridge, Linux devices, Linux kernel or IP Tables;
. Direct and simplified communication between Alcor controller to Alcor control agent;
. Delta update: Update of security group rule only require the changed rule to be push down to compute hosts;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only consider delta update between DPM and ACA? Dose delta update for ovs flows are considered here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we will need delta update for ovs flows also. In order to support that, security group implementation in ACA will need to keep an internal map of port to security group (and its rules) mapping. Keep it in runtime memory would work, but it will not survive reboot/crash and hard to support older ACA to new ACA update (note that this problem is also applicable to ther ACA modules which is runtime memory as storage).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the situation where the security rules of two security groups of a port overlap. If we delete the security group rules of the overlapping part of one of the security groups, it will be troublesome to deal with them by delta update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's step back to the advantage of ACA vs neutron OVS agent:

  1. plan to survive ACA crash or migration from older ACA to new ACA
  2. delta update when only one security group rule is added/deleted/modified

Both of them requires ACA to keep an internal map of port to security group (and its rules) mapping. With this internal map, we can accept delta update and figure out the minimal set of openflow rule to modify.

. On-Host direct connection from VM port to OVS: it doesn't need extra Linux bridge, Linux devices, Linux kernel or IP Tables;
. Direct and simplified communication between Alcor controller to Alcor control agent;
. Delta update: Update of security group rule only require the changed rule to be push down to compute hosts;
. Reduced OVS SG rules: Only one set of security group rules is configured in OVS to be shared with all ports using it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update the wordings to clarify. Please use the below as a reference: Uses of conjunctive flows

https://docs.openstack.org/neutron/train/contributor/internals/openvswitch_firewall.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've read this link. I mean, how do multiple port share a flow?

Copy link
Contributor

@chenpiaoping chenpiaoping Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take the egress flows as an example:
1.table=TRANSIENT_TABLE,priority=100,in_port=ofport,actions=set_field:{ofport}->reg{REG_PORT},set_field:{local_vlan}->reg{REG_NET},resubmit(,BASE_EGRESS_TABLE)
2.table=BASE_EGRESS_TABLE,priority=65,reg_port=ofport,dl_type=ETHERTYPE_IP,in_port=ofport,dl_src=mac_addr,nw_src=ip_addr,actions=ct(table={RULES_EGRESS_TABLE},zone=NXM_NX_REG{REG_NET}[0..15])
3.table=RULES_EGRESS_TABLE,priority=70+,dl_type=ethertype,reg_port=ofport,nw_dst=dst_ip_prefix,nw_proto=protocol,tcp_src=sport,tcp_dst=dport,ct_state=+new-est,actions=resubmit(,ACCEPT_OR_INGRESS_TABLE)
4.table=ACCEPT_OR_INGRESS_TABLE,priority=90,dl_type=ETHERTYPE_IP,reg_port=ofport,ct_state=+new-est,actions=ct(commit,zone=NXM_NX_REG{REG_NET}[0..15]),resubmit(,ACCEPTED_EGRESS_TRAFFIC_TABLE)
If multiple ports share flow#3, what should flows look like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current design can reuse rule for conjunctive flows only, that's only used for security group rule with remote_group.

. Direct and simplified communication between Alcor controller to Alcor control agent;
. Delta update: Update of security group rule only require the changed rule to be push down to compute hosts;
. Reduced OVS SG rules: Only one set of security group rules is configured in OVS to be shared with all ports using it
. Enable scale and performance from the ground up, addressing the biggest pain point on Neutron
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the pain point of neutron remote_group or is it that multiple vm on the same host share a security group rule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe one of the pain point is having too many openflow rules installed on a compute node. Both neutron remote_group and mutiple VMs (on same host) assoicated to a lot of security group rules would contribute to the pain point.

@chenpiaoping
Copy link
Contributor

What is the difference between the implementation of the agent side and the neutron agent?

@er1cthe0ne
Copy link
Contributor Author

What is the difference between the implementation of the agent side and the neutron agent?

  1. Direct goal state communication between Alcor server and ACA, support batching and parallel processing in ACA.
  2. Delta update, greatly reduced the message size between Alcor server and ACA
  3. On-demand programming, similar to L3 routing design, we can install SG rules only when it is needed by traffic flow. And those on-demand openflow rules can expire with an idle timeout

@chenpiaoping
Copy link
Contributor

chenpiaoping commented Oct 14, 2020

@er1cthe0ne Whether PortState is carried in the Networkconfiguration when updating the security group, if so, will go into Aca_Goal_State_Handler::update_port_states () according to the current code logic for processing, which is not expected to happen. In addition, what fields should be included in the PortState when updating the security group?

@er1cthe0ne
Copy link
Contributor Author

@er1cthe0ne Whether PortState is carried in the Networkconfiguration when updating the security group, if so, will go into Aca_Goal_State_Handler::update_port_states () according to the current code logic for processing, which is not expected to happen.

If there is a new port created with security group enable: PortState::operation_type = CREATE

If there is no new port created when security group is updated: PortState::operation_type = INFO

We will add the ignore logic when PortState::operation_type = INFO inside ACA_Dataplane_OVS::update_port_state_workitem similar to update_subnet_state_workitem

In addition, what fields should be included in the PortState when updating the security group?

I think we will need request_id, id, fixed_ips, allow_address_pairs (enforced when security group is enabled) and security_group_ids at the minimal.

Copy link
Contributor

@xieus xieus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us try to close this document soon. Left a few more comments.

The default security group has rules that allow associated ports of the default rule to talk to each other.

To support this using the minimal set of OpenFlow rules, we will mark remote ports with its associated security group using conjunctive flows discussed in the next session.
We will update our neighbor configuration schema to include associated security group IDs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@er1cthe0ne, is this supported in our schema? I think it would be good to include it in this design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the design but not in our schema yet. @chenpiaoping we are going to need this neighbor SG ID in our neighbor configuration for remote SG rules right?

Copy link
Contributor

@xieus xieus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.


OVS (controlled by ACA) will need to stamp all packets coming out for a port with the assoicated SG IDs (zero to five).

Each SG ID is 36 characters string (36 bytes) or we can represent it using 16 bytes encoding. 5 of them will be 80 bytes plus adding 8 bytes NSH header will total 88 bytes increase of packet header. While VxLAN-GPE + NSH can embed all those information, but 88 bytes is too much data to put into header of every single packet.
Copy link
Contributor

@xieus xieus Oct 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that most majority of customer scenarios only use one SG, can we cut the header overhead to 16+8 = 24 bytes or 88 is not negotiable :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would cut it down to 16+8 bytes NSH header = 24 bytes. That's more negotiable :) @chenpiaoping @Gzure - can we assume most customer scenarios only use one SG?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us bring this question up in the weekly open-source meeting. If the assumption is confirmed, we should take the proposal more serious :-)

3. security - put control packet outside tunnel has lower risk, but inside tenent networks tunnels has high risk because tenent can mess with it
4. acknowledgement - what happen if control packet are missed? There is not reply mechanism


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember we had discussion about storing SG information using local registry alike staff. Didn't find it in the documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the discussion mentioned on line 300 where we use control packet to learn about the remote port IP to its assoicated SG, and the store that information into ACA memory (or "registry" to survive reboot?).

@xieus xieus self-requested a review November 3, 2020 05:02
Copy link
Contributor

@xieus xieus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We will merge the design and keep iterating if we find interesting ideas along the way.

@xieus xieus changed the title [Document] Security Group Host Design [Documentation] Security Group Host Design Nov 3, 2020
@xieus xieus merged commit 97a7c17 into futurewei-cloud:master Nov 3, 2020
@er1cthe0ne er1cthe0ne deleted the doc/security_group branch November 3, 2020 21:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Research] VxLan-GPE for remote SG ingress [Documentation] Security Group Host Design
5 participants