- Revision
- About this manual
- Scope
- Abbreviations
- 1 Introduction
- 2 Design
- 3 Test plan
Rev | Date | Author | Description |
---|---|---|---|
0.1 | 15/03/2021 | Nazarii Hnydyn | Initial version |
This document provides general information about PBH implementation in SONiC
This document describes the high level design of PBH feature in SONiC
In scope:
- PBH for NVGRE/VxLAN packets based on inner 5-tuple (IP proto, L4 dst/src port, IPv4/IPv6 dst/src)
Out of scope:
- CRM support for PBH FG hash resources
Term | Meaning |
---|---|
SONiC | Software for Open Networking in the Cloud |
PBH | Policy Based Hashing |
CRM | Critical Resource Monitoring |
ACL | Access Control List |
SAI | Switch Abstraction Interface |
FG | Fine-Grained |
API | Application Programming Interface |
CRC | Cyclic Redundancy Check |
ID | Identifier |
ECMP | Equal-Cost Multi-Path |
LAG | Link Aggregation Group |
NVGRE | Network Virtualization Using Generic Routing Encapsulation |
VxLAN | Virtual eXtensible Local Area Network |
GRE | Generic Routing Encapsulation |
OA | Orchestration agent |
DB | Database |
CLI | Сommand-line Interface |
DPB | Dynamic Port Breakout |
YANG | Yet Another Next Generation |
Figure 1: PBH design
Figure 2: PBH OA design
Figure 3: PBH add flow
Figure 4: PBH remove flow
PBH is a feature which allows user to configure a custom hashing for different packet types.
Under the hood is uses ACL rules to match the specific types of frames and calculates hash
based on user-defined rules.
For flexible hash calculation a new SAI FG Hash API is used.
It allows user not only to configure which fields should be used,
but also to specify a mask for IPv4/IPv6 addresses and sequence ID.
The last one defines in which order the fields are hashed,
and which of them should be associative for CRC with the same sequence ID.
PBH supports hash configuration for ECMP and LAG.
Both Regular ECMP and FG ECMP are eligible.
This feature will support the following functionality:
- NVGRE and VxLAN packets match with inner/outer IPv4/IPv6 frames
- Custom hashing based on inner 5-tuple: IP proto, L4 dst/src port, IPv4/IPv6 dst/src
- Hash configuration for Regular/FG ECMP and LAG
- Warm/Fast reboot
This feature will support the following commands:
- config: add/remove PBH table/rule/hash configuration
- show: display PBH table/rule/hash configuration
This feature will provide error handling for the next situations:
- Invalid object reference
- Incompatible options/parameters
This feature will provide event logging for the next situations:
- PBH table/rule/hash add/remove
Event | Severity |
---|---|
PBH table/rule/hash add/remove: success | NOTICE |
PBH table/rule/hash add/remove: error | ERROR |
PBH uses ACL engine to match NVGRE/VxLAN packets and calculates hash based on user-defined rules.
Hashing is configured based on inner 5-tuple: IP proto, L4 dst/src port, IPv4/IPv6 dst/src.
A custom hashing can be configured for Regular/FG ECMP and LAG.
SAI attributes which shall be used for PBH:
API | Function | Attribute | Comment |
---|---|---|---|
ACL | create_acl_table | SAI_ACL_TABLE_ATTR_FIELD_GRE_KEY | |
SAI_ACL_TABLE_ATTR_FIELD_IP_PROTOCOL | |||
SAI_ACL_TABLE_ATTR_FIELD_IPV6_NEXT_HEADER | |||
SAI_ACL_TABLE_ATTR_FIELD_L4_DST_PORT | |||
SAI_ACL_TABLE_ATTR_FIELD_INNER_ETHER_TYPE | |||
create_acl_entry | SAI_ACL_ENTRY_ATTR_PRIORITY | PBH_RULE|priority | |
SAI_ACL_ENTRY_ATTR_FIELD_GRE_KEY | PBH_RULE|gre_key | ||
SAI_ACL_ENTRY_ATTR_FIELD_IP_PROTOCOL | PBH_RULE|ip_protocol | ||
SAI_ACL_ENTRY_ATTR_FIELD_IPV6_NEXT_HEADER | PBH_RULE|ipv6_next_header | ||
SAI_ACL_ENTRY_ATTR_FIELD_L4_DST_PORT | PBH_RULE|l4_dst_port | ||
SAI_ACL_ENTRY_ATTR_FIELD_INNER_ETHER_TYPE | PBH_RULE|inner_ether_type | ||
SAI_ACL_ENTRY_ATTR_ACTION_SET_LAG_HASH_ID | PBH_RULE|packet_action | ||
SAI_ACL_ENTRY_ATTR_ACTION_SET_ECMP_HASH_ID | PBH_RULE|packet_action | ||
HASH | create_hash | SAI_HASH_ATTR_FINE_GRAINED_HASH_FIELD_LIST | |
create_fine_grained_hash_field | SAI_FINE_GRAINED_HASH_FIELD_ATTR_NATIVE_HASH_FIELD | PBH_HASH|hash_field | |
SAI_FINE_GRAINED_HASH_FIELD_ATTR_IPV4_MASK | PBH_HASH|ipv4_mask | ||
SAI_FINE_GRAINED_HASH_FIELD_ATTR_IPV6_MASK | PBH_HASH|ipv6_mask | ||
SAI_FINE_GRAINED_HASH_FIELD_ATTR_SEQUENCE_ID | PBH_HASH|sequence_id |
OA needs to be updated to support PBH in Config DB and SAI FG Hash API.
There will be class PbhOrch
and a set of data structures implemented to handle PBH feature.
OA will process table/rule/hash updates based on Config DB changes.
Some object updates will be handled and some will be considered as invalid.
Class PbhOrch
will hold a set of methods matching generic Orch
class pattern to handle Config DB updates.
For that purpose a producer-consumer mechanism (implemented in sonic-swss-common
) will be used.
Method PbhOrch::doTask()
will be called on PBH table/rule/hash update. It will distribute handling of DB updates
between other handlers based on the table key which was updated (Redis Keyspace Notifications).
This class will be responsible for:
- Processing updates of the PBH table/rule/hash (add/remove)
- Partial input PBH data validation (including cross-table validation)
- Replicating PBH data from the Config DB to the SAI DB via SAI Redis
- Caching of the PBH objects in order to detect objects update and perform state dump
PBH table objects are stored under PBH_TABLE:*
keys in Config DB. On PBH_TABLE
update,
method PbhOrch::doPbhTableTask()
will be called to process the change.
On table create, PbhOrch
will verify if the table already exists. Creating the table which is already
exists will be treated as an update. Regular table add/remove will update the internal class structures
and appropriate SAI objects will be created/deleted.
PBH rule objects are stored under PBH_RULE:*
keys in Config DB. On PBH_RULE
update,
method PbhOrch::doPbhRuleTask()
will be called to process the change.
On rule create, PbhOrch
will verify if the rule already exists. Creating the rule which is already
exists will be treated as an update. Regular rule add/remove will update the internal class structures
and appropriate SAI objects will be created/deleted.
PBH hash objects are stored under PBH_HASH:*
keys in Config DB. On PBH_HASH
update,
method PbhOrch::doPbhHashTask()
will be called to process the change.
On hash create, PbhOrch
will verify if the hash already exists. Creating the hash which is already
exists will be treated as an update. Regular hash add/remove will update the internal class structures
and appropriate SAI objects will be created or deleted.
Skeleton code:
class PbhOrch : public Orch
{
public:
PbhOrch(
vector<TableConnector> &connectorList,
SwitchOrch *switchOrch,
PortsOrch *portOrch
);
~PbhOrch();
using Orch::doTask; // Allow access to the basic doTask
private:
void doPbhTableTask(Consumer &consumer);
void doPbhRuleTask(Consumer &consumer);
void doPbhHashTask(Consumer &consumer);
void doTask(Consumer &consumer);
SwitchOrch *m_switchOrch;
PortsOrch *m_portOrch;
};
This orchestrator provides API for ACL table/rule configuration.
It is already exists in SONiC.
ACL orchestrator will be extended to support PBH table/rule concept.
PBH table will use a dedicated set of keys to allow match of NVGRE and VxLAN packets.
Skeleton code:
bool AclTable::create()
{
...
if (type == ACL_TABLE_PBH)
{
attr.id = SAI_ACL_TABLE_ATTR_FIELD_GRE_KEY;
attr.value.booldata = true;
table_attrs.push_back(attr);
attr.id = SAI_ACL_TABLE_ATTR_FIELD_IP_PROTOCOL;
attr.value.booldata = true;
table_attrs.push_back(attr);
attr.id = SAI_ACL_TABLE_ATTR_FIELD_L4_DST_PORT;
attr.value.booldata = true;
table_attrs.push_back(attr);
attr.id = SAI_ACL_TABLE_ATTR_FIELD_INNER_ETHER_TYPE;
attr.value.booldata = true;
table_attrs.push_back(attr);
attr.id = SAI_ACL_TABLE_ATTR_ACL_STAGE;
attr.value.s32 = SAI_ACL_STAGE_INGRESS;
table_attrs.push_back(attr);
sai_status_t status = sai_acl_api->create_acl_table(&m_oid, gSwitchId, (uint32_t)table_attrs.size(), table_attrs.data());
if (status == SAI_STATUS_SUCCESS)
{
gCrmOrch->incCrmAclUsedCounter(CrmResourceType::CRM_ACL_TABLE, acl_stage, SAI_ACL_BIND_POINT_TYPE_PORT);
gCrmOrch->incCrmAclUsedCounter(CrmResourceType::CRM_ACL_TABLE, acl_stage, SAI_ACL_BIND_POINT_TYPE_LAG);
}
return status == SAI_STATUS_SUCCESS;
}
...
}
class AclRulePbh: public AclRule
{
public:
AclRulePbh(AclOrch *m_pAclOrch, string rule, string table, acl_table_type_t type, bool createCounter = false);
bool validateAddAction(string attr_name, string attr_value);
bool validateAddMatch(string attr_name, string attr_value);
bool validate();
void update(SubjectType, void *);
};
; defines schema for PBH table configuration attributes
key = PBH_TABLE|table_name ; table name. Must be unique
; field = value
port_list = port-list ; ports to which this table is applied. Can be empty
lag_list = lag-list ; portchannels to which this table is applied. Can be empty
description = *255VCHAR ; table description. Can be empty
; value annotations
port-name = 1*64VCHAR ; name of the port
port-list = port-name [ 1*( "," port-name ) ] ; list of the ports. Valid values range is platform dependent
lag-name = "PortChannel" 1*4DIGIT ; name of the portchannel
lag-list = lag-name [ 1*( "," lag-name ) ] ; list of the portchannels. Valid values range is platform dependent
Note: at least one member of port_list or lag_list is required
; defines schema for PBH rule configuration attributes
key = PBH_RULE|table_name|rule_name ; rule name. Must be unique across the table
; field = value
priority = 1*5DIGIT ; rule priority. Valid values range is platform dependent
gre_key = h32 "/" h32 ; GRE key (32 bits)
ip_protocol = h8 "/" h8 ; IP protocol (8 bits)
ipv6_next_header = h8 "/" h8 ; IPv6 Next Header (8 bits)
l4_dst_port = h16 "/" h16 ; L4 destination port (16 bits)
inner_ether_type = h16 "/" h16 ; Inner EtherType (16 bits)
hash_list = hash-list ; Hash list (PBH_HASH|hash_name)
packet_action = packet-action ; Packet action
counter = flow-counter ; Packet/Byte counter
; value annotations
h8 = 1*2HEXDIG
h16 = 1*4HEXDIG
h32 = 1*8HEXDIG
hash-name = 1*64VCHAR
hash-list = hash-name [ 1*( "," hash-name ) ]
packet-action = "SET_ECMP_HASH" / "SET_LAG_HASH"
flow-counter = "enabled" / "disabled"
Note: at least one match field (gre_key/ip_protocol/l4_dst_port/inner_ether_type) is required
; defines schema for PBH hash configuration attributes
key = PBH_HASH|hash_name ; hash name. Must be unique
; field = value
hash_field = hash-field ; Hash native field
ipv4_mask = ipv4-prefix ; Mask for a IPv4 address.
; Valid only when hash_field is one of:
; INNER_SRC_IPV4, INNER_DST_IPV4
ipv6_mask = ipv6-prefix ; Mask for a IPv6 address.
; Valid only when hash_field is one of:
; INNER_SRC_IPV6, INNER_DST_IPV6
sequence_id = 1*5DIGIT ; Specifies in which order the fields are hashed,
; and defines which fields should be associative
; for CRC with the same sequence ID
; value annotations
hash-field = "INNER_IP_PROTOCOL"
/ "INNER_L4_DST_PORT"
/ "INNER_L4_SRC_PORT"
/ "INNER_DST_IPV4"
/ "INNER_SRC_IPV4"
/ "INNER_DST_IPV6"
/ "INNER_SRC_IPV6"
h16 = 1*4HEXDIG
ls32 = h16 ":" h16
dec-octet = DIGIT ; 0-9
/ %x31-39 DIGIT ; 10-99
/ %x31 2DIGIT ; 100-199
/ %x32 %x30-35 %x30-35 ; 200-255
ipv4-prefix = dec-octet "." dec-octet "."
dec-octet "." dec-octet
ipv6-prefix = 6( h16 ":" ) ls32
/ "::" 5( h16 ":" ) ls32
/ [ h16 ] "::" 4( h16 ":" ) ls32
/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
/ [ *4( h16 ":" ) h16 ] "::" ls32
/ [ *5( h16 ":" ) h16 ] "::" h16
/ [ *6( h16 ":" ) h16 ] "::"
Note: field ipv4_mask/ipv6_mask is only valid when hash_field equals INNER_DST/SRC_IPV4/IPV6
Inner 5-tuple hashing:
{
"PBH_HASH": {
"inner_ip_proto": {
"hash_field": "INNER_IP_PROTOCOL",
"sequence_id": "1"
},
"inner_l4_dst_port": {
"hash_field": "INNER_L4_DST_PORT",
"sequence_id": "2"
},
"inner_l4_src_port": {
"hash_field": "INNER_L4_SRC_PORT",
"sequence_id": "2"
},
"inner_dst_ipv4": {
"hash_field": "INNER_DST_IPV4",
"ipv4_mask": "255.0.0.0",
"sequence_id": "3"
},
"inner_src_ipv4": {
"hash_field": "INNER_SRC_IPV4",
"ipv4_mask": "0.0.0.255",
"sequence_id": "3"
},
"inner_dst_ipv6": {
"hash_field": "INNER_DST_IPV6",
"ipv6_mask": "FFFF::",
"sequence_id": "4"
},
"inner_src_ipv6": {
"hash_field": "INNER_SRC_IPV6",
"ipv6_mask": "::FFFF",
"sequence_id": "4"
}
},
"PBH_RULE": {
"pbh_table|nvgre": {
"priority": "1",
"gre_key": "0x2500/0xffffff00",
"inner_ether_type": "0x86dd/0xffff",
"hash_list": [
"inner_ip_proto",
"inner_l4_dst_port",
"inner_l4_src_port",
"inner_dst_ipv6",
"inner_src_ipv6"
],
"packet_action": "SET_ECMP_HASH"
},
"pbh_table|vxlan": {
"priority": "2",
"ip_protocol": "0x11/0xff",
"l4_dst_port": "0x12b5/0xffff",
"inner_ether_type": "0x0800/0xffff",
"hash_list": [
"inner_ip_proto",
"inner_l4_dst_port",
"inner_l4_src_port",
"inner_dst_ipv4",
"inner_src_ipv4"
],
"packet_action": "SET_LAG_HASH"
}
},
"PBH_TABLE": {
"pbh_table": {
"port_list": [
"Ethernet0",
"Ethernet4"
],
"lag_list": [
"PortChannel0001",
"PortChannel0002"
],
"description": "NVGRE and VxLAN"
}
}
}
User interface:
config
|--- pbh
|--- table
| |--- add <table_name> OPTIONS
| |--- update <table_name> OPTIONS
| |--- remove <table_name>
|
|--- rule
| |--- add <rule_name> <table_name> OPTIONS
| |--- update <rule_name> <table_name> OPTIONS
| |--- remove <rule_name> <table_name>
|
|--- hash
|--- add <hash_name> OPTIONS
|--- update <hash_name> OPTIONS
|--- remove <hash_name>
show
|--- pbh
|--- table
|--- rule
|--- hash
|--- statistics
Options:
config pbh table add
- -p|--port_list - port list
- -l|--lag_list - portchannel list
- -d|--description - table description
config pbh rule add
- -p|--priority - rule priority
- -m|--match - match field
- -h|--hash_list - hash field list
- -a|--action=<set_ecmp_hash|set_lag_hash> - packet action
- -c|--counter=<enabled|disabled> - packet/byte counter
config pbh hash add
- -f|--field - hash field
- -m|--mask - ip mask
- -s|--sequence - sequence id
The following command adds/updates/removes table:
config pbh table add 'pbh_table' --port_list 'Ethernet0,Ethernet4' --lag_list 'PortChannel0001,PortChannel0002' \
--description 'NVGRE and VxLAN'
config pbh table update 'pbh_table' --port_list 'Ethernet0'
config pbh table remove 'pbh_table'
The following command adds/updates/removes rule:
config pbh rule add 'vxlan' 'pbh_table' --priority 1 \
--match gre_key 0x2500/0xffffff00 --match inner_ether_type 0x86dd/0xffff \
--hash_list 'inner_ip_proto,inner_l4_dst_port,inner_l4_src_port,inner_dst_ipv6,inner_src_ipv6' \
--action set_ecmp_hash --counter enabled
config pbh rule update 'vxlan' 'pbh_table' --counter disabled
config pbh rule remove 'vxlan' 'pbh_table'
The following command adds/updates/removes hash:
config pbh hash add 'inner_dst_ipv6' --field 'INNER_DST_IPV6' --mask 'FFFF::' --sequence 4
config pbh hash update 'inner_dst_ipv6' --mask 'FFFF:FFFF::'
config pbh hash remove 'inner_dst_ipv6'
The following command shows table configuration:
root@sonic:/home/admin# show pbh table
Name Interface Description
--------- --------------- ---------------
pbh_table Ethernet0 NVGRE and VxLAN
Ethernet4
PortChannel0001
PortChannel0002
The following command shows rule configuration:
root@sonic:/home/admin# show pbh rule
Table Rule Priority Match Hash Action
--------- ------ ---------- ------------------------------- ----------------- -------------
pbh_table nvgre 1 GRE_KEY: 0x2500/0xffffff00 inner_ip_proto SET_ECMP_HASH
INNER_ETHER_TYPE: 0x86dd/0xffff inner_l4_dst_port
inner_l4_src_port
inner_dst_ipv6
inner_src_ipv6
vxlan 2 IP_PROTOCOL: 0x11/0xff inner_ip_proto SET_LAG_HASH
L4_DST_PORT: 0x12b5/0xffff inner_l4_dst_port
INNER_ETHER_TYPE: 0x0800/0xffff inner_l4_src_port
inner_dst_ipv4
inner_src_ipv4
The following command shows hash configuration:
root@sonic:/home/admin# show pbh hash
Name Field Mask Sequence Symmetric
----------------- ----------------- --------- ---------- -----------
inner_ip_proto INNER_IP_PROTOCOL 1 No
inner_l4_dst_port INNER_L4_DST_PORT 2 Yes
inner_l4_src_port INNER_L4_SRC_PORT 2 Yes
inner_dst_ipv4 INNER_DST_IPV4 255.0.0.0 3 Yes
inner_src_ipv4 INNER_SRC_IPV4 0.0.0.255 3 Yes
inner_dst_ipv6 INNER_DST_IPV6 FFFF:: 4 Yes
inner_src_ipv6 INNER_SRC_IPV6 ::FFFF 4 Yes
The following command shows statistics:
root@sonic:/home/admin# show pbh statistics
Table Rule Packets Count Bytes Count
--------- ------ --------------- -------------
pbh_table nvgre 0 0
vxlan 0 0
A new YANG model sonic-pbh.yang
will be added to sonic-buildimage/src/sonic-yang-models/yang-models
in order to provide support for DPB and management framework.
Skeleton code:
module sonic-pbh {
yang-version 1.1;
namespace "http://github.com/Azure/sonic-pbh";
prefix pbh;
import ietf-inet-types {
prefix inet;
}
import sonic-types {
prefix stypes;
revision-date 2019-07-01;
}
import sonic-extension {
prefix ext;
revision-date 2019-07-01;
}
import sonic-port {
prefix port;
revision-date 2019-07-01;
}
import sonic-portchannel {
prefix lag;
revision-date 2019-07-01;
}
description "PBH YANG Module for SONiC OS";
revision 2021-03-15 {
description "First Revision";
}
container sonic-pbh {
container PBH_HASH {
description "PBH_HASH part of config_db.json";
key "PBH_HASH_NAME";
...
}
/* end of container PBH_HASH */
container PBH_RULE {
description "PBH_RULE part of config_db.json";
list PBH_RULE_LIST {
key "PBH_TABLE_NAME PBH_RULE_NAME";
leaf PBH_TABLE_NAME {
type leafref {
path "/pbh:sonic-pbh/pbh:PBH_TABLE/pbh:PBH_TABLE_LIST/pbh:PBH_TABLE_NAME";
}
description "PBH table reference"
}
leaf PBH_RULE_NAME {
type string {
length 1..255;
}
description "PBH rule"
}
...
}
/* end of PBH_RULE_LIST */
}
/* end of container PBH_RULE */
container PBH_TABLE {
description "PBH_TABLE part of config_db.json";
list PBH_TABLE_LIST {
key "PBH_TABLE_NAME";
leaf PBH_TABLE_NAME {
type string;
description "PBH table"
}
...
}
/* end of PBH_TABLE_LIST */
}
/* end of container PBH_TABLE */
}
/* end of container sonic-pbh */
}
/* end of module sonic-pbh */
No special handling is required
TBD
PBH will reuse and extend the existing test plan:
Inner packet hashing test plan #759