Skip to content

Layer 3

Amy Buck edited this page Nov 15, 2018 · 27 revisions

OpenSwitch OPX supports unicast routing over Linux interfaces using routes in the Linux kernel routing table. Applications can also use the CPS API to configure routes. This information describes how to configure Layer 3 unicast routing to provision the NPU.

The routing subsystem manages the forwarding information base (FIB), and programs routes with resolved next-hops using ARP/Neighbor table entries received from the Linux kernel.

Virtual routing and forwarding

Virtual routing and forwarding (VRF) allows multiple instances of a routing table to coexist within the same router, at the same time. VRF improves functionality by allowing network paths to be segmented without using multiple devices. The control and data planes are isolated in each VRF, achieving unique routing and forwarding intelligence per VRF.

In OPX, VRF is based on the Linux Stretch (4.9) kernel namespace concept, along with the slave MAC-VLAN interface approach.

Create VRF

This command attaches a new MAC-VLAN link to a physical interface, then moves the interface into the namespace.

$ opx-config-vrf --create ---vrf blue --port e101-001-0
Configuration successful...

root@OPX:~# ip netns exec blue ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
39: v-e101-001-0@if12: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:0c:29:a2:7e:87 brd ff:ff:ff:ff:ff:ff link-netnsid 0

View VRFs

If using FRR, run Zebra with –n option to set the VRF backend based on Linux network namespaces.

$ opx-config-vrf --show

Key: 1.292.19136521.
ni/network-instances/network-instance/name = blue
Key: 1.292.19136521.
ni/network-instances/network-instance/name = default

root@OPX:~# ip netns list

default
blue (id: 1)

root@OPX:~# vtysh

Hello, this is FRRouting (version 5.0.1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

OPX# sh vrf

vrf blue id 1 netns /run/netns/blue

Delete VRF

First remove ports from the VRF, then delete the VRF.

$ opx-config-vrf --removeport --vrf blue --port e101-001-0

Configuration Successful...

root@OPX:~# opx-config-vrf --delete --vrf blue

Configuration Successful...

root@OPX:~# ip netns

root@OPX:~#                                                                           

IPv4 routing

A routing table entry consists of a destination IP address prefix and at least one next-hop address or a Linux interface.

Configure static route

$ ip route show

default dev eth0  scope link
3.3.3.0/24 dev e101-003-0  proto kernel  scope link  src 3.3.3.1

$ ip route add 11.10.10.0/24  dev e101-003-0

$ ip route show

default dev eth0  scope link
3.3.3.0/24 dev e101-003-0  proto kernel  scope link  src 3.3.3.1
11.10.10.0/24 dev e101-003-0  scope link

Configure static routing with next-hop

$ ip route add 30.30.30.0/24 via 3.3.3.3

$ ip route show

default dev eth0  scope link
3.3.3.0/24 dev e101-003-0   proto kernel  scope link  src 3.3.3.1
30.30.30.0/24 via 3.3.3.3 dev e101-003-0

Delete static route

$ ip route delete 11.10.10.0/24

$ ip route show

default dev eth0  scope link
3.3.3.0/24 dev e101-003-0  proto kernel  scope link  src 3.3.3.1

To add a persistent static route that is saved after a reboot, configure the route in the /etc/network/interfaces file.

IPv6 routing

You can add, delete, or modify the IPv6 routes and next-hops in the IPv6 routing table.

Add IPv4 route

$ ip -6 route  add 5::5/64 via 3::3

Add IPv6 route

$ ip -6 route show

3::/64 dev e101-003-0  proto kernel  metric 256
5::/64 via 3::3 dev e101-003-0  metric 1024

Monitor IPv6 routing

$ ip monitor

30.30.30.0/24 via 3.3.3.3 dev e00-3
3::/64 via 3::3 dev e101-003-0  metric 1024
5::/64 via 3::3 dev e101-003-0  metric 1024

ARP and neighbor table entries

OPX uses ARP and neighbor table entries to resolve adjacencies by using the host IP address-to-MAC address binding. In Linux, the ARP table is used for IPv4 routing, and the neighbor table is used for IPv6 routing.

View kernel ARP table entries

$ arp -n

Address HWtype HWaddress Flags Mask Iface 3.3.3.4 ether 90:b1:1c:f4:9d:44 C

View IPv6 neighbor table

$ ip -6 neighbor

Configure IPv6 address

$ ifconfig e101-003-0 inet6 add 3::1/64

$ ifconfig e101-003-0

e101-003-0 Link encap:Ethernet  HWaddr 90:b1:1c:f4:a8:ea
inet addr:3.3.3.1  Bcast:3.3.3.255  Mask:255.255.255.0
inet6 addr: 3::1/64 Scope:Global
inet6 addr: fe80::92b1:1cff:fef4:a8ea/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:532 errors:0 dropped:0 overruns:0 frame:0
TX packets:173 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:46451 (45.3 KiB)  TX bytes:25650 (25.0 KiB)

View IPv6 neighbor table

$ ip -6 neighbor show

3::3 dev e101-003-0  lladdr 90:b1:1c:f4:9d:44 router REACHABLE

Check connectivity to IPv6 neighbor

$ ping6 3::3 

PING 3::3(3::3) 56 data bytes 
64 bytes from 3::3: icmp_seq=1 ttl=64 time=1.74 ms

$ tcpdump -i e101-003-0 

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
listening on e101-003-0, link-type EN10MB (Ethernet), capture size 262144 bytes 
04:30:17.053115 IP6 3::1 > 3::3: ICMP6, echo request, seq 8, length 64

Equal cost multi-path (ECMP)

The Linux networking stack supports ECMP by adding multiple next-hops to the route.

Configure next-hop routing

$ ip route add 40.40.40.0/24 nexthop via 3.3.3.6 nexthop via 4.4.4.7

$ ip route show

default dev eth0  scope link
3.3.3.0/24 dev e101-003-0  proto kernel  scope link  src 3.3.3.1
40.40.40.0/24
       nexthop via 3.3.3.6  dev e101-003-0 weight 1
       nexthop via 4.4.4.7  dev e101-004-0 weight 1

NOTE: The Linux kernel provides limited support for IPv6 multi-path routing.

Layer 3 routing topology example

When you configure an IP address, use any Linux utility command such as ip addr add or ifconfig to configure an interface.

Configure IP address on R1

$ ip addr add 10.1.1.1/24 dev e101-007-0

$ ip addr add 11.1.1.1/24 dev e101-001-0

Configure IP address on R2

$ ip addr add 10.1.1.2/24 dev e101-007-0

$ ip addr add 12.1.1.1/24 dev e101-001-0

Verify IP address configuration on R1

$ ip addr show e101-007-0

16: e101-007-0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 500
   link/ether 74:e6:e2:f6:af:87 brd ff:ff:ff:ff:ff:ff
   inet 10.1.1.1/24 scope global e101-007-0
      valid_lft forever preferred_lft forever
   inet6 fe80::76e6:e2ff:fef6:af87/64 scope link
      valid_lft forever preferred_lft forever

$ ip addr show e101-001-0

10: e101-001-0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 500
   link/ether 74:e6:e2:f6:af:81 brd ff:ff:ff:ff:ff:ff
   inet 11.1.1.1/24 scope global e101-001-0
      valid_lft forever preferred_lft forever
   inet6 fe80::76e6:e2ff:fef6:af81/64 scope link
      valid_lft forever preferred_lft forever

Verify IP address configuration on R2

$ ip addr show e101-007-0

16: e101-007-0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 500
   link/ether 74:e6:e2:f6:ba:87 brd ff:ff:ff:ff:ff:ff
   inet 10.1.1.2/24 scope global e101-007-0
      valid_lft forever preferred_lft forever
   inet6 fe80::76e6:e2ff:fef6:ba87/64 scope link
      valid_lft forever preferred_lft forever

$ ip addr show e101-001-0

10: e101-001-0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 500
   link/ether 74:e6:e2:f6:ba:81 brd ff:ff:ff:ff:ff:ff
   inet 12.1.1.1/24 scope global e101-001-0
      valid_lft forever preferred_lft forever
   inet6 fe80::76e6:e2ff:fef6:ba81/64 scope link
      valid_lft forever preferred_lft forever

Enable interfaces on R1 and R2

$ ip link set dev e101-007-0 up

$ ip link set dev e101-001-0 up

Configure static route on server on R1

$ ip route add 12.1.1.0/24 via 10.1.1.2

Configure static route on server on R2

$ ip route add 11.1.1.0/24 via 10.1.1.1

Ping neighbor route and server (Server 2) from R1

$ ping 11.1.1.2

View ARP table on R1

$ arp -n

Address      HWtype  HWaddress           Flags Mask      Iface
11.1.1.2     ether   00:00:00:1d:9a:bd   C               e101-001-0
10.1.1.2     ether   74:e6:e2:f6:ba:87   C               e101-007-0

View ARP table on R2

$ arp -n

Address      HWtype  HWaddress           Flags Mask      Iface
10.1.1.1     ether   74:e6:e2:f6:af:87   C               e101-007-0
12.1.1.2     ether   00:00:00:1d:9a:be   C               e101-001-0

See Programming examples for information on how to program routines using the CPS API.

Dynamic routing

To enable dynamic routing, configure BGP and OSPF using open-source routing stacks, such as Quagga, Bird, and other third-party applications.

FRRouting

Free range routing (FRR) is an open-source routing application that provides OSPFv2, OSPFv3, RIPv1 and v2, RIPng, and BGP-4 functionality. FRR is automatically installed with release 2.3.1 and above.

The FRR architecture consists of a core daemon zebra, which acts as an abstraction layer to the underlying Linux kernel and presents a Zserv API over a Unix or TCP socket to FRR clients. The Zserv clients implement a routing protocol and communicate routing updates to the zebra daemon. See FRR for complete information.

FRR configuration

The daemons file is stored in the /etc/frr directory. All routing protocol daemons installed with FRR are disabled by default. You must enable the zebra daemon to install the routes in the kernel routing table.

1. Open the daemons file for editing and change the daemon status to yes.

$ vim /etc/frr

zebra=yes
bgpd=yes
ospfd=no
ospf6d=no
ripd=no
ripngd=no
isisd=no
babeld=no

2. Create the frr.conf configuration file.

$ touch /etc/frr/frr.conf

3. Restart the FRR service.

$ /etc/frr# service frr restart

4. View the status of the FRR protocol daemons.

root@OPX:/etc/frr# service frr status
? frr.service - FRRouting
   Loaded: loaded (/lib/systemd/system/frr.service; enabled)
   Active: active (running) since Wed 2018-11-25 21:17:14 UTC; 28s ago
  Process: 10771 ExecStop=/usr/lib/frr/frr stop (code=exited, status=0/SUCCESS)
  Process: 10843 ExecStart=/usr/lib/frr/frr start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/frr.service
           ??10859 /usr/lib/frr/zebra -s 90000000 --daemon -A 127.0.0.1
           ??10866 /usr/lib/frr/bgpd --daemon -A 127.0.0.1
           ??10873 /usr/lib/frr/watchfrr -adz -r /usr/sbin/servicebBfrrbBrest...

Nov 25 21:17:13 OPX frr[10843]: Loading capability module if not yet done.
Nov 25 21:17:13 OPX frr[10843]: Starting Frr daemons (prio:10):. zebra. bgpd.
Nov 25 21:17:13 OPX watchfrr[10873]: watchfrr 3.0.3 watching [zebra bgpd], ...t]
Nov 25 21:17:14 OPX watchfrr[10873]: zebra state -&gt; up : connect succeeded
Nov 25 21:17:14 OPX watchfrr[10873]: bgpd state -&gt; up : connect succeeded
Nov 25 21:17:14 OPX watchfrr[10873]: Watchfrr: Notifying Systemd we are up ...ng
Nov 25 21:17:14 OPX frr[10843]: Starting Frr monitor daemon: watchfrr.
Nov 25 21:17:14 OPX frr[10843]: Exiting from the script
Nov 25 21:17:14 OPX systemd[1]: Started FRRouting.
Hint: Some lines were ellipsized, use -l to show in full.

5. Access the FRR shell.

$ vtysh
Hello, this is FRR (version 3.0.3)
Copyright 1996-2017 Kunihiro Ishiguro, et al.
OPX#

6. Save the configuration changes.

$ write memory
Building Configuration...
  Integrated configuration saved to /etc/frr/frr.conf
  [OK]

FRR persistent configuration

The FRR service does not automatically start at system reboot. To start the service, you must add network-online.target.

$ vim.tiny /lib/systemd/system/graphical.target

Usage:

#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.
[Unit]
Description=Graphical Interface
Documentation=man:systemd.special(7)
Requires=multi-user.target <uicontrol>network-online.target</uicontrol>
After=multi-user.target <uicontrol>network-online.target</uicontrol>
Conflicts=rescue.target
Wants=display-manager.service
AllowIsolate=yes

See FRR for complete information.

Routing use case using FRR

This use case describes how to configure BGP using FRR in a spine/leaf network.

Link Network Link nodes BGP AS number
Leaf1-to-Spine1 10.1.1.0/24 Leaf1 64501
Spine 1 64555
Leaf1-to-Spine2 20.1.1.0/24 Leaf1 64501
Spine 2 64555
Leaf2-to-Spine1 40.1.1.0/24 Leaf2 64502
Spine 1 64555
Leaf2-to-Spine2 30.1.1.0/24 Leaf2 64502
Spine 2 64555
Leaf1-to-Server1 11.1.1.0/24 Leaf1 64501
Leaf2-to-Server2 12.1.1.0/24 Leaf2 64502

1. Configure the IP addresses to Spine1, Spine2, and Server1 from Leaf1.

leaf1(config)# interface e101-049-0
leaf1(conf-if-e101-049-0)# ip address 10.1.1.1/24
leaf1(conf-if-e101-049-0)# no shutdown
leaf1(conf-if-e101-049-0)# exit

leaf1(config)# interface e101-051-0
leaf1(conf-if-e101-051-0)# ip address 20.1.1.1/24
leaf1(conf-if-e101-051-0)# no shutdown
leaf1(conf-if-e101-051-0)# exit

leaf1(config)# interface e101-001-0
leaf1(conf-if-e101-001-0)# ip address 11.1.1.1/24
leaf1(conf-if-e101-001-0)# no shutdown

2. Configure the IP addresses to Spine1, Spine2, and Server2 from Leaf2.

leaf2(config)# interface e101-032-0
leaf2(conf-if-e101-032-0)# ip address 30.1.1.1/24
leaf2(conf-if-e101-032-0)# no shutdown
leaf2(conf-if-e101-032-0)# exit

leaf2(config)# interface e101-020-0
leaf2(conf-if-e101-020-0)# ip address 40.1.1.1/24
leaf2(conf-if-e101-020-0)# no shutdown
leaf2(conf-if-e101-020-0)# exit

leaf2(config)# interface e101-001-0
leaf2(conf-if-e101-001-0)# ip address 12.1.1.1/24
leaf2(conf-if-e101-001-0)# no shutdown

3. Configure the IP addresses to Leaf1 and Leaf2 from Spine1.

spine1(config)# interface e101-027-1
spine1(conf-if-e101-027-1)# ip address 10.1.1.2/24
spine1(conf-if-e101-027-1)# no shutdown
spine1(conf-if-e101-027-1)# exit

spine1(config)# interface e101-010-1
spine1(conf-if-e101-010-1)# ip address 40.1.1.2/24
spine1(conf-if-e101-010-1)# no shutdown 

4. Configure the IP addresses to Leaf1 and Leaf2 from Spine2.

spine2(config)# interface e101-027-1
spine2(conf-if-e101-027-1)# ip address 20.1.1.2/24
spine2(conf-if-e101-027-1)# no shutdown
spine2(conf-if-e101-027-1)# exit

spine2(config)# interface e101-018-1
spine2(conf-if-e101-018-1)# ip address 30.1.1.2/24
spine2(conf-if-e101-018-1)# no shutdown
spine2(conf-if-e101-018-1)# exit

5. Configure BGP to Spine1 and Spine2 from Leaf 1.

leaf1(config)# router bgp 64501
leaf1(conf-router-bgp-64501)# neighbor 10.1.1.2 remote-as 64555
leaf1(conf-router-bgp-64501)# neighbor 20.1.1.2 remote-as 64555
leaf1(conf-router-bgp-64501)# network 10.1.1.0/24
leaf1(conf-router-bgp-64501)# network 20.1.1.0/24
leaf1(conf-router-bgp-64501)# network 11.1.1.0/24

6. Configure BGP from Spine1 and Spine2 from Leaf 2.

leaf2(config)# router bgp 64502
leaf2(conf-router-bgp-64502)# neighbor 30.1.1.2 remote-as 64555
leaf2(conf-router-bgp-64502)# neighbor 40.1.1.2 remote-as 64555
leaf2(conf-router-bgp-64502)# network 12.1.1.0/24
leaf2(conf-router-bgp-64502)# network 30.1.1.0/24
leaf2(conf-router-bgp-64502)# network 40.1.1.0/24

7. Configure BGP to Leaf1 and Leaf2 from Spine1.

spine1(config)# router bgp 64555
spine1(conf-router-bgp-64555)# neighbor 10.1.1.1 remote-as 64501
spine1(conf-router-bgp-64555)# neighbor 40.1.1.1 remote-as 64502
spine1(conf-router-bgp-64555)# network 10.1.1.0/24
spine1(conf-router-bgp-64555)# network 40.1.1.0/24

8. Configure BGP to Leaf1 and Leaf2 from Spine 2.

spine2(config)# router bgp 64555
spine2(conf-router-bgp-64555)# neighbor 30.1.1.1 remote-as 64502
spine2(conf-router-bgp-64555)# neighbor 20.1.1.1 remote-as 64501
spine2(conf-router-bgp-64555)# network 30.1.1.0/24
spine2(conf-router-bgp-64555)# network 20.1.1.0/24

9. Configure ECMP from Leaf1 to Leaf2.

leaf1(config)# router bgp 64501
leaf1(conf-router-bgp-64501)# maximum-paths 16

leaf2(config)# router bgp 64502
leaf2(conf-router-bgp-64502)# maximum-paths 16

Verify spine/leaf configuration

1. Verify BGP neighbors from Leaf1 and Leaf2.

leaf1# show ip bgp sum

BGP router identifier 20.20.20.20, local AS number 64501 vrf-id 0
RIB entries 11, using 1232 bytes of memory
Peers 2, using 9136 bytes of memory

Neighbor  V AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.1.1.1  4 64501  196  201  1 0 0 02:39:02  4
20.1.1.1  4 64501  195  206  1 0 0 02:38:57  4

Total number of neighbors 2

leaf2# show ip bgp sum
BGP router identifier 30.20.20.20, local AS number 64501 vrf-id 0
RIB entries 11, using 1232 bytes of memory
Peers 2, using 9136 bytes of memory

Neighbor  V AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
30.1.1.1  4 64501  196  197  1 0 0 02:39:45  4
40.1.1.1  4 64501  192  204  1 0 0 02:39:42  4

Total number of neighbors 2
Clone this wiki locally