In the previous tutorial, we discovered how to let OSPF dynamically configure routing inside a network. This tutorial provides an introduction to another routing protocol, which is BGP, the Border Gateway Protocol. As the name implies, this protocol acts on the border of a network. Where OSPF is well suited to keep track of all tiny details of what's happening in our internal network, BGP will be talking to the outside world to interconnect our network with other networks, managed by someone else.
When routers talk BGP to each other, they essentially just claim that network ranges are reachable via them:
Let's look at the same picture again, hiding less information:
The picture shows two networks, which are interconnected through router R3
and R10
.
- A complete network under control of somebody has an AS (Autonomous System) number. This number will be used later in the BIRD BGP configuration.
- The routes that are published to another network are as aggregated as possible, to minimize the amount of them. While the internal routing table in for example
AS64080
might contain dozens of prefixes, for each little vlan, and probably a number of single host routes (IPv4/32
and IPv6/128
), they're advertised to the outside as just three routes in total. - If neighbouring routers between different networks are directly connected, they often interconnect using a minimal sized network range. For IPv4, this is usually a
/30
and for IPv6 a/120
or a/126
prefix, containing only the two routers. In the example above, the small network ranges are taken from the network ofAS64080
.
While the title of this section might seem logical, since we're considering BGP after just having spent quite some time on OSPF, it's actually a non-issue. OSPF and BGP are two very different routing protocols, which are used to get different things done. Nonetheless, let's look at some differences:
OSPF:
- Routes in the network are originated by just putting ip addresses on a network interface of a router, and letting the routing protocol pick them up automatically.
- The routes in OSPF are addresses and subnets that are actually in use.
- Every router that participates in the OSPF protocol has a full detailed view on the network using link state updates that are broadcasted over the network. This knowledge is used to calculate the shortest path to every part of the network.
BGP:
- Routes that are published to other networks are "umbrella ranges", which are as big as possible and are defined manually.
- There is no actual proof that the addresses which are advertised are actually in use inside the network.
- A neighbour BGP router knows that some prefix is reachable via another network, but where OSPF shortest path deals with knowledge about all separate routers, paths and weights, BGP just looks on a higher level, considering a complete network (AS) being one step. By default BGP also tries to forward traffic into the direction that contains the smallest amount of AS-hops to a destination (the shortest AS-path), but BGP provides a fair amount of configurable options to influence the routing decisions.
So, OSPF is an IGP (Interior Gateway Protocol) and BGP is an EGP (Exterior Gateway Protocol). BGP can connect OSPF networks to each other, hiding a lot of detail inside them.
In the second half of this tutorial we'll configure a network, using OSPF, BGP and the BIRD routing software. BGP wise, it's kept simple, using just a single connection between two networks.
Our networks start to look serious now! It might be handy to print this image so you don't have to scroll back up all the time, comparing all the routes in the output of commands with the network topology.
Thankfully, most of the configuration is provided already, so we can quickly set up this whole network using our LXC environment. Just like in the previous tutorial, the birdbase container can be cloned, after which the lxc network information and configuration inside the containers can be copied into them.
-
Clone this git repository somewhere to be able to use some files from the bgp-intro/lxc/ directory inside.
-
lxc-clone the birdbase container several times:
lxc-clone -s birdbase R0 lxc-clone -s birdbase R1 lxc-clone -s birdbase R3 lxc-clone -s birdbase R10 lxc-clone -s birdbase R11 lxc-clone -s birdbase R12 lxc-clone -s birdbase H6 lxc-clone -s birdbase H7 lxc-clone -s birdbase H19 lxc-clone -s birdbase H34
-
Set up the network interfaces in the lxc configuration. This can be done by removing all network related configuration that remains from the cloned birdbase container, and then appending all needed interface configuration by running the fixnetwork.sh script that can be found in
bgp-intro/lxc/
in this git repository. Of course, have a look at the contents of the script first, before executing it.. ./fixnetwork.sh
-
Copy extra configuration into the containers. The bgp-intro/lxc/ directory inside this git repository contains a little file hierarchy that can just be copied over the configuration of the containers. For each router, it's a network/interfaces configuration file which adds an IP address that corresponds with the Router ID to the loopback interface, and a simple BIRD configuration file that serves as a starting point for our next steps.
-
Start all containers
for router in 0 1 3 10 11 12; do lxc-start -d -n R$router; sleep 2; done for host in 6 7 19 34; do lxc-start -d -n H$host; sleep 2; done
-
Verify connectivity and look around a bit. Here's an example for R1:
lxc-attach -n R1 root@R1:/# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 10.40.217.1/32 scope global lo valid_lft forever preferred_lft forever inet6 2001:db8:40::1/128 scope global valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 109: vlan216: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 02:00:0a:28:d8:03 brd ff:ff:ff:ff:ff:ff inet 10.40.216.3/28 brd 10.40.216.15 scope global vlan216 valid_lft forever preferred_lft forever inet6 2001:db8:40:d8::3/120 scope global valid_lft forever preferred_lft forever inet6 fe80::aff:fe28:d803/64 scope link valid_lft forever preferred_lft forever 111: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 02:00:0a:28:03:01 brd ff:ff:ff:ff:ff:ff inet 10.40.3.1/24 brd 10.40.3.255 scope global vlan3 valid_lft forever preferred_lft forever inet6 2001:db8:40:3::1/120 scope global valid_lft forever preferred_lft forever inet6 fe80::aff:fe28:301/64 scope link valid_lft forever preferred_lft forever root@R1:/# ip r 10.40.2.0/24 via 10.40.216.2 dev vlan216 proto bird 10.40.3.0/24 dev vlan3 proto kernel scope link src 10.40.3.1 10.40.216.0/28 dev vlan216 proto kernel scope link src 10.40.216.3 10.40.217.0 via 10.40.216.2 dev vlan216 proto bird 10.40.217.3 via 10.40.216.1 dev vlan216 proto bird 10.40.217.16/30 via 10.40.216.1 dev vlan216 proto bird root@R1:/# birdc show route BIRD 1.4.5 ready. 10.40.217.16/30 via 10.40.216.1 on vlan216 [ospf1 22:58:02] * I (150/20) [10.40.217.3] 10.40.216.0/28 dev vlan216 [ospf1 22:58:02] * I (150/10) [10.40.217.3] 10.40.217.0/32 via 10.40.216.2 on vlan216 [ospf1 22:58:02] * I (150/10) [10.40.217.0] 10.40.217.1/32 dev lo [ospf1 22:57:42] * I (150/0) [10.40.217.1] 10.40.217.3/32 via 10.40.216.1 on vlan216 [ospf1 22:58:02] * I (150/10) [10.40.217.3] 10.40.2.0/24 via 10.40.216.2 on vlan216 [ospf1 22:58:02] * I (150/20) [10.40.217.0] 10.40.3.0/24 dev vlan3 [ospf1 22:57:42] * I (150/10) [10.40.217.1] root@R1:/# ip -6 r 2001:db8:40:: via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024 unreachable 2001:db8:40::1 dev lo proto kernel metric 256 error -101 2001:db8:40::3 via fe80::aff:fe28:d801 dev vlan216 proto bird metric 1024 2001:db8:40:2::/120 via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024 2001:db8:40:3::/120 dev vlan3 proto kernel metric 256 2001:db8:40:d8::/120 dev vlan216 proto kernel metric 256 2001:db8:40:d910::/120 via fe80::aff:fe28:d801 dev vlan216 proto bird metric 1024 fe80::/64 dev vlan216 proto kernel metric 256 fe80::/64 dev vlan3 proto kernel metric 256 root@R1:/# birdc6 show route BIRD 1.4.5 ready. 2001:db8:40:d8::/120 dev vlan216 [ospf1 22:58:08] * I (150/10) [10.40.217.3] 2001:db8:40::/128 via fe80::aff:fe28:d802 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.0] 2001:db8:40:2::/120 via fe80::aff:fe28:d802 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.0] 2001:db8:40:3::/120 dev vlan3 [ospf1 22:57:41] * I (150/10) [10.40.217.1] 2001:db8:40::3/128 via fe80::aff:fe28:d801 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.3] 2001:db8:40:d910::/120 via fe80::aff:fe28:d801 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.3]
As you can see, OSPF is running for IPv4 and IPv6, and has discovered the complete internal network of AS64080
.
Now make sure you can do the following, and answer the following questions:
- From H6,
traceroute -n
andtraceroute6 -n
to a few destinations inAS64080
to get acquainted with the network topology. - Look at the BIRD logging. A fun way to follow the logging is to do
tail -F R*/rootfs/var/log/bird/*.log
from outside the containers, and then start all of them. - Find out why
10.40.217.18
or2001:db8:40:d910::2
onR10
cannot be pinged fromR1
, while the route to10.40.217.16/30
and2001:db8:40:d910::/120
are actually present in the routing table ofR1
andR3
.
Let's zoom in a bit first, and focus on the connection between R3
and R10
. This section will show how to configure the actual BGP connection between those two routers, so they will learn about each others network.
The routing table of R3
contains information about the internal network of its own network, AS64080
. As you can see, routes to the ranges in AS65033
are missing.
root@R3:/# ip r
10.40.2.0/24 via 10.40.216.2 dev vlan216 proto bird
10.40.3.0/24 via 10.40.216.3 dev vlan216 proto bird
10.40.216.0/28 dev vlan216 proto kernel scope link src 10.40.216.1
10.40.217.0 via 10.40.216.2 dev vlan216 proto bird
10.40.217.1 via 10.40.216.3 dev vlan216 proto bird
10.40.217.16/30 dev vlan217 proto kernel scope link src 10.40.217.17
root@R3:/# ip -6 r
2001:db8:40:: via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024
2001:db8:40::1 via fe80::aff:fe28:d803 dev vlan216 proto bird metric 1024
unreachable 2001:db8:40::3 dev lo proto kernel metric 256 error -101
2001:db8:40:2::/120 via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024
2001:db8:40:3::/120 via fe80::aff:fe28:d803 dev vlan216 proto bird metric 1024
2001:db8:40:d8::/120 dev vlan216 proto kernel metric 256
2001:db8:40:d910::/120 dev vlan217 proto kernel metric 256
fe80::/64 dev vlan216 proto kernel metric 256
fe80::/64 dev vlan217 proto kernel metric 256
Now, add the following configuration to bird.conf
of R3
:
##############################################################################
# eBGP R10
#
table t_r10;
protocol static originate_to_r10 {
table t_r10;
import all; # originate here
route 10.40.0.0/22 blackhole;
route 10.40.216.0/21 blackhole;
}
protocol bgp ebgp_r10 {
table t_r10;
local 10.40.217.17 as 64080;
neighbor 10.40.217.18 as 65033;
import filter {
if net ~ [ 10.0.0.0/8{19,24} ] then accept;
reject;
};
import keep filtered on;
export where source = RTS_STATIC;
}
protocol pipe p_master_to_r10 {
table master;
peer table t_r10;
import where source = RTS_BGP;
export none;
}
Let me explain a bit about what's going on here. So far, we've used the BIRD protocol types kernel
, device
and ospf
. This configuration snippet introduces three other ones: static
, bgp
and pipe
. Besides that, there's also a table definition on top.
table t_r10;
By issuing table t_r10
, we tell BIRD that we'd like to use an extra internal routing table with the name t_r10
. By default, BIRD always has a routing table named master
, and now we added a second one. Routing tables in BIRD are just a collection of routes, having some attributes.
protocol static originate_to_r10 {
table t_r10;
import all; # originate here
route 10.40.0.0/22 blackhole;
route 10.40.216.0/21 blackhole;
}
The static protocol is used to generate a collection of static routes. In this case, we define a protocol static with name originate_to_r10
, and connect it to table t_r10
. The import statement causes the routes that are generated by this static route protocol to be imported into the t_r10
table. Static routes usually have a target of a neighbor router, using a via statement, but in this case, we don't care about a next hop, since it's just a collection of some prefixes that will be exported via BGP. The blackhole won't be actually used for anything here.
protocol bgp ebgp_r10 {
table t_r10;
local 10.40.217.17 as 64080;
neighbor 10.40.217.18 as 65033;
import filter {
if net ~ [ 10.0.0.0/8{19,24} ] then accept;
reject;
};
import keep filtered on;
export where source = RTS_STATIC;
}
The bgp protocol is named after the router which it's talking to, R10
, and is also connected to the t_r10
routing table inside BIRD. It has a local and remote IP address and AS number. The import rules are a bit more complex than a simple import all
, which also would have been sufficient here to get it working. The filter shown here just makes sure only RFC1918 prefixes from 10/8
are accepted, which are allowed to be from a /19
to /24
in size each. The export rule contains a simple filter that tells BIRD to push all routes from table t_r10
that originate from a static protocol to the outside, to R10
.
protocol pipe p_master_to_r10 {
table master;
peer table t_r10;
import where source = RTS_BGP;
export none;
}
The pipe protocol is a simple protocol that is able to move around routes between internal BIRD routing tables. In this case, the pipe protocol p_master_to_r10
is connected to the central master
routing table and is looking at table t_r10
. From table t_r10
, all routes that originate from an external BGP peer are imported into the master table. Doing so will cause the routes that will be learned from the remote network to end up in the routing table of the Linux kernel (via the kernel protocol that exports them from the BIRD master table outside BIRD), while the routes that only were meant to be used to export to the BGP peer (generated by the static protocol) stay in t_r10
.
Don't worry if the whole construction with tables, protocols and pipes is still a bit confusing. First goal is to see the BGP routing in action, and afterwards I'll explain more about those BIRD internals.
Also, remember that the internal BIRD routing tables are not used to actually do packet forwarding. During the OSPF tutorial, we already discussed this difference between the "Control Plane" and "Forwarding Plane". Actually, the routing table inside the control plane is usually called the "RIB" (Routing Information Base), while the routing table that is used in the forwarding plane is called the "FIB" (Forwarding Information Base). Just look up all those terms on the internet to see what everyone is saying about them.
After adding the configuration on R3
, fire up the interactive BIRD console, using birdc
:
root@R3:/# birdc
BIRD 1.4.5 ready.
bird>
Don't forget to tell BIRD to read and apply the changed configuration:
bird> con
Reading configuration from /etc/bird/bird.conf
Reconfigured
Now, the three new protocols should be shown:
bird> show protocols
name proto table state since info
kernel1 Kernel master up 2015-06-14
device1 Device master up 2015-06-14
ospf1 OSPF master up 2015-06-14 Running
originate_to_r10 Static t_r10 up 23:54:16
p_master_to_r10 Pipe master up 23:54:16 => t_r10
ebgp_r10 BGP t_r10 start 00:34:16 Active Socket: Connection refused
bird> show route table t_r10
10.40.216.0/21 blackhole [originate_to_r10 23:54:16] * (200)
10.40.0.0/22 blackhole [originate_to_r10 23:54:16] * (200)
Well, the routes are waiting to be pushed to R10
in the t_r10
table, and no routes from AS65033
are visible yet. There's only an ugly "Connection refused"... reminding you that the other end of the BGP connection needs to be configured. Now it's up to you to configure R10
with the opposite part of the configuration, and make it talk to R3
!
When successful, the output of the commands above should show the BGP session to R3 as Established now:
bird> show protocols
name proto table state since info
kernel1 Kernel master up 2015-06-14
device1 Device master up 2015-06-14
ospf1 OSPF master up 2015-06-14 Running
originate_to_r3 Static t_r3 up 00:48:27
ebgp_r3 BGP t_r3 up 00:48:32 Established
p_master_to_r3 Pipe master up 00:48:27 => t_r3
Table t_r3
now also contains the routes that are learned from AS64080
:
bird> show route table t_r3
10.40.216.0/21 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.32.0/19 blackhole [originate_to_r3 00:48:27] * (200)
10.40.0.0/22 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
The above shows for example that prefix 10.40.216.0/21
was learned via the protocol ebgp_r3
, 48 minutes ago, and that the range is originating from AS64080
. The via 10.40.217.17
is the BGP next-hop, which is the first router outside our own network.
The BIRD master routing table also contains the routes learned over BGP, thanks to the p_master_to_r3
protocol:
bird> show route
10.40.217.16/30 dev vlan217 [ospf1 2015-06-14] * I (150/10) [10.40.32.10]
10.40.216.0/21 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.33.0/26 dev vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
10.40.36.0/24 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
10.40.48.0/21 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.11]
10.40.32.10/32 dev lo [ospf1 2015-06-14] * I (150/0) [10.40.32.10]
10.40.32.11/32 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.11]
10.40.0.0/22 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.32.12/32 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
The last step to get the routes into the actual forwarding table inside the Linux kernel is done by the kernel protocol. Since there is no explicit name given for the kernel protocol in the configuration, BIRD just names it kernel1
.
bird> show route export kernel1
10.40.216.0/21 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.36.0/24 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
10.40.48.0/21 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.11]
10.40.32.11/32 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.11]
10.40.0.0/22 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.32.12/32 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
Now the routes show up in the output of ip route
, labeled with proto bird:
root@R10:/# ip r
10.40.0.0/22 via 10.40.217.17 dev vlan217 proto bird
10.40.32.11 via 10.40.33.2 dev vlan33 proto bird
10.40.32.12 via 10.40.33.3 dev vlan33 proto bird
10.40.33.0/26 dev vlan33 proto kernel scope link src 10.40.33.1
10.40.36.0/24 via 10.40.33.3 dev vlan33 proto bird
10.40.48.0/21 via 10.40.33.2 dev vlan33 proto bird
10.40.216.0/21 via 10.40.217.17 dev vlan217 proto bird
10.40.217.16/30 dev vlan217 proto kernel scope link src 10.40.217.18
Well, let's have a look what we can do with this result. Since both networks are now aware of each other's routes, I'd expect I can do some tracerouting into a remote network now!
root@R10:/# traceroute -n 10.40.2.6
traceroute to 10.40.2.6 (10.40.2.6), 30 hops max, 60 byte packets
1 10.40.217.17 0.356 ms 0.319 ms 0.324 ms
2 10.40.216.2 0.430 ms 0.427 ms 0.378 ms
3 10.40.2.6 0.781 ms 0.724 ms 0.716 ms
R10
now knows the route to IPv4 ranges used in AS64080
, and it seems H6
also knows a route back to R10
.
Let's try it from H34
!
root@H34:/# traceroute -n 10.40.2.6
traceroute to 10.40.2.6 (10.40.2.6), 30 hops max, 60 byte packets
1 10.40.36.1 0.296 ms !N 0.091 ms !N *
Meh, that doesn't look to good. Apparently there's more work to do.
Now make sure you can do the following, and answer the following questions:
-
Configure the IPv6 BGP connection between
R3
andR10
. IPv4 and IPv6 is handled separately by BIRD now, but the configuration for IPv6 is very similar to the configuration I showed here. Just use import all for bgp if you don't want to learn more about filtering now. -
Explain why
10.40.217.18
or2001:db8:40:d910::2
onR10
can be pinged fromR1
now, while this was not the case before:root@R1:/# ping6 2001:db8:40:d910::2 PING 2001:db8:40:d910::2(2001:db8:40:d910::2) 56 data bytes 64 bytes from 2001:db8:40:d910::2: icmp_seq=1 ttl=63 time=0.399 ms 64 bytes from 2001:db8:40:d910::2: icmp_seq=2 ttl=63 time=0.099 ms ^C --- 2001:db8:40:d910::2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1000ms rtt min/avg/max/mdev = 0.099/0.249/0.399/0.150 ms
-
Try to export a route outside of
10.0.0.0/8
over BGP, fromR3
toR10
and notice that the filter will actually stop that route from being propagated, while accepting the other routes. Using theshow route filtered protocol ebgp_r3
command the route should be visible, thanks to theimport keep filtered on
option that is set. -
Figure out why, despite the fact that the two networks learned each others prefixes, you still cannot reach any router or host in the neighbor network that lies behing the border router. Try the following ICMP echo commands and explain why they do or don't succeed. Hint: use
tcpdump -ni vlanXYZ
on the right vlan interface to see the actual traffic, with source and destination addresses.R3
->R10
:root@R3:/# ping 10.40.32.10
R3
->R11
:root@R3:/# ping 10.40.32.11
R11
->R3
:root@R11:/# ping 10.40.217.3
H12
->R1
:root@R12:/# ping 10.40.217.1
After explaining a bit more about the BIRD tables and protocols, we'll fix all these reachability issues.
The usage of import, export, different protocols and routing tables can be a bit confusing at first. Well, at least it was very frustrating for me, until I found out how to use it.
The main gotcha here is that the import and export statements are to be considered from the point of view of the BIRD routing table that is connected to the protocol (either by specifying the table option, or omitting it, using the default master
table).
What I found out is that the easiest way to prevent confusion is to take the BIRD 'master' table as central point of reasoning, and then configure everything so that 'import' points closer to the master table, importing routes closer to the heart of BIRD, and 'export' points away from it, pushing routes to the outside world.
Here's a diagram of the BIRD configuration that we just used:
And here's how you should read the configuration that is in your routers right now:
- table
master
is the central routing table of BIRD - kernel protocol
kernel1
exports routes from BIRD to Linux - ospf protocol
ospf1
imports routes from other OSPF routers in the network into BIRD - pipe protocol
p_master_to_r10
imports routes from its peer tablet_r10
into tablemaster
- table
t_r10
is another BIRD table that contains a collection of routes with attributes - static protocol
originate_to_r10
imports static routes into tablet_r10
- bgp protocol
ebgp_r10
exports routes from tablet_r10
toR10
Note that the OSPF protocol itself also generates routes for connected subnets that are stub or non-stub networks. These routes are not imported via the kernel protocol.
The output of show protocols
should also totally make sense now (table column width adjusted):
root@R3:/# birdc show protocols
BIRD 1.4.5 ready.
name proto table state since info
kernel1 Kernel master up 2015-06-14
device1 Device master up 2015-06-14
ospf1 OSPF master up 2015-06-14 Running
originate_to_r10 Static t_r10 up 2015-06-18
p_master_to_r10 Pipe master up 2015-06-18 => t_r10
ebgp_r10 BGP t_r10 up 2015-06-19 Established
Assignments:
- The OSPF protocol configuration that we are using does not contain any table, import or export. This means it's using the defaults, which are table master, import all, export none. Add a line specifying
import none;
to the OSPF protocol configuration, and look at the effect on the BIRD master table, and the Linux routing table. - Change the BIRD configuration to use only the
master
table, eliminating the extrat_r10
routing table, without changing the set of routes that are actually exported to the Linux kernel. Doing so should show that it's entirely possible, but that decreasing complexity by removing the extra table will increase complexity in the filters needed.
There's a last task that needs to be completed before every host and router in the two networks can see each other. As you just found out, only the border routers that actually speak BGP have learned the routes to the other network, and the internal routers still have no idea about them.
So, how should R0
and R1
be told about the routes from AS65033
that are already known to R3
?
BGP is not only meant to be used to connect to a router in an external network, it can also be used to connect back to routers in our own AS, to provide them with the learned information about externally reachable networks. A connection to a router in a different AS is called an eBGP connection, and, a connection to a router inside the same AS is called an iBGP connection.
In the inside network, iBGP can run alongside OSPF on the routers, the difference between them being that OSPF carries the internal routes, and BGP the external ones:
- OSPF, the IGP, contains all information about routes inside our network.
- BGP, the EGP, contains all information about external connectivity.
Here's an example for the IPv6 iBGP connection between R3
and R1
:
In the IPv6 BIRD configuration of R3
, add:
protocol bgp ibgp_r1 {
import none;
export where source = RTS_BGP;
local 2001:db8:40::3 as 64080;
neighbor 2001:db8:40::1 as 64080;
}
In the IPv6 BIRD configuration of R1
, add:
protocol bgp ibgp_r3 {
local 2001:db8:40::1 as 64080;
neighbor 2001:db8:40::3 as 64080;
}
Using the same AS number for the local and neighbor address simply tells BIRD that we're dealing with an iBGP connection.
Do a birdc6 configure
in R1
and R3
, and look at the result on R1
:
root@R1:/# birdc6 show route protocol ibgp_r3
BIRD 1.4.5 ready.
2001:db8:10::/48 via fe80::aff:fe28:d801 on vlan216 [ibgp_r3 23:26:12 from 2001:db8:40::3] * (100/20) [AS65033i]
BIRD just learned a route to the remote AS! And, because of this, H7
in AS64080
and R10
in AS65033
can now find each other:
root@H7:/# traceroute6 -n 2001:db8:10:6::a
traceroute to 2001:db8:10:6::a (2001:db8:10:6::a), 30 hops max, 80 byte packets
1 2001:db8:40:3::1 0.556 ms 0.501 ms 0.501 ms
2 2001:db8:40:d8::1 1.059 ms 1.074 ms 1.078 ms
3 2001:db8:10:6::a 1.281 ms 1.274 ms 1.268 ms
Since BGP only handles external connectivity, the protocol does not try to be clever about routes inside the local network. When taking a closer look at the BGP route that is received by R1
, it shows that the BGP information attached to the route only contains information about the first hop outside the network, which is called the BGP next hop:
root@R1:/# birdc6
BIRD 1.4.5 ready.
bird> show route all 2001:db8:10::/48
2001:db8:10::/48 via fe80::aff:fe28:d801 on vlan216 [ibgp_r3 23:26:11 from 2001:db8:40::3] * (100/20) [AS65033i]
Type: BGP unicast univ
BGP.origin: IGP
BGP.as_path: 65033
BGP.next_hop: 2001:db8:40:d910::2
BGP.local_pref: 100
Since R1
has only got this information, BIRD has to find out what the actual next hop to a router in a directly connected subnet has to be before a route can be exported to the Linux kernel. Luckily this is where the cooperation of the IGP comes into play. Since OSPF knows a route to 2001:db8:40:d910::2
, it can tell us where to forward the traffic in the local network to push it closer to that external BGP next hop. This is exactly the reason why the subnets that connect to routers just outside our own network are also included in OSPF as stub networks!
bird> show route for 2001:db8:40:d910::2
2001:db8:40:d910::/120 via fe80::aff:fe28:d801 on vlan216 [ospf1 2015-06-14] * I (150/20) [10.40.217.3]
Remember the section about next-hops in the OSPF tutorial? If not, go back and re-read it ("Step three: figuring out shortest paths and determining next-hops"). The same logic applies here. While this router already has a strong opinion about the path that traffic to 2001:db8:10:6::a
has to take to reach the remote network, all this knowledge gets thrown away even before the actual IP packet leaves this router... While BIRD knows the entry point in the remote network, as well as the path through the internal network to reach it, it can only install a route to the locally connected next hop into the actual forwarding routing table of the Linux kernel. The next router which receives the packet has to apply all routing logic again itself to get it forwarded into the right direction. Luckily, protocols like OSPF and BGP are designed in a way that enables us to trust that all routers that cooperate in the routing protocols have the same mindset and will perfectly work together to get the traffic to its destination without endlessly forwarding it in loops between them.
The only thing that the routers inAS64080
know is that R10
is the entry point for AS65033
, and how to get there. They do not have the slightest knowledge about how the internal network of AS65033
is organized, and there is no way for them to learn about this. When the traffic enters the remote network, that network will take care of delivering it to the actual router or host in that network.
After getting to know iBGP, you might still wonder: "If the routes are in the BIRD master table, and we already have the routers inside the AS talking to each other, why not just export the BGP routes into OSPF?". Well, actually, that can be done, and we can try it for fun. In order to redistribute the BGP routes into OSPF, just shut down the iBGP connections again and add the line export where source = RTS_BGP;
to the OSPF section of both R3
and R10
and birdc configure
.
For example, R11
now shows:
root@R11:/# birdc6 show r
BIRD 1.4.5 ready.
2001:db8:10:24::/120 via fe80::aff:fe28:2103 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
2001:db8:10:21::/120 dev vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
2001:db8:10:30::/117 dev vlan48 [ospf1 2015-06-14] * I (150/10) [10.40.32.11]
2001:db8:10:6::a/128 via fe80::aff:fe28:2101 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.10]
2001:db8:10:6::c/128 via fe80::aff:fe28:2103 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
2001:db8:40::/48 via fe80::aff:fe28:2101 on vlan33 [ospf1 21:00:55] * E2 (150/20/10000) [10.40.32.10]
2001:db8:40:d910::/120 via fe80::aff:fe28:2101 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.10]
You can see that the route to the neighbor AS is present, but it's tagged as an 'E2' route in OSPF, instead of the usual 'I', meaning it was imported from a different routing protocol on the router that originates this prefix, 10.40.32.10
.
While using OSPF to transport the routes to the other internal routes might work in our little example network in this tutorial, it introduces a number of limitations, one of them being that all extra BGP specific information attached to a route is lost when converting it from a BGP to an OSPF route. This limits the amount of control that can be exercised on the selection of the exit point for traffic from a network to external networks. Another reason to refrain from doing this is that the full BGP table of the Internet contains more than half a million network prefixes. So if you would run a router in a location where you have all those routes in a BGP table, redistributing them to OSPF, pretending that the entire Internet is part of your local network will probably blow up your OSPF process. It's not designed to handle that. ;-)
It might have occured to you that the iBGP BIRD configuration specifies the local and remote address using loopback addresses instead of interface addresses from an actual connected subnet. Think back of the "The loopback address" section of the OSPF tutorial! The BGP router on the edge of the network, and the internal router which wants to learn about external connectivity using iBGP can be anywhere in the internal network. There may even exist multiple possible paths between them. By using a loopback address as source and target of the iBGP connection, the connection will keep functioning as long as there is any possible path between the two routers. The flow of traffic to the external network will follow the same directions as the iBGP control connection, since both of them use the IGP to reach each other.
- Well, this one is obvious... Practice some more by finishing setting up all connectivity by configuring the iBGP sessions for IPv4 and IPv6 between
R0
andR3
, betweenR10
andR11
, and betweenR10
andR12
. Confirm by tracerouting fromH34
andH19
inAS65033
toH6
andH7
inAS64080
. - If there's any part of the this first BGP tutorial that you do not understand already, make sure you will. The following tutorials will be building upon the knowledge gathered here. Don't get depressed if you don't get all of it the first time. Just go back to the top and read the page again, there's an awful lot of information compacted in this page. If you're brave, make up your own example network and try to build it from scratch. It will take some time, but as soon as you are able to traceroute from one far end to another, you've likely run into and solved all aspects you missed before.
- Look around on the internet and read other blogs and tutorials about OSPF and BGP and see if they're much more easy to understand having a frame of reference which was set by following this tutorial.
In the next tutorial, BGP Part II, I'll show more interesting topologies of different networks connecting together using BGP than just two networks with one eBGP connection. By doing so, we'll quickly discover and understand how the actual huge Internet is organized.