routes of the internet

2020-08-12

This post is a draft. What is a draft?


The Internet is made up of a number of independent Autonomous Systems, AS, operated by various organizations. These organizations can be an ISP, but also a company or a university. Intradomain routing, that is, routing inside the ASs, happens by following the AS rules and protocols. Interdomain routing, that is, routing between independent ASs, must happen by using the same protocol. That protocol is the Exterior Gateway Routing Protocol, BGP.

While routing inside an AS depends on the AS only, routing between AS involves politics.

In an example such as the one below, with five AS A, B, C, all geographically close to each other, and AS K and Z, a bit distant from A, B, and C, but close to each other.

In an ideal world, we would have these 5 AS all connected with each other.

But B, for several reasons (including, but not limited to, security, politics, economics), might not want to carry traffic that is from AS A to AS C (and vice versa) or from K to C, or from Z to B, let alone communicate with A at all. All traffic B is willing to exchange with is C.

Through BGP, paths and policies such as the one just described, can be established. Suppose the realistic situation is the following.

The most common reason for establishing BGP paths are economics (“Use Cogent for traffic to Germany instead of Telia because it is cheaper”) or performance (“Switch to Telia for traffic to Germany because Cogent is congested”).

Suppose, as described elsewhere, that the connection from the example above is of peering type. The various AS set peering agreements in place. Nothing has changed, but we add an element. Hosts at various AS (1 to 5) can communicate with each other following the paths in blue. The green lines are peering agreements at AS level. The blue lines are connection paths, which are established through routing policies.

A routing policy decides what traffic can flow over existing links between ASs. One ISP A can pay an ISP provider C to deliver and receive packets to any destination on the Internet. This is transit service and it works similarly to when you agree with O2 to have your Internet access at home. It works by following this principle:

ISP C advertises routes to all destinations on the Internet to ISP A (basically, shouting “Everything for A goes through me, K!”, over the link between A and K. Simultaneously, ISP A should advertise only routes to A on its network, to ISP K. ISP A does not want to handle traffic for other ISPs.

We can see a simpler example of transit service below.

Recall that AS B refuses to peer with AS A. But AS C peers with B. AS A pays AS C to act as a transit provider through a transit agreement (magenta line). Now Host 1 can communicate with host 2 directly, thanks to AS C who acts as intermediary.

Connections (peering, transit) are often made, between AS, with a link at IXPs (Internet eXchange Points). Routing advertisements travel in the opposite direction of traffic flows between packets. AS C advertises B as a routing destination to A, who pays for transit, and to Z, who gets it for free in exchange to announcing K as a destination for C. Interestingly, transit agreements and resulting packet flows are not necessarily bidirectional.

Try it out. To maximize the chances, let’s pick a provider that is far away from us in Stuttgart. Joe’s Datacenter is a known low-end provider for servers. They are in Kansas City, MO. Providers (or better, data centers) often offer a Looking Glass, which is a small website that enables visitors to run pings, traceroute, and other network utilities to given IP addresses. Joe’s Datacenter looking glass is available here.

Input your IP address and run a traceroute against it. I cannot influence the options of their traceroute, so their output for my IP address is:

traceroute to MY_IP_HERE, 30 hops max, 60 byte packets
    1 * * *
    2 EDGE2.joesdatacenter.com (10.0.1.77) 0.248 ms 0.342 ms 0.417 ms
    3 38.140.136.233 (38.140.136.233) 0.965 ms 1.048 ms 1.140 ms
    4 te0-0-1-0.agr12.mci01.atlas.cogentco.com (154.24.21.89) 1.135 ms 1.200 ms 
    te0-0-1-0.agr11.mci01.atlas.cogentco.com (154.24.21.85) 0.904 ms
    5 te0-1-1-17.ccr21.mci01.atlas.cogentco.com (154.54.5.229) 1.100 ms 
    6 te0-6-0-6-0.ccr21.mci01.atlas.cogentco.com (154.54.1.181) 1.009 ms 
    7 te0-1-0-12.ccr22.mci01.atlas.cogentco.com (154.54.5.225) 0.589 ms
    8 be2831.ccr41.ord01.atlas.cogentco.com (154.54.42.166) 12.831 ms 
    9 be2832.ccr42.ord01.atlas.cogentco.com (154.54.44.170) 12.652 ms 12.567 ms
    10 be2718.ccr22.cle04.atlas.cogentco.com (154.54.7.130) 20.085 ms 
    11 be2717.ccr21.cle04.atlas.cogentco.com (154.54.6.222) 20.144 ms 20.753 ms
    12 be2889.ccr41.jfk02.atlas.cogentco.com (154.54.47.50) 32.541 ms 32.628 ms 31.998 ms
    13 be3362.ccr31.jfk04.atlas.cogentco.com (154.54.3.10) 31.827 ms 
    14 be3363.ccr31.jfk04.atlas.cogentco.com (154.54.3.126) 31.458 ms 31.430 ms
    15 62.157.249.201 (62.157.249.201) 35.211 ms 34.735 ms 34.727 ms
    16 87.137.238.65 (87.137.238.65) 121.765 ms 121.221 ms 121.135 ms
    17 MY_HOST_HERE.dip0.t-ipconnect.de (MY_IP_HERE) 125.130 ms !X 125.412 ms !X 125.413 ms !X

Traffic starts from Joe’s Datacenter at hop 1 (not shown here, but they provide the test IP 208.94.245.2 (hop 1 and 2) and then moves to the router at hop 3, which belongs to AS174/PSINet. From there, traffic flows through various routers belonging to Cogent up to hop 15, which is the first router that belongs to AS3320/Deutsche Telekom. At hop 17, traffic reaches my flat.

Let’s do the opposite. From my computer, I traceroute Joe’s Datacenter test IP.

traceroute -a -q 1 -I -m 20 -w 10 208.94.245.2
    traceroute to 208.94.245.2 (208.94.245.2), 20 hops max, 72 byte packets
    1 [AS0] 192.168.178.1 (192.168.178.1) 1.632 ms
    2 [AS3320] p3e9bf570.dip0.t-ipconnect.de (62.155.245.112) 13.399 ms
    3 [AS3320] pd900c966.dip0.t-ipconnect.de (217.0.201.102) 482.923 ms
    4 [AS3320] 80.157.204.62 (80.157.204.62) 8.878 ms
    5 [AS3257] ae8.cr2-kan1.ip4.gtt.net (89.149.128.49) 117.238 ms
    6 [AS25973] ip4.gtt.net (69.174.12.26) 144.582 ms
    7 [AS0] 10.0.1.137 (10.0.1.137) 130.237 ms
    8 *
    9 [AS19969] 208.94.245.2 (208.94.245.2) 214.478 ms

The options I used show the AS number. Traffic starts from my LAN (192.168.178.0/24), flows through 3 hops belonging to AS3320/Deutsche Telekom, but at hop 5 transits through AS3257/GTT and AS25973/PacketExchange (which peers exclusively with AS3257/GTT, we found a good example that pays a single ISP for transferring and announcing the entire Internet), and finally arrives to AS19969/Joe’s Datacenter.

Ingress traffic and egress traffic, between AS3320 and AS19969 is therefore taking different paths.

Similar thing goes for peering, but between more than two AS. To implement peering, two ASes send routing advertisements to each other for the addresses that reside in their networks. This makes it possible, in our example, to have host 1 in AS A contact host 4 in AS K and vice versa, and for host 4 in AS K to contact host 5 in AS Z and vice versa. What happens if host 1 attempts to connect host 5? That is, a transit A-K-Z? Even if a physical link exists between the three AS, traffic between A and Z only works if K announces routes to A, to Z, and routes to Z, to A.


I do not use a commenting system anymore, but I would be glad to read your comments and feedback. Feel free to contact me.