Skip to main content

Command Palette

Search for a command to run...

BGP: The Internet's Routing Protocol

Updated
9 min read
BGP: The Internet's Routing Protocol

Border Gateway Protocol (BGP) is the standardized routing protocol that holds the internet's independent networks together. Its primary function is to exchange network reachability information between Autonomous Systems (ASes). This information dictates which path data packets will take to travel from a source network to a destination network.

BGP is a path-vector protocol. This means that when it advertises a route to an IP prefix, it includes the full path of ASNs the route has traversed. This path information, called the AS_PATH, is fundamental for loop prevention and is a key tool for enforcing routing policies.


How BGP Sessions Are Established

Before two routers can exchange routes, they must first establish a BGP session, also known as a peering relationship. This process is a structured dialogue that occurs over TCP port 179.

The session progresses through a series of states, using specific message types:

  • OPEN: The first message sent by each router to initiate a connection. It contains the sender's ASN, a hold timer, and other optional parameters. If the receiving router agrees with the parameters, the session can proceed.

  • KEEPALIVE: Once the session is established, routers send periodic KEEPALIVE messages (typically every 60 seconds) to confirm that the connection is still active. If a router stops receiving these messages from its peer, it assumes the connection has failed and terminates the session.

  • UPDATE: This is the core message of BGP. It is used to advertise new routes, withdraw previously advertised routes, or update the attributes of existing routes. The routing information exchanged between peers is contained within these messages.

  • NOTIFICATION: If a router detects an error, it sends a NOTIFICATION message detailing the problem before immediately closing the BGP session.

Once this initial handshake is complete and the routers reach an "Established" state, they are considered BGP peers and can begin exchanging routes via UPDATE messages.


External vs. Internal BGP (eBGP vs. iBGP)

BGP operates in two distinct modes, depending on where the peering session is established.

External BGP (eBGP)

  • Definition: eBGP is used for peering sessions between routers in different Autonomous Systems.

  • Purpose: This is the primary function of BGP—to interconnect different ISPs, hyperscalers, and other networks to form the global internet. When WorldLink connects to Tata Communications, they use an eBGP session.

  • Mechanism: When a router advertises a route to an eBGP peer, it prepends its own ASN to the AS_PATH.

Internal BGP (iBGP)

  • Definition: iBGP is used for peering sessions between routers within the same Autonomous System.

  • Purpose: When a route is learned from an external network via eBGP, iBGP is used to distribute that routing information to all other routers inside the local AS. This ensures all routers within the same network have a consistent view of external routes and can make uniform forwarding decisions.

  • Mechanism: The AS_PATH and other attributes are not modified when routes are propagated between iBGP peers.


Visualization

Traceroute

We can see the hops between Autonomous Systems with traceroute with -a flag

┌─(~) took 28s 🔌63% 
└X traceroute -A youtube.com
traceroute to youtube.com (2404:6800:4002:80b::200e), 30 hops max, 80 byte packets
 1  2400-1A00-B060.ip6.wlink.com.np (2400:1a00:b060:7c92::1) [AS17501]  7.884 ms  7.827 ms  7.803 ms
 2  2400-1a00-b1a6.ip6.wlink.com.np (2400:1a00:b1a6::1) [AS17501]  12.391 ms  12.368 ms  12.346 ms
 3  2400:1a00:0:1::234 (2400:1a00:0:1::234) [AS17501]  12.229 ms  12.209 ms  12.188 ms
 4  2400:1a00:0:40::50 (2400:1a00:0:40::50) [AS17501]  12.149 ms  12.128 ms  12.407 ms
 5  2400:1a00:0:42::121 (2400:1a00:0:42::121) [AS17501]  12.107 ms * *
 6  2400:1a00:0:41::170 (2400:1a00:0:41::170) [AS17501]  12.362 ms  19.897 ms  20.197 ms
 7  2400:1a00:0:41::128 (2400:1a00:0:41::128) [AS17501]  7.472 ms  8.792 ms  9.459 ms
 8  2400:1a00:dccc:1:72:9:128:67 (2400:1a00:dccc:1:72:9:128:67) [AS17501]  28.865 ms  28.836 ms  28.810 ms
 9  2404:d180:1a::15 (2404:d180:1a::15) [AS133372]  38.504 ms  38.479 ms  38.454 ms
10  2001:4860:1:1::2b5a (2001:4860:1:1::2b5a) [AS15169]  28.655 ms  28.630 ms  28.601 ms
11  2001:4860:0:1::78ab (2001:4860:0:1::78ab) [AS15169]  27.431 ms  27.721 ms  23.665 ms
12  2001:4860:0:1::340b (2001:4860:0:1::340b) [AS15169]  24.926 ms 2001:4860:0:1::2b51 (2001:4860:0:1::2b51) [AS15169]  24.868 ms 2001:4860:0:1::340b (2001:4860:0:1::340b) [AS15169]  24.958 ms
13  tzdela-bb-in-x0e.1e100.net (2404:6800:4002:80b::200e) [AS15169]  24.836 ms  24.850 ms  24.820 ms

The above output shows how the packets are moving between autonomous systems

  • If the hop stays in the same ASN → iBGP/internal routing.

  • If the hop changes to a different ASN → eBGP handoff between networks.

    • Hops 1–8 → inside Worldlink (AS17501, iBGP).

    • Hop 9 → first eBGP handoff (AS17501 → AS133372) (interaptusltd.com, HongKong).

    • Hop 10+ → inside Google (AS15169, iBGP) until YouTube server.

BGP visualizer

My new BGP visualizer, available at https://bgp.buddhag.com.np/, was designed to show how IP prefixes are shared between Autonomous Systems and how the shortest path is chosen to build a Routing Information Base (RIB).

An unintentional bug in the code perfectly illustrates one of BGP's most critical flaws. The visualizer allows different, unrelated Autonomous Systems to advertise the same IP prefix, and the RIB still accepts the route with the shortest path, regardless of who the true owner is.

This behavior, while accidental in my tool, mirrors the reality of the BGP protocol. It operates on a principle of trust, which makes it highly vulnerable to malicious attacks and accidental misconfigurations. It's the very reason BGP is often referred to as the "duct tape of the internet," a fragile yet essential system holding global routing together.


Notable BGP Outages: Real-World Case Studies

The Facebook Disappearance (October 2021)

This outage was a catastrophic, self-inflicted wound that made one of the world's largest networks vanish from the internet for nearly six hours.

  • The Cause: A faulty command was issued during routine maintenance on Facebook's global backbone network. This command was intended to assess network capacity but instead inadvertently triggered a system designed to take all of their connections offline.

  • The BGP Mechanism: The command resulted in a complete BGP route withdrawal. Facebook's routers stopped advertising the paths to their IP prefixes. To the rest of the world, their ASN (AS32934) simply disappeared, and with it, the routes to Facebook, Instagram, WhatsApp, and their DNS servers became unreachable.

  • The Impact: The outage was so profound that it also knocked Facebook's internal systems offline, preventing engineers from remotely fixing the problem. It even locked employees out of buildings and data centers because the physical access systems were connected to the same network. The issue had to be resolved by physically sending teams to data centers to manually reset the routers.

The Accidental YouTube Hijack (February 2008)

This is the textbook example of an unintentional BGP hijack with global consequences.

  • The Cause: The Pakistani government ordered the country's main ISP, Pakistan Telecom (AS17557), to block access to YouTube nationwide.

  • The BGP Mechanism:

    • YouTube legitimately owned a prefix like 208.65.152.0/22 (a block of IPs).

    • To block access, Pakistan Telecom engineers created a more specific route (e.g., 208.65.153.0/24) and directed it to a “black hole” (discarding traffic).

    • By mistake, this bogus route was advertised upstream to their provider, and then leaked into the global BGP system.

    • Because of the BGP rule of longest prefix match, routers worldwide preferred the fraudulent /24 over the legitimate /22 — since /24 is more specific.

  • The Impact: For about two hours, a significant portion of the world's internet traffic destined for YouTube was redirected to Pakistan Telecom's network, where it was discarded. This effectively took YouTube offline for a majority of global users until the faulty route was filtered.

The Rogers Canada National Outage (July 2022)

This incident demonstrated the fragility of a modern nation's infrastructure when a major ISP fails.

  • The Cause: A faulty maintenance update was pushed to the core network of Rogers Communications (AS812), one of Canada's largest telecommunications providers.

  • The BGP Mechanism: The flawed update caused a cascade failure in their routers, which led to a complete BGP route withdrawal. Rogers' network disappeared from the global internet, similar to the Facebook outage but with a different internal cause.

  • The Impact: The outage lasted more than a day for many customers and had a devastating effect on the country. It didn't just cut off internet and mobile services for millions; it also took down the Interac debit payment network, crippled 911 emergency services, and disrupted businesses and government services nationwide. The event highlighted the critical dependency of national infrastructure on the stability of a single network's BGP presence.

  • Detail video - How One Mistake Broke Canada’s Internet For an Entire Day

The 2018 Google & China Telecom Rerouting (Traffic Interception)

This event highlighted how BGP incidents can have serious security and data privacy implications.

  • The Cause: A peering misconfiguration by Nigeria’s MainOne (AS37282) accidentally leaked Google-learned routes to China Telecom. This traffic was then further propagated, causing global rerouting of some Google-bound traffic through networks in China and Russia for about 74 minutes.

  • The BGP Mechanism: China Telecom, in turn, announced these routes to the global internet. Because of BGP path selection rules, this made China Telecom appear to be the best and most direct path to reach major Google services (including Google Search and Google Cloud).

  • The Impact: For over an hour, traffic from networks across the world destined for Google was rerouted through China's state-owned network backbone before eventually reaching its destination. While the cause was likely accidental, the incident proved that BGP weaknesses could be exploited to redirect sensitive international data through a specific country, raising significant concerns about potential surveillance and traffic interception.

Russian Hijacking of Ukrainian IP Space (Geopolitical Weapon) 🇺🇦

This example shows BGP being used as a deliberate tool in modern conflict.

  • The Cause: Following the 2022 invasion of Ukraine, Russian network operators began a systematic and malicious campaign to take over Ukrainian IP address blocks.

  • The BGP Mechanism: Russian providers, including the state-owned Rostelecom, began making BGP announcements for IP prefixes belonging to Ukrainian networks, particularly in occupied territories. This is a direct, malicious BGP hijack.

  • The Impact: Internet traffic for users in those regions was forcibly rerouted through Russian infrastructure. This allowed Russian authorities to apply their own censorship, surveillance, and filtering, effectively cutting off those users from the global internet and placing them behind Russia's "digital iron curtain." This demonstrates BGP's modern use as a tool of information control and cyber warfare.


Conclusion: A Resilient but Fragile System

The Border Gateway Protocol is the silent, tireless engine of the internet. It operates on a simple foundation of trust, piecing together thousands of independent networks into a single, global communications fabric. As we've seen, this trust is both BGP's greatest strength and its most critical vulnerability. The massive outages caused by a single misconfigured router or a malicious hijack demonstrate that the internet's stability relies on the careful, collective stewardship of all its network operators.

To strengthen this fragile trust, the internet community is adopting Resource Public Key Infrastructure (RPKI) — a cryptographic system that allows network operators to verify whether a particular Autonomous System (AS) is authorized to announce a given IP prefix. With RPKI in place, many hijacks (accidental or malicious) can be automatically rejected at the routing level, making the global internet more resilient. You can check if your ISP is implementing this at https://isbgpsafeyet.com/.

🔗 Learn more about RPKI here: https://rpki.readthedocs.io

Series Wrap-Up

Across this series, we have journeyed from the inside out. We began with a single packet leaving your computer, learning the local language of ARP to find its way to your router. We then zoomed out to see the global map of the internet—a world of Autonomous Systems, connected by the commercial highways of IP Transit and the handshake agreements of Peering, all meeting at crucial hubs called IXPs. Finally, we explored BGP, the master protocol that navigates this complex world.

The internet is not a cloud; it is a human achievement. It's a physical and logical system of breathtaking scale, built on layers of ingenuity. Hopefully, you now see the intricate dance that happens with every click, a global cooperation that makes our connected world possible.

The Internet

Part 1 of 3

A three-part series exploring how the internet works — from packet flow in local networks to ISP operations and the global routing system (BGP). Real-world cases reveal both the resilience and fragility of the internet.

Up next

How to Build Your Own ISP

Autonomous System The internet isn't a single, monolithic cloud. It’s a massive, interconnected patchwork of thousands of independent networks, and the most fundamental building block of this global structure is the Autonomous System (AS). Think of a...