Appalachia Technologies Blog
The Great Facebook Outage of October 2021
The Netflix docudrama film The Social Dilemma describes Facebook as “The problem beneath all other problems.” As this is a security/technical blog we are not in the business of bashing or praising social media, but it goes without saying that Facebook has become omnipresent in the daily lives of literally billions of people. The same is also true of Instagram and WhatsApp, two other massive social media properties which were also unavailable for about six hours on October 4th. When something that big falls that hard, there are always unforeseen and unintended consequences.
In the case of the Facebook outage, these unforeseen consequences manifested themselves mainly in the unavailability of SSO (Single Sign On) which prevented users from accessing numerous other Internet services that rely on Facebook for authentication – although I have not seen any extensive analysis on how widespread that “collateral damage” was. The unavailability of WhatsApp had a major impact on people in the Eastern Hemisphere, where it is used very widely for day-to-day communications.
From a security and technical perspective the lesson for most people in the Facebook outage is to be careful about using a single property like Facebook to authenticate other Internet resources that you might really need in your daily life.
So how and why did this happen? Facebook has published a technical explanation which is probably a bit too hairy for most people to want to read, so I’ll offer a non-technical analogy instead. Imagine that all of Facebooks data centers are buildings in a large city -- say, Chicago. You have a list of Facebooks buildings which would be like URL’s. You use a paper map to find your way on Chicago’s roads, which would be like Facebooks internal company wide network. Now, imagine that someone took your map, which also happened to have the street addresses of the Facebook buildings in Chicago. You are now unable to find or get to anything Facebook at all. Six hours later someone gave you the map back, and off you went.
On a more big-picture note, let’s talk about BGP (Border Gateway Protocol) which was the root cause of the Facebook outage. BGP makes the news pretty often – see this very recent article titled Major BGP leak disrupts thousands of networks globally. In 2008, Pakistan accidentally knocked YouTube offline though a configuration error. From a “black hat” (bad guy) hacker prospective, BGP makes a superb attack vector because of the massive extended outages it can cause. While there is no publicly available evidence that the Facebook outage was anything but an error, it still makes my antennas go up. If hostilities were to break out between nation-states with offensive cyber capabilities, BGP would definitely get attacked.
Senior Engineer, Cybersecurity Risk and Compliance, Appalachia Technologies
Jason McNew is a CISSP and a CMMC RP (Registered Practitioner). Jason, a United States Air Force veteran, holds a Master’s degree from Penn State University in Information Sciences, Cyber Security and Information Assurance, in addition to a Bachelor of Science and two Associate of Science degrees. Penn State’s Cyber Security program has been reviewed and endorsed by the National Security Agency (NSA) and the Department of Homeland Security (DHS). He also worked for the White House Communications Agency from 2003 until 2015. In 2017 he founded Stronghold Cyber Security, which was acquired by Appalachia Technologies in 2020.