How a Glitched CrowdStrike Update Caused the Blue Screen Friday
Questions loom about CrowdStrike's controls and next steps after losses over $5B; this report examines the events, response, and rival opportunities.
EDGE100 Report, 2023
A faulty update from cybersecurity giant CrowdStrike drove the digital world to a halt on July 19, 2024. In a matter of hours, an estimated 8.5 million devices with Windows systems worldwide succumbed to the dreaded “blue screen of death” (BSOD), paralyzing critical industries and sending shockwaves through the global economy.
From government agencies and stock exchanges to hospitals and airlines, the CrowdStrike Falcon platform's wide spread adoption turned a routine update into a catastrophic event, leaving IT professionals scrambling to orchestrate what’s going to be a largely hands-on recovery process.
As the dust settles, questions loom about the controls (or lack thereof) at CrowdStrike that led to this slip-up and what’s next for the company as it emerges from corporate losses that could exceed USD5 billion. This report aims to answer those questions by delving into what happened, a timeline of key events, CrowdStrike’s response, and potential opportunities for its rivals.
What exactly happened?
At the core of the “largest IT outage in history” was Falcon, CrowdStrike’s cybersecurity platform. A faulty update issued by the vendor in the early hours of Friday triggered a domino effect of crashing PCs and servers, eventually reaching an estimated 8.5 million Microsoft Windows systems worldwide. While this was less than 1% of all Windows machines in operation, the broader economic and societal impacts were significant, disrupting the operations of major organizations, including those in critical industries such as government agencies, stock exchanges, hospitals, and airlines.
The disruption followed a content configuration update for the Falcon sensor, a lightweight agent that protects and monitors devices using the Falcon platform. This sensor works at the kernel level (the core part of an operating system that manages basic computer functions), allowing it to watch over and safeguard important parts of the system and its processes.
CrowdStrike updates its security systems using two methods: Built-in Sensor Content and adaptable Rapid Response Content(RRC). The company explained that the crash was caused by a bug in its content validator, which failed to catch an error in one of two deployed TemplateInstances. These instances guide the Falcon sensor on threat detection and response. The error led to an out-of-bounds memory read, where the sensor tried to access unauthorized areas of a device'sRAM, causing system failure.
Timeline of events
February 28, 2024
○ CrowdStrike releases sensor 7.11 to customers with new InterProcessCommunication (IPC) Template Type to detect novel attack techniques that abuse Named Pipes
March 5, 2024
○ The IPC Template Type passes the stress test and is validated for use
March 5, 2024
○ The IPC Template Instance is released to production via Channel File 291
April 8–24, 2024
○ Three additional IPC TemplateInstances are deployed
July 19, 2024
○ 04:09 UTC: Two additional IPCTemplate Instances are deployed, one of which passes validation despite having problematic content data
○ 04:09–05:27 UTC: Online Windows devices running sensor versions 7.11 and above downloaded the faulty update, resulting in crashes on an estimated 8.5 million devices
○ 05:27 UTC: CrowdStrike addressed the issue by rolling back the problematic update and deploying a corrected version
○ 15:30 UTC: The Cybersecurity andInfrastructure Security Agency (CISA) issued an alert recognizing the extensiveIT outage caused by the CrowdStrike update
July 20, 2024
○ 21:59 UTC: CrowdStrike updated its remediation and guidance hub to provide resources to affected customers
July 21, 2024
○ 21:22 UTC: Microsoft released a custom Windows Preinstallation Environment recovery tool to find and remove the faulty CrowdStrike update
Businesses estimated to lose billions due to disruption
According to a report from risk management software provider Interos, the outage directly impacted nearly 675,000 enterprise customers and indirectly affected more than 49 million customer relationships. For example,UK broadcaster Sky News went off-air briefly, major retailers had to resort to cash transactions only, and customers at major US banks, like Bank ofAmerica and Wells Fargo, reported issues with login and online transfers, with other banking customers reporting declined card transactions.
Interos highlighted that the US was the hardest hit, accounting for 41% of the affected entities, whileEuropean countries such as Britain, France, and Spain collectively accounted for nearly a third. It also noted that the Falcon cybersecurity platform is employed by nearly half of the largest US cities and 82% of US state governments, including the Department of Defense and intelligence agencies.Analysis done by Parametrix found that 25% of Fortune500 companies experienced disruptions from the CrowdStrike outage, with estimated direct losses of approximately USD 5.4 billion, excluding Microsoft.
Impact on Fortune 500 companies by industry
Estimated financial loss on Fortune 500 companies by industry
The airline industry was hit the hardest, with over 5,100 flight cancellations and32,000 delays reported on Friday. The repercussions have continued, with ongoing flight cancellations and delays over the following days. Although airlines have restored their systems, the recovery process has been slow.According to FlightAware data, Delta Airlines was the most severely impacted, canceling more than 6,500 flights between Friday (July 19) and Wednesday (July24). In addition to the logistical challenges, airlines are facing regulatory scrutiny regarding their compliance with federal refund requirements for significant flight cancellations or delays. Specifically, airlines are required to issue refunds in the form of the original payment method, typically a credit card, rather than as vouchers. Parametrix has estimated that Fortune 500companies in this industry have incurred direct losses of USD 860 million due to business interruptions.
Insurers are also expected to face negative consequences from this event due to an increase in claims from businesses and individuals. According to FitchRatings, business interruption, contingent business interruption, and CyberInsurance are expected to see the most significant claims.Additionally, smaller lines such as travel insurance, event cancellation, and technology E&O will also be affected. Preliminary market estimates by Fitch suggest global insured losses to be in the mid-to-high single-digit billion USD range, which may not significantly impact(re)insurers but warns against ongoing claims and litigation. Meanwhile, Parametrix expects insured losses to cover only10%–20% of the total financial loss faced by Fortune 500 companies, translating between USD 0.5 billion and USD 1.1 billion.
Recovery involves more than just an over-the-air update
CrowdStrike quickly identified the issue and released a corrected version of the update in approximately 78 minutes.However, applying the fix to already affected devices was not straightforward, as the underlying Windows OS triggered a BSOD loop, rendering the system inoperative during the normal boot process. This required IT administrators and users to manually boot their devices inWindows Safe Mode or Recovery Environment Mode to navigate the system directory and delete the problematic file, known as Channel File 291.
Given the large number of affected devices, this process is labor-intensive and time-consuming, with total recovery time estimated to span months for certain organizations. Additionally, requiring physical access to the devices, combined with any drive encryption technologies such as BitLocker, further complicates and prolongs recovery efforts.
What are the remedial actions CrowdStrike is taking?
The July 19 update raised a common question:How could a faulty update on critical software go undetected during internal testing? And should updates be deployed instantly to all devices globally?
This issue was not limited to just Windows alone; Linux users also reported kernel panics and crashes related to theFalcon platform since at least April of this year.
In an incident response posted on July 24, the company blamed a bug on one of its testing software for not detecting the problematic update. Some measures to prevent a similar occurrence include improving RRC testing with various testing types, adding more validation checks, implementing a staggered deployment strategy for RRC, enhancing sensor and system performance monitoring, and providing customers with greater control over RRC updating. In addition, CrowdStrike aims to release a full Root CauseAnalysis report of the event following an investigation.
How will this impact the company?
CrowdStrike is likely to see considerable financial and reputational fallout due to the content update error. The company’s stock has already seen a notable decline since the incident's aftermath. On Friday, the stock fell by just over 11% to USD 304.96 per share and by Wednesday, July 24, it had lost nearly 25% (USD 258.14 per share) of its value compared with its closing price of USD 343.05 per share on July 18, just before the incident.
Share price movement of CrowdStrike
Software companies like CrowdStrike usually include protective clauses in their contracts to limit their liability for software issues, often capping potential payouts at the cost of the service(i.e., a simple refund). However, they may still face civil charges for intentional or negligent actions that cause harm. The faulty update situation could be covered by errors and omissions (E&O) insurance, which protects businesses against claims of negligence or mistakes. As a result, the full impact of potential lawsuits on CrowdStrike is uncertain at this time.
The company is also facing congressional scrutiny, with the House Committee on Homeland Security's Subcommittee onCybersecurity and Infrastructure Protection requesting CEO George Kurtz to testify on the global IT outage incident.
Could opportunities lie for CrowdStrike’s rivals?
CrowdStrike rivals may find opportunities and challenges after the recent event. While it’s too early to identify“winners" in the ongoing CrowdStrike situation, its competitors are fortunate to have avoided a similar fate as CrowdStrike, allowing them to evaluate their systems, including assessing the depth of their integration with operating systems, improving methods of air-gapping their updates, and refining deployment processes.
As of 2023, CrowdStrike was the second-largest security software company by global revenue, with a 14.7% marketshare—just behind Microsoft according to Gartner. Other major competitors include Trellix, SentinelOne, and Palo AltoNetworks. The recent IT outage has damaged CrowdStrike's reputation, which may lead organizations to rethink their IT and cybersecurity needs. This is especially true for companies with zero-tolerance policies, who might now look for other solutions to reduce their reliance on a single provider and prevent future business disruptions.
However, changing security solutions isn't easy or cheap. New systems often face challenges integrating with existing infrastructure and moving data and can take a long time to set up. As of April 2024, 65% of CrowdStrike's customers were using more than five of its modules. This deep integration with CrowdStrike's ecosystem can make it difficult for customers to switch to another provider.
While comparisons are being drawn to Okta's 2023 data breach—where hackers accessed data on all customers—CrowdStrike's situation may prove more challenging. Although Okta experienced longer sales cycles due to increased scrutiny of its security protocols, it largely retained its customer base. However, CrowdStrike's breach has caused significant disruption, necessitating substantial recovery efforts from its clients. This disparity in impact suggests that CrowdStrike might face more severe consequences, though the full extent remains uncertain. The magnitude of the breach and the resultant customer inconvenience could potentially lead to different outcomes for CrowdStrike compared with Okta's relatively stable aftermath.
Curious to know how we keep track of emerging industry events as they happen? Talk to us.