Watts S. Humphrey penned the simple yet elegant phrase “Every company is a software company” in his 2001 book, “Winning With Software: An Executive Strategy.”
During the early days of the catastrophic CrowdStrike outage, I found myself repeating, “Every company is a software company.” The global outage started on July 19 when a faulty software update unexpectedly crashed approximately 8.5 computers and tablets running Microsoft Windows. The affected organizations included countless airlines, government agencies, financial institutions, hospitals, and businesses using Windows-based devices.
This incident, described as “the largest IT outage in history,” led me to ponder the cost of IT outages.
Of course, IT outages incur different types of costs. A brand’s reputation suffers when a customer can’t access their favorite retail website. When a worker’s computer goes all wonky, and IT can’t solve the problem quickly, productivity (and an employee’s feelings about their employer) suffers. But the cost that everyone can understand is the one that’s measured in dollars and cents.
According to a global survey of 400 IT pros and executives by Enterprise Management Associates, the average cost of an unplanned outage is $14,056 per minute.
Naturally, the cost of an unplanned outage varies depending on the size of an organization.
The cost is $12,500 per minute for an enterprise with 5,000 to 10,000 employees. For an enterprise with more than 10,000 employees, the cost skyrockets to $23,750 per minute.
To view unplanned outages in a bigger context, IT pros and executives at companies with 5,000 to 10,000 employees estimate their “most recent significant outage” cost $1.6 million. For companies with more than 10,000 employees, the price tag for an unplanned outage jumps to $3.2 million.
Which brings us to the price of the CrowdStrike outage.
5 Key Takeaways
Parametrix, a cloud monitoring and insurance company, estimates that the CrowdStrike outage will cost Fortune 500 companies more than $5 billion in direct costs. And that is just the Fortune 500.
Now that the CrowdStrike outage has been resolved, it’s worthwhile to discuss what business and IT leaders have learned from this calamity. Here are five key takeaways:
Quality control is vital
Companies must conduct more stringent internal reviews of software releases or system updates before deploying them. This means rigorous functional testing that prevents faulty software from being pushed into the real world.
Value software stability, not just speed
Regarding software releases, companies need to re-examine the speed vs. stability equation. Understandably, a business wants to release software quickly to be first to market with a new product or game-changing feature. However, it’s equally important the software is stable (i.e., it won’t cause burdensome, time-consuming glitches or outright system failures).
Take a staggered approach to software releases
One lesson CrowdStrike learned is to rethink its approach to software releases. It plans to move to a staggered approach so everyone doesn’t receive the same update at once. Therefore, if a new patch or update is problematic, the bad effects are more easily controlled and fixed.
Revisit your emergency plan
When the CrowdStrike outage struck, the Metropolitan Atlanta Rapid Transit Authority in Georgia was well prepared. Why? The IT department had a pre-existing emergency plan featuring a phone tree and dedicated communication channels.
Tyson Morris, the Metropolitan Atlanta Rapid Transit Authority CIO, was woken up at 3 a.m. on Friday and informed that all its Windows-based systems were down. The IT department’s emergency plan empowered its 130 members to handle the crisis superbly. Atlanta’s buses and trains were operational by 9 a.m., and every Windows laptop was fixed by Monday morning.
Ensure you are well-staffed
Given that the average cost of an unplanned outage is $14,056 per minute and that outages—planned or unplanned—occur regularly in our world, where every company is a software company, organizations must ensure their IT department is well-staffed and well-taken care of. After all, they are the ones who will save your business during an outage.
Now is the time to evaluate whether your company is adequately staffed for emergencies and whether it makes sense to have outsourced IT workers on standby.
Although CrowdStrike was an outlier in terms of its global impact, unplanned outages occur daily. Given this fact, business and IT leaders must ensure they are ready for the next Big One.