Skip to Main Content

Some "Oops" are Bigger than Others


It’s been about a week since an errant patching process appears to have brought down many high-profile businesses. These businesses run critical processes on servers never designed to be either servers or executors of critical processes. So says the person that favors servers that are designed with no single points of failure and has eight 9’s of availability (about 0.31 seconds of downtime per year). So much for my soap box.

What can be learned from this worldwide example of choosing low cost over high availability? The main lesson that should be taken to heart is the real cost of a business outage. Not just lost opportunity cost, but actual lost revenue and lost productivity caused by not having highly available systems. This is the perfect time to analyze the costs of an outage to validate theoretical or projected costs against actual data related to that same outage.

Please note that “systems” doesn’t just mean hardware: a system is composed of hardware, software, process, and governance. Lack of resiliency in any one of these pillars means that you do not have a resilient system.

So, how far do you go with this? Take these steps:

  1. Really understand the cost of an outage, mapping actual cost, opportunity cost, and goodwill.
  2. Based on this realistic cost of an outage, determine your resiliency budget. This includes hardware costs, software costs, application modifications (if any) and business process updates.
  3. Do a true analysis of your options. Don’t be held hostage to “that’s the way we’ve always done it” or “I only want a certain set of hardware/software/applications.” Otherwise, your cost of being more resilient may become higher as your infrastructure becomes more complex.
  4. Have documented and practiced workload relocation strategies and practices.
  5. Execute a simulated “disaster” on a regular basis with those most affected by that event. Learn from it then update your practices and documentation as necessary.

LRS can assist your business in strategizing for higher availability. Feel free to contact LRS for more information.