(Title Complements of Garbage)
We have been dealing with the remnants of Fay. When I walked into the office this morning around 7:00 AM, there were members of my group that had been at work since 1:00 AM. We lost building power, one of the Monster UPS systems blew and our generator failed to start. The batteries were so hot that they were swollen. We are back up, with a few issues related to the servers coming down hard. Friday will be the never ending day as we clean up the mess and deal with the normal end of month stress.
Everything that could have gone wrong did early this morning. Two separate recovery systems failed at the same time on something that we normally drill weekly. We pulled out drop cords and moved users to alternate power sources. We brought in a fan to clear the air of the batteries trying to cool down.
And maybe that is how it is supposed to work. A crash so hard that you need to stop and evaluate your recovery systems. A crash so complete, your first priority is to get functional, and then figure out how to stop this from happening next time. You really never know how your backup systems will support the load, until you make them bear the load. And if they can't bear the load, you plan differently for the next crisis.
1 comment:
Brilliantly put! I know that with my two major crashes in the last 3 years, Hurley turned out to be the white knight. He was an amazing safety net.
Post a Comment