As America and much of the world watched the Apollo 11 crew approach the first landing on the moon, the tension was great. When we heard, “Houston, Tranquility Base here”, there was a collective sigh of relief around the world.

It wasn’t until later that we discovered the challenge the astronauts faced as they approached the landing. Within three minutes of the scheduled landing, computer alarms were activated. What happened was the astronauts had left one of the switches on, and it had overloaded the system.

Margaret Hamilton, the director of the Software Engineering Division of MIT had developed the flight software for the Apollo 11 program. She is actually the inventor of the term software engineering. The software that Ms. Hamilton developed provided vital onboard guidance, but it had a built-in recovery system should the system become overloaded. The software shut down non-essential functions to allow the essential guidance for the landing to occur softly. This was the start of high-reliability software design.

We often think of recovery as an after-the-fact action, but what if we approached all designs with built-in self-recovery systems? With the recent advances in artificial intelligence, it should be possible to have self-correcting features in many of our designs.

Recovery has another dimension. When we as individuals take on new activities or encounter other challenges, do we think of having built-in recovery approaches should things not go as we expected? Again, we think of recovery as an after-the-fact activity. Could we not think of what we might do should our original plan not work?

We need to think of recovery as something natural to be part of our design or decision-making, not as a reaction to a failure. For every design or decision, we should add the following statement: If this doesn’t work out as we hope, then we will…

* * *

“Everyone has a plan ‘till they get punched in the mouth.” – Mike Tyson (boxer)

How To Use

Useful guides for incorporating messages into discussion.