“Fail open”

Story time. Several years ago, I was taking a deep diving class in February in a glacier lake in Washington called Crescent lake. Being February, the water was bitter cold. When we finally reached 129 feet, the unthinkable happened – my regulator failed. It was a simple sport diver model, not designed for extreme conditions. the failure caused my air supply to rapidly bubble to the surface. I was able to perform enough analysis to know that the regulator was done for the dive, and the only option was a surface ascent with my buddy.

The point of the story is this – my regulator “failed open”. They are designed to do this – in the event of a failure the armatures lock in the “air open” position allowing enough time to asses the situation, locate your buddy or redundant systems, and ultimately end the dive alive.

The “Fail open” mentality can be applied to many other situations.

For some software examples:

  • When parsing a filename – look for characters to include, not exclude
  • When looking for an executable in a path, work backwards not forwards (preventing the infamous c:\Program.exe problem on windows)
  • When looking for a feature on a version of the OS, assume it isn’t there – fail false
  • When you have asserts in code, put the retail check in too. Far too often I’ve seen crashes in an application where there was an assert checking the crashing condition.