The main concern in every business is customer
satisfaction. When a product does not live up to its design expectancy, i.e.,
when it fails either gradually, suddenly, or catastrophically, a method of
evaluation must be available to understand why the failure occurred. Root-cause
failure analysis provides this understanding. Fully implemented, it seeks not
only to solve the immediate problem, but to provide valuable guidance to avoid
the problem in the future.
The primary cause is the set of conditions or parameters from which the failure began. The old saying, “For want of a nail the shoe was lost, for want of a shoe the horse was lost, for want of a horse the battle was lost, for want of the battle the kingdom was lost,” summarizes a classic primary cause determination. The analyst must discover what it was about this incident that is fundamentally responsible for the failure in performance and determine the sequence of events that led to the final failure.
By contrast, the root cause of a failure is a process or procedure which “went wrong.” The finish on a machine part was not as-specified. The heat-treatment on a rail was not uniform. The angle on screw-threads was too steep. Identification of that process is the key to creating a procedure by which future failures can be avoided.
Most failure analysis stops short of this final step. Instead what is presented to the client is the primary cause of failure: poor finish, incorrect heat treatment, the shape of the screw threads in the paragraph above are the “primary causes’ of those failures, not the root causes. The root causes would be:
the failure to check the finish after the part was machined;
the failure to ensure that the heat treatment furnace had sufficient control of changes in temperature to produce the desired microstructure in the rails; or
The failure to enter the proper information into the thread-cutting process;
The horse’s groom not checking to see that the horse’s shoes were properly nailed on before sending him into battle.
All four of these were “process” or “procedure” failures. In the galvanized steel failure example, the primary cause of cracking of galvanized steel in bending may be the lack of an aluminum-iron-zinc intermetallic layer at the steel surface. But the root cause is the failure to maintain the aluminum level in the galvanizing bath.
To avoid these same failures in the future, to determine the root cause of the failure, the primary cause must be supplemented by intimate understanding of the entire history of the failed system or part, including both its manufacturing and its use. This information is usually most effectively obtained by visiting the manufacturing site for the failed part. From this information a new procedure can be crafted which will prevent repetition of the original failure.