These categories were defined as those causes that were apparent from the component inspection, field analysis, and discussions with plant hourly and salaried, operating and maintenance personnel. Based on discussions with others and the author company's experience, six error categories were selected, operational, design, maintenance practice, supervisory, manufacturing, and original equipment installation. In categorizing the causes we used the following definitions:
Operational errors - These are situations where the machine or process was operated beyond the normal or accepted design boundaries. Two plant examples are:
The paper machine frame was not cleaned properly and the resultant microbiological corrosion resulted in structural perforation.
The specified operating practice called for a maximum operating temperature of 10,500oC, but the operator ran it at 11,500oC because the higher temperature would result in a greater production rate. The result was that corrosion perforated the vessel in less than a year.
Design errors - There were two categories of design errors. The most common involved the design of the machine or the system that did not meet the needs of the operation, however there were also situations where the machine performance requirements were changed without a sound design review. (Before citing the examples we should note that these were not necessarily the result of errors committed by engineers. There were several examples where the "design" of a piece of equipment was the work of a maintenance planner or a vendor sales representative and the equipment was installed without competent oversight review. There were also situations where plant personnel with tacit engineering approval but no realistic design analysis changed the machine operating rate.) Examples include:
The dryer felt roll failed from fatigue that originated where a stiffener was welded into the roll. The original design resulted in a high stress concentration at the site of significant residual stresses. It did not specify stress relieving and the roll developed a fatigue crack and failed catastrophically.
The pump impeller failed from cavitation, a corrosion mechanism. The cavitation resulted from poor piping design and the rapid failure from the use of a material that was not cavitation resistant. The system design had the pump operating at well below the minimum NPSH.
The paper machine operating speed was increased by 5% without a serious engineering review. Consequently, some components were operating at resonant frequencies and failed repeatedly with the result that the machine production capability was actually reduced by the increase in speed.
Maintenance errors - The maintenance mechanics did not repair a machine or properly install the machine or component after a repair. Examples:
The pump shaft had loose bearings resulting from poor fitting practice. The resultant fretting corrosion reduced the fatigue strength of the shaft and the shaft fractured from corrosion fatigue.
A crankshaft on a large reciprocating compressor failed from bending fatigue. The crankshaft was supported by three plain bearings and the bearing alignment was not checked prior to crankshaft installation. Several months later the shaft failed as a result of the loading.
Manufacturing errors - The components were improperly manufactured and as a result failed prematurely. Examples include:
The manufacturer of large vertical shaft pumps used in waste treatment plants had torsional resonance in both the drive shafts and the pump bases. As a result the drive shafts would fracture from torsional fatigue after 4000 to 8000 operating hours.
The manufacturer of an expansion joint specified for use at 180 psi actually supplied one designed for use at 120 psi. (There was an error in their internal procedures.) The joint failed during operation shortly after startup.
Original installation errors - At the time of the installation a properly designed and manufactured piece of equipment was installed incorrectly and, as a result, failed prematurely. Two examples are:
The vertical pump motor that was misaligned, causing stress on the shaft and directly contributing the shaft failure.
The copper water line that was installed without the specified dielectric union, resulting in corrosion that caused a leak approximately 15 months later.
Supervisory errors - A situation where there is general recognition that a potentially serious problem exists but no action is taken and the result is a significant failure. Two examples are:
A supercalender drive failure that occurred when the reducer ran out of oil. This 2000 hp reducer had been leaking for over a year but no corrective action had been taken.
A critical bearing that failed when the lubricant supply system failed. There was a monitoring system on the lubricant line and, even though the machine diagnostic report noted that the monitoring system had failed several months earlier, it had not been repaired. As a result, when the lubricant supply failed the bearing was destroyed and substantial downtime resulted.
From "Understanding Why It Failed" by Neville W. Sachs, P.E. Sachs, Salvaterra & Associates, Syracuse, NY (back)