SEARCH

SEARCH BY CITATION

Abstract

This paper presents the results of a generic reliability analysis of fault-tolerant digital control systems (F-T DCS). This analysis differs from previous efforts at estimating the reliability performance of F-T DCS in the sense that this analysis relies extensively on actual experience with redundant computer systems rather than on theoretical evaluations. The dominant contributors to the frequency of failure of F-T DCS are (1) failures within common or shared equipment, (2) software failures, and (3) inadvertent operator actions. Other contributors include loss of electric power, spurious signals that originate from within the DCS, lack of coverage, common cause failure (CCF) of redundant hardware, CCF of instrument channels, and physical damage from externally initiated events (e.g., high temperature). Much variation is expected in the reliability performance of F-T DCSs. Although some systems may operate for 10 or 15 years without experiencing system failures, other systems may fail several times during the same time interval. This variation is expected among systems of different architectures as well as among systems of the same architecture. Because most failures of DCSs can be traced to some kind of CCF, particularly software failures and inadvertent operator actions, CCFs should probably receive more attention than they are presently given when selecting an F-T DCS.