System unavailability analysis based on window-observed recurrent event data



Many service industries require high level of system availability to be competitive. An appropriate system unavailability metric is important for business decisions and to minimize the operation risks. In practice, a system can be unavailable for service because of multiple types of events, and the durations of these events can also vary. In addition, the data that record the system operating history often have a complicated structure. In this paper, we develop a framework for estimating system unavailability metric based on historical data of a fleet of heavy-duty industry equipment, which we call System A. During the useful life of System A, repairs and maintenance actions are performed. However, not all repairs or maintenance actions were recorded. Specifically, the information on event times, types, and durations is available only for certain time intervals (i.e., observation windows), instead of the entire useful life span of the system. Thus, the data structure is window-observed recurrent event with multiple event types. We use a nonhomogeneous Poisson process model with a bathtub intensity function to describe the recurrent events, and a truncated lognormal distribution to describe the event durations. We then define a conservative metric for system unavailability, obtain an estimate of this metric, and quantify the statistical uncertainty. Copyright © 2013 John Wiley & Sons, Ltd.