A reliable stable storage system for unix



This paper describes the implementation of a stable storage system which converts several fallible disk stores into reliable devices for storing data. It provides reliable reading and writing of data in a distributed UNIX environment in spite of transient I/O faults, decay of physical storage devices and processor crashes. The implementation makes available to UNIX users a convenient way of using the facilities of a stable storage system by providing the abstraction of stable files and by maintaining the standard UNIX system call interface. It systematically handles abnormal situations by separating normal and exceptional processing in both the system description and implementation. This is achieved through the use of a fault tolerance design notation for the description of the system and the implementation of that notation using an exception handling package.