Robust data storage in a network of computer systems
Robustness of data in this thesis is taken to mean reliable storage of data and also high availability of data .objects in spite of the occurrence of faults. Algorithms and data structures which can be used to provide such robustness in the presence of various disk, processor and communication network failures are described. Reliable storage of data at individual nodes in a network of computer systems is based on the use of a stable storage mechanism combined with strategies which are used to help ensure crash resis- tance of file operations in spite of the use of buffering mechan- isms by operating systems. High availability of data in the net- work is maintained by replicating data on different computers and mutual consistency between replicas is ensured in spite of network partitioning. A stable storage system which provides atomicity for more complex data structures instead of the usual fixed size page has been designed and implemented and its performance evaluated. A crash resistant file system has also been implemented and evaluated. Many of the techniques presented here are used in the design of what we call CRES (Crash-resistant, Replicated and Stable) storage. CRES storage provides fault tolerance facilities for various disk and processor faults. It also provides fault tolerance facilities for network partitioning through the provision of an algorithm for the update and merge of a partitioned data storage system.