A structural method for the design of fault tolerant distributed control systems
Distributed digital control systems provide alternatives to conventional, centralised digital control systems. Typically, a modern distributed control system will comprise a multi-processor or network of processors, a communications network, an associated set of sensors and actuators, and the systems and applications software. This thesis addresses the problem of how to design robust decentralised control systems, such as those used to control event-driven, real-time processes in time-critical environments. Emphasis is placed on studying the dynamical behaviour of a system and identifying ways of partitioning the system so that it may be controlled in a distributed manner. A structural partitioning technique is adopted which makes use of natural physical sub-processes in the system, which are then mapped into the software processes to control the system. However, communications are required between the processes because of the disjoint nature of the distributed (i.e. partitioned) state of the physical system. The structural partitioning technique, and recent developments in the theory of potential controllability and observability of a system, are the basis for the design of controllers. In particular, the method is used to derive a decentralised estimate of the state vector for a continuous-time system. The work is also extended to derive a distributed estimate for a discrete-time system. Emphasis is also given to the role of communications in the distributed control of processes and to the partitioning technique necessary to design distributed and decentralised systems with resilient structures. A method is presented for the systematic identification of necessary communications for distributed control. It is also shwon that the structural partitions can be used directly in the design of software fault tolerant concurrent controllers. In particular, the structural partition can be used to identify the boundary of the conversation which can be used to protect a specific part of the system. In addition, for certain classes of system, the partitions can be used to identify processes which may be dynamically reconfigured in the event of a fault. These methods should be of use in the design of robust distributed systems.