An evaluation of load balancing algorithms for distributed systems
Distributed systems are gradually being accepted as the dominant computing paradigm of the future. However, due to the diversity and multiplicity of resources, and the need for transparency to users, global resource management raises many questions. On the performance level the potential benefits of the load balancing in resolving the occasional congestion experienced by some nodes while others are idle or lightly loaded are commonly accepted. It is also acknowledged that no single load balancing algorithm deals satisfactorily with the changing system characteristics and dynamic workload environment. In modelling distributed systems for load balancing, optimistic assumptions of system characteristics are commonly made, with no evaluation of alternative system design options such as communications protocols. When realistic assumptions are made on system attributes such as communication bandwidth, load balancing overheads, and workload model, doubts are cast on the capability of load balancing to improve the performance of distributed systems significantly. A taxonomy is developed for the components as well as the attributes aspects of load balancing algorithms to provide a common terminology and a comprehensive view to load balancing in distributed systems. For adaptive algorithms the taxonomy is extended to identify the issues involved and the ways of adding adaptability along different dimensions. A design methodology is also outlined. A review of related work is used to identify the most promising load balancing strategies and the modelling assumptions made in previous load balancing studies. Subsequently the research problems addressed in this thesis and the design of new algorithms are detailed. A simulated system developed to allow an experimentation with various load balancing algorithms under different workload models and system attributes is described. Based on the nature of the file system structure and the classes of nodes processing speed involved, different models of loosely-coupled distributed systems can be defined. Four models are developed: disk-based homogeneous nodes, diskless homogeneous nodes, diskless heterogeneous nodes, and disk-based heterogeneous nodes. The nodes are connected through a broadcast transfer device. A set of representative load balancing algorithms covering a range of strategies are evaluated and compared for the four models of distributed systems. The algorithms developed include a new algorithm called Diffuse based on explicit adaptability for the homogeneous systems. In the case of heterogeneous systems, novel modifications are made to a number of algorithms to take into account the heterogeneity of nodes speed. The evaluation on homogeneous systems is two-fold: an assessment of the effect of system attributes on the performance of the distributed system subject to these algorithms, and a comparison of the relative merits of the algorithms using different performance metrics, and in particular a classification of the performance of the Diffuse algorithm with regard to others in the literature. For the heterogeneous systems the performance of the adapted algorithms is compared to that of the standard versions and to the no load balancing case. As a result of this evaluation, for a set of combinations of performance objectives, distributed system attributes, and workload environment, we identify the most . appropriate load balancing algorithm and optimal values for adjustable parameters of the algorithm.