Improved learning automata applied to routing in multi-service networks
Multi-service communications networks are generally designed, provisioned and configured, based on source-destination user demands expected to occur over a recurring time period. However due to network users' actions being non-deterministic, actual user demands will vary from those expected, potentially causing some network resources to be under- provisioned, with others possibly over-provisioned. As actual user demands vary over the recurring time period from those expected, so the status of the various shared network resources may also vary. This high degree of uncertainty necessitates using adaptive resource allocation mechanisms to share the finite network resources more efficiently so that more of actual user demands may be accommodated onto the network. The overhead for these adaptive resource allocation mechanisms must be low in order to scale for use in large networks carrying many source-destination user demands. This thesis examines the use of stochastic learning automata for the adaptive routing problem (these being adaptive, distributed and simple in implementation and operation) and seeks to improve their weakness of slow convergence whilst maintaining their strength of subsequent near optimal performance. Firstly, current reinforcement algorithms (the part causing the automaton to learn) are examined for applicability, and contrary to the literature the discretised schemes are found in general to be unsuitable. Two algorithms are chosen (one with fast convergence, the other with good subsequent performance) and are improved through automatically adapting the learning rates and automatically switching between the two algorithms. Both novel methods use local entropy of action probabilities for determining convergence state. However when the convergence speed and blocking probability is compared to a bandwidth-based dynamic link-state shortest-path algorithm, the latter is found to be superior. A novel re-application of learning automata to the routing problem is therefore proposed: using link utilisation levels instead of call acceptance or packet delay. Learning automata now return a lower blocking probability than the dynamic shortest-path based scheme under realistic loading levels, but still suffer from a significant number of convergence iterations. Therefore the final improvement is to combine both learning automata and shortest-path concepts to form a hybrid algorithm. The resulting blocking probability of this novel routing algorithm is superior to either algorithm, even when using trend user demands.