Investigations into controllers for adaptive autonomous agents based on artificial neural networks
This thesis reports the development and study of novel architectures for the simulation of adaptive behaviour based on artificial neural networks. There are two distinct levels of enquiry. At the primary level, the initial aim was to design and implement a unified architecture integrating sensorimotor learning and overall control. This was intended to overcome shortcomings of typical behaviour-based approaches in reactive control settings. It was achieved in two stages. Initially, feedforward neural networks were used at the sensorimotor level of a modular architecture and overall control was provided by an algorithm. The algorithm was then replaced by a recurrent neural network. For training, a form of reinforcement learning was used. This posed an intriguing composite of the well-known action selection and credit assignment problems. The solution was demonstrated in two sets of simulation studies involving variants of each architecture. These studies also showed: firstly that the expected advantages over the standard behaviour-based approach were realised, and secondly that the new integrated architecture preserved these advantages, with the added value of a unified control approach. The secondary level of enquiry addressed the more foundational question of whether the choice of processing mechanism is critical if the simulation of adaptive behaviour is to progress much beyond the reactive stage in more than a trivial sense. It proceeded by way of a critique of the standard behaviourbased approach to make a positive assessment of the potential for recurrent neural networks to fill such a role. The findings were used to inform further investigations at the primary level of enquiry. These were based on a framework for the simulation of delayed response learning using supervised learning techniques. A further new architecture, based on a second-order recurrent neural network, was designed for this set of studies. It was then compared with existing architectures. Some interesting results are presented to indicate the appropriateness of the design and the potential of the approach, though limitations in the long run are not discounted.