Title:

Asking intelligent questions : the statistical mechanics of query learning

This thesis analyses the capabilities and limitations of query learning by using the tools of statistical mechanics to study learning in feedforward neural networks. In supervised learning, one of the central questions is the issue of generalization: Given a set of training examples in the form of inputoutput pairs generated by an unknown teacher rule, how can one generate a student which generalizes, i.e., which correctly predicts the outputs corresponding to inputs not contained in the training set? The traditional paradigm has been to study learning from random examples, where training inputs are sampled randomly from some given distribution. However, random examples contain redundant information, and generalization performance can thus be improved by query learning, where training inputs are chosen such that each new training example will be maximally "useful" as measured by a given objective function. We examine two common kinds of queries, chosen to optimize the objective functions, generalization error and entropy (or information), respectively. Within an extended Bayesian framework, we use the techniques of statistical mechanics to analyse the average case generalization performance achieved by such queries in a range of learning scenarios, in which the functional forms of student and teacher are inspired by models of neural networks. In particular, we study how the efficacy of query learning depends on the form of teacher and student, on the training algorithm used to generate students, and on the objective function used to select queries.
