Title:

PAClearning geometrical figures

The thesis studies the following problem: given a set of geometrical figures (such as planar polygons), each one labelled according to whether or not it resembles some 'ideal' figure, find a good approximation to that ideal figure which can be used to classify other figures in the same way. We work within the PAC learning model introduced by Valiant in 1984. Informally, the concepts under consideration are sets of polygons which resemble each other visually. A learning algorithm is given collections of members and nonmembers of a concept, and its task is to infer a criterion for membership which is consistent with the given examples and which can be used as an accurate classifier of further example polygons. In order to formalise the notion of a concept, we use metrics which measure the extent to which two polygons differ. A concept is assumed to be the set of polygons which are within some distance of some fixed central polygon. In the thesis we work most extensively with the Hausdorff metric. Using the Hausdorff metric we obtain NPcompleteness results for several variants of the learning problem. In particular we show that it is hard to find a single geometrical figure which is close to the positive examples but not to the negative examples. This result holds under various assumptions about the specific geometrical figures under consideration. It also holds for several metrics other than the Hausdorff metric. Despite the NPcompleteness results mentioned above we have found some encouraging positive results. In particular, we have discovered a general technique for prediction. (Prediction is a less demanding learning model than PAC learning. The goal is to find a polynomialtime algorithm which takes as input a sample of labelled examples and is then able to predict the status of further unlabelled examples in polynomial time).
