Automatic methods of inductive inference
This thesis is concerned with algorithms for generating generalisations-from experience. These algorithms are viewed as examples of the general concept of a hypothesis discovery system which, in its turn, is placed in a framework in which it is seen as one component in a multi-stage process which includes stages of hypothesis criticism or justification, data gathering and analysis and prediction. Formal and informal criteria, which should be satisfied by the discovered hypotheses are given. In particular, they should explain experience and be simple. The formal work uses the first-order predicate calculus. These criteria are applied to the case of hypotheses which are generalisations from experience. A formal definition of generalisation from experience, relative to a body of knowledge is developed and several syntactical simplicity measures are defined. This work uses many concepts taken from resolution theory (Robinson, 1965). We develop a set of formal criteria that must be satisfied by any hypothesis generated by an algorithm for producing generalisation from experience. The mathematics of generalisation is developed. In particular, in the case when there is no body of knowledge, it is shown that there is always a least general generalisation of any two clauses, in the generalisation ordering. (In resolution theory, a clause is an abbreviation for a disjunction of literals.) This least general generalisation is effectively obtainable. Some lattices induced by the generalisation ordering, in the case where there is no body of knowledge, are investigated. The formal set of criteria is investigated. It is shown that for a certain simplicity measure, and under the assumption that there is no body of knowledge, there always exist hypotheses which satisfy them. Generally, however, there is no algorithm which, given the sentences describing experience, will produce as output a hypothesis satisfying the formal criteria. These results persist for a wide range of other simplicity measures. However several useful cases for which algorithms are available are described, as are some general properties of the set of hypotheses which satisfy the criteria. Some connections with philosophy are discussed. It is shown that, with sufficiently large experience, in some cases, any hypothesis which satisfies the formal criteria is acceptable in the sense of Hintikka and Hilpinen (1966). The role of simplicity is further discussed. Some practical difficulties which arise because of Goodman's (1965) "grue" paradox of confirmation theory are presented. A variant of the formal criteria suggested by the work of Meltzer (1970) is discussed. This allows an effective method to be developed when this was not possible before. However, the possibility is countenanced that inconsistent hypotheses might be proposed by the discovery algorithm. The positive results on the existence of hypotheses satisfying the formal criteria are extended to include some simple types of knowledge. It is shown that they cannot be extended much further without changing the underlying simplicity ordering. A program which implements one of the decidable cases is described. It is used to find definitions in the game of noughts and crosses and in family relationships. An abstract study is made of the progression of hypothesis discovery methods through time. Some possible and some impossible behaviours of such methods are demonstrated. This work is an extension of that of Gold (1967) and Feldman (1970). The results are applied to the case of machines that discover generalisations. They are found to be markedly sensitive to the underlying simplicity ordering employed.