Improving the effectiveness and the efficiency of Knowledge Base Refinement
Knowledge Base Refinement is an area of Machine Learning whose primary goal is the automatic detection and correction of errors in faulty expert system's knowledge bases. A very important feature of a refinement system is the mechanism used to select the refinements to be implemented. Since there are usually different ways to fix a fault, most current Knowledge Base Refinement systems use extensive heuristics to choose one or a few alternative refinements from a set of possible corrections. This approach is justified by the intention of avoiding the computational problems inherent in the generation and testing of multiple refinements. On the other hand, such systems are liable to miss solutions. The opposite approach was adopted by the Knowledge Base Refinement system KRUST which proposed many alternative corrections to refine each wrongly-solved example. Although KRUST demonstrated the feasibility of this approach, the potential of multiple refinement generation could not be fully exploited since the system used a limited set of refinement operators in order to contain the number of alternative fixes generated for each fault, and hence was unable to rectify certain kinds of errors. Additionally, the time taken to produce and test a set of refined knowledge bases was considerable for any non-trivial knowledge base. This thesis presents a major revision of the KRUST system. Like its predecessor, the resulting system, STALKER, proposes many alternative refinements to correct each wrongly classified example in the training set. Two enhancements have been made: the class of errors handled by KRUST has been augmented through the introduction of inductive refinement operators; the testing phase of Knowledge Base Refinement has been speeded up considerably by means of a technique based on a Truth Maintenance System (TMS). The resulting system is more effective than other refinement systems because it generates many alternative refinements. At the same time, STALKER is very efficient since KRUST's computationally expensive implementation and testing of refined knowledge bases has been replaced by a TMS-based simulator.