A semantic modelling approach to knowledge based statistical software
The topic of this thesis is the development of knowledge based statistical software. The shortcomings of conventional statistical packages are discussed to illustrate the need to develop software which is able to exhibit a greater degree of statistical expertise, thereby reducing the misuse of statistical methods by those not well versed in the art of statistical analysis. Some of the issues involved in the development of knowledge based software are presented and a review is given of some of the systems that have been developed so far. The majority of these have moved away from conventional architectures by adopting what can be termed an expert systems approach. The thesis then proposes an approach which is based upon the concept of semantic modelling. By representing some of the semantic meaning of data, it is conceived that a system could examine a request to apply a statistical technique and check if the use of the chosen technique was semantically sound, i.e. will the results obtained be meaningful. Current systems, in contrast, can only perform what can be considered as syntactic checks. The prototype system that has been implemented to explore the feasibility of such an approach is presented, the system has been designed as an enhanced variant of a conventional style statistical package. This involved developing a semantic data model to represent some of the statistically relevant knowledge about data and identifying sets of requirements that should be met for the application of the statistical techniques to be valid. Those areas of statistics covered in the prototype are measures of association and tests of location.