Title: Human information processing based information retrieval
Author: Graf, Erik
Awarding Body: University of Glasgow
Current Institution: University of Glasgow
Date of Award: 2011
This work focused on the investigation of the question how the concept of relevance in Information Retrieval can be validated. The work is motivated by the consistent difficulties of defining the meaning of the concept, and by advances in the field of cognitive science. Analytical and empirical investigations are carried out with the aim of devising a principled approach to the validation of the concept. The foundation for this work was set by interpreting relevance as a phenomenon occurring within the context of two systems: An IR system and the cognitive processing system of the user. In light of the cognitive interpretation of relevance, an analysis of the learnt lessons in cognitive science with regard to the validation of cognitive phenomena was conducted. It identified that construct validity constitutes the dominant approach to the validation of constructs in cognitive science. Construct validity constitutes a proposal for the conduction of validation in scenarios, where no direct observation of a phenomenon is possible. With regard to the limitations on direct observation of a construct (i.e. a postulated theoretic concept), it bases validation on the evaluation of its relations to other constructs. Based on the interpretation of relevance as a product of cognitive processing it was concluded, that the limitations with regard to direct observation apply to its investigation. The evaluation of its applicability to an IR context, focused on the exploration of the nomological network methodology. A nomological network constitutes an analytically constructed set of constructs and their relations. The construction of such a network forms the basis for establishing construct validity through investigation of the relations between constructs. An analysis focused on contemporary insights to the nomological network methodology identified two important aspects with regard to its application in IR. The first aspect is given by a choice of context and the identification of a pool of candidate constructs for the inclusion in the network. The second consists of identifying criteria for the selection of a set of constructs from the candidate pool. The identification of the pertinent constructs for the network was based on a review of the principles of cognitive exploration, and an analysis of the state of the art in text based discourse processing and reasoning. On that basis, a listing of known sub-processes contributing to the pertinent cognitive processing was presented. Based on the identification of a large number of potential candidates, the next step consisted of the inference of criteria for the selection of an initial set of constructs for the network. The investigation of these criteria focused on the consideration of pragmatic and meta-theoretical aspects. Based on a survey of experimental means in cognitive science and IR, five pragmatic criteria for the selection of constructs were presented. Consideration of meta-theoretically motivated criteria required to investigate what the specific challenges with regard to the validation of highly abstract constructs are. This question was explored based on the underlying considerations of the Information Processing paradigm and Newell’s (1994) cognitive bands. This led to the identification of a set of three meta-theoretical criteria for the selection of constructs. Based on the criteria and the demarcated candidate pool, an IR focused nomological network was defined. The network consists of the constructs of relevance and type and grade of word relatedness. A necessary prerequisite for making inferences based on a nomological network consists of the availability of validated measurement instruments for the constructs. To that cause, two validation studies targeting the measurement of the type and grade of relations between words were conducted. The clarification of the question of the validity of the measurement instruments enabled the application of the nomological network. A first step of the application consisted of testing if the constructs in the network are related to each other. Based on the alignment of measurements of relevance and the word related constructs it was concluded to be true. The relation between the constructs was characterized by varying the word related constructs over a large parameter space and observing the effect of this variation on relevance. Three hypotheses relating to different aspects of the relations between the word related constructs and relevance. It was concluded, that the conclusive confirmation of the hypotheses requires an extension of the experimental means underlying the study. Based on converging observations from the empirical investigation of the three hypotheses it was concluded, that semantic and associative relations distinctly differ with regard to their impact on relevance estimation.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: B Philosophy (General) ; BF Psychology ; QA75 Electronic computers. Computer science ; ZA4050 Electronic information resources