Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.330536
Title: The effects of size on the function of an information retrieval document collection
Author: Mushens, Brian G.
Awarding Body: Newcastle University
Current Institution: University of Newcastle upon Tyne
Date of Award: 1982
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
A feature of research into Information Retrieval has been the continued use of small test collections in experiments. The assumption that any results will remain valid when the system is used to interrogate a large operational database is examined critically particulaIly with regard to the difference in size of collections involved and the reasons for this. Experiments investigatinsg MEDLARS database with reference to several sub-collections containing varying numbers of documents are described. These include analyses of single term and two-term combination behaviour and actual retrieval searches. The effect cn the clustering structure of diffeIent small sub-collections is also studied. The results ottained for MEDLARS are examined in the context of some well-known test collections, namely Cranfield 2 and INSEC. Results for MEDLARS data indicate that very large collecticns ( > 20,000 documents) may be necessary in order to ensure that the experimental data is indeed representative and may therefore be used to accurately predict the performance of a particular system in the operational ervironment.
Supervisor: Not available Sponsor: D.E.S./British litrary Research Studentship in Information Science
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.330536  DOI: Not available
Keywords: Information science & librarianship Information science
Share: