Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.432819
Title: Artificial immune systems for Web content mining : focusing on the discovery of interesting information
Author: Secker, Andrew D.
ISNI:       0000 0001 3593 5525
Awarding Body: University of Kent at Canterbury
Current Institution: University of Kent
Date of Award: 2006
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
This thesis explores the way in which biological metaphors can be applied to web content mining and, more specifically, the identification of interesting information in web documents. Web content mining is the use of content found on the web, most usually the text found on web pages, for data mining tasks such as classification. Due to the nature of the search domain, i.e. the web content is noisy and undergoing constant change, an adaptive system is required. The discovery of interesting information is an advance on basic text mining in that it aims to identify text that is novel, unexpected or surprising to a user, whilst still being relevant. This thesis investigates the use of Artificial Immune Systems (AIS) applied to discovery of interesting information as AIS are thought to confer the adaptability and learning required for this task. Two novel Artificial Immune Systems are described and tested. AISEC (Artificial Immune System for Interesting E-mail Classification) is a novel, immune inspired system for the classification of e-mail. It is shown that AISEC performs with a predictive accuracy comparable to a naïve Bayesian algorithm when continually classifying e-mail collected from a real user. This section contributes to the understanding of how AIS react in a continuous learning scenario. Following from the knowledge gained by testing AISEC, AISIID (Artificial Immune system for Interesting Information Discovery) is then described. A study involving the subjective evaluation of the results by users is undertaken and AISIID is seen to discover pages rated more interesting by users than a comparative system. The results of this study also reveal AISIID performs with subjective quality similar to the well known search engine, Google. This leads to a contribution regarding a better understanding of the user's perception of interestingness and possible inadequacies in the current understanding of interestingness regarding text documents.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.432819  DOI: Not available
Keywords: QA 76 Software, computer programming,
Share: