Use this URL to cite or link to this record in EThOS:
Title: User profiling using machine learning
Author: Barnard, Thomas Charles
ISNI:       0000 0004 2730 9374
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
The goal of the Instant Knowledge project was to design a system to facilitate the sharing of knowledge and expertise within a distributed mobile environment. This system automatically builds profiles of experts interests, and automatically recommends them based on context and social networking information. This thesis describes my contributions to the IK project which involves profiling users and making recommendations using machine learning techniques. Recommender systems are information filtering systems which recommend items to users based on a model of their preferences. Recommenders suffer from a number of problems: they do not make use of contextual information, so recommendations may be untimely or inappropriate; they often use a centralised architecture, which makes it diffcult to react to the changing needs of users; they are often implemented in an ad-hoc fashion making it difficult to make principled improvements or add extra information. In this thesis I present a probabilistic recommender based on Bayes' theorem. Rating behaviour is modelled using a Bayesian prior to improve performance in conditions of data sparsity. The best results are obtained using a Gaussian model for user ratings, and a Gaussian-gamma model for co-rating behaviour. The use of a probabilistic framework should make it easier to add context information to the recommendation process. Generating profiles automatically carries the risk of accidentally including private information which may be discovered by querying the Instant Knowledge system. This presents a privacy risk, as private information may be accidentally incorporated into experts' profiles. I present a framework for evaluating the effect of contamination on performance, and the ability of filtering techniques to preserve privacy. Several filtering techniques are tested and I show that supervised and semi-supervised naive Bayes classifiers can help to preserve privacy.
Supervisor: Prugel-Bennett, Adam Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QA75 Electronic computers. Computer science ; TK Electrical engineering. Electronics Nuclear engineering