Use this URL to cite or link to this record in EThOS:
Title: Optimisation classification on the web of data using linked data : a study case : movie popularity classification
Author: Budiprasetyo, Gunawan
ISNI:       0000 0004 7972 1475
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Data mining algorithms have been widely used to solve various types of prediction models in movie domain. Classification problems especially to predict the future success of movies have attracted many researchers in order to find efficient ways to address them. However, movie popularity classification has become very complicated as it has too many parameters with different degrees. In this thesis, we review a broad range of literature on (1) movie prediction domain and identify related data, a main data source, and additional data sources to address these problems; (2) on data mining algorithms to build more robust classification models to predict movie popularity. To obtain the robust movie popularity classification model, three experiments were conducted. The first experiment examined five single classifiers (Artificial Neural Network (ANN), Decision Tree (DT), k-NN, Rule Induction (RI) and SVM Polynomial) to develop classification models to predict the future success of movie. The second experiment assessed the use of wrapper-type feature selection algorithms to develop classification models of movie popularity. The last one scrutinized two ensemble methods, bagging and boosting in classifying movie popularity. Based upon the finding and analysis, this thesis contributes in four areas: (1) it demonstrates the capabilities of linked data to get external movie related data sources and shows how additional attributes from external data sources can be used to improve performances of the classification model based on a single data source; (2) it presents the use of Grid Search to get a set of optimal hyper-parameters of Artificial Neural Network (ANN), Decision Tree (DT), Rule Induction (RI) and SVM Polynomial classifiers so as to get more robust classification model; (3) it proves the use of wrapper-type feature selection using Genetic Algorithm suited to those classifiers either using default or optimized parameters in order to get the robust classification model and (4) it establishes the use of ensemble methods (bagging and boosting) to those classifiers either using default or optimized parameters in order to get the model in question.
Supervisor: Hall, Wendy Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available