Use this URL to cite or link to this record in EThOS:
Title: Combinatorial data analysis for data ordering
Author: Kostopoulos, A.
ISNI:       0000 0004 6059 8905
Awarding Body: University of Liverpool
Current Institution: University of Liverpool
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Access from Institution:
Seriation is a combinatorial optimisation problem that aims to sequence a set of objects such that a natural ordering is created. A large variety of applications exist ranging from archaeology to bioinformatics and text mining. Initially, a thorough and useful quantitative analysis compares different seriation algorithms using the positional proximity coefficient (PPC). This analysis helps the practitioner to understand how similar two algorithms are for a given set of datasets. The first contribution is consensus seriation. This method uses the principles of other consensus based methods to combine different seriation solutions according to the PPC. As it creates a solution that no individual algorithm can create, the usefulness comes in the form of combining different structural elements from each original algorithms. In particular, it is possible to create a solution that combines the local characteristics of one algorithm together with the global characteristics of another. Experimental results show that compared to consensus ranking based methods, using the Hamming, Spearman and Kendall coefficients, the consensus seriation using the PPC gives generally superior results according to the independent accumulated relative generalised anti-Robinson events measure. The second contribution is a metaheuristic for creating good approximation solutions very large seriation problems. This adapted harmony search algorithm makes use of modified crossover operators taken from genetic algorithm literature to optimise the least-squares criterion commonly used in seriation. As for all combinatorial optimisation problems, there is a need for metaheuristics that can produce better solutions quicker. Results show that that algorithm consistently outperforms existing metaheuristic algorithms such as genetic algorithm, particle swarm optimisation, simulated annealing and tabu search as well as the genetic algorithm using the modified crossover operators, with the main advantage of creating a much superior result in a very short iteration frame. These two major contributions offer practitioners and academics with new tools to tackle seriation related problems and a suggested direction for future work is concluded.
Supervisor: Goulermas, Y. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral