Use this URL to cite or link to this record in EThOS:
Title: The exploitation of provenance and versioning in the reproduction of e-experiments
Author: Abang Ibrahim, Dayang Hanani
ISNI:       0000 0004 6060 3989
Awarding Body: Newcastle University
Current Institution: University of Newcastle upon Tyne
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Access from Institution:
Reproducibility has long been a cornerstone of science, and is now becoming a key research area for e-Science. This is because it provides a way to validate, and build on, previous results. Underpinning reproducibility in e-Science is provenance, which has the potential to provide scientists with a complete understanding of data generated in eexperiments, including the services that produced and consumed it. This thesis explores the issues in exploiting provenance for reproducibility. Based on this, a reproducibility framework is designed and implemented to allow past experiments to be reproduced. Seven aspects of reproducibility are considered: 1) experiments, 2) reproducibility, 3) provenance, 4) provenance models, 5) provenance and versioning, 6) automatic transformation of provenance to support reproduction, and 7) a reproducibility taxonomy. A key to reproducibility is the provenance model: a data model that structures information about an e-experiment. A review of existing provenance systems shows that the problem caused by services being updated has been neglected. This can have a severe impact on the ability to reproduce experiments and it is therefore argued that the issue of service versioning must be addressed. Even after information on the provenance of an execution, and versioning of services, is captured there is the need for a method to transform this knowledge into a form that allows past experiments to be reproduced: that is another output of this thesis. The thesis focuses on the use of work ow as a means to represent the composition, and to execute experiments. This work explores how work ows can be automatically generated to re-execute past experiments. In order to do this, a transformation algorithm is described that maps a past experiment's execution log data into a work ow format that can be read and processed by the work- ow system. The thesis also introduces a Reproducibility Taxonomy that captures and structures the information required for reproducibility in the presence of versions and provenance.
Supervisor: Not available Sponsor: mUniversiti Malaysia Sarawak (UNIMAS) ; Ministry of Higher Education Malaysia
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available