Use this URL to cite or link to this record in EThOS:
Title: Targeted feedback collection for data source selection with uncertainty
Author: Cortés Ríos, Julio César
ISNI:       0000 0004 7226 3395
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
The aim of this dissertation is to contribute to research on pay-as-you-go data integration through the proposal of an approach for targeted feedback collection (TFC), which aims to improve the cost-effectiveness of feedback collection, especially when there is uncertainty associated with characteristics of the integration artefacts. In particular, this dissertation focuses on the data source selection task in data integration. It is shown how the impact of uncertainty about the evaluation of the characteristics of the candidate data sources, also known as data criteria, can be reduced, in a cost-effective manner, thereby improving the solutions to the data source selection problem. This dissertation shows how alternative approaches such as active learning and simple heuristics have drawbacks that throw light into the pursuit of better solutions to the problem. This dissertation describes the resulting TFC strategy and reports on its evaluation against alternative techniques. The evaluation scenarios vary from synthetic data sources with a single criterion and reliable feedback to real data sources with multiple criteria and unreliable feedback (such as can be obtained through crowdsourcing). The results confirm that the proposed TFC approach is cost-effective and leads to improved solutions for data source selection by seeking feedback that reduces uncertainty about the data criteria of the candidate data sources.
Supervisor: Paton, Norman ; Fernandes, Alvaro Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Pay-as-you-go ; Optimisation ; Feedback collection ; Uncertainty handling ; Data source selection ; Schema mapping selection ; Crowd-sourcing ; Data integration