Use this URL to cite or link to this record in EThOS:
Title: Inferring causation from big data in the social sciences
Author: Ghiara, Virginia
ISNI:       0000 0004 7969 9431
Awarding Body: University of Kent
Current Institution: University of Kent
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
The emergence of big data has become a central theme in scientific and philosophical discussions. A main tenor in the literature is that big data can drastically change the way in which causal studies are conducted. My thesis aims to explore how big data can be used to establish causal relationships in the social sciences. The beginning of the thesis will focus on data-driven studies and will investigate some of the limitations that characterise this type of study. This analysis will lead me to identify three key challenges of big data for causal studies in the social sciences. The first challenge is how to overcome the limitations of data-driven causal studies. This challenge is motivated by the observation that, regardless of how sophisticated they are, causal data-driven methods can suffer from bias. The second challenge is how to understand the role of ethnographic, qualitative data in causal studies based on big data. This challenge appears vital in the social sciences, where some researchers remain hesitant about the use of data-driven methods and try to defend the importance of qualitative, 'thick' data. The third challenge is how to use big data, in the social sciences, to obtain evidence of causality that goes beyond correlations. This challenge is strongly associated with the idea that, in order to establish causation, both the presence of a correlation between the cause and the effect, and the presence of a mechanism linking the cause and the effect need to be established. This idea, originally proposed by Russo and Williamson (2007) and known by the name of the Russo-Williamson thesis, will be discussed in detail to provide a solution to the first challenge. I will argue that researchers should comply with such a thesis to overcome the limitations of data-driven causal studies in the social sciences. Next, I shall examine the discussions on mixed methods research to claim that qualitative ethnographic data can be used both to collect evidence of social mechanisms, and to help researchers to obtain a comprehensive understanding of the phenomenon under study. Finally, I shall argue that big data can be used, in specific circumstances, to collect evidence of entities and activities constituting causal mechanisms, and that big data might be used to identify sociomarkers, the social version of biomarkers, to trace causal processes that evolve over time.
Supervisor: Williamson, Jon ; Corfield, David Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: HA Statistics