Use this URL to cite or link to this record in EThOS:
Title: Statistical approaches to the analysis of hierarchical data using simulations and real data from a study of musculoskeletal symptoms
Author: Ntani, Georgia
ISNI:       0000 0004 6348 9632
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Clustering of observations is a common phenomenon in epidemiological research. A first objective of this thesis was to explore the situations in which failure to account for clustering in statistical analysis could lead to erroneous conclusions. Using simulated data, I showed that effects estimated from a naïve regression model that ignored clustering were on average unbiased when the outcome was continuous, but were biased towards the null when the outcome was binary. The precision of effect estimates was overestimated when the outcome was binary, and also when both the outcome and explanatory variable were continuous. However, in linear regression with a binary explanatory variable, the precision of effects was somewhat underestimated by the naïve model. The magnitude of bias, both in point estimates and their precision, increased with greater clustering of the outcome variable, and was influenced also by clustering in the explanatory variable. A second aim was to compare analytical approaches to clustering when synthesising results from multiple studies. Using real data from a large multicentre study, I showed that odds ratios (ORs) estimated from meta-analysis of summary results from component sub-studies were generally similar to those from multi-level modelling of pooled individual data. However, the precision of point estimates from meta-analysis was lower than that from multi level analysis. Discrepancies between the two methods (including differences in ORs up to 27% and in precision up to 46%) were demonstrated when the outcome of interest was rare. A third aim was to compare different methods for estimation of relative risks (RRs) when data are clustered. The random-intercept complementary log-log model produced estimates of effect and precision similar to those from the random-intercept log-binomial model (considered to be the best approach, but not always practical). Other models gave effect estimates close to those from the log-binomial model, but with less comparable precision. Contrary to the situation when RRs are being estimated in a set of independent (i.e. unclustered) observations, the random-intercept Poisson model with robust variance produced less precise point estimates than those from the random intercept log-binomial model. Priorities for future work include exploration of: the consequences of ignoring clustering in the presence of effect modification and when marginal methods of analysis are used; situations in which meta analytical estimates differ from those derived by pooled analysis; and specific situations in which the random-intercept Poisson model with robust variance is less likely to produce results similar to those from the random-intercept log binomial model.
Supervisor: Coggon, David ; Inskip, Hazel Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available