Use this URL to cite or link to this record in EThOS:
Title: Analysis of clustered data when the cluster size is informative
Author: Pavlou, M.
ISNI:       0000 0004 2738 8009
Awarding Body: University College London (University of London)
Current Institution: University College London (University of London)
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
Clustered data arise in many scenarios. We may wish to fit a marginal regression model relating outcome measurements to covariates for cluster members. Often the cluster size, the number of members, varies. Informative cluster size (ICS) has been defined to arise when the outcome depends on the cluster size conditional on covariates. If the clusters are considered complete then the population of all cluster members and the population of typical cluster members have been proposed as suitable targets for inference, which will differ between these populations under ICS. However if the variation in cluster size arises from missing data then the clusters are considered incomplete and we seek inference for the population of all members of all complete clusters. We define informative covariate structure to arise when for a particular member the outcome is related to the covariates for other members in the cluster, conditional on the covariates for that member and the cluster size. In this case the proposed populations for inference may be inappropriate and, just as under ICS, standard estimation methods are unsuitable. We propose two further populations and weighted independence estimating equations (WIEE) for estimation. An adaptation of GEE was proposed to provide inference for the population of typical cluster members and increase efficiency, relative to WIEE, by incorporating the intra-cluster correlation. We propose an alternative adaptation which can provide superior efficiency. For each adaptation we explain how bias can arise. This bias was not clearly described when the first adaptation was originally proposed. Several authors have vaguely related ICS to the violation of the ‘missing completely at random’ assumption. We investigate which missing data mechanisms can cause ICS, which might lead to similar inference for the populations of typical cluster members and all members of all complete clusters, and we discuss implications for estimation.
Supervisor: Copas, A. J. ; Seaman, R. S. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available