Title:

Nonignorable nonresponse adjustment using fully nonparametric approach

Nonresponse is an increasingly common problem in surveys. It is a problem because it causes missing data and, more importantly, because such missing data are a potential source of bias for estimates. Most of the methods dealing with nonresponse assume either explicitly or implicitly that the missing values are missing at random (MAR). We consider the situations where the probability to respond may depend on the outcome value even after conditioning on the covariates. For this kind of response mechanism, the missing outcomes are not missing at random (NMAR). The problem of missing data is handled either using fully parametric or semiparametric approaches. These approaches have some potential issues, for example, strict distributional assumptions, heavy computations, etc. We propose a fully nonparametric approach; first we postulate informative individual response probabilities i.e. the response probability may depend on the values of interest, and it may be specific to each individual. We treat the outcome variable as a fixed constant just like in the design based approach to survey sampling. Then we use an estimating equations approach to define the finite population parameters. Hence the approach is fully nonparametric provided the individual specific response probabilities can be estimated nonparametrically. For longitudinal data it is possible that one can have individual historic response rate and those can be used as an empirical estimator for the individual specific response probability. We utilize this individual historic response rate as an estimator for the unknown response probability. If the unknown response probability is consistently estimated then the proof for consistency of estimators is much easier and much more common. But in our case the historic response rate is unbiased but not consistent because practically we cannot have infinitely many historic time points but we can have many units. We try to prove the asymptotic unbiasedness of estimating equations and further the consistency of estimates but we could not prove it and the reason is discussed in Section 2.4. It provides an interesting investigation of pursing consistency. We develop the associated variance estimator. Being a fully nonparametric and computationally simple method, it can be used as a widely applicable exploratory data analysis technique for NMAR mechanisms, as long as there exit a response history, in advance of more sophisticated and possibly more efficient modelling methods. The approach is extended for a longitudinal setting and two types of EEs are defined to estimate parameters that are defined over time, such as the change between two successive time points or the regression coefficients involving outcomes over time. The associated variances estimators using both EEs are also developed. The nonparametric estimating equations (NEE) approach for crosssectional and longitu dinal setting is not unbiased. We therefore develop biasadjusting NEE approach to adjust the bias in crosssectional and longitudinal parameter estimates. Another advantage of the bias adjusting EE approach is that the variance estimator based on biasadjusting NEE is expected to be less biased as compared to the unadjusted approach. Moreover, Taylor expansion is used to adjust the bias in variance estimate obtained from simple and biasadjusted NEE approaches. A comprehensive simulation study is conducted using real and simulated data to assess the performance of NEE and biasadjusted NEE approaches under various settings for crosssectional as well as for longitudinal data.
