Title:

Weak factor model in large dimension

This thesis presents some extensions to the current literature in highdimensional static factor models. When the crosssection dimension (represented by N henceforth) is very large, the standard assumption for each common factor is to have the number of nonzero loadings grow linearly with N . On the other hand, an idiosyncratic error for each component can only be correlated with a finite number of other components in the crosssection. These two assumptions are crucial in standard highdimensional factor analysis, as they allow us to obtain consistent estimators for the factors, the loadings and the number of factors. However, together they rule out the possibility that we may have some factors that have strictly less than N but still nonnegligible number of nonzero loadings, e.g. N for some 0 < < 1 . The existence of these weak factors will decrease the signaltonoise ratio as now the gap between the systematic and idiosyncratic eigenvalues is more narrow. As the consequence, in such model it is harder to establish the consistency of the factors estimated by sample principle components. Furthermore, the number of factors is even more challenging to identify because most existing methods rely on the large signaltonoise ratio. In this thesis, I consider a factor model that allows general strength for each factor, i.e. both strong and weak factors can exist. Chapter 1 gives more discussions about the current literature on this and the motivation for my contribution. In Chapter 2, I show that the sample principle components are still the consistent estimators for the factors (up to the spanning space), provided that the factors are not too weak. In addition, I derive the lower bound that the strength of the weakest factor needs to achieve for being consistently estimated. More precisely, what I mean by strength is the order of the number of nonzero loadings of the factor. Chapter 3 presents a novel method to determine the number of factors, which is asymptotically consistent even when the factors are weak. I run extensive Monte Carlo simulations to compare the performance of this method to the two wellknown ones, i.e. the class of criteria proposed in Bai and Ng (2002) and the eigenvalue ratio method in Ahn and Horenstein (2013). In Chapter 4 and 5, I show some applications that are based on the work of this thesis. I mainly focus on two issues: selecting the factor models in practice and using factor analysis to compute the large static covariance matrix.
