Estimation in causal graphical models
Pearl (2000), Spirtes et al (1993) and Lauritzen (2001) set up a new framework to encode the causal relationships between the random variables by a causal Bayesian network. The estimation of the conditional probabilities in a Bayesian network has received considerable attention by several investigators (e. g., Jordan (1998), Geiger and Heckerman (1997), Ileckerman et al (1995)), but, this issue has not been studied in a causal Bayesian network. In this thesis, we define the multicausal essential graph on the equivalence class of Bayesian networks in which each member of this class manifests a sort of strong type of invariance under (causal) manipulation called hypercausality. We then characterise the families of prior distributions on the parameters of the Bayesian networks which are consistent with hypercausality and show that their unmanipulated uncertain Bayesian networks must demonstrate the independence assumptions. As a result, such prior distributions satisfy a generalisation of the Geiger and lieckerman condition. In particular, when the corresponding essential graph is undirected, the mentioned class of prior distributions will reduce to the Hyper-Dirichlet family (see Chapter 6). In tile second part of this thesis, we will calculate certain local sensitivity measures and through them we are able to provide the solutions for the following questions: Is the network structure that is learned from data robust with respect to changes of the directionality of some specific arrows? Is the local conditional distributions associated with the specified node robust with respect to the changes to its prior distribution or with respect to the changes to the local conditional distribution of another node? Most importantly, is the posterior distribution associated with the parameters of any node robust with respect to the changes to the prior distribution associated with the parameters of one specific node? Finally, are the quantities mentioned above robust with respect to the changes in the independence assumptions described in Chapter 3? Most of the local sensitivity measures (particularly, local measures of the overall posteriors sensitivity), developed in the last decade, tend to diverge to infinity as the sample size becomes very large (Gustafson (1994) and Gustafson et al (1996)). This is in contrast to our knowledge that, starting from different priors, posteriors tend to agree as the data accumulate. Here we define a now class of metrics with more satisfactory asymptotic behaviour. The advantage of the corresponding local sensitivity measures is boundedness for large sample size.