Moneta Chlass Entner Hoyercausal models, <strong>in</strong>dependent component analysis, and tests of conditional <strong>in</strong>dependenceare the tools we propose for structural identification <strong>in</strong> vector autoregressive models.We conclude with an overview of some important issues which are left open <strong>in</strong> thisdoma<strong>in</strong>.1. Specification of the statistical model. Data driven procedures for SVAR identificationdepend upon the specification of the (reduced form) VAR model. Therefore, itis important to make sure that the estimated VAR model is an accurate description ofthe dynamics of the <strong>in</strong>cluded variables (whereas the contemporaneous structure is <strong>in</strong>tentionallyleft out, as seen <strong>in</strong> section 1.2). The usual criterion for accuracy is to checkthat the model estimates residuals conform to white noise processes (although serial<strong>in</strong>dependence of residuals is not a sufficient criterion for model validation). This impliesstable dependencies captured by the relationships among the modeled variables,and an unsystematic noise. It may be the case, as <strong>in</strong> many empirical applications, thatdifferent VAR specifications pass the model check<strong>in</strong>g tests equally well. For example,a VAR with Gaussian errors and p lags may fit the data equally well as a VAR withnon-Gaussian errors and q lags and these two specifications justify two different causalsearch procedures. So far, we do not know how to adjudicate among alternative andseem<strong>in</strong>gly equally accurate specifications.2. Background knowledge and assumptions. Search algorithms are based on differentassumptions, such as, for example, causal sufficiency, acyclicity, the Causal MarkovCondition, Faithfulness, and/or the existence of <strong>in</strong>dependent components. Maybe,background knowledge could justify some of these assumptions and reject others. Forexample, <strong>in</strong>stitutional or theoretical knowledge about an economic process might <strong>in</strong>formus that Faithfulness is a plausible assumption <strong>in</strong> some contexts rather than <strong>in</strong>others, or <strong>in</strong>stead, that one should expect feedback loops if data are collected at certa<strong>in</strong>levels of temporal aggregation. Yet, if background <strong>in</strong>formation could <strong>in</strong>form us here,this might aga<strong>in</strong> provoke a problem of circularity mentioned at the outset of the paper.3. Search algorithms <strong>in</strong> nonparametric sett<strong>in</strong>gs. We have provided some <strong>in</strong>formationon which nonparametric test procedures might be more appropriate <strong>in</strong> certa<strong>in</strong> circumstances.However, it is not clear which causal search algorithms are most efficient<strong>in</strong> exploit<strong>in</strong>g the nonparametric conditional <strong>in</strong>dependence tests proposed <strong>in</strong> Section 4.The more variables the search algorithm needs to be <strong>in</strong>formed about at the same po<strong>in</strong>tof the search, the higher the number of conditioned variables, and hence, the slower, orthe more <strong>in</strong>accurate, the test.4. Number of shocks and number of variables. To conserve degrees of freedom,SVARs rarely model more than six to eight time series variables (Bernanke et al., 2005,p.388). It is an open question how the procedures for causal <strong>in</strong>ference we reviewed canbe applied to large scale systems such as dynamic factor models. (Forni et al., 2000)5. Simulations and empirical applications. Graphical causal models for identify<strong>in</strong>gSVARs, equivalent or similar to the search procedures described <strong>in</strong> section 2, have beenapplied to several sets of macroeconomic data (Swanson and Granger, 1997; Besslerand Lee, 2002; Demiralp and Hoover, 2003; Moneta, 2004; Demiralp et al., 2008;Moneta, 2008; Hoover et al., 2009). Demiralp and Hoover (2003) present Monte Carlo124
Causal Search <strong>in</strong> SVARsimulations to evaluate the performance of the PC algorithm for such an identification.There are no simulation results so far about the performance of the alternative testson residual partial correlations presented <strong>in</strong> section 2.2. Moneta et al. (2010) appliedan <strong>in</strong>dependent component analysis as described <strong>in</strong> section 3, to microeconomic USdata about firms’ expenditures on R&D and performance, as well as to macroeconomicUS data about monetary policy and its effects on the aggregate economy. Hyvär<strong>in</strong>enet al. (2010) assess the performance of <strong>in</strong>dependent component analysis for identify<strong>in</strong>gSVAR models. It is yet to be established how <strong>in</strong>dependent component analysis appliedto SVARs fares compared to graphical causal models (based on the appropriate conditional<strong>in</strong>dependence tests) <strong>in</strong> non-Gaussian sett<strong>in</strong>gs. Nonparametric tests of conditional<strong>in</strong>dependence, as those proposed <strong>in</strong> section 4, have been applied to test for Grangernon-causality (Su and White, 2008), but there are not yet any applications where thesetest results <strong>in</strong>form a graphical causal search algorithm. Overall, there is a need for moreempirical applications of the procedures described <strong>in</strong> this paper. Such applications willbe useful to test, compare, and improve different search procedures, to suggest newproblems, and obta<strong>in</strong> new causal knowledge.6. Appendix6.1. Appendix 1 - Details of the bootstrap procedure from 4.1.(1) Draw a bootstrap sampl<strong>in</strong>g Z * t (for t = 1,...,n) from the estimated kernel densityˆ f (z) = n −1 b −d ∑︀ nt=1K p ((Z t − z)/b).(2) For t = 1,...,n, given Z t *, draw X* t and Y t * <strong>in</strong>dependently from the estimated kerneldensity f ˆ(x|Z t *) and f ˆ(y|Z t *) respectively.(3) Us<strong>in</strong>g X * t , Y* t , and Z* t , compute the bootstrap statistic S * n us<strong>in</strong>g one of the distancesdef<strong>in</strong>ed above.(4) Repeat steps (1) and (2) I times to obta<strong>in</strong> I statistics {S * ni }I i=1 .(5) The p-value is then obta<strong>in</strong>ed by:∑︀ Ii=11{Sni * p ≡> S n},Iwhere S n is the statistic obta<strong>in</strong>ed from the orig<strong>in</strong>al data us<strong>in</strong>g one of the distancesdef<strong>in</strong>ed above, and 1{·} denotes an <strong>in</strong>dicator function tak<strong>in</strong>g value one ifthe expression between brackets is true and zero otherwise.ReferencesE. Baek and W. Brock. A general test for nonl<strong>in</strong>ear Granger causality: Bivariate model.Discuss<strong>in</strong> paper, Iowa State University and University of Wiscons<strong>in</strong>, Madison, 1992.125