Moneta Chlass Entner Hoyerthe structural shocks ε t are connected byu t = Γ −10 ε t = PD −1 ε t (14)with square matrices Γ 0 and PD −1 , respectively. Equation (14) has two important properties:First, the vectors u t and ε t are of the same length, mean<strong>in</strong>g that there are asmany residuals as structural shocks. Second, the residuals u t are l<strong>in</strong>ear mixtures of theshocks ε t , connected by the ‘mix<strong>in</strong>g matrix’ Γ −10. This resembles the ICA model, whenplac<strong>in</strong>g certa<strong>in</strong> assumptions on the shocks ε t .In short, the ICA model is given by x = As, where x are the mixed components, s the<strong>in</strong>dependent, non-Gaussian sources, and A a square <strong>in</strong>vertible mix<strong>in</strong>g matrix (mean<strong>in</strong>gthat there are as many mixtures as <strong>in</strong>dependent components). Given samples from themixtures x, ICA estimates the mix<strong>in</strong>g matrix A and the <strong>in</strong>dependent components s, byl<strong>in</strong>early transform<strong>in</strong>g x <strong>in</strong> such a way that the dependencies among the <strong>in</strong>dependentcomponents s are m<strong>in</strong>imized. The solution is unique up to order<strong>in</strong>g, sign and scal<strong>in</strong>g(Comon, 1994; Hyvär<strong>in</strong>en et al., 2001).By compar<strong>in</strong>g the ICA model x = As and equation (14), we see a one-to-one correspondenceof the mixtures x to the residuals u t and the <strong>in</strong>dependent components s tothe shocks ε t . Thus, to be able to apply ICA, we need to assume that the shocks arenon-Gaussian and mutually <strong>in</strong>dependent. We want to emphasize that no specific non-Gaussian distribution is assumed for the shocks, but only that they cannot be Gaussian. 1For the shocks to be mutually <strong>in</strong>dependent their jo<strong>in</strong>t distribution has to factorize <strong>in</strong>tothe product of the marg<strong>in</strong>al distributions. In the non-Gaussian sett<strong>in</strong>g, this implies zeropartial correlation, but the converse is not true (as opposed to the Gaussian case wherethe two statements are equivalent). Thus, for non-Gaussian distributions conditional<strong>in</strong>dependence is a much stronger requirement than uncorrelatedness.Under the assumption that the shocks ε t are non-Gaussian and <strong>in</strong>dependent, equation(14) follows exactly the ICA-model and apply<strong>in</strong>g ICA to the VAR residuals u tyields a unique solution (up to order<strong>in</strong>g, sign, and scal<strong>in</strong>g) for the mix<strong>in</strong>g matrix Γ −10and the <strong>in</strong>dependent components ε t (i.e. the structural shocks <strong>in</strong> our case). However,the ambiguities of ICA make it hard to directly <strong>in</strong>terpret the shocks found by ICA s<strong>in</strong>cewithout further analysis we cannot relate the shocks directly to the measured variables.Hence, we assume that the residuals u t follow a l<strong>in</strong>ear non-Gaussian acyclic model(Shimizu et al., 2006), which means that the contemporaneous structure is representedby a DAG (directed acyclic graph). In particular, the model is given byu t = B 0 u t + ε t (15)with a matrix B 0 , whose diagonal elements are all zero and, if permuted accord<strong>in</strong>g tothe causal order, is strictly lower triangular. By rewrit<strong>in</strong>g equation (15) we see thatΓ 0 = I − B 0 . (16)From this equation it follows that the matrix B 0 describes the contemporaneous structureof the variables Y t <strong>in</strong> the SVAR model as shown <strong>in</strong> equation (6). Thus, if we can1. Actually, the requirement is that at most one of the residuals can be Gaussian.116
Causal Search <strong>in</strong> SVARidentify the matrix Γ 0 , we also obta<strong>in</strong> the matrix B 0 for the contemporaneous effects.As po<strong>in</strong>ted out above, the matrix Γ −10(and hence Γ 0 ) can be estimated us<strong>in</strong>g ICA up toorder<strong>in</strong>g, scal<strong>in</strong>g, and sign. With the restriction of B 0 represent<strong>in</strong>g an acyclic system,we can resolve these ambiguities and are able to fully identify the model. For simplicity,let us assume that the variables are arranged accord<strong>in</strong>g to a causal order<strong>in</strong>g, sothat the matrix B 0 is strictly lower triangular. From equation (16) then follows that thematrix Γ 0 is lower triangular with all ones on the diagonal. Us<strong>in</strong>g this <strong>in</strong>formation, theambiguities of ICA can be resolved <strong>in</strong> the follow<strong>in</strong>g way.The lower triangularity of B 0 allows us to f<strong>in</strong>d the unique permutation of the rows ofΓ 0 , which yields all non-zero elements on the diagonal of Γ 0 , mean<strong>in</strong>g that we replacethe matrix Γ 0 with Q 1 Γ 0 where Q 1 is the uniquely determ<strong>in</strong>ed permutation matrix.F<strong>in</strong>d<strong>in</strong>g this permutation resolves the order<strong>in</strong>g-ambiguity of ICA and l<strong>in</strong>ks the shocksε t to the components of the residuals u t <strong>in</strong> a one-to-one manner. The sign- and scal<strong>in</strong>gambiguityis now easy to fix by simply divid<strong>in</strong>g each row of Γ 0 (the row-permutedversion from above) by the correspond<strong>in</strong>g diagonal element yield<strong>in</strong>g all ones on thediagonal, as implied by Equation (16). This ensures that the connection strength of theshock ε t on the residual u t is fixed to one <strong>in</strong> our model (Equation (15)).For the general case where B 0 is not arranged <strong>in</strong> the causal order, the above argumentsfor solv<strong>in</strong>g the ambiguities still apply. Furthermore, we can f<strong>in</strong>d the causalorder of the contemporaneous variables by perform<strong>in</strong>g simultaneous row- and columnpermutationson Γ 0 yield<strong>in</strong>g the matrix closest to lower triangular, <strong>in</strong> particular ˜Γ 0 =Q 2 Γ 0 Q ′ 2 with an appropriate permutation matrix Q 2. In case non of these permutationsleads to a close to lower triangular matrix a warn<strong>in</strong>g is issued.Essentially, the assumption of acyclicity allows us to uniquely connect the structuralshocks ε t to the components of u t and fully identify the contemporaneous structure.Details of the procedure can be found <strong>in</strong> (Shimizu et al., 2006; Hyvär<strong>in</strong>en et al.,2010). In the sense of the Cholesky factorization of the covariance matrix expla<strong>in</strong>ed <strong>in</strong>Section 1 (with PD −1 = Γ −10), full identifiability means that a causal order among thecontemporaneous variables can be determ<strong>in</strong>ed.In addition to yield<strong>in</strong>g full identification, an additional benefit of us<strong>in</strong>g the ICAbasedprocedure when shocks are non-Gaussian is that it does not rely on the faithfulnessassumption, which was necessary <strong>in</strong> the Gaussian case.We note that there are many ways of exploit<strong>in</strong>g non-Gaussian shocks for modelidentification as alternatives to directly us<strong>in</strong>g ICA. One such approach was <strong>in</strong>troducedby Shimizu et al. (2009). Their method relies on iteratively f<strong>in</strong>d<strong>in</strong>g an exogenous variableand regress<strong>in</strong>g out their <strong>in</strong>fluence on the rema<strong>in</strong><strong>in</strong>g variables. An exogenous variableis characterized by be<strong>in</strong>g <strong>in</strong>dependent of the residuals when regress<strong>in</strong>g any othervariable <strong>in</strong> the model on it. Start<strong>in</strong>g from the model <strong>in</strong> equation (15), this procedurereturns a causal order<strong>in</strong>g of the variables u t and then the matrix B 0 can be estimatedus<strong>in</strong>g the Cholesky approach.One relatively strong assumption of the above methods is the acyclicity of the contemporaneousstructure. In (Lacerda et al., 2008) an extension was proposed wherefeedback loops were allowed. In terms of the matrix B 0 this means that it is not re-117