GLS estimation of dynamic factor models - Econometrics

1 IntroductionSince the influential work of Forni, Hallin, Lippi, and Reichlin (2000), Stock andWatson (2002a, 2002b), Bai and Ng (2002), and Bai (2003) dynamic factor modelshave become an important tool in macroeconomic forecasting (e.g. Watson 2003;Eickmeier and Ziegler 2007) and structural analysis (e.g. Giannone, Reichlin,and Sala 2002; Bernanke, Boivin, and Eliasz 2004; Eickmeier 2007). Under theweak assumptions of an approximate factor model (Chamberlain and Rothschild1983), the parameters of the models can be consistently estimated by applying thetraditional principal component (PC) estimator (Stock and Watson 2002a; Bai2003) or – in the frequency-domain – by using the dynamic principal componentestimator (Forni et al. 2000). Assuming Gaussian i.i.d. errors, the PC estimatoris equivalent to the ML estimator and, therefore, the PC estimator is expectedto share the asymptotic properties of the ML estimator. It is well known that aGeneralized Least Squares (GLS) type criterion function yields a more efficientestimator than the OLS based PC estimator if the errors are heteroskedastic (e.g.Boivin and Ng 2006; Doz, Giannone, and Reichlin 2006a; Choi 2007). However,it is less clear how the estimator can be improved in the case of serially correlatederrors. Stock and Watson (2005) suggest a GLS transformation similar to the onethat is used to correct for autocorrelation in the linear regression model. As wewill argue below, this approach has the disadvantage that such a transformationmay inflate the number of factors.In this paper we consider the Gaussian ML estimator in models, where theerrors are heteroskedastic and autocorrelated. We derive the first order conditionsfor a maximum of the (approximate) log-likelihood function and show thatthe resulting system of equations can be solved by running a sequence of GLSregressions. Specifically, the factors can be estimated by taking into account possibleheteroskedasticity of the errors, whereas the factor loadings are estimatedby using the usual GLS transformation for autocorrelated errors. We show thatasymptotically the estimated covariance parameters do not affect the limitingdistribution of the PC-GLS estimator. Therefore, the feasible two-step GLS estimationprocedure is asymptotically as efficient as the estimator that maximizesthe approximate likelihood function. In small samples, however, our Monte Carlosimulations suggest that the iterated PC-GLS estimator can be substantially moreefficient than the simpler two-step estimator. In a related paper, Jungbacker andKoopman (2008) consider the state space representation of the factor model,2

3 The PC-GLS estimatorIn this section we follow Stock and Watson (2005) and assume that the idiosyncraticcomponents have a stationary heterogenous autoregressive representationof the forme it = ρ 1,i e i,t−1 + · · · + ρ pi ,ie i,t−pi + ε it (4)ρ i (L)e it = ε it , (5)where ρ i (L) is defined above. The autoregressive structure of the idiosyncraticcomponent can be represented in matrix format by defining the (T − p i ) × Tmatrix⎡R(ρ (i) ) =⎢⎣−ρ pi ,i −ρ pi −1,i −ρ pi −2,i · · · 1 0 0 · · ·0 −ρ pi ,i −ρ pi −1,i · · · −ρ 1,i 1 0 · · · ⎥⎦... .. . .. . .. . .. . .. · · ·⎤Thus, the autoregressive representation (4) is written in matrix form asR(ρ (i) )e i = ε i , (6)where ε i = [ε i,pi +1, . . ., ε iT ] ′ and e i = [e i1 , . . .,e iT ] ′ . Furthermore, we do notimpose the assumption that the idiosyncratic errors have the same variancesacross i and t, but assume that σi 2 = E(ε 2 it) may be different across i.We do not need to make specific assumptions about the dynamic properties ofthe vector of common factors, F t . Apart from some minor regularity conditionsthe only consequential assumption that we have to impose on the factors is thatthey are weakly serially correlated (cf. Assumption 1 in section 4).The PC-GLS estimator maximizes the approximate Gaussian log-likelihoodfunctionS(F, Λ, ρ, Σ)=− T N∑ N∑ T∑log σi 2 (e it − ρ 1,i e i,t−1 − . . . − ρ pi ,ie i,t−pi ) 2−, (7)2i=1i=1 t=p i +1where Σ = diag(σ1 2, . . .,σ2 N ). If x it is normally distributed and N → ∞, then thePC-GLS estimator is asymptotically equivalent to the ML estimator. This canbe seen by writing the log-likelihood function as L(X) = L(e|F) + L(F), whereL(e|F) denotes the logarithm of the conditional density function of e 11 , . . .,e NTconditional on the factors F and L(F) is the log-density of (F 1 ′, . . .,F T ′ ). SinceL(e|F) is O p (NT) and L(F) is O p (T) it follows that as N → ∞ maximizingL(e|F) is equivalent to maximizing the full likelihood function L(X).62σ 2 i

The gradients of the criterion function stated above are obtained as{ T∑}g λi (·) = ∂S(·) = 1 ε∂λ i σi2 it [ρ i (L)F t ]g Ft (·) = ∂S(·)∂F t==N∑i=1N∑i=1g ρk,i (·) = ∂S(·) = 1 ∂ρ k,i σi2g σ 2i(·) = ∂S(·)∂σ 2 i=t=p i +11σ 2 i(ε it λ i − ρ 1,i ε i,t+1 λ i − · · · − ρ pi ,iε i,t+pi λ i )(8)1[ρ i (L −1 )ε it ]λ i (9)σ 2 iT∑t=p i +1∑ Tt=p i +1 ε2 it2σ 4 iε it (x i,t−k − λ ′ i F t−k) (10)− T , (11)2σi2where ε is = 0 for s > T. The PC-GLS estimator is obtained by setting thesegradients equal to zero and solving the resulting system iteratively. A practicalproblem is the large dimension of the system consisting of 2Nr +N + ∑ p i equations.Accordingly, in many practical situations it is very demanding to computethe inverse of the Hessian matrix that is required to construct an iterative minimizationalgorithm. We therefore suggest a simple two-step estimator that willbe shown to be asymptotically equivalent to the PC-GLS estimator.Let us first assume that the covariance parameters ρ and Σ are known. The(infeasible) two-step estimators ˜F t (t = 1, . . .,T) and ˜λ i (i = 1, . . .,N) that resultfrom using the PC estimators as first stage estimators are obtained by solvingthe following sets of equations:g Ft (̂Λ, ˜Ft , ρ, Σ) = 0 (12)g λi ( ˜λ i , ̂F, ρ, Σ) = 0, (13)where ̂F = [ ̂F 1 , . . ., ̂F T ] ′ and ̂Λ = [̂λ 1 , . . .,̂λ N ] ′ are the ordinary PC-OLS estimatorsof F and Λ.It is not difficult to see that the two-step estimator of λ i is equivalent to theleast-squares estimator of λ i in the regression:) ( (ρ i (L)x it = ρ i (L) ̂F) ′λit + ε ∗ it (t = p i + 1, . . .,T). (14)where ε ∗ it = ε it + ρ i (L)(F t − ̂F t ) ′ λ i .7

The two-step estimator of F t (given ̂Λ) is more difficult to understand. Considerthe two-way GLS transformation that accounts for both serial correlationand heteroskedasticity:1σ iρ i (L)x it = 1 σ îλ′ i [ρ i (L)F t ] + 1 σ iε it , (15)where for notational convenience we assume p i = p for all i. Furthermore ournotation ignores the estimation error that results from replacing λ i by ̂λ i . 4We will argue below that in order to estimate F t we can ignore the GLStransformation that is due to serial correlation. But let us first consider the fulltwo-step GLS estimator of F t that corresponds to condition (9). Collecting theequations for t = p + 1, . . .,T the model can be re-written in matrix notation as˜X i = ˜Z i f + ˜ε i , (16)where ˜X i = σ −1i [ρ i (L)x i,p+1 , . . .,ρ i (L)x iT ] ′ , ˜ε i = σ −1i [ε i,p+1 , . . .,ε iT ] ′ , ˜Zi =′[̂λ i ⊗ R(ρ (i) )], and f = vec(F). The complete system can be written asσ −1i˜x = ˜Zf + ˜ε, (17)where ˜x = [ ˜X 1, ′ . . ., ˜X N ′ ]′ , ˜Z = [ ˜Z′ 1 , . . ., ˜Z N ′ ]′ , and ˜ε = [˜ε ′ 1, . . ., ˜ε ′ N ]′ . To see thatthe least-squares estimator of f is equivalent to a two-step estimator setting thegradient (9) equal to zero (given some initial estimator of λ i ), consider the modelwith only one factor (i.e., f = F) and ρ i (L) = 1 − ρ i L. Since⎡⎤⎡ ⎤−ρ i 0 0 · · · 0 ⎡ ⎤−ρ N∑ N∑˜Z i˜ε ̂λ 1 −ρ i 0 · · · 0ε i ε i2i2′ ii =σ 2 0 1 −ρ i · · · 0ε i3⎢i=1 i=1 i⎢⎣.. ..⎥⎣ε i4⎥⎦ = ∑ N̂λ ε i2 − ρ i ε i3iσ⎦2 ε i3 − ρ i ε i4i=1 i⎢ ⎥⎣ . ⎦.0 0 0 · · · 1it follows that the system estimator based on (17) solves the first order condition(9). Note that the resulting estimator involves the inversion of the T × T matrix˜Z ′ ˜Z, which is computationally demanding if T is large.Fortunately, this estimator can be simplified, since the GLS transformationdue to the serial correlation of the errors is irrelevant. The GLS transformationresulting from heteroskedastic errors yields X ∗ t = Λ ∗ F t +u t , where X ∗ t = Σ −1/2 X t ,4 The complete error term is given by σ −1i [ε it +(λ i −̂λ i ) ′ ρ i (L)F t ]. However, as we will show insection 4, the estimation error in ̂λ i does not affect the asymptotic properties of the estimator.ε iT8

Λ ∗ = Σ −1/2 Λ, and u t = Σ −1/2 e t . Replacing Λ ∗ by ̂Λ ∗ = Σ −1/2̂Λ, two-step estimationimplies estimating F 1 , . . .,F T from the systemX ∗ 1 = ̂Λ ∗ F 1 + u ∗ 1.X ∗ T = ̂Λ ∗ F T + u ∗ T ,where u ∗ t = u t +(Λ ∗ −̂Λ ∗ )F t . Note that the vectors u ∗ t and u ∗ s are correlated, whichsuggests to estimate the system by using a GLS (SUR) estimator. However, it iswell known that the GLS estimator is identical to (equation-wise) OLS estimation,if the regressor matrix is identical in all equations. Indeed, since in the presentsetup the regressor matrix is ̂Λ ∗ for all equations, it follows that single-equationOLS estimation is as efficient as estimating the whole system by using a GLSapproach. Thus, the estimation procedure for F t can be simplified by ignoringthe serial correlation of the errors. This suggests to estimate F t from the crosssectionregression( )1 1x it =ω i ω i F t + u ∗ it (i = 1, . . .,N), (18)îλ′where ω 2 i = E(e 2 it). In what follows we focus on this simplified version of thetwo-step estimation approach as its properties are equivalent to those of the fullGLS estimation procedure..4 Asymptotic distribution of the two-step PC-GLS estimatorIn this section we first study the asymptotic properties of the infeasible twostepestimator, that is, we assume that the covariance parameters ρ k,i and ω 2 i areknown. In the following section we study what happens if the unknown covarianceparameters are replaced by estimates. As we will see, the results derived in thissection carry over to the case of estimated covariance parameters.Our analysis is based on the same set of assumptions as in Bai (2003):Assumption 1: (i) E||F t || 4 < ∞ for all t and T ∑ −1 Tt=1 F tF t′p→ Ψ F (p.d).(ii) ||λ i || < ∞ for all i and N −1 Λ ′ Λ → Ψ Λ (p.d.). (iii) For the idiosyncraticcomponents it is assumed that E(e it ) = 0, E|e it | 8 < ∞, 0 < |γ N (s, s)| < ∞,9

F ′t as the t’th row of F ∗ ) with the understanding that F obeys the normalizationT −1 F ′ F p → I r .As we do not impose the assumptions of a strict factor model with stationaryidiosyncratic errors, we define the following “pseudo-true” values of the autoregressiveand variance parameters:where⎛Γ i = lim E ⎜⎝ 1 T →∞ TT∑t=p i +1T∑ωi 2 = lim T −1T →∞t=1E(e 2 it)[ρ 1,i , . . .,ρ pi ,i] ′ = Γ −1i,11 Γ i,10,⎡ ⎤⎞e i,t−1⎢ ⎥⎣ . ⎦ [ ] ⎟e it · · · e i,t−pi ⎠ = [ ]Γ i,10 Γ i,11 ,e i,t−piΓ i,10 is a p i × 1 vector, and Γ i,11 is a p i × p i matrix.For the asymptotic analysis we need to impose the following assumption.Assumption 2: (i) There exists a positive constant C < ∞, such that 1 C < ω2 i

(ii) If (N, T → ∞) and √ N/T → 0, then for each t,√N( ˜Ft − F t )d−1→ N(0, ˜ΨΛ Ṽ (t)λe˜Ψ−1Λ ),whereṼ (t)λe = limN→∞ N −1 N∑i=1N∑1ω 2 j=1 i ω2 jλ i λ ′ jE(e it e jt )and ˜Ψ Λ = lim N→∞ N −1 Λ ′ Ω −1 Λ and Ω =diag(ω 2 1 , . . .,ω2 N ).Remark 1: From (i) it follows that the asymptotic distribution remains thesame if the estimate ρ i (L) ̂F t in (14) is replaced by ρ i (L)F t . This suggests that theestimation error in ̂F t does not affect the asymptotic properties of the estimator˜λ i . A similar result holds for the regressor ωi−1 ̂λ i . In other words, the additionalassumptions on the relative rates of N and T ensure that the estimates of theregressors in equations (14) and (18) can be treated as “super-consistent”. Thefollowing section sheds more light on this important property.Remark 2: The assumptions on the relative rates of N and T may appear tobe in conflict with each other. However, the two parts of Theorem 1 are fulfilledif N = cT δ where 1/2 < δ < 2. Therefore, the limiting distribution should give areliable guidance if both dimensions N and T are of comparable magnitude.Remark 3: It is interesting to compare the result of Theorem 1 with the asymptoticdistribution obtained by Choi (2007). In the latter paper it is assumed thatE(e t e ′ t ) = Ω for all t, where e t = [e 1t , . . .,e Nt ] ′ , i.e. the idiosyncratic componentsare assumed to be stationary. In this case the model can be transformed asX ∗ = F ∗ Λ ∗′ +e ∗ , where X ∗ = XΩ −1/2 , F ∗ = FJ, Λ ∗ = Ω −1/2 Λ(J ′ ) −1 , e ∗ = eΩ −1/2andJ = TΛ ′ Ω −1 ΛF ′ ˜F( ˜F ′ XΩ −1 X ′ ˜F) −1 ,such that the covariance matrix of e ∗ is identical the identity matrix. Note thatthe normalization of the factor space is different from the normalization of the PC-OLS estimator, whereas our PC-GLS estimator adopts the original normalization.Imposing the former normalization, the asymptotic covariance matrix of the GLSestimator ˜F reduces to a diagonal matrix (cf. Choi 2007).Remark 4: The two-step approach can also be employed to an unbalanceddata set with different numbers of time periods for the variables. Stock andWatson (2002b) suggest an EM algorithm, where the missing values are replaced12

y an estimate of the common component. Let ˆx it = ̂λ ′ i ̂F t denote the estimatedobservation based on the balanced data set ignoring all time periods with missingobservations. The updated estimates of the common factors and factor loadingsare obtained by applying the PC-OLS estimator to the enhanced data set, wherethe the missing values are replaced by the estimates ˆx it . Employing the updatedestimates of F t and λ i , results in improved estimates of the missing values that canin turn be used to yield new PC-OLS estimates of the common factors and factorloadings. This estimation procedure can be iterated until convergence. Usingthe two-step procedure, the estimation procedure can be initialized by using thereduced (balanced) data set to obtain the PC-OLS estimates ̂F t and ̂λ i . In thesecond step the vector of common factors is estimated from regression (18). As theT cross-section regressions may employ different numbers of observations, missingvalues do not raise any problems. Similarly, the N time-series regressions (14)may be based on different sample sizes. As in the EM algorithm, this estimationprocedure can be iterated until convergence.5 The feasible GLS estimatorIn practice the covariance parameters are unknown and must be replaced byconsistent estimates. The feasible two-step PC-GLS estimators ˜λ i,bρ and ˜F t,bω solvethe first order conditionswhere˜g λi (˜λ i,bρ , ̂F, ̂ρ (i) ) =˜g Ft (̂Λ, ˜F t,bω , ̂Ω) =T∑t=p i +1N∑i=1[̂ρ i (L)(x it − ˜λ ′ i,bρ ̂F t )][̂ρ i (L) ̂F t ] = 0 (19)1(x it − ̂λ ′ ˜F i t,bω )̂λ i = 0, (20)̂ω 2 îω 2 i = 1 TT∑ê 2 it (21)and ê it = x it − ̂λ ′ ̂F i t . Furthermore, ̂ρ (i) = [̂ρ 1,i , . . ., ̂ρ pi ,i] ′ is the least-squaresestimator from the regressiont=1ê it = ̂ρ 1,i ê i,t−1 + · · · + ̂ρ pi ,iê i,t−pi + ̂ε it . (22)These estimators can be iterated using the resulting estimates ˜λ i,bρ and ˜F t,bω insteadof the estimates ̂F t and ̂λ i in regressions (14) and (18). Similarly, improved13

estimators of the covariance parameters can be obtained from the second stepresiduals ẽ it = x it − ˜λ ′ i,bρ ˜F t,bω . This iterative estimation scheme converges to theestimators that maximize the criterion function given by equation (7). To studythe limiting distribution of the feasible two-step PC-GLS estimator, the followingLemma is used.Lemma 1: Let ̂ρ (i) = [̂ρ 1,i , . . ., ̂ρ pi ,i] ′ denote the least-squares estimates from(22) and ̂ω i 2 is the estimator defined in (21). Under Assumption 1 we have as(N, T → ∞)̂ρ (i) = ρ (i) + O p (T −1/2 ) + O p (δ −2NT ) and ̂ω2 i = ω2 i + O p(T −1/2 ) + O p (δ −2NT ),where δ NT = min( √ N, √ T).The following theorem shows that the asymptotic distributions of the feasibletwo-step PC-GLS estimators of λ i and F t are identical to the ones stated inTheorem 1.Theorem 2: Let ˜λ i,bρ and ˜F t,bω denote the feasible two-step PC-GLS estimatorsbased on (19) and (20). Under Assumptions 1–3 and if (N, T → ∞) and√T/N → 0:√T(˜λi,bρ − ˜λ i )If (N, T → ∞) and √ N/T → 0:p→ 0.√N( ˜Ft,bω − ˜F t ) p → 0and, therefore, the estimators ˜λ i,bρ and ˜F t,bω solving ˜g λi (˜λ i,bρ , ̂F, ̂ρ (i) ) = 0 and˜g Ft (̂Λ, ˜F t,bω , ̂Ω) = 0, respectively, have the same limiting distribution as the (infeasible)GLS estimators of Theorem 1.The proof of this theorem is based on the fact that additional terms in the Taylorseries expansion of the first order conditions that depend on the derivatives withrespect to remaining parameters converge to zero in probability under the givenconditions on N and T. Another important consequence of Theorem 2 is thatthe iterated PC-GLS estimators of λ i and F t have the same asymptotic propertiesas the feasible two-step estimators. Thus, additional steps do not improvethe asymptotic distribution of the estimators. However, further iterations mayimprove the small sample properties.14

6 Small sample propertiesIn order to investigate the small sample properties of the proposed estimators, weperform a Monte Carlo study. In particular, we calculate the relative efficiencyof the respective estimator compared to the infeasible estimator that solves thefirst order conditions ˜g λi (˜λ i , F, ρ (i) ) and ˜g Ft (Λ, ˜F t , Ω). The latter is optimal inthe sense that the respective estimates are based on the true parameter valuesdescribing heteroskedasticity and autocorrelation. Furthermore, the infeasibleestimator employs the true factor loadings when estimating the factors, and whenestimating the loadings it uses the true common factors. Thus, inefficiencies whichcould arise from employing estimated regressors and estimated autocorrelationand heteroskedasticity parameters are absent.This infeasible estimator is the reference point for four different approachesconsidered in this investigation, the standard PC estimator, the two-step and iteratedPC-GLS estimators as described above, and the quasi maximum likelihood(QML) estimator of Doz et al. (2006a). The latter authors suggest maximumlikelihood estimation of the approximate dynamic factor model via the Kalmanfilter employing the EM algorithm. In order to make the standard estimationapproach of traditional factor analysis feasible in the large approximate dynamicfactor environment, Doz et al. (2006a, 2006b) approximate the probabilisticmodel, where their approximation does not allow for cross-sectional correlationof the idiosyncratic component. However, their estimation procedure does takeinto account factor dynamics as well as heteroskedasticity of the idiosyncraticerror. 5whereThe data-generating process of our Monte Carlo study is the following:x it = λ i F t + e it ,F t = γF t−1 + u t , u tiid∼ N(0, (1 − γ 2 ))e it = ρ i e i,t−1 + ε it , ε itiid∼ N(0, σ 2 i (1 − ρ2 i ))ρ iλ iiid∼ U[a, b]iid∼ U(0, 1).5 Even though their actual implementation of the estimator does not allow for serial correlationof the idiosyncratic component, Doz et al. (2006a) point out that, in principle, it ispossible to take into account this feature in the estimation approach. However, the resultingestimator is computationally demanding as it implies N additional transition equations for theidiosyncratic components.15

The interval from which the ρ i ’s are drawn, i.e. the choice of a and b, varies fordifferent simulation setups and is specified below. Furthermore, in our baselinesimulations, we set the number of static and dynamic factors equal to one and,therefore, F t is a scalar. In order to check the robustness of our results, however,we also consider setups where the number of factors is increased to five. Twodifferent scenarios are considered. First, we concentrate on the dynamic aspectsiidand set γ = 0.7, ρ i ∼ U[0.5, 0.9], as well as σi 2 = 2 ∀i. In the second case, theiidfocus is on heteroskedasticity, where γ = 0, ρ i = 0 ∀i, and σ i ∼ |N( √ 2, 0.25)|.We generate 1000 replications for different (N, T)-specifications. In particular,we set N = 50, 100, 200, 300 and T = 50, 100, 200. To construct a performancemeasure that is invariant to the normalization of the factors (or loadings) ourmeasure of relative efficiency is based on the ratioeff( ̂F, ̂F 0 ) = 1 − R2 (F, ̂F 0 ), (23)1 − R 2 (F, ̂F)where R 2 (F, ̂F) is the coefficient of determination of a regression of F (the truefactor) on ̂F (the estimator under consideration) and a constant and R 2 (F, ̂F 0 )denotes the R 2 of a similar regression, where ̂F is replaced by the benchmarkestimator ̂F 0 (the infeasible GLS estimator). Consequently, numbers close to oneindicate high accuracy of the estimator compared to the infeasible GLS estimator,whereas numbers close to zero imply low efficiency. 6We begin with a setup featuring only one factor. Table 1 reports the resultsfor the case of autocorrelated errors. Apparently, the PC and QML estimatorsperform relatively poorly, where the efficiency measures of the two estimators areof comparable magnitude. The low accuracy can be explained by the fact thatboth estimators fail to take into account serial correlation of the idiosyncraticcomponent. However, the QML procedure takes into account the dynamics ofthe common factors. As has been argued in section 3, the dynamic properties ofthe factors are irrelevant for the asymptotic properties as N → ∞. In contrast,for the factor loadings both the two-step and the iterated PC-GLS estimatorachieve a considerable efficiency gain compared to the PC and QML estimators.In particular, with larger T the two PC-GLS estimators become increasinglyaccurate and show a similar performance as expected from Theorem 2. Thispicture changes somewhat when examining the results for the factors. Using the6 In the multiple factor case, our measure of relative efficiency is generalized by consideringthe trace R 2 of a regression of the true factors on the respective estimated factors and aconstant.16

two-step estimator does not lead to an efficiency gain relative to PC. In thisrespect, note that the two-step regression for the common factors is not affectedby possible autocorrelation of the errors but exploits possible heteroskedasticityof these terms. Interestingly, iterating the PC-GLS estimator until convergenceleads to a dramatic increase in relative accuracy. 7 This is due to the fact that theloadings are estimated more precisely by taking into account the autocorrelationof the errors. Thus, in the second step, the regressors have a smaller error andthis improves the efficiency of the factor estimates.The results for heteroskedastic errors are presented in Table 2. Again, the PCestimator performs relatively poorly when considering the common factors, whichcomes as no surprise, since PC does not take into account possible heteroskedasticityof the idiosyncratic component. This contrasts with the high accuracy ofthe estimated factor loadings. This is due to the fact that to a first degree, asexplained in section 3, what is important for the efficient estimation of the factorloadings is to allow for autocorrelation of the errors. Not surprisingly, sincethere is no serial correlation in this scenario, the accuracy of PC is relativelyhigh. Furthermore, the two-step PC-GLS estimator of the factor loadings hasthe same asymptotic properties as the ordinary PC estimator if the errors areserially uncorrelated. In fact, PC exhibits a similar performance as the two-stepPC-GLS estimator. A slight efficiency improvement with respect to the loadingsis attainable by employing the iterated PC-GLS estimator, in particular if N issmall compared to T. Analogous to the case with autocorrelated errors, the efficiencygain is due to the fact that by estimating the factors more precisely viaincorporating heteroskedasticity, in the second step, the regressors have a smallererror, thus improving the accuracy of the estimated factor loadings. However, inline with Theorem 2, for larger samples the two PC-GLS estimators perform similarly.The same is true for the factor estimates, even though, in accordance withTheorem 2, there are slight efficiency improvements when iterating the PC-GLSestimator in cases with a large N compared to T. Finally, the QML estimates ofthe factors as well as the factor loadings show a strong performance, even slightlybetter than the iterated PC-GLS estimator. This is due to the fact, that in thisscenario the approximating model coincides with the true model and the QMLestimator is equivalent to the exact ML estimator.This broad picture also emerges in the case of multiple factors. In particular,7 The number of iterations is limited to a maximum of 5. First, this reduces the computationalburden and we find no further improvement if the number of iterations is increased.17

we consider a setup with five factors, where the corresponding results are presentedin Tables 3 and 4. The main difference to the case considered above isthat, in general, larger sample sizes are needed to achieve the same relative performance.This is particularly true for the two-step estimator and to a smallerextend also for the iterated PC-GLS estimator. Overall, this feature seems tobe more important in the scenario with autocorrelation than in the setup withheteroskedasticity.For example, under autocorrelation the two-step estimates of the factor loadingsdo not achieve as considerable a gain in efficiency compared to PC as inthe one factor case. For the advantage with respect to accuracy of this estimatorto become apparent, larger sample sizes are needed than in the precedingscenario. Increasing the cross-section and time-series dimension up to 500, forexample, leads to relative efficiency measures for the PC, two-step and iteratedPC-GLS estimators of 0.296, 0.927, and 0.946, respectively. Similarly, the factorestimates of the iterated PC-GLS estimator are a little less precise for the samplessizes considered than in the one factor case. This is due to the fact, thatthe loadings are estimated a little less accurate than before, thus leading to lessprecise estimates of the regressors, which in turn negatively affects the efficiencyof the factor estimates. Nevertheless, in particular compared to the other estimatorsunder consideration, the iterated PC-GLS estimator shows a quite goodperformance.On the other hand, as indicated above, the qualitative findings in the caseof heteroskedasticity are quite close to the one factor case. The most visible differenceis the steep increase in efficiency with the sample size. For example, inaccordance with Theorem 2, the estimates of the loadings of the two-step estimatorconsiderably gain in accuracy when increasing N. The respective measureincreases for T = 200 from a value of 0.324 for N = 50 to 0.843 for N = 300,where in the one factor case the corresponding increase is only from 0.663 to0.950. Analogous results are found for the same estimator with respect to thefactor estimates. In this case, however, as expected from the aforementioned theorem,the efficiency gain results when increasing T. For N = 300 the measure ofrelative efficiency increases from 0.368 to 0.759, when T rises from 50 to 200. Thecorresponding numbers in the scenario with only one factor are 0.705 and 0.811.Moreover, also in this setup, iterating the PC-GLS estimator until convergenceincreases the accuracy of the estimates, sometimes considerably. 88 Unfortunately, in very small sample sizes, convergence of the iterated PC-GLS estimator is18

7 ConclusionIn this paper we propose a GLS-type estimation procedure that allows for heteroskedasticand autocorrelated errors. Since the estimation of the covarianceparameters does not affect the limiting distribution of the estimators, the feasibletwo-step PC-GLS estimator is asymptotically as efficient as the infeasibleGLS-estimator (assuming that the covariance parameters are known) and theiterated version that solves the first order condition of the (approximate) MLestimator. Notwithstanding these asymptotic results, the results of our MonteCarlo experiments suggest that the gain in efficiency by iterating the sequentialGLS estimator may be substantial.If one is willing to accept the framework of a strict factor model (that is amodel with cross-sectionally uncorrelated factors and idiosyncratic errors), thenour approach can also be employed for inference. For example, recent work byBreitung and Eickmeier (2008) shows that a Chow-type test for structural breakscan be derived using the iterated PC-GLS estimator. Other possible applicationsare LR tests for the number of common factors or tests of hypotheses on thefactor space.not always assured. In our setup this only happens for a couple of simulation runs for the caseof heteroskedasticity where N = 50 and T = 200. That is why we choose not to report resultsfor this specification.19

AppendixThe following lemma plays a central role in the proof of the following theorems:Lemma A.1: It holds for all k ≤ p i thatT∑T∑(i) T −1 ( ̂F t − F t )F t−k ′ = O p (δ −2NT ), T −1 ( ̂F t − F t ) ̂F t−k ′ = O p (δ −2t=p i +1(ii) T −1 T∑t=p i +1(iii) T −1 T∑(iv)(v)t=p i +1N −1 N∑i=1N −1 N∑i=1̂F t ̂F′t−k = T −1T∑t=p i +1( ̂F t − F t )e i,t−k = O p (δ −21ω 2 i1ω 2 it=p i +1F t F ′t−k + O p(δ −2NT )NT )(̂λ i − λ i )λ ′ i = O p(δ −2NT ),(̂λ i − λ i )e it = O p (δ −2NT ) .N −1 N∑i=11ω 2 iNT )(̂λ i − λ i )̂λ ′ i = O p(δ −2NT )Proof: (i) The proof follows closely the proof for k = 0 provided by Bai (2003,Lemmas B.2 and B.3). We therefore present only the main steps.We start from the representation̂F t − F t = 1 (NT V −1NT ̂F ′ FΛ ′ e t + ̂F ′ eΛF t + ̂F)′ ee t ,where e t = [e 1t , . . .,e Nt ] ′ , e = [e 1 , . . .,e T ] ′ , and V NT is a r × r diagonal matrix ofthe r largest eigenvalues of (NT) −1 XX ′ (cf. Bai, 2003, Theorem 1). Consider(1T∑(T̂F t − F t )F ′ 1T∑−1t−k =NT 2VNT̂F ′ FΛ ′ e t F t−k ′ + ̂FT∑′ eΛ F t F t−k′t=p i +1t=p i +1t=p i +1)= I + II + III.From Assumption 1 (v) it follows thatΛ ′T∑t=p i +1e t F ′t−k = N∑T∑i=1 t=p i +120+ ̂F ′ eT∑t=p i +1e t F ′t−ke it λ i F ′t−k = O p( √ NT).

and using Lemma B.2 of Bai (2003) it follows that T −1 ̂F ′ F = T −1 F ′ F +T −1 ( ̂F −F) ′ F = T −1 F ′ F + O p (δ −2NT). Thus, we obtain( ) ( )I = V −1NTT −1 ̂F ′ 1T∑( )F √ Λ ′ e t F t−k′ 1 1√ = O p √ .NT NT NTNext, we considert=p i +1Λ ′ e ′ ̂F = Λ′T∑t=1e t F ′t + Λ′T∑e t ( ̂F t − F t ) ′ .t=1Following Bai (2003, p. 160), we have1NT Λ′1NT Λ′t=1T∑t=1( ) 1e t F t ′ = O p √NTT∑( )e t ( ̂F t − F t ) ′ 1= O p √ .δ NT NUsing T −1 ∑ Tt=p i +1 F ′t F t−k = O p (1), we obtain( ) ( 1II = V −1NTNT ̂F ′ 1eΛTT∑t=p i +1For the remaining term, we obtainwhere= 1T 21NT 2 ̂F ′ eT∑T∑t=p i +1T∑s=1 t=p i +1) [ ( ) ( )]1F t F t−k′1= O p √ + O p √ O p (1).NT δ NT Ne t F ′t−k = 1NT 2As in Bai (2003, p. 164f), we obtainIII = V −1NTT∑̂F s F ′t−kζ NT (s, t) + 1T 2T∑s=1 t=p i +1T∑e ′ s e t ̂F s F ′t−kT∑s=1 t=p i +1ζ NT (s, t) = e ′ s e t/N − γ N (s, t)γ N (s, t) = E(e ′ se t /N).[O p( 1δ NT√T)+ O p()]1√ .δ NT N̂F s F ′t−kγ N (s, t),21

Collecting these results, we obtain( ) ( )1 1I + II + III = O p √ + O p √NT TδNT( )1+ O p √NδNT( ) 1= O p .δNT2The proof of the second result in (i) is a similar modification of Lemma A.1 inBai (2003) and is therefore omitted.(ii) ConsiderT −1T∑t=p i +1= T −1 ( T∑= T −1 ⎛⎝= T −1 T∑t=p i +1T∑t=p i +1t=p i +1̂F t ̂F′t−k = T −1T∑t=p i +1[F t + ( ̂F t − F t )][F t−k + ( ̂F t−k − F t−k )] ′F t F ′t−k + ( ̂F t − F t )F ′t−k + F t ( ̂F ′t−k − F ′t−k) + ( ̂F t − F t )( ̂F ′t−k − F ′t−k)F t F t−k ′ + F t ( ̂F t−k ′ − F t−k)′ + (} {{ }̂F t − F t ) ̂F ′ ⎠t−k} {{ }TaTbF t F ′t−k + a + b.Using (i) the terms a and b can be shown to be O p (δ −2NT ).(iii) The proof of k = 0 is given in Bai (2003, Lemma B.1). It is not difficultto see that the result remains unchanged if k ≠ 0.(iv) Following Bai (2003, p. 165) we havêλ i − λ i = T −1 F ′ e i + T −1 ̂F ′ (F − ̂F)λ i + T −1 ( ̂F − F) ′ e i , (24)⎞)where e i = [e i1 , . . .,e iT ] ′ . Post-multiplying by ωi−2 λ ′ i and averaging yieldsN(∑N)N −1 1(̂λ i − λ i )λ ′ i = T ∑−1 F ′ N −1 1e i λ ′ ii=1ω 2 i( N) (+ T −1 ̂F ′ (F − ̂F)∑N)N −1 1λ i λ ′ i + T −1 ( ̂F∑− F) ′ N −1 1e i λ ′ ii=1ω 2 ii=1ω 2 ii=1ω 2 i.From Bai (2003, p. 165) it follows that the last two terms are O p (δ −2NT). FromAssumption 1 (v) and Assumption 2 (i) it follows thatT∑ 1F t λ ′ i∣∣e it∣∣ ≤ 1T∑F t λ ′ i∣∣e it∣∣ = O p(1/ √ T),t=1ω 2 iω 2 min22t=1

where ω min = min(ω 1 , . . ., ω N ). Thus, the first part of (iv) is O p (δ −2NT). Thesecond equation can be shown by using the first part and Lemma A.1 (v).(v) From (24) it follows that∑ N T∑ N∑N −1 (̂λ i − λ i )e it = N −1 T −1i=1= a + b.s=1 i=1̂F s e is e it + N −1 T −1 T∑N∑s=1 i=1̂F s (F s − ̂F s ) ′ λ i e itFor expression a we writeT∑ ∑ N T∑N −1 T −1 ̂F s e is e it = T −1s=1 i=1s=1[ N]∑T∑̂F s N −1 e is e it − E(e is e it ) +T −1i=1s=1̂F s γ N (s, t)From Lemma A.2 (a) and (b) of Bai (2003) it follows that the first term on ther.h.s. is O p (N −1/2 δ −1NT ), whereas the second term is O p(T −1 ).To analyze b we note that by Lemma A.1 (i) and Assumption 1 (v)[] [T∑N]T −1 ̂F s (F s − ̂F∑s ) ′ N −1 λ i e it = O p (δ −2NT )O p(N −1/2 )s=1Collecting these results, it follows that∣∣ ∣∣ N ∣∣ N ∑−1 1 ∣∣∣∣ ∣∣∣∣(̂λωi2 i − λ i )e it ≤ 1N ωmin2 ∣∣ N ∑ ∣∣∣∣ ∣∣∣∣−1 (̂λ i − λ i )e iti=1i=1= O p (T −1 ) + O p (N −1/2 δ −1NT ) + O p(N −1/2 δ −2NT ) = O p(δ −2NT ).Proof of Theorem 1:The two-step estimator of λ i is obtained asi=1˜λ i = [ ̂F ′ R(ρ (i) ) ′ R(ρ (i) ) ̂F] −1 ̂F ′ R(ρ (i) ) ′ R(ρ (i) )X i= [ ̂F ′ R(ρ (i) ) ′ R(ρ (i) ) ̂F] −1 ̂F ′ R(ρ (i) ) ′ R(ρ (i) )(Fλ i + e i )= [ ̂F ′ R(ρ (i) ) ′ R(ρ (i) ) ̂F] −1 ̂F ′ R(ρ (i) ) ′ R(ρ (i) ){[ ̂F + (F − ̂F)]λ i + e i )}˜λ i − λ i = [ ̂F ′ R(ρ (i) ) ′ R(ρ (i) ) ̂F] −1 ̂F ′ R(ρ (i) ) ′ R(ρ (i) )[(F − ̂F)λ i + e i ],where e i = [e i1 , . . .,e iT ] ′ . Using Lemma A.1 (ii) it followsT −1 ̂F ′ R(ρ (i) ) ′ R(ρ (i) ) ̂F= T −1 F ′ R(ρ (i) ) ′ R(ρ (i) )F + O p (δ −2p→(i) ˜ΨF .NT )23

Using Lemma A.1 (i) we obtain T −1 ∑ Tt=p i +1 ̂F t−k ( ̂F ′t−k − F ′t−k ) is O p(δ −2NT ) and,therefore,T −1 ̂F ′ R(ρ (i) ) ′ R(ρ (i) )( ̂F − F)λ i = O p (δ −2NT ).Finally we considerT∑T∑[ρ i (L) ̂F t ] [ρ i (L)e it ] = T −1/2T −1/2t=p i +1t=p i +1= T −1/2 T∑t=p i +1ρ i (L)[F t + ( ̂F t − F t )]ρ i (L)e itρ i (L)F t [ρ i (L)e it ] + O p ( √ T/δ 2 NT ),where Lemma A.1 (iii) is used. Thus we find√T(˜λi −λ i ) = [T −1 F ′ R(ρ (i) ) ′ R(ρ (i) )F] −1 T −1/2 F ′ R(ρ (i) ) ′ R(ρ (i) )e i +O p ( √ T/δ 2 NT) ,where √ T/δ 2 NT → 0 if √ T/N → 0. Finally, Assumption 1 (v) implieswhere Ṽ (i)Fefollows.T −1/2 F ′ R(ρ (i) ) ′ R(ρ (i) )e id → N(0, Ṽ (i)Fe ),is defined in Theorem 1. With these results, part (i) of the theoremThe proof of part (ii) is similar. We therefore present the main steps only. Thetwo-step estimator of F t is given by˜F t= (̂Λ ′ Ω −1̂Λ) −1̂Λ′ Ω −1 X t= (̂Λ ′ Ω −1̂Λ) −1̂Λ′ Ω −1 [(̂Λ − ̂Λ + Λ)F t + e t ]˜F t − F t = (̂Λ ′ Ω −1̂Λ) −1̂Λ′ Ω −1 [(Λ − ̂Λ)F t + e t ],where e t = [e 1t , . . .,e Nt ] ′ . Following Bai (2003) and using Lemma A.1 (iv) and(v) it follows thatN −1̂Λ′ Ω −1̂Λ = N −1 Λ ′ Ω −1 Λ + O p (δ −2NT ) p→ ˜ΨΛN −1̂Λ′ Ω −1 (̂Λ − Λ) = O p (δ −2NT )N −1 (̂Λ − Λ) ′ Ω −1 e t= O p (δ −2NT )N −1/2̂Λ′ Ω −1 e t = N −1/2 Λ ′ Ω −1 e t + O p ( √ N/δ 2 NT )N −1/2 Λ ′ Ω −1 e d t → N(0, Ṽ (t)λe())Ṽ (t)λe= E lim N −1 Λ ′ Ω −1 e t e ′ tN→∞ Ω−1 Λ= lim N ∑ N N∑−1 1λN→∞ ω 2 i λ ′ jE(e it e jt ).i=1 j=1 i ω2 jFrom these results the limit distribution stated in Theorem 1 (ii) follows.24

Proof of Lemma 1:Let⎡⎢z t = ⎣e it.e i,t−pi +1⎤⎡⎥ ⎢⎦ and ẑ t = ⎣x it − ̂λ ′ i ̂F t.x i,t−pi +1 − ̂λ ′ i ̂F t−pi +1Using the same arguments as in Lemma 4 of Bai and Ng (2002) it can be shownthatT∑T∑T −1 ê it ẑ i,t−1 − T −1 e it z i,t−1 = O p (δ −2NT )t=p i +1t=p i +1and T ∑ −1 Tt=p i +1 (ẑ i,t−1ẑ i,t−1 ′ −z i,t−1z i,t−1 ′ ) = O p(δ −2NT). Therefore, we obtain for theleast-squares estimator of ρ (i)̂ρ (i) = ρ (i) +( T∑t=p i +1z t−1 z ′ t−1) −1T∑= ρ (i) + O p (T −1/2 ) + O p (δ −2NT )t=p i +1and, similarly, for the least-square estimator of ωi 2 :() (T∑̂ω i 2 = ωi 2 + T −1 e 2 it − ω2 i +t=p i +1= ω 2 i + O p (T −1/2 ) + O p (δ −2NT )T −1⎤⎥⎦.z t−1 e it + O p (δ −2NT )T∑t=p i +1(ê 2 it − e2 it ) )Proof of Theorem 2:The estimation error of the first-step estimators does not affect the second-stepestimators if the first derivative of the first order condition is of smaller stochasticorder (e.g. Newey and McFadden 1994). A first order Taylor expansion of thefirst order condition around the true parameter values yields˜g λi (λ i , ̂F, ̂ρ (i) ) ≃ ˜g λi (λ i , F, ρ (i) ) +˜g Ft (̂Λ, F t , ̂Ω) ≃ ˜g Ft (Λ, F t , Ω) +T∑t=p i +1pD 1Ft ( ̂F∑ it − F t ) + D 1ρk,i (̂ρ i,k − ρ i,k )N∑D 2λi (̂λ i − λ i ) +i=1i=1N∑D 2ω 2i(̂ω i 2 − ω2 i ),i=125

whereD 1Ft = ∂˜g λi (λ i , F, ρ (i) )/∂F ′t = ρ i(L −1 )ε it − λ i [ρ i (L −1 )ρ i (L)F t ] ′D 2λiD 1ρk,i= ∂˜g Ft (Λ, F t , Ω)/∂λ ′ i = − 1 [Fωi2 t λ ′ i − e itI r ]( T∑)= ∂˜g λi (λ i , F, ρ (i) )/∂ρ k,i = − ε it F t−k + e i,t−k [ρ i (L)F t ]t=p i +1D 2ω 2i= ∂˜g Ft (Λ, F t , Ω)/∂ω 2 i = − 1 ω 4 ie it λ i .Using the results of Lemma A.1 we obtainT −1T∑t=p i +1N −1 N∑i=1D 1Ft ( ̂F t − F t ) = O p (δ −2NT )D 2λi (̂λ i − λ i ) = O p (δ −2NT ).From Assumption 1 (v) it follows that D 1ρk,i is O p (T 1/2 ) and by using Lemma 1we obtainD 1ρk,i (̂ρ i,k − ρ i,k ) = O p (1) + O p ( √ T/δ 2 NT)D 2ω 2i(̂ω 2 i − ω 2 i ) = − e it(e ′ i e i/T − ω 2 i )ω 4 iUnder Assumption 1 (iii) it follows thatN −1 N∑i=1With these results we obtain= O p (T −1/2 ) + O p (δ −2NT ).λ i + O p (δ −2NT )D 2ω 2i(̂ω 2 i − ω 2 i ) = O p (N −1/2 T −1/2 ) + O p (δ −2NT ).1√ ˜g λi (λ i , ̂F, ̂ρ (i) ) = √ 1 ˜g λi (λ i , F, ρ (i) ) + O p ( √ T/δNT)2T T1√N˜g Ft (̂Λ, F t , ̂Ω) = 1 √N˜g Ft (Λ, F t , Ω) + O p ( √ N/δ 2 NT ) + O p(T −1/2 ).If (N, T → ∞) and √ T/N → 0, then T −1/2˜g λi (λ i , ̂F, ̂ρ (i) ) converges to T −1/2˜g λi (λ i , F, ρ (i) ).If √ N/T → 0, then N −1/2˜g Ft (̂Λ, F t , ̂ω 2 i ) converges to N −1/2˜g Ft (Λ, F t , ω 2 i ).26

The first order conditions of ρ k,i and ω 2 i result as˜g ρk,i (λ i , F, ρ k,i ) = −˜g ω 2i(λ i , F, ω 2 i ) = −T∑t=p i +1T∑t=p i +1[ρ i (L)x it − λ ′ i ρ i(L)F t ](x i,t−k − λ ′ i F t−k)[(xit − λ ′ iF t ) 2 − ω 2 i]The derivatives with respect to λ i are given by∂˜g ρk,i (·)T∑= − e i,t−k [ρ i (L)F t ] + ε it F t−k = O p (T 1/2 )∂λ ′ i∂˜g ω 2i(·)∂λ ′ i= 2t=p i +1T∑t=p i +1e it F t = O p (T 1/2 ).If e it is weakly cross-correlated as assumed in Assumption 1 and √ T/N → 0 thenSimilarly,∂˜g ρk,i (·)∂λ ′ i(̂λ i − λ i ) = O} {{ } p (1).O p(T −1/2 )∂˜g ω 2i(·)(̂λ i − λ i ) = O p (1).∂λ ′ iSince ˜g ρk,i (·) and ˜g ω 2i(·) are O p (T 1/2 ) it follows that the estimation error of theestimate of λ i does not affect the asymptotic properties if √ T/N → 0.The derivatives with respect to F t are obtained asIt follows from Lemma A.1 thatT∑ ∂˜g ρk,i (·)T −1∂˜g ρk,i (·)∂F t= ρ i (L −1 )e i,t−k λ i + ε i,t+k λ i∂˜g ω 2i(·)∂F t= 2e it λ i .t=p i +1T −1 T∑t=1∂F ′t( ̂F t − F t ) = O p (δ −2NT )∂˜g ωi (·)(∂F ̂Ft′ t − F t ) = O p (δ −2NT ).Therefore, the first derivatives with respect to F t vanish if √ T/δ√NT2 → 0 orT/N → 0 and the asymptotic properties of the estimators ˜λi,bρ −˜λ i and ˜F t,bρ − ˜F tare the same as if they were computed by using the true errors e it .27

ReferencesAnderson, T. W. (1984), Introduction to Multivariate Statistical Analysis, Sec.Ed., John Wiley: New York.Bai, J. (2003), Inferential Theory for Factor Models of Large Dimensions, Econometrica,71, 135–172.Bai, J. (2004), Estimating Cross-section Common Stochastic Trends in NonstationaryPanel Data, Journal of Econometrics, 122, 137-183.Bai, J. (2005), Panel Data Models with Interactive Fixed Effects, New YorkUniversity, mimeo.Bai, J., and S. Ng (2002), Determining the Number of Factors in ApproximateFactor Models, Econometrica, 70, 191–221.Bai, J., and S. Ng (2004), A PANIC Attack on Unit Roots and Cointegration,Econometrica, 72, 1127–1177.Bernanke, B. S., J. Boivin, and P. Eliasz (2004), Measuring the Effects ofMonetary Policy: A Factor-Augmented Vector Autoregressive (FAVAR)Approach, NBER Working Paper 10220.Boivin, J., and S. Ng (2006), Are More Data Always Better for Factor Analysis,Journal of Econometrics, 132, 169–194.Breitung, J., and S. Eickmeier (2008) Testing for structural breaks in dynamicfactor models, working paper, University of Bonn.Chamberlain, G., and M. Rothschild (1983), Arbitrage, Factor Structureand Mean-Variance Analysis in Large Asset Markets, Econometrica, 51,1305-1324.Choi, I. (2007), Efficient Estimation of Factor Models, Working Paper,http://ihome.ust.hk/˜inchoi.Doz, C., D. Giannone, and L. Reichlin (2006a), A quasi maximum likelihoodapproach for large approximate dynamic factor models, Working PaperSeries 674, European Central Bank.28

Doz, C., D. Giannone, and L. Reichlin (2006b), A Two-step Estimator forLarge Approximate Dynamic Factor Models Based on Kalman Filtering,Working Paper, ECARES - Université Libre de Bruxelles.Eickmeier, S. (2007), Business Cycle Transmission from the US to Germany -A Structural Factor Approach, European Economic Review, 51, 521–551.Eickmeier, S., and C. Ziegler (2007), How Successful are Dynamic FactorModels at Forecasting Output and Inflation? A Meta-Analytic Approach,forthcoming in: Journal of Forecasting.Forni, M., M. Hallin, F. Lippi, and L. Reichlin (2000), The Generalized DynamicFactor Model: Identification and Estimation, The Review of Economicsand Statistics, 82, 540–554.Forni, M., M. Hallin, M. Lippi, and L. Reichlin (2005), The generalized dynamicfactor model: one-sided estimation and forecasting, Journal of theAmerican Statistical Association, 100, 830–840.Giannone, D., L. Reichlin, and L. Sala (2002), Tracking Greenspan: Systematicand Unsystematic Monetary Policy Revisited, Working Paper, ECARES- Université Libre de Bruxelles.Jungbacker, B., and S.J. Koopman (2008), Likelihood-based Analysis for DynamicFactor Models, Free University of Amsterdam, mimeo.Newey, W., and D. McFadden (1994), Large Sample Estimation and HypothesisTesting, in Handbook of Econometrics, v. 4., eds. R. Engle andD. McFadden. Amsterdam: Elsevier Science.Phillips, P. C. B. (1986), Understanding Spurious Regressions in Econometrics,Journal of Econometrics, 33, 311–340.Stock, J. H., and M. W. Watson (2002a), Macroeconomic Forecasting UsingDiffusion Indexes, Journal of Business & Economic Statistics, 20, 147-162.Stock, J. H., and M. W. Watson (2002b), Forecasting Using Principal ComponentsFrom a Large Number of Predictors, Journal of the American StatisticalAssociation, 97, 1167-1179.29

Stock, J. H., and M. W. Watson (2005), Implications of Dynamic Factor Modelsfor VAR Analysis, NBER Working Paper no. 11467.Watson, M. W. (2003), Macroeconomic forecasting using many predictors, in:Dewatripont, M., L. Hansen, S. Turnovsky (eds.), Advances in Econometrics,Theory and Applications, Eighth World Congress of the EconometricSociety, Vol. III, 87-115.30

Figure 1: Histogram of the sample variances8765Frequency432100 0.2 0.4 0.6 0.8 12σ , mean=0.55, std=0.29i31

Figure 2: Histogram of the sample autocorrelations141210Frequency86420−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1ρ i, mean=0.01, std=0.4132

iidTable 1: Relative efficiency: one factor, autocorrelation (γ = 0.7, ρ i ∼ U[0.5, 0.9],σi 2 = 2) loadings (λ i ) factors (F t )PC two-step iterated QML PC two-step iterated QMLT=50N=50 0.446 0.663 0.850 0.431 0.409 0.403 0.735 0.300N=100 0.457 0.735 0.914 0.457 0.317 0.314 0.763 0.242N=200 0.467 0.778 0.943 0.472 0.222 0.219 0.780 0.166N=300 0.465 0.787 0.951 0.475 0.165 0.163 0.765 0.133T=100N=50 0.376 0.787 0.845 0.341 0.692 0.674 0.871 0.469N=100 0.386 0.865 0.916 0.377 0.621 0.606 0.878 0.464N=200 0.397 0.908 0.950 0.392 0.530 0.519 0.892 0.411N=300 0.401 0.922 0.962 0.396 0.452 0.444 0.887 0.361T=200N=50 0.328 0.823 0.829 0.294 0.868 0.851 0.931 0.651N=100 0.346 0.907 0.914 0.327 0.840 0.825 0.941 0.668N=200 0.356 0.946 0.955 0.344 0.789 0.774 0.944 0.661N=300 0.357 0.959 0.967 0.350 0.749 0.736 0.946 0.634Notes: Entries are the performance measure defined in (23). PC is the ordinary principalcomponent estimator, two-step and iterated indicate the two-step PC-GLS and iteratedPC-GLS estimators, respectively, introduced in section 3, and QML is the quasi maximumlikelihood estimator of Doz et al. (2006b).33

Table 2: Relative efficiency: one factor, heteroskedasticity (γ = 0, ρ i = 0 ∀i,σ iiid∼ |N( √ 2, 0.25)|)loadings (λ i ) factors (F t )PC two-step iterated QML PC two-step iterated QMLT=50N=50 0.818 0.804 0.932 0.953 0.344 0.692 0.821 0.860N=100 0.914 0.890 0.956 0.978 0.305 0.722 0.833 0.843N=200 0.966 0.940 0.969 0.991 0.259 0.718 0.839 0.864N=300 0.983 0.953 0.970 0.994 0.239 0.705 0.849 0.769T=100N=50 0.748 0.740 0.936 0.951 0.376 0.794 0.872 0.927N=100 0.887 0.873 0.963 0.977 0.337 0.810 0.877 0.929N=200 0.951 0.933 0.974 0.989 0.289 0.792 0.874 0.921N=300 0.971 0.952 0.979 0.995 0.257 0.780 0.876 0.922T=200N=50 0.667 0.663 0.938 0.944 0.401 0.852 0.897 0.958N=100 0.855 0.846 0.967 0.975 0.344 0.843 0.895 0.959N=200 0.936 0.926 0.982 0.990 0.302 0.831 0.895 0.951N=300 0.962 0.950 0.985 0.989 0.268 0.811 0.891 0.830Notes: See Table 1.34

iidTable 3: Relative efficiency: five factors, autocorrelation (γ =0.7, ρ i ∼ U[0.5, 0.9],σi 2 = 2) loadings (λ i ) factors (F t )PC two-step iterated QML PC two-step iterated QMLT=50N=50 0.469 0.521 0.584 0.460 0.607 0.607 0.612 0.571N=100 0.464 0.520 0.618 0.459 0.407 0.405 0.473 0.370N=200 0.463 0.520 0.645 0.458 0.239 0.237 0.326 0.219N=300 0.459 0.518 0.655 0.460 0.167 0.165 0.250 0.155T=100N=50 0.320 0.414 0.508 0.297 0.610 0.607 0.653 0.530N=100 0.319 0.442 0.662 0.301 0.428 0.423 0.619 0.357N=200 0.318 0.462 0.805 0.304 0.271 0.267 0.618 0.223N=300 0.318 0.475 0.861 0.307 0.202 0.199 0.615 0.166T=200N=50 0.235 0.365 0.452 0.194 0.663 0.657 0.736 0.530N=100 0.246 0.468 0.717 0.212 0.532 0.525 0.807 0.395N=200 0.265 0.613 0.864 0.233 0.424 0.418 0.849 0.301N=300 0.275 0.694 0.908 0.248 0.364 0.358 0.858 0.260Notes: See Table 1.35

Table 4: Relative efficiency: five factors, heteroskedasticity (γ = 0, ρ i = 0 ∀i,σ iiid∼ |N( √ 2, 0.25)|)loadings (λ i ) factors (F t )PC two-step iterated QML PC two-step iterated QMLT=50N=50 0.636 0.631 0.697 0.770 0.389 0.435 0.497 0.590N=100 0.696 0.689 0.811 0.877 0.265 0.342 0.492 0.595N=200 0.804 0.790 0.913 0.952 0.206 0.335 0.597 0.665N=300 0.869 0.851 0.940 0.973 0.196 0.368 0.656 0.689T=100N=50 0.465 0.463 0.559 0.738 0.393 0.449 0.522 0.729N=100 0.588 0.583 0.818 0.892 0.300 0.438 0.680 0.793N=200 0.783 0.773 0.937 0.954 0.283 0.579 0.821 0.822N=300 0.864 0.851 0.956 0.970 0.268 0.615 0.828 0.825T=200N=50 0.325 0.324 n/a 0.749 0.404 0.472 n/a 0.855N=100 0.508 0.505 0.816 0.895 0.346 0.571 0.786 0.892N=200 0.765 0.759 0.943 0.952 0.340 0.743 0.878 0.900N=300 0.851 0.843 0.961 0.969 0.322 0.759 0.876 0.898Notes: See Table 1.36

GLS estimation of dynamic factor models - Econometrics

Create successful ePaper yourself

Delete template?

Save as template?