01.02.2013 Views

Non-parametric estimation of a time varying GARCH model

Non-parametric estimation of a time varying GARCH model

Non-parametric estimation of a time varying GARCH model

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Non</strong>-<strong>parametric</strong> <strong>estimation</strong> <strong>of</strong> a <strong>time</strong><br />

<strong>varying</strong> <strong>GARCH</strong> <strong>model</strong><br />

by Neelabh Rohan and T. V. Ramanathan<br />

Technical Report 3/2011<br />

Department <strong>of</strong> Statistics and Centre for Advanced Studies<br />

University <strong>of</strong> Pune, 411 007, INDIA<br />

May, 2012 (Revised)<br />

1


<strong>Non</strong>-<strong>parametric</strong> <strong>estimation</strong> <strong>of</strong> a <strong>time</strong><br />

<strong>varying</strong> <strong>GARCH</strong> <strong>model</strong><br />

Neelabh Rohan 1 and T. V. Ramanathan 2<br />

Department <strong>of</strong> Statistics and Centre for Advanced Studies<br />

University <strong>of</strong> Pune, 411 007, INDIA<br />

Abstract<br />

In this paper, a non-stationary <strong>time</strong>-<strong>varying</strong> <strong>GARCH</strong> (tv<strong>GARCH</strong>) <strong>model</strong> has been<br />

introduced by allowing the parameters <strong>of</strong> a stationary <strong>GARCH</strong> <strong>model</strong> to vary as functions<br />

<strong>of</strong> <strong>time</strong>. It is shown that the tv<strong>GARCH</strong> process is locally stationary in the sense that it<br />

can be locally approximated by stationary <strong>GARCH</strong> processes at fixed <strong>time</strong> points. We<br />

develop a two step local polynomial procedure for the <strong>estimation</strong> <strong>of</strong> the parameter functions<br />

<strong>of</strong> the proposed <strong>model</strong>. Several asymptotic properties <strong>of</strong> the estimators have been<br />

established including the asymptotic optimality. It has been found that the tv<strong>GARCH</strong><br />

<strong>model</strong> performs better than many <strong>of</strong> the standard <strong>GARCH</strong> <strong>model</strong>s for various real data<br />

sets.<br />

Mathematical Subject classification: 62M10, 62G05<br />

Keywords: Local polynomial <strong>estimation</strong>, <strong>time</strong>-<strong>varying</strong> <strong>GARCH</strong>, volatility <strong>model</strong>ling.<br />

1 Corresponding author Email: neelabh.stats@yahoo.co.in<br />

2 Email: ram@stats.unipune.ac.in<br />

2


1 Introduction<br />

The first decade <strong>of</strong> the 21 st century left the global economies grappling with the conse-<br />

quences <strong>of</strong> the financial crisis followed by an uninvited rash <strong>of</strong> currency wars. Many <strong>of</strong><br />

the emerging economies started receiving large capital inflows that have the potential to<br />

destabilizing the economy. Perhaps, the most deleterious consequence <strong>of</strong> capital inflows<br />

has been the strengthening <strong>of</strong> domestic currency, which can lead to a loss in export com-<br />

petitiveness. This, in turn led to currency wars-the phenomenon <strong>of</strong> several emerging and<br />

developed countries intervening in currency market simultaneously in order to ensure that<br />

their currency will not be the only one that appreciates. Such a phenomenon may induce<br />

instability and hence non-stationarity in the bilateral exchange rate volatility process,<br />

implying the failure <strong>of</strong> standard stationary volatility <strong>model</strong>s. In this paper, we address<br />

this problem by considering a <strong>GARCH</strong> <strong>model</strong> with <strong>time</strong> <strong>varying</strong> parameters.<br />

<strong>Non</strong>-stationary volatility <strong>model</strong>s have got considerable attention recently, see for ex-<br />

ample Mercurio and Spokoiny (2004), Mikosch and Starica (2004), Starica and Granger<br />

(2005), Dahlhaus and Subba Rao (2006), Amado and Terasvirta (2008), Fryzlewicz, Sap-<br />

atinas and Subba Rao (2008) and Chen and Hong (2009) and among others. Dahlhaus<br />

and Subba Rao (2006) proposed a <strong>time</strong>-<strong>varying</strong> ARCH (tvARCH) <strong>model</strong> for the volatil-<br />

ity process by allowing the parameters <strong>of</strong> a stationary ARCH <strong>model</strong> to change slowly<br />

through <strong>time</strong>. Fryzlewicz et al. (2008) developed a least-squares <strong>estimation</strong> procedure<br />

for such a tvARCH <strong>model</strong>. We generalize the tvARCH <strong>model</strong> introduced by Dahlhaus<br />

and Subba Rao (2006) to <strong>time</strong> <strong>varying</strong> <strong>GARCH</strong> (tv<strong>GARCH</strong>) by allowing the parameters<br />

<strong>of</strong> a stationary <strong>GARCH</strong> <strong>model</strong> to vary as functions <strong>of</strong> <strong>time</strong>.<br />

Dahlhaus and Subba Rao (2006) showed that the tvARCH <strong>model</strong> can be approxi-<br />

mated by stationary ARCH processes locally. We extend their results to the tv<strong>GARCH</strong><br />

<strong>model</strong> and show that a non-stationary tv<strong>GARCH</strong> process can be locally approximated by<br />

stationary processes at specific <strong>time</strong> points. Therefore, the tv<strong>GARCH</strong> <strong>model</strong> is asymp-<br />

totically locally stationary at every point <strong>of</strong> observation, but it is globally non-stationary<br />

because <strong>of</strong> <strong>time</strong>-<strong>varying</strong> parameters. Such an approximation further helps us in deriving<br />

the asymptotic distribution <strong>of</strong> the estimators.<br />

An alternative approach to incorporate non-stationarity in the volatility process is the<br />

<strong>varying</strong> coefficient <strong>GARCH</strong> <strong>model</strong> (see Číˇzek and Spkoiny (2009) and references therein).<br />

The <strong>estimation</strong> <strong>of</strong> a <strong>varying</strong> coefficient <strong>GARCH</strong> <strong>model</strong> requires the search for local <strong>time</strong><br />

3


intervals <strong>of</strong> homogeneity over the entire period, such that the parameters <strong>of</strong> the process<br />

remain nearly a constant over each interval. The <strong>estimation</strong> is carried out using the<br />

quasi-maximum likelihood (QML) approach. However, the QML procedure is not very<br />

reliable when the sample size is small, since the quasi-likelihood tends to be shallow about<br />

the minimum for small sample sizes, see Shephard (1996), Bose and Mukherjee (2003)<br />

and Fryzlewicz et al. (2008). In addition, the QML estimator does not admit a closed<br />

form solution. The <strong>model</strong> and <strong>estimation</strong> procedure <strong>of</strong> Amado and Terasvirta (2008) also<br />

suffers from similar drawbacks.<br />

We develop a two-step local polynomial <strong>estimation</strong> procedure for the <strong>estimation</strong> <strong>of</strong> the<br />

proposed tv<strong>GARCH</strong> <strong>model</strong>. One can refer to Wand and Jones (1995), Fan and Gijbels<br />

(1996) and Fan and Zhang (1999) among others for the application <strong>of</strong> local polynomial<br />

techniques in various regression <strong>model</strong>s. The proposed two-step <strong>estimation</strong> procedure<br />

requires the <strong>estimation</strong> <strong>of</strong> a tvARCH <strong>model</strong> initially in the first step. In the second step,<br />

we obtain the estimator <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> using the initial estimator. Expressions<br />

for the asymptotic bias and variance <strong>of</strong> the estimators in both the steps are derived and<br />

asymptotic normality is established. It is found that the asymptotic MSE <strong>of</strong> estimators<br />

<strong>of</strong> the parameter functions <strong>of</strong> tv<strong>GARCH</strong> <strong>model</strong> remain invariable for a wide range <strong>of</strong> the<br />

initial step bandwidths, thus making it computation friendly. Moreover, our estimator<br />

achieves the optimal rate <strong>of</strong> convergence under a higher order differentiability assumption<br />

<strong>of</strong> the parameter functions.<br />

Even though this paper deals with tv<strong>GARCH</strong> (1,1) process only, the results presented<br />

here can be extended to a general tv<strong>GARCH</strong> (p,q) with appropriate modifications. In<br />

the empirical analysis <strong>of</strong> financial data, lower order <strong>GARCH</strong> (1,1) <strong>model</strong> has <strong>of</strong>ten been<br />

found appropriate to account for the conditional heteroscedasticity. It usually describes<br />

the dynamics <strong>of</strong> conditional variance <strong>of</strong> many economic <strong>time</strong> series quite well, see for<br />

example Palm (1996). Therefore, in this paper we concentrate on tv<strong>GARCH</strong> (1,1) <strong>model</strong>.<br />

We illustrate the performance <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> using various bilateral ex-<br />

change rate and stock indices data in the past decade. The tv<strong>GARCH</strong> <strong>model</strong> is shown<br />

to outperform several stationary <strong>GARCH</strong> as well as tvARCH <strong>model</strong>s in terms <strong>of</strong> both<br />

in-sample and out <strong>of</strong> sample prediction. The <strong>model</strong> is also found to be performing better<br />

than a long memory <strong>model</strong> in predicting the volatility.<br />

The rest <strong>of</strong> the paper is organized as follows. A tv<strong>GARCH</strong> <strong>model</strong> and its properties<br />

4


have been discussed in Section 2. Section 3 develops a two step local polynomial estima-<br />

tion procedure for the <strong>model</strong>. We establish the asymptotic properties <strong>of</strong> the estimators<br />

in Section 4. Several applications <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> are given in Section 5. All the<br />

pro<strong>of</strong>s are deferred to the Appendix.<br />

2 A <strong>time</strong> <strong>varying</strong> <strong>GARCH</strong> <strong>model</strong><br />

Let {ǫt} be a process such that E(ǫt|Ft−1) = 0 and E(ǫ 2 t |Ft−1) = σ 2 t , where Ft−1 =<br />

σ(ǫt−1,ǫt−2,...). Suppose {vt} is a sequence, independent <strong>of</strong> {ǫt}, <strong>of</strong> real valued indepen-<br />

dent and identically distributed random variables, having mean 0 and variance 1. Then<br />

a <strong>GARCH</strong> <strong>model</strong> with <strong>time</strong> <strong>varying</strong> parameters is defined as<br />

ǫt = σtvt,<br />

σ 2 t = ω(t) + α(t)ǫ 2 t−1 + β(t)σ 2 t−1<br />

where ω(·), α(·) and β(·) are certain non-negative functions <strong>of</strong> <strong>time</strong>.<br />

In order to obtain a meaningful asymptotic theory, we rescale the domain <strong>of</strong> the<br />

parameter functions <strong>of</strong> (1) to unit interval. That is, we study the following process,<br />

σ2 t = ω � �<br />

t<br />

n<br />

+ α � �<br />

t<br />

n<br />

ǫt = σtvt,<br />

ǫ 2 t−1 + β � t<br />

n<br />

�<br />

σ 2 t−1, t = 1, 2,...,n.<br />

The sequence <strong>of</strong> stochastic processes {ǫt, t = 1, 2,...,n} is said to follow a tv<strong>GARCH</strong><br />

process if it satisfies (2). Here ω(u),α(u),β(u) ≥ 0 ∀ u ∈ (0, 1] ensure the non-negativity<br />

<strong>of</strong> σ 2 t . We define ω(u),α(u),β(u) = 0 for u < 0. Such a rescaling is a common technique in<br />

non-<strong>parametric</strong> regression and it does not affect the <strong>estimation</strong> procedure, see Dahlhaus<br />

and Subba Rao (2006).<br />

Now we show that the tv<strong>GARCH</strong> process can be locally approximated by stationary<br />

<strong>GARCH</strong> processes at specific <strong>time</strong> points. This allows us to refer the tv<strong>GARCH</strong> as a lo-<br />

cally stationary process. Towards this, first we state the following technical assumptions:<br />

Assumption 1. (i) There exists δ > 0 such that<br />

0 < α(u) + β(u) ≤ 1 − δ, ∀ 0 < u ≤ 1 and supω(u)<br />

< ∞.<br />

u<br />

(ii) There exist finite constants M1,M2 and M3 such that ∀ u1,u2 ∈ (0, 1],<br />

|ω(u1) − ω(u2)| ≤ M1|u1 − u2|<br />

|α(u1) − α(u2)| ≤ M2|u1 − u2|<br />

|β(u1) − β(u2)| ≤ M3|u1 − u2|.<br />

5<br />

(1)<br />

(2)


The Assumption 1 (i) here is similar in spirit to the stationarity condition for <strong>GARCH</strong><br />

(1,1) <strong>model</strong> discussed by Nelson (1991). This condition is required for the existence<br />

<strong>of</strong> a well defined unique solution to the tv<strong>GARCH</strong> process. It is also sufficient for the<br />

tv<strong>GARCH</strong> to be a short memory process. The Lipschitz continuity condition for the<br />

parameters in Assumption 1 (ii) is required for the local stationarity <strong>of</strong> the tv<strong>GARCH</strong><br />

process. Similar condition is also assumed by Dahlhaus and Subba Rao (2006) for pa-<br />

rameters <strong>of</strong> the tvARCH process. Notice that we do not make any assumption on the<br />

density function <strong>of</strong> ǫt. Therefore, the methodology introduced in the paper will be useful<br />

for analyzing data with heavy tailed distributions which is a common phenomenon in<br />

financial <strong>time</strong> series.<br />

Before proceeding further, we show in Proposition 2.1 that the tv<strong>GARCH</strong> process<br />

possesses a well defined unique solution. In the Proposition 2.2, we derive the covariance<br />

structure <strong>of</strong> the tv<strong>GARCH</strong> process and show that tv<strong>GARCH</strong> is a short memory process.<br />

Proposition 2.1. Let the Assumption 1 (i) hold. Then the variance process (2) has<br />

a well defined unique solution given by<br />

¯σ 2 t = ω � �<br />

t + n<br />

∞� i� �<br />

α<br />

i=1 j=1<br />

� �<br />

t−j+1<br />

v n<br />

2 t−j + β � ��<br />

t−j+1<br />

ω n<br />

� �<br />

t−i , n<br />

such that |σ 2 t − ¯σ 2 t | → 0 a.s., if σ 2 0 (starting point) is finite with probability one. Also,<br />

inf<br />

u ω(u)/(1 − inf<br />

u β(u)) ≤ ¯σ 2 t < ∞ ∀ t a.s.<br />

Proposition 2.2. Suppose that the Assumption 1 (i) is satisfied for the tv<strong>GARCH</strong><br />

process. Further assume that E|vt| 4 < ∞. Then for a fixed k ≥ 0 and 0 < δ < 1,<br />

Cov(ǫ 2 t,ǫ 2 t+k) = O �<br />

(1 − δ) k�<br />

.<br />

Now we define a stationary <strong>GARCH</strong> (1,1) process, which locally approximates the original<br />

process (2) in the neighborhood <strong>of</strong> a fixed point (see Proposition 2.3). Let �ǫt(u0), u0 ∈<br />

(0, 1] be a process with E(�ǫt(u0)| � Ft−1) = 0 and E(�ǫ 2 t(u0)| � Ft−1) = �σ 2 t (u0) where � Ft−1 =<br />

σ(�ǫt−1, �ǫt−2,...). Then {�ǫt(u0)} is said to follow a stationary <strong>GARCH</strong> process associated<br />

with (2) at <strong>time</strong> point u0 if it satisfies,<br />

�ǫt(u0) = �σt(u0)vt,<br />

�σ 2 t (u0) = ω(u0) + α(u0)�ǫ 2 t−1(u0) + β(u0)�σ 2 t−1(u0).<br />

6<br />

(3)


Under Assumption 1(i), (3) is a stationary ergodic process. It is also sufficient for �ǫt(u0)<br />

to be weakly stationary. A unique stationary ergodic solution to (3) is<br />

¯σ 2 t (u0) = ω (u0) + ∞� i� �<br />

α (u0)v<br />

i=1 j=1<br />

2 t−j + β (u0) �<br />

ω (u0). (4)<br />

Here |¯σ 2 t (u0) − �σ 2 t (u0)| → 0 a.s. (see Nelson (1991)). Now in the following proposition,<br />

we show that if the <strong>time</strong> point (t/n) is close to u0, then (3) can be locally considered as<br />

an approximation to (2).<br />

Proposition 2.3. Suppose that the Assumptions 1 (i) and (ii) are satisfied, then the<br />

process {ǫ 2 t } can be approximated locally by a stationary ergodic process {�ǫ 2 t(u0)}. That<br />

is, there exists a well defined stationary ergodic process Vt independent <strong>of</strong> u0 and a con-<br />

stant Q < ∞ such that<br />

or equivalently<br />

|ǫ 2 t − �ǫ 2 t(u0)| ≤ Q �� � � t<br />

n<br />

ǫ 2 t = �ǫ 2 t + OP<br />

�� �� t<br />

n<br />

�<br />

�<br />

− u0<br />

We can also write (2) by recursive substitution,<br />

where<br />

α0( t<br />

n ) = ω � �<br />

t<br />

n<br />

k = 1, 2,...t − 1.<br />

σ 2 t = α0( t<br />

n<br />

+ t−1 �<br />

k=1<br />

t−1 �<br />

) +<br />

k=1<br />

ω � � k� t−k<br />

n<br />

i=1<br />

�<br />

�<br />

− u0<br />

� + 1<br />

αk( t<br />

n )ǫ2 t−k + σ 2 0<br />

n<br />

� + 1<br />

�<br />

Vt a.s.<br />

n<br />

t�<br />

i=1<br />

�<br />

.<br />

β � �<br />

t−i+1 ,αk( n<br />

t<br />

n ) = α � t−k+1<br />

n<br />

β � �<br />

t−i+1 , (5)<br />

n<br />

� k−1 �<br />

i=1<br />

β � �<br />

t−i+1 , n<br />

Here we take 0�<br />

β<br />

i=1<br />

� �<br />

t−i+1 = 1. Notice that the functions αk(·) here are geometrically<br />

n<br />

decaying as k → ∞ under Assumption 1(i). Also, if σ2 0 is finite with probability one, then<br />

σ2 t�<br />

0 β<br />

i=1<br />

� �<br />

t−i+1 P→ P<br />

0 as t → ∞, n → ∞. Here, → denotes convergence in probability.<br />

n<br />

3 Local polynomial <strong>estimation</strong><br />

The local polynomial <strong>estimation</strong> <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> (2) can be carried out in two<br />

steps. In Step 1, we obtain a preliminary estimate <strong>of</strong> σ 2 t using a <strong>time</strong> <strong>varying</strong> ARCH<br />

(p) <strong>model</strong>, exploiting the representation (5) <strong>of</strong> tv<strong>GARCH</strong>. In the second step, we finally<br />

7


each the estimators <strong>of</strong> the parameter functions <strong>of</strong> tv<strong>GARCH</strong>. It has been shown that<br />

with appropriately chosen bandwidth, the rate <strong>of</strong> convergence <strong>of</strong> the MSE <strong>of</strong> final esti-<br />

mates become independent <strong>of</strong> the initial step estimates.<br />

Step 1. First, we obtain a preliminary estimate <strong>of</strong> σ 2 t using the following tvARCH<br />

(p) <strong>model</strong>;<br />

which can also be written as<br />

σ2 t = α0( t t ) + α1( n n )ǫ2t−1 ... + αp( t<br />

n )ǫ2t−p ǫ2 t = α0( t t ) + α1( n n )ǫ2t−1 ... + αp( t<br />

n )ǫ2t−p + σ2 t (v2 t − 1).<br />

Here, p is such that p = pn → ∞ as n → ∞. Among several choices <strong>of</strong> such a p, one<br />

specific choice is log n. The asymptotic results derived in Section 4 for the tv<strong>GARCH</strong><br />

<strong>model</strong> hold for pn → ∞. However, we drop the suffix n for notational simplicity. We<br />

use local polynomial technique to estimate the functions αi(u), i = 0, 1,...p, treating<br />

σ 2 t (v 2 t −1) as error. Now onwards, we will denote (t/n) = ut. We assume that the function<br />

αi(·) possesses a bounded continuous derivative up to order d + 1, (d ≥ 1) (see Section<br />

4). Using Taylor’s series expansion, the function αi(u) can locally be approximated in<br />

the neighborhood <strong>of</strong> a point u0 by,<br />

αi(ut) ≈ αi0 + αi1(ut − u0) + ... + αid(ut − u0) d , i = 0, 1,...,p<br />

where αij, j = 0, 1,...d are constants. Therefore, given a Kernel function K(·), we get<br />

the estimator by minimizing,<br />

L = n�<br />

�<br />

ǫ<br />

i=p+1<br />

2 i − d�<br />

(α0k +<br />

k=0<br />

p �<br />

αjkǫ<br />

j=1<br />

2 i−j)(ui − u0) k<br />

�2 where Kh1(·) = (1/h1)K(·/h1) and h1 denotes the bandwidth. Define<br />

Ut = [1, (ut − u0),...,(ut − u0) d ]1×(d+1) t = 1, 2,...,n ,<br />

⎡<br />

⎢<br />

X1 = ⎢<br />

⎣<br />

Up+1 ǫ 2 pUp+1 ... ǫ 2 1Up+1<br />

Up+2 ǫ 2 p+1Up+2 ... ǫ 2 2Up+2<br />

.<br />

.<br />

...<br />

Un ǫ 2 n−1Un ... ǫ 2 n−pUn<br />

.<br />

⎤<br />

Kh1(ui − u0) (6)<br />

W1 = diag(Kh1(up+1 − u0),...,Kh1(un − u0)) and Y1 = [ǫ 2 p+1,...ǫ 2 n] ⊤ .<br />

8<br />

⎥<br />

⎦ ,


The estimator <strong>of</strong> αi(u0) as a solution to least-squares problem (6) can be expressed as,<br />

ˆαi(u0) = e ⊤ i(d+1)+1,(p+1)(d+1) (X⊤ 1 W1X1) −1 X ⊤ 1 W1Y1, i = 0, 1,...,p. (7)<br />

Here and throughout the paper, we use the notation ek,m for a column vector <strong>of</strong> length<br />

m with 1 at k th position and 0 elsewhere. Therefore, an initial estimate <strong>of</strong> σ 2 t is obtained<br />

by,<br />

ˆσ 2 t = ˆα0(ut) + p �<br />

ˆαk(ut)ǫ<br />

k=1<br />

2 t−k,<br />

where ˆα0(ut) and ˆαk(ut) represent the estimators <strong>of</strong> α0(ut) and αk(ut) respectively. They<br />

are calculated using (7) at ut. We set ǫ 2 t = 0, ∀ t ≤ 0 for the practical implementation.<br />

This method can also be used for the <strong>estimation</strong> <strong>of</strong> a tvARCH (p) <strong>model</strong> <strong>of</strong> Dahlhaus<br />

and Subba Rao (2006).<br />

Step 2. In this step, we use the conditional variance initially estimated in Step 1 to<br />

get the estimates <strong>of</strong> the parameter functions <strong>of</strong> tv<strong>GARCH</strong> process. The parameter func-<br />

tions ω(·),α(·) and β(·) are assumed to be continuously differentiable up to order d + 1.<br />

Using Taylor’s series expansion, we can write,<br />

ω(ut) ≈ ω02 + ω12(ut − u0) + ... + ωd2(ut − u0) d<br />

α(ut) ≈ a02 + a12(ut − u0) + ... + ad2(ut − u0) d<br />

β(ut) ≈ b02 + b12(ut − u0) + ... + bd2(ut − u0) d<br />

where ωi2,ai2 and bi2, i = 0, 1,...,d are constants. We can write (2) as<br />

ǫ2 t = ω( t t ) + α( n n )ǫ2 t−1 + β( t<br />

n )ˆσ2 t−1 − β( t<br />

n )(ˆσ2 t−1 − σ2 t−1) + σ2 t (v2 t − 1). (8)<br />

Corollary 2 (in Section 4) shows that for a particular choice <strong>of</strong> the Step 1 bandwidth<br />

h1 = o(h2), E(ˆσ 2 t−1 − σ 2 t−1) is asymptotically negligible. Here h2 denotes the bandwidth<br />

in the Step 2. The estimates are obtained by minimizing<br />

Define<br />

L = n�<br />

�<br />

i=2<br />

ǫ2 i − d�<br />

(ωk2 + ak2ǫ<br />

k=0<br />

2 i−1 + bk2ˆσ 2 i−1)(ui − u0) k<br />

�2 ⎡<br />

⎢<br />

X2 = ⎢<br />

⎣ .<br />

U2 ǫ 2 1U2 ˆσ 2 1U2<br />

U3 ǫ 2 2U3 ˆσ 2 2U3<br />

.<br />

Un ǫ 2 n−1Un ˆσ 2 n−1Un<br />

.<br />

⎤<br />

⎥<br />

⎦ ,<br />

Kh2(ui − u0).<br />

W2 = diag(Kh2(u2 − u0),...,Kh2(un − u0)), and Y2 = [ǫ 2 2,...,ǫ 2 n] ⊤ .<br />

9


Then, the exact expressions for the estimators are given by<br />

ˆω(u0) = e ⊤ 1,3(d+1) (X⊤ 2 W2X2) −1 X ⊤ 2 W2Y2,<br />

ˆα(u0) = e ⊤ d+2,3(d+1) (X⊤ 2 W2X2) −1 X ⊤ 2 W2Y2 and<br />

ˆβ(u0) = e ⊤ 2d+3,3(d+1) (X⊤ 2 W2X2) −1 X ⊤ 2 W2Y2.<br />

The final estimates <strong>of</strong> σ 2 t in tv<strong>GARCH</strong> <strong>model</strong> can be obtained using these estimators.<br />

These estimators achieve the optimal rate <strong>of</strong> convergence when an optimal bandwidth is<br />

used (see Section 4).<br />

3.1 Bandwidth selection<br />

As will be discussed in the next section, the two step estimator is not very sensitive to the<br />

choice <strong>of</strong> initial bandwidth h1 as long as it is small enough, so that the bias in the first<br />

step is asymptotically negligible. Therefore, one can simply apply the standard univariate<br />

bandwidth selection procedures to select the smoothing parameter for Step 2. The initial<br />

smoothing parameter can be chosen according to the second step bandwidth. For the<br />

practical implementation, we select the optimal bandwidth (h2) using the cross validation<br />

method based on the best linear predictor <strong>of</strong> ǫ2 t given the past (see Hart (1994)), which<br />

is, ω � �<br />

t + α n<br />

� �<br />

t ǫ n<br />

2 t−1 + β � �<br />

t σ n<br />

2 t−1. That is, such a bandwidth (h2) is chosen for which,<br />

CV (h2) = 1<br />

n−1<br />

n�<br />

t=2<br />

�<br />

ǫ2 t − ˆω −t (ut) − ˆα −t (ut)ǫ2 t−1 − ˆ β−t (ut)σ2 �2 t−1<br />

is minimum, where ˆω −t (ut), ˆα −t (ut) and ˆ β−t (ut) denote the local polynomial estimators<br />

<strong>of</strong> ω � �<br />

t ,α n<br />

� �<br />

t and β n<br />

� �<br />

t obtained by leaving the t n<br />

th observation. A pilot bandwidth is<br />

chosen initially to get the initial estimate <strong>of</strong> σ 2 t−1 using the full data. Using the similar<br />

arguments as in Hart (1994), asymptotically it can be shown that such a bandwidth is<br />

a minimizer <strong>of</strong> the mean squared prediction error <strong>of</strong> ǫ 2 t. The pilot bandwidth should be<br />

small enough to be <strong>of</strong> o(h2) and at the same <strong>time</strong>, should satisfy nh1 → ∞. In case, if<br />

h2 comes out be such that the pilot bandwidth is not <strong>of</strong> o(h2), the above cross validation<br />

procedure can be repeated by choosing even smaller initial bandwidth.<br />

However, it is not feasible to compute (9) practically, as it requires the repeated<br />

refitting <strong>of</strong> the <strong>model</strong> after deletion <strong>of</strong> the data points each <strong>time</strong>. The bandwidth selection<br />

procedure is computationally too cumbersome, specially when n is large. Therefore we<br />

provide a simplified version <strong>of</strong> (9) to reduce the computational complexity and make the<br />

bandwidth selection easy and doable. This has been described in the Appendix B.<br />

10<br />

(9)


4 Asymptotic results<br />

Towards proving the asymptotic results corresponding to estimators in Steps 1 and 2, we<br />

first state the following standard technical assumptions and then introduce some nota-<br />

tions:<br />

Assumption 2. (i) The functions ω(·),α(·) and β(·) (and hence αj(·)) have the bounded<br />

and continuous derivatives up to order d+1 (d ≥ 1), in a neighborhood <strong>of</strong> u0, u0 ∈ (0, 1].<br />

(ii) K(u) is a symmetric density function <strong>of</strong> bounded variation with a compact support.<br />

(iii) The bandwidths h1 and h2 are such that h1 → 0,h2 → 0 and nh1 → ∞,nh2 → ∞<br />

as n → ∞.<br />

(iv) E|vt| 4 < ∞.<br />

Notations.<br />

µi = � u i K(u)du, νi = � u i K 2 (u)du,<br />

S = S(u0) = E �<br />

[1, �ǫ 2 t−1(u0),..., �ǫ 2 t−p(u0)] ⊤ [1, �ǫ 2 t−1(u0),..., �ǫ 2 t−p(u0)] �<br />

,<br />

Cj = Cj(u0) = E(�ǫ 2 t(u0) �ǫ 2 t−j(u0)),<br />

Ω = Ω(u0) = E �<br />

�σ 4 t (u0)[1, �ǫ 2 t−1(u0),..., �ǫ 2 t−p(u0)] ⊤ [1, �ǫ 2 t−1(u0),..., �ǫ 2 t−p(u0)] �<br />

,<br />

wj = E(�ǫ j<br />

t(u0)), αtvARCH(u0) = [α0(u0),α1(u0),...,αp(u0)] ⊤ ,<br />

Di = [µd+1,hiµd+2,...,h d iµ2d+1] ⊤ , i = 1, 2,<br />

em = a column vector <strong>of</strong> length m with 1 everywhere,<br />

⎡<br />

⎢<br />

Ai = ⎢<br />

⎣<br />

⎡<br />

⎢<br />

Bi = ⎢<br />

⎣ .<br />

1 hiµ1 ... hd iµd<br />

hiµ1 h2 iµ2 ... h d+1<br />

...<br />

. . .<br />

i µd+1<br />

h d iµd h d+1<br />

i µd+1 ... h 2d<br />

i µ2d<br />

ν0 hiν1 ... hd iνd<br />

hiν1 h2 iν2 ... h d+1<br />

i νd+1<br />

.<br />

...<br />

h d iνd h d+1<br />

i νd+1 ... h 2d<br />

i ν2d<br />

.<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎤<br />

⎥ , i = 1, 2.<br />

⎦<br />

In the following theorem, we obtain the exact expressions for the biases <strong>of</strong> the estimators<br />

<strong>of</strong> tvARCH (p) <strong>of</strong> Step 1.<br />

Theorem 4.1 Let the Assumptions 1 and 2 be satisfied. Then the asymptotic bias <strong>of</strong><br />

ˆαj(u0), j = 0, 1,...,p is given by,<br />

Bias(ˆαj(u0)) = hd+1<br />

�<br />

1<br />

(d+1)!<br />

α (d+1)<br />

j<br />

(u0) �<br />

e ⊤ 1,d+1A −1<br />

1 D1 + oP(h d+1<br />

1 ).<br />

11


Further, if E|vt| 8 < ∞, then the asymptotic variance <strong>of</strong> the estimator is<br />

V ar(ˆα0(u0),..., ˆαp(u0))<br />

= 1<br />

nh1 e⊤ 1,d+1A −1<br />

1 B1A −1<br />

1 e1,d+1V ar(v 2 t )S −1 ΩS −1 (1 + oP(1)),<br />

Interestingly, the bias expression for ˆαj(u0) depends on the (d+1) th derivative <strong>of</strong> αj(u0)<br />

only due to the structure <strong>of</strong> the <strong>model</strong>. The procedure introduced in Step 1 can be<br />

used for the <strong>estimation</strong> <strong>of</strong> a <strong>time</strong> <strong>varying</strong> ARCH (p) <strong>model</strong>. Now it is clear that the<br />

MSE <strong>of</strong> the estimator ˆαj(u0) is OP(h 2d+2<br />

1<br />

+ (nh1) −1 ). Also, when the optimal bandwidth<br />

h1 = O(n −1/(2d+3) ) is used, then the local polynomial estimator achieves the optimal rate<br />

<strong>of</strong> convergence OP(n −(2d+2)/(2d+3) ) for estimating αj(u0). Notice that for d = 3, the opti-<br />

mal convergence rate is OP(n −8/9 ). Now in the following corollary, we show the asymptotic<br />

normality <strong>of</strong> the estimator as a simple application <strong>of</strong> the martingale central limit theorem.<br />

Corollary 4.1. Under the same assumptions as that <strong>of</strong> Theorem 4.1,<br />

√<br />

nh1 (ˆαtvARCH(u0) − αtvARCH(u0) − b(u0)) D �<br />

→<br />

Np+1 0,e ⊤ 1,d+1A −1<br />

1 B1A −1<br />

1 e1,d+1V ar(v2 t )S−1ΩS −1�<br />

where b(u0) = Bias(ˆαtvARCH(u0)) and D → denotes the convergence in distribution.<br />

Corollary 4.2. Let ˆσ 2 t = ˆαtvARCH(ut) ⊤ [1,ǫ2 t−1,...,ǫ 2 t−p] ⊤ (p+1)×1 . Then under the Assump-<br />

tions 1 and 2,<br />

where 0 < ρ < 1 and pn → ∞ as n → ∞.<br />

Bias(ˆσ 2 t ) = E(ˆσ 2 t − σ 2 t ) = OP(h d+1<br />

1 ) + O(ρ pn )<br />

Corollary 4.2 can be proved using Proposition 2.2, equation (5) and Theorem 4.1. It<br />

shows that the choice <strong>of</strong> pn will contribute towards the bias <strong>of</strong> the conditional variance<br />

in the initial step by a term which decays geometrically. Therefore, this term will have<br />

negligible effect on final estimators as pn → ∞. In Theorem 4.2, we derive the asymp-<br />

totic bias and the variance <strong>of</strong> the estimators <strong>of</strong> tv<strong>GARCH</strong> parameter functions obtained<br />

in Step 2. Towards this, first we introduce few more notations.<br />

12


Notations.<br />

bj = bj(u0) = Bias(ˆαj(u0)), δj = δj(u0) = αj(u0) + bj(u0), j = 0, 1,...,p,<br />

λ1 = δ0 + p �<br />

δjw2, λ2 = δ0w2 + p �<br />

δjCj,<br />

j=1<br />

λ3 = δ2 p�<br />

0 + 2δ0w2<br />

j=1<br />

j=1<br />

δj + p �<br />

δ2 jw4 + 2<br />

j=1<br />

λ1b = b0 + p �<br />

bjw2, λ2b = b0w2 + p �<br />

bjCj,<br />

j=1<br />

p� p�<br />

λ3b = δ0b0 + (b0 δj + δ0<br />

j=1 j=1<br />

j=1<br />

p�<br />

δiδjCj−i,<br />

i,j=1(i


It is interesting to note that the bias expressions are free <strong>of</strong> the derivatives <strong>of</strong> other pa-<br />

rameter functions. Also, if h1 = o(h2), then δj = αj(u0) + oP(h d+1<br />

2 ) and the variance<br />

<strong>of</strong> the estimator does not depend on the first step bandwidth. This means that when<br />

the optimal bandwidth is used, then the <strong>estimation</strong> remains unaffected for a large choice<br />

<strong>of</strong> initial step bandwidth. This makes the <strong>estimation</strong> procedure relatively easy to imple-<br />

ment. The MSE <strong>of</strong> the final estimator is OP(h 2d+2<br />

2 +(nh2) −1 ), which is independent <strong>of</strong> the<br />

initial step bandwidth. Notice that this MSE achieves the optimal rate <strong>of</strong> convergence at<br />

an order <strong>of</strong> n −(2d+2)/(2d+3) for an optimal bandwidth h2 <strong>of</strong> order n −1/(2d+3) and h1 = o(h2).<br />

Now in the following corollary, we prove the asymptotic normality <strong>of</strong> the estimator using<br />

martingale central limit theorem.<br />

Corollary 4.3. Under the same assumptions as that <strong>of</strong> Theorem 4.2,<br />

√ �<br />

nh2<br />

ˆβtv<strong>GARCH</strong>(u0) − βtv<strong>GARCH</strong>(u0) − btv<strong>GARCH</strong>(u0) �<br />

�<br />

D<br />

→ N3 0,e ⊤ 1,d+1A −1<br />

2 B2A −1<br />

2 e1,d+1V ar(v2 t )S −1<br />

2 Ω2S −1<br />

�<br />

2<br />

where βtv<strong>GARCH</strong>(u0) = [ω(u0),α(u0),β(u0)] ⊤ and btv<strong>GARCH</strong>(u0) =<br />

[Bias(ˆω(u0)),Bias(ˆα(u0)), Bias( ˆ β(u0))] ⊤ .<br />

Remark 4.1. Above results have led us to the following two important issues, which<br />

need further investigation.<br />

1. The asymptotic distributions <strong>of</strong> the estimators <strong>of</strong> the parameter functions depend<br />

on the parameters <strong>of</strong> the stationary approximation to tv<strong>GARCH</strong> defined in (3),<br />

which is unobservable. Therefore, to derive a confidence band (or point-wise con-<br />

fidence intervals), one can use the bootstrap methods. Fryzlewicz, Sapatinas and<br />

Subba Rao (2008) used residual bootstrap methods <strong>of</strong> Franke and Kreiss (1992)<br />

to construct point-wise confidence intervals for the least-squares estimator <strong>of</strong> the<br />

tvARCH <strong>model</strong>. To avoid instability <strong>of</strong> the generated process, they modified their<br />

estimator so that the sum <strong>of</strong> all the estimated coefficients remain less than one.<br />

However, their method does not guarantee the estimators to be non-negative. This<br />

results in some <strong>of</strong> the bootstrapped residual squares to be negative. In order to<br />

tackle this problem, one needs to carefully formulate a bootstrap procedure and<br />

establish its working. Another approach would be to modify the <strong>estimation</strong> proce-<br />

dure itself to satisfy these constraints, see for example Bose and Mukherjee (2009).<br />

14


This problem is under investigation.<br />

2. Our method assumes that all the three tv<strong>GARCH</strong> parameter functions have the<br />

same degree <strong>of</strong> smoothness and hence they can be approximated equally well in the<br />

same interval. But if the functions possess different degrees <strong>of</strong> smoothness, then the<br />

proposed method may not give the optimal estimators (see Fan and Zhang (1999)).<br />

Therefore, one has to construct an estimator that is adaptive to different degrees<br />

<strong>of</strong> smoothness in different parameter functions.<br />

5 Modelling and forecasting volatility using tv<strong>GARCH</strong><br />

We analyze the currency exchange rates between five major developing economies in the<br />

forefront <strong>of</strong> global economic recovery viz. Brazil (BRL), Russia (RUB), India (INR),<br />

China (CNY) and South Africa (RND) (so called ‘BRICS’) and the developed economies<br />

viz. United States (USD) and Europe (EURO). The last decade saw the ‘BRICS’ mak-<br />

ing their mark on the global economic landscape. In recent <strong>time</strong>s, these economies are<br />

severely affected due to the global financial crisis and currency wars. This was our mo-<br />

tivational factor in analyzing these exchange rates data using tv<strong>GARCH</strong>. Applications<br />

<strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> has also been discussed in four stock indices, S & P 500, Dow<br />

Jones, Bombay stock exchange (BSE, India) and National stock exchange (NSE, India).<br />

All the data sets consist <strong>of</strong> daily percent log returns ranging from the beginning <strong>of</strong> 2000<br />

(dates <strong>varying</strong>) to December 31, 2010 except NSE data, which start from January 2002.<br />

The data are available from the websites <strong>of</strong> US Federal Reserve, European Central Bank<br />

and www.finance.yahoo.com. Figures 1 and 2 depict the plot <strong>of</strong> the return data and au-<br />

tocorrelation functions <strong>of</strong> squared returns. In Table 1, we provide the summary statistics<br />

<strong>of</strong> <strong>of</strong> the data.<br />

To compare the in-sample prediction performance <strong>of</strong> tv<strong>GARCH</strong> with several other<br />

well known existing <strong>model</strong>s, we compute the aggregated mean squared error (AMSE)<br />

(see Fryzlewicz, Sapatinas and Subba Rao (2008)):<br />

AMSE = n�<br />

(ǫ<br />

t=1<br />

2 t − ˆσ 2 t ) 2 ,<br />

where ˆσ 2 t and ǫ 2 t are the predicted volatility and squared return at <strong>time</strong> t and n denotes<br />

the sample size. These are reported in Table 2. The lowest AMSEs are presented in<br />

bold letters. Here, <strong>GARCH</strong> (1,1), E<strong>GARCH</strong> (1,1) and GJR (1,1) (see Engle and Ng<br />

15


(1993) and references therein) <strong>model</strong>s are estimated using SAS, while MATLAB is used<br />

for the <strong>estimation</strong> <strong>of</strong> FI<strong>GARCH</strong> (1, d0, 1) <strong>model</strong>, where d0 is the fractional differencing<br />

parameter to be estimated from the data (Baillie (1996)). The definitions <strong>of</strong> these <strong>model</strong>s<br />

are provided in Appendix C. R codes have been written for the <strong>estimation</strong> <strong>of</strong> tv<strong>GARCH</strong><br />

(with d = 3, 1 and p = log n) and tvARCH <strong>model</strong>s using Epanechnikov kernel. All the<br />

codes can be made available on from authors. The choices <strong>of</strong> d = 3, 1 facilitate the optimal<br />

rate <strong>of</strong> convergence <strong>of</strong> the order <strong>of</strong> n −8/9 and n −4/5 respectively and p = log n requires<br />

lesser number <strong>of</strong> parameters to be estimated in Step 1 as compared to other choices <strong>of</strong><br />

p such as √ n. The bandwidth is selected using the cross-validation method as described<br />

in Section 3.1. Estimation <strong>of</strong> the tvARCH <strong>model</strong> has been carried out using Step 1<br />

methodology <strong>of</strong> Section 3 with bandwidth chosen using cross validation, minimizing the<br />

mean-squared prediction error for tvARCH (Hart (1994)). E<strong>GARCH</strong> <strong>model</strong> could not be<br />

estimated for the CNY/USD data due to convergence problems.<br />

Superiority <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> is evident from the Table 2. The non-stationary<br />

<strong>model</strong>s have clearly outperformed stationary as well as long memory <strong>model</strong>s. The AMSEs<br />

<strong>of</strong> tv<strong>GARCH</strong> with d = 3 are smaller than that with d = 1 in most <strong>of</strong> the cases. However,<br />

the difference between the two is not very high. An illustrative comparison <strong>of</strong> tv<strong>GARCH</strong><br />

(d = 3) <strong>model</strong> is also shown in Figure 3 for BRL/EURO data. The faint plot depicts<br />

the squared returns and the dark plot is the predicted volatility with the corresponding<br />

<strong>model</strong>. Clearly, the tv<strong>GARCH</strong> <strong>model</strong> has captured the ups and downs in the volatility<br />

more accurately.<br />

In Figure 4, we plot the the estimators ˆω(u), ˆα(u), ˆ β(u) and ˆα(u) + ˆ β(u) against<br />

u ∈ (0, 1] for the BSE data. Notice that similar to the least squares estimators <strong>of</strong><br />

Fryzlewicz, Sapatinas and Subba Rao (2008), the local polynomial estimators are not<br />

guaranteed to be non-negative. Although, the estimators satisfy ˆα(u) + ˆ β(u) < 1 for this<br />

data, this may not be the case in general depending on the behaviour <strong>of</strong> the data.<br />

To compare the performance <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> further, in Table 3, we report<br />

the AMSE for the in-sample monthly volatility (<strong>of</strong> 22 trading days) forecasts for the<br />

same data sets, based on the monthly returns. The monthly returns are calculated<br />

as rmt = log(Pt/Pt−1), t = 1, 2,...,T, where Pt denotes the closing price on the last<br />

day <strong>of</strong> t th month and T is the total number <strong>of</strong> complete months in the data. All the<br />

datasets are <strong>of</strong> size around 125 except NSE dataset which has the size 95. This analysis<br />

16


provides insight into the nature <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> for small data sets. Our numerical<br />

evidences indicated that the asymptotic properties derived in Section 4 regarding the<br />

bandwidth selection also hold for these moderate sized monthly datasets. We did not<br />

multiply the returns with 100 to avoid large values. This, together with small data size<br />

has resulted in very small AMSEs. However, for comparative purposes, this does not<br />

make any difference. Clearly, the tv<strong>GARCH</strong> is performing better than other <strong>model</strong>s even<br />

for small sample sizes.<br />

One interesting conclusion that can be drawn from the above analyses is that the<br />

global crisis and specially the currency wars have vehemently turned the exchange rates<br />

volatility towards non-stationarity and short memory. This is quite possible as the fre-<br />

quent manipulation <strong>of</strong> the currencies may lead the currency rates to lose its widespread<br />

notion <strong>of</strong> the long memory behaviour.<br />

The ‘out <strong>of</strong> sample forecasting’ performance <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> has been judged<br />

using 50 daily forecasts computed by a rolling-window scheme. The out <strong>of</strong> sample fore-<br />

casts <strong>of</strong> the tv<strong>GARCH</strong> <strong>model</strong> are computed as follows. Use the n1 = n − 50 observations<br />

for the in-sample <strong>estimation</strong>. Then, forecast into the future using the ‘last’ estimated<br />

coefficient values, that is, the estimate <strong>of</strong> coefficient functions at t = n1. Forecasts into<br />

the future are computed in the same way as in a stationary <strong>GARCH</strong> <strong>model</strong> using these<br />

last coefficient estimates. Similar method has also been used by Fryzlewicz et al. (2008)<br />

for the future forecasts using the tvARCH <strong>model</strong>. Let σ 2 t+1|t , t = n1,n1 + 1,...,n − 1<br />

denote the one-step ahead out <strong>of</strong> sample forecasts using the previous n1 observations.<br />

We compare σ 2 t+1|t with ǫ2 t+1, t = n − 50,n − 49,...,n − 1 to get the AMSEs, which are<br />

reported in Table 4.<br />

The out <strong>of</strong> sample forecasts using tv<strong>GARCH</strong> <strong>model</strong> are better than those <strong>of</strong> the other<br />

<strong>model</strong>s. The tv<strong>GARCH</strong> attains the lowest AMSE for 7 data sets, while tvARCH (2) is<br />

better in 1 case. The FI<strong>GARCH</strong> and E<strong>GARCH</strong> <strong>model</strong>s have shown good forecasts for<br />

two data sets each, while <strong>GARCH</strong> and GJR <strong>model</strong>s are performing abysmally.<br />

It is noticeable that the tv<strong>GARCH</strong> <strong>model</strong> with d = 1 performs better than the tv-<br />

<strong>GARCH</strong> with d = 3 in the out <strong>of</strong> sample forecasting. However, there is not much <strong>of</strong> a<br />

difference between AMSEs <strong>of</strong> tv<strong>GARCH</strong> with d = 3 and d = 1. The better performance<br />

<strong>of</strong> tv<strong>GARCH</strong> (d = 3) than tv<strong>GARCH</strong> (d = 1) in the in-sample forecasting can be ex-<br />

plained to some extent by the fact that bigger d yields a higher convergence rate <strong>of</strong> MSE.<br />

17


However, this need not be the case in out <strong>of</strong> sample forecasting. Since the difference<br />

between the tv<strong>GARCH</strong> <strong>model</strong>s with d = 3 and d = 1 is not very high, it seems better<br />

and more practical to use small d = 1. One more advantage <strong>of</strong> d = 1 is that it reduces<br />

the number <strong>of</strong> parameters to be estimated.<br />

Acknowledgments<br />

The first author would like to acknowledge the Council <strong>of</strong> Scientific and Industrial Re-<br />

search (CSIR), India, for the award <strong>of</strong> a junior research fellowship. The second author’s<br />

research is supported by a research grant from CSIR under the head 25(0175)/09/ EMR-<br />

II.<br />

Appendix A: Pro<strong>of</strong>s<br />

In this Appendix, we provide the pro<strong>of</strong>s <strong>of</strong> the results discussed in Sections 2 and 4<br />

along with some auxiliary lemmas.<br />

Pro<strong>of</strong> <strong>of</strong> Proposition 2.1. By recursive substitution in (2), we obtain<br />

σ2 t = ω � �<br />

t<br />

n<br />

+ t−1 �<br />

i�<br />

�<br />

α � t−j+1<br />

n<br />

�<br />

v2 t−j + β � ��<br />

t−j+1<br />

n<br />

i=1 j=1<br />

+ t� �<br />

α<br />

i=1<br />

� �<br />

i v n<br />

2 i−1 + β � ��<br />

i σ n<br />

2 0<br />

ω � �<br />

t−i<br />

n<br />

Suppose u1 = argmax(α(u) + β(u)) then using strong law <strong>of</strong> large numbers as t → ∞,<br />

t� �<br />

α<br />

i=1<br />

� �<br />

i v n<br />

2 i−1 + β � ��<br />

i σ n<br />

2 0 ≤ t� �<br />

α (u1) v<br />

i=1<br />

2 i−1 + β (u1) �<br />

σ2 0 → σ2 0exp(tγ ∗ ) → 0<br />

as γ ∗ = E[log (α(u1)v 2 t + β(u1))] < 0 using Assumption 1(i). The pro<strong>of</strong> <strong>of</strong> uniqueness <strong>of</strong><br />

the solution is similar to the pro<strong>of</strong> <strong>of</strong> Proposition 1 <strong>of</strong> Dahlhaus and Subba Rao (2006).<br />

The lower limit for ¯σ 2 t is easy to obtain using the series.<br />

Pro<strong>of</strong> <strong>of</strong> Proposition 2.2. Notice that<br />

Cov(ǫ 2 t,ǫ 2 t+h) = Cov(σ 2 t v 2 t ,σ 2 t+hv 2 t+h).<br />

Now the result can be proved using the expansion for σ 2 t as in (10) above and by using<br />

Assumption 1(i). We omit the details.<br />

18<br />

(10)


Pro<strong>of</strong> <strong>of</strong> Proposition 2.3. We can write<br />

|ǫ 2 t − �ǫ 2 t(u0)| ≤ � � �ǫ 2 t − �ǫ 2 t<br />

� �� �<br />

t ��<br />

+<br />

���ǫ 2<br />

n t<br />

� �<br />

t − �ǫ n<br />

2 t(u0) � �<br />

� .<br />

Now using Proposition 2.1 and equation (4),<br />

�<br />

�<br />

�ǫ2 t − �ǫ 2 � �� �<br />

t ��<br />

t =<br />

��σ 2<br />

n t − �σ 2 � ��<br />

t �� 2<br />

t v n t = � � 2 �¯σ t − ¯σ 2 � ��<br />

t �� 2<br />

t v n t a.s., but<br />

�<br />

�<br />

�¯σ 2 t − ¯σ 2 � �� �<br />

t ��<br />

t ≤ α n<br />

� �<br />

t v n<br />

2 t−1 + β � ��<br />

t<br />

n<br />

� ∞�<br />

��<br />

�<br />

� α<br />

i=1<br />

� �<br />

t v n<br />

2 t−2 + β � �<br />

t<br />

n<br />

+ M<br />

�<br />

1 + v n<br />

2 t−2) � i� �<br />

α<br />

j=3<br />

� �<br />

t−j+1<br />

v n<br />

2 t−j + β � ��<br />

t−j+1<br />

ω n<br />

� �<br />

t−i<br />

n<br />

− i� �<br />

α<br />

j=2<br />

� �<br />

t v n<br />

2 t−j + β � ��<br />

t ω n<br />

� �<br />

��<br />

t ��<br />

, n<br />

using Assumption 1(ii) (Lipschitz continuity <strong>of</strong> the parameters). Here we take M =<br />

max(M1,M2,M3) and i−k � �<br />

α<br />

j=i<br />

� �<br />

t v n<br />

2 t−j + β � ��<br />

t = 1, ∀ k > 0. Proceeding in a similar way,<br />

n<br />

that is, replacing α � �<br />

t−j+1<br />

and β n<br />

� �<br />

t−j+1<br />

for each j with α n<br />

� �<br />

t and β n<br />

� �<br />

t successively<br />

n<br />

using Lipschitz continuity, after some algebra, we reach to<br />

�<br />

�<br />

�ǫ2 t − �ǫ 2 � ��<br />

t �� Mv<br />

t ≤ n<br />

2 �<br />

∞� i−1 � �<br />

t α n<br />

i=1 j=1<br />

� �<br />

t v n<br />

2 t−j + β � �� ��<br />

t α n<br />

� �<br />

t−i+1 i + ω n<br />

� �<br />

t (i − 1) n<br />

�<br />

v2 t−i<br />

+ �<br />

β � �<br />

t−i+1 i + ω n<br />

� �<br />

t (i − 1) n<br />

��<br />

+ ∞� i� k−2 � �<br />

α(<br />

i=3 k=3 l=1<br />

t<br />

n )v2 t−l + β( t<br />

n )�<br />

× (1 + v2 t−k+1)ω � � i� �<br />

t−i (k − 2) α( n<br />

t−j+1<br />

)v n 2 t−j + β( t−j+1<br />

) n ��<br />

Now suppose Q ∗ = max (sup<br />

u<br />

u1 = argmax(α(u) + β(u)). Then<br />

�<br />

�<br />

�ǫ 2 t − �ǫ 2 t<br />

ω(u), sup<br />

u<br />

j=k<br />

α(u), sup β(u)) < ∞ and<br />

u<br />

� ��<br />

t �� Q<br />

≤ n nVt, where Q = MQ∗ and<br />

Vt = v2 ∞� i−1 �<br />

t (α(u1)v<br />

i=1 j=1<br />

2 t−j + β(u1))(1 + v2 t−i)(2i − 1)<br />

+v2 ∞� i� k−2 �<br />

t (α(u1)v<br />

i=3 k=3 l=1<br />

2 t−l + β(u1))(1 + v2 t−k+1)(k − 2) i�<br />

j=k<br />

(α(u1)v 2 t−j + β(u1))<br />

It can be shown that Vt is a stationary ergodic process (Stout (1996), Theorem 3.5.8)<br />

with,<br />

E|Vt| ≤ ∞�<br />

2(1 − δ)<br />

i=1<br />

i−1 (2i − 1) + ∞� i�<br />

2(k − 2)(1 − δ)<br />

i=3 k=3<br />

i−1 < ∞,<br />

using Assumption 1 (i). In a similar way, we can show that<br />

�<br />

�<br />

��ǫ 2 t( t<br />

n ) − �ǫ2 t (u0) � � � ≤ Q � �� t<br />

n<br />

19<br />

�<br />

�<br />

− u0<br />

� Vt.


Hence the proposition follows.<br />

In the following lemmas, we prove the results for a general bandwidth h, so that the<br />

results are applicable for both h1 and h2.<br />

Lemma A.1. Let {Zt} be a sequence <strong>of</strong> ergodic random variables with E|Zt| < ∞.<br />

Suppose that Assumption 2(ii) is satisfied. Then<br />

n� 1 (i) nh<br />

k=p+1<br />

(uk − u0) iK � �<br />

uk−u0 P<br />

Zk → h h<br />

i µiE(Zt),<br />

n� 1 (ii) nh (uk − u0) iK2 � �<br />

uk−u0 P<br />

Zk → h h<br />

iνiE(Zt), i = 1, 2,...,2d.<br />

k=p+1<br />

where h is a bandwidth such that h → 0 and nh → ∞ as n → ∞.<br />

Pro<strong>of</strong>. The lemma can be proved using similar techniques as in Dahlhaus and Subba<br />

Rao (2006, Lemmas A.1 and A.2). We omit the details.<br />

Lemma A.2. Let the Assumptions 1 and 2 be satisfied. Then<br />

(i)<br />

n�<br />

k=p+1<br />

1<br />

nh (uk − u0) iK � �<br />

uk−u0 ǫ h<br />

2l<br />

k−j1ǫ2m k−j2<br />

∀ l,m ∈ {0, 1, 2} and j1,j2 ∈ {1, 2,...,p}, j1 �= j2<br />

n� 1 (ii) nh (uk − u0) iK2 � �<br />

uk−u0 σ h<br />

4 kǫ2l k−j1ǫ2m k−j2<br />

k=p+1<br />

∀ l,m ∈ {0, 1} and j1,j2 ∈ {1, 2,...,p},<br />

where (ii) is true for l,m > 0 only if E|vt| 8 < ∞.<br />

P<br />

→ hi µiE(�ǫ 2l 2m<br />

k−j1 (u0)�ǫ k−j2 (u0)),<br />

P<br />

→ hiνiE(�σ 4 k(u0)�ǫ 2l 2m<br />

k−j1 (u0)�ǫ k−j2 (u0)),<br />

Pro<strong>of</strong>. (i) We will prove it for l = m = 2. Other cases can be similarly shown. Using<br />

Lemma A.1 it is clear that<br />

n�<br />

k=p+1<br />

1<br />

nh (uk − u0) iK � �<br />

uk−u0 �ǫ h<br />

2l 2m<br />

k−j1 (u0)�ǫ k−j2 (u0)<br />

20<br />

P<br />

→ hi µiE(�ǫ 2l 2m<br />

k−j1 (u0)�ǫ k−j2 (u0)).<br />

(11)


Now consider<br />

n� 1<br />

nh<br />

k=p+1<br />

(uk − u0) iK � � �<br />

uk−u0 ���ǫ 4<br />

h k−j1 (u0)�ǫ 4 k−j2 (u0) − ǫ4 k−j1ǫ4 �<br />

�<br />

k−j2<br />

�<br />

≤ n� 1<br />

nh<br />

k=p+1<br />

(uk − u0) iK � � �<br />

uk−u0 �ǫ h<br />

4 k−j2 (u0)(�ǫ 2 k−j1 (u0) + ǫ2 k−j1 )<br />

�<br />

�<br />

× ��ǫ 2 k−j1 (u0) − ǫ2 �<br />

�<br />

k−j1<br />

� +ǫ4 k−j1 (�ǫ2 k−j2 (u0) + ǫ2 k−j2 )<br />

�<br />

�<br />

��ǫ 2 k−j2 (u0) − ǫ2 ��<br />

�<br />

k−j2<br />

�<br />

≤ Qhi+1R = OP(hi+1 ), where<br />

R = n� 1<br />

nh<br />

k=p+1<br />

(uk−u0 ) h iK � � �<br />

uk−u0 �ǫ h<br />

4 k−j2 (u0)(�ǫ 2 k−j1 (u0) + ǫ2 �<br />

k−j1 ) | uk−j −u0 1<br />

h<br />

+ 1<br />

�<br />

Vk−j1 +ǫ nh<br />

4 k−j1 (�ǫ2 k−j2 (u0) + ǫ2 �<br />

k−j2 ) | uk−j −u0 2 | + h<br />

1<br />

� �<br />

Vk−j2 nh<br />

(using Proposition 2.3). Now using Proposition 2.3 for ǫ2 k−j1 and ǫ2k−j2 in the expression<br />

<strong>of</strong> R and Lemma A.1, it can be shown that E|R| < ∞. Hence using (11), the lemma<br />

holds as n → ∞.<br />

(ii) Using the form (5) <strong>of</strong> tv<strong>GARCH</strong> <strong>model</strong>, we can write<br />

σ2 t = α0( t t ) + α1( n n )ǫ2t−1 ... + αpn( t<br />

n )ǫ2t−pn + OP(ρpn )<br />

where 0 < ρ < 1 and pn → ∞ as n → ∞. The parameter functions αj(u), j = 0, 1,...,pn<br />

are bounded and continuous under the Assumption 2 (i). The result can be proved using<br />

this form <strong>of</strong> σ 2 t in a similar way as in (i) above. We omit the details.<br />

Lemma A.3. Under Assumptions 1 and 2,<br />

where ⊗ denotes the Kronecker product.<br />

1<br />

n X⊤ P<br />

1 W1X1 → S ⊗ A1<br />

Pro<strong>of</strong>. Pro<strong>of</strong> follows using the expansion <strong>of</strong> X ⊤ 1 W1X1 and Lemma A.2 (i).<br />

Lemma A.4. Suppose the Assumptions 1 and 2 are satisfied. In addition assume that<br />

E|vt| 8 < ∞. Then<br />

�<br />

n�<br />

V ar (uk − u0)<br />

k=p+1<br />

iKh(uk − u0)(v2 k − 1)σ2 k[1,ǫ2 k−1,...,ǫ 2 k−p] ⊤<br />

�<br />

= nh2i−1ν2iV ar(v2 t )Ω(1 + oP(1)), i = 1, 2,...,d.<br />

21<br />

|


Pro<strong>of</strong>. Let Ft−1 = σ(ǫ2 t−1,ǫ2 t−2,...). Then<br />

�<br />

n�<br />

V ar (uk − u0)<br />

k=p+1<br />

iKh(uk − u0)(v2 k − 1)σ2 k[1,ǫ2 k−1,...,ǫ 2 k−p] ⊤<br />

�<br />

�<br />

n�<br />

= E (uk − u0)<br />

k=p+1<br />

2iK2 h(uk − u0)V ar �<br />

(v2 k − 1)σ2 k[1,ǫ2 k−1,...,ǫ 2 k−p] ⊤ �<br />

|Fk−1<br />

�<br />

= �<br />

n�<br />

E (uk − u0)<br />

k=p+1<br />

2iK2 h(uk − u0)V ar(v2 k) �<br />

σ4 k[1,ǫ2 k−1,...,ǫ 2 k−p] ⊤ [1,ǫ2 k−1,...,ǫ 2 k−p] ��<br />

= nh2i−1ν2iV ar(v2 t )Ω(1 + oP(1)), (using Lemma A.2(ii))<br />

Pro<strong>of</strong> <strong>of</strong> Theorem 4.1. Let us denote β1 = [α00,α01,...,α0d,...,αp0,...,αpd] ⊤ . Using<br />

Taylor’s series expansion, we can write,<br />

�<br />

Y1 = X1 α0(u0),α (1)<br />

0 (u0),... α(d)<br />

+<br />

⎡<br />

1 ⎢<br />

(d + 1)!<br />

⎣<br />

+<br />

α (d+1)<br />

0 (ζ0(p+1))(up+1 − u0) d+1<br />

.<br />

α (d+1)<br />

0 (ζ0(n))(un − u0) d+1<br />

⎡<br />

p� 1 ⎢<br />

(d + 1)!<br />

⎣ .<br />

j=1<br />

0 (u0)<br />

,α1(u0),...,αp(u0),... d! α(d)<br />

�⊤ p (u0)<br />

d!<br />

⎤<br />

⎥<br />

⎦<br />

α (d+1)<br />

j (ζj(p+1))(up+1 − u0) d+1ǫ2 p+1−j<br />

α (d+1)<br />

j (ζj(n))(un − u0) d+1ǫ2 n−j<br />

⎤<br />

⎥<br />

⎦ +σ2 ∗ (v 2 − en−p)<br />

where σ 2 = [σ 2 p+1,σ 2 p+2,...,σ 2 n] ⊤ , v 2 = [v 2 p+1,v 2 p+2,...,v 2 n] ⊤ , ∗ denotes the component<br />

wise product 3 <strong>of</strong> vectors and ζjk, j = 0, 1,...,p, k = p + 1,...,n are between uk and u0.<br />

Multiplying both sides by (X ⊤ 1 W1X1) −1 X ⊤ 1 W1,<br />

⎡<br />

⎢<br />

× ⎢<br />

⎣ .<br />

ˆβ1(u0) = β1(u0) +<br />

⎡<br />

⎢<br />

× ⎢<br />

⎣ .<br />

α (d+1)<br />

0 (ζ0(p+1))(up+1 − u0) d+1<br />

α (d+1)<br />

0 (ζ0(n))(un − u0) d+1<br />

α (d+1)<br />

j (ζj(p+1))(up+1 − u0) d+1ǫ2 p+1−j<br />

α (d+1)<br />

j (ζj(n))(un − u0) d+1ǫ2 n−j<br />

1<br />

(d + 1)! (X⊤ 1 W1X1) −1 X ⊤ 1 W1<br />

⎤<br />

⎤<br />

⎥<br />

⎦ +<br />

1<br />

(d + 1)!<br />

Now it is not difficult to show using Lemma A.2 (i) that<br />

⎡<br />

⎤<br />

X ⊤ ⎢<br />

1 W1<br />

⎢<br />

⎣<br />

α (d+1)<br />

0 (ζ0(p+1))(up+1 − u0) d+1<br />

.<br />

α (d+1)<br />

0 (ζ0(n))(un − u0) d+1<br />

p�<br />

(X<br />

j=1<br />

⊤ 1 W1X1) −1 X ⊤ 1 W1<br />

⎥<br />

⎦ + (X⊤ 1 W1X1) −1 X ⊤ 1 W1(σ 2 ∗ (v 2 − en−p)). (12)<br />

3 Let x = [x1,x2,...,xp] ⊤ and y = [y1,y2,...,yp] ⊤ , then x ∗ y = [x1y1,x2y2,...,xpyp] ⊤ .<br />

22<br />

⎥<br />


⎡<br />

X ⊤ ⎢<br />

1 W1<br />

⎢<br />

⎣<br />

and using Lemma A.3,<br />

= nh d+1<br />

1 α (d+1)<br />

0 (u0)[1,e ⊤ p w2] ⊤ (1 + oP(1)) ⊗ D1,<br />

α (d+1)<br />

j (ζj(p+1))(up+1 − u0) d+1ǫ2 p+1−j<br />

.<br />

α (d+1)<br />

j (ζj(n))(un − u0) d+1ǫ2 n−j<br />

= nh d+1<br />

1 α (d+1)<br />

j<br />

Hence, the asymptotic bias is given as,<br />

⎤<br />

⎥<br />

⎦<br />

(u0)[w2,Cj−1,...,Cj−p] ⊤ (1 + oP(1)) ⊗ D1,<br />

(X ⊤ 1 W1X1) −1 = (1/n)S −1 (1 + oP(1)) ⊗ A −1<br />

1 .<br />

E( ˆ β1(u0) − β1(u0))<br />

= hd+1<br />

�<br />

1 α (d+1)!<br />

(d+1)<br />

0 (u0)(S−1 ⊗ A −1<br />

1 )[(1,w2e ⊤ p ] ⊤ ⊗ D1)<br />

+ p �<br />

α<br />

j=1<br />

(d+1)<br />

j (u0)(S−1 ⊗ A −1<br />

1 )([w2,Cj−1,...,Cj−p] ⊤ ⊗ D1) �<br />

+ oP(h d+1<br />

1 ).<br />

Notice that C0 = w4. Now<br />

E( ˆ β1(u0) − β1(u0))<br />

= hd+1<br />

1<br />

(d+1)! (S−1 ⊗ A −1<br />

1 ) ��<br />

+ p �<br />

j=1<br />

α (d+1)<br />

0<br />

(u0)[1,w2e ⊤ p ] ⊤<br />

α (d+1)<br />

j (u0)[w2,Cj−1,...,Cj−p] ⊤� ⊗ D1<br />

= hd+1<br />

1<br />

(d+1)! (S−1 ⊗ A −1<br />

1 ) �<br />

+ oP(h d+1<br />

1 )<br />

= hd+1<br />

�<br />

1<br />

(d+1)!<br />

[α (d+1)<br />

0<br />

S[α (d+1)<br />

0<br />

(u0),α (d+1)<br />

1<br />

(u0),α (d+1)<br />

1<br />

�<br />

+ oP(h d+1<br />

1 )<br />

(u0),...,α (d+1)<br />

p (u0)] ⊤ ⊗ D1<br />

(u0),...,α (d+1)<br />

p (u0)] ⊤ ⊗ A −1<br />

�<br />

1 D1 + oP(h d+1<br />

1 )<br />

Notice that Bias (ˆαj(u0))= e ⊤ j(d+1)+1,(p+1)(d+1) Bias (ˆ β1(u0)). Hence the bias expression is<br />

obtained.<br />

Now the asymptotic variance is<br />

V ar( ˆ β1(u0))<br />

= (1/n)(S −1 (1 + oP(1)) ⊗ A −1<br />

1 )V ar(X ⊤ 1 W1(σ 2 ∗ (v 2 − en−p)))<br />

× (1/n)(S −1 (1 + oP(1)) ⊗ A −1<br />

1 ).<br />

= (1/n)(S −1 (1 + oP(1)) ⊗ A −1<br />

1 )((n/h1)V ar(v 2 t )Ω(1 + oP(1)) ⊗ B1)<br />

× (1/n)(S −1 (1 + oP(1)) ⊗ A −1<br />

1 ).<br />

using Lemma A.4. The desired expression can be obtained after some simplification using<br />

the properties <strong>of</strong> Kronecker product.<br />

23<br />


Lemma A.5. Suppose that the Assumptions 1 and 2 are satisfied. Then<br />

(i) 1<br />

n�<br />

(ut − u0) nh2<br />

t=2<br />

iK( ut−u0)ˆσ<br />

h2<br />

2 P<br />

t−1 → hi 2µiλ1<br />

(ii) 1<br />

n�<br />

(ut − u0) nh2<br />

t=2<br />

iK( ut−u0)ˆσ<br />

h2<br />

2 t−1ǫ2 P<br />

t−1 → hi 2µiλ2<br />

(iii) 1<br />

n�<br />

(ut − u0) nh2<br />

t=2<br />

iK( ut−u0)ˆσ<br />

h2<br />

4 P<br />

t−1 → hi 2µiλ3<br />

Pro<strong>of</strong>. (i) It is evident from (12) (in the pro<strong>of</strong> <strong>of</strong> Theorem 4.1) that for j = 0, 1,...,p<br />

Therefore<br />

ˆαj(u0) = δj(u0) + e ⊤ j(d+1)+1,(p+1)(d+1) (X⊤ 1 W1X1) −1 X ⊤ 1 W1(σ 2 ∗ (v 2 − en−p)).<br />

ˆσ 2 t−1 = δ0(ut−1) + p �<br />

δj(ut−1)ǫ2 t−j−1 + R∗ 1, (13)<br />

where, R ∗ 1 = (e ⊤ 1,(p+1)(d+1) + p �<br />

j=1<br />

e<br />

j=1<br />

⊤ j(d+1)+1,(p+1)(d+1) ǫ2t−j) × (X⊤ 1 W1X1) −1X ⊤ 1 W1(σ2 ∗ (v2 − en−p))<br />

Clearly, E(R ∗ 1) = 0. Here δj(·)’s are continuous functions. Substituting this expression<br />

for ˆσ 2 t−1 (13) in (i), and by using Lemma A.2, the result can be proved. Here,<br />

n� 1 (ut − u0) nh2<br />

t=2<br />

iK( ut−u0)ˆσ<br />

h2<br />

2 t−1<br />

= 1<br />

n�<br />

(ut − u0) nh2<br />

t=2<br />

iK( ut−u0)(δ0(ut−1)<br />

+ h2<br />

p �<br />

δj(ut−1)ǫ<br />

j=1<br />

2 t−j)<br />

+ 1<br />

n�<br />

(ut − u0) nh2<br />

t=2<br />

iK( ut−u0)R<br />

h2<br />

∗ 1.<br />

Now the first term <strong>of</strong> the above expression converges in probability to hi 2µiE(δ0(ut−1) +<br />

p�<br />

δj(ut−1)�ǫ 2 t−j(u0)) = hi 2µiλ1. Now using the similar methodology as in Lemma A.2, it<br />

j=1<br />

can be shown that<br />

n�<br />

(ut − u0) iK( ut−u0)ǫ2l<br />

t−jσ2 t (v2 t − 1) P → hi 2µiE(�ǫ 2l<br />

t−j(u0)�σ 2 t (u0)(v2 t − 1))<br />

1<br />

nh2<br />

t=2<br />

h2<br />

= 0, l ∈ {0, 1}, j = 1, 2,...,p.<br />

This implies that X ⊤ 1 W1σ 2 (v 2 − en−p) P → 0. Therefore, using Lemma A.3, R ∗ 1<br />

the pro<strong>of</strong> follows. Other parts <strong>of</strong> the lemma can be proved similarly.<br />

Lemma A.6. Suppose that the Assumptions 1 and 2 are satisfied.<br />

Pro<strong>of</strong>. Notice that<br />

X ⊤ 2 W2X2 = n�<br />

t=2<br />

1<br />

n X⊤ 2 W2X2<br />

P<br />

→ S2 ⊗ A2<br />

Kh2(ut − u0) �<br />

[1,ǫ2 t−1, ˆσ 2 t−1] ⊤ [1,ǫ2 t−1, ˆσ 2 t−1] ⊗ U ⊤ �<br />

t Ut .<br />

24<br />

P<br />

→ 0. Hence


Hence the result can be easily proved using Lemma A.5.<br />

Lemma A.7. Under the similar assumptions as in Lemma A.4,<br />

V ar<br />

�<br />

n�<br />

(uk − u0)<br />

k=p+1<br />

iKh2(uk − u0)(v2 k − 1)σ2 k[1,ǫ2 k−1, ˆσ 2 �<br />

k−1]<br />

= nh 2i−1<br />

2 ν2iV ar(v2 t )Ω2(1 + oP(1)), i = 1, 2,...,d.<br />

Pro<strong>of</strong>. This can be proved in a similar way as Lemma A.4 using (13). We omit the<br />

details.<br />

Pro<strong>of</strong> <strong>of</strong> Theorem 4.2. Denote<br />

β2 = (ω02,ω12,...,ωd2,a02,...,ad0, b02,...,bd2). Using Taylor’s series expansion in (8),<br />

ˆβ2(u0) = β2(u0) +<br />

1<br />

(d + 1)! (X⊤ 2 W2X2) −1 X ⊤ ⎢<br />

2 W2<br />

⎢<br />

⎣<br />

⎡<br />

ω (d+1) (ξ02)(u2 − u0) d+1<br />

.<br />

ω (d+1) (ξ0n)(un − u0) d+1<br />

+ 1<br />

(d + 1)! (X⊤ 2 W2X2) −1 X ⊤ ⎡<br />

α<br />

⎢<br />

2 W2<br />

⎢<br />

⎣<br />

(d+1) (ξ12)(u2 − u0) d+1ǫ2 1<br />

.<br />

α (d+1) (ξ1n)(un − u0) d+1ǫ2 ⎤<br />

⎥<br />

⎦<br />

n−1<br />

+ 1<br />

(d + 1)! (X⊤ 2 W2X2) −1 X ⊤ ⎡<br />

β<br />

⎢<br />

2 W2<br />

⎢<br />

⎣<br />

(d+1) (ξ22))(u2 − u0) d+1ˆσ 2 1<br />

.<br />

β (d+1) (ξ2n)(un − u0) d+1ˆσ 2 ⎤<br />

⎥<br />

⎦<br />

n−1<br />

−(X ⊤ 2 W2X2) −1 X ⊤ ⎡<br />

⎢ β(u2)(b0(u1) +<br />

⎢<br />

2 W2<br />

⎢<br />

⎣<br />

p �<br />

bj(u1)ǫ<br />

j=1<br />

2 1−j)<br />

.<br />

β(un)(b0(un−1) + p �<br />

bj(un−1)ǫ<br />

j=1<br />

2 ⎤<br />

⎥<br />

⎦<br />

n−1−j)<br />

+(X ⊤ 2 W2X2) −1 X ⊤ 2 W2(σ 2 ∗ (v 2 2 − en−1)),<br />

where ξ0t,ξ1t and ξ2t are between ut and u0. Here v 2 2 = [v 2 2,...,v 2 n] ⊤ and σ 2 2 = [σ 2 2,...,σ 2 n] ⊤ .<br />

We ignore the term O(ρ pn ) (see Corollary 4.2) as it is negligible asymptotically. Now using<br />

Lemmas 6.2 and 6.5, it can be shown that<br />

⎡<br />

X ⊤ ⎢<br />

2 W2<br />

⎢<br />

⎣<br />

ω (d+1) (ξ02)(u2 − u0) d+1<br />

.<br />

ω (d+1) (ξ0n)(un − u0) d+1<br />

⎤<br />

⎥<br />

⎦<br />

= nh d+1<br />

2 ω (d+1) (u0)[1,w2,λ1] ⊤ (1 + oP(1)) ⊗ D2,<br />

25<br />

⎤<br />

⎥<br />


and<br />

X ⊤ ⎡<br />

α<br />

⎢<br />

2 W2<br />

⎢<br />

⎣<br />

(d+1) (ξ12))(u2 − u0) d+1ǫ2 1<br />

.<br />

α (d+1) (ξ1n)(un − u0) d+1ǫ2 ⎤<br />

⎥<br />

⎦<br />

n−1<br />

= nh d+1<br />

2 α (d+1) (u0)[w2,w4,λ2] ⊤ (1 + oP(1)) ⊗ D2,<br />

X ⊤ ⎡<br />

β<br />

⎢<br />

2 W2<br />

⎢<br />

⎣<br />

(d+1) (ξ22))(u2 − u0) d+1ˆσ 2 1<br />

.<br />

β (d+1) (ξ2n)(un − u0) d+1ˆσ 2 ⎤<br />

⎥<br />

⎦<br />

n−1<br />

= nh d+1<br />

2 β (d+1) (u0)[λ1,λ2,λ3] ⊤ (1 + oP(1)) ⊗ D2<br />

X ⊤ ⎡<br />

⎢ β(u2)(b0(u1) +<br />

⎢<br />

2 W2<br />

⎢<br />

⎣<br />

p �<br />

bj(u1)ǫ<br />

j=1<br />

2 1−j)<br />

.<br />

β(un)(b0(un−1) + p �<br />

bj(un−1)ǫ<br />

j=1<br />

2 ⎤<br />

⎥<br />

⎦<br />

n−1−j)<br />

Using Lemma A.6,<br />

Therefore,<br />

Bias( ˆ β2(u0))<br />

= β(u0)[λ1b,λ2b,λ3b)(1 + oP(1)] ⊤ ⊗ D ∗ .<br />

(X ⊤ 2 W2X2) −1 = (1/n)S −1<br />

2 (1 + oP(1)) ⊗ A −1<br />

2 .<br />

= hd+1<br />

2<br />

(d+1)! (S−1 2 (1 + oP(1)) ⊗ A −1<br />

2 ) ��<br />

ω (d+1) (u0)[1,w2,λ1] ⊤<br />

+ α (d+1) (u0)[w2,w4,λ2] ⊤ + β (d+1) (u0)[λ1,λ2,λ3] ⊤�<br />

− β(u0)S −1<br />

2 [λ1b,λ2b,λ3b] ⊤ ⊗ A −1<br />

2 D ∗ + oP(h d+1<br />

2 )<br />

= hd+1<br />

2<br />

(d+1)! (S−1 2 ⊗ A −1<br />

2 ) �<br />

(1 + oP(1)) ⊗ A −1<br />

2 D �<br />

(S2[ω (d+1) (u0),α (d+1) (u0),β (d+1) (u0)] ⊤ ) ⊗ D2<br />

− β(u0)S −1<br />

2 [λ1b,λ2b,λ3b] ⊤ ⊗ A −1<br />

2 D ∗ + oP(h d+1<br />

2 )<br />

= hd+1<br />

2<br />

(d+1)! [ω(d+1) (u0),α (d+1) (u0),β (d+1) (u0)] ⊤ ⊗ A −1<br />

2 D2<br />

− β(u0)S −1<br />

2 [λ1b,λ2b,λ3b] ⊤ ⊗ A −1<br />

2 D∗ + oP(h d+1<br />

2 ).<br />

The bias expressions can be obtained after some simplification by using<br />

Bias(ˆω(u0)) = e ⊤ 1,3(d+1) Bias(ˆ β2(u0)), Bias(ˆα(u0)) = e ⊤ d+1,3(d+1) Bias(ˆ β2(u0))<br />

and Bias( ˆ β(u0)) = e ⊤ 2d+3,3(d+1) Bias(ˆ β2(u0)).<br />

Now using Lemma A.7<br />

V ar( ˆ β2(u0)) = (1/n)S −1<br />

2 (1 + oP(1)) ⊗ A −1<br />

2 V ar(X ⊤ 2 W2(σ 2 ∗ (v 2 − en−p)))<br />

× (1/n)S −1<br />

2 (1 + oP(1)) ⊗ A −1<br />

2<br />

= 1<br />

nh2 V ar(v2 t )(S −1<br />

2 ⊗ A −1<br />

2 )(Ω2 ⊗ B2)(S −1<br />

2 ⊗ A −1<br />

2 )(1 + oP(1)).<br />

26<br />


The variance expression given in Theorem 4.2 can be arrived at after some simplification.<br />

Appendix B<br />

To make the cross validation bandwidth selection computationally feasible, we derive<br />

a relation between the (ˆω, ˆα, ˆ β) and (ˆω −t , ˆα −t , ˆ β −t ) in Proposition B.1. The idea is simi-<br />

lar to the generalized cross validation, which simplifies the intensive computation involved<br />

in the original cross validation (see Wabha (1977), Li and Palta (2009)).<br />

Proposition B.1. Let ˆ β2(u0) be the local polynomial estimator <strong>of</strong> β2(u0) where β2 =<br />

(ω02,ω12,...,ωd2,a02,...,ad0, b02,...,bd2). Suppose that ˆ β −t<br />

2 (u0) denotes the leave one<br />

out (obtained by eliminating the tth observation) estimators <strong>of</strong> β2(u0). Then,<br />

ˆβ −i<br />

2 (u0) = � β2(u0) ˆ − (X⊤ 2 W2X2) −1X ⊤ 2 W2I ∗ �<br />

i Y2<br />

�<br />

+Zi<br />

ˆβ2(u0) − (X⊤ 2 W2X2) −1X ⊤ 2 W2I ∗ � (14)<br />

i Y2<br />

where Zi = (X⊤ 2 W2X2) −1X ⊤ �<br />

2 W2 In−1 + I∗ i X2(X ⊤ 2 W2X2) −1X ⊤ �−1 2 W2 I∗ i X2 and I∗ i de-<br />

notes a matrix <strong>of</strong> order (n − 1) × (n − 1) with (i,i) th element as one and rest <strong>of</strong> them<br />

as zero. Now ˆω −i (u0) = e1,3(d+1)β −i<br />

2 (u0), ˆα −i (u0) = ed+1,3(d+1)β −i<br />

2 (u0) and ˆ β −i (u0) =<br />

e2d+3,3(d+1)β −i<br />

2 (u0).<br />

Notice that to compute (9), we need to fit the <strong>model</strong> just once based on the original<br />

sample (to obtain ˆ β2(u0)). The estimators, (ˆω −i (u0), ˆα −i (u0), ˆ β −i (u0)) can then be easily<br />

computed using the relation (14). This computation is easy and straightforward as we<br />

do not require to delete the data points from the original sample and refit the <strong>model</strong>.<br />

All we need is to change I ∗ i for each i, which can be done easily using a simple program.<br />

Thus the relation (14) facilitates the bandwidth selection and saves enormous amount <strong>of</strong><br />

computing <strong>time</strong>.<br />

Pro<strong>of</strong> <strong>of</strong> Proposition B.1. Let Ip denote the identity matrix <strong>of</strong> order p. Define<br />

the matrices<br />

Ji =<br />

⎡<br />

⎢<br />

⎣<br />

J1 =<br />

I(i−1)<br />

0(i−1)×(n−i−1)<br />

01×(i−1) 01×(i−1)×(n−i−1)<br />

0(n−i−1)×(i−1) I(n−i−1)×(n−i−1)<br />

�<br />

01×(n−2)<br />

In−2<br />

�<br />

(n−1)×(n−2)<br />

⎤<br />

⎥<br />

⎦<br />

, Jn =<br />

27<br />

(n−1)×(n−2)<br />

�<br />

In−2<br />

01×(n−2)<br />

, i = 2,...,n − 1,<br />

�<br />

(n−1)×(n−2)<br />

.


Let W −i<br />

2 denote the matrix W2 with i th row and i th column deleted. Similarly, suppose<br />

X −i<br />

2 and Y −i<br />

2 denote the X2 and Y2 with i th row omitted. It is obvious that<br />

X −i<br />

2 = J ⊤ i X2, W −i<br />

2 = J ⊤ i W2Ji and Y −i<br />

2 = J ⊤ i Y2.<br />

Now, notice that J ⊤ i Ji = In−2 and JiJ ⊤ i = In−1 −I ∗ i . Using these relations and after some<br />

algebra, it can be shown that,<br />

and<br />

Therefore, using the Woodbury formula, 4<br />

X −i⊤<br />

2 W −i<br />

2 X −i<br />

2 = X ⊤ 2 W2X2 − X ⊤ 2 W2I ∗ i X2<br />

X −i⊤<br />

2 W −i<br />

2 Y −i<br />

2 = X ⊤ 2 W2Y2 − X ⊤ 2 W2I ∗ i Y2.<br />

(X −i⊤<br />

2 W −i<br />

2 X −i<br />

2 ) −1 = (X ⊤ 2 W2X2) −1 + Zi(X ⊤ 2 W2X2) −1 ,<br />

where Zi is as defined in Proposition B.1. After some algebraic simplification, this leads<br />

to<br />

Appendix C<br />

β −i<br />

2 (u0) = (X −i⊤<br />

2 W −i<br />

2 X −i<br />

2 ) −1X −i⊤<br />

2 W −i<br />

2 Y −i<br />

2<br />

= � β2(u0) ˆ − (X⊤ 2 W2X2) −1X ⊤ 2 W2I ∗ i Y2<br />

�<br />

+ Zi<br />

ˆβ2(u0) − (X⊤ 2 W2X2) −1X ⊤ 2 W2I ∗ �<br />

i Y2 .<br />

In this appendix, we provide the definitions <strong>of</strong> the <strong>GARCH</strong> <strong>model</strong>s used in Section 5.<br />

The return process {ǫt} with E(ǫt|Ft−1) = 0 and E(ǫ 2 t |Ft−1) = σ 2 t , is said to follow<br />

(i) a <strong>GARCH</strong> process, if<br />

where ω,α,β > 0,<br />

σ 2 t = ω + αǫ 2 t−1 + βσ 2 t−1,<br />

(ii) an E<strong>GARCH</strong> process if<br />

log σ 2 ⎡�<br />

�<br />

�<br />

t = ω + α ⎣�<br />

ǫt−1<br />

�<br />

�<br />

� �<br />

�σt−1<br />

� −<br />

� ⎤<br />

2<br />

⎦ + γ<br />

π<br />

ǫt−1<br />

+ β log σ<br />

σt−1<br />

2 t−1,<br />

4 Let Ap×p, Bp×q and Cq×p denotes the matrices, then according to the Woodbury formula,<br />

where Ip denotes the identity matrix.<br />

(A + BC) −1 = A −1 − � A −1 B(Ip + CA −1 B) −1 CA −1�<br />

28<br />


(iii) a GJR process if<br />

where ω,α,β,γ > 0,<br />

(iv) a FI<strong>GARCH</strong> (1,d0,1) process if<br />

where<br />

and ω,φ,β > 0, 0 < d0 < 1.<br />

References<br />

σ 2 t = ω + αǫ 2 t−1 + βσ 2 t−1 + γI[ǫt


Journal <strong>of</strong> Finance 48, 1749-1778.<br />

Fan, J. and Gijbels, I. (1996). Local Polynomial Modeling and Its Applications. Chapman<br />

and Hall, London.<br />

Fan, J. and Zhang, W. (1999). Statistical <strong>estimation</strong> in <strong>varying</strong> coefficient <strong>model</strong>s. Ann.<br />

Statist. 27, 1491-1518.<br />

Franke, J. and Kreiss, J.P. (1992). Bootstrapping stationary autoregressive moving av-<br />

erage <strong>model</strong>s. J. Time Series Anal. 13, 297-317.<br />

Fryzlewicz, P., Sapatinas, T. and Subba Rao, S. (2008). Normalized least-squares esti-<br />

mation in <strong>time</strong>-<strong>varying</strong> ARCH <strong>model</strong>s. Ann. Statist. 36, 742-786.<br />

Hart, J. D. (1994). Automated kernel smoothing <strong>of</strong> dependent data by using <strong>time</strong> series<br />

cross- validation. J. R. Stat. Soc. Ser. B Stat. Methodol. 56, 529-542.<br />

Li, J. and Palta, M. (2009). Bandwidth selection through cross-validation for semi-<br />

<strong>parametric</strong> <strong>varying</strong>-coefficient partially linear <strong>model</strong>s. J. Stat. Comput. Simul. 79,<br />

1277-1286.<br />

Mercurio, D. and Spokoiny, V. (2004). Statistical inference for <strong>time</strong>-inhomogeneous<br />

volatility <strong>model</strong>s. Ann. Statist. 32, 577-602.<br />

Mikosch, T. and Starica, C. (2004). <strong>Non</strong>stationarities in financial <strong>time</strong> series, the long-<br />

range dependence and the I<strong>GARCH</strong> effects. Rev. Econ. Statist. 86, 378-390.<br />

Nelson, D. B. (1990). Stationarity and persistence in the <strong>GARCH</strong> (1,1) <strong>model</strong>. Econo-<br />

metric Theory 6, 318-334.<br />

Palm, F. C. (1996). <strong>GARCH</strong> <strong>model</strong>s for volatility. In Handbook <strong>of</strong> Statistics (Edited by<br />

G. S. Maddala and C. R. Rao). 14, 209-240. Elsevier Science, North Holand.<br />

Shephard, N. (1996). Statistical aspects <strong>of</strong> ARCH and stochastic volatility. In Time<br />

Series Models in Econometric, Finance and Other Fields (Edited by D. R. Cox, D. V.<br />

Hinkleyand O. E. Barndorff-Nielsen). Chapman and Hall, London.<br />

Starica, C. and Granger, C.W.J. (2005). <strong>Non</strong>-stationarities in stock returns. Rev. Econ.<br />

Statist. 8, 503-522.<br />

Stout, W. (1996). Almost Sure Convergence. Academic Press, New York.<br />

Wabha, N. (1977). A survey <strong>of</strong> some smoothing problems and the method <strong>of</strong> generalized<br />

cross-validation for solving them. In Applications <strong>of</strong> Statistics (Edited by P.R. Krishna-<br />

iah). North-Holland, Amsterdam.<br />

Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.<br />

30


31<br />

Table 1: Summary statistics <strong>of</strong> the datasets<br />

Series Sample Mean Standard Minimum Quantile 1 Median Quantile 3 Maximum<br />

size deviation<br />

INR/USD 2765 0.0004 0.1754 −1.6127 −0.0553 0.0000 0.0498 1.7104<br />

INR/EURO 2805 0.0103 0.6638 −4.1176 −0.3804 0.0000 0.3987 4.5343<br />

CNY/USD 2767 −0.0036 0.0381 −0.8767 −0.0032 0.0000 0.0010 0.4283<br />

CNY/EURO 2806 0.0005 0.2886 −1.9487 −0.1616 0.0004 0.1641 1.8262<br />

BRL/USD 2766 −0.0013 0.4728 −4.1989 −0.2240 −0.0105 0.2059 3.2796<br />

BRL/EURO 2806 0.0026 0.5668 −6.0820 −0.2764 −0.0090 0.2683 6.1018<br />

RUB/EURO 2815 0.0059 0.2612 −1.2866 −0.1277 0.0000 0.1255 1.8967<br />

RND/EURO 2815 0.0054 0.4612 −3.6862 −0.2579 −0.0164 0.2372 3.3931<br />

S & P 500 2766 −0.0014 0.5969 −4.1126 −0.2703 0.0230 0.2695 4.7586<br />

Dow Jones 2767 0.0004 0.5609 −3.5614 −0.2507 0.0182 0.2559 4.5637<br />

BSE 2724 0.0213 0.7527 −5.1287 −0.3301 0.0588 0.4134 6.9444<br />

NSE 2101 0.0381 0.7545 −5.6692 −0.2866 0.0513 0.4157 7.0939


32<br />

Table 2: Aggregated mean squared errors <strong>of</strong> the ‘in sample forecasts’<br />

Series tv<strong>GARCH</strong> tv<strong>GARCH</strong> tvARCH (1) tvARCH (2) <strong>GARCH</strong> E<strong>GARCH</strong> GJR FI<strong>GARCH</strong><br />

(d = 3) (d = 1)<br />

INR/USD 35.59 35.12 36.37 33.27 40.23 38.03 40.26 38.68<br />

INR/EURO 2119.02 2162.22 2158.96 2137.97 2234.45 2524.83 2234.46 2249.93<br />

CNY/USD 0.72 0.72 0.74 0.71 1.03 − 1.22 0.96<br />

CNY/EURO 76.64 77.58 80.12 79.75 84.02 84.17 84.55 85.73<br />

BRL/USD 1174.72 1197.15 1276.72 1117.56 1249.60 1163.88 1312.59 1221.22<br />

BRL/EURO 3563.09 3603.22 4295.43 3844.53 4942.11 4402.06 5320.83 4861.54<br />

RUB/EURO 65.27 65.36 68.77 68.34 73.98 72.81 74.04 69.34<br />

RND/EURO 935.43 940.02 977.31 966.16 993.15 981.79 1016.55 989.07<br />

S & P 500 2154.41 2620.90 2979.67 2652.07 2614.41 2476.76 2679.29 2572.90<br />

Dow Jones 1715.59 2063.03 2330.29 2067.89 2075.91 1951.45 2125.98 2025.00<br />

BSE 5688.85 5702.34 6170.73 6026.22 6358.63 6095.25 6539.42 6381.01<br />

NSE 6205.13 6244.79 6764.44 6556.36 7134.17 6765.58 7398.32 7112.78


33<br />

Table 3: Aggregated mean squared errors <strong>of</strong> the monthly volatility forecasts<br />

Series tv<strong>GARCH</strong> tv<strong>GARCH</strong> tvARCH (1) tvARCH (2) <strong>GARCH</strong> E<strong>GARCH</strong> GJR FI<strong>GARCH</strong><br />

(d = 3) (d = 1)<br />

INR/USD (×10 −5 ) 5.9571 6.2781 6.3978 6.0262 7.4865 7.5525 7.6340 7.5385<br />

INR/EURO (×10 −4 ) 1.4162 1.5060 1.6407 1.5872 1.8460 1.8557 1.8557 1.8806<br />

CNY/USD (×10 −7 ) 2.5545 3.0306 3.0607 3.0216 4.3104 3.3115 4.2701 4.8166<br />

CNY/EURO (×10 −4 ) 1.4054 1.5678 1.6280 1.5514 1.9860 1.6661 1.9820 1.9929<br />

BRL/USD 0.0029 0.0031 0.0031 0.0030 0.0040 0.0037 0.0052 0.0048<br />

BRL/EURO 0.0108 0.0120 0.0119 0.0118 0.0136 0.0136 0.0133 0.0135<br />

RUB/EURO (×10 −4 ) 4.0295 4.3653 4.3969 4.4216 5.8115 6.5392 5.8298 5.5266<br />

RND/EURO 0.0121 0.0131 0.0130 0.0128 0.0149 0.0147 0.0149 0.0149<br />

S & P 500 0.0079 0.0085 0.0085 0.0079 0.0125 0.0151 0.0180 0.0137<br />

Dow Jones 0.0047 0.0051 0.0052 0.0047 0.0065 0.0065 0.0148 0.0085<br />

BSE 0.0205 0.0216 0.0217 0.0204 0.0245 0.0244 0.0245 0.0245<br />

NSE 0.0147 0.0161 0.0161 0.0149 0.0187 0.0173 0.0187 0.0187


34<br />

Table 4: Aggregated mean squared errors <strong>of</strong> the out <strong>of</strong> sample volatility forecasts<br />

Series tv<strong>GARCH</strong> tv<strong>GARCH</strong> tvARCH (1) tvARCH (2) <strong>GARCH</strong> E<strong>GARCH</strong> GJR FI<strong>GARCH</strong><br />

(d = 3) (d = 1)<br />

INR/USD 0.1975 0.2093 0.2148 0.2159 0.2104 0.2132 0.2105 0.2060<br />

INR/EURO 12.7829 12.0691 12.7828 12.7831 12.2632 12.5052 12.3108 12.1515<br />

CNY/USD 0.0053 0.0056 0.0054 0.0050 0.0051 − 0.0052 0.0051<br />

CNY/EURO 0.4956 0.4827 0.4733 0.5365 0.4609 0.4825 0.4649 0.4525<br />

BRL/USD 0.5638 0.5235 0.5505 0.5769 0.5225 0.5804 0.5469 0.5208<br />

BRL/EURO 0.6297 0.5962 0.6325 0.6290 0.6796 0.6889 0.6312 0.6610<br />

RUB/EURO 0.2928 0.2835 0.2994 0.3245 0.3002 0.3049 0.2992 0.3004<br />

RND/EURO 0.3176 0.2579 0.2664 0.3097 0.3470 0.2883 0.3253 0.3036<br />

S & P 500 1.5806 1.4883 1.6848 1.6141 1.4323 1.2191 1.2502 1.4648<br />

Dow Jones 2.4202 2.0835 2.2603 2.0234 1.8905 1.6448 1.6792 1.9229<br />

BSE 3.9336 3.7315 3.9902 4.1654 3.9710 4.6607 4.0103 3.9286<br />

NSE 4.0292 3.8433 3.9642 4.1683 3.9634 5.1846 4.0816 3.9559


−1.5 0.0 1.0<br />

−2 −1 0 1<br />

−1.0 0.0 1.0 2.0<br />

−2 0 2 4<br />

0 500 1500 2500<br />

INR/USD<br />

0 500 1500 2500<br />

CNY/EURO<br />

0 500 1500 2500<br />

RUB/EURO<br />

0 500 1500 2500<br />

Dow Jones<br />

−4 −2 0 2 4<br />

−4 −2 0 2<br />

−3 −1 1 3<br />

−4 0 2 4 6<br />

0 500 1500 2500<br />

INR/EURO<br />

0 500 1500 2500<br />

BRL/USD<br />

0 500 1500 2500<br />

RND/EURO<br />

0 500 1500 2500<br />

BSE<br />

Figure 1: Plot <strong>of</strong> the percentage log returns<br />

35<br />

−0.8 −0.2 0.2<br />

−6 −2 2 4 6<br />

−4 −2 0 2 4<br />

−6 −2 2 4 6<br />

0 500 1500 2500<br />

CNY/USD<br />

0 500 1500 2500<br />

BRL/EURO<br />

0 500 1500 2500<br />

S & P 500<br />

0 500 1000 2000<br />

NSE


acf<br />

acf<br />

acf<br />

acf<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0 5 10 20 30<br />

INR/USD<br />

0 5 10 20 30<br />

CNY/EURO<br />

0 5 10 20 30<br />

RUB/EURO<br />

0 5 10 20 30<br />

Dow Jones<br />

acf<br />

acf<br />

acf<br />

acf<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0 5 10 20 30<br />

INR/EURO<br />

0 5 10 20 30<br />

BRL/USD<br />

0 5 10 20 30<br />

RND/EURO<br />

0 5 10 20 30<br />

Figure 2: Autocorrelation functions <strong>of</strong> the squared returns<br />

36<br />

BSE<br />

acf<br />

acf<br />

acf<br />

acf<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0.0 0.4 0.8<br />

0 5 10 20 30<br />

CNY/USD<br />

0 5 10 20 30<br />

BRL/EURO<br />

0 5 10 20 30<br />

S & P 500<br />

0 5 10 20 30<br />

NSE


volatility<br />

volatility<br />

0 10 20 30<br />

0 10 20 30<br />

0 500 1000 2000<br />

tv<strong>GARCH</strong><br />

0 500 1000 2000<br />

E<strong>GARCH</strong><br />

volatility<br />

volatility<br />

0 10 20 30<br />

0 10 20 30<br />

0 500 1000 2000<br />

FI<strong>GARCH</strong><br />

0 500 1000 2000<br />

<strong>GARCH</strong><br />

Figure 3: In sample volatility forecasts for the BRL/EURO data<br />

37


omega<br />

beta<br />

0.0 0.1 0.2 0.3 0.4 0.5<br />

0.0 0.5 1.0<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

u<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

u<br />

alpha<br />

alpha+beta<br />

−0.2 0.0 0.2 0.4 0.6<br />

0.2 0.4 0.6 0.8<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

u<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

Figure 4: Plot <strong>of</strong> the estimators <strong>of</strong> the parameter functions for the BSE data<br />

38<br />

u

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!