23.12.2014 Views

Time Series - STAT - EPFL

Time Series - STAT - EPFL

Time Series - STAT - EPFL

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Time</strong> <strong>Series</strong><br />

Anthony Davison<br />

c○2010<br />

http://stat.epfl.ch<br />

Second-order theory of stationary random processes 126<br />

Reminder: Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127<br />

Spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128<br />

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129<br />

Normalized spectra for AR(1) models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130<br />

Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131<br />

Spectral distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132<br />

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133<br />

Linear filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134<br />

Effect of filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135<br />

General linear process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

ARMA models 137<br />

Autoregressive process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138<br />

Moving average process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139<br />

Invertibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140<br />

ARMA models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141<br />

Causality, invertibility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142<br />

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143<br />

Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144<br />

1


Second-order theory of stationary random processes slide 126<br />

Reminder: Basic definitions<br />

Definition 28 (a) A random function is a set of random variables {Y (t)} such that the time-index t<br />

can take any real value.<br />

(b) The trend of {Y t } is the non-random function µ(t) = E{Y (t)}, and its autocovariance function<br />

is<br />

γ(t,s) = cov{Y (t),Y (s)} = E[{Y (t) − µ(t)}{Y (s) − µ(s)}], t,s ∈ R.<br />

The mean and covariance functions constitute the second-order properties of {Y (t)}. They<br />

determine the entire distribution of the random function if the joint distribution of any finite collection<br />

of random variables {Y (t 1 ),... ,Y (t k )} is multivariate normal.<br />

(c) The random function is stationary if µ(t) = µ and γ(t,s) = γ(|t − s|): the trend is constant and<br />

the covariance between Y (t) and Y (s) depends only on their time separation t − s. In this case we<br />

can define the autocorrelation function ρ(t) = γ(t,0)/γ(0,0).<br />

(d) A random sequence is a collection of random variables {Y t } in which the time index takes only<br />

integer values: t ∈ Z. We use a subscript notation to make this clear.<br />

(e) A white noise sequence is a random sequence consisting of mutually independent random<br />

variables each with mean zero and variance σ 2 .<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 127<br />

Spectrum<br />

Definition 29 (a) The autocovariance generating function of a stationary random sequence {Y t }<br />

with autocovariances γ k = cov(Y t ,Y t+k ) is<br />

(b) The spectrum of {Y t } is<br />

G(z) =<br />

∞∑<br />

k=−∞<br />

f(ω) = G(e −2πiω ) =<br />

γ k z k , z ∈ C.<br />

∞∑<br />

k=−∞<br />

γ k e −2πikω , ω ∈ R,<br />

where i 2 = −1; this may also be written as the real-valued function<br />

f(ω) = γ 0 + 2<br />

∞∑<br />

γ k cos(2πkω), ω ∈ R. (5)<br />

k=1<br />

The normalised spectrum is f ∗ (ω) = f(ω)/γ 0 .<br />

The spectrum provides a convenient summary of the second-order properties of the process in a single<br />

function, and also shows the effect of linear operations on the series very simply.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 128<br />

127


Examples<br />

Example 30 Find the spectrum and normalised spectrum for white noise.<br />

Example 31 Show that the spectrum of the AR(1) process Y t = αY t−1 + ε t with |α| < 1 is<br />

and that the normalised spectrum is<br />

f(ω) =<br />

f ∗ (ω) =<br />

σ 2<br />

1 − 2α cos(2πω) + α 2,<br />

1 − α 2<br />

1 − 2α cos(2πω) + α 2.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 129<br />

Normalized spectra for AR(1) models<br />

Y<br />

−4 −2 0 1 2 3<br />

Y<br />

−4 −2 0 1 2 3<br />

0 50<br />

alpha=−0.5<br />

100 150 200<br />

<strong>Time</strong><br />

alpha=0.5<br />

0 50 100 150 200<br />

<strong>Time</strong><br />

f*(w)<br />

0.5 1.0 1.5 2.0 2.5 3.0<br />

f*(w)<br />

0.5 1.0 1.5 2.0 2.5 3.0<br />

0.0 0.1 0.2 0.3 0.4 0.5<br />

w<br />

0.0 0.1 0.2 0.3 0.4 0.5<br />

w<br />

alpha=0.9<br />

Y<br />

−4 −2 0 2 4<br />

f*(w)<br />

0 5 10 15<br />

0 50 100 150 200 0.0 0.1 0.2 0.3 0.4 0.5<br />

<strong>Time</strong><br />

w<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 130<br />

128


Comments<br />

□ The form of the spectrum suggests that<br />

f(ω) = f(−ω), f(ω) = f(ω + k), k ∈ Z,<br />

so f need only be defined in 0 < ω < 1 2<br />

—recall the definition of the periodogram.<br />

□ The variance of the average Y of data Y 1 ,... ,Y n from a stationary process satisfies<br />

{<br />

lim nvar(Y ) = lim<br />

n→∞ n→∞<br />

}<br />

n−1<br />

∑<br />

∞∑<br />

γ 0 + 2 (1 − h/n)γ h = γ 0 + 2 γ h = f(0)<br />

h=1<br />

for large n. Thus the spectral density at the origin, if finite, equals the (rescaled) variance of Y .<br />

□ We easily see that ∫ 1/2<br />

−1/2 f(ω)dω = γ 0.<br />

□ The spectrum is the discrete Fourier transform of the autocovariance sequence.<br />

□ The inverse Fourier transform gives the autocovariance function by<br />

γ k =<br />

∫ 1/2<br />

−1/2<br />

e 2πikω f(ω)dω = 2<br />

∫ 1/2<br />

□ The spectral density need not always exist; cf. Example 33.<br />

0<br />

h=1<br />

cos(2πkω)f(ω)dω, k ∈ Z. (6)<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 131<br />

Spectral distribution<br />

Theorem 32 (a) A set of numbers {γ h } h∈Z is the autocovariance function of a stationary random<br />

sequence iff there exists a unique bounded function F defined on [− 1 2 , 1 2<br />

] such that F(−1/2) = 0, F<br />

is right-continuous and non-decreasing, with symmetric increments about zero, and<br />

∫<br />

γ h = e 2πihu dF(u), h ∈ Z.<br />

(−1/2,1/2]<br />

The function F is called the spectral distribution function of γ h , and its derivative, if it exists, is<br />

called the spectral density function, or spectrum.<br />

(b) If ∑ h |γ h| < ∞, then f exists.<br />

(c) A function f(ω) defined on [−1/2,1/2] is the spectrum of a stationary process if and only if<br />

f(ω) = f(−ω), f(ω) ≥ 0, and ∫ 1/2<br />

−1/2<br />

f(ω)dω < ∞.<br />

□ The symmetric increments property means that if 0 ≤ a < b ≤ 1 2 , then<br />

F(b) − F(a) = F(−a) − F(−b).<br />

□ The interpretation of F is that F(ω 2 ) − F(ω 1 ) measures the variation accounted for by<br />

fluctuations in frequency in the interval (ω 1 ,ω 2 ).<br />

□ Part (c) of the theorem suggests how we may construct covariance functions with desired<br />

properties, by choosing an appropriate spectrum—note that f may be any scaled density function.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 132<br />

129


Example<br />

Example 33 Show that the covariance function of the stationary random sequence given by<br />

Y t = U 1 cos(2πω 0 t) + U 2 sin(2πω 0 t),<br />

U 1 ,U 2<br />

iid ∼ N(0,σ 2 ),<br />

may be written as<br />

γ h =<br />

∫ 1/2<br />

−1/2<br />

⎧<br />

⎪⎨ 0, ω < −ω 0 ,<br />

e 2πiωh dF(ω), F(ω) = σ<br />

⎪⎩<br />

2 /2, −ω 0 ≤ ω < ω 0 ,<br />

σ 2 , ω 0 ≤ ω.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 133<br />

Linear filters<br />

Definition 34 A linear filter is a transformation of the random sequence {U t } of the form<br />

Y t =<br />

∞∑<br />

j=−∞<br />

□ If {U t } is stationary and<br />

a j U t−j . (7)<br />

– only a finite number of the a j are non-zero, then {Y t } is stationary;<br />

– infinitely many of the a j are non-zero, then the properties of {Y t } depend on their values.<br />

□ The relation between the spectra of the sequences is given by the following theorem:<br />

Theorem 35 The spectra of two stationary random sequences {U t } and {Y t } satisfying (7) are<br />

related by<br />

f Y (ω) = |a(ω)| 2 f U (ω),<br />

where a(ω) = ∑ ∞<br />

j=−∞ a je −2πijω is the transfer function of the linear filter.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 134<br />

Effect of filtering<br />

Example 36 Find the spectrum of a three-point moving average of an AR(1) process.<br />

Spectrum<br />

Squared modulus of transfer function<br />

Filtered spectrum<br />

f(w)<br />

0 1 2 3 4<br />

|a(w)|^2<br />

0.0 0.2 0.4 0.6 0.8 1.0<br />

f(w)|a(w)|^2<br />

0.0 0.1 0.2 0.3 0.4<br />

0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5<br />

w<br />

w<br />

w<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 135<br />

130


General linear process<br />

Theorem 37 The spectrum of the general linear process<br />

Y t =<br />

∞∑<br />

a j ε t−j , (8)<br />

j=0<br />

where {ε t } is white noise, may be written as<br />

f(ω) = b 0 +<br />

∞∑<br />

b m cos(2πmω), −1/2 ≤ ω ≤ 1/2. (9)<br />

m=1<br />

□ Any real-valued even continuous function that satisfies f(ω) = f(ω + k) for integer k can be<br />

expressed as the (implicit) Fourier series (9), so any stationary random sequence with a<br />

continuous spectrum can be represented as a general linear process—at least so far as<br />

second-order properties are concerned.<br />

□ The general linear representation (8) is only useful if it involves only a few parameters—hence the<br />

use of ARMA models, which are quite flexible linear models with finite parameters.<br />

□ The computations leading to (9) implicitly presuppose that ∑ a 2 j < ∞, which then implies that<br />

{Y t } is stationary with finite variance and covariance function.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 136<br />

ARMA models slide 137<br />

Autoregressive process<br />

Definition 38 An autoregressive process of order p, AR(p), model, is of the form<br />

Y t = φ 1 Y t−1 + φ 2 Y t−2 + · · · + φ p Y t−p + ε t , (10)<br />

where {Y t } is stationary and φ 1 ,... ,φ p are constants and φ p ≠ 0. Unless otherwise mentioned, we<br />

iid<br />

assume here that ε t ∼ N(0,σ 2 ). A process with non-zero mean is obtained by replacing Y t in (10) by<br />

Y t − µ, etc.<br />

The backshift operator B can be used to write (10) in the form<br />

(1 − φ 1 B − · · · − φ p B p )Y t = φ(B)Y t = ε t ,<br />

where φ(B) is the autoregressive operator, and this suggests writing<br />

Y t = φ(B) −1 ε t<br />

to get the causal representation Y t = ∑ ∞<br />

j=0 ψ jε t−j = ψ(B)ε t , if it exists. To find the coefficients of<br />

ψ(B), we suppose that such a representation exists, and then match terms on the left and right of the<br />

equation φ(B)Y t = φ(B)ψ(B)ε t = ε t .<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 138<br />

131


Moving average process<br />

Definition 39 A moving average model of order q, MA(q), is of the form<br />

Y t = ε t + θ 1 ε t−1 + · · · + θ q ε t−q , (11)<br />

where θ 1 ,... ,θ q are constants, θ q ≠ 0, and ε t<br />

iid ∼ N(0,σ 2 ). A process with non-zero mean is obtained<br />

by replacing Y t in (11) by Y t − µ.<br />

The backshift operator B can be used to write (11) in the form<br />

Y t = (1 + θ 1 B + · · · + θ q B q )ε t = θ(B)ε t ,<br />

where θ(B) is the moving average operator. The process (11) is stationary for any values of the θ r .<br />

Example 40 Show that the MA(1) processes with parameters θ 1 and 1/θ 1 are statistically<br />

indistinguishable.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 139<br />

Invertibility<br />

Definition 41 A moving average process {Y t } is called invertible if it has an infinite autoregressive<br />

representation<br />

∞∑<br />

ε t = a j Y t−j .<br />

□ This definition is needed simply in order to ensure the identifiability of MA processes.<br />

□ In Example 40 it is easy to check which version is invertible, we write<br />

ε t = (1 + θ 1 B) −1 Y t =<br />

which is convergent iff |θ 1 | < 1.<br />

j=0<br />

∞∑<br />

∞∑<br />

(−θ 1 B) j Y t = (−θ 1 ) j Y t−j ,<br />

j=0<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 140<br />

j=0<br />

132


ARMA models<br />

Definition 42 A time series {Y t } is an autoregressive-moving average process of order p,q,<br />

ARMA(p,q), model, if it is stationary and of the form<br />

Y t = φ 1 Y t−1 + φ 2 Y t−2 + · · · + φ p Y t−p + ε t + θ 1 ε t−1 + · · · + θ q ε t−q , (12)<br />

where φ 1 ,...,φ p ,θ 1 ,... ,θ q are constants with φ p ,θ q ≠ 0. Unless otherwise mentioned, we assume<br />

iid<br />

that ε t ∼ N(0,σ 2 ). A process with non-zero mean is obtained by replacing the Y s in (12) by Y t − µ,<br />

etc.<br />

□ We use the autoregressive and moving average operators to write (12) as<br />

φ(B)Y t = θ(B)ε t .<br />

The properties of the process are intimately tied to the polynomials φ(z),θ(z), where we take<br />

z ∈ C. Let D = {z ∈ C : |z| ≤ 1} denote the unit disk in the complex plane.<br />

□ We remove common factors from φ(B) and θ(B) to eliminate overparametrisation, also called<br />

parameter redundancy.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 141<br />

Causality, invertibility<br />

Definition 43 An ARMA(p,q) process φ(B)Y t = θ(B)ε t is causal if it can be written as a linear<br />

process<br />

∞∑<br />

Y t = ψ j ε t−j = ψ(B)ε t ,<br />

where ∑ |ψ j | < ∞, and we set ψ 0 = 1. It is invertible if it can be written as<br />

where ∑ |π j | < ∞, and we set π 0 = 1,<br />

ε t =<br />

j=0<br />

∞∑<br />

π j Y t−j = π(B)Y t ,<br />

j=0<br />

Theorem 44 (a) An ARMA(p,q) process φ(B)Y t = θ(B)ε t is causal iff φ(z) ≠ 0 within D. If so,<br />

then the coefficients of ψ(z) satisfy ψ(z) = θ(z)/φ(z) for z ∈ D.<br />

(b) The process is invertible iff θ(z) ≠ 0 for z ∈ D. If so, then the coefficients of π(z) satisfy<br />

π(z) = φ(z)/θ(z) for for z ∈ D.<br />

Thus the process is causal iff the roots of φ(z) lie outside D, and invertible iff the roots of θ(z) lie<br />

outside D.<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 142<br />

Example<br />

Example 45 Investigate the properties of the process<br />

Y t = 0.4Y t−1 + 0.45Y t−2 + ε t + ε t−1 + 0.25ε t−2 .<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 143<br />

133


Comments<br />

□ Why use ARMA processes<br />

– usually an empirical model, using φ 1 ,...φ p ,θ 1 ,...,θ q as summary statistics, but with no<br />

implication that the model has a ‘scientific’, explanatory, basis in terms of the underlying data<br />

generating mechanism<br />

– the spectrum of an ARMA process can take many forms without p or q being very large, so<br />

they provide a flexible and parsimonious way to approximate a wide range of second-order<br />

properties<br />

– they are useful for forecasting, or for other settings where the autocorrelation structure of the<br />

data is not of primary interest<br />

□ ARMA models are not usually useful when the focus is on understanding the underlying<br />

mechanism that generates the data<br />

□ AR and MA models separately may provide more interpretable models in such cases:<br />

– AR models have Markov structure, which may be interpretable<br />

– MA models stem from weighted moving averages, which may be interpretable<br />

<strong>Time</strong> <strong>Series</strong> Spring 2010 – slide 144<br />

134

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!