25.08.2015 Views

Is TRAMO-SEATS automatic identification of Reg-ARIMA ... - Cemfi

Is TRAMO-SEATS automatic identification of Reg-ARIMA ... - Cemfi

Is TRAMO-SEATS automatic identification of Reg-ARIMA ... - Cemfi

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

April 2012<strong>Is</strong> <strong>TRAMO</strong>-<strong>SEATS</strong> <strong>automatic</strong><strong>identification</strong> <strong>of</strong> <strong>Reg</strong>-<strong>ARIMA</strong> modelsreliable?Some large-scale evidence.Agustín MaravallBank <strong>of</strong> Spainwith the collaboration <strong>of</strong>Roberto López Pavón and Domingo Pérez CañeteIndra


ABSTRACTIn so far that –as Hawking and Mlodinow state– “there can be no modelindependenttest <strong>of</strong> reality,” time series analysis applied to large sets <strong>of</strong> series(perhaps several thousands) needs an <strong>automatic</strong> model <strong>identification</strong> (AMI)procedure. The presentation will discuss the results <strong>of</strong> the <strong>automatic</strong><strong>identification</strong> <strong>of</strong> models <strong>of</strong> the regression-<strong>ARIMA</strong> (reg-<strong>ARIMA</strong>) type in programs<strong>TRAMO</strong>-<strong>SEATS</strong>. The procedure identifies several types <strong>of</strong> outliers and calendareffects through regression, and identifies the <strong>ARIMA</strong> model for the stochasticseries. Interpolators <strong>of</strong> missing values (if any) and forecasts <strong>of</strong> the series areobtained, as well as MMSE estimators and forecasts <strong>of</strong> the unobservedcomponents contained in the series (in particular, <strong>of</strong> the seasonally adjustedseries). Two sets <strong>of</strong> series are treated: one with 50000 simulated series,generated by 50 different <strong>ARIMA</strong> models; the second with 13691 real economicseries <strong>of</strong> different lengths. The first set shows the accuracy <strong>of</strong> the procedure inidentifying the correct model; the second set shows the ability <strong>of</strong> the procedureto provide a satisfactory model for actual series, as evidenced by the validity <strong>of</strong>the n.i.i.d. assumption for the residuals and by the performance <strong>of</strong> the out-<strong>of</strong>sampleforecasts. A comparison with the AMI procedure in the present X12-<strong>ARIMA</strong> and DEMETRA+ programs (based on older versions <strong>of</strong> the <strong>TRAMO</strong>´s AMI)is made.For monthly series with lengths not exceeding 30 years the <strong>TRAMO</strong> + AMIprocedure is found remarkably reliable. (For very long series kurtosis <strong>of</strong> theresiduals becomes the biggest problem.) Still, the procedure certainly providesan excellent benchmark and a good starting point when a careful manual<strong>identification</strong> is intended.1


<strong>TRAMO</strong>, <strong>SEATS</strong> and TSWvs<strong>TRAMO</strong>+, <strong>SEATS</strong>+ and TSW+(New features: next seminar May 10)Everything is done on the basis <strong>of</strong> a:REGRESIÓN – <strong>ARIMA</strong> MODELNeed for AMILong history and frustating results (ex.: Autobox, SCA,…)(Ex. : standard Unit root tests; not used by <strong>TRAMO</strong>+ )We consider:AMI in TSW+.Incorporated to (with a lag…)X12-<strong>ARIMA</strong> (<strong>TRAMO</strong> in REG<strong>ARIMA</strong>)X13-<strong>ARIMA</strong>-<strong>SEATS</strong>DEMETRA and DEMETRA+ESS GUIDELINES(UN recommendation)(INE)Mantenimiento futuro?Special attention paid to large-scale application (as in SA):2


One thing that is new in SA over the last 15+ years:THE INCORPORATION OF STATISTICAL TIME SERIES MODELS INTO THE SA OFECONOMIC AND SOCIAL TIME SERIES(more generally, unobserved components decomposition).Two basic approaches: <strong>ARIMA</strong> model based (AMB) approach, structural time series model (STSM) approach.STSM is restricted to the standard specifications that are available tousers by default or in large-scale applications (e.g., program STAMP).The two approaches share many things, but there are some importantdifferences.(1) STS models always accept an <strong>ARIMA</strong> representation.But not all <strong>ARIMA</strong> models can be expressed as STSM.(Simple example:when 0.) 1 3


(2) Components in STSM approach: all contaminated by noise.Ex. Trend-cycle in STSM: , .In AMB is decomposed into , 1 ,canonical (spectralzero for ) : white noise.Same feature affects seasonal component.In general,STSM for component (not ) = AMB component + orthogonal whitenoise.(3) No Automatic Outlier Detection and Correction procedure is availablein the STSM approach.(4) No proper Automatic Model Identification procedure.(Similar to using Airline model always…)4


Thus I consider the AMB approach as enforced in <strong>TRAMO</strong>-<strong>SEATS</strong> and TSW.DATAAMIMODELMODEL IS DECOMPOSEDMMSE FILTERSJOINT DISTRIBUTION OFCOMPONENTS ANDESTIMATORSADJUSTED DATAINFERENCE ANDDIAGNOSISEverything depends on finding an appropriate model.How to judge whether a model is appropriate? : ~ . . . . 0, Validity <strong>of</strong> estimators, some diagnostics, and most inference depend on notrejecting .5


Because all users need not be model <strong>identification</strong> experts, and because, even ifthey are, the number <strong>of</strong> series to treat may be too big, AMI is a crucial feature.We are going to look at summary results <strong>of</strong> <strong>TRAMO</strong>’s AMI default procedure.6


REG-<strong>ARIMA</strong> model: Out: AO, LS, TCCal: TD, EE, LY :<strong>ARIMA</strong> (perhaps there are M.O.) Φ Θ and 0, 1, 2, 3 0, 1, 2384 combinations, , 0, 17


Important model:Seasonal series:Airline model (Default) 1 1 Non-seasonal series:IMA(1,1) + mean model (Default) 1 - They approximate well many other models- Useful in AMI as Starting point Benchmark8


AUTOMATIC MODEL IDENTIFICATION (<strong>TRAMO</strong>)1. Log/level2. Seasonal/non-seasonal3. Calendar effects4. Outliers5. <strong>ARIMA</strong> Differencing ARMAA. PERFORMANCE OF AMI ON A SET OF SIMULATED SERIESAUTOMATIC MODE.ONLY PARAMETER 3(No pre-test for calendar effects).B. PERFORMANCE OF AMI ON A SET OF REAL SERIESAUTOMATIC MODE. 4(Default, pre-test for TD with working/non-working dayspecification).9


Series simulated from n.i.i.d. (0,1), series a d bd starting conditions.50 <strong>ARIMA</strong> models. For each:500 series with 120 observations500 series with 240 observations.Total: 25000 “short” series25000 “long” series.10


Models:A. AIRLINE-TYPE MODELS (8500 series)Orders: (0 1 1) (0 1 1)(0 1 0) (0 1 1)(0 1 1) (0 1 0)θ θ -0.9 -0.7-0.8 -0.4-0.7 -0.3-0.6 -0.4-0.6 0-0.5 -0.95-0.5 -0.5-0.4 -0.6-0.4 0-0.3 -0.70 -0.70 -0.50.3 -0.60.3 00.4 -0.80.5 -0.611


B. NON-SEASONAL MODELS (8000 series)a. Stationary <strong>Reg</strong>ular ordersx a (0 0 0)1 0.7 (1 0 0) 1 0.6 (0 0 2)1 0.8 1 0.5 (1 0 1)1 0.6 (2 0 0)complex root: 0.78, 7.231 0.41 0.37 1 0.30 0.4 (2 0 1)real roots: 0.85, 0.441 0.3 0.5 . (3 0 0)real root: 0.67complex root: 0.86, 3.1912


. Non-stationary Orders 1 0.7 (0 1 1) 1 0.5 (0 1 1) 1 0.3 (0 1 1) (0 1 0)1 0.7 (1 1 0)1 0.6 1 0.5 0.7 (1 1 2)1 0.40 0.42 (2 1 0)complex root: 0.65, 5.01 0.7 (0 1 0) (1 0 0) 1 0.8 (0 2 1) 1 0.31 0.36 (0 2 2)complex root: 0.6, 4.813


C. OTHER SEASONAL MODELS (8500 series)a. Stationary seasonality Orders1 0.6 1 0.6 . (1 0 0) (1 0 0)1 0.8 1 0.4 . (0 0 0) (1 0 1)1 0.7 1 0.85 1 0.3 (1 0 1) (1 0 0)1 0.7 1 0.4 0.7 . (0 1 2) (1 0 0)b. Non-stationary seasonality Orders 1 0.5 . (0 0 0) (0 1 1)1 1.4 B 0.7 B x 1 0.5 B a . (2 0 0) (0 1 1)complex root: 0.5, 4.801 0.4 1 0.5 . (0 0 0) (1 1 1) x 1 0.23 B 0.19 B 1 0.56 B a (0 1 2) (0 1 1)1 0.7 B x 1 0.50 B a (1 1 0) (0 1 1)1 0.5 B x 1 0.4 B a . (0 1 1) (1 1 0)1 0.4 B x 1 0.4 B 0.4 B 1 0.4 B a .(1 1 2) (0 1 1)1 0.3 1 0.6 . (1 1 0) (0 1 1)14


1 0.3 B x 1 0.6 B 1 0.3 B a . (1 1 1) (0 1 1)1 0.4 1 0.5 1 0.5 .(0 1 1) (1 1 1)1 0.6 B 0.5 B x 1 0.8 B a . (2 1 0) (0 1 1)complex root: 0.71, 5.551 0.5 B 0.3 B x 1 0.4 B a . (3 1 0) (0 1 1)real root: 0.54complex root: 0.75, 2.61 0.1 B 0.17 B 0.34 B x 1 0.48 B a .(3 1 0) (0 1 1)real root: 0.74complex root: 0.68, 2.81 0.4 1 0.4 . (0 2 1) (1 1 0)15


The complete set contains many models <strong>of</strong>ten found in practice and also modelswith awkward structures.An example <strong>of</strong> the latter: The seasonal AR polynomial with 0. The spectral peaks <strong>of</strong> the model occur at intraseasonal frequencies,near the middle <strong>of</strong> intervals between seasonal harmonics. These peaks willgenerate a transitory component.Such an AR structure may appear when modeling SA series, and is accompaniedby negative seasonal autocorrelation, typically present in the adjusted series.Approx.: 34% are “Airline” type models.34% are “Other Seasonal” models.32% are Non-seasonal models.Non-seasonal series possibly are overrepresented.Yet, important to identify well which series have seasonality and which onesdon’t.16


16% are stationary (40% <strong>of</strong> them are seasonal)84% are non-stationary.Models have 0, 1, 2; 0, 1. 0, 1, 2, 3; 0, 1. 0, 1, 2; 0, 1.maximum order <strong>of</strong> differencing: .The 50000 simulated series were exponentiated.Then, the log/level test (likelihood ratio) was applied to the 100000 series set.17


Table 1:Errors in Log/Level test (in % <strong>of</strong> series)Series is in levelsSeries is in logs120 240 120 240Airline model 0.1 0.0 0.2 0.0Other seasonal 0.4 0.1 1.1 0.1Non-seasonal 0.0 0.0 1.6 1.0Total 0.2 0.0 1.0 0.4 Accurate Slight bias that favors levels Most errors occur for 2Next:Pre-testDoes the series contain seasonality?Several tests.18


Table 2:Errors in the detection-<strong>of</strong>-seasonality-in-series tests (in % <strong>of</strong> series in group)Non-parametricAuto-correlationSpectralF-test OverallModel producedtesttesttesttestby AMIAirline modelOther seasonal modelsNon-seasonal modelsTotal120 0.0 0.0 0.0 0.2 0.0 0.0240 0.0 0.0 0.0 0.1 0.0 0.0120 2.9 0.2 6.1 1.9 0.1 0.3240 1.9 0.0 4.9 1.4 0.0 0.0120 1.5 1.6 2.4 0.8 2.2 1.2240 1.8 1.8 3.0 0.7 2.5 0.7120 1.5 0.6 2.9 1.0 0.7 0.5240 1.2 0.6 2.6 0.7 0.8 0.219


Pre-test and final detection <strong>of</strong> seasonality Ranking:1) AC,2) F,3) NP,4) Spectral. Errors concentrate on failure to detect stationary seasonality.Also, slight overdetection <strong>of</strong> seasonality. The four test integrated in Overall pre-test.Weights for AC and F: largest;Spectral test: small weight. Final decision is model provided by AMI:1 error in 200 for 120.1 error in 500 for 240.Excellent behavior.20


Next:AODCTable 3:Average number <strong>of</strong> outliers per series120 240Airline model 0.18 0.11Other seasonal 0.16 0.09Non-seasonal 0.17 0.10Total 0.17 0.10Spurious detection <strong>of</strong> outliers 1 spurious outlier every 6 series for 120 1 spurious outlier every 10 series for 240(not far from proportions implied by CV for outlier detection).Next:IDENTIFICATION OF <strong>ARIMA</strong> MODEL ORDERS, , , , 21


Table 4:<strong>ARIMA</strong> model correct <strong>identification</strong> (in % <strong>of</strong> series in group)Completemodel ordersDifferencing120 240 120 240Airline model 78.1 85.7 95.9 99.3Other seasonal 46.7 71.6 92.3 97.4Non-seasonal 67.6 79.5 93.5 95.7Total 64.4 78.9 93.9 97.5 384 possible model orders Some models are close. Ex.:- ARI(1,1) and IMA(1,1) when parameter small in modulus.- ARMA(1,1) and IMA(1,1) when 0.8- 1, 1, 1 and 2, 1, 0Etc… Algorithm favors NON-STATIONARITY(More regular seasonal or trend-cycle components, at the cost <strong>of</strong> somesmall autocorrelation in irregular component.) First step in AMI:If AR(1) x AR 12 (1) has real root0.88 1,it is made 1(simulated models have real AR roots 0.80, 0.85).22


Results: For the 25000 series with 120,2/3 <strong>of</strong> the time correct <strong>identification</strong> <strong>of</strong> full model. Id with 240,4/5 <strong>of</strong> the time. Correct <strong>identification</strong> <strong>of</strong> the differencing- 120 94%- 240 97%Most failures due to:- Detecting stationary seasonality.- Large real AR roots.Table 5:Errors in Airline model detection (in % <strong>of</strong> series in group)120 240Airline model 21.9 14.3Other seasonal 15.6 6.3Non-seasonal 0.2 0.1Total 12.8 7.0 No over-detection <strong>of</strong> Airline models.(Rather the opposite!)23


Table 6:Errors in differencing polynomials (in % <strong>of</strong> series in group)<strong>Reg</strong>ulardifferences DSimulated modelSeasonaldifferences BD0 01 02 00 11 12 1# <strong>of</strong> obs.in series# <strong>of</strong> series ingroupErrors in DErrors in BDUnder-diff. Over-diff. Under-diff. Over-diff.Total errors indiff. polynomial120 4500 --- 8.3 --- 4.0 10.7240 4500 --- 7.1 --- 1.4 7.8120 4250 0.0 1.6 --- 0.8 2.4240 4250 0.0 0.7 --- 0.2 0.8120 1000 12.5 --- --- 0.3 12.6240 1000 1.8 --- --- 0.4 2.2120 1500 --- 7.0 2.0 --- 9.0240 1500 --- 3.7 0.3 --- 3.9120 13500 3.7 0.4 1.0 --- 4.8240 13500 0.7 0.3 0.1 --- 1.2120 250 9.2 --- 1.6 --- 10.8240 250 1.2 --- 0.0 --- 1.2Total (*) 120 25000 2.6 2.4 0.6 0.9 6.1240 25000 0.5 1.8 0.1 0.3 2.5(*) in % <strong>of</strong> relevant groups24


Errors in orders <strong>of</strong> differencing Grouped by (D, BD) Emphasis put on seasonal differencing. More problematic group: Stationary series.Major problems: (% errors > 4%)- Some regular over-diff. for series with 0;- Some regular under-diff. for short series with 2;- For the 50000 series (long and short):• Errors in BD 1%both, under and over;• Errors in D 2.6% both, under and over, 120; 1.8% both, under and over, 240.ARMA parametersTable 7:Average number <strong>of</strong> stationary parameters per series120 240 In simulationmodelAirline model 1.9 1.8 1.7Other seasonal 2.4 2.6 2.6Non-seasonal 1.5 1.5 1.5Total 1.93 1.97 1.9425


Table 8:Simulated series: model diagnostics; % <strong>of</strong> series in group that fail the testn.i.i.d. assumption on residualsOut-<strong>of</strong>-sample forecast# obs. perConstantAutocorrelationRandomNormality Skewness Kurtosis F-test (18t-test (1-seriesmean andsignsfinalperiod-varianceperiods)ahead)Airline model(8500)Otherseasonal(8500)Non-seasonal(8000)Total(25000)120 0.8 0.3 0.2 1.0 0.8 0.6 5.3 1.2240 0.7 0.3 0.2 1.3 0.8 0.8 3.9 1.1120 0.6 0.4 0.3 1.2 0.8 0.7 4.9 1.1240 0.8 0.4 0.2 1.3 0.9 0.9 2.7 0.9120 0.7 0.6 0.3 0.7 0.6 0.5 2.0 1.2240 0.8 0.5 0.2 0.7 0.7 0.5 1.1 0.8120 0.7 0.4 0.3 1.0 0.7 0.7 4.1 1.2240 0.8 0.4 0.2 1.1 0.8 0.7 2.6 0.926


All tests are carried at the 1% (approximate) level. Residual diagnostics- In all cases, except for Normality, % failure 1%;- For N: 1% Out-<strong>of</strong>-sample forecast- F-test failures decrease with NZ, from 5.3% to 1.1%;- t-test 1%Table 9:Seasonality and Calendar residual effects (% <strong>of</strong> residual series ingroup that show evidence)# obs. perseriesEvidence <strong>of</strong> seasonality in residualsSeasonal NonparametricSpectral Overallautocorrel.evidence testtestSpectralevidence<strong>of</strong> TDeffect inresidualsAirline model(8500)Otherseasonal(8500)Non-seasonal(8000)Total(25000)120 0.0 0.1 0.1 0.0 0.1240 0.0 0.3 0.1 0.0 0.2120 0.0 0.1 0.2 0.0 0.1240 0.1 0.4 0.1 0.1 0.2120 0.1 0.4 0.2 0.1 0.1240 0.2 0.6 0.2 0.2 0.1120 0.0 0.2 0.2 0.0 0.1240 0.1 0.4 0.1 0.1 0.2Obviously, seasonality is captured by the model.27


ONE WORD OF CAUTION:<strong>TRAMO</strong>-<strong>SEATS</strong> and TSW up to 2011:Moderate revisions to the basic programs <strong>of</strong> 1996 and 2004.In 2001: start work on new versions (considerably corrected, more complete,and extended)The results I have presented correspond to the first release <strong>of</strong> these newversions (TSW+ 555).My intention is to refer to them as <strong>TRAMO</strong>+, <strong>SEATS</strong>+, and TSW+.The <strong>TRAMO</strong> and <strong>SEATS</strong> programs made available for the routine <strong>Reg</strong><strong>ARIMA</strong> inX12-<strong>ARIMA</strong>, and for X13-<strong>ARIMA</strong>-<strong>SEATS</strong>, as well as for DEMETRA+, were olderversions <strong>of</strong> the new programs.(Hopefully, they will eventually be updated, but it will take time.)Over the last two years, work on the AMI procedure.To get a feeling for the differences between the different versions, X12-<strong>ARIMA</strong>(release version 0.3, build 188) and DEMETRA+ (version 1.0.2.2228) wereapplied to the set <strong>of</strong> 50000 series and compared to the results <strong>of</strong> TSW (version555):28


Table 10:Correct Identification <strong>of</strong> the <strong>ARIMA</strong> model# obs. inseriesComplete model ordersD and BDTS X12A Demetra+ TS X12A Demetra+Airline-typemodelsOther seasonalmodelsNon-seasonalmodelsTotal120 78.2 68.0 71.6 96.1 94.8 96.4240 85.7 80.1 79.9 99.3 98.9 99.3120 46.9 36.4 43.3 92.4 86.6 85.9240 71.6 48.9 66.3 97.4 87.1 88.5120 68.7 36.4 54.0 93.5 71.5 76.6240 79.5 32.6 64.5 95.7 71.6 80.4120 64.9 47.2 56.4 94.0 84.6 86.8240 78.9 54.5 70.3 97.5 86.3 89.6 Improvement in 3 groups. Very large improvement forOther SeasonalNon-seasonal29


SIMULATED SERIESIn summary, If series can be seen as generated by (relatively parsimonious) <strong>ARIMA</strong>model, AMI in <strong>TRAMO</strong> yields the following results:% <strong>of</strong> series withcorrect resultIdentification <strong>of</strong> the exact model: 65 – 79%Identification <strong>of</strong> the exact differencingneeded for stationarity: 94 – 98%Model diagnostics: 99%Forecasting (one-period-ahead)- all series (TERROR): 99%- 18 one-period-ahead forecasts(single series): 96 – 97.5%SEASONALITY:Correct detection <strong>of</strong> seasonality(or its lack there<strong>of</strong>) 99.5%No seasonality in residuals 99.5%(No missed seasonality; no spurious seasonality)Automatic procedure seems to work well (at least, as a starting pointor as benchmark).30


But what about REAL SERIES?Generating process is unknown. <strong>Is</strong> a reg-<strong>ARIMA</strong> model a reasonableapproximation?(Real world is always harder than simulated one.)Mixed set <strong>of</strong> monthly seriesApprox. # <strong>of</strong> seriesFTI European countries (Eurostat) 6500US series (BLS, USBC) 3300Spanish series (INE) 1500Other (multiple sources) 2000Dominant groups: FTI, employment US (1800)Rest: smaller groups All categoriesBesides FTI and unemployment, national accounts,manufacturing, housing and construction, monetary series,industrial production, price indices, sales and inventories,public finance, tourism and travel, agriculture and food,energy, health, survey data, among other.Also: SA, MO. Many countries (Africa, Asia, Australia, Europe, North andSouth America)Length: [60 – 600] observations.31


Table 11: GeneralGroup# observ-Average# <strong>of</strong>% logs Average #Average #% withper-serielengthseriesparameters/outliers/sercalendarsereffect1 60 – 110 92 3479 82.5 2.1 0.8 23.22 111 – 160 126 3885 91.1 2.2 1.3 70.83 161 – 210 173 2945 91.3 2.3 1.6 72.64 211 – 260 229 1817 80.7 2.2 2.5 28.75 261 – 360 290 1038 84.8 2.4 2.8 60.7TOTAL (1 to 5) 155 13164 87.0 2.2 1.5 52.06 361 – 600 496 527 71.4 3.0 5.9 42.1TOTAL (1 to 6): 13691 Longer series highly unfrequent.Only 527.Besides: results deteriorate considerably.Mostly to illustrate which problems arise with longer series.TOTALS AVERAGES do not include Group 6. They refer to the 13164series that are not longer than 30 years. FTI in groups 1, 2 and 3.US series in all, mostly 1 and 4 (also 3).Spanish series in all, mostly 4 and 5.Other series in 2, 5, 1. Mixed set, with much variation in series behavior. Not an easy set!32


Next:Results from TSW+ purely <strong>automatic</strong> use (with parsimonious specificationfor TD pre-testing).I.e., 4, default option.Table 12: GeneralGroup(by# series inAverage% logs Average #Average #% withNZ)grouplengthparameters/seroutliers/sercalendareffect60 – 110 3479 92 82.5 2.1 0.8 23.2111 – 160 3885 126 91.1 2.2 1.3 70.8161 – 210 2945 173 91.3 2.3 1.6 72.6211 – 260 1817 229 80.7 2.2 2.5 28.7261 – 360 1038 290 84.8 2.4 2.8 60.7TOTAL 13164 155 87.0 2.2 1.5 52.0361 – 600 211 504 74.9 2.9 6.4 42.2 More levels in short and long series. Average # <strong>of</strong> parameters not much affected. Outliers 1/100 observations (all groups). Calendar effect:Groups 1, 4 mostly unaffected;Possibly due to group composition (e.g., FTI prone to exhibitcalendar effect.33


Table 12: OutliersGroup (by NZ)% <strong>of</strong> serieswith out.Average # per seriesAO TC LS Tot.60 – 110 42.2 0.4 0.2 0.2 0.8111 – 160 59.5 0.6 0.3 0.4 1.3161 – 210 67.1 0.8 0.4 0.5 1.6211 – 260 82.3 1.1 0.5 0.9 2.5261 – 360 85.0 1.6 0.6 0.6 2.8TOTAL AVERAGE 61.8 0.7 0.3 0.4 1.5361 – 600 93.0 2.5 1.4 2.0 5.9 % <strong>of</strong> series with outlier increases with length. 1 outlier / 100 observations. # <strong>of</strong> AO (# <strong>of</strong> TC + # <strong>of</strong> LS).Table 13: Calendar effectsGroup% <strong>of</strong> series withTD EE STOCH. TD Calendar effect60 – 110 19.5 2.6 2.3 23.2111 – 160 68.7 17.5 3.1 70.8161 – 210 69.3 25.4 5.0 72.6211 – 260 23.9 11.9 3.1 28.7261 – 360 56.2 27.6 2.6 60.7TOTAL AVERAGE 48.7 15.4 3.3 52.0361 – 600 37.4 13.1 8.3 42.134


Table 14: % Reduction in residual SE due to outliers and Calendar effectGroupDue to <strong>automatic</strong> outlierdetection and correctionDue to Calendareffect pretesting60 – 110 12.5 3.0111 – 160 12.9 9.3161 – 210 16.4 7.6211 – 260 15.3 1.8261 – 360 13.1 3.9% is computed w.r. to the complete group (including series with no outlier / nocalendar effect, for which the reduction is 0).35


Table 15: Detection <strong>of</strong> seasonalityGroupPre-tests: % <strong>of</strong> series with seasonalitySeasonal(not(by NZ)NP QS Spectrum F Overallcomponentincludingtestin AMInon-modelsignificantseasonality)60 – 110 81.1 83.3 50.9 84.6 86.0 85.1 (83.9)111 – 160 81.7 78.7 79.3 84.1 85.2 81.9 (79.7)161 – 210 87.2 85.5 83.1 89.5 90.7 87.4 (84.4)211 – 260 84.5 82.0 83.1 87.0 88.0 84.6 (80.7)261 – 360 83.2 78.2 80.0 87.8 88.8 84.4 (78.0)TOTALAVERAGE83.3 81.9 73.2 86.1 87.3 84.5 (81.8)361 – 600 62.8 61.1 62.1 61.9 64.9 63.4 (60.2) Short series: Spectrum <strong>of</strong> little use.Likely to fail when seasonality is stationary. Other groups: All are roughly close (range: 78 – 89%).F detects the most (Average: 86%), followed byNP (Average: 83%),QS (Average: 82%),Spectrum: 73%. Most reliable with simulated series: AMI result (range: 82 – 87%).If that is still true, overall test over-detects.- AMI in <strong>TRAMO</strong> will reduce it.36


<strong>SEATS</strong> will reduce it further (through quality control).(In particular, the seasonal component estimator may not besignificantly different from zero.)Final % <strong>of</strong> seasonal series to be adjusted is given in last column <strong>of</strong>Table 15.37


<strong>ARIMA</strong> MODEL IDENTIFICATIONTable 16: DifferencesGroupStationary(no diff.)% <strong>of</strong> series with Airlinemodel60 – 110 31.6 21.5 0.3 4.5 40.2 1.9 31.8111 – 160 7.2 22.9 0.3 5.3 64.0 0.4 47.4161 – 210 4.0 18.5 0.2 3.1 73.9 0.4 49.6211 – 260 2.5 22.4 0.2 3.2 69.6 2.0 50.1261 – 360 2.5 21.6 0.6 1.3 72.4 1.6 47.2TOTAL AVERAGE 11.9 21.4 0.3 4.0 61.4 1.2 44.1361 – 600 0.8 35.5 7.4 1.1 45.7 13.9 14.6 % <strong>of</strong> stationary series goes (drastically) down with length. Majority <strong>of</strong> series requires . Only : % is stable for all groups. Only : few (more for large). Only : few. Only : 1.2% (more for large).38


Table 17: ARMA parametersGroup (by NZ)Average # per seriesP Q BP BQ Total60 – 110 0.7 0.6 0.3 0.5 2.0111 – 160 0.5 0.8 0.2 0.7 2.2161 – 210 0.5 0.9 0.1 0.8 2.3211 – 260 0.5 0.7 0.1 0.8 2.2261 – 360 0.5 1.0 0.1 0.8 2.5TOTAL AVERAGE 0.5 0.8 0.2 0.7 2.2361 – 600 1.2 0.9 0.2 0.8 3.0 Very slight increase with length.Average # for set: 2.15/serie. Mostly MA. (An exception: P for large series.)# <strong>of</strong> MA parameters increases with length. Most frequent seasonal specification: IMA1,1 .Also:1 0 0 1 0 1 1 1 0 1 1 1 39


Table 18:Real series: Model diagnostics; % <strong>of</strong> series in group that pass the testn.i.i.d. assumption on residualsOut-<strong>of</strong>-sample forecastmean = 0 Constantmean andvarianceAutocorrelation RandomsignsNormality Skewness Kurtosis F-test (18finalperiods)t-test (1periodahead)60 – 110 1.3 1.9 0.5 0.5 3.8 2.6 2.4 8.4 9.4111 – 160 0.5 3.9 1.1 0.4 5.5 3.0 4.3 5.9 7.7161 – 210 0.3 6.0 2.3 0.5 8.0 3.8 6.6 7.0 10.9211 – 260 0.3 7.9 1.8 0.6 13.6 4.2 12.4 13.3 8.4261 – 360 0.1 4.9 3.4 0.9 17.5 4.9 16.8 6.3 3.1TOTAL AVERAGE 0.6 4.5 1.5 0.5 7.7 3.4 6.4 8.0 8.3361 – 600 0.6 20.9 20.7 4.7 66.2 15.6 66.8 12.9 3.440


Table 19:Real series: Model-diagnostics; % <strong>of</strong> series in group that pass the testn.i.i.d. assumptionOut-<strong>of</strong>-sample forecastmean = 0ConstantAutocorrelationRandomNormality Skewness Kurtosis F-test (18t-test (1mean andsignsfinalperiodvarianceperiods)ahead)60 – 110 98.7 98.1 99.5 99.5 96.2 97.4 97.6 91.6 90.6111 – 160 99.5 96.1 98.9 99.6 94.5 97.0 95.7 94.1 92.3161 – 210 99.7 94.0 97.7 99.5 92.0 96.2 93.4 93.0 89.1211 – 260 99.7 92.1 98.2 99.4 86.4 95.8 87.6 86.7 91.6261 – 360 99.9 95.1 96.6 99.1 82.5 95.1 83.2 93.7 96.7TOTAL AVERAGE 99.4 95.5 98.5 99.5 92.3 96.6 93.6 92.0 92.7361 – 600 99.4 79.1 79.3 95.3 33.8 84.4 33.2 87.1 96.641


Model diagnostics All tests carried at the (approx.) 99% level. Two types <strong>of</strong> tests:- Residuals are n.i.i.d.• Normally distributed N, Sk, Kur;• Identically 0; Constant and .• Independently Q; runs.- Out-<strong>of</strong>-sample forecasts:• Model fit to first (NZ – 18) obs. and 1 p.a. forecast errorsrecursively computed for last 18 obs. F-test.• Compute standardized 1 p.a. forecast errors for the 13691series and compare size (option TERROR: “<strong>TRAMO</strong> forerrors”).42


:- For all groups 99%. and constant:- Success with NZ;- For all groups, success 92% (for three 95%);- substantially for NZ >360 (79%); Autocorrelation (Q):- Success with NZ (moderately);- For all groups, success 97%;- substantially for NZ > 360 (79%); Runs:- All groups 99% (96% when NZ > 360). Normality:- Success decreases rapidly with NZ.The decrease is spectacular for NZ > 360 (34% successes);- Kurtosis is the main problem;for all groups, success 83%;for 3 groups 93%.- Skewness performs better;for all groups, success 95%.43


Out-<strong>of</strong>-sample forecast tests:- F-test is performed on linearized series.- t-test is performed on original series.- NZ has little effect on the tests• F-test: % <strong>of</strong> success: 87 – 94% (Average: 92%)• t-test: % <strong>of</strong> success: 89 – 97% (Average: 92%)Notice good forecasting performance <strong>of</strong> long series.(In fact, for NZ > 360 t-test performs best.)44


OutliersResidual Autocorrelation1165Mean 0,8110,510Mean Df 21,68609,55598,5508457,57406,5%3530% / 100.065,554,5254203,531510500 1 2 3 4 5 6 7 8 9 10 11 12 13# Outliers/Series2,521,510,5> 60.00001,5 4,5 7,5 10,5 15,0 19,5 24,0 28,5 33,0 37,5 42,0 46,5 51,0 55,5 60,0Q-Stat.% / 100.01211,51110,5109,598,587,576,565,554,543,532,521,51RANDOMNESS in Sign <strong>of</strong> Residuals (t-value)Mean 0,040,5% Run < -4.1% Run > 4.10-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,1 0,5 0,9 1,3 1,7 2,1 2,5 2,9 3,3 3,7RANDOMNESS% / 100.09,598,587,576,565,554,543,532,521,51SKEWNESS <strong>of</strong> Residuals (t-value)Mean 0,240,5% w ith SK > 4.1% w ith SK < -4.10-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,3 0,7 1,1 1,5 1,9 2,3 2,7 3,1 3,5 3,9SKEWNESS45


%484644424038363432302826242220181614121086420Mean 1,33Outliers0 1 2 3 4 5 6 7 8 9 10 11 12 13 14# Outliers/Series% / 100.0109,598,587,576,565,554,543,532,521,51Residual AutocorrelationMean Df 22,970,5> 60.00001,5 4,5 7,5 10,5 15,0 19,5 24,0 28,5 33,0 37,5 42,0 46,5 51,0 55,5 60,0Q-Stat.RANDOMNESS in Sign <strong>of</strong> Residuals (t-value)9SKEWNESS <strong>of</strong> Residuals (t-value)10,5109,59Mean -0,028,587,5Mean 0,138,5786,5% / 100.07,576,565,5565,5% / 100.054,544,543,533,532,52,5221,510,5% Run < -4.1% Run > 4.10-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,3 0,7 1,1 1,5 1,9 2,3 2,7 3,1 3,5 3,9RANDOMNESS1,510,5% w ith SK < -4.10% w ith SK > 4.1-3,9 -3,5 -2,9 -2,5 -1,9 -1,3 -0,7 -0,1 0,3 0,7 1,1 1,5 1,9 2,3 2,7 3,1 3,5 3,9SKEWNESS46


OutliersResidual Autocorrelation%3836343230282624222018Mean 1,64% / 100.0109,598,587,576,565,554,5Mean Df 24,57164143,5123102,58261,54210,5> 60.00000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28# Outliers/Series01,5 4,5 7,5 10,5 15,0 19,5 24,0 28,5 33,0 37,5 42,0 46,5 51,0 55,5 60,0Q-Stat.1514,51413,51312,51211,5% / 100.01110,5109,598,587,576,565,554,543,532,521,5RANDOMNESS in Sign <strong>of</strong> Residuals (t-value)Mean 0,001% 0,5Run < -4.1% Run > 4.10-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,1 0,5 0,9 1,3 1,7 2,1 2,5 2,9 3,3 3,7RANDOMNESS% / 100.08,587,576,565,554,543,532,521,510,5% w ith SK < -4.10SKEWNESS <strong>of</strong> Residuals (t-value)Mean 0,03% w ith SK > 4.1-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,3 0,7 1,1 1,5 1,9 2,3 2,7 3,1 3,5 3,9SKEWNESS47


%282726252423222120191817161514131211109876543210Mean 2,47Outliers0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18# Outliers/Series% / 100.0109,598,587,576,565,554,543,532,521,510,5Residual AutocorrelationMean Df 24,89> 60.00001,5 4,5 7,5 10,5 15,0 19,5 24,0 28,5 33,0 37,5 42,0 46,5 51,0 55,5 60,0Q-Stat.RANDOMNESS in Sign <strong>of</strong> Residuals (t-value)SKEWNESS <strong>of</strong> Residuals (t-value)1413,5% / 100.01312,51211,51110,5109,598,587,576,565,554,543,532,521,51% 0,5Run < -4.10Mean -0,05-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,1 0,5 0,9 1,3 1,7 2,1 2,5 2,9 3,3 3,7RANDOMNESS% Run > 4.1% / 100.07,576,565,554,543,532,521,510,5% w ith SK < -4.10Mean 0,11% w ith SK > 4.1-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,3 0,7 1,1 1,5 1,9 2,3 2,7 3,1 3,5 3,9SKEWNESS48


%26252423222120191817161514131211109876543210Mean 2,83Outliers0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18# Outliers/Series% / 100.098,587,576,565,554,543,532,521,510,50Residual AutocorrelationDfMean 25,99> 60.0001,5 4,5 7,5 10,5 15,0 19,5 24,0 28,5 33,0 37,5 42,0 46,5 51,0 55,5 60,0Q-Stat.RANDOMNESS in Sign <strong>of</strong> Residuals (t-value)SKEWNESS <strong>of</strong> Residuals (t-value)% / 100.01312,51211,51110,5109,598,587,576,565,554,543,532,521,51Mean -0,14% 0,5Run < -4.1% Run > 4.10-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,1 0,5 0,9 1,3 1,7 2,1 2,5 2,9 3,3 3,7RANDOMNESS% / 100.07,576,565,554,543,532,521,51Mean 0,390,5% w ith SK > 4.1% w ith SK < -4.10-3,9 -3,3 -2,7 -2,1 -1,5 -0,9 -0,3 0,3 0,7 1,1 1,5 1,9 2,3 2,7 3,1 3,5 3,9SKEWNESS49


Comparing the results for the simulated and real series, as could be expected,AMI applied to the later yields worse results. Still, the results are quiteencouraging.N.i.i.d. assumption All tests for all groups indicate a proportion <strong>of</strong> success > 82%. Out <strong>of</strong> 35 tests aggregate results, 31 yield % success > 90%.26 yield % success > 95%. Only serious problem: kurtosis.(which accounts for the failure <strong>of</strong> Normality)But kurtosis is <strong>of</strong> secondary importance. If distribution is symmetric,point estimators are close to “optimal”. If N judged by skewness only:- all tests > 92% success.- 23 out <strong>of</strong> 25 tests > 95% success.- 10 out <strong>of</strong> 25 tests > 99% success.- Long series: all tests 80%.Out-<strong>of</strong>-sample forecasts- Lowest % <strong>of</strong> success: 87%.- 8 out <strong>of</strong> 10 tests > 90% success.- Notice that tests do not deteriorate as NZ.50


SEASONALITY AND SEASONAL ADJUSTMENTTable 20:Seasonality and Calendar residual effects (% <strong>of</strong> residual series ingroup that show evidence)Group (by NZ) Evidence <strong>of</strong> seasonality SpectralSeasonalNon-SpectralOverallevidence <strong>of</strong> TDautocorrel.parametricevidencetesteffect intestresiduals60 – 110 0.1 0.4 0.1 0.1 0.0111 – 160 0.1 2.9 0.4 0.1 0.4161 – 210 0.2 3.9 0.3 0.2 3.5211 – 260 0.1 3.9 0.2 0.2 0.8261 – 360 0.3 7.1 0.4 0.4 1.3TOTAL AVERAGE 0.1 2.9 0.3 0.2 1.1361 – 600 0.8 7.0 0.2 0.8 5.3 Detection-<strong>of</strong>-seasonality test applied to residuals.- AC: 1 or 2 in 1000 series.- Spectral: Slightly above AC test;Still, 99.5% <strong>of</strong> residual series show no evidence.- NP, however, detects seasonality in the range 0.4 7% <strong>of</strong>series, % increases with NZ.- Overall test: 99.5% <strong>of</strong> residuals show no evidence. TD: Spectral evidence in residuals.A peak for intermediate length group (3.5%).Rest 1%.51


Table 21: Features <strong>of</strong> decompositionMODEL CHANGEDBY <strong>SEATS</strong>GroupNon-OtherWithWithWith(by NZ)Admissibleseasonaltransitorystochasticdecomp.componentcomponentTD60 – 110 3.7 1.2 85.1 26.3 0.2111 – 160 2.5 1.4 81.9 23.4 2.2161 – 210 3.7 1.3 87.4 24.4 3.3211 – 260 4.3 1.3 84.6 24.2 1.3261 – 360 3.9 0.7 84.4 28.8 1.5TOTAL 3.4 1.3 84.5 24.9 1.7361 – 600 17.5 11.0 63.4 41.4 4.9What <strong>SEATS</strong> does with these models: <strong>TRAMO</strong> model has no admisible decomposition(i.e., model is changed):- Around 3.4% <strong>of</strong> series(not much affected by NZ). <strong>TRAMO</strong> model changed to improve decomposition features- Around 1.25% <strong>of</strong> series(not much affected by NZ).52


<strong>SEATS</strong> decomposes the series into components. The trend-cycle andirregular components are present 99% <strong>of</strong> the time.Seasonal component: around 85% <strong>of</strong> the series(not much affected by NZ).Transitory component: around 25% <strong>of</strong> the series(not much affected by NZ).Stochastic TD: around 2% <strong>of</strong> the series(only when a deterministic TD has been detected).<strong>SEATS</strong> will point to problems that may arise in the series decomposition.53


Table 22:Decomposition failures (1) Seasonal component: Quality diagnostics (2)Error inACF-CCF Seasonality inUnstableUnreliableRevisionSeasonalityUnacceptablespectralSA series (AC,seasonal (largeestimator (largeis toonotseasonalfactorizationNP, or SPEC)innovationestimationlargesignificant(Total)variance)error)60 – 110 0 1.1 0 0.8 0.6 0.1 1.1 2.7111 – 160 0 0.8 0 0.2 0.1 0.1 2.2 2.6161 – 210 0 1.4 0 0.1 0.1 0.1 3.0 3.2211 – 260 0 1.3 0 0 0.2 0.1 3.9 4.1261 – 360 0 2.6 0 0 0.3 0 6.5 6.8TOTAL 0 1.2 0 0.3 0.3 0.1 2.7 3.3361 – 600 0 3.2 0 0 0.8 0 7.8 11.8(1) Percentage <strong>of</strong> total number <strong>of</strong> series (13164)(2) Percentage <strong>of</strong> total SA series (11127)54


Computational failure: 0%.Misspecification <strong>of</strong> component models: 1.2%(mostly detected when model is an approx. <strong>of</strong> a non-admissibledecomposition)Seasonality in SA series: 0%.Quality diagnostics:(if failed, series should not be SA). Unstable seasonal: 0.3%(i.e., large innovation variance) Unreliable estimator: 0.3%(i.e., large estimation error variance) Revisions are too large: 0.1%(i.e., large variance <strong>of</strong> revision error in concurrent estimator) Seasonality is not significant: 2.7%(seasonal component estimator not significant)Hence 3.3% <strong>of</strong> the SA series are not worth adjusting, even though a seasonalcomponent was detected (some 370 series).55


Back to original question: For series with NZ not longer than 30 year (monthlydata)AMI in TSW+ seems quite reliable.One last comment:In SA, long tradition <strong>of</strong> ambiguity:• Lack <strong>of</strong> precise definition <strong>of</strong> seasonality;• “wishful thinking”: how a decent seasonal componentshould be.Important example:Should a decent S(t) be NON-STATIONARY ?Stationary seasonal : each month component has 0 mean.Problem: if series is stationary, S(t) is stationary.Say,1 0.7 ) AR polynomial.• If adjusted by replacing it with 1 ) : overadjustment(misspecification);• If not adjusted: diagnostics (AC, spectrum) will say there isseasonality in SA series.Better, stick to model.As Hawking says: “there is no model-independent test <strong>of</strong> reality.”(Then, one can add “judgment.”)56

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!