12.08.2013 Views

final_program_abstracts[1]

final_program_abstracts[1]

final_program_abstracts[1]

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

11 IMSC Session Program<br />

A functional data approach for climate zones identification<br />

Tuesday - Parallel Session 5<br />

Edmondo Di Giuseppe 1 , Giovanna Jona Lasinio 2 , Stanislao Esposito 1 and<br />

Massimiliano Pasqui 3<br />

1 CRA-Cma, Roma, Italy<br />

2 Uniroma1, Roma, Italy<br />

3 CNR-Ibimet, Roma, Italy<br />

In order to correctly describe atmospheric variability and clear trends, homogeneous<br />

climate regions should be identified. A combination of Functional Data Analysis<br />

(FDA) and Partioning Around Medoids (PAM) clustering technique is applied in Italy<br />

for surface temperature and precipitation fields. The analysed dataset is composed of<br />

daily precipitation and daily minimum and maximum temperature data collected for<br />

the period 1961-2007 from 96 Italian stations. First, minimum and maximum<br />

temperatures were averaged to obtain medium temperature. Then Monthly Mean of<br />

Medium Temperature (Tmed-MM) and Monthly Cumulated Rainfall (Prec-MC) were<br />

calculated. Thus, 96 time series of 564 monthly values concerning a set of 2 climatic<br />

variables form the basis for the classification.<br />

FDA is a collection of techniques to model data from dynamic systems in terms of<br />

some set of basis functions, which are a linear combination of known functions. FDA<br />

consists of converting observations gathered at discrete time into functional data.<br />

Tmed-MM and Prec-MC time series can be considered as realizations of continuous<br />

processes recorded in discrete time. As each time series is representative of station<br />

location climate variability, they are converted into functional data through the<br />

estimation of spline coefficients. The main advantage of functional data is the<br />

reduction of many observations to few coefficients, preserving the information about<br />

temporal pattern of the time series. B-splines system of basis with a fixed number of<br />

knots is adopted for functional data conversion, which guarantees a comparability of<br />

responses from 96 time series. Fixed interior knots are 45 plus 2 knots corresponding<br />

to the edges of observations interval, piece-wise polynomials degree is 3 for a total<br />

number of 51 estimated coefficients. A Generalized Cross Validation (GCV)<br />

procedure is applied for determining the λ weight of penalty matrix. Finally the<br />

number of estimated coefficients is reduced by means of Principal component<br />

analysis (PCA). Thus, the Pc’s of estimated coefficients are partitioned by PAM<br />

classification technique to obtain climate zones.<br />

PAM algorithm clusters objects around k medoids where k is specified in advance.<br />

Medoids are chosen at each iteration as representative items rather than calculate the<br />

mean of the items in each cluster. As an alternative the coordinates of k medoids can<br />

be established as initial medoids. A comparison between the average distance of<br />

object i from all other objects in the same cluster and the minimum distance of object<br />

i from all other objects not in the same cluster composes an index defined as<br />

silhouette. A plot of silhouettes for all objects is a method both to determine the<br />

number of clusters and also to determine which objects lie well within their clusters<br />

and which do not.<br />

Once a <strong>final</strong> grouping of the stations is established, a cross validation procedure is<br />

applied in order to quantify those stations correctly classified. An assessment based<br />

Abstracts 140

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!