21.12.2012 Views

Identification of dry and rainy periods using telecommunication ...

Identification of dry and rainy periods using telecommunication ...

Identification of dry and rainy periods using telecommunication ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

<strong>Identification</strong> <strong>of</strong> <strong>dry</strong> <strong>and</strong> <strong>rainy</strong> <strong>periods</strong> <strong>using</strong> <strong>telecommunication</strong><br />

microwave links<br />

M. Kaufmann 1 <strong>and</strong> J. Rieckermann 1<br />

1 Eawag - Swiss Federal Institute <strong>of</strong> Aquatic Science <strong>and</strong> Technology, Dübendorf, Switzerl<strong>and</strong><br />

*Corresponding author, e-mail joerg.rieckermann@eawag.ch<br />

Abstract<br />

Microwave links from <strong>telecommunication</strong> networks (MWL) provide rainfall data at high spatial<br />

(


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

Figure 1 The pre-processing <strong>of</strong> received signal level (RSL) from Microwave Links (MWL) decomposes the<br />

RSL into the baseline (B) <strong>and</strong> the rain-induced attenuation (Atot). Pre-processing with an online-algorithm<br />

(left) only uses past information, but can be applied in real-time data analysis <strong>and</strong> operation. The corresponding<br />

<strong>of</strong>fline algorithm uses both past <strong>and</strong> future observations.<br />

In this study, we therefore developed three novel methods to classify every measurement <strong>of</strong><br />

the signal strength as either belonging to wet or <strong>dry</strong> <strong>periods</strong>: i) a moving window algorithm,<br />

ii) a statistical classification algorithm <strong>using</strong> r<strong>and</strong>om forests <strong>and</strong> iii) an algorithm based on a<br />

Gaussian factor graph, which is a rather novel signal-processing method. Our main innovations<br />

are that we suggest algorithms for <strong>of</strong>fline <strong>and</strong> online data processing, compare their performance<br />

to that <strong>of</strong> others published algorithms in literature <strong>and</strong> investigate how the different<br />

methods cope with different data qualities, temporal <strong>and</strong> power resolution. Our results show<br />

minimal classification errors <strong>of</strong> precipitation for the r<strong>and</strong>om forest, although each algorithm<br />

has its pros <strong>and</strong> cons. Interestingly, the major influence factor is the quantization <strong>of</strong> the MWL<br />

signal. As expected, the results are very much dependent on the training procedure <strong>and</strong> data <strong>of</strong><br />

the algorithms.<br />

CASE STUDY AREA IN ZURICH, CH<br />

MWL network: For our study, 14 links from a <strong>telecommunication</strong> network in the region <strong>of</strong><br />

Zurich were available (Figure 2). For details see Rieckermann et al. (2009). The RSL was<br />

recorded as the instantaneous power from ten MWL with a “fine” quantization <strong>of</strong> 0.1 [dBm]<br />

<strong>and</strong> four with a “coarse” one (1 [dBm]: No. 2, 3, 6, 11) <strong>and</strong> a temporal resolution <strong>of</strong> about 2.5<br />

minutes. The RSL were recorded from April 2009 until May 2010.<br />

Rain gauges: Eleven rain gauges are present in the study area <strong>and</strong> the data were obtained from<br />

the local sewer operator, ERZ, <strong>and</strong> MeteoSwiss. We aggregated the tipping bucket recordings<br />

(volume: 0.1 mm) over ten minutes to account for the spatial <strong>and</strong> temporal variability <strong>of</strong> rainfall<br />

with regard to the path-averaged values <strong>of</strong> the MWL. This way, the measurable intensities<br />

were multiples <strong>of</strong> 0.6 mm/h. To evaluate the performance <strong>of</strong> the algorithms, we used the rain<br />

gauge closest to an individual MWL as the ground truth. This is naturally not exact, because,<br />

in addition to measurement errors, the measurements <strong>of</strong> point rainfall from a gauge do not<br />

correspond to the path-average rain intensities from MWLs. However, absolute values are not<br />

our primary interest in this study. Also, one rain gauge represents from less than 1 kilometre<br />

to up to 8 kilometres <strong>of</strong> MWL length, which is in the lower range <strong>of</strong> the values found in the<br />

literature (Berne <strong>and</strong> Uijlenhoet, 2007; Leijnse et al., 2007; Zinevich et al., 2008).<br />

Weather radar: Radar data are provided by MeteoSwiss (SMA) as a composite <strong>of</strong> the three<br />

Page 2 <strong>of</strong> 12


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

Figure 2 Location <strong>of</strong> the selected ORANGE microwave links <strong>and</strong> the location <strong>of</strong> the available rain gauges.<br />

weather radars in Switzerl<strong>and</strong> (Germann et al., 2006). We deliberately chose not to use the<br />

radar data as ground truth in our study, because data from weather radars are rather uncertain<br />

(Krämer et al., 2005; Upton et al., 2005). However, to test their usefulness to inform the<br />

online <strong>and</strong> <strong>of</strong>fline classification algorithms (see below), we filtered the data <strong>and</strong> set all radar<br />

measurements below 0.6 mm/h to zero.<br />

CLASSIFYING RECEIVED SIGNAL LEVELS INTO WET AND DRY<br />

PERIODS<br />

The moving window algorithm <strong>and</strong> its modification<br />

Schleiss <strong>and</strong> Berne (2010) proposed a method to separate <strong>dry</strong> from <strong>rainy</strong> <strong>periods</strong> based on the<br />

assumption that, during <strong>dry</strong> <strong>periods</strong>, the st<strong>and</strong>ard deviation <strong>of</strong> the received signal level is<br />

bounded by some constant value that does not change with time <strong>and</strong> only depends on the<br />

physical characteristics (i.e., frequency, length, type <strong>of</strong> antenna) <strong>of</strong> the MWL (Schleiss <strong>and</strong><br />

Berne, 2010). This leads to a simple decision rule: IF (σ(t)) > σ0 THEN “Wet” (W) ELSE<br />

“Dry” (D), where σ(t) is the st<strong>and</strong>ard deviation <strong>of</strong> the received signal level RSL(t) in the moving<br />

window Wt=[t-w, t], w>0, <strong>and</strong> σ0 is a threshold parameter that has to be estimated from the<br />

data. For our investigation, we chose a length <strong>of</strong> Wt, �Wt = { 20, 30 min}. The threshold σ0<br />

was computed from previously recorded data. Typically, several months <strong>of</strong> previously recorded<br />

data are required, because the fraction <strong>of</strong> <strong>rainy</strong> <strong>periods</strong> is small: σ0= q1-r{�ˆ (t)} where<br />

q is the quantile <strong>and</strong> r is the fraction <strong>of</strong> <strong>rainy</strong> <strong>periods</strong>, which we determined to 7.2% based on<br />

SMA rain data from May to December 2009. The attenuation baseline B(t) is then estimated<br />

in real-time <strong>using</strong> the following algorithm:<br />

(1): For each time index t � D, set B(t)= RSLWt<br />

(2): For each time index t � W, set B(t)=B(t-k), where k is the smallest value such that tk<br />

� D.<br />

We label this original algorithm by Schleiss <strong>and</strong> Berne (2010) “S1a” for �Wt= 20 min <strong>and</strong><br />

“S1b” for �Wt= 30 min. We found that S1a <strong>and</strong> S1b show quite high class errors for the W<br />

category, which is probably because the temporal resolution <strong>of</strong> the RSL used in the original<br />

method was 6 seconds, while we have a typical resolution <strong>of</strong> two or three minutes. To improve<br />

the detection <strong>of</strong> rain, we thus modified the above decision rule by introducing an additional<br />

threshold for the mean <strong>of</strong> the moving window, mean(RSLWt) <strong>and</strong> �Wt= 20 min (S2).<br />

Page 3 <strong>of</strong> 12


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

�<br />

�<br />

RSL(t)<br />

Figure 3 (Left) A simple classification tree with categories D <strong>and</strong> W based on the attributes RSL(t), σ(t),<br />

min(RSLWt), max(RSLWt) for a backward-looking moving window <strong>of</strong> length �Wt= 20 min. (Right) Illustration<br />

<strong>of</strong> different moving windows to construct the attributes for a case <strong>of</strong> RSL(t). Online versions <strong>of</strong><br />

algorithms only use past data, i.e. backward-looking windows. An observation is classified as D, if i) the<br />

attribute RSL(t)� 39.81 [dBm] (�), ii) RSL(t)� 39.81 AND σ(t)� 0.3202 AND max(RSL(t))< -40.81(�), etc.<br />

Regressive classification trees <strong>and</strong> r<strong>and</strong>om forests<br />

The aim <strong>of</strong> a regressive classification tree is to predict a classification for a “case” based on its<br />

attributes (also: “variables”) (Breiman et al., 1984) (Figure 3, left). Here, we classify an observation<br />

RSL(t) based on attributes computed from past (online) or both past <strong>and</strong> future (<strong>of</strong>fline)<br />

data (e.g., σ(t), min(RSLWt), etc.) (Figure 3, right). We used the rpart package in the<br />

statistical computing language R, which maximizes the mean decrease in the Gini-index (GI)<br />

(Gini, 1921) at every split. To every end-node a classification (W or D) is assigned, when the<br />

splitting stops. This occurs by default, when a maximum <strong>of</strong> 20 observations at every node is<br />

reached (Atkinson, 2000).<br />

A r<strong>and</strong>om forest basically consists <strong>of</strong> many regressive trees, it again relies on a training-set<br />

with in total m classified cases. Each tree <strong>of</strong> the forest is then constructed <strong>using</strong> a different<br />

bootstrap sample, which is drawn with replacement from the original data (Breiman, 2001).<br />

The criteria for building the classification forest are the same as for the single classification<br />

tree. After the construction <strong>of</strong> the r<strong>and</strong>om forest, each new observation is classified by all<br />

trees <strong>of</strong> the forest, where each tree give a votes <strong>and</strong> the final category is assigned based on the<br />

majority <strong>of</strong> the votes. As the sum <strong>of</strong> all decreases in GI over the whole tree for an attribute<br />

(e.g., min(RSLWt)) gives an estimate <strong>of</strong> its importance, the mean decrease in GI (�Gini) is a<br />

measure <strong>of</strong> the sensitivity <strong>of</strong> the classification to this attribute (Breiman, 2001). While we<br />

used rpart for the classification tree, the R package r<strong>and</strong>omForest was used for the construction<br />

<strong>of</strong> r<strong>and</strong>om forests to analyze the MWL data.<br />

For the <strong>of</strong>fline r<strong>and</strong>om-forest algorithm, we calculated the following attributes from forwardbackward-looking<br />

windows (Figure 3, right) (the online versions only relied on past data)<br />

with �Wt= {15, 30, 120 min}: i) RSL, ii) st<strong>and</strong>ard deviation (std), iii) slope <strong>of</strong> a fitted regression<br />

line (slope) <strong>and</strong> autocorrelation (lag=1) (autocor) (all relative), iv) min(RSL) (min), v)<br />

max(RSL) (max), vi <strong>and</strong> vii) the 10%- <strong>and</strong> the 90%-Quantile (q10, q90; only for<br />

�Wt= 120min) (all absolute), viii) the last rain intensity obtained by the weather radar<br />

(RTS_past) <strong>and</strong> a binary variable (0 or 1) depicting whether it had rained in the previous 10 -<br />

40 min (rain_10_40_past) (information on past rainfall). Our expectation was that information<br />

on past precipitation could improve the classification. Intrinsic MWL variables, such as<br />

the cross-correlation between the two communication channels were also calculated, but finally<br />

not used in the classification, because it <strong>of</strong>ten resulted in zero st<strong>and</strong>ard variation in the<br />

intervals, for which the correlation coefficient is not defined. We constructed various r<strong>and</strong>om<br />

forests based on different attributes [in brackets] as online (ON) or <strong>of</strong>fline version: RF1 [all<br />

attributes], RF2 [same as RF1, ON], RF3 [all-{RTS_past, rain_10_40_past}], RF4 [same as<br />

RF3, ON]. RF5 [{RSL, min}], RF6 [same as RF5, ON] consider only the most informative<br />

RSL (t)<br />

Page 4 <strong>of</strong> 12<br />

time (t)


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

attributes, which we selected based on a high �Gini value. A table with the list <strong>of</strong> all used<br />

variables for RF1-RF6 is provided in Table A2. As the moving window algorithm repeatedly<br />

misclassified small rain amounts (see below), we introduced the additional category “light<br />

rain” (L) (D: 0, Light: 0.6, W> 0.6 [mm/h]). For training, we compiled sets <strong>of</strong> 1000 <strong>dry</strong> <strong>and</strong><br />

1000 <strong>rainy</strong> cases, r<strong>and</strong>omly sampled out <strong>of</strong> the desired analysis period. Although this does not<br />

correspond to the observed distribution <strong>of</strong> <strong>dry</strong> <strong>and</strong> wet <strong>periods</strong>, it improves the detection <strong>of</strong><br />

wet <strong>periods</strong>, which are more relevant in our application.<br />

Stochastic state space model <strong>of</strong> the baseline <strong>using</strong> a Gaussian factor graph<br />

Here, we used a rather new model-based signal processing methodology which is based on<br />

factor graphs <strong>and</strong> algorithms for message passing, which include an element <strong>of</strong> forgetting to<br />

ensure a decreasing confidence in remote parts <strong>of</strong> the baseline model. For an introduction on<br />

the use <strong>of</strong> factor graphs in model-based signal processing see Loeliger et al. (2007) <strong>and</strong> on the<br />

idea <strong>of</strong> forgetting (Loeliger et al., 2009). The underlying idea <strong>of</strong> the algorithm developed by<br />

Reller et al. (2011) is to reconstruct the baseline <strong>and</strong> to classify the RSL(t) either as belonging<br />

to the baseline (ck=1, i.e., Dry weather) or not (ck=0) (see Eq. 2). Important assumptions are<br />

that the data belonging to the baseline are locally smooth <strong>and</strong> that there is some periodicity in<br />

the observed signal over consecutive days (Figure A1). Therefore, the baseline is composed<br />

from two linear state space models on different time scales, which are referred to as the “fast"<br />

<strong>and</strong> the “slow" model. As an example, we briefly describe the fast model, which is formulated<br />

as a second order linear state space model <strong>of</strong> a straight line (Eqs. 1 <strong>and</strong> 2), where a time interval<br />

is defined �k = tk-tk-1, with timestamp t <strong>and</strong> observations indexed with k=1,..K. The state<br />

vector <strong>of</strong> the straight line, Xk, contains the RSL(tk) <strong>and</strong> the line slope at time t (Reller et al.,<br />

2011).<br />

� k<br />

Xk � AkXk-1<br />

(1)<br />

Yk= CXk+ Zk<br />

(2)<br />

where �� �1<br />

� k �<br />

Ak<br />

� �� , C � �1, 0�,<br />

Zk is Gaussian noise <strong>and</strong> �k is a forgetting factor on the state<br />

�0<br />

1 �<br />

Xk, which has a similar, but not the same, effect as state noise. Yk are the observed RSL. Details<br />

<strong>of</strong> the algorithm, such as the slow model, the message passing <strong>and</strong> the choice <strong>of</strong> threshold<br />

parameters for classification are described in Reller et al. (2011). Due to limited time, we<br />

were only able to adjust the threshold parameter, but not to optimize the algorithm extensively.<br />

Figure 4 shows the results for a single rain event on 17.07.2009 <strong>and</strong> MWL 8. In the<br />

lower part <strong>of</strong> the figure, the observed RSL <strong>and</strong> modeled baseline are plotted with the timestamps<br />

where the fast <strong>and</strong> slow baseline models are connected (�t= 6 hours). In the upper<br />

part, the baseline components are displayed <strong>and</strong> the residual signal is compared to the rain<br />

intensity measured by the gauge, which are in good agreement (not scaled).<br />

Performance evaluation <strong>and</strong> comparison <strong>of</strong> the algorithms<br />

The classification <strong>of</strong> every single RSL measurement was compared to the rain intensity that<br />

was measured at that time at the corresponding rain gauge <strong>of</strong> the MWL. The methods were<br />

applied to two different analysis <strong>periods</strong>: Period A (01.07.2009-31.10.2009) contains only<br />

liquid precipitation events. Period B (01.07.2009-31.12.2009) in addition contains snowfall<br />

<strong>and</strong> sleet. To evaluate the above algorithms, we computed the class errors for W <strong>and</strong> D based<br />

on the confusion matrix which contains True Positives, False Positives, False Negatives <strong>and</strong><br />

True Negatives:<br />

Page 5 <strong>of</strong> 12


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

R [mm/h]; Residuals [dBm]; RSL [dBm]; B [dBm]<br />

24<br />

0<br />

-24<br />

-48<br />

Measured rain intensity [mm/h]<br />

Residual (dBm) <strong>of</strong>:<br />

- only fast model (upper, dark grey line)<br />

- the combination <strong>of</strong> fast <strong>and</strong> slow model (lower, black line)<br />

B ± threshold for only the fast model (grey lines) or the<br />

combination <strong>of</strong> the fast <strong>and</strong> slow model (black lines) [dBm]<br />

RSL (dBm)<br />

-72<br />

17.07. 12:00 17.07. 16:00 17.07. 20:00 18.07. 00:00 18.07. 04:00<br />

Page 6 <strong>of</strong> 12<br />

Timestamp <strong>of</strong> the connection between<br />

the two sub-models<br />

Figure 4 Signal decomposition <strong>of</strong> MWL No. 8 with a Gaussian Factor graph. (Bottom) RSL <strong>and</strong> baseline<br />

model, (Top) Fast <strong>and</strong> slow baseline components, Residual attenuation <strong>and</strong> Measured rain intensities.<br />

Type I errors: <strong>dry</strong> class error (false rainfall alert) = False Positives / n_<strong>dry</strong> (3)<br />

Type II errors: wet class error (missed rain) = False Negatives / n_wet (4)<br />

RESULTS AND DISCUSSION<br />

Moving window algorithms<br />

The results for period A demonstrate that the original algorithm (S1a <strong>and</strong> S1b) is prone to 25.4 - 64.9% <strong>of</strong><br />

type II errors (<br />

Figure 5, left), which is unsatisfactory. It is remarkable that type II errors <strong>of</strong> the MWL with a 0.1 dBm<br />

quantization were about half <strong>of</strong> those with a coarse quantization. For the coarse quantization MWL, we<br />

found that the wet class errors decreased with increasing �Wt. In contrast, our modified algorithm, S2,<br />

presented lower wet class errors <strong>and</strong> higher <strong>dry</strong> class errors. Furthermore, S2 showed a superior performance<br />

regarding the true detections <strong>of</strong> <strong>dry</strong> <strong>and</strong> wet <strong>periods</strong> for all the investigated MWL in our study<br />

area (<br />

Figure 5, right). Period B included events with snowfall <strong>and</strong> ice particles, which have a different<br />

influence on millimeter microwave attenuation (Brussaard <strong>and</strong> Watson, 1994). Therefore,<br />

the wet<br />

Figure 5 (Left) Class errors for S1a, S1b <strong>and</strong> S2 for ten MWL with a fine (0.1 dBm) <strong>and</strong> four with a coarse<br />

(1.0 dBm) quantization for period A. (Right) Proportion <strong>of</strong> the RSL measurements that were classified as<br />

wet by a rain gauge for the methods S1a, S1b <strong>and</strong> S2 for period A (mean over all 14 links). The class errors<br />

were calculated without <strong>of</strong>fset.


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

class error for S1a <strong>and</strong> S1b increased to approximately 45% (coarse quantization: 75%). In<br />

contrast, the wet class error for S2 only increased to 20 % (coarse quantization: 30%). As expected,<br />

the <strong>dry</strong> class error did not experience important changes.<br />

Regressive classification trees <strong>and</strong> r<strong>and</strong>om forests<br />

Similar to the above algorithms, we developed individual classifiers for the 14 MWL based<br />

on R<strong>and</strong>om forests <strong>and</strong> the various attributes computed from period A data. As described<br />

above, we classified <strong>dry</strong> (D) <strong>and</strong> <strong>rainy</strong> <strong>periods</strong>, split into light (L) <strong>and</strong> strong (W). The direct<br />

comparison with S1-S2 shows rather favourable results for the r<strong>and</strong>om forest algorithms RF1-<br />

RF6, because less than 5% <strong>of</strong> all <strong>rainy</strong> cases {L,W} were misclassified as <strong>dry</strong> (results not<br />

shown). We found that the mean class errors for RF1- RF6 (averaged over all 14 links) for<br />

class W were about 20%, with 30-50% errors for L, light rains, which means that many light<br />

rains are wrongly classified as strong <strong>and</strong> vice versa. Also, the online algorithms perform<br />

competitively against the <strong>of</strong>fline algorithm, because the class error for W is very similar <strong>and</strong><br />

errors for the two other classes only increase slightly (e.g., D: RF1/RF2 = 19.9%/22.2%). For<br />

period B, the class errors (averaged over all links <strong>and</strong> over RF1-RF6) showed an increase <strong>of</strong><br />

9.9 % (D), 3.9 % (L) <strong>and</strong> 8.5 % (W) (absolute values). Figure 6 shows the average �Gini for<br />

the <strong>of</strong>fline algorithm RF1, which indicates the potential <strong>of</strong> the different attributes to identify<br />

<strong>dry</strong> or wet <strong>periods</strong>. First <strong>of</strong> all, we found that the important attributes depend on the quantization<br />

<strong>of</strong> the MWL. Second, min(RSL) (for �Wt= {15,30}) <strong>and</strong> the RSL itself are very informative<br />

for the MWL with 0.1 dBm resolution. Third, for the coarse MWL, the by far most informative<br />

attribute is the radar rainfall (RTS_past) <strong>and</strong> the attributes calculated based on the<br />

RSL unfortunately do not contain much information. Interestingly, attributes computed from<br />

�Wt=120min are very informative for the coarse MWL, but not for the fine ones. The online<br />

algorithms (RF2, RF4 <strong>and</strong> RF6) which only use past information show similar patterns.<br />

Gaussian factor graphs<br />

For the data from period A, we found that, for the fine (coarse) MWL, the factor graph produces<br />

2% (7%) <strong>of</strong> type I errors, <strong>and</strong> misclassifies about 35% (32%) <strong>of</strong> the W class (Figure 7).<br />

For period B, this is about 1% (5%) higher (absolute values). Nevertheless, the Factor Graph<br />

is a very elegant algorithm, because it can cope very well with pronounced signal fluctuations<br />

<strong>and</strong> even large gaps <strong>of</strong> missing data (Reller et al., 2011). In addition, the available MATLAB<br />

implementation is very fast compared to current implementations <strong>of</strong> the other algorithms. The<br />

classification performance could be further optimized by tuning, eventually also accompanied<br />

by a post-processing to filter out unrealistically frequent <strong>and</strong> small rain events.<br />

�Gini<br />

180<br />

160<br />

140<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

min15_fut<br />

min30_past<br />

min15_past<br />

min30_fut<br />

RSL<br />

RTS_past<br />

min120_past<br />

std120_past<br />

std30_past<br />

min120_fut<br />

std30_fut<br />

std120_fut<br />

max15_fut<br />

q10_120_fut<br />

q10_120_past<br />

std15_past<br />

std15_fut<br />

max15_past<br />

autocor120_past<br />

slope30_fut<br />

Page 7 <strong>of</strong> 12<br />

slope120_fut<br />

slope30_past<br />

max30_fut<br />

autocor120_fut<br />

slope120_past<br />

max30_past<br />

slope15_past<br />

0.1 dB-Links<br />

1 dB-Links<br />

slope15_fut<br />

q90_120_past<br />

rain_10_40_past<br />

autocor30_past<br />

autocor30_fut<br />

max120_fut<br />

max120_past<br />

autocor15_past<br />

autocor15_fut<br />

Figure 6 Mean Gini-decrease for algorithm RF1. �Gini is the average decrease over ten 0.1 dBm <strong>and</strong> four<br />

1.0 dBm links.


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

Dry Wet<br />

Figure 7 Class errors for two categories Dry <strong>and</strong> Wet. For the R<strong>and</strong>om Forest, the two Wet classes were<br />

merged to one.<br />

DISCUSSION<br />

In general, our results indicate that all algorithms are able to successfully identify <strong>dry</strong> <strong>and</strong> wet <strong>periods</strong><br />

based on MWL signals. In Figure 7 we compare the online algorithms S1a, S1b <strong>and</strong> S2 to RF4, which is<br />

based on similar attributes. For the factor graphs, only an <strong>of</strong>fline algorithm was available. We found that<br />

the quantization had a great influence <strong>and</strong> therefore provide separate bars for the MWL with fine (bold)<br />

<strong>and</strong> coarse (shaded) quantization (Figure 7). In general, it can be seen that S2 shows smaller type II errors<br />

than S1a or S1b <strong>and</strong> out <strong>of</strong> the type II errors, less correspond to heavy rains (<br />

Figure 5, right). However, we also notice that the cost <strong>of</strong> fewer type II errors lies in higher<br />

misclassification rates for <strong>dry</strong> <strong>periods</strong>. Based on our results, we find that the choice <strong>of</strong> one<br />

specific algorithm strongly depends on the user’s preferences <strong>and</strong> the available resources.<br />

First, the absolute performance can be adjusted according to the user’s preference by varying<br />

the respective tuning parameters. For the original (S1) <strong>and</strong> extended (S2) moving window<br />

algorithms, the fraction <strong>of</strong> wet <strong>periods</strong> is a physically-based parameter <strong>and</strong> should therefore<br />

not necessarily be changed. For the r<strong>and</strong>om forests, the composition <strong>of</strong> the training set <strong>and</strong><br />

the weights assigned to each class can be can be tuned. For the Factor graph, the threshold<br />

parameter could be further adjusted to lower the misclassification rate for the wet classes,<br />

which would automatically lead to an increased D misclassification. Second, such parameters<br />

could be optimized by introducing a cost or preference function. Ideally, this should be defined<br />

by the user <strong>of</strong> the algorithms <strong>and</strong> will be different for online <strong>and</strong> <strong>of</strong>fline algorithms as<br />

the user is probably interested in different kind <strong>of</strong> rainfall data (e.g., real-time information to<br />

control waster water infrastructure vs. generated time series for the dimensioning <strong>of</strong> waste<br />

water infrastructure). This should also consider the absolute deviation from observed rainfall<br />

volumes.<br />

CONCLUSIONS<br />

In this study, our goal was to develop an appropriate <strong>of</strong>fline <strong>and</strong> online algorithm for the preprocessing<br />

<strong>of</strong> MWL data to support urban drainage applications. We compared novel algorithms<br />

based on moving windows, r<strong>and</strong>om forests <strong>and</strong> factor graphs, which is a rather novel<br />

technique from signal processing to those reported in literature. Our results show that an appropriate<br />

online <strong>and</strong> <strong>of</strong>fline algorithm strongly depends on the quantization <strong>and</strong> temporal<br />

resolution <strong>of</strong> the MWL data, which greatly depends on the deployed instruments. With regard<br />

Page 8 <strong>of</strong> 12


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

to the individual algorithms we conclude that i) the improved moving window algorithm is<br />

easy to implement <strong>and</strong> provides an instant prediction <strong>of</strong> the baseline <strong>and</strong> the rain-induced attenuation.<br />

The method performs unsatisfactory with a low temporal or power resolution, ii)<br />

the power <strong>of</strong> the r<strong>and</strong>om forest algorithm is its flexibility, because it can readily consider<br />

various types <strong>of</strong> information. However, this supervised learning method requires a comprehensive<br />

ground truth <strong>and</strong>, to cope with seasonal variations, ideally a whole year <strong>of</strong> data should<br />

already be available. In our case study, <strong>of</strong>fline <strong>and</strong> online algorithms performed similarly <strong>and</strong><br />

results were more robust to a coarse quantization than the moving window algorithm, iii) the<br />

Factor graph is a very elegant solution, as it decomposes the observed signals by means <strong>of</strong><br />

stochastic state-space estimation. Although it performed well, even without intensive tuning, a<br />

post processing step is suggested for MWL with a coarse quantization to avoid the prediction<br />

<strong>of</strong> too many small rain rates. This is the preferred method for low temporal resolution data,<br />

because <strong>of</strong> its underlying dynamic system model.<br />

ACKNOWLEDGEMENTS<br />

We thank Michal Piotrowski <strong>and</strong> Johannes Graf, ORANGE SA <strong>and</strong> Maciej Rudnicki from Alcatel-Lucent for kind provision <strong>of</strong> the MWL<br />

data <strong>and</strong> ERZ for rainfall information. Dr. Marcel Dettling provided valuable advice regarding statistical classification <strong>using</strong> r<strong>and</strong>om forests.<br />

We are most grateful to Pr<strong>of</strong>. Loeliger, Christoph Reller, Juan Pablo Marín Díaz from the Signal <strong>and</strong> Information Processing Laboratory<br />

(ISI) <strong>of</strong> ETH for developing <strong>and</strong> implementing the novel factor graph algorithm.<br />

REFERENCES<br />

Atkinson, E. J., et al. (2000) An Introduction to Recursive Partitioning Using the RPART Routines. Mayo<br />

Clinic Division for Biostatistics,<br />

Berne, A. <strong>and</strong> Uijlenhoet, R. (2007) Path-averaged rainfall estimation <strong>using</strong> microwave links: Uncertainty<br />

due to spatial rainfall variability. Geophysical Research Letters, 34 (7).<br />

Breiman, Friedman, Olshen <strong>and</strong> Stone. (1984). “Classification <strong>and</strong> Regression Trees”. Chapman <strong>and</strong><br />

Hall/CRC. 978-0412048418<br />

Breiman, L. (2001) R<strong>and</strong>om forests. Machine Learning, 45 (1),5-32.<br />

Brussaard, G. <strong>and</strong> Watson, P. A. (1994). “Atmospheric Modelling And Millimetre Wave Propagation”.<br />

Springer-Verlag, New York.<br />

Germann, U., Galli, G., Boscacci, M. <strong>and</strong> Bolliger, M. (2006) Radar precipitation measurement in a mountainous<br />

region. Q. J. Roy. Meteor. Soc., 132 (618),1669-1692.<br />

Gini, C. (1921) Measurement <strong>of</strong> inequality <strong>of</strong> incomes. Economic Journal 31 124–126.<br />

Krämer, S., Verworn, H. R. <strong>and</strong> Redder, A. (2005) Improvement <strong>of</strong> X-b<strong>and</strong> radar rainfall estimates <strong>using</strong> a<br />

microwave link. Atmospheric Research, 77 (1-4 SPEC. ISS.),278-299.<br />

Leijnse, H., Uijlenhoet, R. <strong>and</strong> Stricker, J. N. M. (2007) Hydrometeorological application <strong>of</strong> a microwave<br />

link. Part II: precipitation. Water Resour. Res., 43 (4),W04417.<br />

Loeliger, H. A., Bolliger, L., Reller, C. <strong>and</strong> Korl, S. (2009) Localizing, forgetting, <strong>and</strong> likelihood filtering<br />

in state-space models. In “Information Theory <strong>and</strong> Applications Workshop, ITA 2009”, pp. 184-<br />

186.<br />

Loeliger, H. A., Dauwels, J., Hu, J., Korl, S., Ping, L. <strong>and</strong> Kschischang, F. R. (2007) The factor graph approach<br />

to model-based signal processing. Proceedings <strong>of</strong> the IEEE, 95 (6),1295-1322.<br />

Messer, H., Zinevich, A. <strong>and</strong> Alpert, P. (2006) Environmental monitoring by wireless communication networks.<br />

Science, 312 713.<br />

Reller, C., Marín Díaz, J. P. <strong>and</strong> Loeliger, H. A. (2011) A model for quasi-periodic signals with application<br />

to rain estimation from microwave link gain. In “19th European Signal Processing Conference<br />

(EUSIPCO 2011) , August 29-September 2, 2011”, Barcelona, Spain.<br />

Rieckermann, J., Lüscher, R. <strong>and</strong> Krämer, S. (2009) Assessing urban precipitatoin <strong>using</strong> radio signals from<br />

a commercial communication network. In “8th International Workshop on Precipitation in Urban<br />

Areas 10-13 December, 2009, St. Moritz, Switzerl<strong>and</strong>”.<br />

Schleiss, M. <strong>and</strong> Berne, A. (2010) <strong>Identification</strong> <strong>of</strong> <strong>dry</strong> <strong>and</strong> <strong>rainy</strong> <strong>periods</strong> <strong>using</strong> telecommunciation microwave<br />

links. IEEE Geosci. Remote Sens. Lett.<br />

Upton, G. J. G, Holt, A. R., Cummings, R. J., Rahimi, A. R. <strong>and</strong> Goddard, J. W. F. (2005) Microwave<br />

links: the future for urban rainfall measurements? Atmos. Res., 77 (1-4),300-312.<br />

Zinevich, A., Alpert, P. <strong>and</strong> Messer, H. (2008) Estimation <strong>of</strong> rainfall fields <strong>using</strong> commercial microwave<br />

communication networks <strong>of</strong> variable density. Adv. Water Resour., 31 (11),1470-1480.<br />

Page 9 <strong>of</strong> 12


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

Appendix<br />

Table A1 Coordinates (in CH1903+ datum) <strong>and</strong> operating information on the MWL used in our study.<br />

The operation <strong>of</strong> the shaded link finished early (07.10.2009).<br />

ID B<strong>and</strong> A Easting<br />

A Northing<br />

B Easting<br />

B Northing<br />

Length Precision Quantization<br />

No. <strong>of</strong><br />

channels<br />

[GHz] [m] [m] [m] [m] [m] [dBm/km] [dBm] [-]<br />

1 23 686'678 245'929 682'375 242'720 5'368 0.019 0.1 1<br />

2 23 681'022 249'720 679'900 246'950 2'989 0.335 1 1<br />

3 23 681'022 249'720 684'005 248'145 3'373 0.296 1 1<br />

4 23 679'900 246'950 687'883 244'409 8'378 0.012 0.1 1<br />

5 23 687'292 251'528 682'330 256'182 6'803 0.015 0.1 1<br />

6 38 682'580 248'400 682'853 247'657 792 1.263 1 1<br />

7 38 681'552 248'965 681'342 248'164 828 0.121 0.1 1<br />

8 38 679'455 245'037 681'291 247'127 2'782 0.036 0.1 2<br />

9 38 685'101 248'105 684'538 245'285 2'876 0.035 0.1 1<br />

10 38 686'678 245'929 684'004 246'127 2'681 0.037 0.1 1<br />

11 38 686'549 244'845 685'868 245'322 831 1.203 1 2<br />

12 38 681'690 249'449 682'580 248'400 1'376 0.073 0.1 1<br />

13 58 682'853 247'657 682'870 247'348 309 0.388 0.1 1<br />

14 58 687'292 251'528 687'457 251'109 450 0.266 0.1 1<br />

RSL (dBm)<br />

-40<br />

-45<br />

-50<br />

-55<br />

-60<br />

-65<br />

-70<br />

01.06.2009 08.06.2009 15.06.2009 22.06.2009 29.06.2009<br />

Figure A1 Observed variations <strong>of</strong> the RSL for MWL 1 during June 2009. Cycling pattern represents<br />

normal variations during <strong>dry</strong> weather. In contrast, precipitation causes strong attenuation <strong>of</strong> up to -80.8<br />

dBm (19.06.2009).<br />

Page 10 <strong>of</strong> 12


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

Table A2 Attributes that were used for the different r<strong>and</strong>om forests (RF1-RF6). ON=Online algorithm,<br />

<strong>using</strong> only past information.<br />

Attribute All attributes<br />

Page 11 <strong>of</strong> 12<br />

All attributes w/o past<br />

rainfall information<br />

The most informative<br />

attributes<br />

RF1 RF2 (ON) RF3 RF4 (ON) RF5 RF6 (ON)<br />

RSL X X X X X X<br />

std15_past X X X X<br />

min15_past X X X X X X<br />

max15_past X X X X<br />

slope15_past X X X X<br />

autocor15_past X X X X<br />

std15_fut X X<br />

min15_fut X X X<br />

max15_fut X X<br />

slope15_fut X X<br />

autocor15_fut X X<br />

std30_past X X X X<br />

min30_past X X X X X X<br />

max30_past X X X X<br />

slope30_past X X X X<br />

autocor30_past X X X X<br />

std30_fut X X<br />

min30_fut X X X<br />

max30_fut X X<br />

slope30_fut X X<br />

autocor30_fut X X<br />

std120_past X X X X<br />

min120_past X X X X<br />

max120_past X X X X<br />

slope120_past X X X X<br />

autocor120_past X X X X<br />

q10_120_past X X X X<br />

q90_120_past X X X X<br />

std120_fut X X<br />

min120_fut X X<br />

max120_fut X X<br />

slope120_fut X X<br />

autocor120_fut X X<br />

q10_120_fut X X<br />

q90_120_fut X X<br />

rain_10_40_past X X<br />

RTS_past X X


12 nd International Conference on Urban Drainage, Porto Alegre/Brazil, 10-15 September 2011<br />

Figure A2 Close-up <strong>of</strong> the factor graph (adapted from Reller, 2011). The grey boxes depict noise terms.<br />

Page 12 <strong>of</strong> 12

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!