19.11.2014 Views

Localized Supervised Metric Learning on ... - Researcher - IBM

Localized Supervised Metric Learning on ... - Researcher - IBM

Localized Supervised Metric Learning on ... - Researcher - IBM

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Since P is a low-rank positive semi-definite matrix,<br />

we can decompose the precisi<strong>on</strong> matrix as P = WW T ,<br />

where W ∈ R N×d and d ≤ N. The distance metric<br />

can be rewritten as d m (x i , x) = ‖W T x i − W T x j ‖.<br />

Therefore, the distance metric is equivalent to euclidean<br />

distance over the low-dimensi<strong>on</strong>al projecti<strong>on</strong> W T x.<br />

4. Data Descripti<strong>on</strong> and Feature Extracti<strong>on</strong><br />

We have used the physiological data for 74 patients<br />

obtained from the MMIC II database [1] in our experiments.<br />

Each patient is represented with 5 streams<br />

of sensor readings, sampled at 1 minute intervals: 1)<br />

Sp02, 2) heart rate (HR), 3) mean ABP (ABPmean),<br />

(4) systolic ABP (ABPSys), and diastolic ABP (ABP-<br />

Dias). All patients bel<strong>on</strong>g to <strong>on</strong>e of two groups H or C.<br />

Those in group H (36 patients) had experienced Arterial<br />

Hypotensive Episode (AHE) events during the forecast<br />

window, whereas those in group C (38 patients) did not<br />

experience any AHE within the forecast window. The<br />

start of the forecast window is timestamped in the data<br />

set (T 0 ) and its durati<strong>on</strong> is 1 hour, in which an episode<br />

of AHE can occur. For this study, we focus <strong>on</strong> a 2-<br />

hour window around T 0 for each patient. Figure 2 illustrates<br />

the data from two patients, in which samples in<br />

H group show higher variability than those in C group.<br />

Physicians actually use the variability level of ABP to<br />

diagnose AHE [2].<br />

We have used two different schemes to represent<br />

the 2-hour temporal data for each patient: a statistical<br />

time domain method and a wavelet domain method. In<br />

the former, we compute the mean and variance of data<br />

from each sensor for each patient. Thus, each patient is<br />

represented in the time domain with a 10-dimensi<strong>on</strong>al<br />

vector. In the latter, the wavelet coefficients of the 2-<br />

hour window from each sensor are computed. We use<br />

Daubechies-4 Wavelet [3] and keep the top-10 coefficients.<br />

Finally, the coefficients from all 5 sensors are<br />

vectorized into a 50-dimensi<strong>on</strong>al feature vector for each<br />

patient.<br />

5. Experiments<br />

From the feature extracti<strong>on</strong> step described in secti<strong>on</strong><br />

4, we obtain 74 N-dimensi<strong>on</strong>al feature vectors<br />

where N = 10 for the statistic method and N = 50<br />

for the Wavelet method. We then compare the following<br />

three distance metrics using the leave-<strong>on</strong>e-out<br />

paradigm:<br />

• Expert uses Euclidean distance of the variance of<br />

the mean ABP as suggested in [2];<br />

• PCA uses Euclidean distance over lowdimensi<strong>on</strong>al<br />

points after PCA (an unsupervised<br />

metric learning algorithm);<br />

(a) Samples in H group<br />

(b) Samples in C group<br />

Figure 2. Examples of multivariate time<br />

series data for H and C groups. H<br />

group patients show higher variability<br />

than those in C group.<br />

• LSML using the localized supervised metric learning<br />

method described in secti<strong>on</strong> 3.<br />

Note that we do not make comparis<strong>on</strong>s with global supervised<br />

metric learning methods like LDA [4] because<br />

as shown in [5, 8], localized metric usually performs<br />

better. The performance metrics include k-NN classificati<strong>on</strong><br />

error rate and precisi<strong>on</strong>@10 retrieval results.<br />

The precisi<strong>on</strong>@10 of a query point is computed by retrieving<br />

10-nearest points with a specific distance metric<br />

and then computing the percentage of those retrieved<br />

points having the same label as the query point.<br />

Performance Comparis<strong>on</strong> To have a fair comparis<strong>on</strong>,<br />

both PCA and LSML project data into 1-<br />

dimensi<strong>on</strong>al space since the Expert method <strong>on</strong>ly uses<br />

<strong>on</strong>e feature, i.e., the variance of mean ABP. Table 1<br />

shows the classificati<strong>on</strong> results using 3-NN classifier,<br />

and Table 2 shows the retrieval results. As can be<br />

observed in both tables, LSML out-performs both expert<br />

and PCA <strong>on</strong> both statistical and Wavelet features,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!