14.02.2013 Views

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

Mathematics in Independent Component Analysis

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

294 Chapter 21. Proc. BIOMED 2005, pages 209-212<br />

1.5<br />

1<br />

0.5<br />

0<br />

−0.5<br />

−1<br />

−1.5<br />

−2<br />

4<br />

2<br />

0<br />

−2<br />

−4<br />

−3<br />

Figure 3. Data set with 120 samples after 3-dimensional<br />

PCA projection (91% of the data was reta<strong>in</strong>ed). The dots<br />

mark the 60 samples represent<strong>in</strong>g cells, the x’s mark the 60<br />

non-cell data po<strong>in</strong>ts. The two circles <strong>in</strong>dicate clusters of<br />

a k-means application with a search for two clusters. Obviously,<br />

k-means nicely differentiates between the cell and<br />

the non-cell components.<br />

A perceptron with output dimension 1 consists of a<br />

s<strong>in</strong>gle neuron only, so the output function y can be written<br />

as<br />

y(x) = θ(w ⊤ x + w0)<br />

with weight w ∈ R n , n <strong>in</strong>put dimension, w0 ∈ R the<br />

bias and as activation function θ, the Heaviside function<br />

(θ(x) = 0 for x < 0 and θ(x) = 1 for x ≥ 0). Often, the<br />

bias w0 is added as additional weight to w with fixed <strong>in</strong>put<br />

1.<br />

Learn<strong>in</strong>g <strong>in</strong> a perceptron means m<strong>in</strong>imiz<strong>in</strong>g the error<br />

energy function shown above. This can be performed for<br />

example by gradient descent with respect to w and w0. This<br />

<strong>in</strong>duces the well known delta-rule for the weight update<br />

−2<br />

−1<br />

∆w = η(y(x) − t)x,<br />

where η denotes a chosen learn<strong>in</strong>g rate parameter, y(x) the<br />

output of the neural network at sample x and t the observation<br />

of <strong>in</strong>put x. It is easy to see that a perceptron separates<br />

the data l<strong>in</strong>early with the boundary hyperplane given by<br />

{x ∈ R n |w ⊤ x + w0 = 0}.<br />

For the cell classifier, we use a s<strong>in</strong>gle-unit perceptron<br />

with l<strong>in</strong>ear activation function <strong>in</strong> order to get a measure<br />

for the certa<strong>in</strong>ty of cell/non-cell classification. Application<br />

of delta-learn<strong>in</strong>g to the 5-dimensional data set from<br />

above gives excellent performance after already 4 epochs<br />

of batch learn<strong>in</strong>g. The f<strong>in</strong>al performance error (variance<br />

of perceptron estimation error of the tra<strong>in</strong><strong>in</strong>g set) after 55<br />

epochs was 0.0038 which confirms the good performance<br />

as well as the l<strong>in</strong>earity of the classification problem. This<br />

was further shown, when we used a two-layered network<br />

0<br />

1<br />

2<br />

3<br />

4<br />

with 5 hidden neurons <strong>in</strong> order to test for nonl<strong>in</strong>earities <strong>in</strong><br />

the data set. After only 10 epochs, the error was already<br />

very small, and it could f<strong>in</strong>ally be dim<strong>in</strong>ished to 3 · 10 −19 .<br />

Still the performance of the perceptron is more than enough<br />

for classification.<br />

4 Confidence map<br />

4.1 Generation<br />

The cell classifier from above only has to be tra<strong>in</strong>ed once.<br />

Given such a cell classifier, section pictures can now<br />

be analyzed as follows:<br />

A pixelwise scan of the image gives an image patch<br />

with center location at the scan po<strong>in</strong>t; to this image patch<br />

the cell classifier is then applied to give a probability of<br />

whether a cell is at the given location or not. This altogether<br />

(after image extension at the boundaries) yields a probability<br />

distribution over the whole image which is called confidence<br />

map. Each po<strong>in</strong>t of the confidence map is a value <strong>in</strong><br />

[0, 1] stat<strong>in</strong>g how probable it is that a cell is depicted at the<br />

specified location.<br />

In practice a pixelwise scan is too expensive <strong>in</strong> terms<br />

of calculation time, so <strong>in</strong>stead a grid value say 5 for 20 ×<br />

20-patches is <strong>in</strong>troduced, and the picture is scanned only<br />

every 5-th pixel. This yields a rasterization of the orig<strong>in</strong>al<br />

confidence map, which is still good enough to detect cells.<br />

Figure 4 shows the rasterized confidence map of a section<br />

part. The maxima of the confidence map correspond to the<br />

cell locations; small but non-zero values <strong>in</strong> the confidence<br />

map typically depict misclassifications that can be avoided<br />

by threshold<strong>in</strong>g.<br />

4.2 Evaluation<br />

After the confidence map has been generated, it can be<br />

evaluated by simple maxima analysis. However as seen <strong>in</strong><br />

figure 4, maxima not always correspond to cell positions,<br />

so threshold<strong>in</strong>g <strong>in</strong> the confidence map is first applied. Values<br />

of 0.5 to 0.8 yield good results <strong>in</strong> experiments. Furthermore,<br />

the cell classifier could have responded to one cell<br />

when applied to image patches with large overlap. Therefore<br />

after a maximum has been detected, adjacent po<strong>in</strong>ts <strong>in</strong><br />

the confidence map are also set to zero with<strong>in</strong> a given radius<br />

(15 to 18 were good values for 20 × 20 image patches). Iterative<br />

application of this algorithm then gave the f<strong>in</strong>al cell<br />

positions and hence the image segmentation.<br />

5 Results<br />

In practice we used perceptron learn<strong>in</strong>g after preprocess<strong>in</strong>g<br />

with PCA and also ICA [5] [6] <strong>in</strong> order to help the learn<strong>in</strong>g<br />

algorithm with l<strong>in</strong>early separated data.<br />

The patch size was chosen to be 20 × 20, a threshold<strong>in</strong>g<br />

of 0.8 was applied <strong>in</strong> the confidence map and the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!