18.07.2014 Views

Semantic Interpretation of Digital Aerial Images Utilizing ...

Semantic Interpretation of Digital Aerial Images Utilizing ...

Semantic Interpretation of Digital Aerial Images Utilizing ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

28 Chapter 3. From Appearance and 3D to Interpreted Image Pixels<br />

interpretation <strong>of</strong> images. On the one hand, the aerial images provide a nearly constant<br />

stuff object scale that is mainly defined by the GSD <strong>of</strong> the aerial project. The local interpretation<br />

can be thus performed for one scale which is mainly defined by the patch size.<br />

On the other hand, the huge variability <strong>of</strong> the mapped data and the undefined object’s<br />

shapes make a top-down recognition strategy intractable to solve. In addition the training<br />

sample generation does not need entirely annotated objects, but rather an efficient assigning<br />

<strong>of</strong> object classes by drawing strokes. We therefore concentrate on a local explanation<br />

<strong>of</strong> the images and introduce the exact object extraction in a later step (see Chapter 5).<br />

We extensively exploit statistical Sigma Points [Julier and Uhlmann, 1996] features,<br />

directly derived from the well-established covariance descriptors [Tuzel et al., 2006] in<br />

combination with RF classifiers [Breiman, 2001] to compactly describe and classify color,<br />

basic texture and elevation measurements within local image regions. The combination<br />

<strong>of</strong> the derived statistical features and RF classifiers provides several advantages for largescale<br />

computations in aerial imagery. Since the aerial imagery consists <strong>of</strong> multiple information<br />

sources, there is a need to reasonably combine these low-level information cues.<br />

We therefore apply a Sigma Points feature representation to compactly describe different<br />

channels considering a small local neighborhood. Compared to computed histograms<br />

over multi-spectral data, these descriptors are low-dimensional and enable a simple integration<br />

<strong>of</strong> appearance and height information, that is then represented on a Euclidean<br />

vector space. Moreover, they can be quickly computed for each pixel using integral image<br />

structures and also support parallel computation techniques.<br />

Randomized forests have proven to give robust and accurate results for challenging multiclass<br />

classification tasks [Lepetit and Fua, 2006, Shotton et al., 2008]. RFs are very efficient<br />

at runtime since the final decision is made on fast binary decisions between a small<br />

number <strong>of</strong> selected feature attributes. In addition, the classifier can be efficiently trained<br />

on a large amount <strong>of</strong> data and can handle some errors in the training data.<br />

In the following sections we first review related work in the context <strong>of</strong> semantic image<br />

interpretation and classification. Then, we outline the core parts <strong>of</strong> our semantic interpretation,<br />

consisting <strong>of</strong> a powerful feature representation (Section 3.4), the classifier (Section<br />

3.5) and refinement steps to obtain an improved final class labeling (see Sections 3.6<br />

and 3.7).<br />

3.3 Related Work<br />

While recently proposed approaches aim at extracting coarse scene geometry directly<br />

from interpretation results [Hoiem et al., 2007,Saxena et al., 2008,Gould et al., 2009,Liu<br />

et al., 2010] or try to jointly estimate classification and dense reconstruction [Ladicky<br />

et al., 2010b], we rather focus on directly integrating available 3D data in our interpre-

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!