24.11.2013 Views

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

PhD Thesis Semi-Supervised Ensemble Methods for Computer Vision

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 5<br />

On-line <strong>Semi</strong>-<strong>Supervised</strong> Boosting<br />

Although there have been proposed a large amount of approaches to the SSL task,<br />

most of them operate in off-line or batch mode 1 . Recall from Chapter 2 that offline<br />

methods assume having access to the entire training data at any time, which eases<br />

optimization and typically yields good classifiers. In practice, however, on-line learning<br />

capability is demanded because learners often have limited access to the problem domain<br />

due to dynamic environments or streaming data sources. Additionally, on-line learning<br />

methods usually require less memory to operate which makes them ideal <strong>for</strong> incremental<br />

learning problems, interactive learning or in cases where limited hardware resources do<br />

not allow <strong>for</strong> full data storage, <strong>for</strong> instance on mobile devices. Finally, in supervised<br />

learning on-line methods are frequently applied to large scale data problems due to their<br />

inherent good scaling characteristics.<br />

These benefits make on-line approaches also interesting <strong>for</strong> semi-supervised learning<br />

and leverage the usage of SSL in scenarios that require sequential data analysis. Furthermore,<br />

the goal of SSL is to benefit from large amounts of unlabeled data. Yet, the bad<br />

scaling behavior of current state-of-the-art SSL methods, which can be up to O(n 3 ) [Mann<br />

and Mccallum, 2007], where n is the number of unlabeled samples, often limits their applicability<br />

in practice. This seems to be in particular paradox in a field that claims to<br />

benefit from the huge amounts of unlabeled data that are often available <strong>for</strong> free. Hence,<br />

incorporating on-line learning techniques into SSL has also the potential to make them<br />

operate in really large data scenarios. Although on-line learning can bring several advantages<br />

to semi-supervised learning, in the literature there exist only a small amount of<br />

approaches dealing with on-line SSL, e.g., methods based on deep neural networks [Weston<br />

et al., 2008] or convex programming in kernel space [Goldberg et al., 2008].<br />

In this chapter, we discuss on-line semi-supervised learning algorithms based on online<br />

boosting [Oza, 2001]. The reasons <strong>for</strong> developing such algorithms are that boosting<br />

1 Note that throughout this text we use the two terms interchangeably with equal meaning.<br />

61

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!