Object Detection On Cell Broadband Engine - ISR-Coimbra

Object Detection On Cell Broadband EngineLuciano Oliveira and Urbano NunesInstitute of Systems and RoboticsDepartment of Electrical and Computer EngineeringCoimbra, Portugal{lreboucas,urbano}isr.uc.ptAbstractMany theoretically conceived, efficient computer visionapplications are avoided to run on-the-fly for hardware limitations.For recognizing objects in images and extractingobject information, a system usually owns object searching,object classification and object tracking modules. Nevertheless,each one of these modules may be high time demanding,requiring to be implemented in different computersin order to achieve efficient runtime. On the other hand,more multi-core processor computers come to provide powerfulprocessing towards full parallel implementations insidecost effective systems. In this way, the main contributionof this paper is pointing some directions to implementthose systems in the Cell broadband engine.2 Object Detection PipelineA typical object detection pipeline is illustrated in Fig. 1.This pipeline is commonly used in order to improve the confidenceof the hypothesized objects in still images. Amongthe modules, the first two are usually built by means of hightime demanding algorithms. The tracking module is generallythe least computational demanding module.Figure 1. Object detection pipeline2.1 Object Searching1 INTRODUCTIONA typical system for object recognition in still images isusually composed by three main modules: object searching,object recognition and object tracking. These modulesare primarily conceived to be integrated, while informationtransfer among them should provide more confidence in thedecision of what or where are the objects in the images.Yet, the first two modules are probably the most computationallyexpensive since, after the objects are hypothesized,a kalman filter based tracking system is almost costless. Forbuilding a complete object detection system for real life applications,some project issues should be considered. Themain one resides in obtaining a trade-off between a highperformance object detection and on-the-fly implementation.This frame rate usually decreases while increasing thecomplexity of the algorithms applied.Finding objects in still images is a complex task, beingaccomplished generally under two main methods: bruteforce and moving object segmentation. In the first category,a normalized image window runs through various scalesand positions over the input frame. Under the perspectiveof the parallelization, the brute force methods rely on differentcomputating parallelization strategies (onto PS3 platform):i) instruction level parallelization in the SynergisticProcessing Unit (SPE); ii) image scale level parallelization– each image scale should be sent to an SPE to be processed.The parallelization of the latter method, instead, should happenat the moment that each scale of the pyramidal is sentto each SPE, making the parallelization to be accomplishedin a higher level programming technique. Yet, if the size ofeach scaled image is longer than the Local Store (LS) presentedin each SPE, one should thinking about other strategiesto break the processing in multiple clock cycles [1].In moving object segmentation methods, the way that theimage Region of Interest (ROI) is found usually relies on

faster methods, since the main objective is to segmentingwhat is foreground and background based on hints of movement.The success of this type of method resides in a goodhypothesis technique based on various clues. Generally, opticalflow and particle filter are used to estimate the differencebetween the foreground and background. In this way,an instruction level parallelization technique should presentbetter results, since a scale level approach is not usually feasible.2.2 Object ClassificationObject classification can be accomplished by supervisedor unsupervised approaches. Several works utilize deterministicapproaches in order to classify objects in imagesand some of them can be found in [2]- [4]. This type ofclassification system fits the acceleration approach implementedby the Cell, since that architecture is composed byinherently vector processors. The vectorization structuretakes place by means of SIMD instructions, being able tomultiple data in the same clock. The main purpose of ourwork is to parallelize our detection algorithm [5]. The mainidea of the approach is to use a couple of feature extractorsand classifiers, combining them into a more expert system,which holds a high performance in many datasets. In thisway, the goal is to parallelize all aspects of the algorithms.To achieve this, the use of a cluster of PS3 may provide apromissing solution to implement a brute force approach ofour pedestrian detection system.3 Cell ArchitectureCell is a heterogeneous multi-core architecture, containinga dual-threaded processor unit, denominated Power ProcessingElement (PPE) and eight co-processors called SynergisticProcessing Element [6]. In order to use the fullpower of the Cell processor inside PS3, one must split programsinto threads, assigning each one to an SPE.3.1 Optimizing Object Detection AlgorithmsObject detection algorithms appear like suitable ones fora Cell optimized execution. In the current state of our work,we identified a main approach to use the whole power ofCell processor in order to classify objects in images: apyramidal searching. The second approach uses the multiresolutiontechnique to create a pyramid of N stages, whereeach stage is the original image in a different resolution,sending them continuosly to the six SPEs at a time.4 ConclusionThe most powerful unit in Cell architecures is undoubtedlythe SPEs. These processors are mainly vector-driven,and SIMD instructions are pervasively implemented for performancegain. On the other hand, within the pipeline ofobject detection, many points are found to be vectorized,being suitable to be implemented whether in a single Cellor in a cluster of Cells. Nowadays, PS3 game console is ahigh-performance computational system, with a great tradeoffbetween cost and benefits. The next step is to build acomplete object detection system by using a cluster of PS3.References[1] A. Felch, J. Nageswaran, A. Chandrashekar, J. Furlong,N. Dutt, R. Granger, A. Nicolau and A. Veidenbaumand,“Accelerating Brain Circuit Simulationsof Object Recognition with CELL Processors”, in InternationalWorkshop on Innovative Architecture forFuture Generation Processors and Systems, 2007, pp33–42, 2007.[2] C. Papageorgiou and T. Poggio, “A Trainable Systemfor Object Detection”, International Journal of ComputerVision, vol. 38, 2005, pp 15–33.[3] N. Dalal and B. Triggs, “Histograms of OrientedGradients for Human Detection”, IEEE InternationalConference on Computer Vision and Pattern Recognition,2005, pp 886–893.[4] L. Oliveira, G. Monteiro, P. Peixoto and U. Nunes,“Towards a Robust Vision-based Obstacle Perceptionwith Classifier Fusion in Cybercars”, ComputerAided System Theory (Eurocast 2007), Lecture Notesin Computer Science (LNCS), Springer-Verlag, vol.4739, 2007, pp 1089–1096.[5] L. Oliveira and Urbano Nunes, “On Integration of Featuresand Classifiers for Robust Vehicle Detection”,IEEE Conference on Intelligent Transportation Systems,2008 (submitted).[6] D. Pham, S. Asano, M. Bolliger, M.N. Day, H.P. Hofstee,C. Johns, J. Kahle, A. Kameyama, J. Keaty,Y. Masubuchi, M. Riley, D. Shippy, D. Stasiak, M.Suzuoki, M. Wang, J. Warnock, S. Weitzel, D. Wendel,T. Yamazaki and K.Yazawa, “The Design andImplementation of a First-generation CELL Processor”,IEEE International Conference on Solid-StateCircuits, 2005, pp 184-592.

Object Detection On Cell Broadband Engine - ISR-Coimbra

Create successful ePaper yourself

Delete template?

Save as template?