12.07.2015 Views

Robotics 2 AdaBoost for People and Place Detection

Robotics 2 AdaBoost for People and Place Detection

Robotics 2 AdaBoost for People and Place Detection

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Robotics</strong> 2<strong>AdaBoost</strong> <strong>for</strong> <strong>People</strong><strong>and</strong> <strong>Place</strong> <strong>Detection</strong>Giorgio Grisetti, Cyrill Stachniss,Kai Arras, Wolfram Burgardv.1.0, Kai Arras, Oct 09, including material by Luciano Spinello <strong>and</strong> Oscar Martinez Mozos


Contents Motivation <strong>AdaBoost</strong> Boosting Features <strong>for</strong> <strong>People</strong> <strong>Detection</strong> Boosting Features <strong>for</strong> <strong>Place</strong> Labeling


Motivation: <strong>People</strong> <strong>Detection</strong><strong>People</strong> tracking is a key technology <strong>for</strong> manyproblems <strong>and</strong> robotics-related fields: Human-Robot-Interaction (HRI) Social <strong>Robotics</strong>: social learning, learning by imitation<strong>and</strong> observation Motion planning in populated environments Human activity <strong>and</strong> intent recognition Abnormal behavior detection Crowd behavior analysis <strong>and</strong> control<strong>People</strong> detection is a key component therein.


Motivation: <strong>People</strong> <strong>Detection</strong> Where are the people?


Motivation: <strong>People</strong> <strong>Detection</strong> Where are the people? Why is it hard?Range data contain littlein<strong>for</strong>mation on peopleHard in clutteredenvironments


Motivation: <strong>People</strong> <strong>Detection</strong> Appearance of people in range data changesdrastically with:- Body pose- Distance to sensor- Partial occlusion <strong>and</strong> self-occlusion 2d data from a SICK laser scanner


Motivation: <strong>People</strong> <strong>Detection</strong> Appearance of people in range data:3d data from a Velodyne sensor


Motivation: <strong>People</strong> <strong>Detection</strong>


Motivation: <strong>People</strong> <strong>Detection</strong>Freiburg main station


Motivation: <strong>People</strong> <strong>Detection</strong> Freiburg Main Station data set: raw data


Motivation: <strong>People</strong> <strong>Detection</strong> Freiburg Main Station data set: labeled data


Motivation: <strong>People</strong> <strong>Detection</strong> Freiburg Main Station data set: labeled data


Contents Motivation <strong>AdaBoost</strong> Boosting Features <strong>for</strong> <strong>People</strong> <strong>Detection</strong> Boosting Features <strong>for</strong> <strong>Place</strong> Labeling


Boosting Supervised Learning techniqueuser provides Learning an accurate classifier by combiningmoderately inaccurate “rules of thumb” Inaccurate rules: weak classifiers Accurate rule: strong classifier Most popular algorithm: <strong>AdaBoost</strong>[Freund et al. 95], [Schapire et al. 99]<strong>AdaBoost</strong> in <strong>Robotics</strong>:[Viola et al. 01], [Treptow et al. 04], [Martínez-Mozos et al. 05],[Rottmann et al. 05] , [Monteiro et al. 06] , [Arras et al. 07]


<strong>AdaBoost</strong> What <strong>AdaBoost</strong> can do <strong>for</strong> you:1. It tells you what the best "features" are2. What the best thresholds are, <strong>and</strong>3. How to combine this to a classifier It's a non-linear classifier Robust to overfitting Very simple to implement Classifier design becomes science, not art


<strong>AdaBoost</strong> A machine learning ensemble technique(also called committee methods) There are other ensemble technique such asBagging, Voting etc. Combines an ensemble of weak classifiers(weak learners) to create a strong classifier Prerequisite: weak classifier better than chance,error < 0.5


ClassificationLinear vs non-linear classifierLNL


ClassificationFormal problem statement: Produce a function that maps Given a training setlabelsvaluesWhat is overfitting? Fitting a statistical modelwith too many parameters Overfitted modelsexplain training data perfectly BUT: cannot generalize!


ClassificationError typesTrue valueTNPredicted valueof the classifierT'N'True PositiveFalse NegativeFalse PositiveTrue Negative Sensivity (True Positive Rate) = TP / TDetectedNot Detected Specificity (True Negative Rate) = TN / Nmany more measures...


<strong>AdaBoost</strong>: Weak ClassifierAlternativesDecision stump: Single axis-parallel partitionof spaceDecision tree: Hierarchical partition ofspaceMulti-layer perceptron: General non-linear functionapproximatorsRadial basis function: Non-linear expansions basedon kernels


<strong>AdaBoost</strong>: Weak ClassifierDecision stump Simple-most type of decision tree Equivalent to linear classifier defined by affine hyperplane Hyperplane is orthogonal to axis with which it intersectsin threshold θ Commonly not used on its own Formally,⎧h(x; j,θ) = ⎨+1⎩ −1x j> θelse€where x is training sample, j is dimension


<strong>AdaBoost</strong>: AlgorithmGiven the training data1. Initialize weights2. For m = 1,...,Mw i=1 n• Train a weak classifier on weighted training dataminimizing€Σ iw iΙ(y i≠ h m(x i))• Compute error of :€€• Compute voting weight of :€€• Recompute weights:€h m(x)h m(x)h m(x)α m= log( 1− e me m)3. Make predictions using the final€strong classifier€n∑e m= w iΙ(y i≠ h m(x i))i =1i =1w i= w iexp{ α m⋅ Ι(y i≠ h m(x i) }n∑w i


<strong>AdaBoost</strong>: Strong Classifier Computing the voting weightclassifierα mof a weakα mchance: e m = 0.5€α m= log( 1− e m)e m€€error


<strong>AdaBoost</strong>: Strong Classifier Training is completed...The weak classifiersvoting weightα 1....Mh 1...M(x)are now fix<strong>and</strong> their The resulting strong classifier is€€⎛ M ⎞H(x i) = sgn ⎜ ∑ α mh m(x i)⎟⎝ m =1 ⎠Class Result {+1, -1}Put your data here€Weighted majority voting scheme


<strong>AdaBoost</strong>: Weak Classifier Train a decision stump on weighted data This consists in...w iΙ(y i≠ h m(x i))( j * , θ * i=1) = argmin j,θFinding € an optimum parameter θ*<strong>for</strong> each dimension j =1…d <strong>and</strong>then select the j* <strong>for</strong> which thecost is minimal.n∑n∑i=1w i


<strong>AdaBoost</strong>: Weak ClassifierA simple training algorithm <strong>for</strong> stumps:∀ j = 1...dendSort samples x i in ascending order along dimension j∀ i = 1...nendCompute n cumulative sumsThreshold θ j is at extremum ofSign of extremum gives direction p j of inequality€€w cumGlobal extremum in all d sumsthreshold θ* <strong>and</strong> dimension j*jw cumjw cum(i) =givesi∑k =1w ky k


<strong>AdaBoost</strong>: Weak ClassifierTraining algorithm <strong>for</strong> stumps: Intuition Label y :red: +blue: – Assuming allweights = 1jw cum(i) =i∑k =1w ky kθ * , j * = 1


<strong>AdaBoost</strong>: AlgorithmGiven the training data1. Initialize weights2. For m = 1,...,Mw i=1 n• Train a weak classifier on weighted training dataminimizing€Σ iw iΙ(y i≠ h m(x i))• Compute error of :€€• Compute voting weight of :€€• Recompute weights:€h m(x)h m(x)h m(x)α m= log( 1− e me m)3. Make predictions using the final€strong classifier€n∑e m= w iΙ(y i≠ h m(x i))i =1i =1w i= w iexp{ α m⋅ Ι(y i≠ h m(x i) }n∑w i


<strong>AdaBoost</strong>: Step-By-Step Training data


<strong>AdaBoost</strong>: Step-By-Step Iteration 1, train weak classifier 1Thresholdθ* = 0.37Dimensionj* = 1Weighted errore m = 0.2Voting weightα m = 1.39Total error = 4


<strong>AdaBoost</strong>: Step-By-Step Iteration 1, recompute weightsThresholdθ* = 0.37Dimensionj* = 1Weighted errore m = 0.2Voting weightα m = 1.39Total error = 4


<strong>AdaBoost</strong>: Step-By-Step Iteration 2, train weak classifier 2Thresholdθ* = 0.47Dimensionj* = 2Weighted errore m = 0.16Voting weightα m = 1.69Total error = 5


<strong>AdaBoost</strong>: Step-By-Step Iteration 2, recompute weightsThresholdθ* = 0.47Dimensionj* = 2Weighted errore m = 0.16Voting weightα m = 1.69Total error = 5


<strong>AdaBoost</strong>: Step-By-Step Iteration 3, train weak classifier 3Thresholdθ* = 0.14Dimension, signj* = 2 , negWeighted errore m = 0.25Voting weightα m = 1.11Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 3, recompute weightsThresholdθ* = 0.14Dimension, signj* = 2 , negWeighted errore m = 0.25Voting weightα m = 1.11Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 4, train weak classifier 4Thresholdθ* = 0.37Dimensionj* = 1Weighted errore m = 0.20Voting weightα m = 1.40Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 4, recompute weightsThresholdθ* = 0.37Dimensionj* = 1Weighted errore m = 0.20Voting weightα m = 1.40Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 5, train weak classifier 5Thresholdθ* = 0.81Dimensionj* = 1Weighted errore m = 0.28Voting weightα m = 0.96Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 5, recompute weightsThresholdθ* = 0.81Dimensionj* = 1Weighted errore m = 0.28Voting weightα m = 0.96Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 6, train weak classifier 6Thresholdθ* = 0.47Dimensionj* = 2Weighted errore m = 0.29Voting weightα m = 0.88Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 6, recompute weightsThresholdθ* = 0.47Dimensionj* = 2Weighted errore m = 0.29Voting weightα m = 0.88Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 7, train weak classifier 7Thresholdθ* = 0.14Dimension, signj* = 2 , negWeighted errore m = 0.29Voting weightα m = 0.88Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 7, recompute weightsThresholdθ* = 0.14Dimension, signj* = 2 , negWeighted errore m = 0.29Voting weightα m = 0.88Total error = 1


<strong>AdaBoost</strong>: Step-By-Step Iteration 8, train weak classifier 8Thresholdθ* = 0.93Dimension, signj* = 1 , negWeighted errore m = 0.25Voting weightα m = 1.12Total error = 0


<strong>AdaBoost</strong>: Step-By-Step Iteration 8, recompute weightsThresholdθ* = 0.93Dimension, signj* = 1 , negWeighted errore m = 0.25Voting weightα m = 1.12Total error = 0


<strong>AdaBoost</strong>: Step-By-Step Final Strong ClassifierTotal error = 0!...which is rarelymet in practiceColored backgrounddots infigure is test set.


<strong>AdaBoost</strong>: Wrap Up Misclassified samples receive higher weights The higher the weight the "more attention"a training sample gets Algorithm generates weak classifier by training thenext learner on the mistakes of the previous one Now we also underst<strong>and</strong> the name:adaptive Boosting <strong>AdaBoost</strong> Once training is done, <strong>AdaBoost</strong> is a voting method Large impact on ML community <strong>and</strong> beyond


Contents Motivation <strong>AdaBoost</strong> Boosting Features <strong>for</strong> <strong>People</strong> <strong>Detection</strong> Boosting Features <strong>for</strong> <strong>Place</strong> Labeling


Problem <strong>and</strong> Approach Can we find robust features <strong>for</strong> people, legs <strong>and</strong>groups of people in 2D range data? What are the best features <strong>for</strong> people detection? Can we find people that do not move?Approach: Classifying groups of adjacent beams Computing a set of simple (scalar) features onthese groups Boosting the features


Related Work <strong>People</strong> Tracking[Fod et al. 2002][Kleinhagenbrock et al. 2002][Schulz et al. 2003][Scheutz et al. 2004][Topp et al. 2005][Cui et al. 2005][Schulz 2006][Mucientes et al. 2006]SLAM in dyn. env.[Montemerlo et al. 2002][Hähnel et al. 2003][Wang et al. 2003]... <strong>People</strong> detection done with very simple classifiers:manual feature selection, heuristic thresholds Typically: narrow local-minima blobs that move


Segmentation Divide the scan into segments"Range image segmentation"


SegmentationRaw scanSegmented scan Method: Jump distance conditionSee [Premebida et al. 2005] <strong>for</strong> survey Size filter:rejection of too small segmentsFeature profiles


SegmentationRaw scanSegmented scan Method: Jump distance conditionSee [Premebida et al. 2005] <strong>for</strong> survey Size filter:rejection of too small segmentsFeature profiles


SegmentationRaw scanSegmented scan Method: Jump distance conditionSee [Premebida et al. 2005] <strong>for</strong> survey Size filter:rejection of too small segmentsFeature profiles


FeaturesSegment1. Number of points2. St<strong>and</strong>ard Deviation3. Mean avg. deviation from median4. Jump dist. to preceding segment5. Jump dist. to succeeding segment6. Width


FeaturesSegmentr c7. Linearity8. Circularity9. Radius


FeaturesSegment10. Boundary Length11. Boundary Regularity12. Mean curvature13. Mean angular difference14. Mean speed


Features Resulting feature signature <strong>for</strong> each segment


Training: Data Labeling Mark segments that correspond to people Either manuallyor automatically


Training: Data Labeling Automatic labeling: obvious approach, define areaof interest3 m4 m Here: discrimination from clutter is interesting.Features include spatial relation between <strong>for</strong>e- <strong>and</strong>background. Thus: labeling is done by h<strong>and</strong>


Training Resulting Training Set+1 -1Segments correspondingto people(<strong>for</strong>eground segments)Segments correspondingto other objects(background segments)


<strong>AdaBoost</strong>: Weak Classifiers Each binary weak classifier h j (x) is created using asingle-valued feature f j in the <strong>for</strong>m: Where θ j is a threshold <strong>and</strong> p j ∈ {-1, 1} indicates thedirection of the inequality Weak classifier must be better than r<strong>and</strong>om


<strong>AdaBoost</strong>: Final Strong ClassifierStrong Binary Classifierexample 1 . . . example Nw 1 h 1Boosting...Σ{-1,1}w T h Tf #1 . . . f #14Vocabulary of featuresWeighted majorityvote classifier


ExperimentsEnv. 1: Corridor, no clutterEnv. 2: Office, very cluttered


ExperimentsEnv. 1 & 2: Corridor <strong>and</strong> OfficeEnv. 1→2: Cross-evaluationTrained in corridor, applied in office


ExperimentsAdding motion feature (mean speed, f#14)→ Motion feature has no contributionExperimental setup: Robot Herbert SICK LMS 200 laser rangefinder, 1 deg angularresolution


Experiments Comparison with h<strong>and</strong>-tuned classifier Jump distance θ δ = 30 cm Width θ w,m = 5 cm, θ w,M = 50 cm Number of points θ n = 4 St<strong>and</strong>ard deviation θ σ = 50 cm Motion of points θ v = 2 cm<strong>People</strong> are often not detected


ExperimentsFive best features:1: Radiusof LSQ-fitted circle, robust size measure (#9)2: Mean angular differenceConvexity measure (#13)3/4: Jump distancesLocal minima measure (#4 <strong>and</strong> #5)5: Mad from medianRobust compactness measure (#3)


Result: ClassificationTFT F


Summary <strong>People</strong> detection phrased as a classification problem ofgroups of neighboring beams <strong>AdaBoost</strong> allows <strong>for</strong> a systematic approach toper<strong>for</strong>m this task One-shot/single-frame people detection with over90% accuracy Learned classifier clear superior to h<strong>and</strong>-tunedclassifier No background knowledge such as an a priori map isneeded (e.g. to per<strong>for</strong>m background substraction)


Contents Motivation <strong>AdaBoost</strong> Boosting Features <strong>for</strong> <strong>People</strong> <strong>Detection</strong> Boosting Features <strong>for</strong> <strong>Place</strong> Labeling


<strong>Place</strong> Labeling: Motivation A map is a metric <strong>and</strong> topological modelof the environment


<strong>Place</strong> Labeling: Motivation Wanted: semantic in<strong>for</strong>mation about placesRoomCorridorDoorway


Scenario ExampleAlbert, whereare you?I am in thecorridor


Scenario Example 2 Semantic mappingCorridorRoomDoorway Human-Robot Interaction of type:"Robot, get out of my room, go into the corridor!"


Problem Statement Classification of the position of the robot usinga single observation: a 360° laser range scan


Observations


ObservationsRoom


ObservationsRoom


ObservationsRoomDoorway


ObservationsRoomDoorway


ObservationsCorridorRoomDoorway


Similar Observations


Similar ObservationsCorridorDoorway


Classification Problem


Classification Problem


Classification Problem?CorridorRoomDoorway


Representing the Observations How we represent the 360 laser beams <strong>for</strong> ourclassification task? As a list of beamsProblem: which beam is the first beam?Not invariant to rotation!!=


Representing the Observations A list of scalar geometrical features of the scanThe features are all invariant to rotation=


Simple Featuresd idd•Σ d i• Gap = d > θ• f = # GapsMinimum• f = dd• f =Area • f =Perimeter • f = d


Simple Features Features of the raw beams


Simple Features Features of the closed polynom P(z)made up by the beams


Multiple ClassesCorridorRoomDoorway1 23


Multiple ClassesCorridorRoomDoorway1 23


Multiple Classes Sequence of binary classifiers in a decision listCorridorClassifierH(x)=–1RoomClassifierH(x)=–1DoorwayH(x)=1CorridorH(x)=1Room Order matters as accuracy differs Order according to error rate Generalizes to sequential <strong>AdaBoost</strong> <strong>for</strong> K classes


ExperimentsTraining (top)# examples:16045Test (bottom)# examples:18726classification:93.94%Building 079Uni. FreiburgCorridorRoomDoorway


ExperimentsTraining (left)# examples:13906Test (right)# examples:10445classification:89.52%Building 101Uni. FreiburgCorridorRoomDoorwayHallway


Application to New EnvironmentTraining mapIntel Research Labin Seattle


Application to New EnvironmentTraining mapIntel Research Labin SeattleCorridorRoomDoorway


Training Learn a strong classifier from a set of previouslylabeled observationsRoomCorridorDoorwayfeaturesfeaturesfeatures<strong>AdaBoost</strong>Multi-class Classifier

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!