TRACKING POORLY MODELLED MOTION USING PARTICLE ...

5. ITERATED LIKELIHOOD WEIGHTINGGreat care is usually taken to ensure that an unbiased estimateof the posterior is obtained when applying particle filteringto tracking. The importance sampling steps of (4), (6)and (7) are bias-correcting schemes used to obtain such anunbiased estimate. However, it is well known in statisticalinference that approximation error depends not only on thebias but on the variance. If the importance density is reasonablyaccurate, the correction step may in fact increaseapproximation error for all but very large particle sets [18].Furthermore, the prior density is often poor and noisyand it therefore makes little sense to attempt to obtain acomputationally expensive, high accuracy approximation tothe posterior. This is particularly true in many human trackingapplications where inter-frame motion is often poorlymodeled by the dynamic model (transition density).A scheme is proposed here in which only a subset ofthe particles at each time step is sampled from the ‘posterior’.The remainder of the particles are used to increasesampling in regions of high likelihood via a simple iterativesearch using the most recent observation. This is usefulwhen the prior (dynamic model) is poor. It can preventtracking failure in the case of unexpected motion, for example.Rather than attempt to perform a (potentially expensive)bias-correction step for those particles used to searchhigh-likelihood regions, they are weighted at each iterationbased on their likelihood. The resulting algorithm (Table 2)is not an unbiased, Bayesian particle filter within the usualMarkov framework. After an intial iteration of SIR, thesample set is split uniformly at random into two sets of equalsize. One of these sets is propagated to the next time stepunaltered while the samples in the other set are subjectedto further iterations of diffusion, likelihood weighting andresampling. This has the effect of migrating half of the particlesto regions of high likelihood while the other half aresampled using the prior as the importance function. The effectivenessof ILW is demonstrated empirically in Section 7where it is compared to SIR and APF over multiple runs.6. LIKELIHOOD MODELIn order to apply the above filtering schemes to tracking, astate, x, and a likelihood model, p(z|x), must be defined.Head shape is reasonably well approximated as an ellipsein the image irrespective of pose. Previous authors haveused constrained ellipses to track frontal-profile views ofthe head [16, 19, 20]. As orientation and elongation varywith pose and position, particularly in an overhead view, allfive ellipse parameters were estimated here.The likelihood model combined intensity gradient informationalong the head boundary with a colour model of theellipse’s interior region as described in an earlier paper [5].The region likelihood p(r t |x n t ) was based on divergence of1. Draw N samples x n t+1 ∼ p(x t+1 |x n t )2. Assign weights wt+1 n = p(z t+1 |x n t+1)3. Normalise weights so that ∑ Nn=1 wn t+1 = 14. Resample with replacement to obtainsamples x n t+1 with equal weights5. Split the sample set at random into two sets ofsize M = N/2: {x m t+1,1} M m=1 and {x m t+1,∗} M m=16. For k = 1 . . . KDraw M samples x m t+1,k+1 ∼ p(x t+1,k+1|x m t+1,k )Assign weights wt+1,k+1 m = p(z t+1|x m t+1,k+1 )Normalise weights so that ∑ Mm=1 wm t+1,k+1 = 1Resample with replacement to obtainM samples x m t+1,k+1with equal weights7. For m = 1 . . . Mx m t+1 = x m t+1,∗x M+mt+1 = x m t+1,K+1Table 2. The ILW filter. Here p(x t+1,k+1 |x m t+1,k) is a transitiondensity with expected value x m t+1,k.a colour histogram of the ellipse’s interior from a model histogram.The boundary likelihood p(b t |x n t ) was computedby searching for maximal gradient magnitude points nearthe ellipse boundary. Assuming conditional independence,the likelihood was obtained using Equation (8). Figure 2 illustratesthe characteristics of the likelihood and compares itto the use of boundary cues and region cues alone. It variesin a well behaved manner under translation and scaling.p(z t |x n t ) = p(b t |x n t )p(r t |x n t ) (8)7. EXPERIMENTSResults are reported here for scenarios in which a person istracked moving around a home environment using a wideangle,ceiling-mounted camera. Whilst the articulated structureof the body will not always be readily apparent, it canbe assumed that the head will nearly always be visible. Thetarget application is a monitoring system to help extend independentliving for older people in their own homes. Herewe give indicative results on typical sequences of interest.Likelihood computation is the main computational expenseduring tracking and the different filters require differentnumbers of likelihood evaluations per frame. In order toobtain a fair empirical comparison, the number of particlesused with each filter was chosen so that the number of likelihoodevaluations per frame was equal. All filters were runwith the same transition density and the same noise parameters.Particle set sizes for SIR, APF and ILW were 2000,1000 and 400 respectively. ILW used an additional 8 iterationsper frame giving a total of 2000 likelihood evaluations.

Frame 1 Frame 75Frame 200 Frame 400(a) TranslationFig. 4. Frames from a 400-frame sequence in which the occupantstands up, moves around the room, sits on a chair, leans over andfinally sits on the floor. ILW tracked successfully throughout.Frame 40 Frame 50(b) ScalingFig. 2. Likelihoods as the ellipse (a) translates and (b) changesscale away from the correct ellipse. Dashed: gradient likelihood.Dotted: colour likelihood. Solid: combined likelihood (8).Frame 30 Frame 50Frame 60 Frame 80Frame 55 Frame 60Fig. 5. Frames from the sequence of Figure 4 showing the SIRtracker losing lock after frame 50. It did not recover.Fig. 3. Frames from the sequence in Figure 1 tracked using SIR.The tracker loses lock in frame 56 and is unable to recover.Figures 1, 3, 4 and 5 show typical runs of SIR and ILW.A red ellipse indicates the mean estimated from the particleset and a white ellipse indicates the most heavily weightedparticle for that frame. In Figure 3 the SIR filter loses trackwhen the person falls due to the sudden, poorly modelledmotion. However, Figure 1 shows this sequence being successfullytracked using ILW. Similarly, Figure 4 shows ILWsuccessfully tracking a 400-frame sequence while the SIRfilter was easily distracted by clutter (Figure 5).Although the above runs were typical for these sequences,isolated runs of particle filters are not sufficient to evaluateperformance. The filters were compared over multiple runson the sequence shown in Figure 1. This is a challengingsequence in several respects. The carpet, in particular, containsstrongly structured edge clutter and many regions withsimilar colour distributions to the head being tracked. Thesequence also contains large inter-frame motion when the

160140SIR →120Average Error1008060← APF(a) Sampling Importance Resampling (SIR)4020← ILW00 10 20 30 40 50 60 70Frame NumberFig. 8. Distances between estimated and ground-truth ellipse centresaveraged over 20 runs of each filter.(b) Auxiliary Particle Filter(c) Iterated Likelihood WeightingFig. 6. The mean ellipse (left) and the most highly weighted particle(right) after 55 frames for each of twenty runs.person falls over. Figure 6 compares the mean and strongestellipses obtained after 55 frames in 20 separate runs of thethree filters on the sequence. SIR failed in the great majorityof runs. In only 3 of the 20 runs did it provide a reasonableestimate in terms of the most heavily weighted particle. Themean did not provide good estimates of the state indicatingthat the distribution was clearly multimodal due to clutter.APF gave reasonable estimates in 9 of the 20 runs. ILWgave good estimates in terms of both the mean and the mostheavily weighted particle in all but one run.Figure 7 shows trajectories obtained by 20 separate runsof each of the three methods from identical initial conditions.Ground-truth data were acquired by manually fittingan ellipse to the head in each frame. The distances of the estimatedellipse centres from the ground-truth centres werecomputed for each frame. Figure 8 plots these errors foreach filter averaged over 20 runs. At about frame 40 someof the trackers lost lock due to a strong mode in the distributionover the carpet but many subsequently recovered ataround frame 50. After frame 55 the person begins to fallover causing many of the trackers to fail.8. DISCUSSIONKing and Forsyth [9] point out that expectations computedusing Condensation have high variance so that different runsof the tracker lead to very different answers. They also commentthat “the tracker will appear to be following tight peaksin the posterior even in the absence of any meaningful measurement”.The experiments conducted here show that thevariance can indeed be high while the approximation accuracyis often poor. The use of an APF improved matters alittle. However, ILW (a simply implemented modification toSIR) yielded better accuracy and lower variance. In particular,it was able to successfully track motion that was poorlyaccounted for by the dynamic model.It should be stressed that the effect of the ILW algorithmis not the same as that obtained by simply increasingthe variance of the motion model (or adopting a model withheavier tails) in the SIR filter. Asymptotically, the implieddynamics in ILW is a K times convolved version of the originaldynamic kernel but because the particle set is finite anda sampling step is applied at each iteration of ILW, the outcomeis very different.This paper has compared ILW with SIR and an unbiasedfilter motivated by exploring regions of high likelihood(APF). Other particle filters have been suggested thatshare some of the motivations discussed here. It would beinteresting to compare these in furture work. Partitionedsampling [14, 15] was proposed for tracking multiple orarticulated objects and as such is not appropriate for theapplication presented here. Layered sampling [17] can reducethe complexity of factored sampling when the likelihoodfunction is narrow. It was developed to address theproblem of “overloading” when observations are made at afine spatial scale. Annealed particle filtering [12] uses anheuristic annealing process to avoid Markov chains becomingtrapped in a mode near the starting point. It is usefulwhen the likelihood is very peaked. Rather than propagate

TRACKING POORLY MODELLED MOTION USING PARTICLE ...

Create successful ePaper yourself

Delete template?

Save as template?