Detecting and Segmenting Periodic Motion - CiteSeerX

M.I.T Media Laboratory Perceptual Computing Section Technical Report No. 400Detecting and Segmenting Periodic Motion Fang Liu and Rosalind W. PicardThe Media Laboratory, E15-390Massachusetts Institute of Technology, Cambridge, MA 02139iu@media.mit.edu, picard@media.mit.eduAbstractA new algorithm is presented forthe detectionand segmentation of multiple periodic motionsin image sequences. A periodicity measure whichprovides an accurate quantitativemeasure ofsignalperiodicity in the form of temporal harmonicenergy ratio is proposed. The periodicity templateswhich incorporate the segmented regionwith the associated periodicity energy distributionand motion fundamental frequencies providea new means of periodic motion representation.The algorithm is computationallysimple and robustin the presence ofnoise. Real world examplesare used toillustrate the techniques.1 Introduction1.1 OverviewPeriodic motion is common in the natural world. For instance,locomotion often exhibits cyclic movement. Periodicityisalso an important cue in human motion perception.Information regarding the nature of a periodic motion,such aslocation, extent, and frequency, isimportantfor motion understanding. Techniques for periodic motiondetection and segmentation can assist in manyapplicationsrequiring object and activity recognition and representation.The main body of work on periodic motion is modelbased(eg. [1][2]). More recently there is work on motionrecognition directly using low-level features of motion information(eg. [3][4][5]). However, to date, there has notbeen a method which uses low-level motion features to detectand segment periodic motion simultaneously. In thiswork, we attempt to tackle this problem by using a Fourierspectral based approach. The periodicity templates generatedin the process incorporate the location, extent, andother characteristic information, such asfrequency, ofaperiodicmotion. Therefore, the templates can also be usedfor periodic motion representation.1.2 Our ApproachThe term temporal texture is dened in [4] as \the motionpatterns of indeterminate spatial and temporal extent".We would like touse \texture" to refer to any approximatelyhomogeneous phenomenon. In this sense, periodictemporal activity isatype of temporal texture. This work was supported in part by IBM and NEC.The algorithm we present hereismotivated by theoryfor textured static images that assumes an underlying randomeld representation for the data [4]. In particular,1-D signals along the temporal dimension can be consideredas stochastic processes. When assuming stationarity,astochastic signal can be decomposed into deterministic(periodic) and indeterministic (random) components(Wold decomposition [6]). In the frequency domain, thetwo components correspond to the singular and continuouspart of the Fourier spectra respectively. In practicalapplications, this is to say that the repetitive structure inthe signal contributes only to the spectral harmonic peaksand the random behavior to the smooth part of the spectra.Therefore, the energy contained in the harmonic peaks isagoodmeasure of periodicity inthe signal. In this work,we apply this analysis to the temporal dimension of theimage sequences and propose a new measure of temporalperiodicity. This measure is then used to build a periodicitytemplate for simultaneous periodic motion detectionand segmentation.The actual implementation of the approach above assumesthat the repetitiveness of an action is observablealong lines parallel to the temporal (T) axis. In otherwords, the moving object needs to be tracked just like wex our eyes on a walking person. Typically, optical owbased techniques are used for object tracking. However,ow based methods are usually susceptible to noise. Wepresent here a non-ow-based procedure for foveating onmoving objects by frame alignment.1.3 Related WorkOf the large amount ofresearch in this area, the workof Polana and Nelson on periodic motion detection [4] isperhaps the most relevant totheapproach presented inthis paper. In their work, reference curves, which are linesparallel to the trajectory of the motion ow centroid, areextracted and their power spectra computed. The periodicitymeasure p f of each reference curve isdened as thenormalized dierence between the sum of the spectral energyat the highest amplitude frequency and its multiplesand the sum of the energy at the frequencies half way between.Besides the value of the periodicity measure itself,there is no checking on the signal harmonicity along thecurve, which isaweakness of the method. The periodicitymeasure for an entire sequence is the maximum of p faveraged among pixels whose highest power spectrum valuesappear on the same frequency. The nal periodicitymeasure is used to distinguish periodic and non-periodicmotion by thresholding.In [3], ow based algorithms are used to transform the1

(a)(b)Frame 20 Frame 40Figure 2: Head and ankle level XT slices of the Walkersequence. (a) Head leaves a straight track. (b) Walkingankles make acrisscross patternFrame 60 Frame 80Figure 1: Frames 20, 40, 60, and 80 of the 97 frame sequenceWalker. Frame size is 320 by 240.image sequence so that the object in consideration staysat the center of the image frame. Then ow magnitudes intessellated frame areas of periodic motion were used as featurevectors for motion classication. There is no precisesegmentation of regions of periodic motion involved.2 Method2.1 OverviewOur system for periodic motion detection and segmentationconsists of two stages: 1) foveating by frame alignment2) simultaneous detection and segmentation of regionsof periodic motion.In this work, two types of image sequences are considered.(1) Each object which mayhave aperiodic motion isas a whole stationary to the camera, but the backgroundcan be moving. (2) The relative motion between the cameraand the background is small, permitting minor shakeand gradual drift as in the hand-held situation, and eachmoving object is as a whole moving approximately frontoparallelto the camera at relatively constant speeds andin a translatory manner. In practice, large number of imagesequences containing natural periodic motions can becategorized into one of the two types.When watching a sequence of a person walking acrossthe image plane, weexperience a notion of repetitive movement.However, if examining the individual frames, therewould be no re-occurring scenes. Several frames of a sequenceWalker are shown in Figure 1 (see also http://[webseq-1]).The reason why wesee a periodic motion in thesequence is due to our ability tofocusour visual attentionon the moving object, or, foveat. The eect of foveatingcan be accomplished computationally by frame alignment.Obviously, foveating is not necessary for sequence type (1),but in fact is a process of transforming sequences of type(2) into type (1).In the second stage, 1-D Fourier transforms are performedon each pixel location along the temporal dimensionof the aligned frames. The spectral harmonic peaksare then detected and used to compute harmonic energyinvolved in each frame pixel location. A periodicity templatein frame size is generated based on the ratio betweenthe harmonic energy and the total energy. The originalsequence is then masked for regions of periodic motion.In the following, we usethe term data cube to refer tothe three dimensional (X: horizontal Y: vertical and T:temporal) data volume formed by putting together all theframes in a sequence, one behind the other. The XT andYT slices of the data cube reveal the temporal behaviorusually hidden from the viewer. Figure 2 shows the headand ankle level XT slices of the Walker sequence. Thehead leaves a more or less straight track while the walkingankles make acrisscross pattern. Throughout this section,the Walker sequence will be used to illustrate the technicalpoints. More examples are given in Section 3.2.2 Foveating by Frame AlignmentTo align a sequence to a particular moving object, we needrst to detect the trajectory of the object. In existingworks, certain kinds of tracking techniques based on differencingconsecutive frames are usually used to locate themoving objects. The ow-based methods are inevitablysubject to the noise sensitivity (which will be demonstratedlater) inherent in pairwise frame comparison. To avoidthese shortcomings, we takeadvantage of the two types ofrestrictions on data and use a method similar to the onein [7] to look directly in the data cube for tracks left bythe moving objects.Since the background of the sequence changes slowly, a1-D median lter can be applied to the data volume alongthe temporal dimension T to exclude moving objects andresult in a sequence containing mostly the background.Filter length of 11 was used in the Walker sequence. Subtractingthe background from the original sequence, weobtain a dierence sequence containing mainly the movingobjects. By the condition (2) on data, the trajectoriesof the objects as a whole in the dierence cube shouldbe approximately linear. To obtain 2-D representations ofthe trajectories, we simply compute the average of the XTor the YT slices of the dierence cube, which isequivalenttocollapsing the dierence cube top down to the XTplane or sideways to the YT plane. Next, lines in the 2-Dtrajectory images are detected via the Hough transform.The detected lines give theXorYpositions of the movingobjects in each frame. These position values are calledalignment indices. Theaveraged XT image of the Walker2

(a)(b)Figure 3: (a) Averaged XT image of the Walker sequenceafter background removal. (b) Line found in (a) by usingthe Hough transform method.dierence sequence and the line found by the Hough transformmethod are shown in Figure 3. Note that multipleobject trajectories can be detected simultaneously usingthis procedure. An example of three walking persons isshown in Section 3.Using the alignment indices of an object, each framein the image sequence can be repositioned to center theobject to any specied position in the XY plane. Afteralignment, the object should appear as moving in place.For instance, after aligning the Walker sequence in the Xdimension (alignment inYdimension is not necessary sincethe overall Y position of the person does not change much),the position of the walker's torso remains in place, but thesurroundings move tothe right. In eect, this is equivalentto focusing our visual attention on an object when viewinga sequence in which the object's position changes frame byframe. We call this process foveating by frame alignment.The aligned sequences are passed on to the second stageof the system.2.3 Detection and SegmentationIn this stage, we only deal with sequences already alignedto a particular object. To save computation and storage,an aligned sequence can be cropped to limit processing tothe object and its vicinity. Itwill become clear later thatthe cropping has no eect on the estimation of periodicmotion. The location and size of the cropping windowis estimated from the average XY image of the dierencesequence which is rst aligned to the particular object.Figure 4 shows the averaged aligned XY image and thealigned and cropped Walker sequence with splits near thecenter of the frames to show the inside of the data cube.Facing the data cube, a line can be drawn from a pixel(x 0y 0)inthe rst frame all the way through the cube toarrive atpixel (x 0y 0)inthe last frame. Clearly, this linecontains pixel (x 0y 0)ofall frames. We namethislinethetemporal line at (x 0y 0), which isofessential importanceto our discussion. If the frame size is N x by N y,thenthereare N x N y temporal lines in the data cube.Since the image sequence is aligned to a particular object,the object should be moving in place. Apparently, ifthe object is moving cyclically in any manner, the periodicitywill be reected in some of the temporal lines. Thetop two images of Figure 5 shows the head and ankle levelXT slices of 64 frames (Frame 17 to 80) of the data cubein Figure 4 (b). These slices are in fact the aligned andcropped version of the two XTslices in Figure 2. Everycolumn in the two images is a temporal line. The Fourierpower spectra of these temporal lines are the columns ofthe corresponding bottom two imagesinFigure 5. Note(a)(b)Figure 4: (a) Averaged XY image of the aligned Walker sequenceafter background removal. (b) Aligned and croppedWalker sequence with splits near the center of the framesto show theinside of the data cube.that the power spectrum values are normalized among alltemporal lines in the data cube. While the spectra of thehead slice shows no harmonic peaks, the periodicity ofthemoving ankles is captured by the spectral harmonic peaks.We refer to the spectral energy corresponding to these harmonicpeaks as the temporal harmonic energy and proposeusing the ratio between the harmonic energy and the totalenergy of a temporal line (temporal harmonic energy ratio)as a measure of temporal periodicity atthe correspondingpixel location.In [8], a method was presented to detect spectral harmonicpeaks in 2-D random elds. Here the algorithm isadapted to use for 1-D signals. Given a temporal line, itis rst zero-meaned and Gaussian tapered, then its powerspectrum values are computed via a fast Fourier transform.To locate the harmonic peaks, local maxima of thespectrum values (excluding values below 10% of the valuerange in the data cube) are found by searching a size 7neighborhood of each frequency sample. These local maximaprovide candidate locations of harmonic peaks. A localmaximum marks the location of a harmonic peak onlywhen its spectral frequency is either a fundamental or aharmonic. A fundamental is dened as a frequency whichcan be used to linearly express the frequencies of someother local maxima. A harmonic is a frequency which canbe expressed as a linear combination of some fundamentals.Starting from the lowest frequency to the highest,each localmaxima is checked rst for its harmonicity |if its frequency can be expressed as a linear combinationof the existing fundamentals, and then for its fundamentality|ifthe multiples of its frequency, combined withthe multiples of existing fundamentals, coincide with thefrequency of another local maximum. A tolerance of onesample point isused in the frequency matching.Due to the nature of the natural temporal signal and thewindowing eect in spectra computation, a harmonic peakusually does not appear as a single impulse. Therefore theprocedure so far only provides the summit location of eachpeak. The support of a peak is determined by growingoutward from the summit location along the frequency axisuntil the spectrum value is below certain small value (5% ofthe spectrum value range in this work). After the harmonicpeaks are identied, it is straightforward to compute the3

(a)(b)(a)(b)Figure 5: Temporal lines and their normalized power spectraat the head and ankle level of the aligned Walker datacube. Top row: XT slices, every column in the images isa temporal line. Bottom row: power spectra, each columnis the power spectra of the corresponding column in theXT slice. (a) Head level, showing no harmonic peaks. (b)Ankle level, periodicity ofthemoving ankles is capturedby the harmonic peaks in spectra.harmonic energy ratio of the temporal line.One may argue that the technique discussed above willfail when a temporal line contains only a sinusoidal signalwhich produces only a single spectral peak. Theoretically,this is correct. However, this situation almost never arisesin the image sequences of natural scenes and objects. Toexplain this, wetrace back tothe formation of the signal ona temporal line. Since a temporal line corresponds to a particularpixel in the image plane, having a pattern movingacross the pixel is equivalent tohaving the pixel scanningacross the pattern. When will this scan create a sinusoidalsignal? The answer is only when the pattern has a sinusoidalprole. (An example is to translate horizontally avertical sine grating pattern frontoparallel to the camara ina constant speed.) However, natural edges, patterns, andsurfaces hardly ever have suchaprole. Therefore, it issafe to say that higher harmonics will usually accompanythe fundamentals in the Fourier Spectra of the temporallines.Applying the peak detection procedure to all temporallines in a data cube and registering the temporal harmonicenergy ratio at each pixel location as gray level values ina blank plane of frame size, a periodicity template for periodicmotion is built for the aligned sequence. The largerthe template value, the more periodic energy at the location.At places where no periodic motion is involved, thetemplate value remains zero. Under circumstances suchas a noisy background, some speckle noise may appearinthe template. Simple morphological closing and openingoperations can be applied to the template to clean up thespeckle.Figure 6 (a) is the periodicity template of the Walkersequence after one closing and one opening operation witha3pixel diameter circular structuring element. As ex-Figure 6: (a) Periodicity template generated from thealigned Walker sequence of Figure 4 (b). The value ateach pixel is that of the temporal harmonic energy ratioof the corresponding temporal line. High value indicatesmore periodic energy at the location. (b) Using the periodicitytemplate to mask the original sequence. The fourframes in Figure 1 are masked and stacked together intoone frame.pected, the brightest region is the wedge shape created bythe walking legs. The head, the shoulder, and the outlineof the backpack areshown because the walker bounces.The hands appear at the front ofthe body since in mostparts of the sequence the walker was xing his gloves andmoving his hands in a rather periodic manner. Note thatthe moving background and parts of the walker do not appearin the template since there is no periodic motion inthose areas.Since the non-periodic motion of the background doesnot light upinthe templates, it is clear that the sequencecropping earlier in the second stage does not eect thealgorithm itself, but only increases the computational eciency.Using the alignment indices generated at the rst stage,the periodicity template can also be applied to the originalsequence to mask the regions of periodic motion in eachframe. The masked frames corresponding to the ones inFigure 1 are stacked together and shown in Figure 6 (b)(see also http://[web-seq-2]).3 ExamplesThree example sequences are used in this paper. One is theWalker sequence. Another walking sequence is Trio, wherethree people walking and passing each other. The last oneis an aerobics sequence called Jumping Jack. Walker andTrio are taken by ahand-held consumer-grade camcorderduring a snow storm. Camera drift and the inuence ofbreathing of the cameraman are visible in the sequences.Jumping jack istaken by axed Betacam camera in anindoor setting. These examples are used to demonstrate1) the eectiveness of the new algorithm in the detectionand segmentation of both single and multiple objects withperiodic motions 2) the robustness of the algorithm undernoisy conditions 3) the noise sensitivity of opticalow based estimation methods, which are used for trajectorydetection in many existing works, and avoided bythe method proposed here.4

(a)(b)Figure 8: (a) Averaged XT image of the Trio sequenceafter background removal. (b) Lines found in (a) by usingthe Hough transform method.Figure 7: Left column: frames 40, 61, and 88 of the 156frame sequence Trio, with frame size 320 by 240. Rightcolumn: frames 40, 61, and 88 masked by theperiodicitytemplates of three individuals.3.1 TrioTrio is a 156 frame sequence of three people walking andpassing each other. Frames 40, 61, and 88 of the sequenceare shown in the left column of Figure 7 (see alsohttp://[web-seq-3]).As in the Walker example, a temporal median lter isused to extract the background. After the background islargely removed from the sequence, the averaged XT imageis computed. The lines in the XT image are then detectedvia Hough transform. Figure 8 shows the averaged XTimage and the lines detected from the image. The detectedlines provide the alignment indices of each objects. Notethat the alignment indices of all three objects are estimatedsimultaneously.Next, as in the Walker example, the original sequence isaligned and cropped for each ofthethree moving individuals.All aligned sequences contain 64 frames. Then thealigned sequences go though the process of power spectrumestimation and harmonic peak detection. Finally,the temporal harmonic energy ratio at each pixel locationis computed to generate the periodicity templates. Figure9 shows example frames of each aligned sequences andthe periodicity templates. The templates are then usedto mask the original sequence. Examples of masked sequenceare shown in the right column of Figure 7 (see alsohttp://[web-seq-4]).Notice that, besides the center person, there is a secondor even a third person passing through in all three alignedsequences. However, their appearance has no eect on theresult of periodic motion detection and segmentation. Thisis because, to any individual temporal line, these passersbyare one-time event anddo not contribute to the temporalharmonic energy of the temporal line. The Trio exampledemonstrates that the proposed algorithm is well suitedfor the detection and segmentation of multiple periodicmotions, even under the circumstances of temporary objectocclusion.3.2 Jumping JackIn the Jumping Jack sequence, there is no translatory motioninvolved and most part of the background is verysmooth. We use this sequence and the noisy versions ofit to demonstrate the robustness of the new algorithm undernoisy conditions and also to show the sensitivity ofthe optical ow based motion estimation to noise. Thereare three dierent kinds of input given to the system: theoriginal sequence and the sequences corrupted by additiveGaussian white noise (AGWN) with variance 100 and 400.The length of the sequences used in power spectrum estimationis increased to 128 due to the cycle of the jumpingmotion. All the data related to the Jumping Jack sequenceare shown in Figure 10 (see also http://[web-seq-5] to [webseq-10]).The second row inFigure 10 are the 57th TY (not YT!)image of each data cube. They show thetracks left by theright hand and leg. The rows in these images are temporallines and the corresponding power spectra are shown in thethird row ofthe gure. The periodicity templates can befound in the fourth row ofthegure. Although the noisedoes cause some degradation in the arm region, other areasof the templates are well preserved.The reason why the proposed algorithm is not aectedby certain amount ofwhitenoiseinthe input is that whitenoise only contributes to the continuous component ofthepower spectrum. As long as the noise energy is not sohigh that it overwhelms the spectral harmonic peaks, thealgorithm works.Most of the related works use ow based methods to extractspatiotemporal surfaces or curves to locate the movingobjects in a sequence, but wewould like todemonstratehere that the noise sensitivity of the ow based methodcan be a drawback inrealapplications. The optical owmagnitudes are estimated here by using the hierarchicalleast-squares algorithm [9] which isbased on a gradientapproach described by [10] [11]. Two pyramids are built,5

Original GWN Var=100 GWN Var=400Figure 9: Example frames of aligned sequences and thecorresponding periodicity templates for each individualsof the Trio sequence. Two left columns: example frames.Right column: periodicity templates.one for each ofthe two consecutive frames, and motion parametersare progressively rened by residual motion estimationfrom coarse images to the higher resolution images.This algorithm is representative ofthe existing optical owestimation techniques.The last row ofimages in Figure 10 is the optical owmagnitudes of frame 61 in the sequences. When given aclean input such asthe original Jumping Jack sequence,the ow magnitudes can be used to segment out the movingobject. However, under the noisy condition, the algorithmis mostly ineective.3.3 WalkerThe detection and segmentation results of the Walker sequenceare mostly shown in the previous section. Herewe show the results from noisy inputs. Additive Gaussianwhite noise of variance 100 and 400 was used to corruptthe original sequence. The length of the aligned sequencesfor power spectrum estimation is 64. The resulting periodicitytemplates in Figure 11 clearly show that the proposedalgorithm is robust in the presence of noise.Figure 10: Data of 128 frame Jumping Jack sequence withframe size 155 by 170. Left column: original sequence.Middle column: corrupted by AGWN with variance 100.Right column: corrupted by AGWN with variance 400.Row 1: frame 61 of Jumping Jack sequence. Row 2: TYslice 57, showing the tracks left by the right hand andleg each rowofthese images is a temporal line. Row3: temporal line power spectra of TY slice 57. Row 4:periodicity templates. Row 5: optical ow magnitudes offrame 61.6

(a) (b) (c)Figure 11: Periodicity templates of the Walker sequence:(a) from original sequence (b) with AGWN of variance100 (c) with AGWN of variance 400. The proposed algorithmis robust in the presence of noise.4 Discussion4.1 AlgorithmCompared to the one used in [4], the periodicity measureproposed here in the form of the temporal harmonic energyratio is a more accurate and reliable measure of signalperiodicity. Itnot only can indicate the presence of the periodicbehavior, but also gives a quantitative measure ofhow much energy at each pixel location is contributed bythe periodicity.The use of the periodicity template allows the detectionand segmentation of the periodic motion to be accomplishedsimultaneously. Since the fundamental frequenciesof the temporal lines are extracted in the process of computingthe templates, they can be registered to the templatesas well. Using this information, areas involved inperiodic motion with dierent cycles can be easily distinguished.This is useful in situations such asapersonwalking in front ofapicket fence. When the sequence isaligned to the person, the fence will be moving in the backgroundand leave aperiodic signature on some temporallines. Consequently, the fence area will light upintheperiodicitytemplate. However, the fundamental frequencyof the moving fence is usually quite dierent fromthe oneof the walker.The proposed algorithm can also be considered as a periodicmotion lter. At rst, all moving objects are targetsfor foveating, but then only the ones exhibiting periodicmotions remain. Consider an input of a street with carsand pedestrians, the system would lter out the cars andkeep the pedestrians.Existing related work often uses ow based methods toextract the trajectories of moving objects. Since owbasedmethods can be susceptible to noise, we instead use foveatingby frame alignment tofocus on individual objects. Thisapproach not only helps to improve the performance of thesystem in the presence of noise, but also is ecient inthatit generates alignment parameters of all moving objects atonce.The method introduced here is not computationallytaxing. The most machine intensive part of the algorithmis the 1-D fast Fourier transform used in power spectrumcomputation. However, when the motion cycle is reasonablyshort, such as walking in normal speed, sequencelength of 64 suces. Cropping of aligned sequences alsohelps to speed up the processing.In the current work, we putrestrictive conditions ondata. The steady background condition in the translatorymoving object case is for the background subtraction. Thesystem actually tolerates small camera movement quitewell. When an object is not moving in a translatory mannerwith respect to the camera, its trajectory will not belinear in the data cube and a scheme more sophisticatedthan the Hough line detection will have tobeused for theframe alignment. If the object is not moving frontoparallelto the camera, the perspective eect will change thesize of the object in the sequence. However, this changeshould not be large during the period of 64 frames whenthe distance between the camera and the object is suf-ciently large comparing to the size of the camera, andin many practical situations, this is the case. When thedata is reasonably clean, ow based tracking and scalingmethods such astheone in [3] can also be used for framealignment.4.2 ApplicationsAmong other possible applications, the proposed algorithmcan be applied readily to motion classication and recognition.In [5], the shape of the active region in a sequencewas used for activity recognition. In [3], the sum of theow magnitudes in tessellated frame areas of periodic motionwere used as feature vectors for motion classication.The periodicity templates produced by the algorithm introducedhere can provide both distinct shapes of regionsof periodic motion, such asthewedge for the walking motionand the snow angle for the jumping jack, and accuratepixel-level description of a periodic action in the form oftemporal harmonic energy ratio and motion fundamentalfrequencies.The characterization of periodic motion is also importanttovideo database related applications. The presence,position, extent, and frequency information of a periodicmotion can be used for video representation and retrieval.5 SummaryWe have presented a new method for detecting and segmentingmultiple periodic motions in image sequences.The method consists of two main parts: 1) frame alignmenttofoveate on individual moving objects 2) Fourierspectral harmonic peak detection and energy computationto identify regions of periodic motion. This method detectsand segments a periodic motion simultaneously and is notcomputationally taxing. We illustrated the technique usingreal-world examples and demonstrated its robustnessto noise.Anewperiodicity measure in the form of temporal harmonicenergy ratio is proposed. Compared to the periodicitymeasure used in other work, this is a more accuratequantitative measure of how much energyateach pixellocation is contributed by thesignal periodicity.The periodicity templates contain information of thesegmented region as well as the associated energy distributionand characteristic cyclic frequencies. The templatescan be used as a means of periodic motion representation.7

AcknowledgmentsThe authors would like tothank John Wang for the hierarchicaloptical ow estimation program, Aaron Bobickfor insightful discussions, and Jim Davis for the JumpingJack sequence.References[1] D.D. Homan and B.E. Flinchbuagh. The interpretationof biological motion. Biological Cybernatics.,pages 195{204, 1982.[2] M. Allmen and C.R. Dyer. Cyclic motion detectionusing spatiotemporal surface and curves. In Proc. Int.Conf. on Patt. Rec., pages 365{370, 1990.[3] R. Polana and R. Nelson. Low level recognition ofhuman motion. In IEEE Workshop on Motion of Nonrigidand Articulated Objects, Austin, TX, 1994.[4] R. Polana and R. C. Nelson. Detecting activities. InProceedings CVPR '93, pages 2{7, New York City,NY, June 1993. Computer Vision and Pattern Recognition,IEEE Computer Society Press.[5] A.F. Bobick andJ.W. Davis. Real-time recognitionof activity using temporal templates. In Workshop inApplications of Computer Vision, Sarasota, Florida,December 1996.[6] H. Wold. A Study in the Analysis of Stationary TimeSeries. Stockholm, Almqvist & Wiksell, 1954.[7] S. A. Niyogi and E. H. Adelson. Analyzing gait withspatiotemporal surfaces. In IEEE Workshop on Motionof Non-rigid and Articulated Objects, pages64{69, Austin, Texas, Nov. 11-12 1994.[8] F. Liu and R. W. Picard. Periodicity, directionality,and randomness: Wold features for image modelingand retrieval. IEEE T. Patt. Analy. and Mach. Intell.,18(7):722{733, July 1996.[9] John Y. A. Wang. Layered Image Representation:Identication of Coherent Components in Image Sequences.PhD thesis, Department ofElectrical Engineeringand Computer Science, Massachusetts Instituteof Technology, Cambridge, September 1996.[10] J. Bergen et al. Hierarchial model-based motion estimation.In Proc. Second Europ. Conf. Comput. Vision,pages 237{252, 1992.[11] B. Lucas and T. Kanade. An iterative image registrationtechnique with an application to stereo vision.In Proc. Image Understanding Workshop, pages 121{131, 1981.8

Detecting and Segmenting Periodic Motion - CiteSeerX

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?