12.07.2015 Views

an innovative algorithm for key frame extraction in video ...

an innovative algorithm for key frame extraction in video ...

an innovative algorithm for key frame extraction in video ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

The compressed doma<strong>in</strong> is often considered when develop<strong>in</strong>g <strong>key</strong> <strong>frame</strong><strong>extraction</strong> <strong>algorithm</strong>s s<strong>in</strong>ce it easily allows to express the dynamics of a <strong>video</strong>sequence through motion <strong>an</strong>alysis. Nara et al. [19] propose a neural networkapproach us<strong>in</strong>g motion <strong>in</strong>tensities computed from MPEG compressed <strong>video</strong>. Afuzzy system classifies the motion <strong>in</strong>tensities <strong>in</strong> five categories, <strong>an</strong>d those <strong>frame</strong>sthat exhibit high <strong>in</strong>tensities are chosen as <strong>key</strong> <strong>frame</strong>s. In Calic et al. [20] <strong>video</strong>features extracted from the statistics on the macro-blocks of a MPEG compressed<strong>video</strong> are used to compute <strong>frame</strong> differences. A discrete contour evolution<strong>algorithm</strong> is applied to extract <strong>key</strong> <strong>frame</strong>s from the curve of the <strong>frame</strong> differences.In Liu et al. [21] a perceived motion energy (PME) computed on the motionvectors is used to describe the <strong>video</strong> content. A tri<strong>an</strong>gle model is then employed tomodel motion patterns <strong>an</strong>d extract <strong>key</strong> <strong>frame</strong>s at the turn<strong>in</strong>g po<strong>in</strong>ts of acceleration<strong>an</strong>d deceleration.The drawback to most of these approaches is that the number of representative<strong>frame</strong>s must be set <strong>in</strong> some m<strong>an</strong>ner a priori depend<strong>in</strong>g on the length of the <strong>video</strong>shots <strong>for</strong> example. This c<strong>an</strong>not guar<strong>an</strong>tee that the <strong>frame</strong>s selected will not behighly correlated. It is also difficult to set a suitable <strong>in</strong>terval of time, or <strong>frame</strong>s:large <strong>in</strong>tervals me<strong>an</strong> a large number of <strong>frame</strong>s will be chosen, while small<strong>in</strong>tervals may not capture enough representative <strong>frame</strong>s, those chosen may not be<strong>in</strong> the right places to capture signific<strong>an</strong>t content. Still other approaches work onlyon compressed <strong>video</strong>, are threshold-dependent, or are computationally <strong>in</strong>tensive(e.g. [22] <strong>an</strong>d [21]).In this paper, we propose <strong>in</strong>stead <strong>an</strong> approach to the selection of <strong>key</strong> <strong>frame</strong>sthat determ<strong>in</strong>es the complexity of the sequence <strong>in</strong> terms of ch<strong>an</strong>ges <strong>in</strong> the pictorialcontent us<strong>in</strong>g three visual features: its color histogram, wavelet statistics, <strong>an</strong>d <strong>an</strong>edge direction histogram. Similarity measures are computed <strong>for</strong> each descriptor<strong>an</strong>d comb<strong>in</strong>ed to <strong>for</strong>m a <strong>frame</strong> difference measure. The <strong>frame</strong> differences are thenused to dynamically <strong>an</strong>d rapidly select a variable number of <strong>key</strong> <strong>frame</strong>s with<strong>in</strong>each shot. The method woks fast on all k<strong>in</strong>d of <strong>video</strong>s (compressed or not), <strong>an</strong>ddoes not exhibit the complexity of exist<strong>in</strong>g methods based, <strong>for</strong> example, oncluster<strong>in</strong>g strategies. It c<strong>an</strong> also extract <strong>key</strong> <strong>frame</strong>s on the fly, that is, it c<strong>an</strong> output<strong>key</strong> <strong>frame</strong>s while comput<strong>in</strong>g the <strong>frame</strong> differences without hav<strong>in</strong>g to process thewhole shot.3. Summary evaluationOne of the most challeng<strong>in</strong>g topics <strong>in</strong> the field of <strong>video</strong> <strong>an</strong>alysis <strong>an</strong>dsummarization is that of evaluat<strong>in</strong>g the summaries produced by the different <strong>key</strong><strong>frame</strong> <strong>extraction</strong> <strong>algorithm</strong>s. In their work to design a <strong>frame</strong>work <strong>for</strong> <strong>video</strong>summarization, Fayzull<strong>in</strong> et al. [23] def<strong>in</strong>e three properties that must be taken <strong>in</strong>toaccount when creat<strong>in</strong>g a <strong>video</strong> summary: cont<strong>in</strong>uity, priority <strong>an</strong>d repetition.Cont<strong>in</strong>uity me<strong>an</strong>s that the summarized <strong>video</strong> must be as un<strong>in</strong>terrupted as possible.Priority me<strong>an</strong>s that, <strong>in</strong> a given application, certa<strong>in</strong> objects or events may be moreimport<strong>an</strong>t th<strong>an</strong> others, <strong>an</strong>d thus the summary must conta<strong>in</strong> high priority items.Repetition me<strong>an</strong>s that it is import<strong>an</strong>t to not represent the same events over <strong>an</strong>dover aga<strong>in</strong>. It is often very difficult to successfully <strong>in</strong>corporate these sem<strong>an</strong>ticproperties <strong>in</strong> a summarization <strong>algorithm</strong>. Priority, <strong>in</strong> particular, is a highly taskdependent property. It requires that <strong>video</strong> experts carefully def<strong>in</strong>e thesummarization rules most suitable <strong>for</strong> each genre of <strong>video</strong> sequence processed.The most common evaluation of a summary relies on the subjective op<strong>in</strong>ion ofa p<strong>an</strong>el of users. This shifts the problem of <strong>in</strong>corporat<strong>in</strong>g sem<strong>an</strong>tic <strong>in</strong><strong>for</strong>mation4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!