2.2. Shot Boundary Detection

wipe A wipe is a transition where shot B reveals by travelling from one side of the frame

to another or with a special shape. Famous examples for wipes are barn door wipes

(where the wipe proceeds from two opposite edges of the image towards the center)

or matrix wipes. Wipes are often used in presentations but they are very rare in

TV broadcasts.

In their comparison paper from 1996 Boreczky and Rowe (University of California Berkeley)

[Boreczky and Rowe, 1996] have shown that hard cuts are the most common type

of transition. Gradual transitions are less frequent. Their results, based on 3.8 hours of

TV broadcasts, are shown in table 2.1.

Video Type # of frames Cuts Gradual Transitions

Television 133.204 831 42

News 81.595 293 99

Movies 142.507 564 95

Commercials 51.733 755 254

Miscellaneous 10.706 64 16

Total 419.745 2507 506

Table 2.1.: Frequency of hard cuts versus gradual transitions in different video types.

[Boreczky and Rowe, 1996]


Beside [Boreczky and Rowe, 1996], serveral more detailed surveys exist, comparing further

shot boundary detection approaches and/or working on different data sets. Some

were already written in the 1990’s ([Aigrain and Joly, 1994], [Ahanger and Little, 1996],

[Patel and Sethi, 1996] [Lienhart, 1999] or [Browne et al., 1999]), others are newer like

[Gargi et al., 2000], [Koprinska and Carrato, 2001], [Hanjalic, 2002], [Lefevre et al., 2003],

[Cotsaces et al., 2006] or [Yuan et al., 2007]. Furthermore, since 2001, the shot boundary

detection problem has been included as a task in the Text REtrieval Conference series

(TREC), sponsored by the National Institute of Standards and Technology (NIST).

Since 2003, one of the workshops on this conference (TRECVID 1 ) only handles video

data [Smeaton et al., 2006]. This also involves a good video dataset to benchmark new


The amount of literature existing in this field reflects the number of different approaches

for shot boundary detection. Of course, all methods base on the quantification of the

difference between consecutive frames in a video sequence and most of them have in

common, that features based on the image colors are chosen to distinguish between

different shots. Swain and Ballard [Swain and Ballard, 1991] have proven, that ”color has

excellent discrimination power in image retrieval systems. It is very rare that two images



