27.09.2014 Views

An Integrated and Scalable Approach to Video Enhancement in ...

An Integrated and Scalable Approach to Video Enhancement in ...

An Integrated and Scalable Approach to Video Enhancement in ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

1<br />

<strong>An</strong> <strong>Integrated</strong> <strong>and</strong> <strong>Scalable</strong> <strong>Approach</strong> <strong>to</strong> <strong>Video</strong><br />

<strong>Enhancement</strong> <strong>in</strong> Challeng<strong>in</strong>g Light<strong>in</strong>g Conditions<br />

Xuan Dong, Jiangtao (Gene) Wen, Weix<strong>in</strong> Li, Yi (Amy) Pang, Guan Wang, Yao Lu, Wei Meng<br />

Abstract—We describe a novel, <strong>in</strong>tegrated <strong>and</strong> scalable approach<br />

<strong>to</strong> video enhancement for applications <strong>to</strong> video acquired under<br />

a broad range of challeng<strong>in</strong>g light<strong>in</strong>g conditions. We show that<br />

by us<strong>in</strong>g the same core enhancement algorithm <strong>and</strong> the proper<br />

pre-process<strong>in</strong>g module, video clips captured <strong>in</strong> low light<strong>in</strong>g, bad<br />

weather (e.g. hazy, ra<strong>in</strong>y <strong>and</strong> snowy weathers), <strong>and</strong> high dynamic<br />

range situations can all benefit from the proposed system. We also<br />

propose <strong>to</strong> utilize temporal <strong>and</strong> spatial redundancies <strong>in</strong>herent <strong>in</strong><br />

video signals <strong>to</strong> not only facilitate real-time process<strong>in</strong>g but also<br />

improve the temporal <strong>and</strong> spatial consistencies of the output <strong>and</strong><br />

overall visual quality. Various techniques <strong>to</strong> further improve the<br />

visual quality of the output are described, mak<strong>in</strong>g the proposed<br />

approach a scalable system that can be deployed as an <strong>in</strong>tegrated<br />

module <strong>in</strong> either a video encoder or a video decoder, or as a<br />

comb<strong>in</strong>ation of a codec module <strong>and</strong> a post-process<strong>in</strong>g system for<br />

better visual quality.<br />

Index Terms—<strong>Video</strong> <strong>Enhancement</strong>, Computational Pho<strong>to</strong>graphy.<br />

I. INTRODUCTION<br />

Mobile cameras such as those embedded <strong>in</strong> smart phones are<br />

<strong>in</strong>creas<strong>in</strong>gly widely deployed, <strong>and</strong> are expected <strong>to</strong> acquire,<br />

record <strong>and</strong> sometimes compress <strong>and</strong> transmit video <strong>in</strong> all<br />

light<strong>in</strong>g <strong>and</strong> weather conditions. On the iPhone, softwares<br />

such as Skype <strong>and</strong> FaceTime support real time two-way<br />

video conferenc<strong>in</strong>g over 3G or WiFi networks us<strong>in</strong>g the video<br />

camera on the phone. The popular Flip camera can not only<br />

record video <strong>in</strong> HD resolution, but also upload the clips <strong>to</strong><br />

video shar<strong>in</strong>g or social network sites such as Youtube or<br />

Facebook. On Youtube, among the over 13 million hours of<br />

video that users uploaded <strong>in</strong> 2010, at least 3 of the the <strong>to</strong>p<br />

10 most watched clips (No. 9 “Jimmy Surprises Bieber Fan”,<br />

No. 6, “Yosemite Bear Mounta<strong>in</strong> Giant Double Ra<strong>in</strong>bow”, <strong>and</strong><br />

No. 3, “Greyson Chance s<strong>in</strong>g<strong>in</strong>g Paparazzi”) were apparently<br />

shot with non-professional equipments [1].<br />

The majority of portable cameras are not specifically designed<br />

<strong>to</strong> be all-purpose <strong>and</strong> weather-proof, render<strong>in</strong>g the video<br />

footage unusable under many circumstances.<br />

Image <strong>and</strong> video process<strong>in</strong>g <strong>and</strong> enhancement <strong>in</strong>clud<strong>in</strong>g<br />

gamma correction, de-haz<strong>in</strong>g, de-blurr<strong>in</strong>g are well-studied<br />

areas. Although many algorithms perform well for different<br />

specific light<strong>in</strong>g impairments, they often require tedious <strong>and</strong><br />

Xuan Dong, Jiangtao (Gene) Wen, Yi (Amy) Pang <strong>and</strong> Wei Meng are<br />

with the Computer Science <strong>and</strong> Technology Department, Ts<strong>in</strong>ghua University,<br />

Beij<strong>in</strong>g, Ch<strong>in</strong>a, 100084. Weix<strong>in</strong> Li <strong>and</strong> Guan Wang are with the Computer<br />

Science <strong>and</strong> Technology Department, Beihang University, Beij<strong>in</strong>g, Ch<strong>in</strong>a,<br />

100191. Yao Lu is with the Electronic Eng<strong>in</strong>eer<strong>in</strong>g Department, Ts<strong>in</strong>ghua<br />

University, Ch<strong>in</strong>a, 100084.<br />

E-mail: jtwen@ts<strong>in</strong>ghua.edu.cn<br />

sometimes manual <strong>in</strong>put-dependent f<strong>in</strong>e-tun<strong>in</strong>g of algorithm<br />

parameters. In addition, different specific types of impairments<br />

often require different specific algorithms. Take low light<strong>in</strong>g<br />

video enhancement as an example. Although far <strong>and</strong> near<br />

<strong>in</strong>frared based systems ([2], [3], [4], [5]) are widely used,<br />

especially for “professional” video surveillance applications,<br />

they are usually more expensive, harder <strong>to</strong> ma<strong>in</strong>ta<strong>in</strong>, <strong>and</strong><br />

have a relatively shorter life-span than conventional systems.<br />

They also <strong>in</strong>troduce extra, <strong>and</strong> often times considerable power<br />

consumption. In many consumer applications such as video<br />

capture <strong>and</strong> communications on smart phones, it is usually<br />

not feasible <strong>to</strong> deploy <strong>in</strong>frared systems due <strong>to</strong> such cost <strong>and</strong><br />

power consumption issues. On the other h<strong>and</strong>, image <strong>and</strong><br />

video process<strong>in</strong>g based low light<strong>in</strong>g enhancement algorithms<br />

comb<strong>in</strong><strong>in</strong>g of noise reduction, contrast enhancement, <strong>to</strong>nemapp<strong>in</strong>g,<br />

his<strong>to</strong>gram stretch<strong>in</strong>g, equalization, <strong>and</strong> gamma correction<br />

techniques have made tremendous progress over the<br />

years, algorithms such as [6] <strong>and</strong> [7] produced very good<br />

enhancement results.<br />

The algorithm <strong>in</strong> [6] utilized the temporal correlations of the<br />

color <strong>and</strong> light<strong>in</strong>g <strong>in</strong>formation of pixels, <strong>and</strong> used spatialtemporal<br />

smooth<strong>in</strong>g <strong>to</strong> reduce the noise level of each frame followed<br />

by further improvement through <strong>to</strong>ne mapp<strong>in</strong>g. As the<br />

approach was pixel-based, <strong>and</strong> made no dist<strong>in</strong>ction between<br />

foreground objects <strong>and</strong> background, the algorithm sometimes<br />

resulted <strong>in</strong> spatial <strong>in</strong>consistencies, <strong>and</strong>/or under-enhancement<br />

<strong>in</strong> the foreground or over-enhancement of the background.<br />

The complexity of the overall algorithm was also fairly high,<br />

the process<strong>in</strong>g speed reported was only 6 fps even with GPU<br />

acceleration. [7] <strong>to</strong>ok mov<strong>in</strong>g objects <strong>in</strong><strong>to</strong> consideration <strong>and</strong><br />

used bilateral filter<strong>in</strong>g <strong>to</strong> improve visual quality. However, the<br />

computational complexity was also very high, <strong>and</strong> enhanc<strong>in</strong>g<br />

each frame always needed more than ten seconds.<br />

Recently, we proposed a novel low complexity video enhancement<br />

algorithm [8]. The algorithm was based on the<br />

observation that “after <strong>in</strong>vert<strong>in</strong>g the <strong>in</strong>put, pixels <strong>in</strong> the background<br />

regions of the <strong>in</strong>verted low-light<strong>in</strong>g video usually have<br />

high <strong>in</strong>tensities <strong>in</strong> all color (RGB) channels while those of<br />

foreground regions usually have at least one color channel<br />

whose density is low” <strong>and</strong> “This is very similar <strong>to</strong> video<br />

captured <strong>in</strong> hazy weather conditions”. As a result, <strong>in</strong> [8], we<br />

proposed <strong>to</strong> apply image de-haz<strong>in</strong>g algorithms <strong>to</strong> <strong>in</strong>verted lowlight<strong>in</strong>g<br />

video for enhancement.<br />

On the other h<strong>and</strong>, even though many video edit<strong>in</strong>g softwares<br />

of different levels of sophistication are available, due <strong>to</strong> the<br />

time <strong>and</strong> expertise required, more <strong>and</strong> more users <strong>in</strong>creas<strong>in</strong>gly<br />

reply on web-based (“cloud-based”) video edit<strong>in</strong>g software <strong>to</strong>


2<br />

au<strong>to</strong>mate the process of edit<strong>in</strong>g <strong>and</strong> upload<strong>in</strong>g video as much<br />

as possible. For video clips <strong>to</strong> be processed by popular webbased<br />

systems such as JayCut, they must first be compressed<br />

before they can be uploaded over the Internet, <strong>and</strong> often<br />

times, the compression is done with sub-optimal sett<strong>in</strong>gs for<br />

the video encoder. Even for local edit<strong>in</strong>g with professional<br />

software by experts where compression <strong>and</strong> then upload<strong>in</strong>g<br />

is not necessary, because video clips captured by portable<br />

cameras, such as mobile phones or the Flip camera are already<br />

compressed, usually by a low power video encoder <strong>in</strong>troduc<strong>in</strong>g<br />

significant quality loss, it is required that video process<strong>in</strong>g<br />

algorithms be able <strong>to</strong> h<strong>and</strong>le video conta<strong>in</strong><strong>in</strong>g artifacts created<br />

by both captur<strong>in</strong>g (e.g. low light<strong>in</strong>g, high dynamic range, <strong>and</strong><br />

etc.) as well as compression.<br />

In this paper, we describe a novel <strong>in</strong>tegrated <strong>and</strong> scalable<br />

video enhancement approach applicable <strong>to</strong> a wide range of<br />

<strong>in</strong>put impairments commonly encountered <strong>in</strong> mobile video<br />

applications. The core enhancement algorithm has much lower<br />

computational <strong>and</strong> memory complexities than other exist<strong>in</strong>g<br />

solutions of similar enhancement performance. In our system,<br />

a low complexity au<strong>to</strong>matic module first determ<strong>in</strong>es the predom<strong>in</strong>ate<br />

source of impairment <strong>in</strong> the <strong>in</strong>put video. The <strong>in</strong>put<br />

is then pre-processed based on the particular source of impairment,<br />

followed by process<strong>in</strong>g by the core enhancement<br />

module. F<strong>in</strong>ally, post-process<strong>in</strong>g is applied <strong>to</strong> produce the<br />

enhanced output. In addition, spatial <strong>and</strong> temporal correlations<br />

are utilized <strong>to</strong> improve the speed of the algorithm <strong>and</strong> the<br />

visual quality, enabl<strong>in</strong>g it <strong>to</strong> be embedded <strong>in</strong><strong>to</strong> video encoders<br />

or decoders <strong>to</strong> share temporal <strong>and</strong> spatial prediction modules<br />

<strong>in</strong> the video codec <strong>to</strong> further lower complexity.<br />

Although the system <strong>in</strong> the paper is described <strong>in</strong> detail <strong>in</strong><br />

the context of us<strong>in</strong>g de-haz<strong>in</strong>g as the core enhancement<br />

algorithm, it should be noted that the ma<strong>in</strong> contribution of<br />

the work is <strong>to</strong> establish the connections between the issues of<br />

enhancement of video captured with a wide range of light<strong>in</strong>g<br />

impairments. We show that it is possible <strong>to</strong> achieve reasonably<br />

good enhancement results <strong>in</strong> real time or close <strong>to</strong> real time,<br />

even with the limited resources available on a netbook or<br />

even a mobile phone. We also show that by <strong>in</strong>troduc<strong>in</strong>g more<br />

optimization techniques, the process<strong>in</strong>g quality can be further<br />

improved, thereby achiev<strong>in</strong>g a scalable architecture. Us<strong>in</strong>g the<br />

approach <strong>in</strong> this paper, one could focus on design<strong>in</strong>g a suitable<br />

core enhancement algorithm (based on either de-haz<strong>in</strong>g<br />

or low-light<strong>in</strong>g enhancement techniques), post-optimization<br />

algorithms, <strong>and</strong>/or temporal/spatial acceleration techniques.<br />

By <strong>in</strong>tegrat<strong>in</strong>g specific algorithms <strong>and</strong> techniques tailored for<br />

<strong>in</strong>dividual applications <strong>in</strong><strong>to</strong> the scalable system, optimized<br />

results could be achieved for generic applications as well as<br />

meet<strong>in</strong>g highly specific requirements.<br />

A major advantage of the approach described <strong>in</strong> the paper is<br />

its flexibility. The flexibility is reflected <strong>in</strong> several key aspects.<br />

First of all, the system is of a low complexity that is possible<br />

<strong>to</strong> be embedded <strong>in</strong><strong>to</strong> a portable camera systems. It can also be<br />

<strong>in</strong>corporated <strong>in</strong><strong>to</strong> a post-process<strong>in</strong>g software with various degrees<br />

of complexity for different quality-complexity tradeoffs<br />

target<strong>in</strong>g different applications; secondly, it can be adopted<br />

as a st<strong>and</strong>alone module, or as an <strong>in</strong>tegrated part <strong>in</strong> a video<br />

encoder or decoder. By <strong>in</strong>tegrat<strong>in</strong>g the system <strong>in</strong><strong>to</strong> an encoder<br />

or a decoder, one can not only share the <strong>in</strong>formation between<br />

the codec <strong>and</strong> the enhancement systems, thereby lower<strong>in</strong>g the<br />

comb<strong>in</strong>ed complexity, but can also usually improve the quality<br />

after process<strong>in</strong>g; f<strong>in</strong>ally, the multiple steps of the algorithm can<br />

be implemented as a complete system, or, the basel<strong>in</strong>e features<br />

of the system can be implemented on a portable device for<br />

basic enhancement <strong>in</strong> real time applications, while the more<br />

sophisticated steps (e.g. further noise reduction, better <strong>to</strong>ne<br />

mapp<strong>in</strong>g <strong>and</strong> etc. of the prelim<strong>in</strong>ary enhanced video) can be<br />

done on a cloud server. It is also conceivable that with the<br />

advance of computational pho<strong>to</strong>graphy <strong>and</strong> the development<br />

of good high dynamic range or low light<strong>in</strong>g capability image<br />

sensors, the core enhancement module could become the<br />

sensor itself, with only the pre-process<strong>in</strong>g <strong>and</strong> post-process<strong>in</strong>g<br />

modules required <strong>to</strong> h<strong>and</strong>le the challenge <strong>in</strong> a large variety of<br />

applications.<br />

The paper is organized as the follow<strong>in</strong>g. In Section II, we<br />

present the evidences <strong>and</strong> establish the connections between<br />

video de-haz<strong>in</strong>g, low-light<strong>in</strong>g video <strong>and</strong> high dynamic range<br />

video enhancement. We show that for a wide range of applications,<br />

especially applications target<strong>in</strong>g low complexity <strong>and</strong><br />

mobile platforms, satisfac<strong>to</strong>ry results can be achieved with a<br />

s<strong>in</strong>gle core algorithm for a wide range of video enhancement<br />

problems. Then, us<strong>in</strong>g a low-complexity de-haz<strong>in</strong>g algorithm<br />

expla<strong>in</strong>ed <strong>in</strong> Section III as an example of a possible choice for<br />

the core algorithm, we expla<strong>in</strong> various techniques for reduc<strong>in</strong>g<br />

the computational <strong>and</strong> memory complexities of the algorithm,<br />

<strong>and</strong> various techniques that can be used <strong>in</strong> conjunction of<br />

the core algorithm <strong>to</strong> further improve the visual quality of<br />

the core algorithm <strong>in</strong> Section IV. Given that <strong>in</strong> real-world<br />

applications, the video enhancement module could be deployed<br />

<strong>in</strong> multiple stages of the end <strong>to</strong> end system, e.g. before<br />

compression <strong>and</strong> transmission/s<strong>to</strong>rage, or after compression<br />

<strong>and</strong> transmission/s<strong>to</strong>rage but before decompression, or after<br />

decompression <strong>and</strong> before the video content displayed on the<br />

moni<strong>to</strong>r, we exam<strong>in</strong>e the complexity <strong>and</strong> rate-dis<strong>to</strong>rtion (RD)<br />

tradeoffs associated with apply<strong>in</strong>g the proposed algorithm <strong>in</strong><br />

these different steps with experimental results <strong>in</strong> Sections V.<br />

F<strong>in</strong>ally we conclude the paper with Section VI.<br />

II. AN INTEGRATED APPROACH TO VIDEO ENHANCEMENT<br />

The motivation for our algorithm is the observation made <strong>in</strong><br />

[8] that if one performs a pixel-wise <strong>in</strong>version of low light<strong>in</strong>g<br />

video, the results look quite similar <strong>to</strong> hazy video. Through<br />

experiments, we found that the same also hold true for a<br />

significant percentage of high dynamic range video. Here, the<br />

“<strong>in</strong>version” operation is simply<br />

R c (x) = 255 − I c (x), (1)<br />

where R c (x) <strong>and</strong> I c (x) are <strong>in</strong>tensities for the correspond<strong>in</strong>g<br />

color (RGB) channel c for pixel x <strong>in</strong> the <strong>in</strong>put <strong>and</strong> <strong>in</strong>verted<br />

frame respectively.<br />

To verify the claim, we r<strong>and</strong>omly selected (by Google) <strong>and</strong><br />

captured a <strong>to</strong>tal of 100 images <strong>and</strong> video clips each <strong>in</strong>


3<br />

be 14.07. The his<strong>to</strong>gram of the m<strong>in</strong>imum <strong>in</strong>tensities of all color<br />

channels of all pixels for hazy videos, <strong>in</strong>verted low light<strong>in</strong>g<br />

<strong>and</strong> <strong>in</strong>verted high dynamic range videos were used <strong>in</strong> the tests,<br />

some examples are shown <strong>in</strong> Fig. 2. The results of the chisquare<br />

tests are given <strong>in</strong> Table I. As can be seen from the table,<br />

the chi-square values are far smaller than 14.07, demonstrat<strong>in</strong>g<br />

that our hypothesis of the similarities between haze videos <strong>and</strong><br />

<strong>in</strong>verted low light<strong>in</strong>g videos, <strong>and</strong> between haze videos <strong>and</strong> high<br />

dynamic range videos is reasonable.<br />

Fig. 1: Examples of orig<strong>in</strong>al (Top), <strong>in</strong>verted low light<strong>in</strong>g<br />

videos/images (Middle) <strong>and</strong> haze videos/images (Bot<strong>to</strong>m).<br />

hazy, low light<strong>in</strong>g <strong>and</strong> high dynamic range conditions. Some<br />

examples are shown <strong>in</strong> Fig. 1. As can be clearly seen from Fig.<br />

1, visually, the videos <strong>in</strong> hazy weather are <strong>in</strong>deed similar <strong>to</strong><br />

videos captured <strong>in</strong> low light<strong>in</strong>g <strong>and</strong> high dynamic range conditions<br />

after <strong>in</strong>version. This can be unders<strong>to</strong>od us<strong>in</strong>g the widely<br />

used pixel degradation model for hazy images <strong>in</strong>troduced by<br />

Koschmieder <strong>in</strong> 1924 [9],<br />

R(x) = J(x)t(x) + A(1 − t(x)), (2)<br />

where A is the global “airlight” (ambient light reflected <strong>in</strong><strong>to</strong><br />

the l<strong>in</strong>e of sight by atmospheric particles), R(x) is the <strong>in</strong>tensity<br />

of pixel x that the camera captures, J(x) is the orig<strong>in</strong>al<br />

<strong>in</strong>tensity of the pixel, <strong>and</strong> t(x) is the medium transmission<br />

function describ<strong>in</strong>g the percentage of the light emitted from the<br />

objects that reaches the camera. In this model, each degraded<br />

pixel is a mixture of the airlight <strong>and</strong> an unknown surface<br />

radiance, the <strong>in</strong>tensities of both are <strong>in</strong>fluenced by the medium<br />

transmission, determ<strong>in</strong>ed by the scene depth <strong>and</strong> the scatter<strong>in</strong>g<br />

coefficient of the atmosphere.<br />

For hazy, low light<strong>in</strong>g <strong>and</strong> high dynamic range videos, light<br />

captured by the camera is blended with the airlight . The ma<strong>in</strong><br />

difference is the actual brightness of the airlight, brighter <strong>in</strong><br />

the case of haze videos, darker <strong>in</strong> high dynamic range videos<br />

<strong>and</strong> black <strong>in</strong> the case of low light<strong>in</strong>g.<br />

We also performed the chi-square test <strong>to</strong> exam<strong>in</strong>e the statistical<br />

similarities between hazy videos <strong>and</strong> <strong>in</strong>verted low light<strong>in</strong>g <strong>and</strong><br />

high dynamic range videos. The chi-square test is a st<strong>and</strong>ard<br />

statistical <strong>to</strong>ol widely used <strong>to</strong> determ<strong>in</strong>e if the observed data<br />

are consistent with a hypothesis. As expla<strong>in</strong>ed <strong>in</strong> [10], <strong>in</strong> chisquare<br />

tests, a p value is calculated, <strong>and</strong> usually, if p > 0.05,<br />

it is reasonable <strong>to</strong> assume that the deviation of the observed<br />

data from the expectation is due <strong>to</strong> chance alone. In our<br />

experiments, the expected distribution was calculated from<br />

hazy videos <strong>and</strong> the observed statistics from <strong>in</strong>verted low<br />

light<strong>in</strong>g <strong>and</strong> high dynamic range videos were tested. In the<br />

experiments, we divided the range [0, 255] of color channel<br />

<strong>in</strong>tensities <strong>in</strong><strong>to</strong> eight equal <strong>in</strong>tervals, correspond<strong>in</strong>g <strong>to</strong> a degree<br />

of freedom of 7. Accord<strong>in</strong>g <strong>to</strong> the chi-square distribution<br />

table, if we adopt the common st<strong>and</strong>ard of p > 0.05, the<br />

correspond<strong>in</strong>g upper threshold for the chi-square value should<br />

F<strong>in</strong>ally, the observation was confirmed by various haze detection<br />

algorithms: we implemented haze detection us<strong>in</strong>g the<br />

HVS threshold range based method [11], the Dark Object<br />

Subtraction (DOS) approach [12], <strong>and</strong> the spatial frequency<br />

based technique [13], <strong>and</strong> found that hazy, <strong>in</strong>verted low<br />

light<strong>in</strong>g videos <strong>and</strong> <strong>in</strong>verted high dynamic range videos were<br />

all classified as hazy video clips, as whereas “normal” clips<br />

were not.<br />

In our experiments, we also tested with image <strong>and</strong> video clips<br />

captured <strong>in</strong> bad weather conditions such as ra<strong>in</strong>y <strong>and</strong> snowy<br />

weathers. Some of the examples are given <strong>in</strong> later sections of<br />

the paper.<br />

Based on the visual observations <strong>and</strong> statistical tests, we<br />

believe that for the purpose of video enhancement, especially<br />

for applications on mobile devices, it is reasonable <strong>to</strong> categorize<br />

light<strong>in</strong>g impairments <strong>in</strong><strong>to</strong> two large classes, namely, low<br />

light<strong>in</strong>g video <strong>and</strong> hazy video. As the experiments <strong>and</strong> analysis<br />

also show similarities between <strong>in</strong>verted low light<strong>in</strong>g video <strong>and</strong><br />

hazy video, it is conceivable that for applications such as <strong>in</strong><br />

mobile systems, a system for video enhancement could employ<br />

the same core algorithm, <strong>in</strong>tegrated with an au<strong>to</strong>matic classifier<br />

classify<strong>in</strong>g if the <strong>in</strong>put is low light<strong>in</strong>g or hazy video, followed<br />

by the <strong>in</strong>version operation is necessary, then process<strong>in</strong>g by<br />

the core algorithm. Although the rest of the paper uses a dehaz<strong>in</strong>g<br />

algorithm as the core enhancement algorithm, it is also<br />

possible for some systems <strong>to</strong> use a low-light<strong>in</strong>g enhancement<br />

module as the core process<strong>in</strong>g module while perform<strong>in</strong>g the<br />

<strong>in</strong>version operation on hazy, as opposed <strong>to</strong> low light<strong>in</strong>g <strong>in</strong>puts.<br />

III. BASELINE INTEGRATED VIDEO ENHANCEMENT<br />

SYSTEM FOR CHALLENGING LIGHTING CONDITIONS - AN<br />

EXAMPLE<br />

Given the connections between the enhancement problems for<br />

video captured <strong>in</strong> different challeng<strong>in</strong>g light<strong>in</strong>g conditions,<br />

our basel<strong>in</strong>e experiment system consists of an au<strong>to</strong>matic<br />

impairment detection module <strong>and</strong> a video-dehaz<strong>in</strong>g based core<br />

enhancement module. As already po<strong>in</strong>ted out <strong>in</strong> the <strong>in</strong>troduction,<br />

the particular implementation is simply one among many<br />

possible designs of the concept. It is <strong>in</strong>tended <strong>to</strong> show that<br />

even a relatively simple design with off-the-shelf techniques<br />

can achieve reasonably good results for many applications <strong>and</strong><br />

on many platforms.


4<br />

Fig. 2: The his<strong>to</strong>gram of the m<strong>in</strong>imum <strong>in</strong>tensity of each pixel’s three color channels of haze videos (Left), low light<strong>in</strong>g videos<br />

(Middle) <strong>and</strong> high dynamic range videos (Right).<br />

TABLE I: Results of chi square tests<br />

Data of chi square test Degrees of Freedom Chi square values<br />

Haze videos <strong>and</strong> <strong>in</strong>verted low light<strong>in</strong>g videos 7 13.21<br />

Haze videos <strong>and</strong> <strong>in</strong>verted high dynamic range videos 7 11.53<br />

TABLE II: Parameter sett<strong>in</strong>gs for the haze detection algorithm.<br />

Color attribute Threshold range<br />

S 0 ∼ 255 0 ∼ 130<br />

V 0 ∼ 255 90 ∼ 240<br />

A. Au<strong>to</strong>matic Impairment Source Detection<br />

The function of the au<strong>to</strong>matic impairment source detection<br />

module is <strong>to</strong> classify <strong>in</strong>put video <strong>in</strong><strong>to</strong> normal video that does<br />

not need <strong>to</strong> be processed, low light<strong>in</strong>g video (also <strong>in</strong>clude<br />

high-dynamic range video) for which pixel-wise <strong>in</strong>version is<br />

performed first, or hazy video (also <strong>in</strong>clude video captured<br />

<strong>in</strong> ra<strong>in</strong>y <strong>and</strong> snowy weathers) which is processed by the core<br />

enhancement module directly.<br />

A flow diagram for this au<strong>to</strong>matic detection system is shown<br />

<strong>in</strong> Fig. 3. Our detection algorithm is based on the technique<br />

<strong>in</strong>troduced by R. Lim et al. [11]. To reduce complexity, we<br />

only perform the detection for the first frame <strong>in</strong> a Group of<br />

Pictures (GOP), coupled with scene chang<strong>in</strong>g detection. The<br />

correspond<strong>in</strong>g algorithm parameters are given <strong>in</strong> Table II. The<br />

same test was conducted for each pixel <strong>in</strong> the frame. If the<br />

percentage of hazy pixels <strong>in</strong> a picture is higher than 60%, we<br />

designate the picture as a hazy picture. Similarly, if an image<br />

is determ<strong>in</strong>ed <strong>to</strong> be a hazy picture after <strong>in</strong>version, it is a low<br />

light<strong>in</strong>g image.<br />

B. <strong>Video</strong> De-Haz<strong>in</strong>g Based Core <strong>Enhancement</strong><br />

Similar <strong>to</strong> [8], <strong>in</strong> our experiments, we used a system <strong>in</strong> which<br />

the core enhancement algorithm is an improved video dehaz<strong>in</strong>g<br />

algorithm based on the image de-haz<strong>in</strong>g algorithm of<br />

[14].<br />

As is the case <strong>in</strong> many other advanced haze-removal algorithms<br />

such as [14], [15], [16], <strong>and</strong> [17], was also based on<br />

aforementioned Koschmieder model <strong>in</strong> (2). The critical part<br />

of all image de-haz<strong>in</strong>g algorithms based on the Koschmieder<br />

Fig. 3: Flow diagram of the impairment source detection<br />

module.<br />

model is <strong>to</strong> estimate A <strong>and</strong> t(x) from the recorded image<br />

<strong>in</strong>tensity I(x) so as <strong>to</strong> recover the J(x) from I(x).<br />

Follow<strong>in</strong>g [14], we estimate the medium transmission <strong>and</strong><br />

airlight us<strong>in</strong>g the Dark Channel method:<br />

{<br />

R c }<br />

(y)<br />

t(x) = 1 − ω m<strong>in</strong> m<strong>in</strong><br />

c∈{r,g,b} y∈Ω(x) A c , (3)<br />

where ω = 0.8 <strong>and</strong> Ω(x) is a local 3 × 3 block centered at x.<br />

In our experiments, the cpu-<strong>and</strong>-memory-costly soft matt<strong>in</strong>g<br />

method proposed <strong>in</strong> [14] was not implemented <strong>in</strong> the basel<strong>in</strong>e<br />

system, but could be used as a post-process<strong>in</strong>g step, e.g. if the<br />

output from the basel<strong>in</strong>e system is subsequently uploaded <strong>to</strong><br />

a high power server <strong>in</strong> the cloud.<br />

To estimate airlight, we first note that the schemes <strong>in</strong> exist<strong>in</strong>g<br />

image haze removal algorithms are usually not very robust.<br />

Even very small changes <strong>to</strong> the airlight value might lead<br />

<strong>to</strong> very large changes <strong>in</strong> the recovered images or video<br />

frames. As a result, calculat<strong>in</strong>g the airlight frame-by-frame


5<br />

Fig. 5: The comparison of orig<strong>in</strong>al, haze removal, <strong>and</strong> optimized haze removal video clips. Top: <strong>in</strong>put video sequences; Middle:<br />

outputs of image haze removal algorithm of ; Bot<strong>to</strong>m: outputs of haze removal us<strong>in</strong>g our optimized algorithm <strong>in</strong> calculat<strong>in</strong>g<br />

airlight.<br />

value A, we refresh the value of airlight by<br />

A = A ∗ 0.4 + A t ∗ 0.6, (4)<br />

where A t is the airlight value calculated <strong>in</strong> GOP t, <strong>and</strong> A is<br />

the global airlight value. Examples of the recovered results<br />

are shown <strong>in</strong> Fig. 5. The frames <strong>in</strong> the bot<strong>to</strong>m row change<br />

gradually us<strong>in</strong>g our algorithm, as opposed <strong>to</strong> the results for<br />

the same frames <strong>in</strong> the frame by frame approach <strong>in</strong> the middle<br />

row.<br />

Fig. 4: Examples of process<strong>in</strong>g steps of low light<strong>in</strong>g enhancement<br />

algorithm (Left <strong>to</strong> Right, Up <strong>to</strong> Down): <strong>in</strong>put image I,<br />

<strong>in</strong>verted <strong>in</strong>put image R, haze removal result J of the image<br />

R, <strong>and</strong> output image.<br />

not only <strong>in</strong>creases the overall complexity of the system, but<br />

also <strong>in</strong>troduces visual <strong>in</strong>consistency between frames, thereby<br />

creat<strong>in</strong>g annoy<strong>in</strong>g visual artifacts. Fig. 5 shows an example<br />

us<strong>in</strong>g the results of the algorithm <strong>in</strong> [14]. Notice the difference<br />

between the first <strong>and</strong> second frame <strong>in</strong> the middle row.<br />

Based on this observation, we would calculate the airlight<br />

value only for the first frame <strong>in</strong> a GOP. The same value is<br />

then used for all subsequent frames <strong>in</strong> the same GOP. In the<br />

implementation, we also <strong>in</strong>corporated a scene change detection<br />

module so as <strong>to</strong> detect sudden changes <strong>in</strong> airlight that are not<br />

aligned with GOP boundaries but merit recalculation. Among<br />

successive GOPs, <strong>to</strong> avoid severe changes of the global airlight<br />

Once A is found, from (2),<br />

J(x) = R(x) − A<br />

t(x)<br />

+ A. (5)<br />

Although (5) works reasonably well for haze removal, for lowlight<strong>in</strong>g<br />

enhancements, we found that (5) might lead <strong>to</strong> underenhancement<br />

for low lum<strong>in</strong>ance areas <strong>and</strong> over-enhancement<br />

for high lum<strong>in</strong>ance areas. To solve this problem, we modified<br />

(5) <strong>to</strong><br />

J(x) = R(x) − A + A, (6)<br />

P (x)t(x)<br />

where<br />

{<br />

P (x) =<br />

Kt (x) 0 < t (x) ≤ 0.5,<br />

−Kt 2 (x) + M 0.5 < t (x) ≤ 1,<br />

In (7), K = 0.6 <strong>and</strong> M = 0.5, determ<strong>in</strong>ed through experiments.<br />

The idea beh<strong>in</strong>d (6) is as the follow<strong>in</strong>g. When t(x) is smaller<br />

than 0.5, which means that the correspond<strong>in</strong>g pixel needs<br />

boost<strong>in</strong>g, we assign P (x) a small value <strong>to</strong> make P (x)t(x)<br />

even smaller <strong>to</strong> <strong>in</strong>crease the correspond<strong>in</strong>g J(x), so as <strong>to</strong><br />

<strong>in</strong>crease the RGB <strong>in</strong>tensities of this pixel. On the other h<strong>and</strong>,<br />

when t(x) is greater than 0.5, we refra<strong>in</strong> from overly boost<strong>in</strong>g<br />

the correspond<strong>in</strong>g pixel <strong>in</strong>tensity. When t(x) is close <strong>to</strong> 1,<br />

(7)


6<br />

10 x 107 Difference of t(x) for constant pixels<br />

9<br />

8<br />

Fig. 6: Examples of optimiz<strong>in</strong>g low light<strong>in</strong>g <strong>and</strong> high dynamic<br />

range enhancement algorithm by <strong>in</strong>troduc<strong>in</strong>g P (x):<br />

Input (Left), output of the enhancement algorithm without<br />

<strong>in</strong>troduc<strong>in</strong>g P (x) (Middle), <strong>and</strong> output of the enhancement<br />

algorithm by <strong>in</strong>troduc<strong>in</strong>g P (x) (Right).<br />

quantity<br />

7<br />

6<br />

5<br />

4<br />

3<br />

2<br />

P (x)t(x) leads <strong>to</strong> slight “dull<strong>in</strong>g” of the pixel. This makes the<br />

overall visual quality more balanced <strong>and</strong> visually pleasant.<br />

For low light<strong>in</strong>g <strong>and</strong> high dynamic range videos, once J(x)<br />

is recovered, the <strong>in</strong>version operation (1) is performed aga<strong>in</strong> <strong>to</strong><br />

produce the enhanced videos of the orig<strong>in</strong>al <strong>in</strong>put. This process<br />

is illustrated <strong>in</strong> Fig. 4. The improvement after <strong>in</strong>troduc<strong>in</strong>g<br />

P (x) can be seen <strong>in</strong> Fig. 6.<br />

1<br />

0<br />

0 0.2 0.4 0.6 0.8 1<br />

relative difference of t(x)<br />

Fig. 7: Differences of t(x) values between the predicted block’s<br />

pixels <strong>and</strong> its reference block’s pixels.<br />

IV. OPTIMIZATIONS OF THE BASELINE SYSTEM<br />

A. Algorithmic Optimizations<br />

1) Motion Estimation Based Acceleration <strong>and</strong> Quality Improvement:<br />

The algorithm described <strong>in</strong> Section III is a frame<br />

based approach, <strong>and</strong> the calculation of t(x) consumes about<br />

60% of the <strong>to</strong>tal computation time. For real-time <strong>and</strong> low<br />

complexity process<strong>in</strong>g of video <strong>in</strong>puts, calculat<strong>in</strong>g t(x) frame<br />

by frame not only has high computational complexity, but also<br />

makes the output results much more sensitive <strong>to</strong> temporal <strong>and</strong><br />

spatial noise, <strong>and</strong> destroys the temporal <strong>and</strong> spatial consistency<br />

of the processed outputs.<br />

To remedy these problems, we notice that the t(x) <strong>and</strong> other<br />

model parameters are correlated temporally <strong>and</strong> spatially.<br />

As a result, its calculation can be expedited us<strong>in</strong>g motion<br />

estimation/compensation (ME/MC) techniques.<br />

ME/MC is a key procedure <strong>in</strong> all state-of-the-art video<br />

compression algorithms. By match<strong>in</strong>g blocks <strong>in</strong> subsequently<br />

encoded frames <strong>to</strong> f<strong>in</strong>d the “best” match of a current block <strong>and</strong><br />

a block of the same size that has already been encoded <strong>and</strong><br />

then decoded (the “reference”), video compression algorithms<br />

use the reference as a prediction of the current block <strong>and</strong><br />

encodes only the difference (termed the “residual”) between<br />

the reference <strong>and</strong> the current block, thereby improv<strong>in</strong>g cod<strong>in</strong>g<br />

efficiency. The process of f<strong>in</strong>d<strong>in</strong>g the best match between<br />

a current block <strong>and</strong> a block <strong>in</strong> a reference frame is called<br />

“motion estimation”, <strong>and</strong> the “best” match is usually determ<strong>in</strong>ed<br />

by jo<strong>in</strong>tly consider<strong>in</strong>g the rate <strong>and</strong> dis<strong>to</strong>rtion costs of<br />

the match. If a “best” match block is found, the current block<br />

will be encoded <strong>in</strong> the <strong>in</strong>ter mode <strong>and</strong> only the residual will be<br />

encoded. Otherwise, the current block will be encoded <strong>in</strong> the<br />

<strong>in</strong>tra mode. The most commonly used metric for dis<strong>to</strong>rtion <strong>in</strong><br />

motion estimation is the Sum of Absolute Differences (SAD).<br />

To verify the feasibility of us<strong>in</strong>g temporal block match<strong>in</strong>g <strong>and</strong><br />

ME <strong>to</strong> expedite t(x) calculation, we calculated the differences<br />

Fig. 8: Subsampl<strong>in</strong>g pattern of proposed fast SAD algorithm.<br />

of t(x) values for pixels <strong>in</strong> the predicted <strong>and</strong> reference blocks.<br />

The statistics <strong>in</strong> Fig. 7 shows that the differences are less<br />

than 10% <strong>in</strong> almost all cases <strong>and</strong> as a result, we could utilize<br />

ME/MC <strong>to</strong> by-pass the calculation of t(x) for the majority<br />

of the pixels/frames, <strong>and</strong> only calculate the t(x) for a small<br />

number of selective frames. For the the rema<strong>in</strong>der of the<br />

frames, we used the correspond<strong>in</strong>g t(x) values of the reference<br />

pixels. For motion estimation, we used mature fast motion<br />

estimation algorithms e.g. Enhanced Prediction Zonal Search<br />

(EPZS) [18]. When calculat<strong>in</strong>g the SAD, similar <strong>to</strong> [19] <strong>and</strong><br />

[20], we only utilized a subset of the pixels <strong>in</strong> the current<br />

<strong>and</strong> reference blocks us<strong>in</strong>g the pattern shown <strong>in</strong> Fig. 8. With<br />

this pattern, our calculation “<strong>to</strong>uched” a <strong>to</strong>tal of 60 pixels <strong>in</strong><br />

a 16 × 16 block, or roughly 25%. These pixels were located<br />

on either the diagonals or the edges, result<strong>in</strong>g <strong>in</strong> about 75%<br />

reduction <strong>in</strong> SAD calculation when implemented <strong>in</strong> software<br />

on a general purpose processor.<br />

Specifically, when the proposed algorithm is deployed prior <strong>to</strong><br />

video compression or after video decompression, we can first<br />

divide the <strong>in</strong>put frames <strong>in</strong><strong>to</strong> GOPs. The GOPs could either<br />

conta<strong>in</strong> a fix number of frames, or decided based on a max<br />

GOP size (<strong>in</strong> frames) <strong>and</strong> scene chang<strong>in</strong>g. Each GOP starts<br />

with an Intra coded frame (I frame), for which all t(x) values<br />

are calculated. ME is performed for the rema<strong>in</strong><strong>in</strong>g frames (P


7<br />

are shown <strong>in</strong> Fig. 13 <strong>and</strong> Fig. 14. Some of the comparisons<br />

can be found <strong>in</strong> Section V.<br />

Fig. 9: Flow diagram of the core enhancement algorithm with<br />

ME acceleration.<br />

frames) of the GOP, similar <strong>to</strong> conventional video encod<strong>in</strong>g. To<br />

this end, each P frame is divided <strong>in</strong><strong>to</strong> non-overlapp<strong>in</strong>g 16×16<br />

blocks, for which a motion search us<strong>in</strong>g the SAD is conducted.<br />

A threshold T is def<strong>in</strong>ed for the SAD of blocks: if the SAD<br />

is below the threshold which means a “best” match block is<br />

found, the calculation of t(x) for the entire MB is skipped.<br />

Otherwise, t(x) still needs <strong>to</strong> be calculated. In both cases, the<br />

values for the current frame are s<strong>to</strong>red for reference by the<br />

next frame. The flow diagram is shown <strong>in</strong> Fig. 9.<br />

In addition <strong>to</strong> operat<strong>in</strong>g as a st<strong>and</strong>-along module with uncompressed<br />

pixel <strong>in</strong>formation as both the <strong>in</strong>put <strong>and</strong> output,<br />

the ME accelerated enhancement algorithm could also be<br />

<strong>in</strong>tegrated <strong>in</strong><strong>to</strong> a video encoder or a video decoder. When the<br />

algorithm is <strong>in</strong>tegrated with a video encoder, the encoder <strong>and</strong><br />

the enhancement can share the ME module. When <strong>in</strong>tegrated<br />

with the decoder, the system has the potential of us<strong>in</strong>g the<br />

motion <strong>in</strong>formation conta<strong>in</strong>ed <strong>in</strong> the <strong>in</strong>put video bitstream<br />

directly, <strong>and</strong> thereby by-pass<strong>in</strong>g the entire ME process. Such<br />

<strong>in</strong>tegration will usually lead <strong>to</strong> a Rate-Dis<strong>to</strong>rtion (RD) loss.<br />

The reason for this loss is first <strong>and</strong> foremost that the ME<br />

module <strong>in</strong> the encoder with which the enhancement module<br />

is <strong>in</strong>tegrated or the encoder with which the bitstreams that a<br />

decoder with enhancement decodes may not be optimized for<br />

f<strong>in</strong>d<strong>in</strong>g the best matches <strong>in</strong> t(x) values. For example, when the<br />

enhancement module is <strong>in</strong>tegrated with a decoder, it may have<br />

<strong>to</strong> decode an <strong>in</strong>put bitstream encoded by a low complexity<br />

encoder us<strong>in</strong>g a really small ME range. The traditional SAD<br />

or SAD-plus-rate metrics for ME are also not optimal for t(x)<br />

match search. However, through extensive experiments with<br />

widely used encoders <strong>and</strong> decoders, we found that such quality<br />

loss were usually small, <strong>and</strong> well-justified by the sav<strong>in</strong>gs <strong>in</strong><br />

computational cost. The flow diagrams of <strong>in</strong>tegrat<strong>in</strong>g the ME<br />

acceleration enhancement algorithm <strong>in</strong><strong>to</strong> encoder <strong>and</strong> decoder<br />

2) Visual Quality Improvement with Motion Detection: As<br />

mentioned <strong>in</strong> previous sections, depend<strong>in</strong>g on the target application<br />

<strong>and</strong> the camera <strong>and</strong> process<strong>in</strong>g platforms used, different<br />

systems could <strong>in</strong>troduce different add-on modules on <strong>to</strong>p of<br />

the base l<strong>in</strong>e system for further improvements <strong>in</strong> visual quality.<br />

In this section, we describe one, among many, such possible<br />

modules. The idea here is <strong>to</strong> focus the process<strong>in</strong>g on the<br />

mov<strong>in</strong>g objects that are more likely <strong>to</strong> be <strong>in</strong> the Regions<br />

of Interests (ROIs), <strong>and</strong>/or more visible <strong>to</strong> the human visual<br />

system. In our experiments, we implemented algorithm <strong>in</strong> [21]<br />

for segmentation of mov<strong>in</strong>g objects <strong>and</strong> static background.<br />

Then depend<strong>in</strong>g on whether a pixel belongs <strong>to</strong> the background<br />

or a mov<strong>in</strong>g object, we modify the parameters K <strong>and</strong> M<br />

<strong>in</strong> the calculation of P (x) <strong>in</strong> (7) <strong>to</strong> K mov<strong>in</strong>g <strong>and</strong> M mov<strong>in</strong>g<br />

for mov<strong>in</strong>g objects, or K background <strong>and</strong> M background for the<br />

background respectively. In addition, <strong>to</strong> avoid abrupt changes<br />

of lum<strong>in</strong>ance around the edges of mov<strong>in</strong>g objects, we def<strong>in</strong>e a<br />

b<strong>and</strong> of W trans -pixels wide around the mov<strong>in</strong>g objects as the<br />

transition areas. For the transition areas, P (x) is calculated<br />

us<strong>in</strong>g<br />

K trans =<br />

<strong>and</strong><br />

M trans =<br />

d<br />

K mov<strong>in</strong>g + W trans − d<br />

K background , (8)<br />

W trans W trans<br />

d<br />

M mov<strong>in</strong>g + W trans − d<br />

M background , (9)<br />

W trans W trans<br />

where d is the distance between the pixel x <strong>and</strong> the edge of<br />

the mov<strong>in</strong>g object with which the transition area borders. In<br />

our experiments, the K foreground is set as 0.6, M foreground<br />

is set as 0.5, K background is set as 0.8, M background is set as<br />

1.2.<br />

B. Implementation Optimizations<br />

In addition <strong>to</strong> the algorithmic optimizations <strong>in</strong> the previous<br />

sections, the implementation of the core algorithm can also<br />

be further optimized by tak<strong>in</strong>g advantage of the redundancies<br />

<strong>in</strong>herent <strong>to</strong> the pixel wise calculations of t(x) <strong>and</strong> I c (x).<br />

First of all, we <strong>in</strong>tegrate the calculation of t(x) <strong>in</strong> (3) <strong>in</strong><strong>to</strong> the<br />

calculation of J(x) <strong>in</strong> (6), so that<br />

J(x) =<br />

I(x) − ωA m<strong>in</strong>( m<strong>in</strong> ( Ic (y)<br />

c y∈Ω(x)<br />

A<br />

)) c<br />

1 − ω m<strong>in</strong>( m<strong>in</strong> ( Ic (x)<br />

c y∈Ω(x)<br />

A<br />

)) c<br />

. (10)<br />

This allows for the enhancement of the <strong>in</strong>put I(x) directly<br />

without calculat<strong>in</strong>g t(x). It should be noted that the aforementioned<br />

ME-based acceleration is still applicable <strong>to</strong> (10)<br />

after replac<strong>in</strong>g cach<strong>in</strong>g t(x) values with cach<strong>in</strong>g Ic (y)<br />

A c .<br />

Although the algorithm <strong>in</strong> this paper, as were the de-haze algorithms<br />

<strong>in</strong> many of the papers <strong>in</strong> the reference, was described<br />

<strong>in</strong> the RGB space, our algorithm can be easily adapted <strong>to</strong> work


8<br />

…<br />

1 2 3 4 5 6 7 8 9 ... W-1 W<br />

Fig. 10: Fast calculation of 1-D local m<strong>in</strong>imum value.<br />

<strong>in</strong> the YUV space <strong>to</strong> match the <strong>in</strong>put format of most practical<br />

video applications:<br />

Y out (x) =<br />

U out (x) =<br />

V out (x) =<br />

Y <strong>in</strong> (x) − ωA m<strong>in</strong>( m<strong>in</strong> ( I(y)<br />

c y∈Ω(x)<br />

A ))<br />

1 − ω m<strong>in</strong>( m<strong>in</strong> ( I(y)<br />

c y∈Ω(x)<br />

A )) , (11)<br />

U <strong>in</strong> (x) − 128<br />

1 − ω m<strong>in</strong>( m<strong>in</strong><br />

c<br />

( I(y)<br />

y∈Ω(x)<br />

A<br />

V <strong>in</strong> (x) − 128<br />

1 − ω m<strong>in</strong>( m<strong>in</strong><br />

c<br />

( I(y)<br />

y∈Ω(x)<br />

A<br />

))<br />

+ 128, (12)<br />

))<br />

+ 128. (13)<br />

F<strong>in</strong>ally, <strong>to</strong> further speed up the implementation, we exploited<br />

the <strong>in</strong>herent redundancies <strong>in</strong> the pixel-wise calculations of the<br />

m<strong>in</strong>imization <strong>in</strong> equations (10) - (13), which corresponds <strong>to</strong> a<br />

complexity of k 2 × W × H comparisons for an <strong>in</strong>put frame of<br />

resolution W ×H, <strong>and</strong> a search w<strong>in</strong>dow (for the m<strong>in</strong>imization)<br />

of size k × k pixels. To expedite the process, we first f<strong>in</strong>d<br />

<strong>and</strong> s<strong>to</strong>re the smaller of every two horizontally neighbor<strong>in</strong>g<br />

pixels <strong>in</strong> the frame us<strong>in</strong>g a slid<strong>in</strong>g horizontal w<strong>in</strong>dow of size<br />

2, requir<strong>in</strong>g W × H comparisons. Then, by aga<strong>in</strong> us<strong>in</strong>g a<br />

horizontal slid<strong>in</strong>g w<strong>in</strong>dow of 2 over the values s<strong>to</strong>red <strong>in</strong> the<br />

previous step, we can f<strong>in</strong>d the m<strong>in</strong>imum of every 4 horizontally<br />

neighbor<strong>in</strong>g pixels <strong>in</strong> the orig<strong>in</strong>al <strong>in</strong>put frame. This process is<br />

repeated <strong>in</strong> both the horizontal <strong>and</strong> vertical directions, until we<br />

have found the m<strong>in</strong>imum of all k × k neighborhoods of the<br />

<strong>in</strong>put. It is easy <strong>to</strong> f<strong>in</strong>d that such a strategy has a complexity<br />

of roughly 2 log 2 k × W × H comparisons, as opposed <strong>to</strong><br />

k 2 × W × H for the simplistic implementation. This process<br />

is illustrated for one row of W pixels <strong>in</strong> Fig. 10, where the<br />

red <strong>and</strong> black l<strong>in</strong>es refer <strong>to</strong> the comparisons made, each with<br />

a slid<strong>in</strong>g w<strong>in</strong>dow of 2 values.<br />

Fig. 11: Example of low light<strong>in</strong>g video enhancement algorithm:<br />

Orig<strong>in</strong>al <strong>in</strong>put (Left), <strong>and</strong> the enhancement result<br />

(Right).<br />

Fig. 12: Example of high dynamic range video enhancement<br />

algorithm: Orig<strong>in</strong>al <strong>in</strong>put (Left), <strong>and</strong> the enhancement result<br />

(Right).<br />

Examples of the enhancement outputs for low light<strong>in</strong>g, high<br />

dynamic range <strong>and</strong> hazy videos are shown <strong>in</strong> Fig. 11, Fig. 12<br />

<strong>and</strong> Fig. 15 respectively. As we can see from these figures,<br />

the improvements <strong>in</strong> visibility are obvious. In Fig. 11, the<br />

yellow light from the w<strong>in</strong>dows <strong>and</strong> signs such as “Hobby<br />

Town” <strong>and</strong> other Ch<strong>in</strong>ese characters were recovered <strong>in</strong> correct<br />

color. In Fig. 12, the headlight of the car <strong>in</strong> the orig<strong>in</strong>al <strong>in</strong>put<br />

made letters on the license plate very difficult <strong>to</strong> read. After<br />

V. EXPERIMENTAL RESULTS<br />

To evaluate the proposed approach, a series of experiments<br />

were conducted with a W<strong>in</strong>dows PC (Intel Core 2 Duo<br />

processor runn<strong>in</strong>g at 2.0 GHz with 3G of RAM) <strong>and</strong> an iPhone<br />

4. On the iPhone, our software could process the images <strong>and</strong><br />

videos from the camera directly or from the pho<strong>to</strong> album.<br />

After the process<strong>in</strong>g is complete, the output would be shown<br />

on the screen <strong>and</strong> saved <strong>in</strong> the pho<strong>to</strong> album au<strong>to</strong>matically. The<br />

resolution of test videos <strong>in</strong> our experiments was 640 × 480 on<br />

PC <strong>and</strong> 192 × 144 on iPhone 4. The enhancement effects <strong>and</strong><br />

process<strong>in</strong>g speed are listed below. Due <strong>to</strong> time constra<strong>in</strong>ts,<br />

we only implemented the frame-by-frame base l<strong>in</strong>e system on<br />

the iPhone <strong>and</strong> did not employ many further possible ways<br />

of optimization on the PC platform (e.g. by us<strong>in</strong>g assembly<br />

cod<strong>in</strong>g).<br />

Fig. 15: Example of haze removal algorithm: Orig<strong>in</strong>al <strong>in</strong>put<br />

(Left), <strong>and</strong> the enhancement result (Right).<br />

Fig. 16: Example of ra<strong>in</strong>y video enhancement us<strong>in</strong>g haze<br />

removal algorithm: Orig<strong>in</strong>al <strong>in</strong>put (Left), <strong>and</strong> the enhancement<br />

result (Right).


9<br />

Fig. 13: Flow diagram of the <strong>in</strong>tegration of encoder <strong>and</strong> ME acceleration enhancement algorithm.<br />

Fig. 14: Flow diagram of the <strong>in</strong>tegration of decoder <strong>and</strong> ME acceleration enhancement algorithm.<br />

Fig. 17: Examples of snowy video enhancement us<strong>in</strong>g haze<br />

removal algorithm: Orig<strong>in</strong>al <strong>in</strong>put (Left), <strong>and</strong> the enhancement<br />

result (Right).<br />

Fig. 18: Examples of visual quality improvement with motion<br />

detection: Orig<strong>in</strong>al <strong>in</strong>put (Left), the enhancement result (Middle)<br />

<strong>and</strong> the improvement result with motion detection (Right).<br />

enhancement with our algorithm, the license plate became<br />

much more <strong>in</strong>telligible. The algorithm also worked well for<br />

video captured <strong>in</strong> hazy, ra<strong>in</strong>y <strong>and</strong> snowy weathers as shown<br />

<strong>in</strong> Fig. 15, Fig. 16 <strong>and</strong> Fig. 17. <strong>An</strong> example of the visual<br />

quality improvement us<strong>in</strong>g motion detection is shown <strong>in</strong> Fig.<br />

18.<br />

As mentioned above, there are three possible ways of <strong>in</strong>corporat<strong>in</strong>g<br />

ME <strong>in</strong><strong>to</strong> the enhancement algorithm, i.e. through<br />

a separate ME module <strong>in</strong> the enhancement system, as well<br />

as utiliz<strong>in</strong>g the ME module <strong>and</strong> <strong>in</strong>formation available <strong>in</strong><br />

a video encoder or decoder. Some example outputs of the<br />

frame-wise enhancement algorithm <strong>and</strong> these three ways of<br />

<strong>in</strong>corporat<strong>in</strong>g ME are shown <strong>in</strong> Fig. 22, with virtually no<br />

visual difference. We also calculated the average RD curves<br />

of ten r<strong>and</strong>omly selected experimental videos us<strong>in</strong>g the three<br />

acceleration methods. The reference was enhancement us<strong>in</strong>g<br />

the proposed frame-wise enhancement algorithm <strong>in</strong> the YUV<br />

doma<strong>in</strong>. The RD curves of perform<strong>in</strong>g the frame-wise enhancement<br />

algorithm before encod<strong>in</strong>g or after decod<strong>in</strong>g are<br />

shown <strong>in</strong> Fig. 19, while the results for acceleration us<strong>in</strong>g a<br />

separate ME module are given <strong>in</strong> Fig. 20, <strong>and</strong> <strong>in</strong>tegrat<strong>in</strong>g the<br />

ME acceleration <strong>in</strong><strong>to</strong> the codec are shown <strong>in</strong> Fig. 21. As the<br />

RD curves <strong>in</strong> our experiments reflect the aggregated outcome<br />

of both cod<strong>in</strong>g <strong>and</strong> enhancement, <strong>and</strong> because enhancement<br />

was not optimized for PSNR based dis<strong>to</strong>rtion, the shape of our<br />

RD curve looks slightly different from RD curves for video<br />

compression systems.<br />

From the results, we found that <strong>in</strong> general, perform<strong>in</strong>g enhancement<br />

before encod<strong>in</strong>g has better overall RD performance.<br />

Although enhanc<strong>in</strong>g after decod<strong>in</strong>g means we can<br />

transmit un-enhanced video clips, which usually have lower<br />

contrast, less detail <strong>and</strong> are easier <strong>to</strong> compress, the reconstructed<br />

quality after decoder/enhancement is heavily affected<br />

by the loss of quality dur<strong>in</strong>g the encod<strong>in</strong>g, lead<strong>in</strong>g <strong>to</strong> an overall<br />

RD performance loss of 2 dB for the cases <strong>in</strong> the experiments.<br />

In addition, <strong>in</strong> Fig. 19, the RD loss of frame-wise enhancement<br />

was due <strong>to</strong> encod<strong>in</strong>g <strong>and</strong> decod<strong>in</strong>g. In Fig. 20, the RD loss


10<br />

resulted from ME acceleration <strong>and</strong> encod<strong>in</strong>g/decod<strong>in</strong>g. In Fig.<br />

21, the RD loss resulted from <strong>in</strong>tegration of ME acceleration<br />

algorithm <strong>in</strong><strong>to</strong> encoder <strong>and</strong> decoder. Overall however, the RD<br />

loss <strong>in</strong>troduced by ME acceleration <strong>and</strong> <strong>in</strong>tegration was small<br />

<strong>in</strong> PSNR terms, <strong>and</strong> not visible subjectively.<br />

We also measured the computational complexity of frame-wise<br />

enhancement, acceleration with a separate ME module <strong>and</strong><br />

<strong>in</strong>tegration <strong>in</strong><strong>to</strong> an encoder or a decoder. The computational<br />

cost was measured <strong>in</strong> terms of average time for enhanc<strong>in</strong>g each<br />

frame. For the cases when the enhancement was <strong>in</strong>tegrated <strong>in</strong><strong>to</strong><br />

the codec, we did not count the actual encod<strong>in</strong>g or decod<strong>in</strong>g<br />

time, so as <strong>to</strong> measure only the enhancement itself. As shown<br />

<strong>in</strong> the Table III, us<strong>in</strong>g a separate ME module saved about 28%<br />

process<strong>in</strong>g time on average compared with the frame-wise<br />

algorithm. On the other h<strong>and</strong>, <strong>in</strong>tegrat<strong>in</strong>g with the decoder<br />

saved about 40% process<strong>in</strong>g time compared with the frame<br />

wise algorithm, while <strong>in</strong>tegrat<strong>in</strong>g with the encoder saved about<br />

77%.<br />

VI. CONCLUSIONS<br />

In this paper, we propose a novel <strong>in</strong>tegrated approach <strong>to</strong><br />

enhancement of videos acquired under challeng<strong>in</strong>g light<strong>in</strong>g<br />

conditions <strong>in</strong>clud<strong>in</strong>g low light<strong>in</strong>g, bad weather (hazy, ra<strong>in</strong>y,<br />

snowy) <strong>and</strong> high dynamic range conditions. We show that for<br />

many applications, it is usually acceptable <strong>to</strong> first classify the<br />

<strong>in</strong>put video <strong>in</strong><strong>to</strong> “normal” video, low light<strong>in</strong>g video (<strong>in</strong>clud<strong>in</strong>g<br />

high dynamic range video) <strong>and</strong> hazy video (<strong>in</strong>clud<strong>in</strong>g video<br />

acquired <strong>in</strong> other bad weather conditions such as ra<strong>in</strong>y <strong>and</strong><br />

snowy weathers). Then, because visually <strong>and</strong> statistically, low<br />

light<strong>in</strong>g video <strong>and</strong> hazy video exhibit very similar characteristics<br />

after either one undergoes the pixel-wise <strong>in</strong>version<br />

operation, a s<strong>in</strong>gle enhancement module could be used for<br />

the process<strong>in</strong>g of video under a broad range of bad light<strong>in</strong>g<br />

conditions.<br />

We also present as an example, a very simple video dehaz<strong>in</strong>g<br />

algorithm based basel<strong>in</strong>e video enhancement system<br />

us<strong>in</strong>g “off-the-shelf” technologies. The resulted system can<br />

run <strong>in</strong> real time on a PC <strong>and</strong> achieved good speed <strong>and</strong><br />

enhancement quality even on an iPhone. Results given <strong>in</strong><br />

the paper demonstrate the usefulness <strong>and</strong> promise of the<br />

approach. Through experiments, we also exam<strong>in</strong>e the the<br />

tradeoffs associated with <strong>in</strong>tegrat<strong>in</strong>g the proposed system <strong>in</strong><strong>to</strong><br />

different “l<strong>in</strong>ks” of the video acquisition, cod<strong>in</strong>g, transmission<br />

<strong>and</strong> consumption cha<strong>in</strong>. Potentially, the proposed approach<br />

could be <strong>in</strong>tegrated <strong>in</strong><strong>to</strong> the sensors, the codecs <strong>and</strong> video<br />

process<strong>in</strong>g softwares <strong>and</strong> systems, or, its different process<strong>in</strong>g<br />

steps could be distributed through these l<strong>in</strong>ks <strong>to</strong> offer a tiered<br />

scheme for quality improvement.<br />

Areas of further improvements <strong>in</strong>clude better pre-process<strong>in</strong>g<br />

filters target<strong>in</strong>g specific sources of impairments, especially<br />

high dynamic range <strong>in</strong>puts, further optimization us<strong>in</strong>g denois<strong>in</strong>g,<br />

<strong>to</strong>ne mapp<strong>in</strong>g <strong>and</strong> other techniques, improved core<br />

enhancement algorithms, <strong>and</strong> better acceleration techniques.<br />

Also of great importance is a system that can process <strong>in</strong>puts<br />

PSNR (db)<br />

39<br />

38<br />

37<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

frame−wise enhancement before encod<strong>in</strong>g<br />

frame−wise enhancement after decod<strong>in</strong>g<br />

29<br />

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000<br />

bitrate(kb/s)<br />

Fig. 19: RD performance of frame-wise enhancement <strong>in</strong><br />

encoder <strong>and</strong> decoder.<br />

PSNR (db)<br />

38<br />

36<br />

34<br />

32<br />

30<br />

28<br />

st<strong>and</strong>−alone ME enhancement before encod<strong>in</strong>g<br />

st<strong>and</strong>−alone ME enhancement after decod<strong>in</strong>g<br />

26<br />

0 2000 4000 6000 8000 10000<br />

bitrate(kb/s)<br />

Fig. 20: RD performance of separate ME acceleration enhancement<br />

<strong>in</strong> encoder <strong>and</strong> decoder.<br />

with compounded impairments (e.g. video of foggy nights,<br />

with both haze <strong>and</strong> low light<strong>in</strong>g).<br />

REFERENCES<br />

[1] C. Smith. “The 10 Most Watched YouTube <strong>Video</strong>s of 2010,” at<br />

http://www.huff<strong>in</strong>g<strong>to</strong>npost.com.<br />

[2] M. Blanco, H. M. Jonathan, <strong>and</strong> T. A. D<strong>in</strong>gus. “Evaluat<strong>in</strong>g New Technologies<br />

<strong>to</strong> Enhance Night Vision by Look<strong>in</strong>g at Detection <strong>and</strong> Recognition<br />

Distances of Non-Mo<strong>to</strong>rists <strong>and</strong> Objects,” <strong>in</strong> Proc. Human Fac<strong>to</strong>rs <strong>and</strong><br />

Ergonomics Society., M<strong>in</strong>neapolis, MN, Jan. 2001, vol. 5, pp. 1612-1616.<br />

PSNR (db)<br />

37<br />

36<br />

35<br />

34<br />

33<br />

32<br />

31<br />

30<br />

29<br />

28<br />

<strong>in</strong>tegration of ME enhancement before encod<strong>in</strong>g<br />

<strong>in</strong>tegration of ME enhancement after decod<strong>in</strong>g<br />

27<br />

0 2000 4000 6000 8000 10000<br />

bitrate(kb/s)<br />

Fig. 21: RD performance of <strong>in</strong>tegration of ME acceleration<br />

enhancement <strong>in</strong><strong>to</strong> encoder <strong>and</strong> decoder.


11<br />

TABLE III: Process<strong>in</strong>g speeds of proposed algorithms over PC (640 × 480) <strong>and</strong> iPhone4 (192 × 144)<br />

PC/ms per frame iPhone4/ms per frame Time saved<br />

Frame-wise enhancement algorithm 27.1 66.3 N/A<br />

Separate ME acceleration enhancement algorithm 19.8 N/A 27.5%<br />

Integration of ME acceleration enhancement algorithm <strong>in</strong><strong>to</strong> encoder 6.2 N/A 77.3%<br />

Integration of ME acceleration enhancement algorithm <strong>in</strong><strong>to</strong> decoder 16.7 N/A 40.0%<br />

[15] R. Fattal. “S<strong>in</strong>gle Image Dehaz<strong>in</strong>g,” <strong>in</strong> ACM SIGGRAPH ’08, Los<br />

<strong>An</strong>geles, CA, Aug. 2008, pp. 1-9.<br />

[16] R. Tan. “Visibility <strong>in</strong> Bad Weather from A S<strong>in</strong>gle Image,” <strong>in</strong> Proc. IEEE<br />

Conf. Computer Vision <strong>and</strong> Pattern Recognition., <strong>An</strong>chorage, Alaska, Jun.<br />

2008, pp. 1-8.<br />

[17] S. G. Narasimhan, <strong>and</strong> S. K. Nayar. “Chromatic Framework for Vision<br />

<strong>in</strong> Bad Weather,” <strong>in</strong> Proc. IEEE Conf. Computer Vision <strong>and</strong> Pattern<br />

Recognition., Hil<strong>to</strong>n Head, SC, Jun. 2000, vol. 1, pp. 1598-1605.<br />

[18] A. M. Tourapis. “Enhanced Predictive Zonal Search for S<strong>in</strong>gle <strong>and</strong><br />

Multiple Frame Motion Estimation,” <strong>in</strong> Proc. Visual Communications <strong>and</strong><br />

Image Process<strong>in</strong>g., San Jose, CA, Jan. 2002, pp. 1069-1079.<br />

Fig. 22: Examples of comparisons among the frame-wise algorithm<br />

<strong>and</strong> the three proposed ME acceleration methods (Left <strong>to</strong><br />

Right): Orig<strong>in</strong>al <strong>in</strong>put, output of frame-wise algorithm, output<br />

of separate ME acceleration algorithm, output of <strong>in</strong>tegration<br />

of ME acceleration algorithm <strong>in</strong><strong>to</strong> encoder <strong>and</strong> decoder.<br />

[3] O. Tsimhoni, J. Bärgman, T. M<strong>in</strong>oda, <strong>and</strong> M. J. Flannagan. “Pedestrian<br />

Detection with Near <strong>and</strong> Far Infrared Night Vision <strong>Enhancement</strong>,” Tech.<br />

rep., The University of Michigan, 2004.<br />

[4] L. Tao, H. Ngo, M. Zhang, A. Liv<strong>in</strong>gs<strong>to</strong>n, <strong>and</strong> V. Asari. “A Multi-sensor<br />

Image Fusion <strong>and</strong> <strong>Enhancement</strong> System for Assist<strong>in</strong>g Drivers <strong>in</strong> Poor<br />

Light<strong>in</strong>g Conditions,” <strong>in</strong> Proc. IEEE Conf. Applied Imagery <strong>and</strong> Pattern<br />

Recognition Workshop., Wash<strong>in</strong>g<strong>to</strong>n, DC, Dec. 2005, pp. 106-113.<br />

[5] D. Koob, F. Bellotti, C. Bellotti <strong>and</strong> L. <strong>An</strong>dreone. “Enhanced Driver’s<br />

Perception <strong>in</strong> Poor Visibility.” Available at http://www.edel-eu.org/<br />

[6] H. Malm, M. Oskarsson, E. Warrant, P. Clarberg, J. Hasselgren, <strong>and</strong><br />

C. Lejdfors. “Adaptive <strong>Enhancement</strong> <strong>and</strong> Noise Reduction <strong>in</strong> Very Low<br />

Light-Level <strong>Video</strong>,” <strong>in</strong> Proc. IEEE Int. Conf. Computer Vision., Rio de<br />

Janeiro, Brazil, Oct. 2007, pp. 1-8.<br />

[7] E. P. Bennett, L. McMillan. “<strong>Video</strong> <strong>Enhancement</strong> Us<strong>in</strong>g Per-pixel Virtual<br />

Exposures,” <strong>in</strong> Proc. SIGGRAPH ’05., Los <strong>An</strong>geles, CA, Jul. 2005, pp.<br />

845-852.<br />

[8] X. Dong, Y. Pang, <strong>and</strong> J. Wen. “Fast Efficient Algorithm for <strong>Enhancement</strong><br />

of Low Light<strong>in</strong>g <strong>Video</strong>,” <strong>in</strong> Proc. SIGGRAPH ’10 Poster., Los <strong>An</strong>geles,<br />

CA, Jul. 2010.<br />

[9] Koschmieder. “Theorie der horizontalen sichtweite,” <strong>in</strong> Beitr. Phys. Freien<br />

Atm., vol. 12, pp. 171-181, 1924.<br />

[10] R. Fisher, <strong>and</strong> F. Yates. “Statistical Tables for Biological, Agricultural<br />

<strong>and</strong> Medical Research,” 6 ed. Oliver <strong>and</strong> Boyd Ltd., Ed<strong>in</strong>burgh, 1963.<br />

[11] R. Lim, T. Bretschneider. “Au<strong>to</strong>nomous Moni<strong>to</strong>r<strong>in</strong>g of Fire-related Haze<br />

from Space,” <strong>in</strong> Conf. Imag<strong>in</strong>g Science, Systems <strong>and</strong> Technology., Las<br />

Vegas, Nevada, Jun. 2004, pp. 101-105.<br />

[12] C. Song, C. E. Woodcock, K. C. Se<strong>to</strong>, M. P. Lenney, <strong>and</strong> S. A.<br />

Macomber. “Classification <strong>and</strong> Change Detection Us<strong>in</strong>g L<strong>and</strong>sat TM<br />

Data: When <strong>and</strong> How <strong>to</strong> Correct Atmospheric Effects?” <strong>in</strong> Int. Symposium<br />

Remote Sens<strong>in</strong>g of Environment., vol. 75, no. 2, pp. 230-244, Feb. 2001.<br />

[13] Du Y., Gu<strong>in</strong>dong B., <strong>and</strong> Cihlar J.. “Haze Detection <strong>and</strong> Removal <strong>in</strong><br />

High Resolution Satellite Image with Wavelet <strong>An</strong>alysis,” <strong>in</strong> IEEE Trans.<br />

Geoscience <strong>and</strong> Remote Sens<strong>in</strong>g., vol. 40, no. 1, pp. 210-217, Jan. 2002.<br />

[14] K. He, J. Sun, <strong>and</strong> X. Tang. “S<strong>in</strong>gle Image Haze Removal Us<strong>in</strong>g<br />

Dark Channel Prior,” <strong>in</strong> Proc. IEEE Conf. Computer Vision <strong>and</strong> Pattern<br />

Recognition., Miami, FL, Jun. 2009, pp. 1956-1963.<br />

[19] T. Koga, K. I<strong>in</strong>uma, A. Hirano, Y. Iijima, <strong>and</strong> T. Ishiguro.“Motion<br />

Compensated Interframe Cod<strong>in</strong>g for <strong>Video</strong> Conferenc<strong>in</strong>g,” <strong>in</strong> Proc. Nut.<br />

Telecommun. Conf., New Orleans, LA, Nov. 1981, pp. G5.3.1-G5.3.5.<br />

[20] B. Girod, <strong>and</strong> K. W. Stuhlmüller. “A Content-Dependent Fast DCT for<br />

Low Bit-Rate <strong>Video</strong> Cod<strong>in</strong>g,” <strong>in</strong> Proc. IEEE Int. Conf. Image Process.,<br />

Chicago, Ill<strong>in</strong>ois, Oct. 1998, vol. 3, pp. 80-83.<br />

[21] P. Kaewtrakulpong, <strong>and</strong> R. Bowden. “<strong>An</strong> Improved Adaptive Bachground<br />

Mixture Model for Realtime Track<strong>in</strong>g with Shadow Detection,”<br />

<strong>in</strong> Proc. European Workshop on Advanced <strong>Video</strong> BAsed Surveillance<br />

Systems., London, UK, Sept., 2001, Vol.1, pp.149-158.<br />

[22] X. Dong, G. Wang, Y. Pang, W. Li. J. Wen, Y. Lu, <strong>and</strong> W. Meng. “Fast<br />

Efficient Algorithm for <strong>Enhancement</strong> of Low Light<strong>in</strong>g <strong>Video</strong>,” <strong>in</strong> IEEE<br />

Int. Conf. on Multimedia <strong>and</strong> Exop, Barcelona, Spa<strong>in</strong>, Jul., 2011.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!