27.09.2014 Views

An Integrated and Scalable Approach to Video Enhancement in ...

An Integrated and Scalable Approach to Video Enhancement in ...

An Integrated and Scalable Approach to Video Enhancement in ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2<br />

au<strong>to</strong>mate the process of edit<strong>in</strong>g <strong>and</strong> upload<strong>in</strong>g video as much<br />

as possible. For video clips <strong>to</strong> be processed by popular webbased<br />

systems such as JayCut, they must first be compressed<br />

before they can be uploaded over the Internet, <strong>and</strong> often<br />

times, the compression is done with sub-optimal sett<strong>in</strong>gs for<br />

the video encoder. Even for local edit<strong>in</strong>g with professional<br />

software by experts where compression <strong>and</strong> then upload<strong>in</strong>g<br />

is not necessary, because video clips captured by portable<br />

cameras, such as mobile phones or the Flip camera are already<br />

compressed, usually by a low power video encoder <strong>in</strong>troduc<strong>in</strong>g<br />

significant quality loss, it is required that video process<strong>in</strong>g<br />

algorithms be able <strong>to</strong> h<strong>and</strong>le video conta<strong>in</strong><strong>in</strong>g artifacts created<br />

by both captur<strong>in</strong>g (e.g. low light<strong>in</strong>g, high dynamic range, <strong>and</strong><br />

etc.) as well as compression.<br />

In this paper, we describe a novel <strong>in</strong>tegrated <strong>and</strong> scalable<br />

video enhancement approach applicable <strong>to</strong> a wide range of<br />

<strong>in</strong>put impairments commonly encountered <strong>in</strong> mobile video<br />

applications. The core enhancement algorithm has much lower<br />

computational <strong>and</strong> memory complexities than other exist<strong>in</strong>g<br />

solutions of similar enhancement performance. In our system,<br />

a low complexity au<strong>to</strong>matic module first determ<strong>in</strong>es the predom<strong>in</strong>ate<br />

source of impairment <strong>in</strong> the <strong>in</strong>put video. The <strong>in</strong>put<br />

is then pre-processed based on the particular source of impairment,<br />

followed by process<strong>in</strong>g by the core enhancement<br />

module. F<strong>in</strong>ally, post-process<strong>in</strong>g is applied <strong>to</strong> produce the<br />

enhanced output. In addition, spatial <strong>and</strong> temporal correlations<br />

are utilized <strong>to</strong> improve the speed of the algorithm <strong>and</strong> the<br />

visual quality, enabl<strong>in</strong>g it <strong>to</strong> be embedded <strong>in</strong><strong>to</strong> video encoders<br />

or decoders <strong>to</strong> share temporal <strong>and</strong> spatial prediction modules<br />

<strong>in</strong> the video codec <strong>to</strong> further lower complexity.<br />

Although the system <strong>in</strong> the paper is described <strong>in</strong> detail <strong>in</strong><br />

the context of us<strong>in</strong>g de-haz<strong>in</strong>g as the core enhancement<br />

algorithm, it should be noted that the ma<strong>in</strong> contribution of<br />

the work is <strong>to</strong> establish the connections between the issues of<br />

enhancement of video captured with a wide range of light<strong>in</strong>g<br />

impairments. We show that it is possible <strong>to</strong> achieve reasonably<br />

good enhancement results <strong>in</strong> real time or close <strong>to</strong> real time,<br />

even with the limited resources available on a netbook or<br />

even a mobile phone. We also show that by <strong>in</strong>troduc<strong>in</strong>g more<br />

optimization techniques, the process<strong>in</strong>g quality can be further<br />

improved, thereby achiev<strong>in</strong>g a scalable architecture. Us<strong>in</strong>g the<br />

approach <strong>in</strong> this paper, one could focus on design<strong>in</strong>g a suitable<br />

core enhancement algorithm (based on either de-haz<strong>in</strong>g<br />

or low-light<strong>in</strong>g enhancement techniques), post-optimization<br />

algorithms, <strong>and</strong>/or temporal/spatial acceleration techniques.<br />

By <strong>in</strong>tegrat<strong>in</strong>g specific algorithms <strong>and</strong> techniques tailored for<br />

<strong>in</strong>dividual applications <strong>in</strong><strong>to</strong> the scalable system, optimized<br />

results could be achieved for generic applications as well as<br />

meet<strong>in</strong>g highly specific requirements.<br />

A major advantage of the approach described <strong>in</strong> the paper is<br />

its flexibility. The flexibility is reflected <strong>in</strong> several key aspects.<br />

First of all, the system is of a low complexity that is possible<br />

<strong>to</strong> be embedded <strong>in</strong><strong>to</strong> a portable camera systems. It can also be<br />

<strong>in</strong>corporated <strong>in</strong><strong>to</strong> a post-process<strong>in</strong>g software with various degrees<br />

of complexity for different quality-complexity tradeoffs<br />

target<strong>in</strong>g different applications; secondly, it can be adopted<br />

as a st<strong>and</strong>alone module, or as an <strong>in</strong>tegrated part <strong>in</strong> a video<br />

encoder or decoder. By <strong>in</strong>tegrat<strong>in</strong>g the system <strong>in</strong><strong>to</strong> an encoder<br />

or a decoder, one can not only share the <strong>in</strong>formation between<br />

the codec <strong>and</strong> the enhancement systems, thereby lower<strong>in</strong>g the<br />

comb<strong>in</strong>ed complexity, but can also usually improve the quality<br />

after process<strong>in</strong>g; f<strong>in</strong>ally, the multiple steps of the algorithm can<br />

be implemented as a complete system, or, the basel<strong>in</strong>e features<br />

of the system can be implemented on a portable device for<br />

basic enhancement <strong>in</strong> real time applications, while the more<br />

sophisticated steps (e.g. further noise reduction, better <strong>to</strong>ne<br />

mapp<strong>in</strong>g <strong>and</strong> etc. of the prelim<strong>in</strong>ary enhanced video) can be<br />

done on a cloud server. It is also conceivable that with the<br />

advance of computational pho<strong>to</strong>graphy <strong>and</strong> the development<br />

of good high dynamic range or low light<strong>in</strong>g capability image<br />

sensors, the core enhancement module could become the<br />

sensor itself, with only the pre-process<strong>in</strong>g <strong>and</strong> post-process<strong>in</strong>g<br />

modules required <strong>to</strong> h<strong>and</strong>le the challenge <strong>in</strong> a large variety of<br />

applications.<br />

The paper is organized as the follow<strong>in</strong>g. In Section II, we<br />

present the evidences <strong>and</strong> establish the connections between<br />

video de-haz<strong>in</strong>g, low-light<strong>in</strong>g video <strong>and</strong> high dynamic range<br />

video enhancement. We show that for a wide range of applications,<br />

especially applications target<strong>in</strong>g low complexity <strong>and</strong><br />

mobile platforms, satisfac<strong>to</strong>ry results can be achieved with a<br />

s<strong>in</strong>gle core algorithm for a wide range of video enhancement<br />

problems. Then, us<strong>in</strong>g a low-complexity de-haz<strong>in</strong>g algorithm<br />

expla<strong>in</strong>ed <strong>in</strong> Section III as an example of a possible choice for<br />

the core algorithm, we expla<strong>in</strong> various techniques for reduc<strong>in</strong>g<br />

the computational <strong>and</strong> memory complexities of the algorithm,<br />

<strong>and</strong> various techniques that can be used <strong>in</strong> conjunction of<br />

the core algorithm <strong>to</strong> further improve the visual quality of<br />

the core algorithm <strong>in</strong> Section IV. Given that <strong>in</strong> real-world<br />

applications, the video enhancement module could be deployed<br />

<strong>in</strong> multiple stages of the end <strong>to</strong> end system, e.g. before<br />

compression <strong>and</strong> transmission/s<strong>to</strong>rage, or after compression<br />

<strong>and</strong> transmission/s<strong>to</strong>rage but before decompression, or after<br />

decompression <strong>and</strong> before the video content displayed on the<br />

moni<strong>to</strong>r, we exam<strong>in</strong>e the complexity <strong>and</strong> rate-dis<strong>to</strong>rtion (RD)<br />

tradeoffs associated with apply<strong>in</strong>g the proposed algorithm <strong>in</strong><br />

these different steps with experimental results <strong>in</strong> Sections V.<br />

F<strong>in</strong>ally we conclude the paper with Section VI.<br />

II. AN INTEGRATED APPROACH TO VIDEO ENHANCEMENT<br />

The motivation for our algorithm is the observation made <strong>in</strong><br />

[8] that if one performs a pixel-wise <strong>in</strong>version of low light<strong>in</strong>g<br />

video, the results look quite similar <strong>to</strong> hazy video. Through<br />

experiments, we found that the same also hold true for a<br />

significant percentage of high dynamic range video. Here, the<br />

“<strong>in</strong>version” operation is simply<br />

R c (x) = 255 − I c (x), (1)<br />

where R c (x) <strong>and</strong> I c (x) are <strong>in</strong>tensities for the correspond<strong>in</strong>g<br />

color (RGB) channel c for pixel x <strong>in</strong> the <strong>in</strong>put <strong>and</strong> <strong>in</strong>verted<br />

frame respectively.<br />

To verify the claim, we r<strong>and</strong>omly selected (by Google) <strong>and</strong><br />

captured a <strong>to</strong>tal of 100 images <strong>and</strong> video clips each <strong>in</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!