03.01.2015 Views

Effect of Illumination on Automatic Expression Recognition: A Novel ...

Effect of Illumination on Automatic Expression Recognition: A Novel ...

Effect of Illumination on Automatic Expression Recognition: A Novel ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<str<strong>on</strong>g>Effect</str<strong>on</strong>g> <str<strong>on</strong>g>of</str<strong>on</strong>g> <str<strong>on</strong>g>Illuminati<strong>on</strong></str<strong>on</strong>g> <strong>on</strong> <strong>Automatic</strong> Expressi<strong>on</strong> Recogniti<strong>on</strong>:<br />

A <strong>Novel</strong> 3D Relightable Facial Database<br />

Giota Stratou Abhijeet Ghosh Paul Debevec Louis-Philippe Morency<br />

Institute for Creative Technologies, University <str<strong>on</strong>g>of</str<strong>on</strong>g> Southern California<br />

{ stratou, ghosh, debevec, morency} @ict.usc.edu<br />

Abstract— One <str<strong>on</strong>g>of</str<strong>on</strong>g> the main challenges in facial expressi<strong>on</strong><br />

recogniti<strong>on</strong> is illuminati<strong>on</strong> invariance. Our l<strong>on</strong>g-term goal is<br />

to develop a system for automatic facial expressi<strong>on</strong> recogniti<strong>on</strong><br />

that is robust to light variati<strong>on</strong>s. In this paper, we introduce a<br />

novel 3D Relightable Facial Expressi<strong>on</strong> (ICT-3DRFE) database<br />

that enables experimentati<strong>on</strong> in the fields <str<strong>on</strong>g>of</str<strong>on</strong>g> both computer<br />

graphics and computer visi<strong>on</strong>. The database c<strong>on</strong>tains 3D models<br />

for 23 subjects and 15 expressi<strong>on</strong>s, as well as photometric<br />

informati<strong>on</strong> that allow for photorealistic rendering. It is also<br />

facial acti<strong>on</strong> units annotated, using FACS standards. Using<br />

the ICT-3DRFE database we create an image set <str<strong>on</strong>g>of</str<strong>on</strong>g> different<br />

expressi<strong>on</strong>s/illuminati<strong>on</strong>s to study the effect <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> <strong>on</strong><br />

automatic expressi<strong>on</strong> recogniti<strong>on</strong>. We compared the output<br />

scores from automatic recogniti<strong>on</strong> with expert FACS annotati<strong>on</strong>s<br />

and found that they agree when the illuminati<strong>on</strong> is<br />

uniform. Our results show that the output distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

automatic recogniti<strong>on</strong> can change significantly with light variati<strong>on</strong>s<br />

and sometimes causes the discriminati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> two different<br />

expressi<strong>on</strong>s to be diminished. We propose a ratio-based light<br />

transfer method, to factor out unwanted illuminati<strong>on</strong>s from<br />

given images and show that it reduces the effect <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong><br />

<strong>on</strong> expressi<strong>on</strong> recogniti<strong>on</strong>.<br />

I. INTRODUCTION<br />

One <str<strong>on</strong>g>of</str<strong>on</strong>g> the main challenges with facial expressi<strong>on</strong> recogniti<strong>on</strong><br />

is to achieve illuminati<strong>on</strong> invariance. Prior studies show<br />

that changing the directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> can influence<br />

the percepti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> object characteristics such as 3D shape<br />

and locati<strong>on</strong> [1]. Relative to comm<strong>on</strong> image representati<strong>on</strong>s,<br />

changes in lighting result in large image differences. These<br />

observed changes can be larger even than when varying the<br />

identity <str<strong>on</strong>g>of</str<strong>on</strong>g> the subject [2].<br />

These studies suggest that both human and automated<br />

facial identificati<strong>on</strong> are impaired by variati<strong>on</strong>s in illuminati<strong>on</strong>.<br />

By extensi<strong>on</strong>, we expect a similar impediment to facial<br />

expressi<strong>on</strong> recogniti<strong>on</strong>. This intuiti<strong>on</strong> is strengthened by four<br />

observati<strong>on</strong>s: i) changes in facial expressi<strong>on</strong> are manifested<br />

as deformati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the shape and texture <str<strong>on</strong>g>of</str<strong>on</strong>g> the facial surface,<br />

ii) illuminati<strong>on</strong> variance has been shown to influence percepti<strong>on</strong><br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> shape, which c<strong>on</strong>founds face recogniti<strong>on</strong>, iii) most<br />

methods for automated expressi<strong>on</strong> recogniti<strong>on</strong> use image<br />

representati<strong>on</strong>s, features, and processing techniques similar<br />

to face recogniti<strong>on</strong> methods [3] which are also c<strong>on</strong>founded<br />

by illuminati<strong>on</strong> variance, and iv) the training set for most<br />

classifiers c<strong>on</strong>sists mainly <str<strong>on</strong>g>of</str<strong>on</strong>g> uniformly lit images.<br />

While most automatic systems for facial expressi<strong>on</strong> recogniti<strong>on</strong><br />

assume input images with relatively uniform illuminati<strong>on</strong>,<br />

research such as Li et al. [4], Kumar et al. [5] and<br />

a b c d<br />

Fig. 1. A sample 3D model from ICT-3DRFE and some <str<strong>on</strong>g>of</str<strong>on</strong>g> its corresp<strong>on</strong>ding<br />

textures and photometric surface normal maps. First row, a) diffuse<br />

texture, b) specular texture, c) diffuse (red channel) normals, d) specular<br />

normals. Sec<strong>on</strong>d row, from left to right: 3D geometry <str<strong>on</strong>g>of</str<strong>on</strong>g> a subject posing<br />

for the ”eyebrows up” expressi<strong>on</strong>; same pose rendered with texture and<br />

simple pointlights; different pose <str<strong>on</strong>g>of</str<strong>on</strong>g> the model rendered under envir<strong>on</strong>mental<br />

lighting.<br />

Toderici et al. [6] have worked toward illuminati<strong>on</strong> invariant<br />

by extracting features which are illuminati<strong>on</strong> invariant. To<br />

serve this directi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> research, facial databases have been<br />

assembled which capture the same face and pose under<br />

different illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s, and lately the development<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> 3D facial databases has become <str<strong>on</strong>g>of</str<strong>on</strong>g> interest, since they<br />

allow explorati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> new 3D features.<br />

In this paper, we introduce a novel 3D Relightable Facial<br />

Expressi<strong>on</strong> (ICT-3DRFE) database which enables studies <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

facial expressi<strong>on</strong> recogniti<strong>on</strong> and synthesis. We dem<strong>on</strong>strate<br />

the value <str<strong>on</strong>g>of</str<strong>on</strong>g> having such a database while exploring the effect<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> <strong>on</strong> facial expressi<strong>on</strong> recogniti<strong>on</strong>. First, we<br />

use the ICT-3DRFE database to create a sample database<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> images to study the effect <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong>. We use the<br />

Computer Expressi<strong>on</strong> Recogniti<strong>on</strong> Toolbox (CERT) [7] to<br />

evaluate specific Facial Acti<strong>on</strong> Units (AU) <strong>on</strong> that image set<br />

and we compare CERT output with a FACS (Facial Acti<strong>on</strong><br />

Coding System) expert coder’s annotati<strong>on</strong>s. We also compare<br />

the CERT output <str<strong>on</strong>g>of</str<strong>on</strong>g> specific expressi<strong>on</strong>s under different<br />

illuminati<strong>on</strong> to observe how lighting variati<strong>on</strong> affects its ability<br />

to distinguish between expressi<strong>on</strong>s. Sec<strong>on</strong>d, we present


an approach to factor out lighting variati<strong>on</strong> to improve<br />

the accuracy <str<strong>on</strong>g>of</str<strong>on</strong>g> automatic expressi<strong>on</strong> recogniti<strong>on</strong>. For this<br />

purpose, we employ ratio images as in the approach <str<strong>on</strong>g>of</str<strong>on</strong>g> Peers<br />

et al. [8], to transfer the uniformly-lit appearance <str<strong>on</strong>g>of</str<strong>on</strong>g> a similar<br />

face in the ICT-3DRFE database to a target face seen under<br />

n<strong>on</strong>-uniform illuminati<strong>on</strong>. In this approach, we use the ICT-<br />

3DRFE database to select a matching subject and transfer<br />

illuminati<strong>on</strong>. We evaluate if ”unlighting” a face in this<br />

way can improve the performance <str<strong>on</strong>g>of</str<strong>on</strong>g> expressi<strong>on</strong> recogniti<strong>on</strong><br />

s<str<strong>on</strong>g>of</str<strong>on</strong>g>tware. Our experiments show promising results.<br />

The remainder <str<strong>on</strong>g>of</str<strong>on</strong>g> this paper is arranged as follows: in<br />

Secti<strong>on</strong> II, we discuss the previous work <strong>on</strong> automatic facial<br />

expressi<strong>on</strong> recogniti<strong>on</strong>. We also survey the state-<str<strong>on</strong>g>of</str<strong>on</strong>g>-the-art<br />

in facial expressi<strong>on</strong> databases and menti<strong>on</strong> other face relighting<br />

techniques relevant to facial expressi<strong>on</strong> recogniti<strong>on</strong>.<br />

In Secti<strong>on</strong> III, we introduce our new ICT-3DRFE database,<br />

discussing its advantages and how it was assembled. Secti<strong>on</strong><br />

IV describes our experiment <strong>on</strong> the effect <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> <strong>on</strong><br />

facial expressi<strong>on</strong> recogniti<strong>on</strong> using the ICT-3DRFE database.<br />

Secti<strong>on</strong> V describes our illuminati<strong>on</strong> transfer technique for<br />

mitigating the effects <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> <strong>on</strong> expressi<strong>on</strong> recogniti<strong>on</strong>,<br />

showing how this improves AU classificati<strong>on</strong>. We<br />

c<strong>on</strong>clude with a discussi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> future work in Secti<strong>on</strong> VI.<br />

II. PREVIOUS WORK<br />

A. Facial Expressi<strong>on</strong> Recogniti<strong>on</strong><br />

There has been significant progress in the field <str<strong>on</strong>g>of</str<strong>on</strong>g> facial<br />

expressi<strong>on</strong> recogniti<strong>on</strong> in the last few decades. Two popular<br />

classes <str<strong>on</strong>g>of</str<strong>on</strong>g> facial expressi<strong>on</strong> recogniti<strong>on</strong> are: i) facial Acti<strong>on</strong><br />

Units (AUs) according to the Facial Acti<strong>on</strong> Coding System<br />

(FACS) proposed by Ekman et al. [10] and ii) the set <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

prototypic expressi<strong>on</strong>s also defined by Ekman [11] that relate<br />

to emoti<strong>on</strong>al states including happiness, sadness, anger, fear,<br />

disgust and surprise. Systems <str<strong>on</strong>g>of</str<strong>on</strong>g> automatic expressi<strong>on</strong> recogniti<strong>on</strong><br />

comm<strong>on</strong>ly use AU analysis as a low level expressi<strong>on</strong><br />

classificati<strong>on</strong> followed by a sec<strong>on</strong>d level <str<strong>on</strong>g>of</str<strong>on</strong>g> classificati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

AU combinati<strong>on</strong>s into <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the basic expressi<strong>on</strong>s [13].<br />

Traditi<strong>on</strong>al automatic systems use geometric features such as<br />

the locati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> facial landmarks (corners <str<strong>on</strong>g>of</str<strong>on</strong>g> the eyes, nostrils,<br />

etc.) and spatial relati<strong>on</strong>s am<strong>on</strong>g them (shape <str<strong>on</strong>g>of</str<strong>on</strong>g> eyes, mouth,<br />

etc.) [3] [12]. Bartlett et al. found in practice that imagebased<br />

representati<strong>on</strong>s c<strong>on</strong>tain more informati<strong>on</strong> for facial<br />

expressi<strong>on</strong> than representati<strong>on</strong>s based <strong>on</strong> shape <strong>on</strong>ly [14].<br />

Recent methods focus either <strong>on</strong> solely appearance features<br />

(representing the facial texture) like Bartlett et al. [14] who<br />

use Gabor wavelets or eigenfaces, or hybrid methods, using<br />

both shape- and appearance- based features, like in the case<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> Lucey et al. which uses an Active Appearance Model<br />

(AAM) [15]. There is also a rising interest in the use <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

3D facial geometry to extract expressi<strong>on</strong> representati<strong>on</strong>s that<br />

will be view and pose invariant [13].<br />

B. Facial Databases<br />

Facial expressi<strong>on</strong> databases are very important for facial<br />

expressi<strong>on</strong> recogniti<strong>on</strong>, because there is a need for comm<strong>on</strong><br />

ground to evaluate various algorithms. These databases<br />

are usually static images or image sequences. The most<br />

Fig. 2. Acquisiti<strong>on</strong> setup for ICT-3DRFE. Left: LED sphere with 156<br />

white LED lights. Right: Layout showing the positi<strong>on</strong>ing <str<strong>on</strong>g>of</str<strong>on</strong>g> the stereo pair<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> cameras and projector for face scanning.<br />

comm<strong>on</strong>ly used facial expressi<strong>on</strong> databases include the<br />

Cohn-Kanade facial expressi<strong>on</strong> database [16] which is AU<br />

coded, the Japanese Female Facial Expressi<strong>on</strong> (JAFFE)<br />

database [17], MMI database [18] which includes both still<br />

images and image sequences, the CMU-PIE database [19],<br />

with pose and illuminati<strong>on</strong> variati<strong>on</strong> for each subject, and<br />

other databases [20]. Since the introducti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> 3D into<br />

facial expressi<strong>on</strong> recogniti<strong>on</strong>, 3D databases have gained in<br />

popularity. The most comm<strong>on</strong> is the BU-3DFE database<br />

which includes 3D models and c<strong>on</strong>siders intensity levels <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

expressi<strong>on</strong>s [21]. BU-3DFE was extended to the BU-4DFE<br />

by including temporal data [22]. The latest facial expressi<strong>on</strong><br />

databases are the Radboud Facial Database (RaFD), which<br />

c<strong>on</strong>siders c<strong>on</strong>tempt, a n<strong>on</strong> prototypic expressi<strong>on</strong> and different<br />

gaze directi<strong>on</strong>s [23], and the extended Cohn-Kanade (CK+)<br />

database, which is an extensi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the older CK, is fully<br />

FACS coded and includes emoti<strong>on</strong> labels [24].<br />

Our new ICT-3DRFE database also includes 3D models,<br />

c<strong>on</strong>siders different gaze directi<strong>on</strong>s, and is AU annotated. In<br />

c<strong>on</strong>trast to the other databases, however, our ICT-3DRFE<br />

database <str<strong>on</strong>g>of</str<strong>on</strong>g>fers much higher resoluti<strong>on</strong> in its 3D models,<br />

and it is the <strong>on</strong>ly photorealistically relightable database.<br />

C. Face Relighting<br />

One <str<strong>on</strong>g>of</str<strong>on</strong>g> our ultimate goals is to factor out the effect <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

illuminati<strong>on</strong> <strong>on</strong> facial expressi<strong>on</strong> recogniti<strong>on</strong>. For that, we<br />

leverage image based relighting techniques which have been<br />

extensively studied in computer graphics. Debevec et al. [26]<br />

photographs a face with a dense sampling <str<strong>on</strong>g>of</str<strong>on</strong>g> lighting directi<strong>on</strong>s<br />

using a spherical light stage and exploits the linearity<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> light transport to accurately rendering the face under<br />

any distant illuminati<strong>on</strong> envir<strong>on</strong>ment from such data. While<br />

realistic and accurate, the technique can be applied <strong>on</strong>ly to<br />

subjects captured in a light stage. Peers et al. [8] overcame<br />

this restricti<strong>on</strong> through an appearance transfer technique<br />

based <strong>on</strong> ratio images [9], allowing a single photograph <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

a face to be approximately relit using light stage data <str<strong>on</strong>g>of</str<strong>on</strong>g> a<br />

similar-looking subject from a database. Ratio images have<br />

also been used to transfer facial expressi<strong>on</strong>s from <strong>on</strong>e image<br />

to another by Liu et al. [25]. A few other researchers have<br />

explored relighting methods to enhance facial recogniti<strong>on</strong>:<br />

Kumar et al. [5] uses morphable reflectance fields to augment


0 1 2 3 4 5 6 7<br />

Neutral,<br />

Eyes Closed<br />

Ex0<br />

Neutral,<br />

Eyes Open<br />

Ex4<br />

Eyebrows<br />

Up<br />

Ex5<br />

Eyebrows<br />

Down<br />

Scrunch<br />

8 9 10 11 12 13 14 15<br />

16 17 18 19 20 21 22<br />

Fig. 3.<br />

The 23 subjects <str<strong>on</strong>g>of</str<strong>on</strong>g> ICT-3DRFE database.<br />

image databases with relit images <str<strong>on</strong>g>of</str<strong>on</strong>g> the existing set, and<br />

Toderici et al. [6] uses bidirecti<strong>on</strong>al relighting.<br />

Our approach to factor out the effect <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> from<br />

a target face is similar in principle to that <str<strong>on</strong>g>of</str<strong>on</strong>g> Peers et al. [8].<br />

Our main difference from most <str<strong>on</strong>g>of</str<strong>on</strong>g> these approaches is that<br />

while they relight a uniformly illuminated target face to a<br />

desired n<strong>on</strong>-uniform lighting c<strong>on</strong>diti<strong>on</strong>, we relight the target<br />

face image from a known n<strong>on</strong>-uniform lighting c<strong>on</strong>diti<strong>on</strong><br />

to a uniform lighting c<strong>on</strong>diti<strong>on</strong> for robust facial expressi<strong>on</strong><br />

recogniti<strong>on</strong>.<br />

III. 3D-RFE DATASET<br />

The main c<strong>on</strong>tributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> this paper is the introducti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

a new 3D Relightable Facial Expressi<strong>on</strong> Database 1 . As with<br />

any 3D database, a great advantage <str<strong>on</strong>g>of</str<strong>on</strong>g> having 3D geometry is<br />

that <strong>on</strong>e can use it to extract geometric features that are pose<br />

and viewpoint invariant. In our ICT-3DRFE database, the<br />

detail <str<strong>on</strong>g>of</str<strong>on</strong>g> the geometry is higher than in any other existing 3D<br />

database, with each model having approximately 1,200,000<br />

vertices with reflectance maps <str<strong>on</strong>g>of</str<strong>on</strong>g> 1296 × 1944 pixels. This<br />

resoluti<strong>on</strong> c<strong>on</strong>tains detail down to sub-millimeter skin pore<br />

level, increasing its utility for the study <str<strong>on</strong>g>of</str<strong>on</strong>g> geometric and 3D<br />

features. Besides high resoluti<strong>on</strong>, relightability is the other<br />

main novelty <str<strong>on</strong>g>of</str<strong>on</strong>g> this database. The reflectance informati<strong>on</strong><br />

provided with every 3D model allows the faces to rendered<br />

realistically under any given illuminati<strong>on</strong>. For example, <strong>on</strong>e<br />

could use a light probe [29] to capture the illuminati<strong>on</strong> in a<br />

specific scene to to render a face in the ICT-3DRFE database<br />

with that lighting. This property, al<strong>on</strong>g with the traditi<strong>on</strong>al<br />

advantages <str<strong>on</strong>g>of</str<strong>on</strong>g> a 3D model database (such as c<strong>on</strong>trolling the<br />

pose while rendering) enables many uses. In Secti<strong>on</strong> IV, we<br />

use our ICT-3DRFE to study the effect <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> <strong>on</strong><br />

facial expressi<strong>on</strong>s by creating a database <str<strong>on</strong>g>of</str<strong>on</strong>g> facial images<br />

under chosen illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s and poses. Also in<br />

Secti<strong>on</strong> V, we use the database as a tool for removing<br />

illuminati<strong>on</strong> effects from facial images. Figure 1 displays<br />

a sample 3D model from the ICT-3DRFE database under<br />

different poses and illuminati<strong>on</strong>s.<br />

A. Acquisiti<strong>on</strong> Setup<br />

The ICT-3DRFE dataset introduced in this paper was<br />

acquired using a high resoluti<strong>on</strong> face scanning system that<br />

1 This database will be made publicly available at<br />

http://projects.ict.usc.edu/3drfe/<br />

Ex1 Ex2 Ex3<br />

Happiness Sadness Anger Fear Disgust Surprise<br />

Eye gaze<br />

Up<br />

Eye gaze<br />

Down<br />

Eye gaze<br />

Right<br />

Eye gaze<br />

Left<br />

Fig. 4. The 15 expressi<strong>on</strong>s captured for every subject. The <strong>on</strong>es annotated<br />

with an ”Ex” label are the expressi<strong>on</strong>s used in our experiments.<br />

employs a spherical light stage with 156 white LED lights<br />

( Figure 2A). The lights are individually c<strong>on</strong>trollable in<br />

intensity and are used to light the face with a series <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

c<strong>on</strong>trolled spherical lighting c<strong>on</strong>diti<strong>on</strong>s which reveal detailed<br />

shape and reflectance informati<strong>on</strong>. An LCD video projector<br />

subsequently projects a series <str<strong>on</strong>g>of</str<strong>on</strong>g> colored stripe patterns to aid<br />

stereo corresp<strong>on</strong>dence. The face’s appearance under these<br />

c<strong>on</strong>diti<strong>on</strong>s is photographed by a stereo pair <str<strong>on</strong>g>of</str<strong>on</strong>g> Can<strong>on</strong> 1D<br />

Mark III digital cameras (10 megapixels) ( Figure 2B).<br />

Computati<strong>on</strong>al stereo between the two cameras produces<br />

a millimeter-accurate estimate <str<strong>on</strong>g>of</str<strong>on</strong>g> facial shape; this shape<br />

is refined using sub-millimeter surface orientati<strong>on</strong> estimates<br />

from the spherical lighting c<strong>on</strong>diti<strong>on</strong>s as in Ma et al. [27],<br />

revealing fine detail at the level <str<strong>on</strong>g>of</str<strong>on</strong>g> pores and creases. Linear<br />

polarizer filters <strong>on</strong> the LED lights and an active polarizer <strong>on</strong><br />

the left camera allow specular reflecti<strong>on</strong>s (the shine <str<strong>on</strong>g>of</str<strong>on</strong>g>f the<br />

skin) and subsurface reflecti<strong>on</strong> (the skin’s diffuse appearance)<br />

to be recorded independently, yielding the diffuse and specular<br />

reflectance maps (Figure 1) needed for photorealistic<br />

rendering under new lighting.<br />

Each facial capture takes five sec<strong>on</strong>ds, acquiring approximately<br />

20 stereo photographs under the different lighting<br />

c<strong>on</strong>diti<strong>on</strong>s. Our subjects had no difficulty maintaining the<br />

facial expressi<strong>on</strong>s for the capture time, particularly since<br />

we used the complementary gradient technique <str<strong>on</strong>g>of</str<strong>on</strong>g> Wils<strong>on</strong> et<br />

al. [28] to digitally remove subject moti<strong>on</strong> during the capture.<br />

B. Dataset Descripti<strong>on</strong><br />

For the purpose <str<strong>on</strong>g>of</str<strong>on</strong>g> this dataset, 23 people were captured,<br />

as represented in Figure 3. Our database c<strong>on</strong>sists <str<strong>on</strong>g>of</str<strong>on</strong>g> 17 male<br />

and 6 female subjects from different ethnic backgrounds,<br />

all between the ages <str<strong>on</strong>g>of</str<strong>on</strong>g> 22-35. Each subject was asked to<br />

perform a set <str<strong>on</strong>g>of</str<strong>on</strong>g> 15 expressi<strong>on</strong>s, as shown in Figure 4.<br />

The set <str<strong>on</strong>g>of</str<strong>on</strong>g> posed expressi<strong>on</strong>s c<strong>on</strong>sists <str<strong>on</strong>g>of</str<strong>on</strong>g> the six prototypic<br />

<strong>on</strong>es (according to Ekman [11]), two neutral expressi<strong>on</strong>s<br />

(eyes closed and open), two eyebrow expressi<strong>on</strong>s, a


AU1 AU2 AU4 AU5<br />

FACS coder<br />

Annotati<strong>on</strong><br />

CERT Output<br />

Ex0 Ex1 Ex2 Ex3 Ex4 Ex5 Ex0 Ex1 Ex2 Ex3 Ex4 Ex5 Ex0 Ex1 Ex2 Ex3 Ex4 Ex5 Ex0 Ex1 Ex2 Ex3 Ex4 Ex5<br />

Fig. 5. Distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AU scores for a selected set <str<strong>on</strong>g>of</str<strong>on</strong>g> expressi<strong>on</strong>s (see Table II) under uniform illuminati<strong>on</strong>. Top: distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AU scores, as annotated<br />

by expert FACS coder. Bottom: distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AU output from CERT, a system for automatic facial expressi<strong>on</strong> recogniti<strong>on</strong> [7]. From these graphs, it<br />

becomes obvious that Ex3 (surprise) and Ex4 (eyebrows-up) have different degrees <str<strong>on</strong>g>of</str<strong>on</strong>g> eyebrows up (expressed am<strong>on</strong>g others by AU1 and AU2), and Ex2<br />

(disgust), Ex5 (eyebrows-down) include a frown (expressed by AU4).<br />

scrunched face expressi<strong>on</strong>, and four eye gaze expressi<strong>on</strong>s<br />

(see Figure 4). For the six emoti<strong>on</strong> driven expressi<strong>on</strong>s (middle<br />

row), the subjects were given the freedom to perform the<br />

expressi<strong>on</strong> as naturally as they could, whereas for the acti<strong>on</strong><br />

specific expressi<strong>on</strong>s the subjects were asked to perform<br />

specific facial acti<strong>on</strong>s. Our motivati<strong>on</strong> for this was to capture<br />

some <str<strong>on</strong>g>of</str<strong>on</strong>g> the variati<strong>on</strong> with which people express different<br />

emoti<strong>on</strong>s, and not to force <strong>on</strong>e standardized face for each<br />

expressi<strong>on</strong>.<br />

Each model in the database c<strong>on</strong>tains high-resoluti<strong>on</strong> (submillimeter)<br />

geometry as a triangle mesh, as well as a<br />

set <str<strong>on</strong>g>of</str<strong>on</strong>g> high-resoluti<strong>on</strong> reflectance maps including a diffuse<br />

color map (like a traditi<strong>on</strong>al ”texture map”, but substantially<br />

without ”baked-in” shading), a specular intensity map (how<br />

much shine each part <str<strong>on</strong>g>of</str<strong>on</strong>g> the face has), and several surface<br />

normal maps (indicating the local orientati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> each point<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> the skin surface). Normal maps are provided for the red,<br />

green, and blue channels <str<strong>on</strong>g>of</str<strong>on</strong>g> the diffuse comp<strong>on</strong>ent as well<br />

as the colorless specular comp<strong>on</strong>ent to enable efficient and<br />

realistic skin rendering using the hybrid normal technique <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

Ma et al. [27].<br />

C. Acti<strong>on</strong> Unit Annotati<strong>on</strong>s<br />

Our ICT-3DRFE database is also fully AU annotated from<br />

an expert FACS coder. Acti<strong>on</strong> units are assigned scores<br />

between 0-1 depending <strong>on</strong> the degree <str<strong>on</strong>g>of</str<strong>on</strong>g> muscle activity. In<br />

Figure 5, we show the distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the scores for some<br />

eyebrow related AU and for the subject/expressi<strong>on</strong> set we<br />

have chosen for further analysis in this paper.<br />

The displayed AU are: AU1 (inner brow raise), AU2<br />

(outer brow raise), AU4 (brow lower) and AU5 (upper lid<br />

raise). The AU score distributi<strong>on</strong> over different expressi<strong>on</strong>s<br />

dem<strong>on</strong>strates which AUs are activated in a specific expressi<strong>on</strong><br />

and to what degree. For example, from Figure 5, first<br />

row, we can tell that expressi<strong>on</strong>s Ex3 and Ex4 (surprise and<br />

eyebrow-up, respectively) usually employ inner and outer<br />

eyebrow raise since they have both AU1 and AU2 activated.<br />

Moreover, we can tell that during expressi<strong>on</strong> Ex4 subjects<br />

tend to raise their inner eyebrow more than during Ex3,<br />

because <str<strong>on</strong>g>of</str<strong>on</strong>g> the distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the scores (the degree <str<strong>on</strong>g>of</str<strong>on</strong>g> AU1<br />

is different between these two expressi<strong>on</strong>s). Similarly, am<strong>on</strong>g<br />

the selected set <str<strong>on</strong>g>of</str<strong>on</strong>g> expressi<strong>on</strong>s, <strong>on</strong>ly Ex2 and Ex5 (disgust<br />

and eyebrows-down, respectively) include a frown, which is<br />

represented by AU4.<br />

IV. INFLUENCE OF ILLUMINATION ON EXPRESSION<br />

RECOGNITION<br />

In this secti<strong>on</strong>, we explore and quantify the illuminati<strong>on</strong><br />

effect <strong>on</strong> expressi<strong>on</strong> recogniti<strong>on</strong>. For the scope <str<strong>on</strong>g>of</str<strong>on</strong>g> this study<br />

we focus <strong>on</strong> automatic recogniti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> facial expressi<strong>on</strong>s.<br />

We evaluate automatic classificati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AU’s, since they are<br />

the prevailing classificati<strong>on</strong> method for facial expressi<strong>on</strong>s.<br />

We intend to find patterns in the variati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AU resp<strong>on</strong>se<br />

when changing the illuminati<strong>on</strong> (either during expressi<strong>on</strong> or<br />

during a neutral face) and explore which characteristics <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

illuminati<strong>on</strong> affect specific facial AUs.<br />

We decided to focus our first effort <strong>on</strong> investigating<br />

eyebrow facial acti<strong>on</strong>s, with the intuiti<strong>on</strong> that this area <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

face is <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the most expressive <strong>on</strong>es. Muscle activati<strong>on</strong><br />

al<strong>on</strong>g the eyebrows causes big shape and texture variati<strong>on</strong><br />

during expressi<strong>on</strong>s.<br />

We set our experiment goals as follows: i) we examine<br />

the correlati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> our expert FACS coder’s annotati<strong>on</strong> with<br />

the AU output from automatic expressi<strong>on</strong> recogniti<strong>on</strong>, ii)<br />

we explore the changes in automatic recogniti<strong>on</strong> output<br />

caused by illuminati<strong>on</strong> variati<strong>on</strong> <strong>on</strong> the neutral face, and iii)<br />

we examine if two different expressi<strong>on</strong>s, distinguished by<br />

different AU scores, remain separable to the same degree


EXPRESSIONS<br />

Ex0: Neutral<br />

Ex1: Happy<br />

Ex2: Disgusted<br />

Ex3: Surprised<br />

Ex4: Eyebrows<br />

Up<br />

Ex5: Eyebrows<br />

Down<br />

ILLUMINATION Uniform L1 L2 L3 L4 L5 L6 L7 L8<br />

Fig. 6. Example <str<strong>on</strong>g>of</str<strong>on</strong>g> expressi<strong>on</strong>s and illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s used in our<br />

experiment. <str<strong>on</strong>g>Illuminati<strong>on</strong></str<strong>on</strong>g>s are the same column-wise, and expressi<strong>on</strong>s are<br />

the same row-wise.<br />

when illuminati<strong>on</strong> changes.<br />

A. Evaluati<strong>on</strong> Methodology<br />

First, we need to create an image set <str<strong>on</strong>g>of</str<strong>on</strong>g> different facial<br />

expressi<strong>on</strong>s and under different illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s.<br />

Based <strong>on</strong> the analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> the FACS annotated AU scores, we<br />

chose a set <str<strong>on</strong>g>of</str<strong>on</strong>g> expressi<strong>on</strong>s which activate eyebrow related<br />

AUs. Specifically, we picked six expressi<strong>on</strong>s for our study,<br />

as described in Table II. Expressi<strong>on</strong>s Ex2-Ex5 are chosen<br />

because they usually come with intense eyebrow activati<strong>on</strong>,<br />

and the first two (Ex0-Ex1) for calibrati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> what c<strong>on</strong>sists <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

neutral and close to neutral for eyebrow moti<strong>on</strong>, respectively.<br />

For our lighting set, we chose nine different illuminati<strong>on</strong><br />

c<strong>on</strong>diti<strong>on</strong>s, as seen in Figure 6. They are described in Table<br />

I. The first <strong>on</strong>e (L0) is picked to evaluate the best performance<br />

for CERT, since it is a uniform lighting, desirable<br />

for automatic facial expressi<strong>on</strong> recogniti<strong>on</strong> systems. This<br />

illuminati<strong>on</strong> is uniform. L1-L5 are picked because <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

directi<strong>on</strong>ality which is <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the main parameters that impairs<br />

shape percepti<strong>on</strong>. L6-L8 are picked as representatives<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> more realistic, envir<strong>on</strong>mental lighting c<strong>on</strong>diti<strong>on</strong>s that <strong>on</strong>e<br />

can actually come across. L7 and L8 are also cases <str<strong>on</strong>g>of</str<strong>on</strong>g> low<br />

illuminati<strong>on</strong> intensity.<br />

To produce our experimental image set for analysis, we<br />

used our newly developed ICT-3DRFE database. The image<br />

set for <strong>on</strong>e <str<strong>on</strong>g>of</str<strong>on</strong>g> the subjects can be seen in Figure 6. All 3D<br />

models were rendered under the same 6 expressi<strong>on</strong>s and 9<br />

illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s. We did that for a subset <str<strong>on</strong>g>of</str<strong>on</strong>g> fifteen<br />

subjects, generating 6 x 9 = 54 images for each subject.<br />

For the automatic evaluati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AUs, we used Computer<br />

Expressi<strong>on</strong> Recogniti<strong>on</strong> Toolbox (CERT) [7], which is a<br />

robust AU classifier that uses appearance based features [14]<br />

and performs with great accuracy. Using CERT we obtained<br />

output for some eyebrow related AUs.<br />

B. Results<br />

First, we want to evaluate the correlati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> CERT output<br />

with our FACS coder annotati<strong>on</strong>s. AU1, AU2, AU4 and AU5<br />

output evaluated with CERT are shown in Figure 5, sec<strong>on</strong>d<br />

row, below the expert FACS AU annotati<strong>on</strong>s. Note that in<br />

both cases, uniform illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong> (L0) was used.<br />

CERT output are the Support Vector Machine margins from<br />

TABLE I<br />

ILLUMINATION CONFIGURATIONS FOR EXPERIMENTS DESCRIBED IN<br />

SECTION IV<br />

Label<br />

Descripti<strong>on</strong><br />

L0<br />

uniform<br />

L1 ambient + pointlight at the right side <str<strong>on</strong>g>of</str<strong>on</strong>g> the head<br />

L2 ambient + pointlight at the top side <str<strong>on</strong>g>of</str<strong>on</strong>g> the head<br />

L3<br />

ambient + pointlight at the the left<br />

L4 ambient + pointlight at the the bottom<br />

L5 ambient + pointlight at the the bottom left<br />

L6<br />

envir<strong>on</strong>mental light<br />

L7 envir<strong>on</strong>mental light (modified 1)<br />

L8 envir<strong>on</strong>mental light (modified 2)<br />

TABLE II<br />

SELECTED EXPRESSIONS FOR EXPERIMENTS DESCRIBED IN SECTION<br />

Label<br />

Ex0<br />

Ex1<br />

Ex2<br />

Ex3<br />

Ex4<br />

Ex5<br />

IV<br />

Descripti<strong>on</strong><br />

Neutral - eyes open<br />

Happy<br />

Disgusted<br />

Surprised<br />

Eyebrows Up<br />

Eyebrows Down<br />

classificati<strong>on</strong> and can be positive or negative [7], whereas<br />

the annotated scores range from 0-1 with 1 signifying the<br />

highest intensity and 0 meaning that the AU is n<strong>on</strong> existent.<br />

Although CERT was trained as a discriminative classifier,<br />

Figure 5 shows that its output is directly correlated with<br />

the expert FACS coder’s annotati<strong>on</strong>. Indeed, the distributi<strong>on</strong><br />

patterns are similar to those <str<strong>on</strong>g>of</str<strong>on</strong>g> the annotated scores, which<br />

validates the CERT output <strong>on</strong> our image set and certifies that<br />

the uniform illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong> is indeed a suitable input<br />

to establish ground truth.<br />

To answer our sec<strong>on</strong>d questi<strong>on</strong> about the AU variati<strong>on</strong><br />

with illuminati<strong>on</strong> <strong>on</strong> a neutral face, we plot the distributi<strong>on</strong>s<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> CERT AU output for the different illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s,<br />

evaluated <strong>on</strong> neutral faces. Figure 7 lays such a plot for AU4<br />

(eyebrows drown medially and down). The first distributi<strong>on</strong><br />

(first highlighted column <strong>on</strong> the left) are the scores for<br />

uniform light and we c<strong>on</strong>sider it to be the ground truth AU<br />

score for the neutral face. From Figure 7 we observe that<br />

AU4 output changes with illuminati<strong>on</strong> and more specifically,<br />

illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s L4 and L5 (directi<strong>on</strong>ality from the<br />

bottom, and bottom left) seem to affect it the most. To<br />

analyze the statistical significance <str<strong>on</strong>g>of</str<strong>on</strong>g> the variati<strong>on</strong> in AU<br />

scores, we performed the paired T-Test with a standard 5%<br />

significance level, annotated in the Figures with a ”*”, and<br />

with ”**” a significance level <str<strong>on</strong>g>of</str<strong>on</strong>g> 1% .<br />

Similarly, we performed more experiments for some other<br />

eyebrow related AUs (Figures not shown) and we observed<br />

that: i) light from the side affects AU1 (inner eyebrow raised)<br />

the most, ii) light from the top or bottom affects AU9<br />

(nose wrinkle) the most. These observati<strong>on</strong>s agree with our<br />

intuiti<strong>on</strong>.<br />

Our third topic <str<strong>on</strong>g>of</str<strong>on</strong>g> interest is to understand whether<br />

different expressi<strong>on</strong>s remain distinguishable under different<br />

illuminati<strong>on</strong>. To answer this, we examine the distributi<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

a specific AU output under different illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s


CERT output<br />

over 15 subjects<br />

AU4: Eyebrows drawn medially and down<br />

*<br />

*<br />

**<br />

Similarity<br />

probability<br />

CERT output distributi<strong>on</strong><br />

over 15 subjects<br />

AU1: Inner eyebrow raise<br />

p


Original Image<br />

Ratio based Light Transfer<br />

Result<br />

Light<br />

Transfer<br />

A<br />

F<br />

B<br />

3D-RFE<br />

database<br />

C<br />

D<br />

E<br />

Fig. 9. Overview <str<strong>on</strong>g>of</str<strong>on</strong>g> Ratio-based Light Transfer method: Factoring out a specific illuminati<strong>on</strong> from a target image. An original image (A) is taken under<br />

known illuminati<strong>on</strong> (B). A subject from the ICT-3DRFE database is selected and brought to a similar pose as the target. We then render the database<br />

subject under the same illuminati<strong>on</strong> as the target (C) and under a desired uniform illuminati<strong>on</strong> (D). From these two images (C and D), we obtain the ratio<br />

image (E) for the database subject. This ratio image represents the light difference between the two illuminati<strong>on</strong>s, the <strong>on</strong>e we have (B) and the <strong>on</strong>e we<br />

want (uniform). The ratio image is then used to transfer the desired illuminati<strong>on</strong> to our target image as seen in the result (F).<br />

affects AU scores. To show our approach, we proceed<br />

with the case <str<strong>on</strong>g>of</str<strong>on</strong>g> AU1, where light coming from the left<br />

side (L1) causes CERT output to change significantly as<br />

dem<strong>on</strong>strated in Figure 11, first two columns. We extracted<br />

that illuminati<strong>on</strong> (L1) from the neutral face <str<strong>on</strong>g>of</str<strong>on</strong>g> the subjects<br />

and changed their images to a more uniformly lit illuminati<strong>on</strong><br />

c<strong>on</strong>diti<strong>on</strong> (L0), which was used for the definiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the<br />

baseline. We evaluated the AU scores <str<strong>on</strong>g>of</str<strong>on</strong>g> the new ”pseudo-<br />

L0” set <str<strong>on</strong>g>of</str<strong>on</strong>g> images using CERT, and the results <str<strong>on</strong>g>of</str<strong>on</strong>g> the new<br />

output are shown in Figure 11, last column.<br />

L1 affects the output <str<strong>on</strong>g>of</str<strong>on</strong>g> CERT to the point that the<br />

distributi<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> AU1 outputs under L1 becomes statistically<br />

different from the <strong>on</strong>e under L0. However, when we process<br />

the L1 images with our method <str<strong>on</strong>g>of</str<strong>on</strong>g> ratio based light transfer<br />

and we bring them under a uniform illuminati<strong>on</strong>, close to<br />

L0, the AU1 output distributi<strong>on</strong> changes correctively toward<br />

the expected <strong>on</strong>e, and the statistical difference becomes<br />

insignificant.<br />

This is a very encouraging result, given our goal <str<strong>on</strong>g>of</str<strong>on</strong>g> light<br />

invariant AU classificati<strong>on</strong>.<br />

VI. CONCLUSIONS AND FUTURE WORK<br />

In this paper, we introduced a new database called ICT-<br />

3DRFE, which includes 3D models <str<strong>on</strong>g>of</str<strong>on</strong>g> 23 participants, under<br />

15 expressi<strong>on</strong>s, with the highest resoluti<strong>on</strong> compared to the<br />

other 3D databases. It also includes photometric informati<strong>on</strong><br />

which enables photorealistic rendering under any illuminati<strong>on</strong><br />

c<strong>on</strong>diti<strong>on</strong>. We showed how such properties can be<br />

employed in the design <str<strong>on</strong>g>of</str<strong>on</strong>g> experiments where illuminati<strong>on</strong><br />

c<strong>on</strong>diti<strong>on</strong>s are modified to study the effect <strong>on</strong> systems <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

automatic expressi<strong>on</strong> recogniti<strong>on</strong>.<br />

We presented a novel approach towards a light invariant<br />

expressi<strong>on</strong> recogniti<strong>on</strong> system. Using ratio images, we are<br />

able to factor out unwanted illuminati<strong>on</strong> and in some cases<br />

improve the output <str<strong>on</strong>g>of</str<strong>on</strong>g> AU automatic classificati<strong>on</strong>. Our<br />

current approach, however, requires that the facial image<br />

to be recognized be taken in known (although arbitrary)<br />

illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s. For future work, we would like<br />

to remove this restricti<strong>on</strong> by estimating the illuminati<strong>on</strong><br />

envir<strong>on</strong>ment directly from the image.<br />

Since our observati<strong>on</strong>s generally agree with our intuiti<strong>on</strong>s,<br />

a goal for future work would also be to study the effect <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

illuminati<strong>on</strong> <strong>on</strong> human judgment.<br />

VII. ACKNOWLEDGMENTS<br />

The authors would like to thank the following collaborators<br />

for their help <strong>on</strong> this paper: Ning Wang for the AU<br />

annotati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the database. Sim<strong>on</strong> Lucey from CSIRO ICT<br />

Center (Australia) for the face tracking code used in this<br />

paper for sparse corresp<strong>on</strong>dence during image alignment.<br />

Cyrus Wils<strong>on</strong> and Jay Busch for help with processing <str<strong>on</strong>g>of</str<strong>on</strong>g><br />

the acquired data.<br />

REFERENCES<br />

[1] Braje, W.L., Kersten, D., Tarr, M.J. and Troje, N.F. <str<strong>on</strong>g>Illuminati<strong>on</strong></str<strong>on</strong>g> effects<br />

in face recogniti<strong>on</strong>. (1998) Psychobiology. 26 (4), 371-380.<br />

[2] Adini, Y., Moses, Y., Ullman, S. (1995). Face recogniti<strong>on</strong>: the problem<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> compensating for changes in illuminati<strong>on</strong> directi<strong>on</strong> (Report No.<br />

CS93-21). The Weizmann Institute <str<strong>on</strong>g>of</str<strong>on</strong>g> Science.<br />

[3] Chibelushi, C.C., Bourel, F.: Facial expressi<strong>on</strong> recogniti<strong>on</strong>: A brief<br />

tutorial overview. In: Fisher, R. (ed.) CV<strong>on</strong>line: On-Line Compendium<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> Computer Visi<strong>on</strong> (January 2003)<br />

[4] Li, H., Buenaposada, J.M.; Baumela, L.; , ”Real-time facial expressi<strong>on</strong><br />

recogniti<strong>on</strong> with illuminati<strong>on</strong>-corrected image sequences,” <strong>Automatic</strong><br />

Face and Gesture Recogniti<strong>on</strong>, 2008. FG ’08. 8th IEEE Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> , vol., no., pp.1-6, 17-19 Sept. 2008


Original Result Ground truth<br />

AU1: inner eyebrow raise<br />

*<br />

CERT Output<br />

over 15 subjects<br />

<str<strong>on</strong>g>Illuminati<strong>on</strong></str<strong>on</strong>g><br />

Uniform<br />

L1<br />

L1 + Ratio-based<br />

Light Transfer <str<strong>on</strong>g>of</str<strong>on</strong>g> L0<br />

Fig. 10. Factoring out known illuminati<strong>on</strong> from n<strong>on</strong> fr<strong>on</strong>tal pose: A)<br />

Original image, with illuminati<strong>on</strong> that we want to ”neutralize”, B) Output<br />

<str<strong>on</strong>g>of</str<strong>on</strong>g> our method, the target with desired illuminati<strong>on</strong> (this image was produced<br />

by our image based transfer <str<strong>on</strong>g>of</str<strong>on</strong>g> illuminati<strong>on</strong> method). C) Ground truth<br />

for comparis<strong>on</strong>: the target subject illuminated with desired illuminati<strong>on</strong><br />

c<strong>on</strong>diti<strong>on</strong> (this image is a rendering from the 3D model).<br />

[5] Kumar, R., J<strong>on</strong>es, M., Marks, T.K., ”Morphable Reflectance Fields for<br />

enhancing face recogniti<strong>on</strong>,” Computer Visi<strong>on</strong> and Pattern Recogniti<strong>on</strong><br />

(CVPR), 2010 IEEE C<strong>on</strong>ference <strong>on</strong> , vol., no., pp.2606-2613, 13-18<br />

June 2010<br />

[6] Toderici, G., Passalis, G., Zafeiriou, S., Tzimiropoulos, G., Petrou,<br />

M., Theoharis, T., Kakadiaris, I.A., ”Bidirecti<strong>on</strong>al relighting for 3Daided<br />

2D face recogniti<strong>on</strong>,” Computer Visi<strong>on</strong> and Pattern Recogniti<strong>on</strong><br />

(CVPR), 2010 IEEE C<strong>on</strong>ference <strong>on</strong> , vol., no., pp.2721-2728, 13-18<br />

June 2010<br />

[7] Bartlett, M.,Littlewort, G., Tingfan Wu, Movellan, J., Computer Expressi<strong>on</strong><br />

Recogniti<strong>on</strong> Toolbox, Univ. <str<strong>on</strong>g>of</str<strong>on</strong>g> California, San Diego, CA,<br />

<strong>Automatic</strong> Face and Gesture Recogniti<strong>on</strong>, 2008<br />

[8] Peers P., Tamura N., Matusik W., Debevec P., (2007). Post-producti<strong>on</strong><br />

Facial Performance Relighting using Reflectance Transfer. ACM<br />

Transacti<strong>on</strong>s <strong>on</strong> Graphics (Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> ACM SIGGRAPH), 26(3).<br />

[9] Riklin-Raviv T., Shashua A., (1999). The quotient image: class based<br />

recogniti<strong>on</strong> and synthesis under varying illuminati<strong>on</strong> c<strong>on</strong>diti<strong>on</strong>s. IEEE<br />

Transacti<strong>on</strong>s <strong>on</strong> Pattern Analysis and Machine Intelligence 02, 262265.<br />

[10] Ekman,P., Friesen,W.V., Hager,J.C., Facial Acti<strong>on</strong> Coding System<br />

(FACS): Manual, Salt Lake City (USA): A Human Face, 2002.<br />

[11] Ekman,P., Emoti<strong>on</strong> in the Human Face, Cambridge University Press,<br />

1982<br />

[12] Pantic, M., Rothkrantz, L.J.M.: <strong>Automatic</strong> analysis <str<strong>on</strong>g>of</str<strong>on</strong>g> facial expressi<strong>on</strong>s:<br />

The state <str<strong>on</strong>g>of</str<strong>on</strong>g> the art. IEEE Transacti<strong>on</strong>s <strong>on</strong> Pattern Analysis and<br />

Machine Intelligence 22(12), 1424-1445 (2000)<br />

[13] Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S., ”A Survey <str<strong>on</strong>g>of</str<strong>on</strong>g> Affect<br />

Recogniti<strong>on</strong> Methods: Audio, Visual, and Sp<strong>on</strong>taneous Expressi<strong>on</strong>s,”<br />

Pattern Analysis and Machine Intelligence, IEEE Transacti<strong>on</strong>s <strong>on</strong> ,<br />

vol.31, no.1, pp.39-58, Jan. 2009<br />

[14] Bartlett M.S., Littlewort G.C., Frank M.G., Lainscsek C., Fasel I.,<br />

and Movellan J.R., (2006). <strong>Automatic</strong> recogniti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> facial acti<strong>on</strong>s in<br />

sp<strong>on</strong>taneous expressi<strong>on</strong>s., Journal <str<strong>on</strong>g>of</str<strong>on</strong>g> Multimedia., 1(6) p. 22-35.<br />

[15] Lucey,S., Ashraf, A.B., Cohn, J.F., Investigating Sp<strong>on</strong>taneous Facial<br />

Acti<strong>on</strong> Recogniti<strong>on</strong> through AAM Representati<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> the Face, Face<br />

Recogniti<strong>on</strong>, K. Delac, and M. Grgic, eds., pp. 275-286, I-Tech<br />

Educati<strong>on</strong> and Publishing, 2007.<br />

[16] Kanade, T., Tian,Y., Cohn, J.F., ”Comprehensive Database for Facial<br />

Expressi<strong>on</strong> Analysis,” fg, pp.46, Fourth IEEE Internati<strong>on</strong>al C<strong>on</strong>ference<br />

<strong>on</strong> <strong>Automatic</strong> Face and Gesture Recogniti<strong>on</strong> (FG’00), 2000<br />

[17] Ly<strong>on</strong>s, M., Akamatsu, S., Kamachi, M., Gyoba, J. (1998), Coding facial<br />

expressi<strong>on</strong>s with Gabor wavelets, In 3rd Internati<strong>on</strong>al C<strong>on</strong>ference<br />

<strong>on</strong> <strong>Automatic</strong> Face and Gesture Recogniti<strong>on</strong>.<br />

[18] Pantic M, Valstar M, Rademaker R and Maat L (2005), Webbased<br />

Database for Facial Expressi<strong>on</strong> Analysis, Proc. <str<strong>on</strong>g>of</str<strong>on</strong>g> IEEE Intl<br />

C<strong>on</strong>f.Multmedia and Expo (ICME05).<br />

Sample Image<br />

from <strong>on</strong>e subject<br />

Fig. 11. Distributi<strong>on</strong>s <str<strong>on</strong>g>of</str<strong>on</strong>g> AU1 scores for images illuminated with L0:<br />

uniform illuminati<strong>on</strong>, L1: light source brighter <strong>on</strong> the left side and images<br />

that L1 was factored out from to bring them to L0.<br />

[19] Sim, T., Baker, S., Bsat, M., ”The CMU Pose, <str<strong>on</strong>g>Illuminati<strong>on</strong></str<strong>on</strong>g>, and<br />

Expressi<strong>on</strong> (PIE) Database,” Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the IEEE Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> <strong>Automatic</strong> Face and Gesture Recogniti<strong>on</strong>, May, 2002.<br />

[20] Anitha C., Venkatesha M.K., Adiga B.S., A Survey <strong>on</strong> Facial Expressi<strong>on</strong><br />

Databases, Internati<strong>on</strong>al Journal <str<strong>on</strong>g>of</str<strong>on</strong>g> Engineering Science and<br />

Technology, Vol. 2(10), 2010, 5158-5174<br />

[21] Yin L, Wei X, Sun Y, Wang J, Rosato M J (2006), A 3D Facial<br />

Expressi<strong>on</strong> Database For Facial Behavior Research, 7th Internati<strong>on</strong>al<br />

C<strong>on</strong>ference <strong>on</strong> <strong>Automatic</strong> Face and Gesture Recogniti<strong>on</strong> (FGR06).<br />

[22] Yin L, Chen X, Sun Y, Worm T, Reale M (2008), A High-Resoluti<strong>on</strong><br />

3D Dynamic Facial Expressi<strong>on</strong> Database, The 8th Internati<strong>on</strong>al C<strong>on</strong>ference<br />

<strong>on</strong> <strong>Automatic</strong> Face and Gesture Recogniti<strong>on</strong> (FGR08).<br />

[23] Langner O, Dotsch R, Bijlstra G, Wigboldus D H J, Hawk S T, and Van<br />

Knippenberg A (2010), Presentati<strong>on</strong> and validati<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> the Radboud<br />

Faces Database, Cogniti<strong>on</strong> and Emoti<strong>on</strong>, Psychology Press.<br />

[24] Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews,<br />

I. , ”The Extended Cohn-Kanade Dataset (CK+): A complete dataset<br />

for acti<strong>on</strong> unit and emoti<strong>on</strong>-specified expressi<strong>on</strong>,” Computer Visi<strong>on</strong><br />

and Pattern Recogniti<strong>on</strong> Workshops (CVPRW), 2010 IEEE Computer<br />

Society C<strong>on</strong>ference <strong>on</strong> , vol., no., pp.94-101, 13-18 June 2010<br />

[25] Liu, Z., Shan, Y., Zhang, Z. 2001. Expressive expressi<strong>on</strong> mapping<br />

with ratio images. In Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the 28th annual c<strong>on</strong>ference<br />

<strong>on</strong> Computer graphics and interactive techniques (SIGGRAPH ’01).<br />

ACM, New York, NY, USA<br />

[26] Debevec P., Hawkins T., Tchou C., Duiker H.-P., Sarokin W., Sagar<br />

M., (2000). Acquiring the reflectance field <str<strong>on</strong>g>of</str<strong>on</strong>g> a human face. In<br />

Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> ACM SIGGRAPH, Computer Graphics Proceedings,<br />

Annual C<strong>on</strong>ference Series, 145156.<br />

[27] Ma W-C., Hawkins T., Peers P., Chabert C., Weiss M., Debevec P.,<br />

(2007). Rapid Acquisiti<strong>on</strong> <str<strong>on</strong>g>of</str<strong>on</strong>g> Specular and Diffuse Normal Maps<br />

from Polarized Spherical Gradient <str<strong>on</strong>g>Illuminati<strong>on</strong></str<strong>on</strong>g>. Proc. Eurographics<br />

Symposium <strong>on</strong> Rendering, p. 183-194.<br />

[28] Wils<strong>on</strong> C.A., Ghosh A., Peers P., Chiang J., Busch J., Debevec<br />

P., (2010). Temporal Upsampling <str<strong>on</strong>g>of</str<strong>on</strong>g> Performance Geometry using<br />

Photometric Alignment. ACM Transacti<strong>on</strong>s <strong>on</strong> Graphics, 29(2).<br />

[29] Debevec P., (1998). Rendering synthetic objects into real scenes:<br />

bridging traditi<strong>on</strong>al and image-based graphics with global illuminati<strong>on</strong><br />

and high dynamic range photography. Proceedings <str<strong>on</strong>g>of</str<strong>on</strong>g> the 25th<br />

annual c<strong>on</strong>ference <strong>on</strong> Computer graphics and interactive techniques<br />

(SIGGRAPH ’98), p. 189-198.<br />

[30] Lucey S., Wang Y., Saragih J., Cohn J.F., 2010. N<strong>on</strong>-rigid face tracking<br />

with enforced c<strong>on</strong>vexity and local appearance c<strong>on</strong>sistency c<strong>on</strong>straint.<br />

Image Visi<strong>on</strong> Comput. 28, 5 (May 2010), 781-789.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!