Back Room Front Room 2
Back Room Front Room 2
Back Room Front Room 2
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
302 ENTERPRISE INFORMATION SYSTEMS VI<br />
based on established psychological studies, as well<br />
as empirical analysis of actual video footage from<br />
human-computer interaction sessions and human-tohuman<br />
dialogues. The results of the synthesis process<br />
can then be applied to avatars, so as to convey<br />
the communicated messages more vividly than plain<br />
textual information or simply to make interaction<br />
more lifelike.<br />
2 MPEG-4 REPRESENTATION<br />
In the framework of MPEG-4 standard, parameters<br />
have been specified for Face and Body Animation<br />
(FBA) by defining specific Face and Body nodes in<br />
the scene graph. The goal of FBA definition is the<br />
animation of both realistic and cartoonist characters.<br />
Thus, MPEG-4 has defined a large set of parameters<br />
and the user can select subsets of these parameters<br />
according to the application, especially for the body,<br />
for which the animation is much more complex. The<br />
FBA part can be also combined with multimodal<br />
input (e.g. linguistic and paralinguistic speech analysis).<br />
2.1 Facial Animation<br />
MPEG-4 specifies 84 feature points on the neutral<br />
face, which provide spatial reference for FAPs definition.<br />
The FAP set contains two high-level parameters,<br />
visemes and expressions. In particular, the Facial<br />
Definition Parameter (FDP) and the Facial Animation<br />
Parameter (FAP) set were designed in the<br />
MPEG-4 framework to allow the definition of a facial<br />
shape and texture, eliminating the need for<br />
specifying the topology of the underlying geometry,<br />
through FDPs, and the animation of faces reproducing<br />
expressions, emotions and speech pronunciation,<br />
through FAPs. By monitoring facial gestures corresponding<br />
to FDP and/or FAP movements over time,<br />
it is possible to derive cues about user’s expressions<br />
and emotions. Various results have been presented<br />
regarding classification of archetypal expressions of<br />
faces, mainly based on features or points mainly<br />
extracted from the mouth and eyes areas of the<br />
faces. These results indicate that facial expressions,<br />
possibly combined with gestures and speech, when<br />
the latter is available, provide cues that can be used<br />
to perceive a person’s emotional state.<br />
The second version of the standard, following<br />
the same procedure with the facial definition and<br />
animation (through FDPs and FAPs), describes the<br />
anatomy of the human body with groups of distinct<br />
tokens, eliminating the need to specify the topology<br />
of the underlying geometry. These tokens can then<br />
be mapped to automatically detected measurements<br />
and indications of motion on a video sequence, thus,<br />
they can help to estimate a real motion conveyed by<br />
the subject and, if required, approximate it by means<br />
of a synthetic one.<br />
2.2 Body Animation<br />
In general, an MPEG body is a collection of<br />
nodes. The Body Definition Parameter (BDP) set<br />
provides information about body surface, body<br />
dimensions and texture, while Body Animation<br />
Parameters (BAPs) transform the posture of the<br />
body. BAPs describe the topology of the human<br />
skeleton, taking into consideration joints’ limitations<br />
and independent degrees of freedom in the<br />
skeleton model of the different body parts.<br />
2.2.1 BBA (Bone Based Animation)<br />
The MPEG-4 BBA offers a standardized interchange<br />
format extending the MPEG-4 FBA (Preda &<br />
Preteux, 2002). In BBA the skeleton is a hierarchical<br />
structure made of bones. In this hierarchy every<br />
bone has one parent and can have as children other<br />
bones, muscles or 3D objects. For the movement of<br />
every bone we have to define the influence of this<br />
movement to the skin of our model, the movement<br />
of its children and the related inverse kinematics.<br />
3 EMOTION REPRESENTATION<br />
The obvious goal for emotion analysis applications<br />
is to assign category labels that identify emotional<br />
states. However, labels as such are very poor descriptions,<br />
especially since humans use a daunting<br />
number of labels to describe emotion. Therefore we<br />
need to incorporate a more transparent, as well as<br />
continuous representation, that matches closely our<br />
conception of what emotions are or, at least, how<br />
they are expressed and perceived.<br />
Activation-emotion space (Whissel, 1989) is a<br />
representation that is both simple and capable of<br />
capturing a wide range of significant issues in emotion.<br />
It rests on a simplified treatment of two key<br />
themes:<br />
�� Valence: The clearest common element of emotional<br />
states is that the person is materially influenced<br />
by feelings that are ‘valenced’, i.e.<br />
they are centrally concerned with positive or<br />
negative evaluations of people or things or<br />
events. The link between emotion and valencing<br />
is widely agreed<br />
�� Activation level: Research has recognised that<br />
emotional states involve dispositions to act in<br />
certain ways. A basic way of reflecting that