SALEM - Statistical AnaLysis of Elan files in Matlab

SALEM - Statistical AnaLysis of Elan files in Matlab 

Marc Hanheide 1 , Manja Lohse 2 , Angelika Dierker 2 

1 University of Birmingham, School of Computer Science, B15 2TT, UK 

2 Bielefeld University, Technical Faculty, Universitätsstraße 25, 33615 Bielefeld, Germany 

E-mail: m.hanheide@cs.bham.ac.uk, mlohse@techfak.uni-bielefeld.de, adierker@techfak.uni-bielefeld.de 

Abstract 

This document proposes SALEM (Statistical AnaLysis of Elan files in Matlab) as a toolbox for the statistical analysis 

of data from human-machine interaction that are annotated in Elan. The authors show how SALEM allows to analyze 

annotations quantitatively in an effective manner. The paper introduces the position of SALEM in the data processing 

chain, its functionalities, and how it contributes to the evaluation process. 

1. Introduction 

Interaction studies with humans and intelligent systems 

are gaining more and more attention with systems 

exhibiting more advanced abilities. This is true in 

various areas such as cognitive robotics, assistance, 

interaction analysis and others more. In the interaction of 

systems and humans, a corpus must not only encode the 

behavior of the participating humans, but also the 

behavior of the intelligent system to facilitate a manifold 

analysis of more complex interaction scenarios. Both 

need to be brought together in order to allow a real 

cross-disciplinary analysis. Merging and temporally 

aligning the system-focused annotations with manual 

annotations that describe the humans’ behavior results in 

a comprehensive and rich representation of the 

interaction situation. With the help of these so-called 

systemic corpora we can (i) learn and study patterns of 

deviation or failures in interaction and (ii) identify 

correlations between system and human behaviors. 

Figure 1 shows an exemplary process of creating a 

systemic corpus comprising the recording of data, 

integrating, and finally analyzing it. The left side 

displays the data source. It includes audio and video data, 

as well as system log files from the intelligent system. In 

our work with multiple intelligent systems we use 

logging frameworks such as LOG4J 1 or LOG4CXX5 2 to 

generate the system log files. These frameworks natively 

support time-stamped events and can also set the type 

(usually corresponding to a logging level) and the emitter 

(e.g., called “logger” in log4j). The content of these 

events is either unstructured text or structured text (if the 

developers agree on a coding scheme in advance). It 

would also be possible to include other data sources. 

However, since our aim is to display these logging 

events as (time-stamped) annotations in Elan, all sources 

need to have an extent in time. In the next step of the 

processing chain all data are aligned and synchronized in 

time. After transformation and synchronization all 

recorded data can be displayed and manually revised, 

1 http://logging.apache.org/log4j/ 

2 http://logging.apache.org/log4cxx/ 

e.g., in Elan 3 which furthermore allows to add manual 

annotations (Figure 1 [data display]). Alternatively, other 

XML-based annotation tools with more or less similar 

abilities could be used (e.g., Anvil 4 , Interact 5 ). With 

respect to manual annotations one important step is to 

check their correctness for later analysis (e.g., if a coding 

scheme is used, only codes that are defined in the 

scheme are allowed; the annotations do not contain 

misspellings). Once all the data are represented in one 

format (here in the format of the annotation tool) they 

can be analyzed automatically. This is where the 

SALEM (Statistical AnaLysis of Elan files in Matlab) 

toolbox comes in. In the following, we will introduce the 

basic idea behind the usage of SALEM, its 

functionalities, and advantages. 

2. Basic idea of SALEM 

After conducting interaction studies, every researcher 

encounters the question of how to analyze data 

efficiently and effectively. This question is strongly 

influenced by what and how data are recorded. As has 

been introduced in Figure 1, we propose to collect the 

data in annotation tools like Elan, because with the help 

of these tools, multiple data sources can be integrated 

with manual annotations. The annotations are structured 

in multiple layers, so-called ‘tiers’. For example, in 

human-robot interaction (HRI) one tier may contain the 

speech of the human that has been manually transcribed 

based on the video. Another tier may contain what the 

3 http://www.lat-mpi.eu/tools/elan/ 

4 http://www.anvil-software.de/ 

5 http://www.mangold-international.com/en/products/inte 

ract.html 

Figure 1: Data processing chain

obot understood which has been extracted from the log 

files. This example shows that manual annotation might 

have to be integrated with data from automatic log files. 

The integration can easily be achieved with an XML 

format as used by Elan and other annotation tools such as 

Anvil. However, annotation tools usually only include 

limited functionalities for quantitative analysis. Thus, to 

analyze the data, one can go through the files manually 

or, alternatively, use the tools’ export options to acquire 

text or XML-based files. Thereafter, these files have to 

be imported in another software for analysis (for 

example, Matlab 6 , SPSS 7 ). This approach is rather 

laborious because whenever a change is made to the 

annotation file, it has to be exported and imported to the 

analysis software again. To work around this problem, 

the SALEM toolbox parses Elan files directly and offers 

advanced and in-depth statistical analyses with Matlab. 

Thus, it closes the cycle of importing automatic log files 

into the annotation tool, importing the corresponding 

video/audio streams into the annotation tool, annotating 

manually, and analyzing the data in an efficient way. In 

this process, one main advantage of the SALEM toolbox 

is that it allows comparing annotations of different 

modalities, structural features of the interaction, or 

whatever has been annotated in the tiers. For example, 

the video can be used to manually annotate human 

speech which can then be compared to the speech that 

the robot understood because they are represented in one 

file that can be analyzed, in our case using Matlab (see 

Figure 2). Analyzing the data with the SALEM toolbox 

does not only alleviate the analysis process, but also 

increases the consistency of the analysis. This is because 

all evaluations are conducted using the statistical 

6 http://www.mathworks.com/products/matlab/ 

7 http://www.spss.com/software/statistics/ 

Figure 2: Example of Elan display 

functions that are predefined by the toolbox, for example, 

a T-test will always be calculated in the same way based 

on the same formula. 

3. Functionalities of SALEM 

Our proposed automatic statistical analysis of annotation 

files consists of a number of routines to compute 

statistics on the temporal distribution of specific 

annotations, their correlation to one another, and their 

comparison with regard to duration and dedicated values. 

A core concept of SALEM to allow a rich analysis is 

“slicing”. It has been introduced to facilitate temporal 

correlation between annotations of different tiers. The 

idea is that we can automatically select subsets of 

annotations for the computation of the statistics based on 

synchrony or overlaps. Therefore, one tier is chosen as 

the master tier and the whole set of annotations is sliced 

according to the existence of annotations in that master 

tier. We also support slicing based on specific values of 

the annotation in that master tier. Slicing is based on the 

analysis of overlaps of the master tier annotations with 

annotations in all other tiers as illustrated in Figure 3. If, 

for instance, one tier codes all time intervals in which the 

robot was speaking, slicing allows to compute statistics 

only for those annotations that overlap with this 

‘speech_robot’ annotation. 

Built upon this general slicing concept, SALEM to date 

has the following functionalities which were developed 

based on requirements that arose during the analysis of 

two corpora of HRI data (Lohse, 2010; Lohse, 

submitted). Of course these functionalities are currently 

limited, however, new ones can be integrated in the 

toolbox if needed.

Parsing, displaying structure of parsed files, and 

slicing: 

• parsing single Elan files or a set of Elan files at once 

(which allows for the analysis of the data of single 

users, groups of users, groups of trials that belong to 

certain conditions, and all users of an experiment) 

• plot all annotations in the tiers 

• slice the files with respect to time (specifying one or 

more beginnings and endings of timeslots) 

• slice all annotations of a single tier (for example, if 

the file is sliced on the basis of the tier 

‘speech_human’, then in all other tiers only the 

annotations that are overlapping with instances of 

human speech are taken into account) 

• slice the files with respect to one or more values of 

the annotations in a single tier (for example, slice all 

annotations of eye gaze that have the value “1” 

which means that the user is looking at the robot) 

• examine one specific annotation in a tier (for 

example, the 12 th annotation in the tier ‘gaze 

direction’) 

Analyzing: 

• descriptive statistics for all data of the parsed files or 

the slices (for each tier): 

o count of annotations (number of occurrences) 

o minimum duration of the annotations (in 

seconds) 

o maximum duration of the annotations (in 

seconds) 

o mean duration of all annotations (in seconds) 

o median of the durations (in seconds) 

o overall duration of all annotations (in seconds) 

o variance and standard deviation of the 

duration of all annotations 

o beginning of first annotation (in seconds) 

o end of last annotation (in seconds) 

o overall duration of all annotations as a 

percentage of the time between the beginning 

of the first annotation and the end of the last 

annotation 

• the descriptive statistics for slices additionally 

include for all tiers 

o count and percentage of time that the 

annotations in a tier overlap with the 

reference tier for four types of overlap: (a) 

the annotation extends the annotation in the 

reference tier (begins before the annotation 

and ends after the annotation in the reference 

tier), (b) the annotation is included in the 

annotation in the reference tier (begins after 

the annotation in the reference tier and ends 

before the annotation in the reference tier), (c) 

the annotation begins before the annotation in 

the reference tier begins and ends before it 

ends, (d) the annotation begins after the begin 

of the annotation in the reference tier and 

ends after the end of the annotation in the 

reference tier (see Figure 3) 

• statistics for the annotated values in a certain tier: 

o duration of all annotations for all values 

o descriptive statistics for all values (see 

descriptive statistics for all tiers) 

o T-tests for the duration of the annotations and 

the duration of the overlap 

o predecessor transition matrix for all values in 

the tier (percentages with which all values 

preceded all other values; for example, 

annotations with the value ‘1’ are preceded 

by annotations with the value ‘2’ in 62% of 

the cases) 

o successor transition matrix for all values in the 

tier (percentages with which all values 

succeeded all other values; for example, 

annotations with the value ‘1’ are succeeded 

by annotations with the value ‘2’ in 36% of 

the cases) 

Figure 3: Overlap types 

4. Conclusion 

In this paper, we presented a toolbox for the statistical 

analysis of systemic, and thus inherently multi-modal, 

corpora. It supports a quantitative analysis of interaction 

corpora by merging automatically generated annotations 

with manual annotations and subsequently analyzing 

them using functions provided by the SALEM toolbox. 

We successfully applied the proposed approach in 

several user trials with interactive robots. The universal 

concept of slicing introduced by SALEM allows to 

define subsets of annotations to compare different cases 

or situations and consequently facilitates the analysis in a 

context-aware manner. SALEM also eases the work flow 

by processing the annotations directly without requiring 

prior export which is often error-prone. Moreover, the 

toolbox is usable for people with little knowledge about 

Matlab. It saves them from learning single commands for 

the statistical functions and speeds up the analysis. Since 

the same functions are used for all data, the toolbox 

supports a consistent analysis. SALEM is extendable if 

more statistical functions are needed or if it shall be used 

with other annotations tools which are based on XML.

The SALEM toolbox is available free of charge from: 

http://aiweb.techfak.uni-bielefeld.de/content/salem-statis 

tical-analysis-elan-annotation-using-matlab 

5. Acknowledgments 

This work has been supported by the Collaborative 

Research Centre 673 Alignment in Communication, 

founded by the German Research Foundation (DFG), 

and the European Community’s Seventh Framework 

Programme [FP7/2007- 2013] under grant agreement No. 

215181, CogX. 

6. References 

Lohse, M. (2010). Social, functional, and problem- 

related tasks in HRI - a comparative analysis of body 

orientation and gaze. 2nd AISB workshop on New 

Frontiers in Human-Robot Interaction. Leicester, UK. 

Lohse, M. (submitted). Investigating the influence of 

situations and expectations on user behavior - 

empirical analyses in human-robot interaction. 

Doctoral Thesis. Bielefeld University, Technical 

Faculty, Germany.

SALEM - Statistical AnaLysis of Elan files in Matlab

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?