Here - Tilburg University

The ASA Spring Methodology Conference 

Organized in Europe by the Department of Methodology and Statistics at Tilburg 

University, the Netherlands. 

SESSION : Applications of Event History and Panel Analysis 

PRESENTERS: 

Nicoletta Balbo n.f.g.balbo@rug.nl 

University of Groningen; Dondena Centre, Bocconi University 

Hans Dietrich hans.dietrich@iab.de 

Institute for Employment Research 

Isabel Haeberling haeberling@soziologie.uzh.ch 

University of Zurich, Institute of Sociology 

Dimitris Pavlopoulos d.pavlopoulos@vu.nl 

VU Amsterdam

Author and presenter 

Balbo, Nicoletta; University of Groningen; Dondena Centre, Bocconi University 

Title 

Does fertility behavior spread among friends? 

Abstract 

Social interaction theories in fertility research (Montgomery & Casterline 

1996; Kohler 2001) give evidence that an individual’s fertility decision-making is 

not only driven by his or her own characteristics or contextual factors, but also 

influenced by the behavior of people whom that individual interacts with. 

Previous studies on fertility in developed countries focus on socialization 

processes that operate through the transmission of fertility attitudes from 

parents to children (Barber, 2000), or through intra-family interactions, 

especially between siblings (Lyngstad & Prskawetz, 2010). However, 

socialization does not only occur within the kinship network, but also outside it, 

through social interaction with peers and friends. Therefore the aim of this paper 

is to investigate whether and how friends’ fertility behaviour affects an 

individual’s transition to parenthood, making use of a model specification that 

allows to properly identify interaction effects and distinguish them from any 

selection and contextual effect. Indeed, lack of research on this topic mainly 

rests with the difficulty to model social interaction (Manski, 1993). The 

contribution of this study is twofold: 1) extending research on the effect of social 

interaction on fertility outside the family network; 2) proposing an innovative 

way to deal with endogeneity issues, typical of social interaction processes. 

Using the 4 waves of the Add Health data, we engage in a series of discrete time 

event history models with random effect at the dyadic level. In order to 

investigate the effect of a friend’s childbearing on an individual’s risk of 

becoming a parent, we include in our sample 8,905 dyads of women friends, 

that we follow during their young adulthood (from the age of 15 till around age 

30). In a dyad-month file, assuming that each dyad is independent, we set as 

dependent variable a dummy that takes on value 1 when the first friend of the 

dyad gives birth, 0 in the other months. To measure cross-friend effects, we 

include a time-varying variable indicating when the other friend of the dyad had 

a child.

Friendships under study were formed when adolescents were around 12 (Wave 

I), so we assume their formation is exogenous to the decision to have a child. At 

Wave III, each respondent had to indicate from a list of 10 previous school 

mates and friends (at Wave I), those who are current friends and those who are 

not. In this way, we can distinguish dyads of friends from those of people who 

simply share a common social context (they went to school together). By 

including these two types of ties in our analysis, we can separate true cross- 

friend interaction from contextual effects. Moreover, to distinguish selection from 

influence (people might remain friends with those who share similar family 

attitudes), we engage in a simultaneous equation model, in which we estimate 

together the probability of being current friend with the other person in the 

dyad, and one friend’s risk of becoming parents, using as exclusion restriction 

the geographical distance between the two friends. 

Results show that net of contextual and selection effects, a friend’s childbearing 

positively influences an individual’s risk of becoming a parent. We find this effect 

being strong in the short-term and inverse U-shaped: it increases and starts to 

become significant one year after the friend’s childbearing, it reaches its peak 

24-36 months later and then decreases.

Presenter 

Dietrich, Hans; Institute for Employment Research 

Authors 

Hans Dietrich; Institute for Employment Research 

Anna Manzoni; Yale University; IAB/University Nürnberg 

Title 

The manifold effect of social background on youth unemployment outcome: 

unemployment outcome estimates using survey data and register data 

Abstract 

Previous research consistently found measurement error in retrospective 

data on unemployment. Ambivalent findings are reported towards the relation 

between unemployment reports and respondents socio-economic characteristics, 

depending on type of data compared, the complexity of the event structure or 

the distance between the occurrence of the event and the time of the report. 

However, only a few research findings are available about the effect of 

measurement error or competing measurements on model outcomes. 

In particular, we are substantially interested in the effect of social 

background on the labor market outcomes of the unemployed. Background 

related findings are reported both concerning the duration of unemployment and 

the transition out of unemployment. .However, we are also concerned with the 

extent of error in the report of unemployment. Using register based data on 

unemployment as reference (quasi gold standard), Dietrich described systematic 

effects of social background, level of education, school performance, and labor 

market status at the time of interview on individuals’ reporting of occurrence and 

duration of unemployment in retrospective survey data. 

However, we know from combined register data on employment and 

unemployment in German (IEB) that register data are vulnerable to 

measurement errors of register unemployment information, too. Our hypothesis 

is that social background works in a manifold way, affecting the report of 

unemployment episodes on the one hand, and labor market outcomes, such as 

the hazard of leaving unemployment on the other hand. We aim to disentangle

the interplay of social background effects on unemployment report and on the 

labor market outcomes of the unemployed. 

Two datasets: 

-The so-called Jugalo sample, a multi-wave-survey where 4000 German 

youths who were below 25 and registered as unemployed between 1998 and 

1999 were interviewed in the years 2000, 2001 and 2004. It delivers monthly 

information on educational and work biographies, as well as longitudinal 

information on household composition, social background and individual 

characteristics, like work orientation or mental health. 

-Official register information on unemployment, employment and active 

labour market scheme participation from the German Social-Security-System, 

containing longitudinal information about individuals labor market activities on a 

daily basis. 

Using an anonymized personal ID provided in both the datasets, we 

successfully match the two datasets on a monthly basis for 3,635 respondents. 

For our analysis we consider information for the first two waves of the Jugalo 

survey and constrain our analysis to the episodes of unemployment in 98/99 

from which the sample was generated, and their corresponding outcomes. 

Method: We first look descriptively at the timing and type of labor market 

episodes. Then we apply a latent Markov model, which allows us to account for 

measurement error. In particular, we relax the assumption that register based 

data work as a gold standard and account for correlated measurement error. We 

assume that individual characteristics and social background in particular, as 

well as the labor market state at the time of survey, affect both measurement 

error and labor market outcomes. Eventually, we use the estimates of the true 

(latent) unemployment state from the latent Markov model in a hazard model to 

predict the labor market outcome of interesinterest. The first findings support 

our hypothesis.


Haeberling, Isabel; University of Zurich, Institute of Sociology 

Title 

Children: statistically rare events. on the importance of using logistic regression 

for rare events data abstract 

Abstract 

“People have kids anyway” – This statement from the German statesman 

Konrad Adenauer in the 1950s holds no longer true. Only 10 out of 1000 persons 

started or expanded their families in Switzerland per year between 2002 and 

2009. This is what makes starting and/or enlarging a family a rare event – not 

only in a statistical sense. This research project explores the determinants of this 

rare event from a sociological and statistical perspective – using different 

statistical research methods and revealing inherent differences. Research 

focuses particularly on factors, which cause individuals to carry through the 

process of having a baby. 

Recent reports in the media on demographic topics, such as global aging, 

human beings as an endangered species, and worries about who is going to care 

for all the elderly in the future, in combination with the conflict of reconciling 

work and family life, have triggered public interest and concern; and there are 

also important implications for (family) policies. Therefore one has to analyze 

this topic carefully and using appropriate statistical methods. Basically, the 

majority of individuals show a general desire for children, which can easily be 

analyzed via conventional logistic regression. The precise intention to actually 

have a baby is much more uncommon but still does not have to be categorized 

as seldom – the use of conventional regression methods is adequate. Finally, the 

realization of desire and intention are rather rare events, a fact that points to an 

existing statistical problem that is widely ignored by scientists. Customary 

regression methods are no longer appropriate for analyzing this particular 

problem. 

Rare events are dependent variables with dozens to thousands of times 

fewer ones than zeros. Popular statistical procedures as conventional logistic 

regression can sharply underestimate the probability of already rare events. 

(King 2001) In addition, data collection strategies for rare events data are

greatly inefficient. Only very few events are surrounded by a very large amount 

of non events. These facts call for a method, which takes this rareness of the 

dependent variables into account. As a consequence, a logistic regression for 

rare events data has to be conducted to analyze the reasons for starting and 

expanding a family. King & Zeng (2001a; 2001b) developed a method that 

combines these two issues, enabling both types of corrections to work 

simultaneously. 

For the first time in the field of Swiss demography, a comparison of a 

conventional logistic regression and a logistic regression for rare events data is 

conducted. This helps to statistically explore and research the different 

structures and factors that influence fertility behavior. The results indicate the 

importance of using a logistic regression specifically designed for rare events 

data. This study examines the aforementioned comparison on the basis of the 

Swiss Household Panel (SHP) by way of statistical methods for rare events data 

and conventional logistic regression. In so doing, it reveals important 

background factors and motives for fertility behavior, which do not 

underestimate the event of having a baby. In terms of empirical considerations, 

a new dimension is developed.


Pavlopoulos, Dimitris; Free University Amsterdam; KUL - University of Leuven 

Title 

Temporary unemployment: a flexibility arrangement to overcome the crisis. A 

study using 2-way fixed effects 

Motivation 

Temporary unemployment is a flexibility arrangement that was applied in 

many countries to mitigate the employment effects of economic crisis. As, for 

some countries, this is a new policy tool, we know very little about the long-term 

effects of this policy measure. In Belgium, temporary unemployment has been in 

use for long. Therefore, we can use the Belgian experience to draw conclusions 

on the short-term and long-term effects of this flexibility arrangement. 

Aim 

This paper investigates the effect of temporary unemployment on the 

wage growth of workers in the Belgian labour market by controlling for both 

worker and firm effects. We study both the effect of present and past 

experiences of temporary unemployment on wages. Furthermore, we study 

whether this effect varies according to the age and the tenure of the worker as 

well as with the sector and the size of the firm. In this way, we study whether 

temporary unemployment has long-term scarring effects on the career of the 

workers. 

Method 

We apply a panel regression model with 2-way fixed effects. In the study 

of wages, the use of the usual mixed models is inappropriate as the observed 

individual characteristics are believed to be correlated with the unobserved 

individual characteristics. Therefore, to estimate a panel wage regression, 

economists typically apply a so-called fixed-effects model where they estimate 

the first differences or the differences from the individual mean. When matched 

employer-employee data is available, we can control for 2-way fixed effects – 

individual and firm unobserved characteristics . Following similar approaches in 

the literature, we include 2-way fixed effects by first-differencing on individual 

effects and then include dummies for the firms.

Data 

We use matched employer-employee longitudinal data for two samples of 

5,000 workers that are initially employed in 302 firms from the Datawarehouse 

of the Belgian Crossroadsbank for Social Security. The Datawarehouse offers 

employment information at the trimester-level and unemployment information at 

the monthly level. The first sample was selected in the first trimester of 1998 

and workers were followed until the last trimester of 2003. The second sample 

was initially selected in the first trimester of 2002 and workers were followed 

until the last trimester of 2007. 

Results 

Our results indicate that current experiences of temporary unemployment 

are associated with lower wages for workers that have been employed by the 

firm for more than 1.5 years. This effect is stronger for older workers. In 

contrast, for workers that have been employed for shorter periods, no effect of 

recent temporary unemployment is found. Past experiences of temporary 

unemployment are especially harmful for workers 25-30 years old. As in the 

case of recent experiences of temporary unemployment, this effect increases 

with tenure. For workers with long tenure in the firm, the length of past 

temporary unemployment matters as well. Specifically, for these workers, the 

longer they have been in temporary unemployment the last year the lower their 

wage is. In contrast, for workers older than 30, past experiences of temporary 

unemployment are not associated with the a lower wage. Therefore, it seems 

that temporary unemployment has some scarring effect on the career of young 

workers but is not harmful for the career of prime-age or older workers.

KEY NOTE SPEAKER 

Sijtsma, Klaas k.sijtsma@uvt.nl 

Dept. Methodology and Statistics, Tilburg School of Social and Behavioral Sciences 

Title 

Psychological Measurement Between Physics and Statistics - Klaas Sijtsma 

Abstract 

This contribution discusses the physical perspective on psychological 

measurement represented by additive conjoint measurement and the statistical 

perspective represented by item response theory, and argues that both fail to 

adequately address the real measurement problem in psychology: This is the 

absence of well-developed theories about psychological attributes. I argue that 

the two perspectives leave psychology out of the equation and by doing that 

come up with proposals for psychological measurement that are fruitless. Only 

the rigorous development of attribute theories can lead to meaningful 

measurement. I provide two examples of the measurement of well-developed 

attributes and suggest future directions for psychological measurement.

KEY NOTE SPEAKER 

Snijders, Tom A.B. Tom.Snijders@nuffield.ox.ac.uk 

Nuffield College 

University of Oxford 

Title 

Statistical models for dynamics of social networks: inference and applications 

Abstract 

The main issue for statistical modelling of social networks (represented 

mathematically mainly by directed graphs) is how to express the dependencies 

between the ties in the network. This is less complicated for longitudinally than 

for cross-sectionally observed networks, because the time-ordering assists in the 

representation of these dependencies. Stochastic actor-oriented models are a 

class of continuous-time Markov chain models for representing network 

dynamics. These models assume that the actors, represented by the nodes in 

the network, control their outgoing network ties, subject to inertia and 

contextual constraints, and with an element of randomness to represent the 

unpredictability of social behaviour. The transition distribution can depend in 

potentially complex ways on current network structure and monadic or dyadic 

covariates. Estimation procedures have been developed for such models using 

network panel data, i.e., repeated measures of the network collected at two or 

more discrete time points, according to the method of moments, the maximum 

likelihood principle, as well as Bayesian methods. 

The actor-oriented model is presented with an outline of the estimation 

procedures, and a review is given of some of the applications that have 

appeared in the literature.




SESSION A: Categorical Marginal Models 

SUMMARY 

Introduction and applications of categorical marginal models 

Chair of this session: Andries van der Ark a.vdark@uvt.nl 

PRESENTERS: 

Jacques Hagenaars jacques.hagenaars@uvt.nl 


Wicher Bergsma w.p.bergsma@lse.ac.uk 

Dept. of Statistics, London School of Economics and Political Science, U.K. 

Renske Kuijpers r.e.kuijpers@uvt.nl 


Marcel Croon m.a.croon@uvt.nl 

Dept. Methodology and Statistics, Tilburg School of Social and Behavioral Sciences

Presenter 

Hagenaars, Jacques A.P.; Dept. Methodology and Statistics, Tilburg School of 

Social and Behavioral Sciences 

Authors 

Jacques Hagenaars and Marcel Croon; Tilburg University, the Netherlands 

Wicher Bergsma; London School of Economics and Political Science, U.K 

Title 

Introduction to CMMs: 

Marginal Models for dependent, clustered and longitudinal categorical data. 

Abstract 

Dependent observations may arise in many research settings (e.g., in 

cluster, matched or longitudinal samples) or may arise in contexts where 

otherwise the observations are independent from each other, but where the 

research question ''makes' them dependent. Ignoring such dependencies and 

treating the observations as independent will distort the standard errors of the 

estimates but may also bias the estimates of the (effect) parameters. One 

solution is to model the dependencies, as in autocorrelation or random effect 

models. However,for many research questions marginal modeling is the best 

solution, in which the dependency is treated as a nuissance and the parameters 

of interest are estimated taking this nuissance into account (without modeling 

it). In this presentation, the emphasis will be on showing the potentialities of 

marginal models for answering many different types of important research 

questions.

Presenter 

Bergsma, Wicher; Dept. of Statistics, London School of Economics and Political 

Science 

Authors 

Wicher Bergsma; London School of Economics and Political Science, U.K 

Marcel Croon and Jacques Hagenaars; Tilburg School of Social and Behavioral 

Sciences 

Title 

Marginal Models for Dependent, Clustered, and Longitudinal Categorical Data 

Abstract 

In the social, behavioural, educational, economic, and biomedical 

sciences, data are often collected in ways that introduce dependencies in the 

observations to be compared. For example, the same respondents are 

interviewed at several occasions, several members of networks or groups are 

interviewed within the same survey, or, within families, both children and 

parents are investigated. Statistical methods that take the dependencies in the 

data into account must then be used, e.g., when observations at time one and 

time two are compared in longitudinal studies. At present, researchers almost 

automatically turn to multi-level models or to GEE estimation to deal with these 

dependencies. Despite the enormous potential and applicability of these recent 

developments, they require restrictive assumptions on the nature of the 

dependencies in the data. Marginal models provide another way of dealing with 

these dependencies, without the need for such assumptions, and can be used to 

answer research questions directly at the intended marginal level. The present 

talk will focus on the maximum likelihood method, which has many attractive 

statistical properties, for fitting marginal models. 

This talk is based on a recent book by the authors in the Springer series 

Statistics for the Social Sciences, see www.cmm.st.

Presenter 

Croon, Marcel; Tilburg School of Social and Behavioral Sciences 

Authors 

Marcel Croon; Jacques A. Hagenaars; Department Methodology and Statistics, 

Tilburg University 

Wicher Bergsma; London School of Economics and Political Science, UK 

Francesca Bassi; University of Padova, Padova, Italy 

Title 

Marginal models for longitudinal categorical data from a complex rotating design 

Abstract 

In their book Marginal Models for Dependent, Clustered, and Longitudinal 

Categorical Data (2009), Bergsma, Croon & Hagenaars discuss several 

applications of marginal models for categorical data observed in longitudinal 

studies. They distinguish between the analysis of trend data, when different 

random samples from the same population are drawn at different time points, 

and panel data, when the same random sample from a population is observed at 

different time points. For both types of data, they discuss how various 

hypotheses about gross and net changes over time can be tested by marginal 

modeling. 

These methods can be extended to the case the data are collected in a 

more complex way, for instance, by means of a rotating design in which different 

random cross-sectional samples are followed over time at different measurement 

occasions. The data which will be analyzed come from the Italian Continuous 

Quarterly Labour Force Survey, which is cross-sectional with a 2-2-2 rotating 

design. The questionnaire yields multiple indicators of labour force participation 

for each quarter: (i) each respondent is classified as employed, unemployed or 

out of the labour market according to the definition of the International Labour 

Office on the bases of answers given to a group of questions (ii) each 

respondent is asked to classify himself as employed, unemployed or out of the 

labour market, the so-called self-perceived condition; and (iii) a retrospective

question asks about condition in the labour market one year before the 

interview. 

In the analysis of the data from this survey, the emphasis is on the study how 

changes in labour status are reflected by each of the three indicators, and how 

differences and similarities among them change over time.


Kuijpers, Renske E.; Tilburg School of Social and Behavioral Sciences 

Title 

Testing Cronbach's Alpha Using Feldt's Approach and a New Marginal Modelling 

Approach 

Abstract 

Feldt developed an approach for testing three relevant hypotheses 

involving Cronbach's alpha: H01, alpha equals a particular criterion; H02, two 

alpha coefficients computed on two independent samples are equal; and H03, 

two alpha coefficients computed on the same sample are equal. The assumptions 

of Feldt's approach are unrealistic for many test and questionnaire data, and 

little is known about the robustness of the approach against violations of the 

assumptions. We propose a new approach to testing the three hypotheses. The 

new approach uses marginal modelling and is based on weaker assumptions. 

The Type I error rate and the power of both approaches were compared in a 

simulation study using realistic conditions. In general, the two approaches 

showed similar results showing that Feldt's approach is robust against violations 

of the assumptions. In some cases, however, the marginal modelling approach 

was more accurate: For computing Type I error rates for very high values of 

alpha, for computing Type I error rates for hypothesis H03, and for computing 

the power of hypothesis H03 using a small sample size.




SESSION B: Categorical Marginal Models 

SUMMARY 

New developments in categorical marginal models 

Chair of this session: Wicher P. Bergsma w.p.bergsma@lse.ac.uk 

PRESENTERS: 

Andries van der Ark a.vdark@uvt.nl 


Tamás Rudas rudas@tarki.hu 

Dept. of Statistics, Eotvos Lorand University (ELTE) Budapest, Hungary 

Antonio Forcina forcina@stat.unipg.it 

Dept. of Economics, Finance and Statistics, University of Perugia, Italy 

Alberto Roverato alberto.roverato@unibo.it 

Dept. of Science Statistics, University of Bologna, Italy


Forcina, Antonio; Dept. of Economics, Finance and Statistics, University of 

Perugia, Italy 

Title 

Smoothness of Conditional Independence Models for Discrete Data 

Abstract 

The paper is about a family of conditional independence models which require 

constraints on complete but non hierarchical marginal log-linear parameters. For 

such models, whose dependence structure cannot be represented by any of the 

known graphical separation criteria, it is not known whether the model is 

smooth, so that the usual asymptotics can be applied. A model is called non 

smooth when the variety which it defines in the parameter space contains points 

which do not admit a local approximation by a linear space. 

By exploiting results on the mixed parameterization within the exponential 

family, we determine a condition which has to be satisfied for the model to be 

smooth. The condition has to do with the possibility to reconstruct the joint 

distribution from the set of marginal log-linear parameters in a unique way. In 

technical terms, the condition require that a certain jacobian matrix has spectral 

radius strictly less than 1. In the simple context when only two marginals are 

involved, we show that this condition is always satisfied. In the general case, we 

describe an efficient numerical test for checking whether the condition is 

satisfied with high probability. This approach is illustrated with several examples 

of non hierarchical conditional independence models and by a directed cyclic 

graph model; we establish that all these models smooth.

Presenter 

Rudas, Tamás; Eötvös Loránd University 

Authors 

Tamás Rudas and Renáta Németh; Eötvös Loránd University 

Title 

Marginal Models of Social Mobility 

Abstract 

The talk shows how path models may be defined within the marginal 

modeling framework. The key assumption of a path model is that only effects 

associated with edges or arrows of a graph exist among the variables. This 

assumption is straightforward within the Gaussian framework but is a real 

restriction for categorical data. Marginal log-linear parameters are used to 

quantify the magnitude of the effects allowed by the model. As illustrative 

applications, status attainment models will be defined ad analyzed.


Roverato, Alberto; Dept. of Science Statistics, University of Bologna, Italy 

Title 

Log-linear Moebius models for binary data 

Abstract 

Models of marginal independence can be useful in several contexts and 

sometimes they may be used to represent independence structures induced 

after marginalizing over latent variables. A relevant class of marginal models is 

given by graphical models for marginal independence that use either bi-directed 

or dashed undirected graphs to encode marginal independence patterns between 

the variables of a random vector (Cox and Wermuth, 1993). When variables 

follow a multinomial distribution, graphical models for marginal independence 

are curved exponential families and the marginal independence restrictions 

correspond to complicated non-linear restrictions on the parameters of the 

traditional log-linear models. Parameterizations more suitable in this context 

have been proposed by Drton and Richardson (2008), shortly DR2008, and by 

Lupparelli, Marchetti and Bergsma (2009), shortly LMB2009. DR2008 introduced 

the Moebius parameters and showed that marginal independence constraints 

correspond to the factorization of certain mean parameters of the exponential 

family representation of the model. Although it is not straightforward to identify 

the set of factorizations corresponding to a given independence model, this 

parameterization has several advantages and, in particular, the likelihood can be 

written in closed form as a function of the Moebius parameters. Successively, 

LMB2009 proposed a mixed parametrization, denoted by lambda, based on 

marginal log-linear parameters such that graphical models for marginal 

independence can be specified by setting to zero certain lambda terms. In this 

framework, however, it is not possible to write the parameters of the 

multinomial distribution as a function of lambda in closed form. In this paper, we 

introduce a class of models for binary variables that we call the log-linear 

Moebius models. A first feature of this class of models is that it includes, as a 

special case, graphical models for marginal independence. The parameters of our 

class of models, that we call gamma, are not a mixed parametrization and, in 

fact, they are closely related to the Moebius parameters of DR2008 and allow us

to write the likelihood in closed form. Nevertheless, similarly to the 

parametrization of LMB2009, marginal independence can be specified directly by 

a set of zero constraints. More generally, log-linear Moebius models can be seen 

as an extension of graphical models for marginal independence because they 

make it possible to specify additional independence relationships, in 

subpopulations of interest, by imposing linear constraints on the gamma 

parameters. 

Cox, D. R. and Wermuth, N. (1993). Linear dependencies represented by chain 

graphs (with discussion). Statist. Sci. 8, 204–218, 247–277. 

Drton, M. and Richardson, T. S. (2008). Binary models for marginal 

independence. J. R. Stat. Soc. Ser. B Stat. Methodol. 70, 287–309. 

Lupparelli M., Marchetti, G. M. and Bergsma, W. P. (2009). Parameterization 

and fitting of discrete bi-directed graph models. Scandinavian Journal of 

Statistics, 36, p. 559-576


Ark van der, Andries; Tilburg School of Social and Behavioral Sciences 

Title 

Categorical Marginal Models for Large Sparse Contingency Tables 

Abstract 

Categorical marginal models (CMMs) are flexible tools to model location, 

spread, and association in categorical data that have some dependence 

structure.The categorical data are collected in a contingency table; location, 

spread, or association are modelled by restricting certain marginals of the 

contingency table. If contingency tables are large, maximum likelihood 

estimation of the CMMs is no longer feasible due to computer memory problems. 

We propose a maximum empirical likelihood estimation (MEL) procedure for 

estimating CMMs for large contingency tables, and discuss three related 

problems:The problem of finding the correct design matrices and the so-called 

empty set problem can be solved satisfactorily, the problem of obtaining good 

starting values remains unsolved. A simulation study shows that for small data 

contingency tables ML and MEL yield comparable estimates. For large tables, 

when ML does not work, MEL has a good sensitivity and specificity if good 

starting values are available.




SESSION: Multilevel analysis 

SUMMARY 

Multilevel analysis is extensively used in cross-national survey research and in 

the current session the latest developments in this field are presented. 

The first talk is about distinguishing longitudinal from cross-sectional variation 

and explaining why some societies change more than others. The question “How 

many countries are needed for an accurate multilevel SEM?” is answered in the 

second presentation. Third, a Stata command is presented that became recently 

available to fit multilevel models in MLwiN from within Stata. Last, models for 

predicting (dichotomous) outcomes at the national-level from explanatory 

variables at the individual-level are presented. 

PRESENTERS: 

Malcolm Fairbrother m.fairbrother@bristol.ac.uk 

University of Bristol 

Bart Meuleman bart.meuleman@soc.kuleuven.be 

Katholieke Universiteit Leuven 

George Leckie g.leckie@bristol.ac.uk 

University of Bristol, Centre for Multilevel Modelling 

Margot Bennink(chair of the session) m.bennink@uvt.nl 



Fairbrother, Malcolm; University of Bristol 

Title 

On the Multiple Ways of Using Multilevel Models to Study Social Change 

Abstract 

Analyses of repeated cross-sectional survey data have relied increasingly 

on multilevel/random effects models, in two ways. First, multilevel models have 

been used to distinguish age, period, and cohort effects, where the goal is to 

understand the mechanism by which some social change is occurring. Second, 

models of survey respondents nested within social units (typically countries or 

states) have been used to examine the effects of society-level conditions on 

individual-level outcomes. Both approaches, however, provide limited insights 

into the drivers of change over time. The former approach does not exploit 

differences among societies experiencing more or less change, and the latter 

does not distinguish longitudinal from cross-sectional variation. This paper 

illustrates how to overcome these limitations, by group mean-centring time- 

varying covariates—allowing for longitudinal effects to be distinguished from 

cross-sectional effects—and by fitting growth curves at the group level. Growth 

curves, where units of analysis are presumed to have unique random slopes for 

time, allow for the rate of some social change to be a function of a time- 

invariant covariate. This is the relationship many social theories implicitly 

expect, no matter whether change is mostly due to period or cohort effects. The 

paper concludes with an application to the study of why religiosity has declined 

(or secularism expanded) in some countries and not others.


Meuleman, Bart; Katholieke Universiteit Leuven, Belgium 

Title 

A Monte Carlo sample size study: how many countries are needed for accurate 

multilevel SEM? 

Abstract 

Thanks to the increasing availability of international survey data (e.g. the 

European Values Study and the European Social Survey), there exists growing 

scientific interest for cross-national comparisons of values, attitudes and 

opinions. Various scholars have used international surveys to link individual 

characteristics to aspects of the national context. Often, multilevel techniques 

are applied to explain individual-level variables by means of country-level 

features. 

However, the application of multilevel models in the field of cross-national 

research is far from unproblematic. Due to budget limitations, the number of 

participating countries does not exceed 25 for most international surveys. 

Consequently, the group level sample sizes are often substantially lower than 

what rules of thumb suggest (at least 50 or 100 units). On the other hand, 

cross-national surveys typically contain a large number of respondents per 

country (> 1000). 

This paper summarizes the results of a Monte Carlo study that was carried 

out to assess the accuracy of multilevel modeling in the domain of cross-national 

research. More specifically, the study concentrates on a rather recent but very 

promising statistical tool, namely multilevel structural equation modeling (SEM). 

A multilevel SEM, in which a latent factor is explained by a within- and a 

between-level variable, is simulated. In order to reproduce realistic 

circumstances as much as possible, the situation of the ESS round 1 (2002- 

2003) -22 countries and over 40.000 respondents- is taken as a starting point. 

The size of the between-level variable effect and the intra-class correlations are 

manipulated. In order to test whether trade-off effects between individual 

sample size and group sample size are present, various numbers of countries 

and respondents per country are simulated. For all conditions, the parameter 

estimates and their respective standard errors for both the within- and between

model are evaluated. Special attention is given to the power for detecting the 

effect of the between-level variable.

Presenter 

Leckie, George; University of Bristol, Centre for Multilevel Modelling 

Authors 

George Leckie; Chris Charlton 

Title 

Running MLwiN from within Stata: the runmlwin command 

Abstract 

The Centre for Multilevel Modelling is developing runmlwin, a Stata 

command to fit multilevel models in MLwiN from within Stata. There are three 

steps to using runmlwin: (1) The researcher specifies the desired model using 

the runmlwin command syntax; (2) The model is sent to and fitted in MLwiN; 

and (3) The results are returned to and displayed in Stata where they can be 

accessed for further analyses. 

runmlwin will benefit Stata users by enabling them to fit a considerably 

wider range of multilevel models than they can currently and to fit these models 

quickly and to large data sets using fast estimation engines. Stata users can 

then examine these models using the many interactive tools available in MLwiN. 

runmlwin will also benefit MLwiN users familiar with Stata as they can now type 

all the commands for their analysis into a single file and to run them all at once. 

This makes it easy to document and reproduce the results for an entire series of 

MLwiN models. MLwiN users can then make use of Stata’s many inbuilt post- 

estimation commands to calculate predictions, perform hypothesis tests, and 

produce publication quality graphics. Even simulation studies are now easy to 

perform. 

In this talk, we shall provide an overview of the runmlwin command and then 

demonstrate runmlwin in action with several example multilevel analyses.

Presenter 

Bennink, Margot; Dept. Methodology and Statistics, Tilburg School of Social 

and Behavioral Sciences 

Authors 

Margot Bennink; Marcel A. Croon; Jeroen K. Vermunt; Dept. Methodology and 

Statistics, Tilburg School of Social and Behavioral Sciences 

Title 

Micro-Macro analysis for discrete outcomes 

Abstract 

This study deals with models for predicting outcomes at the higher level (e.g. 

team performance) from explanatory variables at the lower level (e.g. 

employee’s motivation and skills). This “reversed” multilevel analysis problem is 

rather common in social sciences, and is sometimes referred to as micro-macro 

analysis. Recently, Croon and Van Veldhoven proposed a statistical model for 

micro-macro multilevel analysis which involves using a factor analytic structure 

in which the scores of the lower-level units are seen as indicators of latent 

factors at the group level. The key is that the outcome variable is not regressed 

on the aggregated group mean(s) of the micro-level predictor(s) but on the 

latent macro-level variable(s). The aim of the project, from which the current 

study is a part, is to generalize this approach so that it can also be applied when 

the explanatory and/or outcome variables are discrete instead of continuous and 

normally distributed. Two new models for micro-macro relations between 

discrete variables are presented; a simple 1-2 model in which a dichotomous 

micro-level variable affects a dichotomous macro-level outcome variable, and a 

more complex 2-1-2 model in which a dichotomous macro-level variable has a 

direct effect on a dichotomous macro-level outcome variable and an indirect 

effect on the outcome through a dichotomous mediating variable defined at the 

micro-level. In both models the latent variable at the group level is defined to be 

discrete (latent classes). We present the theoretical background of the models, a 

simulation study in which their performance is evaluated, as well as an empirical 

application.




SESSION: Response Styles and Response Behavior 

SUMMARY 

The quality of survey data is strongly influenced by variations in the 

cognitive effort respondents are willing to invest in answering interview 

questions. As pointed out by Krosnick (1996), respondents are likely to simplify 

challenging cognitive processes, and to reduce the cognitive effort necessary 

(“survey satisficing”). Effects of satisficing include item-nonresponse, 

acquiescence, response sets, and non-differentiation in item batteries. Satisficing 

therefore substantially lowers data quality and contributes to the total survey 

error. This session comprises papers on variations in response style and 

response behavior, and studies different aspects of satisficing strategies. We 

present research on the impact of interviewing mode, question type and 

response scale format on response behavior. 

Presenters: 

Cornelia Züll & Evi Scholz cornelia.zuell@gesis.org 

Gesis – Leibniz Institute for the Social Sciences, Germany 

Henning Best (chair of this session) henning.best@gesis.org 

Gesis – Leibniz Institute for the Social Sciences and University of Mannheim 

Juergen H.P. Hoffmeyer-Zlotnik juergen.hoffmeyer-zlotnik@gesis.org 

Gesis – Leibniz Institute for the Social Sciences and University of Giessen 

Dagmar Krebs dagmar.krebs@sowi.uni-giessen.de 

University of Giessen, Germany

Presenter 

Züll, Cornelia & Scholz, Evi; Gesis – Leibniz Institute for the Social Sciences, 

Germany 

Authors 

Cornelia Züll & Evi Scholz 

Title 

Item Nonresponse in Open Ended Questions: Empirical Analyses of Respondents’ 

Answering Behaviour on the Meaning of Left and Right. 

Abstract 

One of the main topics of the German Social Survey (ALLBUS) in 2008 

was “political attitudes and political participation”. As in many other political 

science based surveys the self-placement on a left-right scale was asked as an 

indicator for ideological self-identification. Though left-right self-placement is 

one of the most frequently used measures in empirical political science research, 

the respondents' associations with “left” and “right” are queried only rarely in 

the last decades of survey research. ALLBUS 2008 included two open-ended 

questions directly following the left-right scale itself and thus allows to gain 

important insights in how respondents use the left-right scale: “What do you 

mean by left/right”. However, item non-response on these open-ended questions 

has to be well considered before the associations with “left” and “right” are 

analyzed and results are interpreted. About 20% of the respondents answered 

“don’t know” or did not answer the question at all. Such a considerable amount 

of non-response might have effects on data quality and, hence, on the 

interpretation of the results. We assume that respondents answering “don’t 

know” or those who did not give any answer would have problems with the self- 

placement on the left-right scale. We further assume that demographic and 

political indicators, i.e. education – both formally and politically – or political 

interest, influence the non-response behavior. We will present the results of our 

investigation of item non-response on the questions about associations with 

left/right and discuss quality problems related to the validity of the left-right 

scale itself.


Best, Henning; Gesis – Leibniz Institute for the Social Sciences and University 

of Mannheim 

Title 

Survey-Satisficing in Telephone and Face-to-Face Interviews. A Comparison of 

Non-Differentiation in Item Batteries. 

Abstract 

Answering interview questions requires substantial cognitive effort from 

repondents, no matter which interview mode is used (telephone, face-to-face or 

mail). The respondents need to concentrate on the interview, interpret the 

question’s meaning, recall knowledge on attitudes or past behaviors, and finally 

formulate an appropriate answer. Starting from Simon’s (1955) concept of 

bounded rationality, Krosnick (1996) argues that respondents tend to simplify 

cognitive processes and therefore to reduce the effort involved in answering 

survey questions (“survey satisficing”). Effects of satisficing include acqiescence, 

response sets, and non-differentiation in item batteries. Holbrook et. al. (2003) 

hypothesize the tendency for survey satisficing to be stronger in telephone 

interviewing, as compared to personal interviews. The consequences of 

satisficing behavior then may lead to a lower data quality in telephone surveys 

(“satisficing bias”). We test Krosnick’s and Holbrook et al‘s Hypotheses using 

data from a large German survey on media consumption. The survey was 

conducted in 2000 using CATI as well as CAPI, using identical questions and item 

batteries. First results indicate the amount of survey satisficing to be higher in 

the CATI survey. Additionally, the mode effect is stronger in respondents with a 

low education.


Hoffmeyer-Zlotnik, Juergen H.P.; Gesis – Leibniz Institute for the Social 

Sciences and University of Giessen 

Title 

Effects of Response Scale Formats in Comparative Survey Research 

Abstract 

In international comparative social survey research many important 

problems in translating and harmonizing the questions have been solved in the 

recent years. However, critical issues regarding the response scales used for of 

attitude measurement still remain unresolved. In international comparative 

survey research we know that the perceived distance between scale points will 

change when the response scales are translated. The interpretation of vague 

quantifiers used to verbalize scale points strongly varies by culture. Additionally, 

there is a lack of research on effects of the response scale format on response 

behavior. Although there has been research on the mid-point of rating scales as 

well as on the direction of response scales from a national perspective, only a 

small number of studies has been published on these important questions in the 

context of international surveys. In this paper I will argue that international 

research projects oftentimes rely on national traditions in formulating response 

scales, which more often than not is based on ideology than on research. I 

present effects of response scale formats found in national research and discuss 

these effects with regard to the practice of international survey research and the 

cross-cultural comparability of survey data.


Krebs, Dagmar; University of Giessen, Germany 

Title 

The Impact of Direction and Polarity in Response Scales on Response Behavior 

Abstract 

The application of cognitive theory to survey methodology uncovered that 

answering survey questions is a cognitive process consisting basically of four 

tasks: question interpretation, memory retrieval, judgment formation, and 

response editing. This paper deals with the latter two tasks in examining the 

effect of polarity (uni- versus bipolar response scales) within answering 

categories running either from negative to positive or from positive to negative. 

The effect of polarity is expected to materialize primarily in the middle category 

of the scale: On a unipolar scale, the middle category indicates medium intensity 

whereas on a bipolar scale, the midpoint indicates neutrality. At the same time, 

responses on the bipolar scale are expected to tend more to the positive than to 

the negative area of the scale. However, with changing direction of the response 

scale, these effects might be stronger in the scale format starting with the 

negative response option then in the format starting with the positive response 

option. The study was conducted with repeated measurement, asking identical 

respondents identical questions with different methods – here uni- versus bipolar 

scales, formulated in positive as well as negative directions. For all questions a 

7-point response scale was used. Question content refers to achievement and 

job motivation. Based on the repeated measures with different scale 

polarization, reliability and validity of indicators are tested and the impact of uni- 

versus bipolar scale format on measurement quality (method effect) is tested.




SESSION: Applications of Latent Variable Models 

SUMMARY 

For many constructs of interest in the social sciences, educational 

measurement, psychology, biology, and economics, no direct method exists for 

measurement. Nonetheless, examples of such constructs are legion, for instance 

think of political attitudes, abilities, personal traits, or product preferences. To 

get a hold of such constructs, researchers gather observable variables (hereafter 

called indicators, manifest variables, or items) which they hope will provide 

indirect evidence for the constructs of interest. Latent variable models are 

statistical models built to quantify and help objectify this type of inference by 

deriving a small set of latent unobserved variables that is underlying to the set 

of manifest variables and should reflect the constructs of interest. Well-known 

instances of this type of modeling approach are factor analysis (Thurstone, 

1947), latent class and latent profile analysis (Lazarsfeld & Henry, 1968), and 

item response theory (Lord & Novick, 1968). 

Although the foundations of latent variable models were laid several years 

ago, it has taken quite some time before they became widely applied. This is 

mainly due to the statistical nature of these models as well as the sometimes 

complicated computations and algorithms needed to estimate latent variable 

models. With the recent advances in computation speed and optimization 

algorithms and the availability of general purpose software able to fit latent 

variable models, this has become less of an issue. 

This session is intended as a brief showcase of the possibilities that a 

latent variable model framework can offer for research in the social and 

behavioral sciences. Each presentation will fill in and illustrate one of the most 

familiar model instances in the framework (see Figure).

PRESENTERS: 

Johan Braeken (chair of the session) j.braeken@uvt.nl 


Mart van Dinther m.vandinther@fontys.nl 

Dept. Pedagogical Studies: Educational Theory, Fontys University of Applied Sciences, 

Sittard/Tilburg 

Phoebe Mui phoebe.mui@gmail.com 

Research Master, Tilburg School of Social and Behavioral Sciences 

Gabriela Koppenol-Gonzalez g.v.koppenolgonzalez@uvt.nl 



Mui, Phoebe; Tilburg School of Social and Behavioral Sciences 

Title 

Latent Profile Analysis: 

Typologies of immigrant's acculturation attitudes: Comparing theory and data. 

Abstract 

Classifying people, concepts, or other entities into categories to streamline one's 

thinking and perceptions, is common practice in every day life: By grouping 

entities the world gains structure. 

Also in scientific research such classification schemes are quite common: 

think of Durkheim's four types of suicide, Jung's psychological types, or Berry's 

acculturation strategies. The main advantage of such a typology is that a 

theoretical reference frame for further investigations is created. However, what 

is the value of such a theoretical typology in practice and how should one 

classify entities into the theoretical categories based upon the available data? 

Latent profile analysis [LPA] provides an initial starting point to answer these 

two questions. LPA can be used as a model-based clustering procedure, grouping 

entities based upon their similar properties. The model can be either entirely 

data-driven or restricted to correspond to a theoretical typology. This allows for 

a direct comparison between the theoretically expected typology and the 

prominent categories that are put forward by the data. 

LPA will be applied within the context of acculturation attitudes of 

immigrants in multicultural societies. How do immigrants typically deal with their 

cultural heritage and with the mainstream culture? Do they maintain their home 

culture, or do they adapt their cultural practice to fit in with the host culture? 

The most prominent account of acculturation is due to Berry. His model of 

acculturation consists of two dimensions: cultural maintenance and cultural 

change. Depending on one's relative preference along these two dimensions, 

four typical acculturation strategies are possible: integration, marginalization, 

assimilation, or separation.


Dinther, Mart van; Dept. Pedagogical Studies: Educational Theory, Fontys 

University of Applied Sciences, Sittard/Tilburg 

Title 

Factor Analysis: 

Perceived competence for higher education: Underlying structure and utility. 

Abstract 

Competence-based education emphasizes the development of 

competences, in stead of acquiring isolated knowledge and skills. A competence 

is an integrated set of related knowledge, skills and attitudes, that enables the 

student to perform professional tasks. 

This study draws attention to the role of student’s own perceptions and beliefs in 

acquiring professional competencies. Based upon social cognitive theory and 

competence-based educational theory, perceived competence is anticipated to 

be a complex and broad construct. As an initial measure for this construct, a 

comprehensive self-report survey is created. 

To shed some more light on the underlying structure of, and possible 

driving forces behind perceived self competence, factor analysis is used. A 

series of 4 confirmatory factor analysis models (i.e., one-factor, multi-factor, 

second-order factor, and bi-factor model) was fitted. A substantive 

interpretation of the model results is provided and a link is made to the 

predictive validity of the survey. Although self-report measures are often 

criticized as being uninformative and not objective, some initial evidence is 

provided for the potential diagnostic value of a perceived competence measure 

for competence learning.


Koppenol-Gonzalez, Gabriela; Dept. Methodology and Statistics, Tilburg 

School of Social and Behavioral Sciences 

Title 

Latent Class Analysis: 

Understanding planning ability measured by the Tower of London: 

Identifying and characterizing cognitive strategies. 

Abstract 

The Tower of London (TOL) is a widely used instrument for assessing 

planning ability. People may adopt different strategies when confronted with a 

TOL problem. For instance, some people try to solve the problems by adopting a 

trial-and-error strategy, whereas others try to look ahead and think through 

every move before actually making the first one. It is obvious that some 

cognitive strategies are more efficient than others and that strategy 

effectiveness may interact with specific properties of given problems to be 

solved. 

TOL problem properties are directly observable, yet the cognitive strategy 

that people use to solve the problems are not. Hence to study this, a technique 

is needed that indirectly infers these cognitive strategies based upon the 

available data. In this study, latent class analysis was used to identify and 

characterize the most prominent cognitive strategies used when solving TOL- 

problems. 

The results suggest that four strategy groups can be distinguished which differ 

with respect to preplanning time, effects of problem properties on performance 

and overall performance. The findings offer an explanation for inconsistent 

findings in the literature on the relation between TOL problem solving and 

cognitive inhibition.


Braeken, Johan; Dept. Methodology and Statistics, Tilburg School of Social and 

Behavioral Sciences 

Title 

Item Response Models (Latent Trait Analysis): 

The ABC structure of aggression: Acknowledging context or not? 

Abstract 

Aggression is at the basis of much societal and personal suffering. As a 

consequence, the broad and complex construct of aggression has been studied 

extensively in the social and behavioral sciences. The aggression construct is 

commonly conceptualized as consisting of several, interrelated components that 

reflect the affective, behavioral, and cognitive aspects that are involved in 

aggression. This is the so-called ABC model. 

To test theories about the interrelations between the three aggression 

components, and to investigate the potential importance of context, 

multidimensional item response theory is used. A series of item response models 

is fitted that can be applied to any study with categorical responses and a 

similar multi-trait multi-method design. The substantive interpretation of the 

model results illustrate that the difference between either accounting for the 

context, or not, gives rise to both qualitative as well as quantitative changes in 

related model inferences. Hence, context does matter when studying aggression. 

Prior studies mainly consider aggression at the general trait-and-attitude 

level. In contrast, this directed-imagery study is based upon a survey that 

assesses each aggression component in three different situational contexts; It 

allows to study aggression while acknowledging the context in which it arises. 

To test theories about the interrelations between the three aggression 

components, and to investigate the potential importance of context, 

multidimensional item response theory is used. A series of item response models 

is fitted that can be applied to any study with categorical responses and a

similar multi-trait multi-method design. The substantive interpretation of the 

model results illustrate that the difference between either accounting for the 

context, or not, gives rise to both qualitative as well as quantitative changes in 

related model inferences. Hence, context does matter when studying aggression.




SESSION: PPSM Session A: 

Panel Research, Nonresponse and Missing Data Analysis 

SUMMARY 

The German Priority Programme on Survey Methodology (PPSM) started 

on January 2008. It consists of a total of 16 projects and is still running (see 

www.survey-methodology.de). The present session aims at reporting on work 

from PPSM projects that address research questions related to longitudinal 

research (Bloom filter based cryptographic personal identification keys), access- 

panel research (Who joins a probability based access panel), administration data 

for nonresponse analysis, and multiple imputation of incomplete count data. In 

addition, the session provides a brief introduction to the PPSM network. 

PRESENTERS: 

Uwe Engel (chair of the session) uengel@empas.uni-bremen.de 

Dept. of Social Sciences, University of Bremen, Germany. 

Rainer Schnell rainer.schnell@uni-due.de 

Institut für Soziologie, University of Duisburg-Essen, Germany 

Tobias Gramlich tobias.gramlich@uni-due.de 

University of Duisburg-Essen, Germany 

Kristian Kleinke kristian.kleinke@uni-bielefeld.de 

University of Bielefeld, Germany

Presenter 

Engel, Uwe; Dept. of Social Sciences, University of Bremen 

Authors 

Uwe Engel; Simone Bartsch and Helen Vehre; University of Bremen 

Title 

Who joins a probability based access panel? 

Abstract 

As part of the German Priority Programme on Survey Methodology 

(www.survey-methodology.de), large random telephone samples for the adult 

population of Germany were drawn to build up an access panel for the three 

survey modes fixed-line, mobile-phone, and online-interviewing. 14,200 realized 

interviews yielded a net panel size of 6,600 people. 

The study design involves recruitment interviews of 20 minutes of length 

on average. The questionnaire programme focuses on variables expected to be 

relevant for explaining survey participation in one way or another (e.g., survey 

items related to social exchange and inte-gration, attitudes toward survey 

research, prior survey experience, personality traits, com-munication habit). 

A key feature of the study consists in an experimental design that, for 

refusal conversion attempts, combines interviewer tailoring efforts with the offer 

of ‘core interviews’ (of about half the full interview length) and ‘exit interviews’ 

(consisting of just two questions on prior survey experience). In this way a 

limited amount of survey data is obtained for respondents who were otherwise 

non-respondents. In addition to the survey data from full-interview participants, 

we now have for two subsets of the whole variable list corresponding informa- 

tion also from core-interview and exit-interview participants. Hence it is possible 

to include the respective survey variables (along with paradata) in a model to 

predict the probability of obtaining a full recruitment interview. 

A further key feature of the study consists in a heavy use of paradata and 

metadata to predict response propensities. The available paradata includes 

information about the se-quences of events that occurred during the process of 

repeated contact attempts. We iden-tified all such sequences and coded them 

into a typology of 22 contact courses. Also availa-ble is the number of contact

attempts. In addition, a large set of items was collected on con-vincing efforts 

(the various arguments raised) to convert reluctant persons. Finally, we let the 

interviewers rate the degree of necessary convincing effort in cases of both 

success and failure, i.e., in cases with a full, core or exit interview respectively a 

refusal of a target person in the end. Besides, the available metadata includes a 

set of questions on possible reasons why the person decided to take part in the 

interview, the perceived interview atmosphere as well as the perceived 

sensitivity and other aspects of selected survey items. 

At the ASA Methodology Conference we would like to present two related 

models. The 1st model is a multilevel model of participation in the initial 

recruitment interview itself. It com-bines a series of related mixed-effects 

logistic regression equations in an attempt at exhaust-ing, in view of a 

pronounced missing data pattern, as much predictor information as possi-ble. 

The equations use parts of the aforementioned sets of paradata and survey data 

respec-tively while controlling for interviewer effects. 

One outcome of this modelling approach is the estimated probability to 

provide a full re-cruitment interview. Along with survey-data from the full- 

interview variable list and afore-mentioned metadata variables, this estimated 

response propensity is used in a second step as a predictor variable in a latent 

variable model of access-panel membership.

Presenter 

Schnell, Rainer; Institut für Soziologie, University of Duisburg-Essen 

Authors 

Rainer Schnell; Tobias Bachteler and J. Reiher; University of Duisburg-Essen 

Title 

Bloom filter based cryptographic personal identification keys for longitudinal 

research. 

Abstract 

Longitudinal micro data are a rich source of information on important 

research topics all through the social sciences. To obtain longitudinal data 

individuals must however be tracked over time. For example, in epidemiological 

research, a national cohort may be tracked life-long in databases of health care 

providers. In criminological research, the identity of offend-ers has to be known 

for computing individual risk of recidivism. 

If no unique national personal identification numbers are available, the 

linkage of person-al data of the same individual across time is usually based on 

pseudonyms. Since this raises privacy concerns, methods of privacy preserving 

identity management in longitudinal re-search are needed. 

So far, quite simple algorithms for the generation of pseudonyms based 

on personal cha-racteristics (names, date and place of birth) are in common use. 

However, these algorithms will yield non matching pseudonyms when errors or 

changes in the underlying information occur. 

In Schnell et al. (2009) we suggested to use Bloom filters for calculating 

string similarities in a privacy-preserving manner. Here, we claim that this 

principle can also be used for a cryptographic long-term stable key (CLK) that 

provides both privacy and fault-tolerance. Us-ing simulated data we evaluate its 

practicability and compare it to previously proposed al-ternative methods. 

References: 

Schnell, R., Bachteler, T. & Reiher, J. (2009): Privacy-preserving record linkage 

using Bloom filters; in: BMC Medical Informatics and Decision Making 9 (41).

Presenter 

Gramlich, Tobias; University of Duisburg-Essen, Germany 

Authors 

Rainer Schnell and Tobias Gramlich; University of Duisburg-Essen 

Title 

Potential Undercoverage and Bias in Name-based Samples of Foreigners 

Abstract 

In many cases there are no sampling frames for rare or special 

populations like foreigners or migrants. Therefore, often sampling frames for a 

more general population are screened for members of the target population. 

Whereas this is an efficient way of sampling rather rare and specific populations, 

knowledge of potential limitations and threats of this widely used approach is 

very limited. 

Screening and classifying members of a population according to some 

criteria may produce false positive matches (e.g. natives wrongly classified as 

foreigners) as well as false negatives (e.g. foreigners wrongly classified as 

domestic). Whereas false positives only increase screening or survey costs, false 

negatives potentially introduce bias if they are systematically different in 

variables relevant to the topic of the survey. 

Name-based sampling has been applied to different nationalities and 

groups of migrants. It also has shown to be an efficient method for sampling for 

turkish migrants (and their descendants) which present the largest group of 

migrants in Germany (Razum et al. 2000, 2001). 

We use a large scale German panel survey ('PASS', on labour market and 

social security) to investigate coverage problems and potential biases when 

using a Bayesian-based classification of names to screen for foreigners in a 

general population sampling frame. We will present results on biased estimates 

of migration and labour force variables, introduced by false negative name 

classifications.

References: 

Razum, Oliver; Zeeb, Hajo; Beck, K.; Becher, Heiko; Ziegler, H. &Stegmaier, 

Christa, 2000: Combining a name algorithm with a capture-recapture method to 

retrieve cases of Turkish descent from a German population-based cancer 

registry. European Journal of Cancer 36:2380-2384 

Razum, Oliver; Zeeb, Hajo & Akgün, Seval, 2001: How Useful is a Name-based 

Algorithm in Health Research Among Turkish Migrants in Germany. Tropical 

Medicine and International Health 6(8): 654-661.

Presenter 

Kleinke, Kristian; University of Bielefeld 

Authors 

Kristian Kleinke, and Jost Reinecke; University of Bielefeld 

Title 

Multiple imputation of incomplete count data. 

Abstract 

Multiple imputation is one of the state of the art procedures to analyze 

incomplete data. Multiple Imputation technology is nowadays implemented in 

nearly every statistical pack-age. Unfortunately, currently available software is 

still quite limited in two regards: efficient and robust procedures for (a) "special" 

data types like count data and (b) complex data structures like clustered or 

panel data. 

We present imputation routines for ordinary, overdispersed, zero-inflated 

and multilevel count data, discuss their respective advantages and 

disadvantages and present fruitful ave-nues for future software development.




SESSION: PPSM Session B: Measurement Techniques, Online Research 

and the Role of the Interviewer 

SUMMARY: The German Priority Programme on Survey Methodology (PPSM) 

started on January 2008. It consists of a total of 16 projects and is still running 

(see www.survey-methodology.de). The present session aims at reporting on 

work from PPSM projects that address research questions related to 

measurement techniques for asking sensitive questions (Testing a new 

alternative to the Randomized Response Technique) and the factorial survey 

design and interviewer effects (Just gross earnings: Why respondents prefer 

lower inequalities in earnings while an interviewer is sitting next to them). The 

role of the interviewer is also addressed related to interviewer effects in the 

recruitment of a probability based access panel as well as related to an indicator 

based method for ex-post identification of falsifications in survey data. Finally, 

the session provides an international comparison of the availability of 

technologies for online surveys. 

PRESENTERS: 

Marc Höglinger (or Ben Jann) marc.hoeglinger@soz.gess.ethz.ch 

ETH Zurich and University of Bern 

Stefan Liebig stefan.liebig@uni-bielefeld.de 

University of Bielefeld 

Lars Kaczmirek lars kaczmirek@gesis org 

GESIS - Leibniz-Institut für Sozialwissenschaften, Mannheim 

Nina Storfinger nina.storfinger@zeu.uni.giessen.de 

University of Giesen 

Uwe Engel (chair of the session) uengel@empas.uni-bremen.de 

Dept. of Social Sciences, University of Bremen

Presenter 

Höglinger, Marc (or Jann, Ben); ETH Zurich and University of Bern 

Authors 

Andreas Diekmann, and Marc Höglinger,; ETH Zurich 

Ben Jann; University of Bern 

Title 

Asking sensitive questions: Testing a new alternative to the Randomized 

Response Technique 

Abstract 

Eliciting truthful answers to sensitive questions is an age-old challenge in 

survey research. Respondents tend to underreport socially undesired or illegal 

behavior and to overreport desired behavior. The often used Randomized 

Response Technique is intended to overcome this problem by adding some 

randomness to the answering process which provides full protection to 

respondents. In practice however, the Randomized Response Technique shows 

some serious drawbacks. Respondents’ compliance with the procedure is crucial 

to get unbiased results. If some respondents don’t understand the underlying 

principle and do not completely trust the technique they tend to give self- 

protective answers that can substantially distort results. 

A new alternative technique, the Crosswise Model, promises to be much 

more robust to noncompliance because there exists no obvious self-protective 

answering strategy. In addition, respondents never have to answer the sensitive 

question directly. The Crosswise Model was proposed by Yu, Tian, and Tang 

(2008, Metrika 67: 251–263) and has first been implemented in a survey by 

Jann, Jerke, and Krumpal (forthcoming, Public Opinion Quarterly). 

We tested the Crosswise Model in an experimental online-survey on 

plagiarism and cheating in exams with university students as subjects. The 

performance of the Crosswise Model is compared to that of the Randomized 

Response Technique and of direct questioning along several dimensions such as

the resulting prevalence estimates and respondents’ perception of the usability 

and privacy protection.

Presenter 

Liebig, Stefan; University of Bielefeld 

Authors 

Stefan Liebig and Carsten Sauer; University of Bielefeld 

Katrin Auspurg, and Thomas Hinz; University of Konstanz 

Title 

Just gross earnings: Why respondents prefer lower inequalities in earnings while 

an inter-viewer is sitting next to them. 

Abstract 

The factorial survey design has become a popular method in survey 

research. It integrates experimental set-ups into a survey: Respondents react to 

hypothetical descriptions (vignettes) while the values of each attribute 

(dimen¬sion) systematically vary in order to estimate the impact of each 

dimension on respondents' judgments. So far there is only little empirical 

knowledge if and to what extent this approach causes methodological artefacts 

especially in attitude research. Using the example of justice evaluations of gross 

earnings we address two methodological problems in this paper. First, as 

respondents have to evaluate a number of complex descriptions (vignettes of 

fictitious earners) the complexity may result in quite arbitrary reactions, varying 

from time to time and causing a very low reliability of the instrument. Second, 

as the factorial survey was designed for an indirect measurement of attitudes 

one of its advantages is seemingly a low sensitivity for social desirability 

response sets. Therefore we present two studies focusing (1) on the reliability of 

attitude measures using a test-retest design (three wave panel study, 2008) and 

(2) on the sensitivity for interviewer effects using a mixed mode design (German 

population survey, 2009). The results based on the student panel study show a 

fairly high reliability of the attitude measurement. In the population survey we 

find strong interviewer effects, meaning that the perceived just magnitude of 

income inequality is more egalitarian in the presence of an interviewer than in 

the absence of an interviewer. We discuss the latter from a methodological but 

also from a substantial point of view as it is in line with the experimental findings 

from behavioral economics and an evolutionary theory of justice attitudes.

Presenter 

Kaczmirek, Lars; GESIS - Leibniz-Institut für Sozialwissenschaften, Mannheim 

Authors 

Lars Kaczmirek: Dorothé Behr and Wolfgang Bandilla; 

GESIS - Leibniz-Institut für Sozialwissenschaften, Mannheim 

Title 

Availability of technologies for online surveys – an international comparison 

Abstract 

Design decisions for Web surveys are restricted by the assumptions about 

the technologies respondents have available. Measurement problems might 

occur when fully labelled scales are displayed on small computer screens or 

when respondents participate via cell phones and other mobile devices such as 

Netbooks, iPhone, Ipad, or Blackberry. In these cases, the required equidistance 

of scale points could be violated. Other technologies whose availabili-ty are 

relevant in this context are Flash technology and the respondents’ connection 

speed, that are key indicators for successful video presentations, and Java Script 

which is widely used in automatic data validation procedures. JavaScript is also 

necessary for all interactive question types such as automatic tally questions or 

visual analog scales. In the process of designing a survey, the availability of 

these technologies is then highly relevant for the tech-nical pretest. As 

pretesting is restricted to the most common combinations of technology, such as 

specific browsers, mobile devices, and connection speed, it is important to know 

which combinations really are the most common in the target group. 

This study provides exactly this data on available technologies for 

countries with different Internet penetration rates, namely Canada, Denmark, 

Germany, Hungary, Spain, and the United States (N=480 per country, quotation 

on age, gender and education). Data was col-lected automatically, similarly to 

the collection of paradata, in January 2011 while respon-dents participated in an 

Internet survey. The participants were sampled from online access panels. The 

results provide information about the availability of technology in different de- 

mographic groups: How do respondents access online surveys (connection

speed, browser, mobile devices)? What technology can survey researchers safely 

design for (screen size and used window size, Flash, JavaScript)? The study 

shows that most surveys can use a wide range of design choices, but also that 

specific groups of respondents need a conservative approach.

Presenter 

Storfinger, Nina; University of Giesen 

Authors 

Nina Storfinger; University of Giessen 

Natalja Menold; GESIS - Leibniz-Institut für Sozialwissenschaften, Mannheim 

Peter Winker; University of Giessen 

Title 

Indicator based method for ex-post identification of falsifications in survey data. 

Abstract 

Data quality in face-to-face interviews might be affected by interviewers' 

irregular behaviour like intentional deviation from the prescribed interviewing 

procedures, called cheating or interviewer falsification. As a part of a DFG 

research project we develop a multivariate statis-tical method - based on the 

motivation of such cheating behaviour - for ex-post identifica-tion of 

falsifications in survey data. 

As a first step in the project we conducted two explorative studies to identify the 

attributes of questionnaires, which would be useful to identify falsified data. 

During this step, existing real survey data is compared with “falsified” data 

which is fabricated by people participating in the explorative study. First results 

indicate clear differences between falsi-fied and real data. Falsifiers show a 

higher proportion of denominations of the option “Oth-ers” (in all semi-open 

questions which offer the option “other”), show less extreme answers in scale 

questions and they overestimate the political knowledge of real respondents. 

Fur-ther the falsifiers tend to round their answers to open-ended questions 

which require a me-tric answer like income or the frequency of a specific 

behaviour. Also they show higher in-ternal consistencies in item sets which are 

calculated by means of reliability coefficients. 

Based on these results we compute for every interviewer some specific 

“indicators of cheating” which are included in the multivariate analysis. For 

example we calculate the share of extreme answers in all scale questions or the 

share of rounded answers in all open-ended questions, and incorporate them in

a cluster analysis. Using this multivariate method we try to split the interviewers 

into two groups, correct and possibly cheating ones. The perfor-mance of this 

method is then assessed referring to the fraction of correctly assigned inter- 

viewers. Because of knowing the cheating interviewers beforehand, we are able 

to validate the clustering process immediately. We conduct some separate 

cluster analyses differencing in the amount of included indicators to assess the 

performance of every single indicator. 

Results of the analysis show that a high share of all falsifiers is actually 

pooled together in one cluster albeit some of the honest interviewers are also 

added to this group. Concerning the performance of every single indicator the 

“extreme-answers ratio” shows the highest share of correctly assigned 

interviewers; more than half of the honest and half of the cheat-ing interviewers 

could be identified. Thus, we might argue that the “indices of cheating” em- 

ployed, help to identify cheaters. The sensitivity of the clustering method is then 

analysed by means of bootstrapping. In a synthetic setting, we modify the 

number of interviewers and the number of interviews to obtain results with 

different sample sizes. As expected a higher number of interviews (by each 

interviewer) induces a more correct clustering, meaning that the identification of 

the cheating interviewers improves markedly.

Presenter 

Engel, Uwe; Dept. of Social Sciences, University of Bremen 

Authors: 

Uwe Engel; Simone Bartsch; Helen Vehre; University of Bremen 

Title 

Interviewer effects in the recruitment of a probability based access panel 

Abstract 

As part of the German Priority Programme on Survey Methodology 

(www.survey-methodology.de), large random telephone samples for the adult 

population of Germany were drawn to build up an access panel for the three 

survey modes fixed-line, mobile-phone, and online-interviewing. 14,200 realized 

interviews yielded a net panel size of 6,600 people. 

An accompanying interviewer survey was carried out to study possible 

interviewer ef-fects. In addition to that we conducted a study to evaluate the 

interviewers’ voices as well as communicative aspects of the initial contact 

situation. At the ASA Methodology Conference we would like to present first 

findings of this study on interviewer effects. Using the Mplus modelling 

framework we estimated two-level mod-els for categorical indicator variables 

and continuous latent factors. These models relate the probability of a full 

interview respectively the individual response propensity (within part) to several 

interviewer attitudes and beliefs at the between level (k=185 interviewers). 

Indica-tors include attitudes and beliefs about the possibility and necessity of 

convincing reluctant target persons, the acceptance of refusals, the emphasis of 

voluntariness, and the need to tailor the contact situation. All these information 

has been gathered prior to the fieldwork phase of the study. 

To estimate the effects of interviewers’ voice characteristics and perceived 

communica-tive aspects, we applied a two-step procedure. First, the individual 

response propensity was estimated as a function of a large set of paradata and 

survey data while allowing for random intercept variation at the interviewer 

level. Then, using again the Mplus modelling frame-work this response 

propensity (within part) is modelled at the between level as a function of some

asic voice characteristics (7 point scales including scales for vocal tone, quiet - 

loud tone, speaking fluently – hesitantly, perceiving the voice as 

agreeable – disagreeable, per-sonal – impersonal address, opening of the 

conversation with the target person appears as phrased freely – as read from a 

paper).




SESSION : Analysis of Rankings and Sequences 

PRESENTERS: 

Tim F. Liao tfliao@illinois.edu 

University of Illinois 

cut-e GmbH 

Katherina Lochner katharina.lochner@cut-e.com 

Nicola Barban nicola.barban@unibocconi.it 

Dondena Centre, Università Bocconi 

Brian Francis B.Francis@Lancaster.ac.uk 

Lancaster University & Regina Dittrich and Reinhold Hatzinger; Wirtschaftsuniversitaet,

Presenter 

Liao, Tim F.; University of Illinois 

Authors 

Tim F. Liao, University of Illinois; Anette Fasang, Yale University 

Title 

A Permutation Test for Comparing Groups of Social Science Sequences 

Abstract 

Sequence analysis has seen recent advances as well as wider applications 

in the social sciences. However, no formal way exists in the literature for 

directly comparing groups of sequences to determine whether they are different 

in a statistically meaningful way. To fill this gap, we propose a permutation test 

for comparing groups of social science sequences. We view a typical social 

science sequence such as life-course or employment-history sequences as 

having certain characteristics such as transition to first marriage, first birth, or 

first job, that contribute some unique information. Therefore, in addition to 

proposing a permutation test for comparing overall sequence-group differences 

via sequence-based distance such as the Levenshtein distance, we propose to 

apply the permutation test on statistics that isolate specific aspects of 

sequences. Examples of such statistics include the relative frequency of 

transitions and the timing of certain events. We apply the test to both simulated 

groups of sequences and data from the German Life History Study (GLHS) on 

family formation of East and West German women.

Presenter 

Lochner, Katharina; cut-e GmbH 

Authors 

Katharina Lochner; Maike Wehrmaker; Achim Preuss; cut-e GmbH 

Title 

Normative, ipsative, and beyond 

Abstract 

For online personality tests, two formats are established: normative and 

ipsative. Both have advantages and disadvantages. Normative questionnaires 

are pleasant to answer for test takers because they can indicate for each item to 

what extent they agree, but the resulting profiles are not always as 

differentiated as desired by the evaluator. The ipsative format yields profiles 

with a much higher degree of differentiation, but is not as pleasant to answer for 

the test takers because they are forced to make a choice, no matter to what 

extent they agree. A third format that strives to combine the advantages of the 

two formats will be presented: adalloc (adaptive allocation of consent). Adalloc 

presents items in blocks and test takers have to make a choice, like the ipsative 

method. They do so by allocating points to the items. However, they are not 

required to allocate all points, and they may also allocate an equal number of 

points to all items, like in the normative format. The method allows for 

shortening the questionnaire because it weights the responses and thus the 

underlying concepts during the administration. Therefore, not all combinations of 

constructs assessed have to be presented to the test taker. The weights also 

allow for a high amount of differentiation between the constructs assessed. 

Therefore, the test administrator benefits from the format. And so does the test 

taker because the questionnaire is short, and decisions are not forced. It would 

be desirable to discuss after the presentation how IRT models can be applied to 

estimate item qualities when using the adalloc format.


Barban, Nicola; Dondena Centre, Università Bocconi; Centre for Population 

Studies, Ageing and Living Conditions programme, Umea University 

Title 

Sequence analysis and causality. The effect of age at retirement on health using 

Swedish register data 

Abstract 

Life expectancy is increasing steadily in developed countries. Governments 

are seeking to increase the proportion of elderly people in paid employment to 

balance the ratio of employed people over dependent ones. This led to a 

considerable debate about the timing of retirement and its influence on health: 

is early retirement good or bad for your health? Several studies have shown that 

retirement at younger age has adverse effects on health (e.g., Westerlund and 

al. 2010, Hult and al., 2010). However, selection into retirement may obscure 

the effect of retirement on health. The individual decision to retire can be 

influenced by previous health trajectory, marital status and widowhood, social 

relations with relatives and work career. Moreover, the transition to retirement 

has become blurred, and the actual range of retirement age has expanded, 

making the transition “longer and fuzzier” (Kohli and Rein 1991; Han and 

Moen,1999). As a result, retirement is becoming more “destandardized” and 

“deinstitutionalized” (Guillemard and Rein 1993, Guillemard and van Gunsteren 

1991) with people anticipating retirement entering periods of inactivity or 

reducing their labor supply. Starting from this theoretical framework, we develop 

a new matching approach to investigate the causal effect of age at retirement on 

later health outcomes. Standard matching estimators (Rosembaum and Rubin 

1985) based on propensity score pair each treatment participant with a single 

(or multiple) non-treated participant based on a set of observed characteristics. 

However, we claim that selection into treatment can be affected by the 

trajectories of a set of observed characteristics before treatment. For this 

reason, using sequence analysis with Optimal Matching (OM) (Abbott, 1995), we 

develop a matching procedure based on the trajectory before treatment. Our 

method use an extension of nearest neighborhood matching estimator using OM 

distances. In this way we matched individuals with the most similar trajectory

efore retirement. We identify four different sources of selection into retirement: 

health trajectory, partnership trajectory, work career and family support history. 

We combine the four trajectories with a standard propensity score and develop a 

complex measure of dissimilarity among individuals. We use Swedish register 

data and we restrict the analysis to the cohort of people born in Sweden during 

1935. Our measure of outcome is the average days of hospitalization from 

retirement to age 71. We conduct separate analysis for different ages at 

retirement focusing on retirement between age 60 and 65. Our preliminary 

results confirm that early retirement is associated with poorer health outcomes. 

Once we control for selection issues the negative effect of retirement is 

negligible except for men and women who retire at age 60.


Francis, Brian 

Authors 

Brian Francis; Lancaster University, Regina Dittrich and Reinhold Hatzinger; 

Wirtschaftsuniversitaet, Vienna 

Title 

Modelling ranked survey data - a new approach accounting for covariates and 

latent heterogeneity. 

Abstract 

This talk focuses on the analysis of ranked survey response data and is 

motivated by a Eurobarometer survey on science knowledge. As part of the 

survey, respondents were asked to rank sources of science information in order 

of importance. The official statistical analysis of these data examined only the 

first two rank positions, and the percentage of times a source was mentioned in 

either the first or second position was reported. This failed to use all the 

information available in the dataset. 

Another issue concerns the heterogeneity of ranked responses. We might 

suppose that there is variability in the ranks across individuals which can be 

explained either through known covariates or through a random effects 

formulation which would incorporate the effect of unknown and unmeasured 

covariates. 

In this talk we propose a method which treats ranked data as a set of 

paired comparisons which places the problem in the standard framework of 

generalized linear models. This formulation also allows respondent covariates to 

be incorporated. The model can be interpreted through the worths of each item, 

and the effects of covariates on the worths. 

An extension is proposed to allow for heterogeneity in the ranked 

responses. The resulting model uses a nonparametric formulation of the random 

effects structure, fitted using the EM algorithm. Each mass point is multivalued, 

with a parameter for each item and masspoint. The resultant model is equivalent 

to a covariate latent class model, where the latent class profiles are provided by 

the mass point components and the covariates act on the class profiles. This

provides an alternative interpretation of the fitted model. The approach is also 

suitable for paired comparison data. 

Using age and sex as covariates, we found that a six class solution gave the best 

solution. Both age and sex were important in explaining the ranked responses. 

The six classes are interpretable in terms of different response profiles, and may 

be explained through omitted covariates such as degree of urbanisation and 

country.




SESSION: Statistical Social Network Analysis 

SUMMARY 

by Johan Koskinen (chair of the session) johan.koskinen@gmail.com 

Social network analysis (SNA) is concerned with the study of social 

interaction among social actors. Introduced as Sociometry, SNA was formalised 

using graph theory with the obvious analogue in the social world of nodes and 

edges being people connected by social relations. While visual and mathematical 

analysis, as well as rudimentary tests against simple null-models, has been in 

use since at least the thirties it is only in the last forty or thirty years that 

progress has been made in statistical modelling of networks and the 

dependencies these induce among observations. We deal here with small 

networks (under a thousand individuals) where we assume binary relational 

measurement for all of the pairs of individuals. 

Presenters: 

Josh Lospinoso lospinos@stats.ox.ac.uk 

Dept. of Statistics, University of Oxford, U.K. 

Marijtje van Duijn m.a.j.van.duijn@rug.nl 

Dept. Sociology, University of Groningen, the Netherlands. 

Mark Huisman and Christian Steglich j.m.e.huisman@rug.nl 

Dept. Sociology, University of Groningen, the Netherlands 

Nial Friel nial.friel@ucd.ie 

School of Mathematical Sciences, University College Dublin, Ireland


Lospinoso, Joshua A.; Dept. of Statistics, University of Oxford 

Title 

Joint inference on informant accuracy and social network dynamics 

Abstract 

To date, models for social network dynamics have been formulated 

(perhaps implicitly) in such a manner that complete trust is bestowed upon the 

observed data; the observed data panels are taken for the true state of 

relationships and data augmentation proceeds by constructing chains that could 

have connected them. This trend of placing complete trust in the data follows 

the general social networks literature. However, the extensive literature on 

informant accuracy suggests that placing such trust in the observed data as a 

representation of the true state of relations may be a tenuous proposition. 

In this talk, a flexible class of models is considered which relaxes the strict 

trustin observed network data. 

This intermediate model for informant accuracy can represent basic noise with 

some false positive rate and some false negative rate and be elaborated by 

random and fixed effects models. Bayesian ideas can be leveraged, as the notion 

of genuine prior belief about informant accuracy may be particularly appropriate. 

Further, the exponential random graph (ERG) family of models can be naturally 

employed for this purpose. In this sense, the model is a generalization of the 

Stochastic Actor Oriented Models of Snijders (2001) which permits joint 

inference on the social network dynamics and on the informant accuracy.


Duijn van, Marijtje; Dept. Sociology, Groningen University 

Title 

Social Network Analysis of Gossip Triads 

Abstract 

A model for binary three-way social network data is presented relating the 

probability of a tie to individual properties of the actors, network relations that 

may exist between any pair of them, possibly available three-way characteristics 

of them as a triplet, and a number of random components, taking care of the 

dependence between the triads. 

The model was motivated by a study investigating how instrumental and 

expressive ties influence gossip in employee triads. Two models were estimated 

for positive and negative gossip, whose results indicate different mechanisms of 

cowork (instrumental ties) and friendship (expressive ties) for the two types of 

gossip. 

The model is estimated using WINBUGS.


Huisman, Mark; Dept. Sociology, Groningen University 

Title 

Statistical models for ties and actors 

Abstract 

An overview of statistical models that can deal with the combination of 

cross-sectional network data and and individual actor and/or dyadic attributes is 

presented. These models can be categorized by the type of research question 

they can answer, either focusing on the relationships (ties) or on explaining 

differences at the individual level (actors). The accompanying models are on the 

one hand (logistic) regression-type models with a complex dependence 

structure, predicting the occurrence or strength of ties or stochastic block 

models classifying or grouping the actors in the network. The models are 

presented and compared using an example data set. Some attention will be 

given to available software for the estimation of the models.


Friel, Nial; School of Mathematical Sciences, University College Dublin 

Title 

Bayesian inference for the exponential random graph model 

Abstract 

This talk will present a new approach to carry out inference for the 

exponential random graph model. The exponential random graph is very widely 

used in the analysis of social networks, yet from a statistical view point it 

presents many difficulties, mostly notably because the likelihood cannot be 

evaluated for reasonably sized networks. The approach which we describe here, 

sidesteps these difficulties to a certain extent. The algorithm which we use to 

perform Bayesian inference is based on a Markov chain Monte Carlo algorithm 

and is therefore simulation-based -- it relies on the ability to able to draw 

networks exactly from the likelihood model. We outline how this algorithm can 

be extended to also yield estimates of the marginal likelihood or model evidence, 

thereby allowing one to make probability statements about the uncertainty of 

the statistical model itself. We will present several examples of how this 

methodology performs, and will also outline how it may be used using the 

statistical software R.




SESSION: Bayesian Methods 

SUMMARY 

Often researchers have an expectation about the ordering of the model 

parameters. This can be directly evaluated using Bayesian statistics. In this, one 

uses prior knowledge with respect to the model parameters. There are several 

ways of specifying the prior. 

The first speaker will go into detail in specifying reasonable and relevant priors. 

Then, Bayesian hypothesis testing is addressed. Subsequently, Bayesian factor 

analysis is discussed. Finally, a demonstration of combining evidence regarding 

multiple studies is given. 

Presenters: 

Rebecca Kuiper (chair of the session) r.m.kuiper@uu.nl 

Dept. Methodology & Statistics, Utrecht University 

Ruud Wetzels R.M.Wetzels@uva.nl 

University of Amsterdam, the Netherlands 

Carel Peeters C.F.W.Peeters@uu.nl 


Floryt van Wesel F.vanWesel@uu.nl 

Dept. Methodology & Statistics, Utrecht University

Presenter 

Kuiper, Rebecca M.; Dept. Methodology & Statistics, Utrecht University 

Authors 

R. M. Kuiper and H. Hoijtink 

Title 

Combining Statistical Evidence from Several Studies 

Abstract 

The effect of an independent variable on a dependent variable is often 

evaluated with hypothesis testing. Sometimes, multiple studies are available 

that test the same hypothesis. 

In such studies the dependent variable and the main predictors might differ, 

while they do measure the same theoretical concepts. 

In this presentation, I demonstrate a Bayesian method that can be used 

to quantify the joint evidence in multiple studies regarding the effect of one 

variable of interest. The method proposed here quantifies evidence for the 

hypothesis at hand using Bayesian model selection. By way of example, the 

method is applied to four studies from economic sociology and social dilemma 

research on how trust in social and economic exchange depends on learning 

effects through dyadic embeddedness, i.e., experience from previous exchange 

with the same partner. Subsequently, simulation shows that our method has 

good frequency properties.


Wetzels, Ruud; University of Amsterdam 

Title 

A Default Bayesian Hypothesis Test for ANOVA Designs 

Abstract 

We outline a Bayesian hypothesis test for ANOVA designs. The test is an 

application of a default Bayesian method for variable selection in regression 

models. One advantage of this test is that the user does not need to specify 

priors through subjective elicitation. We believe that this Bayesian test for 

ANOVA designs is useful for empirical researchers and for students; both groups 

will get a more acute appreciation of Bayesian inference when they can apply it 

to practical statistical problems such as ANOVA.

Presenter 

Peeters, Carel; Dept. Methodology & Statistics, Utrecht University 

Authors 

Carel F.W. Peeters and Herbert Hoijtink 


Title 

Inequality Constrained Conrmatory Factor Analysis: Bayesian Specication and 

Model Selection 

Abstract: 

An important topic in factor analysis (FA) is the restriction of parameters. 

A Bayesian framework is proposed which takes restriction of parameters in the 

context of confirmatory FA beyond the purpose of identification and prevention 

of impermissible estimates by allowing inequality and approximate equality 

constraints to express substantive theoretical ideas regarding direction and 

magnitude of effect of factor loadings. We are subsequently interested in the 

demarcation of competing inequality constrained formulations of factor analytic 

correlation structure.

Presenter 

Wesel van, Floryt; Dept. Methodology & Statistics, Utrecht University 

Authors 

Floryt van Wesel and Hennie Boeije; Dept. of Methodology and Statistics, 

Utrecht University 

Title 

Priors & Prejudice: Using existing knowledge in social science research 

Abstract 

Using Bayesian statistics implies the use of prior distributions. These prior 

distributions can contain information about the topic at hand that is already 

known from previous research. In this presentation we discuss how to acquire 

such existing information on the one hand and how to translate this information 

into a statistical prior distribution on the other hand. For the information part of 

the prior distribution we use three sources to formulate a single integrated 

theoretical model. The sources are: meta-analysis, qualitative synthesis and 

expert elicitation. The results that emerge from each individual source will be 

used to formulate an inequality constrained hypothesis. The three hypotheses 

are then integrated leading to an overall hypothesis representing the integrated 

model. This overall hypothesis is the existing information that determines the 

first part of the prior distribution. As the hypothesis does not contain any 

numerical information to base a statistical prior distribution on, we update a 

non-informative prior with a small part of the data, called a training sample. The 

final prior distribution consists of a combination of the information in the 

inequality constrained hypothesis and the training sample. The case we use to 

exemplify this procedure is that of factors influencing the development of post- 

traumatic stress disorder (PTSD) in children who have gone through trauma.




SESSION : Reliability, Stability and Discrimination 

PRESENTERS: 

Peter M. Kruyen p.m.kruyen@uvt.nl 

Tilburg School of Social and Behavioral Sciences 

Jarl Kampen jarl.kampen@wur.nl 

Wageningen University and Research Centre 

Marcel Noack 

University of Duisburg-Essen marcel.noack@uni-due.de


Kampen, Jarl; Wageningen University and Research Centre 

Title 

Ferguson's and Hankin's delta revisited: Towards a renewed interest in the 

discriminating power of tests 

Abstract 

Discriminating power is a characteristic of health indices, tests and 

questionnaires that is crucial for use of test scores in practice. Recently, renewed 

attention has been paid to Ferguson's Coefficient of Test Discrimination (delta) 

for test scores based on dichotomous items, and an extension thereof that 

quantifies discrimination for test scores based on polytomous ordinal items. In 

this article, four potential problems relating to Ferguson's delta and Hankins' 

recent generalization are discussed. Alternative methods of analysis that test for 

certain aspects of discriminative power are proposed. 

The properties of Ferguson's delta and its generalization are illustrated by 

mathematical argument, numerical examples, and the analysis of a real data set 

consisting of ordinal scaled items (WHOQOL-BREF Domain 2). 

It is shown that 1) Ferguson's delta in practical applications its maximal 

value cannot be attained which obfuscates interpretation, 2) its statistical 

significance cannot be computed reliably, 3) it is insensitive to the fineness of 

test scores and 4) it is insensitive to variation in discriminating power over the 

range of possible test scores. 

It is concluded that the renewed attention for discriminative power can 

help improve measurement in health. However, Ferguson's delta is not the most 

effective coefficient for this purpose. The proposed alternative methods are 

promising but require further assessment.


Kruyen, Peter M.; Tilburg School of Social and Behavioral Sciences 

Title 

Test length and decision making: When is short too short? 

Abstract 

To efficiently assess multiple psychological attributes and to minimize the 

burden on patients, psychologists increasingly use shortened versions of existing 

tests. Meanwhile, the importance of psychological testing has increased. For 

example, patients are routinely measured to monitor their progress in the course 

of a therapy and to evaluate treatment programs. These measurements are not 

only used to evaluate changes in the individual patient, but they are also used 

by insurance companies to make financial decisions on whether or not to 

reimburse certain treatment programs. However, the shortened tests are less 

reliable compared to long tests and may therefore substantially impair reliable 

decision-making. 

In this study, we reviewed recent trends in the use of short tests and 

examined the impact of test length reduction on individual decision-making. 

First, we present the results of a literature review on the use and validation of 

abbreviated tests in psychology. Second, we present the results of simulation 

studies comparing the risks of making incorrect decisions for the long and 

abbreviated tests. These simulations showed that the number of items needed to 

take decisions about patients depends on various factors including the 

application envisaged. For some applications five to ten items are sufficient, 

whereas in other applications one needs at least twenty items.


Noack, Marcel; University of Duisburg-Essen 

Title 

Reliability and stability of the "Alone in the dark" indicator 

Abstract 

The measurement of fear of crime with the classical fear of crime indicator 

has a long tradition. Despite the rich discussion on its doubtful usefulness, no 

estimates for the reliability of this indicator as a necessary condition for validity 

are available. Using panel data from the British Household Panel Survey and the 

German DEFECT project, the reliability and stability of the classical fear of crime 

indicator are estimated using quasi-Markov simplex models for the first time. A 

crucial assumption of these models is that only the measurement in the 

preceding wave has an influence on the answer of a respondent. This 

assumption appears to be violated in the given data. One plausible explanation 

could be the respondents' reactions to the terrorist attacks of 9/11, which took 

place between the waves."




SESSION : Qualitative and Mixed Methods 

PRESENTERS: 

Daniela P. Blettner d.p.blettner@uvt.nl 

Tilburg School of Economics and Management 

Anna Kuchenkova a.kuchenkova@rggu.ru 

Russian state university for the humanities 

Meike Morren m.morren@uvt.nl 

Tilburg School of Social and Behavioral Sciences

Presenter 

Blettner, Daniela P.; 

Dept. of Organisation and Strategy, 

Tilburg School of Economics and Management 

Authors 

Daniela P. Blettner 

Philipp Tuertscher; Vienna University (Austria) 

Title 

Comparative assessment of three content analysis methods for research on 

organizational attention 

Abstract 

Content analysis has gained great interest in strategic management 

research and organization studies for revealing organizational attention. 

Although these methods are now moving more into the main stream of strategic 

management, researchers do not have a clear understanding of the various 

methods and their respective strengths and weaknesses. In this paper, we 

compare three major approaches to content analysis: causal mapping, 

frequency-based analyses, and psychological linguistic analyses. We assess the 

insights that can be gained from these three methods in a study based on 

longitudinal data from the US airline industry.


Kuchenkova, Anna; Russian state university for the humanities 

Title 

Analysis of causes by means of two logical – combinatorial methods: QCA and 

JSM-method 

Abstract 

The aim of the paper is to examine two non-statistical methods of causal 

relation analysis. First of them, QCA is an approach (introduced by C.С. Ragin in 

the late 1980-s), including several techniques of formalized comparative analysis 

for small- and intermediate-N research designs. Second, JSM-method is a 

method of automatic hypotheses generation in intelligent data analysis, that is 

used for analysis of respondents’ opinions (introduced by V.K. Finn in the early 

1980-s). 

Both methods have a lot of in common, though they were devised 

independently at the same time. Firstly, they are non – probabilistic methods, 

based on mathematical logic (Boolean algebra, fuzzy set, predicate logic). 

Secondly, they have the same epistemological foundation - ideas of J.S. Mill (his 

“method of agreement”, “method of difference”, “joint method of agreement and 

differences” are formalized in QCA and JSM-method). Thirdly, they imply the 

same interpretation of causality as a combination of necessary and sufficient 

conditions that lead to a certain output. Fourthly, these methods are labeled as 

formalized qualitative methods; which combine elements of quantitative and 

qualitative research. From the one hand, these methods imply analysis of rigidly 

structured data, description of objects through a set of variables. From the other 

hand, these methods implement inductive strategy of data analysis: individual 

cases are examined in order to find out similarities and differences, empirical 

regularities. So it is a process of generalization, during which hypotheses are not 

testing, on the contrary, they are generated, what accords to the logic of 

qualitative research. Finely, QCA and JSM-method are intended for discovery of 

interconnection between values of different variables. They constitute a special 

group of methods among the methods of causal relationship analysis.

Presenter 

Morren, Meike; Tilburg School of Social and Behavioral Sciences 

Authors 

Meike Morren; John P.T.M. Gelissen; Tilburg School of Social and Behavioral 

Sciences 

Title 

Response Strategies and Response Styles in Cross-Cultural Surveys 

Abstract 

This paper addresses the following research questions: Do respondents 

participating in cross-cultural surveys differ in terms of their response style and 

response strategy when responding to attitude statements, and if so are these 

characteristics affecting the response process associated with a respondent’s 

ethnicity and generation of immigration? To answer these questions we 

conducted a mixed method study. Quantitative analysis of a large representative 

sample of minorities in the Netherlands shows that cross-cultural differences in 

responding can partly be explained by a differential response style. These 

differences in response style turn out to be related to the generation of 

immigration, both in the representative sample and in a purposively selected 

qualitative sample of persons belonging to the same four cultural groups. 

Analysis of cognitive interviews performed with the latter shows that 

respondents use three types of response strategies to overcome the difficulties 

of responding to survey items in a cross-cultural survey. The selected response 

strategy turns out to be strongly related to a respondent’s generation of 

immigration.




SESSION: Dealing with Measurement Inequivalence in Cross Cultural 

Research 

SUMMARY 

by Guy Moors (chair of the session) guy.moors@uvt.nl 

Cross-cultural comparative research has become inevitable in a globalizing 

society. Researchers are becoming increasingly aware that solid comparative 

research is not a straightforward matter. Cultural groups may not only differ in 

attitudes and values, but may assign different meanings to the questions asked 

to measure attitudes and values. The latter issue is the topic of the papers that 

are presented in this session, i.e. testing for measurement equivalence in cross- 

cultural research. The papers include methodological advances as well as 

applications. 

Presenters: 

Alain De Beuckelaer a.debeuckelaer@fm.ru.nl 

Radboud University Nijmegen, the Netherlands 

Eldad Davidov / Hermann Dülmer davidov@soziologie.uzh.ch 

University of Zurich / University of Cologne hduelmer@uni-koeln.de 

Miloš Kankaraš m.kankarash@uvt.nl 


Bart Meuleman bart.meuleman@soc.kuleuven.be 

Catholic University of Leuven, Belgium

Presenter 

Beuckelaer De, Alain; Radboud University Nijmegen 

Authors 

Nele Libbrecht; Ghent University 

Alain De Beuckelaer; Ghent University, Belgium; Renmin University China, P.R. 

China; and Radboud University Nijmegen 

Filip Lievens; Ghent University 

Thomas Rockstuhl; Nanyang Business School 

Title 

Measurement Invariance of the Wong and Law Emotional Intelligence Scale 

Scores: Does the Measurement Structure Hold Across Far Eastern and European 

Countries? 

Abstract 

In recent years, emotional intelligence and emotional intelligence 

measures have been widely examined in a plethora of countries and cultures. 

This is also the case for the Wong and Law Emotional Intelligence Scale (WLEIS), 

prompting the importance of examining whether the WLEIS is invariant across 

regions other than the Far Eastern region (China) where it was originally 

developed. This study investigated the measurement invariance of the WLEIS 

scores across two countries, namely Singapore (N = 505) and Belgium (N = 

339). Results showed that the measurement structure underlying the WLEIS 

ratings was invariant across these different countries as there was no departure 

from measurement invariance in terms of factor form, factor pattern coefficients, 

and factor intercorrelations. The scalar invariance model was partially supported. 

These results show promise for the equivalence of the WLEIS scores across 

different countries. Future research is needed to further test the equivalence 

across other countries and samples.

Presenters 

Davidov, Eldad; University of Zurich 

Dülmer, Hermann; University of Cologne 

Authors 

Eldad Davidov; University of Zurich 

Hermann Dülmer; University of Cologne 

Elmar Schlüter; University of Cologne 

Peter Schmidt; State University Higher School of Economics (HSE) in Moscow 

Title 

Explanation of Cross-Cultural Measurement In-equivalence using a Multilevel 

Structural Equation Modeling Approach 

Abstract 

Testing for equivalence of measurements across groups (such as countries 

or time points) is essential before meaningful comparisons of correlates and 

means may be conducted. However, often equivalence is not present and, as a 

result, comparisons across groups are problematic and biased. Scalar 

equivalence is only seldom supported by the data. In the current study we 

propose utilizing a multilevel structural equation modeling (SEM) approach to 

model and explain scalar in-equivalence. This method does not resolve in- 

equivalence but rather illuminates why it is present. We illustrate the method 

using data on human values (Schwartz 1992) from the second round of the 

European Social Survey. Thus, a new direction for research even when 

equivalence is not present is proposed. 

Key words: Human values; Configural, metric, and scalar equivalence; 

Multilevel confirmatory factor analysis (CFA) / Multilevel structural equation 

modelling (SEM); European Social Survey; Comparisons over time and/or 

countries.

Presenter 

Kankaraš, Miloš; Dept. Methodology and Statistics, Tilburg School of Social and 

Behavioral Sciences 

Authors 

Miloš Kankaraš and Guy Moors; Tilburg University 

Title 

Cross-National and Cross-Ethnic Differences in Attitudes. How do minorities’ 

attitudes align? 

Abstract 

Minorities’ attitudes can be compared to attitudes of fellow citizen within 

the host country as well as to attitudes of the motherland. Given the 

heterogeneity of Luxembourg’s minority groups, this country is a relevant 

example case in which the comparison needs to involve answering a two-folded 

question. First we analyze the level of measurement equivalence, i.e. the extent 

to which different groups can be compared. Secondly, we examine whether 

ethnic-cultural groups within Luxembourg resemble citizens from their native 

country more than their country of residence. Using EVS-date from 2008 we 

demonstrate different types of outcomes. Results indicated that cultural 

background is more important than national context in the case of culturally 

more distant minorities to Luxembourg’s resident population, and that national 

setting is the prevailing factor when minorities are from neighboring countries. 

The effect of a common national setting is also important with regards to the 

issue of measurement equivalence, where it contributes to greater comparability 

of intra-national, cross-ethnic comparisons.


Meuleman, Bart; Catholic University Leuven 

Title 

When are item intercept differences substantial in measurement equivalence 

testing? An application on ESS data. 

Abstract 

Applied comparative researchers are becoming increasingly aware of the 

issue of measurement equivalence. By now, there exists considerable agreement 

on the concrete operationalization and implications of (the various levels of) 

measurement equivalence. Multiple group confirmatory factor analysis (MGCFA) 

has become widely recognized as a useful statistical tool to test for equivalence. 

In this framework, measurement equivalence is assessed by constraining certain 

parameters – e.g. factor loadings or item intercepts - across groups. 

Despite growing consensus, important issues in equivalence testing by 

means of MGCFA remain unresolved. One of the most compelling problems 

related to the specific criteria that should be used to decide whether an equality 

constraint is violated or not. Various authors warn against relying on statistical 

criteria alone, because due to the large sample sizes often used, even negligible 

differences between groups can become significant. Saris et al. (2009) suggest 

that one should only pay attention to substantial model misspecifications (i.e. 

with a high expected parameter change). 

Yet, how large should differences between groups be to be judged as 

substantial? This paper proposes a concrete strategy to predict whether 

differences in item intercepts will have perceivable impact on substantial 

conclusions drawn from latent mean comparisons. The proposed strategy is 

applied using European Social Survey (round 4) data on welfare attitudes. 

Key words 

Measurement equivalence, MGCFA, European Social Survey




SESSION : Survey Methodology (Data Collection) 

PRESENTERS: 

Shishi Chen chenshishi@gmail.com 

University of Hong Kong 

Britta Busse busse@ifs.tu-darmstadt.de 

Darmstadt University of Technology 

Jorre van Nieuwenhuyze jorre.vannieuwenhuyze@soc.kuleuven.be 

Catholic University, Leuven 

Mark Trappmann mark.trappmann@iab.de 

Institute for Employment Research; University of Leipzig


Chen Shishi; University of Hong Kong 

Title 

Survey errors and fieldwork recommendation from a call back survey in Mainland 

China 

Abstract 

Fixed lines and mobile phones have been widely used as national 

telephone survey tools and there are many studies of fixed line and mobile 

phone survey methodology and comparing telephone surveys with other survey 

modes. 

This paper builds upon a great opportunity for methodological work on 

fixed line and mobile phone surveys in Mainland China, using a follow-up survey 

interviewing the respondents from a prior face-to-face survey. This is innovative. 

Understanding the challenges in fixed line and mobile phone surveys in Mainland 

China is a very topical issue in the field of survey research and the results can be 

used to study survey errors and contribute to that literature as well as to 

improve the quality of survey fieldwork procedures. 

A database with telephone contact information for 4041 individuals was 

obtained from a household survey in Mainland China, for which the Social 

Sciences Research Centre of the University of Hong Kong was commissioned to 

conduct a follow-up telephone survey of the same individuals. The households 

were sampled randomly for the first wave national face-to-face survey and the 

individuals are respondents who left their telephone numbers after the face-toface 

survey and accepted in principle a call back interview within two weeks. 

This paper details global telephone coverage over the past ten years and 

identifies the trends over time by geographical region and level of development 

in Mainland China, including both fixed lines and mobile phones. 

This paper analyzes the quality of the face-to-face database and the 

outcomes of the call back survey. As the demographics of respondents and nonrespondents 

were known from the database, studies of the influence of day, 

time, household demographics and individual demographics on the first and 

second contact attempt outcomes were undertaken using logistic regression. The 

findings include an effective calling design to improve telephone survey field 

work strategy and contribute valuable information for further studies in Mainland 

China.

Presenter 

Busse, Britta; Darmstadt University of Technology 

Authors 

Marek Fuchs; Britta Busse; Darmstadt University of Technology 

Title 

Using an adaptive design in gaining cooperation. Enhancing the recruitment 

success in a mobile phone panel survey 

Abstract 

In recent years declining response and cooperation rates have become a 

serious threat to all kinds of surveys. This implies one, reduced sample sizes and 

therefore inflating standard errors and decreasing accuracy of survey results, 

and two, a potential increase of non-response biases since the population of 

survey respondents might differ significantly from the non-responding 

population. Both effects have especially severe consequences for panel surveys 

since panel studies require large initial samples due to panel attrition (which 

reduces sample size in addition to initial non-response). Also, non-response 

biases that might be introduced into the panel will be carried on into every 

following panel wave. Thus, when recruiting for a panel survey it is necessary to 

avoid initial refusals and increase cooperation with the help of an effective 

recruitment question wording. A common practice used in the initial phase of 

telephone interviews in order to gain cooperation allays respondents’ concerns 

with an appropriate interviewer statement addressing the respondents’ qualms. 

We propose that – similar to these proactive interviewer persuasion statements 

in the beginning of telephone interviews - a respondent-tailored conviction 

strategy could enhance the success of a panel recruitment question. In practice 

this could be implemented by differential recruitment question versions among 

which the most promising one will be presented to the respondent, chosen based 

on questions already answered by the respondent like survey attitudes items. 

In this paper we will present results from a recruitment survey (n=1,600) 

conducted in Germany for refreshing an ongoing mobile phone panel. We tested 

four different recruitment question versions in a randomized between-subjects

design, each emphasizing a specific notion directly linked to potential causes of 

non-response (=declining further panel participation). In the interview section 

prior to the panel recruitment question we also measured corresponding survey 

attitudes. This design allows us to determine the effectiveness of the various 

recruitment question versions with respect to subgroups of respondents who are 

prone to specific positive or negative survey attitudes. We will discuss results in 

light of an adaptive recruitment strategy that matches the recruitment question 

wording to previously answered survey attitude items. In addition to recruitment 

success (proportion of respondents agreeing to further panel participation) we 

also examine potential non-response biases that might be introduced in the 

panel due to selective panel cooperation.


Nieuwenhuyze, Jorre van; Catholic University, Leuven 

Title 

Evaluating mode effects on moments of continuous variables in mixed mode 

data 

Abstract 

Mixed mode surveys are surveys where data of different respondents is 

gathered by different survey modes. To research the quality gain of mixed mode 

surveys relative to single mode surveys, selection effects between the modes 

should be evaluated. Nevertheless, direct estimation of selection effects is 

difficult because they are completely confounded with measurement effects of 

the modes. This paper first discusses the shortcomings and problems of the 

common methods to disentangle mode effects reported in the existing literature. 

Next, it discusses a recent technique which avoids the former problems and 

which can be used as an alternative method within certain circumstances. This 

paper aims to broaden this technique to mode effects estimations on the 

moments of continuous variables (with special attention to means and 

covariances). Data of a mixed mode experiment parallel to the European Social 

Survey will be used for illustration.


Trappmann, Mark 

Co-Author 

Antje Kirchner; Institute for Employment Research; University of Leipzig 

Title 

Eliciting illicit work. Item Count and Randomized Response Technique put to the 

test 

Abstract 

We address an ongoing debate how to assess sensitive topics in telephone 

surveys. Examining three existing methods and implementing one new method, 

we developed a module to measure illicit work and tested this in two CATI 

studies (both conducted in 2010). In an experimental setting, we compare a 

double-list implementation of the Item Count Technique (ICT) with direct 

questioning as well as a forced-response implementation of the Randomized 

Response Technique (RRT) with direct questioning. In the first study (ICT; 

n=1.603), respondents were selected from the German general population. In 

the second study (RRT; n=3.211), respondents of two specific populations were 

sampled from a register: employed persons and those qualifying for basic 

income support in Germany, i.e. people depending on state transfer payments. 

Goal of the studies is to evaluate which method elicits more socially 

undesirable answers in the context of illicit work and moonlighting, particularly 

with regard to the specific mode of data collection and different subpopulations. 

Furthermore, we developed a novel method which can be applied to the 

measurement of sensitive metric variables. This method requires no randomizer 

and can be easily administered in CATI surveys. Also, in both studies data on a 

number of background variables were collected that, according to theory, foster 

illicit work. These theories are empirically tested and the results are briefly 

discussed in the paper.




SESSION : Item Response Theory 

PRESENTERS: 

Wilco H.M Emons w.h.m.emons@uvt.nl 


Hendrik J.H.Straat j.h.straat@uvt.nl 


Marie Anne Mittelhaëuser m.mittelhaeuser@uvt.nl 

Tilburg School of Social and Behavioral Sciences/CITO 

Rosalie Gorter r.gorter@vumc.nl 

VUmc, Dept. of Epidemiology and Biostatistics


Emons, Wilco H.M.; Tilburg School of Social and Behavioral Sciences 

Title 

On the Usefulness of Latent Variable Hybrid Models to Distinguish Categories 

from Dimensions 

Abstract 

In this presentation, we discuss the usefulness of latent variable hybrid 

models, including latent class item response theory and item response theory 

mixture models, to distinguish qualitative from quantitative individual differences 

on multidimensional psychological attributes. Different latent variable hybrid 

approaches will be discussed and contrasted with traditional approaches 

including taxometrics. Results from empirical data analysis and simulation 

studies will be presented and limitations and implications for future research will 

be discussed. As an example, we use distressed personality, which refers to a 

general propensity to psychological distress defined by the combination of two 

distinct personality attributes, negative affectivity (NA) and social inhibition (SI). 

Currently, persons are categorized as Type D if they score above a certain cutoff 

on both NA and SI dimension and as non-Type D otherwise. We used latent 

variable hybrid models to advance the current debate as to whether individual 

differences in distressed personality should be conceived as representing gradual 

differences on its constituent continuous NA and SI dimensions rather than as a 

categorical Type D/non-Type D dichotomy.


Gorter, Rosalie; VUmc, Dept. of Epidemiology and Biostatistics, EMGO+ 

institute of health and care research 

Authors 

Rosalie Gorter; Martijn W. Heymans; Jos W.R. Twisk; VUmc, Dept. of 

Epidemiology and Biostatistics, Emgo+ Institute of Health and Care Research. 

Michiel R. de Boer, VU University, Faculty of Earth and Life Sciences, Institute for 

Health Sciences, Dept.of Methodology and Applied Biostatistics. 

Rien van der Leeden, Leiden University, Faculty of Social Sciences, Institute 

Psychology, Methodology & Statistics. 

Title 

Comparing the performance of software packages in estimating the parameters 

of multilevel IRT models for longitudinal data 

Background 

Many questionnaires used in patient research consist of items with a likert 

answering scale. An example is the increasing utilization of quality of life 

questionnaires in epidemiological and medical research. When the answers on 

such questionnaire are used as outcome variable, usually a score is attached to 

the answering categories and these scores are than added in order to obtain a 

total score for the construct. A theoretically more appropriate way of analyzing 

these data is by using an IRT model that estimates item and person parameters. 

An adjacent category (ordinal) logit model can be used to estimate the 

probability of a person to choose a specific category given his or her level of the 

latent variable theta. In addition to using IRT specific software packages for such 

analysis, the models can also be formulated as hierarchical models and analyzed 

with general software packages. An important advantage of this reformulation is 

that levels can be added for analyzing longitudinal or otherwise clustered data. 

There are several different software packages for fitting ordinal logit models 

which are capable of estimating the parameters for this type of longitudinal data. 

However, these packages use different estimation methods which may lead to 

different estimates depending on the combination of parameter specific 

characteristics of the data such as sample size, item and person characteristics.

The aim of this study therefore is to compare these packages with respect to 

parameter estimates and user friendliness. 

Method and results 

Datasets were simulated under several conditions with variation in the number 

of participants. Results from multilevel IRT analyses in different software 

packages are compared on their precision of estimating parameters, and the 

time and number of iterations needed for convergence. We started our analysis 

in GLLAMM (Stata) and as expected we found that the bias of the estimated 

parameters was smaller in the n = 500 condition than in the n = 150 condition. 

The time for convergence varied a lot between these different conditions, 1.5 

hours and 30 minutes respectively. Our results indicate that using data with a 

larger number of participants gives better estimates of the parameters, although 

the time until convergence increases. The results indicate an underestimate of 

the true parameters in all conditions. We will present a comparison of these 

results with the performance of other software packages. 

Keywords: bias, multilevel IRT, simulation, longitudinal, ordinal, quality of life.


Mittelhaëuser, Marie-Anne; Tilburg School of Social and Behavioral 

Sciences/CITO 

Title 

Using mixed IRT models and person-fit methods to model motivation: an 

application in educational measurement 

Abstract 

The goal of the current study was to compare a linking procedure for two 

test forms using different types of common items. It was hypothesized that the 

test-taking condition of the common items influences the linking procedure. The 

results support the hypothesis. A mixed Rasch model was used to model some 

examinees as being more motivated than others to solve the items. Removal of 

aberrant item-score vectors or items displaying differential item functioning did 

not improve the linking procedure.


Straat, Hendrik J.H.; Tilburg School of Social and Behavioral Sciences 

Title 

Conditional Association as a Powerful Tool for Assessing IRT Model Fit 

Abstract 

The ordinal, unidimensional latent variable model assumes 

unidimensionality, local independence, and monotonicity, and implies the 

general property of conditional association between sets of items. We specialized 

conditional association into three useful observable consequences and 

implemented them in a new scaling procedure that we coined CA scaling. CA 

scaling aims at identifying items that are inconsistent with the unidimensional 

latent variable model, removing those items from the initial item set, and 

producing a subset of items that is consistent with the unidimensional latent 

variable model. We compared CA scaling with the scaling procedures DETECT 

and Mokken scale analysis, and found that CA scaling produced longer scales 

consistent with the unidimensional latent variable model.

Here - Tilburg University

Create successful ePaper yourself

Delete template?

Save as template?