Proceedings of the 13 ESSLLI Student Session - Multiple Choices ...

Proceedings of the 

13 th ESSLLI Student Session 

4–15 August 2008, Hamburg, Germany 

Kata Balogh 

(editor)

Copyright c○ to the authors

Contents 

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

Martin Avanzini 

POP ∗ and Semantic Labelling using SAT . . . . . . . . . . . . 7 

Timo Baumann 

Simulating Spoken Dialogue 

With a Focus on Realistic Turn-Taking . . . . . . . . . . . . . 17 

Christopher Brumwell 

Epistemic Modals in Dialogue . . . . . . . . . . . . . . . . . . 27 

Bert Le Bruyn 

Bare predication and kinds . . . . . . . . . . . . . . . . . . . . 37 

James Burton 

Diagrammatic Reasoning 

with Enhanced Static Constraints . . . . . . . . . . . . . . . . 47 

Gemma Celestino 

Fictional Contingencies . . . . . . . . . . . . . . . . . . . . . 57 

Michael Franke 

Meaning & Inference in Case of Conflict . . . . . . . . . . . . 65 

Michael Hartwig 

Towards a New Characterisation of Chomsky’s Hierarchy via 

Acceptance Probability . . . . . . . . . . . . . . . . . . . . . . 75 

Simon Hopp 

Distance Effects in Sentence Processing . . . . . . . . . . . . . 85 

Pierre Lison 

A Salience-driven Approach to 

Speech Recognition for Human-Robot Interaction . . . . . . . . 95 

Petar Maksimović – Dragan Doder–Bojan Marinković – Aleksandar 

Perović 

A logic with a conditional probability operator . . . . . . . . . 105 

Scott Martin 

A Proof-theoretic Approach to French Pronominal Clitics . . . 115 

Takako Nemoto 

Infinite games from an intuitionistic point of view . . . . . . . 125 

Ivelina Nikolova 

Language Technologies for Instructional Resources in Bulgarian135 

3

Proceedings of the 13 th ESSLLI Student Session 

Yves Peirsman 

Word Space Models of Semantic Similarity and Relatedness . . 143 

Maren Schierloh 

Examining the Noticing Function of Output . . . . . . . . . . 153 

Andreas Schnabl 

Cdiprover3: a Tool for Proving 

Derivational Complexities of Term Rewriting Systems . . . . . 165 

Éva Szilágyi 

The Rank(s) Of A Totally Lexicalist Syntax . . . . . . . . . . 175 

Camilo Thorne 

Expressing Conjunctive and Aggregate Queries 

over Ontologies with Controlled English . . . . . . . . . . . . . 185 

Christina Unger – Gianluca Giorgolo 

Interrogation in Dynamic Epistemic Logic . . . . . . . . . . . 195 

Melanie Uth 

The Semantic Change of the French -age-Derivation . . . . . . 203 

Grégoire Winterstein 

Adversary Implicatures . . . . . . . . . . . . . . . . . . . . . . 213 

List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 

4

Preface 


This years Student Session is the thirteenth in the twenty years history of the annual 

European Summer School on Logic Language and Information. The first edition was 

held in Prague in 1996, invented and organized by students, and ever since ESS- 

LLI has been accompanied by a separate Student Session. The aim of the Student 

Session is to give an opportunity to students at all levels (Bachelor-, Master-, and 

PhD-students) to present and discuss their work in progress with a possibility to get 

feedback from senior researchers. 

Similarly to the previous years, the quality of the submissions was high, this made 

the selection procedure difficult. This year 17 papers were selected for oral presentation 

and 5 for poster presentation from a total of 46 submissions. All the accepted 

papers are included in this volume. 

I would like to thank the ESSLLI organization, in particular Rineke Verbrugge and 

Benedikt Loewe for their continuos support and for making it possible. I am grateful 

to the StuS Program Committee, the co-chairs: Laia Mayol, Manuel Kirschner and 

Ji Ruan, for their efforts in coordinating the reviewing process, and the senior area 

experts: Anke Lüdeling, Paul Egré, Guram Bezhanishvili and Alexander Rabinovich 

for their continuous presence and helpful advice. Also, I want to hank the anonymous 

reviewers, whose detailed comments have not only proved invaluable during 

the selection procedure, but also provide useful feedback to the authors. Many 

thanks to the Kluwer Academic Publishers who offered — as in previous years — 

prizes in “Best Student Paper in the Oral Session” and “Best Student Paper in the 

Poster Session” nominations. 

We are very much looking forward to the 13 th ESSLLI Student Session in Hamburg, 

and believe that it will be again a very inspiring meeting. 

5 

Kata Balogh 

Amsterdam, May 2008


6


POP ∗ AND SEMANTIC LABELING USING SAT 


University of Innsbruck 

Abstract. The polynomial path order (POP ∗ for short) is a termination method that induces 

polynomial bounds on the innermost runtime complexity of term rewrite systems (TRSs). 

Semantic labeling is a transformation technique used for proving termination. In this paper 

we propose an efficient implementation of POP ∗ together with finite semantic labeling. This 

automation works by a reduction to the problem of boolean satisfiability. Satisfiability of 

the resulting formula is checked by a state-of-the-art SAT-solver. We have implemented the 

technique and experimental results confirm the feasibility of our approach. By semantic 

labeling, we significantly increase the power of POP ∗ . 

Term rewrite systems provide a conceptually simple but powerful abstract model of 

computation. In rewriting, proving termination is a long standing research field and 

consequently termination techniques applicable in an automated setting have been introduced 

quite early. Former research concentrated mainly on direct termination techniques 

(TeReSe, 2003). One such technique is the use of recursive path orders (RPOs), for instance 

the multiset path order (MPO) (Baader and Nipkow, 1998). Recently, the emphasis 

shifted toward transformation techniques like the dependency pair method (Arts and 

Giesl, 2000) or semantic labeling (Zantema, 1995). These methods significantly increase 

the possibility to automatically conclude termination. 

For direct termination techniques it is often possible to infer upper bounds on the 

derivational complexity of a rewrite system R from the termination proof. For instance, 

Hofbauer was the first to observe that termination via MPO implies the existence of a 

primitive recursive bound on the derivational complexity (Hofbauer, 1992). Here derivational 

complexity refers to the function that relates the length of the longest derivation 

sequence to the size of the initial term. It is thus quite natural to extend such a termination 

analysis of rewrite systems to the analysis of complexity properties. For the study of lower 

complexity bounds we recently introduced in (Avanzini and Moser, 2008) the polynomial 

path order (POP ∗ for short). This order is in essence a miniaturization of MPO, carefully 

crafted to induce polynomial bounds on the number of rewrite steps (c.f. Theorem 4). 

In this work, we show how to increase the power of POP ∗ by semantic labeling 

(Zantema, 1995). The idea behind semantic labeling is to label the function symbols 

of a rewrite system R with semantic information in such a way that direct termination 

methods become applicable for the labeled rewrite system Rlab. In order to label R, one 

needs to define suitable interpretation- and labeling-functions for all symbols appearing 

in R. Naturally, these functions have to be chosen such that POP ∗ is applicable to the 

labeled system. To find them automatically, we extend the propositional encoding from 

(Avanzini and Moser, 2008). Satisfiability of the constructed formula certifies the existence 

of a labeled system Rlab that is compatible with POP ∗ . Finite semantic labeling is 

non-termination preserving and moreover, it is complexity preserving. Thus from compatibility 

of Rlab with POP ∗ we conclude that R admits a polynomial runtime complexity 

(c.f. Lemma 6). 

7


A translation of infinite semantic labeling in conjunction with RPOs has already been 

given in (Koprowski and Middeldorp, 2007). Unfortunately, this approach is inapplicable 

in our context since the runtime complexity of the original system cannot be related to the 

runtime complexity of the infinite labeled system in general. Furthermore, finite semantic 

labeling using heuristics is implemented in the termination prover TPA (Koprowski, 2006) 

for instance. We consider the here presented approach favorable, as the choice of labeling 

suitable for the base order can be left to a state-of-the-art SAT-solver. 

1 The Polynomial Path Order 

We briefly recall the basic concepts of term rewriting, for details (Baader and Nipkow, 

1998) provides a good resource. Let V denote a countably infinite set of variables and F 

a signature. The set of terms over F and V is denoted by T (F, V). We write ✂ for the 

subterm relation, the converse is denoted by ☎ and the the strict part of ☎ by ✄. 

A term rewrite system (TRS for short) R over T (F, V) is a set of rewrite rules l → r 

such that l, r ∈ T (F, V), l �∈ V and all variables of r also appear in l. In the following, R 

will always denote a TRS and in our context, R is finite. A binary relation on T (F, V) is a 

rewrite relation if it is compatible with F-operations and closed under substitutions. The 

smallest extension of R that is a rewrite relation is denoted by →R. The innermost rewrite 

relation i −→R is a restriction of →R, where innermost terms have to be reduced first. The 

transitive and reflexive closure of a rewrite relation → is denoted by →∗ and we write 

s →n t for the contraction of s to t in n steps. We say that R is (innermost) terminating 

i 

if there exists no infinite chain of terms t0, t1, . . . such that ti →R ti+1 (ti −→R ti+1) for 

all i ∈ N. 

The root symbols of left-hand sides of rewrite rules in R are called defined symbols and 

collected in D(R), while all other symbols are called constructor symbols and collected 

in C(R). A term f(s1, . . . , sn) is constructor-based with respect to R if f ∈ D(R) and 

s1, . . . , sn ∈ T (C(R), V). We write T cb(R) for the set of all constructor-based terms 

over R. If every left-hand side of R is constructor-based then R is called constructor 

TRS. Constructor TRSs allow us to model the computation of functions in a very natural 

way. Consider the following TRS: 

Example 1 The constructor TRS Rmult is defined by 

add(0, y) → y mult(0, y) → 0 

add(s(x), y) → s(add(x, y)) mult(s(x), y) → add(y, mult(x, y)). 

Rmult defines the function symbols add and mult, i.e. D(R) = {add, mult}. Natural 

numbers are represented using the constructor symbols from C(R) = {s, 0}. Define the 

encoding function �·� : Σ ∗ → T (C(R), ∅) by �0� = 0 and �n + 1� = s(�n�). Then for 

all n, m ∈ N, mult(�n�, �m�) i −→ ∗ R �n ∗ m�. We say that Rmult computes multiplication 

(and addition) on natural numbers. For instance, the system admits the innermost rewrite 

sequence mult(s(0), 0) i −→ add(0, mult(0, 0)) i −→ add(0, 0) i −→ 0, computing 1 ∗ 0. Notice 

that we have to reduce in the second step the innermost redex mult(0, 0) first. 

In (Lescanne, 1995) it is proposed to conceive the complexity of a rewrite system 

R as the complexity of the functions computed by R. Whereas this view falls into the 

realm of implicit complexity analysis, we conceive rewriting under R as the evaluation 

8

mechanism of the encoded function. Thus it is natural to define the runtime complexity 

based on the number of rewrite steps admitted by R. Let |s| denote the size of a term 

s. The (innermost) runtime complexity of a terminating rewrite system R is defined by 

DlR(m) = max{n | ∃s, t. s i −→ n t, s ∈ T cb(R) and |s| � m}. 

To verify whether the runtime complexity of a rewrite system R is polynomially 

bounded, we employ the polynomial path order. Similar to the recursion-theoretic characterization 

of the polytime functions given in (Bellantoni and Cook, 1992), POP ∗ relies 

on the separation of safe and normal inputs. For this, the notion of safe mappings is introduced. 

A safe mapping safe associates with every n-ary function symbol f the set of 

safe argument positions. If f ∈ D(R) then safe(f) ⊆ {1, . . . , n}, for f ∈ C(R) we fix 

safe(f) = {1, . . . , n}. The argument positions not included in safe(f) are called normal 

and denoted by nrm(f). A precedence is an irreflexive and transitive order on F. The 

polynomial path order >pop∗ is an extension of the auxiliary order >pop, both defined in 

the following definitions: 

Definition 2 Let > be a precedence and safe a safe mapping. We define the order >pop 

inductively as follows: s = f(s1, . . . , sn) >pop t if one of the following alternatives hold: 

1. f ∈ C(R) and si > = pop t for some i ∈ {1, . . . , n}, or 

2. si > = pop t for some i ∈ nrm(f), or 

3. t = g(t1, . . . , tm) with f ∈ D(R) and f > g and s >pop ti for all 1 � i � m. 

Definition 3 Let > be a precedence and safe a safe mapping. We define the polynomial 

path order >pop∗ inductively as follows: s = f(s1, . . . , sn) >pop∗ t if either 

1. s >pop t, or 


2. si > = pop∗ t for some i ∈ {1, . . . , n}, or 

3. t = g(t1, . . . , tm), with f ∈ D(R), f > g, and the following properties hold: 

• s >pop∗ ti0 for some i0 ∈ safe(g) and 

• either s >pop ti or s ✄ ti and i ∈ safe(g) for all i �= i0, or 

4. t = f(t1, . . . , tm) and for nrm(f) = {i1, . . . , ip}, safe(f) = {j1, . . . , jq} both 

[si1, . . . , sip] (>pop∗)mul [ti1, . . . , tip] and [sj1, . . . , sjq] (> = pop∗)mul [tj1, . . . , tjq] holds. 

Here > = pop∗ (> = pop) denotes the reflexive closure of >pop∗ (>pop) and (>pop∗)mul the multiset 

extension of >pop∗. When R ⊆ >pop∗ holds, we say that >pop∗ is compatible with R. 

The main theorem from (Avanzini and Moser, 2008) states: 

Theorem 4 Let R be a finite, constructor TRS compatible with >pop∗, i.e., R ⊆ >pop∗. 

Then the runtime complexity of R is polynomial. The polynomial depends only on the 

cardinality of F and the sizes of the right-hand sides in R. 

We conclude this section by demonstrating the application of POP ∗ on the TRS Rmult: 

9


Example 5 Reconsider the rewrite system Rmult from Example 1. We suppose that the 

second argument of addition (add) is safe (safe(add) = {2}) and that all arguments of 

multiplication (mult) are normal (safe(mult) = ∅). Furthermore let the precedence > 

be defined as mult > add > s. Then Rmult is compatible with >pop∗. As a consequence 

of Theorem 4, the number of rewrite steps starting from mult(�n�, �m�) is polynomially 

bounded in n and m. 

In order to verify compatibility for this particular instance >pop∗ we need to show that 

all the rules in Rmult are strictly decreasing with respect to >pop∗, that is l >pop∗ r holds 

for l → r ∈ Rmult. To exemplify this, consider the rule add(s(x), y) → s(add(x, y)). 

We write 〈i〉 for the i-th case of Definition 3. From s(x) >pop∗ x by rule 〈2〉 we infer 

[s(x)](>pop∗)mul[x]. Furthermore [y](> = pop∗)mul[y] holds and thus by rule 〈4〉 we obtain 

add(s(x), y) >pop∗ add(x, y). Finally, from this and add > s we conclude by one application 

of rule 〈3〉 that add(s(x), y) >pop∗ s(add(x, y)) holds. 

2 A Propositional Encoding of POP ∗ with Finite Semantic Labeling 

In (Zantema, 1995) the transformation technique semantic labeling is introduced. From 

R a labeled TRS Rlab is obtained by labeling the function symbols in R with semantic 

information. Semantics are given to R by defining a model. A model is a F-algebra 

A, i.e. a carrier A equipped with operations fA : A n → A for every n-ary symbol 

f ∈ F, such that for every rule l → r ∈ R and any assignment α : V → A, the equality 

[α]A(l) = [α]A(r) holds. Here [α]A(t) denotes the interpretation of t with assignment α, 

inductively defined by [α]A(t) = α(t) if t ∈ V and [α]A(t) = fA([α]A(t1), . . . , [α]A(tn)) 

if t = f(t1, . . . , tn). The system is then labeled according to a labeling ℓ for A, i.e. a set 

of mappings ℓf : A n → A for every n-ary function symbol f ∈ F. 1 

For every assignment α, the mapping labα(t) is defined by labα(t) = t if t ∈ V and 

labα(f(t1, . . . , tn)) = fa(labα(t1), . . . , labα(tn)) where a = ℓf([α]A(t1), . . . , [α]A(tn)). 

The labeled TRS Rlab is obtained by labeling all rules for all assignments α, that is 

Rlab = {labα(l) → labα(r) | l → r ∈ R and assignment α}. 

The main theorem from (Zantema, 1995) states that Rlab is terminating if and only if R 

is terminating. In the following, we restrict to algebras B with carrier B = {true, false}, 

however the approach is extensible to arbitrary finite carriers. 

To encode a Boolean function b : B n → B, we make use of unique propositional atoms 

bw for every sequence of arguments w = w1, . . . , wn ∈ B n . The atom bw will denote 

the result of applying w1, . . . , wn to b. Let a1, . . . , an be propositional formulas. To 

impose restrictions on the encoded function b, we introduce the formula �b�(a1, . . . , an) 

such that for a satisfying assignment ν the equality ν(�b�(a1, . . . , an)) = bν(a1),...,ν(an) 

holds. For instance with �b�(a1, a2) ↔ r we assert that the encoded function b satisfies 

b(ν(a1), ν(a2)) = ν(r). 

For every assignment α : V → A and term t appearing in R we introduce the atoms 

intα,t and labα,t for t �∈ V. The meaning of intα,t will be the result of [α]B(t), labα,t 

will denote the label of the root symbol of t under α. In order to ensure this for t = 

1 The definition from (Zantema, 1995) allows the labeling of a subset of F and leave other symbols 

unchanged. In our context, this has no consequence and simplifies the translation. 

10


f(t1, . . . , tn) and a particular assignment α, we define 

INTα(t) = intα,t ↔ �fB�(intα,t1, . . . , intα,tn), and 

LABα(t) = labα,t ↔ �ℓf�(intα,t1, . . . , intα,tn). 

Furthermore for t ∈ V we set INTα(t) = intα,t ↔ α(t). We extend ☎ to TRSs as follows: 

R ☎ t if l ☎ t or r ☎ t for some rule l → r ∈ R. Beside the model condition, the above 

constraints have to be enforced for every term appearing in R. This is covered by 

LAB(R) = �� 

� 

(INTα(t) ∧ LABα(t)) ∧ � 

(intα,l ↔ intα,r) � . 

α 

R☎t 

l→r∈R 

Assume ν is a satisfying assignment for LAB(R) and Rlab denotes the system obtained by 

labeling R according to the encoded labeling and model. In order to show compatibility 

of Rlab with POP ∗ , we need to find a precedence > and safe mapping safe such that 

Rlab ⊆>pop∗ holds for the induced order >pop∗. To compare the labeled versions of two 

concrete terms s, t ∈ T (F, V) under a particular assignment α, we define 

�s >pop∗ t�α = �s > (1) 

pop∗ t�α ∨ �s > (2) 

pop∗ t�α ∨ �s > (3) 

pop∗ t�α ∨ �s > (4) 

pop∗ t�α. 

Here �s > (i) 

pop∗ t� refers to the encodings of the case 〈i〉 from Definition 3. We discuss 

the cases 〈2〉 – 〈4〉, case 〈1〉, the comparison using the weaker order >pop, is obtained 

similarly. 

Note that si = t implies labα(si) = labα(t). Thus case 〈2〉 is perfectly captured 

by �f(s1, . . . , sn) > (2) 

pop∗ t�α = ⊤ 2 if si = t holds for some si. Otherwise, we define 

�f(s1, . . . , sn) > (2) 

pop∗ t�α = � n 

i=1 �si >pop∗ t�α. For f ∈ F and formula a representing 

the label, the formula SF(fa, i) (NRM(fa, i)) assesses that depending on the valuation of 

a, the i-th position of ftrue or ffalse is safe (normal). Likewise, for f, g ∈ F, the formula 

�fa > gb� is defined such that for a satisfying assignment ν, fν(a) > gν(b) is asserted. 

Assume the unlabeled symbol f is a defined symbol of R.We define for f �= g 

�f(s1, . . . , sn) > (3) 

pop∗ g(t1, . . . , tm)�α = �flabα,s > glabα,t� 

n� � 

∧ �s >pop∗ ti0�α ∧ SF(glabα,t, i0) 

i0=1 

∧ 

n� 

i=1,i�=i0 

� �s > (1) 

pop∗ ti�α ∨ ( SF(glabα,t, i) ∧ �s ✄ ti� ) �� 

. 

Here we employ that the superterm property ✄ is closed under labeling. Additionally 

we add the rule fa(x1, . . . , xn) → c with c a fresh constant to the labeled system and 

require fa > c in the precedence. This guarantees that fa is defined with respect to 

Rlab as otherwise case 〈3〉 is not applicable. Alternatively one could encode whether fa is 

defined and adopt the encoding of case 〈3〉 accordingly, but experimental findings indicate 

that the described approach is favorable. 

To encode multiset comparisons, we make use of multiset covers (Schneider-Kamp, 

Thiemann, Annov, Codish and Giesl, 2007). A multiset cover is a pair of total mappings 

2 We use ⊤ and ⊥ to denote truth and falsity in propositional formulas. 

11

γ : {1, . . . , n} → {1, . . . , n} and ε: {1, . . . , n} → B, encoded using fresh atoms γi,j and 

εi. The underlying idea is that for the comparison [s1, . . . , sn](> = pop∗)mul[t1, . . . , tn] to 

hold, every term tj has to be covered by some term si (encoded as γij = true), either by 

si = tj (εi = true) or si >pop∗ tj (εi = false). For the case si = tj, si must not cover 

any element besides tj. To assert a correct encoding of (γ, ε), we introduce the formula 

�(γ, ε)�. By means of multiset covers we are able to encode case 〈4〉 using one multiset 

comparison. We define 

�f(s1, . . . , sn) > (4) 

pop∗ f(t1, . . . , tn)�α = 

(labα,s ↔ labα,t) ∧ �(γ, ε)� ∧ 

∧ 

n� 

i=1 j=1 

n� � � 

NRM(flabα,s, i) ∧ ¬εi 

i=1 

n� � 

γi,j → � (SF(flabα,s, i) ↔ SF(flabα,t, j)) 

∧ (εi → �si = tj�) ∧ (¬εi → �si >pop∗ tj�α) �� 

where we restrict comparisons of arguments by their kind. Assuming STRICT(R) and 

SMSL(R) cover the restrictions on the precedence and safe mapping, satisfiability of 

POP ∗ SL(R) = � � 

�l >pop∗ r�α ∧ SM(R) ∧ STRICT(R) ∧ LAB(R) 

α 

l→r∈R 

certifies the existence of a model B and labeling ℓ such that the rewrite system 

R ′ lab = Rlab ∪ {fa(x1, . . . , xn) → c | f ∈ D(R) and fa ∈ C(Rlab)} 

is compatible with >pop∗. Since every rewrite sequence in R translates to a sequence in 

Rlab, by Theorem 4 it is an easy exercise to proof the following lemma: 

Lemma 6 Let R be a finite, constructor TRS and assume POP∗ SL (R) is satisfiable. Then 

the induced runtime complexity is polynomial. 

3 Experimental Results 


We implemented the encoding of POP ∗ with semantic labeling (denoted by POP ∗ SL ) 

in OCaml and compare it to the implementation without labeling from (Avanzini and 

Moser, 2008) (denoted by POP ∗ ) and an implementation of a restricted class of polynomial 

interpretations (denoted by SMC). To check satisfiability of the obtained formulas 

we employ the MiniSat SAT-solver (Eén and Sörensson, 2003). 

SMC refers to a restrictive class polynomial interpretations: Every constructor symbol 

is interpreted by a strongly linear polynomial, i.e. a polynomial of shape P (x1, . . . , xn) = 

Σ n i=1xi + c with c ∈ N, c � 1. Furthermore, each defined symbol is interpreted by a 

simple-mixed polynomial P (x1, . . . , xn) = Σij∈0,1ai1...inx i1 

1 . . . x in 

n + Σ n i=1bix 2 i with coefficients 

in N. For this class of polynomial interpretations it is trivial to check that they 

induce polynomial bounds on the runtime complexity. To find these interpretations automatically 

we employ cdiprover3 (Moser and Schnabl, 2008). 

12

The table below presents experimental results based on two testbeds. Testbed T constitutes 

of the 957 examples from the Termination Problem Database 4.03 (TPDB) that were 

automatically verified terminating in the competition of 20074 . Testbed C is a restriction 

of T where only constructor TRSs have been considered (449 in total). Experimental 

results, performed on a PC with 512 MB of RAM and a 2.4 GHz Intel R○ Pentium TM 

IV 

processor, are collected in Table 15 . 

Table 1: Experimental results on TPDB 4.0. 

POP ∗ POP ∗ SL SMC 

T C T C T C 

Yes 65 41 128 74 156 83 

Maybe 892 408 800 370 495 271 

Timeout (60 sec.) 0 0 29 5 306 95 

Average Time Yes (sec.) 0.037 0.130 0.183 

The results confirm that semantic labeling significantly increases the power of POP∗ , 

yielding comparable results to SMC. What is noteworthy is that the union of yes-instances 

of the three methods constitutes of 218 examples for testbed T and 112 for testbed C. For 

these 112 out of 449 constructor TRSs we are able to conclude a polynomial runtime 

complexity. Interestingly POP∗ SL and SMC succeed on a quite different range of systems. 

There are 29 constructor TRSs that only POP∗ SL can deal with, whereas 38 constructor 

yes-instances of SMC cannot be handled by POP∗ SL . Table 1 reflects that for both suites 

SMC runs into a timeout for approximately every fourth system. This indicates that purely 

semantic methods similar to SMC tend to get impractical when the size of the input system 

increases. Compared to this, the number of timeouts of POP ∗ SL 

is rather low, confirming 

the feasibility of our new approach. 

We perform various optimizations in our implementation: First of all, the constraint 

formula can be reduced during construction. It is usually beneficial in combination with 

this to lazily construct the formula. For example, �f(s1, . . . , sn) > (2) 

pop∗ si�α reduces to ⊤ 

and thus one can directly conclude �f(s1, . . . , sn) >pop∗ si�α = ⊤ without constructing 

encodings for the other cases. Furthermore, s >pop∗ t is doomed to failure if t contains 

variables not appearing in s, in this case we replace the constraint by ⊥. SAT-solvers 

expect their input in CNF (worst case exponential in size). We employ the transformation 

proposed in (Plaisted and Greenbaum, 1986) to obtain a equisatisfiable CNF linear in size. 

This approach is analogous to Tseitin’s transformation (Tseitin, 1968) but additionally 

takes the plurality of atoms into account, usually resulting in shorter transformations. 

4 Conclusion 


In this paper we have shown how to automatically verify polynomial runtime complexities 

of rewrite systems. For that we employ semantic labeling and the polynomial path order 

3 Available at http://www.lri.fr/ ∼ marche/tpdb. 

4 C.f. http://www.lri.fr/ ∼ marche/termination-competition/2007/. 

5 Detailed results available at http://homepage.uibk.ac.at/ ∼ csae2496/esslli08. 

13

POP ∗ . Our automation works by a reduction to SAT and employing a state-of-the-art 

SAT-solver. To our best knowledge, this is the first SAT encoding of recursive path orders 

with finite semantic labeling. The experimental results confirm the feasibility of our approach. 

Moreover, they demonstrate that by semantic labeling we significantly increase 

the power of POP ∗ . 

Our research seems also comparable to (Bonfante, Marion and Pchoux, 2007), where 

recursive path orders together with strongly linear polynomial quasi-interpretations are 

employed in the complexity analysis. However, this method relies on caching techniques 

to achieve polytime computability. Opposite to this, we only demand an eager evaluation 

strategy. 

In future work we will strengthen the applicability of our methods. Currently we investigate 

in the integration of POP ∗ into the dependency pair framework for an automatic 

complexity analysis as proposed in (Hirokawa and Moser, 2008). As this framework allows 

the use of argument filterings (Kusakari, Nakamura and Toyama, 1999) and usable 

rules (Arts and Giesl, 2000), we expect a significant increase in the ability to automatically 

verify polynomial runtime complexities. 

Finally we want to mention another exciting field of application. There is a long interest 

in the functional programming community to automatically verify complexity properties 

of programs. For brevity we just mention (Rosendahl, 1989; Anderson, Khoo, 

Andrei and Luca, 2005; Bonfante et al., 2007). Rewriting naturally models the evaluation 

of functional programs, and termination behavior of functional programs via transformations 

to rewrite systems has been extensively studied. For instance, one recent approach is 

described in (Giesl, Swiderski, Schneider-Kamp and Thiemann, 2006) where Haskell programs 

are covered. In joint work with Hirokawa, Middeldorp and Moser (Avanzini, Hirokawa, 

Middeldorp and Moser, 2007) we propose a translation from (a subset of higherorder) 

Scheme programs to term rewrite systems. The transformation is designed to be 

complexity preserving and thus allows the study of the complexity of a Scheme program 

P by the analysis of the transformed rewrite system R. Hence from compatibility of R 

with POP ∗ we can directly conclude that the number of evaluation steps of the Scheme 

program P is polynomially bounded with respect to the input sizes. All necessary steps 

can be performed mechanically and thus we arrive at a completely automatic complexity 

analysis for Scheme, and eagerly evaluated functional programs in general. 

References 


Anderson, H., Khoo, S.-C., Andrei, S. and Luca, B. (2005). Calculating polynomial 

runtime properties, Proc. 3th APLAS, pp. 230–246. 

Arts, T. and Giesl, J. (2000). Termination of term rewriting using dependency pairs, TCS 

236(1-2): 133–178. 

Avanzini, M., Hirokawa, N., Middeldorp, A. and Moser, G. (2007). Proving termination 

of scheme programs by rewriting. Draft 6 . 

Avanzini, M. and Moser, G. (2008). Complexity analysis by rewriting, Proc. 9th FLOPS, 

Vol. 4989 of LICS, pp. 130–146. 

6 Available at http://cl-informatik.uibk.ac.at/ ∼ georg/list.publications 

14


Baader, F. and Nipkow, T. (1998). Term Rewriting and All That, Cambridge University 

Press. 

Bellantoni, S. and Cook, S. A. (1992). A new recursion-theoretic characterization of the 

polytime functions, CC 2: 97–110. 

Bonfante, G., Marion, J.-Y. and Pchoux, R. (2007). Quasi-interpretation synthesis by 

decomposition., Proc. 4th ICTAC, Vol. 4711 of LICS, pp. 410–424. 

Eén, N. and Sörensson, N. (2003). An extensible sat-solver, Proc. 6th SAT, Vol. 2919 of 

LICS, pp. 502–518. 

Giesl, J., Swiderski, S., Schneider-Kamp, P. and Thiemann, R. (2006). Automated termination 

analysis for haskell: From term rewriting to programming languages, Proc. 

17th RTA, Vol. 4098 of LICS, pp. 297–312. 

Hirokawa, N. and Moser, G. (2008). Automated complexity analysis based on the dependency 

pair method, Proc. 4th IJCAR. To appear. 

Hofbauer, D. (1992). Termination proofs by multiset path orderings imply primitive recursive 

derivation lengths, TCS 105(1): 129–140. 

Koprowski, A. (2006). Tpa: Termination proved automatically, Proc. 17th RTA, pp. 257– 

266. 

Koprowski, A. and Middeldorp, A. (2007). Predictive labeling with dependency pairs 

using sat, Proc. 21th CADE, Vol. 4603 of LICS, pp. 410–425. 

Kusakari, K., Nakamura, M. and Toyama, Y. (1999). Argument filtering transformation, 

Proc. 1th PPDP, Vol. 1702 of LICS, pp. 47–61. 

Lescanne, P. (1995). Termination of rewrite systems by elementary interpretations, Formal 

Aspects of Computing 7(1): 77–90. 

Moser, G. and Schnabl, A. (2008). Proving quadratic derivational complexities using 

context dependent interpretations, Proc. 19th RTA. To appear. 

Plaisted, D. A. and Greenbaum, S. (1986). A structure-preserving clause form translation, 

J. Symb. Comput. 2(3): 293–304. 

Rosendahl, M. (1989). Automatic complexity analysis, Proc. 4th FPCA, pp. 144–156. 

Schneider-Kamp, P., Thiemann, R., Annov, E., Codish, M. and Giesl, J. (2007). Proving 

termination using recursive path orders and SAT solving, Proc. 6th FroCoS, number 

4720 in LNCS, pp. 267–282. 

TeReSe (2003). Term Rewriting Systems, Vol. 55 of CTTCS, Cambridge University Press. 

Tseitin, G. (1968). On the complexity of derivation in propositional calculus, SCML, Part 

2 pp. 115–125. 

Zantema, H. (1995). Termination of term rewriting by semantic labelling, FI 24(1/2): 89– 

105. 

15


16

SIMULATING SPOKEN DIALOGUE 

WITH A FOCUS ON REALISTIC TURN-TAKING 

Timo Baumann 

University of Potsdam 

Abstract. We present a system for testing turn-taking strategies in a simulation environment, 

in which artificial dialogue participants exchange audio streams in real time – unlike earlier 

turn-taking simulations, which interchanged unambiguous symbolic messages. Dialogue 

participants autonomously determine their turn-taking behaviour, based on their analysis of 

the incoming audio. We use machine-learning methods to classifiy the continuous audio 

signal into symbolic turn-taking states. We experiment with various rule sets and show how 

simple, local management rules can create realistic behavioural patterns. 

1 Introduction 


Turn-taking management, i. e. deciding who may speak when in a dialogue, is an important 

subtask of interaction management. The classical model of turn-taking (Sacks, 

Schegloff and Jefferson, 1974) describes turn-taking as locally managed (depending only 

on a local context) and predictive (upcoming turn endings are signalled in advance by the 

interplay of syntax, semantics and prosody). Current speech dialogue systems (SDSes) on 

the other hand, use reactive turn-taking schemes, with the turn being taken after a silence 

of fixed length or of contextually determined length (Ferrer, Shriberg and Stolcke, 2002). 

This limits the interactivity of SDSes, as turns have to be separated by intervening silence. 

The prediction of turn endings (EoT prediction) has been investigated by a number 

of authors. Schlangen (2006) trains classifiers to predict the end of turn (EoT) but uses 

features that are not calculated strictly incrementally. Turn-management has also been 

studied before, but typically in simulation systems that interchange symbolic messages 

and work in a centrally managed environment (Padilha, 2006). In the present paper, we 

combine the efforts for EoT-prediction and turn-taking simulation. We propose an incremental 

classification of speech into speech states that control the system’s turn-taking. We 

first evaluate the classification itself and then combined with different turn-management 

strategies in a dialogue simulation environment. 

Dialogue simulation itself has a long standing tradition in the development of SDSes, 

but the main focus seems to be on the improvement of dialogue strategies (Schatzmann, 

Weilhammer, Stuttle and Young, 2006) and audio is usually just used to trigger realistic 

ASR errors (López-Cózar, De la Torre, Segura and Rubio, 2003), which contrasts with 

the focus of the present paper: Our goal is to show how realistic turn-taking behaviour 

can be simulated using only local context for the classification of speech into classes relevant 

to turn-taking management combined with simple, locally managed rules. Dialogue 

strategies in general are not locally managed and thus learning dialogue strategies seems 

to require the more complex reinforcement learning instead of simple classifier training 

which we use. 

17


Figure 1: A human user conversing with an artificial DP in our interaction environment 

(structured as in section 2). A dialogue recorder wiretaps their conversation. 

We do not (and do not need to) take into account the content of the dialogues and 

in fact we limit our speech analysis to simple prosodic features for the EoT prediction. 

Thus, for this work, we abstract away from all questions of content management and let 

our dialogue participants speak randomly selected pre-recorded utterances – though with 

proper turn-taking. 

The remainder of the paper is structured as follows: Section 2 describes the system 

architecture and Section 3 the corpora we use. Section 4 evaluates the speech state classification 

and Section 5 demonstrates and evaluates some simple turn-management strategies. 

We close with conclusions and ideas for further work. 

2 Architecture of the Interaction Environment 

Our architecture defines an interaction environment in which dialogue participants (DPs) 

communicate with each other. Interaction is purely non-symbolic, using asynchronous 

audio streams over RTP (Schulzrinne, Casner, Frederick and Jacobson, 2003). There is no 

common clock, or other synchronisation required between DPs. The architecture provides 

a headset tool for human DPs, and monitoring tools to listen to ongoing dialogues and to 

record them to disk. 

Figure 1 shows two dialogue participants – one human, one artificial – conversing in 

the environment described above. The artificial DP on the right of figure 1 is structured 

as described below. 

Artificial DPs are realized as modular and extensible collections of event-driven software 

agents in the open agent architecture, OAA (Martin, Cheyer and Moran, 1999). 

In the OAA each software agent advertises its own abilities to solve problems (such as 

generating utterances) and may itself request other agents to solve sub-problems (e. g. 

sending data over RTP). For audio processing inside the DP we rely on the Sphinx-4 

framework (Walker, Lamere, Kwok, Raj, Singh, Gouvea, Wolf and Woelfel, 2004) which 

we extended for our audio-processing pipeline. In the current system, we do not yet use 

Sphinx’ abilities as a speech recognizer and most other modules that would be needed for 

a real dialogue system are missing. These are obvious enhancements for later versions. 

18

21 Speech Generation 

Speech generation consists of a synthesizer and a dispatcher. The synthesizer currently 

selects from a corpus of pre-recorded utterances and will be extended to include text-tospeech. 

To make turn-taking management harder and the system more realistic a fixed 

delay of 100 ms between signal to the module and onset of the recorded utterance is 

introduced at this point. 1 This delay is realized by sending 100 ms of recorded silence 

before the utterance and utterances are also followed by 100 ms of recorded silence. (If 

DPs were to send digital zeros directly before and after their utterances, speech state 

classification, as described below, would become trivial.) 

The speech dispatcher continuously sends an RTP stream in packets of 10 ms, either 

audio from a file or sine waves if so instructed by the synthesizer, or silence (digital zero). 

It can also be ordered to interrupt the audio and to revert to silence. The dispatcher also 

publishes its current speech state which may be one of sil, start of turn (SoT), talk, or end 

of turn (EoT) to the DP it is part of. 

22 Speech Analysis 


Speech analysis focuses solely on local prosodic analysis for the classification of the 

listening state (which should reflect the interlocutor’s speech state, as described above). 

In order to be effective, classification must happen with as short a lag as possible. While 

short lags would allow for reactive behaviour, we aim to predict when the interlocutor’s 

end of turn is approaching in order to achieve smooth turn changes and counter-balance 

the 100 ms lag before a response can be uttered by the speech generation. 

We use machine learning to classify each received frame (10 ms) of audio as silence (sil), 

ongoing talk (talk) or end of turn (EoT). Classification is based exclusively on signal 

power, pitch and derived features. Our pitch extraction is modelled after the first three 

steps of the YIN algorithm (de Cheveigné and Kawahara, 2002). As no smoothing or dynamic 

programming is applied to the pitch extraction, results are computed incrementally 

in real-time and become available instantaneously. The algorithm runs at several times 

real-time on average hardware. On the corpora described below, the gross error rate is 

1.6 % compared to the well known ESPS algorithm (Talkin, 1995). 

In order to track changes over time, we derive features by windowing over past values 

of pitch and power with sizes ranging from 20 to 500 ms. While the features calculated 

on smaller windows help to smooth and to remove outliers due to failures of the pitch 

extraction, the larger windows are expected to capture long-term trends. We calculate the 

arithmetic mean and the range of the values, the mean difference between values within 

the window and the relative position of the minimum and maximum. We also perform 

a linear regression and use its slope, the MSE of the regression and the error of the 

regression for the last value in the window. 

23 Turn-Taking Management 

The turn-taking management agent determines whether to start or stop emitting utterances 

on the basis of the states of the generation and analysis modules. An important aspect in 

turn-taking management is robustness. To be robust, the turn-taking strategy must not 

1 In a dialogue system NLG and TTS would require processing time; for humans there is a delay between 

starting to plan an utterance and the start of the articulation (Levinson, 1983). 

19

depend on its interlocutor acting and reacting in certain ways. Naturally, “good” dialogue 

will only evolve from friendly dialogue partners, but the turn-management strategy must 

prevent dead-locks due to the interlocutor’s behaviour. 

Upon the reception of dialogue state change notifications from the analysis module, the 

agent decides about emitting messages to the generation module, ordering it to talk or to 

hush, according to a defined turn-taking strategy. Messages are only emitted with certain 

probabilities. The probabilities to start or hush were determined empirically to lead to 

natural performance. If no action is taken, the agent sleeps for a short while (currently, 

50 ms) being awakened if another message is received (for example EoT changing to 

sil). Thus, exact timings are non-deterministic and randomly differ between agents. The 

probability to start an utterance is set to 0.1, and to hush during an utterance to 0.3. 

3 Corpora 


We perform our experiments with two different corpora, one of simple pseudo-speech, one 

of read speech. Each corpus contains material from two different speakers (one female, 

one male) for which we train separate speech analyzers, in order to be able to simulate 

dialogues with one male and one female each. 

For pseudo-speech our speakers repeatedly uttered the syllable /ba/ instead of the actually 

occuring syllables in a script of 50 utterances (questions, informative sentences, 

confirmations, etc). By always uttering the same syllable, we remove segment-inherent 

influences on power and pitch variation, while at the same time retaining sentence intonation. 

For read speech we relied on the two major speakers of the Kiel Corpus of 

Read Speech, KCoRS (IPDS, 1994). That corpus contains some 600 utterances for each 

speaker. 

The two corpora differ in size and complexity. Our controlled pseudo-speech poses 

hardly any problem for pitch-extraction and does not contain voiceless speech, silence 

during the occlusion of voiceless plosives or other potentially “difficult” audio. The 

KCoRS on the other hand contains far more training material. Also, as the pseudo-speech 

does not convey any semantic meaning, subjects in a listening test for the evaluation of 

generated turn-taking patterns would not be distracted by nonsense dialogue. 

The performance of a speech state classifier on both of our corpora is likely to be better 

than on a corpus of real dialogue speech as it is more homogenous (especially compared to 

speaker-independent speech state classification). Thus, our results should be considered 

an upper bound on realistic results. 

The start and end of each utterance were hand-annotated and each 10 ms of audio was 

assigned to one of the listening states as described above with EoT being assigned to 

frames in the vicinity of ± 50 ms of the utterance end. For the turn-taking management 

experiments, we crop the audio files so that each utterance is preceeded and succeeded by 

100 ms of silence. 

4 Speech Analysis Evaluation 

We used the machine learning toolkit Weka (Witten and Frank, 2000) to train various 

speaker-dependent classifiers. For the evaluation 80 % of each corpus were used as 

training- and 20 % as test-set. Tables 1 and 2 show the results of the OneR-, J48 and 

20


classifier 

OneR 

J48 

Acc. 

96.1 

94.8 

female speaker 

Fsil Ftalk FEoT 

0.98 0.96 0.00 

0.98 0.95 0.50 

F AR 

21.4 

68.9 

Acc. 

92.8 

96.3 

male speaker 


0.96 0.93 0.13 

0.97 0.97 0.71 

F AR 

65.5 

64.3 

JRip 95.3 0.98 0.95 0.55 68.3 96.2 0.97 0.97 0.80 59.2 

Stateful JRip 

Stateful JRip, shifted 

95.9 

96.2 

0.98 

0.98 

0.95 

0.96 

0.59 

0.59 

48.4 

48.4 

95.5 

96.4 

0.97 

0.97 

0.96 

0.97 

0.72 

0.80 

50.0 

47.5 

Table 1: Accuracy, per-class f-measures and false alarm rate for various speech state 

classifiers for the pseudo-speech corpus. 

classifier 

OneR 

J48 

Acc. 

94.5 

97.3 

female speaker 


0.97 0.96 0.03 

0.98 0.98 0.61 

F AR 

65.4 

71.1 

Acc. 

93.7 

96.1 

male speaker 


0.92 0.96 0.10 

0.96 0.98 0.42 

F AR 

80.7 

84.1 

JRip 96.6 0.97 0.98 0.73 61.1 95.9 0.97 0.96 0.61 65.7 

Stateful JRip 96.4 0.96 0.98 0.70 31.9 94.9 0.97 0.96 0.58 50.0 

Stateful JRip, shifted 96.9 0.97 0.98 0.74 31.6 95.5 0.97 0.96 0.64 48.9 

Table 2: Accuracy, per-class f-measures and false alarm rate for various speech state 

classifiers for the KCoRS speakers. 

JRip-algorithms for each corpus. OneR finds the most predictive feature to be the dynamic 

range of frame energy over the last 100 or 200 ms. JRip outperforms J48, but 

has far worse training complexity. Separation of speech and silence (which here is the 

recorded silence in the corpus, not digital zero) is done with high accuracy. Recognition 

of EoT regions is of lower quality, but still surpasses results in (Schlangen, 2006). 2 

While the data and their states are sequential in nature, the classifiers as described 

above evaluate each frame independently. At the same time, recognizing the other speaker’s 

start or end of turn a little too late or too early hardly matters, while frequently 

changing the listening state may lead to bad dialogue behaviour. This is measured in the 

false alarm rate (FAR), defined as the proportion of over-generated state changes. 

The analysis of classification output showed that wrong classifications would often 

last for only one frame. We implemented a stateful classifier that only changes state 

after two consecutive classifications of the underlying classifier. This strongly decreases 

FAR but introduces systematic errors in the classification (every actual state change will 

be registered one frame too late) and reduces precision/recall measures. When this is 

accounted for in the evaluation, the stateful classifier outperforms the base classifier also 

in these measures. 

The results show, that the complexity of the KCoRS is counterbalanced by its 10 times 

larger size. This may indicate, that speech state classification for real dialogue speech 

would be feasible with a sufficiently large corpus and speaker-normalized prosodic features. 

5 Simple Strategies for Turn-Taking 

We outline some simple strategies to turn-control. Their purpose is to exemplify how 

very restricted locally managed behaviour with some simple rules can already lead to 

acceptable turn-taking behaviour as postulated by the local management model of Sacks 

et al. (1974), without the need for a dialogue history, or complex temporal reasoning. 

2 Results cannot be easily compared, as Schlangen (2006) recognizes turn-final words using prosodic 

and syntactic features on a more complex corpus, reaching an f-measure of 0.36. 

21


measure strategy 1 strategy 2 strategy 3 

gap 14.0 % 351 ms 18.7 % 358 ms 17.4 % 362 ms 

speaker a 31.4 % 1259 ms 35.9 % 1009 ms 36.5 % 1079 ms 

speaker b 39.3 % 1415 ms 39.8 % 1165 ms 40.8 % 1225 ms 

clash 15.4 % 1184 ms 5.6 % 317 ms 5.3 % 278 ms 

Table 3: Distribution and mean duration of dialogue states for three turn-taking strategies 

with pseudo-speech. 

measure strategy 1 strategy 2 strategy 3 

gap 14.1 % 528 ms 20.7 % 477 ms 18.9 % 454 ms 

speaker a 36.2 % 1764 ms 40.5 % 1456 ms 34.7 % 1232 ms 

speaker b 26.2 % 1437 ms 24.8 % 1307 ms 42.0 % 1540 ms 

clash 23.5 % 1915 ms 4.0 % 253 ms 4.4 % 243 ms 

Table 4: Distribution and mean duration of dialogue states for three turn-taking strategies 

with KCoRS speakers. 

51 Measuring Turn-Management Success 

The dialogue state can be described by the current speech state of each of the dialogue 

participants, with each speech state being either talk or sil. For two-party dialogue, this 

results in four states: two “good” states where either one of the dialogue participants is 

talking and two “bad” states: Clashes when both participants talk simultaneously, and 

gaps with neither of them talking. 

According to Sacks et al. (1974), speakers try to optimize their behaviour so as to 

minimize the occurence of both clashes and gaps. That is why we choose clashes and 

gaps as basic measures for turn-taking success. Slight gaps and clashes occur all the time, 

but they are not always perceptually relevant. We thus decided to calculate the proportion 

of clashes and gaps over the course of the dialogue as well as their mean duration. 

For evaluation purposes, we set up two artificial dialogue participants and let them talk 

with each other for about 10 minutes for each of the following strategies. We recorded 

the internal states and calculated the described measures. The audio itself was recorded 

but not further analyzed in the evaluation. The results of the strategies described below 

are shown in tables tables 3 and 4. 

52 Strategy 1: Talk When Nobody Talks 

Rule: Start an utterance when neither you nor your interlocutor is talking. (Implicitly: 

Continue talking until your utterance is finished.) 

The performance with this strategy strongly depends on the round-trip time from one 

agent’s decision to take the turn until the other agent notices the turn being taken. The 

shorter the lags introduced by the talking agent’s internal communication, audio transmission, 

prosodic processing and classification, and the listening agent’s internal communication, 

the more likely it is for a dialogue participant to notice its interlocutor talking (and 

then listen until he has finished) before she has started talking herself. For longer lags, 

the DP will decide to talk even though its interlocutor may already have started talking 

himself. As can be seen, this strategy leads to a large amount of clahes. 

53 Strategy 2: Hush When Both Talk 

Rule as above, plus: Stop your utterance when both you and your interlocutor are talking. 

22


The rule proves effective in reducing simultaneous talk as clashes are reduced by 65 % 

(pseudo-speech) and over 80 % (KCors) respectively. At the same time, this strategy leads 

to the introduction of utterance truncations, when an utterance was stopped prematuerly. 

(Actually, the majority of utterances (71 % for pseudo-speech) was truncated, but many of 

these truncations occur in the silent phases before or after the actual talk and do not have 

any deteriorating effect on the perceived turn-taking performance.) Truncations could be 

reduced with a higher probability to hush during SoT. 

54 Strategy 3: Start Talking Early 

The previous strategies only react after turns have started or ended. In order to initiate 

actions early and anticipates turn changes, this strategy exploits the EoT class of the 

speech analysis (which was ignored before) in the first rule: Start an utterance, when you 

are not talking and your interlocutor is ending their turn or has already finished. 

By starting utterance planning before the interlocutor’s preceding utterance is finished, 

the dialogue participant can hide some of the lag introduced by its speech generation 

module. The duration of both gaps and clashes is reduced compared to strategy 2, for 

gaps because turns will be taken over more quickly and for clashes due to the original 

talk-owner noticing the turn-change earlier, avoiding the start of a new utterance. 

The durations for gaps and clashes with this strategy are similar to those reported 

for parts of the Verbmobil corpus by Weilhammer and Rabold (2003), with 363 ms and 

331 ms respectively. 3 Performance could be further improved by using a lower probability 

to hush during EoT. 

6 Conclusion and Future Directions 

We have presented a flexible, modular architecture for dialogue strategy evaluation where 

arbitrary pairings of human users and artificial dialogue participants can be created. We 

have discussed a case-study in this environment, where pairs of artificial DPs converse in 

real time via audio. Each DP autonomously decides on their turn-taking behaviour (start 

or stop talking) based on a local analysis of the audio signal and using machine-learned 

classifiers. We tested these with corpora of simplified speech and achieve good recognition 

performance. Three implemented turn-management rulesets, all of them locallymanaged 

in the sense of Sacks et al. (1974), i. e. not requiring dialogue memory, were 

shown to create increasingly realistic behavioural patterns. 

We plan to use the components developed for this system in an interactive speech 

dialogue system. For the speech state classification, we will need normalized prosodic 

features that allow for speaker independent speech state classification. At the same time, 

ASR will make features relative to syllable information (stress patterns, speech rate, ...) 

accessible, as well as word hypotheses. We may also want to look into classifier confidence 

scores, only emitting speech state changes if the classifier is reasonably certain. 

In real dialogue, the problem of hesitations arises. Our classification will have to be 

extended to distinguish hesitational interruptions from normal EoT. We would also like to 

identify positions in a turn where a back-channelling utterance might be appropriate. 

3 Note, that their numbers are for turn changes only, while we do not distinguish between gaps at turn 

changes and at turn continuations. 

23

Acknowledgements 

I would like to thank my supervisor David Schlangen for his constant guidance and support 

and the anonymous reviewers for their insightful comments and suggestions. 

References 


de Cheveigné, A. and Kawahara, H. (2002). Yin, a fundamental frequency estimator for 

speech and music, The Journal of the Acoustical Society of America 111(4): 1917– 

1930. 

Ferrer, L., Shriberg, E. and Stolcke, A. (2002). Is the speaker done yet? Faster and more 

accurate end-of-utterance detection using prosody, Proceedings of the International 

Conference on Spoken Language Processing (ICSLP2002), Denver, USA. 

IPDS (1994). The kiel corpus of read speech, CD-ROM. 

Levinson, S. C. (1983). Pragmatics, Cambridge Textbooks in Linguistics, Cambridge 

University Press. 

López-Cózar, R., De la Torre, A., Segura, J. and Rubio, A. (2003). Assessment of dialogue 

systems by means of a new simulation technique, Speech Communication 

40(3): 387–407. 

Martin, D., Cheyer, A. and Moran, D. (1999). The Open Agent Architecture: a framework 

for building distributed software systems, Applied Artificial Intelligence 13(1/2): 91– 

128. 

URL: citeseer.ist.psu.edu/martin99open.html 

Padilha, E. G. (2006). Modelling Turn-taking in a Simulation of Small Group Discussion, 

PhD thesis, School of Informatics, University of Edinburgh, Edinburgh, UK. 

Sacks, H., Schegloff, E. A. and Jefferson, G. A. (1974). A simplest systematic for the 

organization of turn-taking in conversation, Language 50: 735–996. 

Schatzmann, J., Weilhammer, K., Stuttle, M. and Young, S. (2006). A survey of statistical 

user simulation techniques for reinforcement-learning of dialogue management 

strategies, The Knowledge Engineering Review 21(02): 97–126. 

Schlangen, D. (2006). From reaction to prediction: Experiments with computational 

models of turn-taking, Interspeech 2006, Pittsburgh, USA. 

URL: http://www.ling.uni-potsdam.de/ das/papers/schlangen intersp2006.pdf 

Schulzrinne, H., Casner, S., Frederick, R. and Jacobson, V. (2003). RTP: A Transport 

Protocol for Real-Time Applications, RFC 3550 (Standard). 

URL: http://www.ietf.org/rfc/rfc3550.txt 

Talkin, D. (1995). A robust algorithm for pitch tracking (rapt), in W. B. Kleijn and K. K. 

Paliwal (eds), Speech Coding and Synthesis, Elsevier, chapter 14, pp. 495–518. 

24


Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., Wolf, P. and Woelfel, 

J. (2004). Sphinx-4: A flexible open source framework for speech recognition, 

Technical Report SMLI TR2004-0811, Sun Microsystems Inc. 

Weilhammer, K. and Rabold, S. (2003). Durational aspects in turn taking, Proc. of the 

ICPhS, Barcelona, Spain. 

URL: http://www.phonetik.uni-muenchen.de/Publications/WeilhammerRabold-03- 

ICPhS.pdf 

Witten, I. H. and Frank, E. (2000). Data Mining. Practical Machine Learning Tools and 

Techniques with Java Implementations., Morgan Kaufmann. 

25


26


EPISTEMIC MODALS IN DIALOGUE 

Chris Brumwell 

University of Amsterdam 

Abstract. I present an update semantics for epistemic modals in which a formula of the 

form might φ acts on a context Γ by introducing a salient possibility con-structed from φ 

into Γ. This theory is meant to account for the intuitions and data that suggest that assertions 

of epistemic modals do not provide information to the participants in a conversation, but 

instead suggest certain possibilities for their con-sideration. Among this data is the important 

empirical fact that epistemic modals can answer questions. To account for this, I also define a 

semantics for questions and show that in this system epistemic modals can count as answers 

to questions. 

1 Introduction and Motivations 

In the classic picture of communication given in Stalnaker (1978), a conversation is a 

process of distinguishing between various possibilities, or ways the world might be. It is 

clear, however, that in a conversation not all possibilities are given equal attention by the 

interlocutors. People talking about whether or not John murdered Jack are not trying to 

distinguish a possibility in which chocolate makes cats sick from a possibility in which 

chocolate doesnt make cats sick. In this paper, I call the possibilities the interlocutors are 

most interested in salient possibilities. 

Asking a question is the canonical way of introducing salient possibilities into a discourse: 

questions introduce possibilities corresponding to their different answers. But 

other constructions introduce salient possibilities as well. The disjunction Jones works 

at a bank or a hospital introduces the salient possibilities that Jones works at a bank and 

Jones works at a hospital. Constructions containing indefinite NPs such as somebody 

stole the jewels can introduce salient possibilities corresponding to various instantiations 

of somebody. Free choice commands such as Take any apple you like introduce salient 

possibilities corresponding to your various choices. Finally, a statement expressing epistemic 

modality such as John might be hiding upstairs introduces the salient possi-bility 

that John is hiding upstairs. 

Recent work by Groenendijk (Groenendijk 2007) proposes an analysis of disjunction 

and existential quantification that captures their potential to introduce salient possibilities 

into a dialogue. In this paper, I formalize the notion of a salient possibility and use it 

to define a dynamic semantics for questions and epistemic modals. In the semantics, 

a question introduces salient possibilities corresponding to its possible answers, and an 

epistemic modal of the form might φ introduces a salient possibility constructed from φ 

and, following Veltman (1996), tests the common ground to see whether it is consistent 

with φ. 

Salient possibilities are almost perfectly suited for an analysis of epistemic modals. 

Unlike other kinds of assertions, an assertion of an epistemic modal does not con-tribute 

information to a conversation. Instead, its function is to call attention to certain possibilities 

that the conversational participants should, for some reason, find interesting. Thus, 

27


to analyze epistemic modals one must develop a framework in which assertions can significantly 

change a context without providing information. Since this papers framework 

postulates that epistemic modals affect the salient possibilities in a context rather than its 

information, the non-informative yet non-trivial effects of epistemic modals are properly 

represented. 

One advantage of this analysis is that it is able to account for the felicity of a modalized 

construction as an answer to a question. For example: 

(1) A: Where are my keys? 

B: They might be in the basement. 

(2) A: Are John and Bill coming to the party? 

B: They might. 

In dialogue (1), B doesnt answers As questions by saying where her keys are (because, 

if hes acting felicitously, he doesnt know where they are), but by suggesting a possibility 

for her to consider. Similarly in (2): B suggests that A should not overlook the possibility 

that Bill and John come to the party. If she really dislikes them, the very possibility that 

they attend may be reason enough for her to skip the party. 

Enemies of salient possibilities may think that a modal answer to a question really says 

nothing more than I dont know, or Any answer is consistent with my knowledge. Against 

this, consider the following case: suppose A is frantically looking for her husband Joe, 

and comes across B, who has never met Joe and has never given him one thought. If 

she asks him Where is Joe? and he responds I dont know, this is perfectly acceptable. 

However, if he responds He might be in Boston this is completely infelicitous: if A takes 

him seriously, shes on her way to a wild goose chase. Intuitively, this is because she 

seriously takes into account his (inappropriate) suggestion to consider the possibility that 

Joe is in Boston. 

A classical partition theory of questions has difficulty accounting for (1) and (2). This 

is the case because in a partition theory an answer to a question has to give information. 

However, as dialogues (1) and (2) demonstrate, answers to questions do not need to be 

informative: it suffices that they suggest informative answers. Below, I give a more detailed 

and formal discussion of the problem partition theories of questions face from noninformative 

answers to questions, and discuss the similarities and differences between the 

theory presented in this paper and a partition theory. 

This analysis also accounts for a puzzling feature of the behavior of epistemic modals 

under attitude reports. Statements of the form x believes that might φ mean, in part, 

that the attitude holder x considers φ to be a salient possibility. For example, sup-pose 

that John has never given a thought to what the weather is like in Amsterdam. Then (3) 

certainly seems wrong: 

(3) John believes it might be raining in Amsterdam. 

Using this papers theory, one could account for (3) by analyzing a belief state as composed 

of both information and salient possibilities. The content of (3) then states, roughly, 

that its consistent with Johns beliefs that its raining in Amsterdam and that this is a salient 

28


possibility in his belief state. Several contemporary theories of epistemic modality do not 

appeal to any notion similar to that of a salient possibility, and hence have no clear way 

of accounting for (3) (e.g. DeRose (1991) and Egan et. al. (2005); for similar reasons 

these theories also have problems accounting for the question and answer data presented 

above). Due to constraints on length we will not formalize this theory of the interaction 

between epistemic modals and attitude reports below. 

As mentioned above, the analysis is carried out in a dynamic semantic framework. In 

dynamic semantics, the meaning of a formula is not identified with its truth conditions, 

but rather with the way it changes a context. More specifically, our theory is a version of 

update semantics in the style of Veltman (1996), i.e. we give a definition of an information 

state and the meanings of formulas are functions from information states to information 

states. 

2 Questions and Salient Possibilities 

In this section, we define a 1st-order language with a question operator and an epistemic 

possibility operator. We then define the structures (information states) used to give a 

semantics for this language and define the notion of a salient possibility. Finally, we give 

the semantics for this language and define what it means for a formula to be an answer to 

a question. This definition will allow modal and non-modal formulas to answer questions. 

DEFINITION 1. We define the languages L1, L2, and L3 as follows: 

(i) If P is an n-place predicate and t1...tn are terms, then P(t1...tn) ∈ L1 

(ii) If φ,ψ ∈ L1, then φ ∧ ψ ∈ L1 and ¬ φ ∈ L1 

(iii) If φ ∈ L1, then ⋄φ ∈ L2 

(iv) If φ,ψ ∈ L2, then φ ∧ ψ ∈ L2 and ¬ φ ∈ L2 

(v) If φ ∈ L1, then ?φ ∈ L3 

(vi) If φ,ψ ∈ L3, then φ ∧ ψ ∈ L3 

The language L we discuss in this paper is defined L = L1 ∪ L2 ∪ L3. As a notational 

convention, we write atomic sentences (i.e. atomic formulas with no free variables) and 

Boolean combinations of atomic sentences as p, q, ¬q,p ∧ q, etc. 

In a standard update semantics, information states are sets of indices, where an index 

assigns an individual from a domain D to each constant of the language and an n-ary 

relation to each n-place predicate. In this papers framework, an information state is a set 

of sets of indices A such that there is an I* ∈ A that for all I m ∈ A, I m ⊆ I*. The intui-tion 

behind this definition is that this maximal set I* represents the common ground at a point 

in a conversation. Any subset of I* is a possible future state of the common ground, and 

hence could be a possibility that the discourse participants are interested in. However, 

recalling the introduction, all such subsets are not always of interest to the discourse 

participants. With that in mind we think of the subsets I m of I* as salient possibilities. We 

formally define information states below: 

DEFINITION 2. Let I be the set of all indices for the language L. We define an information 

state to be a set Γ = {P1,...,Pn,...} such that: 

(i) Pi ⊆ I for all n (ii) For some i, Pi = Γ 

(iii) There is an i such that for all j, Pj ⊆ Pi. This maximal set Pi is called the common 

ground. 

We write CG (common ground) for the maximal set Pi defined in (iii), and write 

Γ = {CG,P1,...,Pn,...,∅}. In some cases, we refer to information states as contexts. 

29

Though every element of an information state is a salient possibility (except the empty 

set, which is present to simplify the definition of an answer to a question), the sets in an 

information state do not exhaust its salient possibilities. Rather, the salient possibilities 

in an information state are generated by closing it under union and intersection. Salient 

possibilities are defined this way because, intuitively, if P1 and P2 are salient possibilities 

in a context, then if they are not mutually exclusive it is also possible that they both obtain. 

Thus, their intersection should count as a salient possibility as well. Similar reasoning 

supports considering the union of salient possibilities to be a salient possibility. 

DEFINITION 3. Let Γ be an information state. Then 〈Γ〉, the set of salient possibilities in 

Γ, is defined as the the smallest set such that: 

(i) If P ∈ Γ, then P ∈ 〈Γ〉 (ii) If P1, P2 ∈ 〈Γ〉, then P1 ∪ P2 ∈ 〈Γ〉 

(iii) If P1, P2 ∈ 〈Γ〉, then P1 ∩ P2 ∈ 〈Γ〉. 

We need one more concept in order to define the semantics of wh-questions. On our 

analysis, wh-questions introduce salient possibilities corresponding to each of their possible 

answers into an information state. To represent the possible answers to a wh-question, 

we use the relations defined in definition 5 (Definition 4 is a standard account of satisfaction, 

which is necessary for articulating definition 5): 

DEFINITION 4. Let φ ψ ∈ L1, let i be an index, and let g be a variable assignment function. 

(i) If φ = Qt1...tn, then i |= φ [g] iff 〈[t1] i,g ,...,[tn] i,g 〉 ∈ i(Q) 

(ii) i |= φ ∧ ψ [g] iff i |= φ [g] and i |= ψ [g] (iii) i |= ¬φ [g] iff i �|= φ [g] 

DEFINITION 5. Let φ ∈ L1, and let i and j be indices. We say that i ≡ j (mod φ) if for all 

assignments g, i |= φ [g] iff i |= ψ [g] 

Given a formula φ, definition 6 defines the conditions under which two indices give 

the same answer to the question ?φ. For a sentence φ of L1, i ≡ j (mod φ) will hold as 

long as i and j assign φ the same truth value. But for a formula of L1 with free variables, 

congruence modulo φ requires that the indices assign the same denotations (or just similar 

denotations if the formula contains both free variables and constants) to predicates that 

occur in φ. The following examples illustrates how this definition works. 

EXAMPLE 1. 


(i) Let ?φ = ?Px (Who came to the party?) i ≡ j (mod φ) if i(P) = j(P), or informally, if the 

same people came to the party according to indices i and j. 

(ii) Let ?φ = ?Ibx (Who did Bill invite to the party?) i ≡ j (mod φ) if 

{d ∈ D| 〈d, i(b)〉 ∈ i(P)} = {d ∈ D| 〈d, j(b)〉 ∈ j(P)}. 

(iii) Let ?φ = ?p (Did Alice help Bill?) i ≡ j (mod φ) if i |= p iff j |= p. 

In our update semantics, the effect of a formula on an information state will be defined 

in terms of the effects it has on certain elements of the information state. Thus, to state 

our update semantics for information states we require an update semantics for sets of 

indices as well. The update semantics for sets indices is fairly simple, and is roughly the 

same as that given in Veltman (1996). 

DEFINITION 6. Let φ ∈ L1 ∪ L2 be a sentence, and let P be a set of indices. We define the 

update of P with φ, written P[φ], as follows: 

(i) P[p] = {i ∈ P | i |= p} (ii) P[φ ∧ ψ] = P[φ][ψ] 

(iii) P[¬φ] = {i ∈ P | i �∈ P[φ]} (iv) P[⋄φ] = P if P[φ] �= ∅ 

(v) P[⋄φ] = ∅ if P[φ] = ∅ 

30

We now state our update semantics for information states. 

DEFINITION 7. Let Γ = {CG, P1,...,Pn, ∅} be an information state, and let φ ∈ L be a 

sentence. We define the update of Γ with φ as follows: 

(i) Γ[p] = { CG[p], P1[p],...,Pn[p], ∅} 

(ii) Γ[¬φ] = { CG[¬φ], P1[¬φ],...,Pn[¬φ], ∅} 

(iii) Γ[φ ∧ ψ] = Γ[φ][ψ] 

(iv) Γ[⋄φ] = {CG[⋄φ], P1[⋄φ],...,Pn[⋄φ], ∅} if there is a P ∈ Γ such that P[φ]] = P 

(v) Γ[⋄φ] = { CG[⋄φ], CG[φ], P1[φ],...,Pn, ∅} if there is a P ∈ Γ such that P[φ]] = P 

(vi) Γ = Γ ∪ {{ i | i ≡ j (mod φ)} | j ∈ CG}. 

Clauses (i) - (vi) apply as long as CG[φ] �= ∅. In the degenerate case that CG[φ] = ∅, we set 

Γ[φ] = {∅}, the absurd state. 

In the semantics defined above, although epistemic modals can change information 

states they cannot have a non-trivial effect on the common ground. This is as it should 

be: only constructions that provide information should change the common ground, and 

epistemic modals do not play that role in a dialogue. Thus, this semantics complies with 

the requirement set forward in the introduction: epistemic modals change a context in a 

significant yet non-informative manner. 

More specifically, epistemic modals change a context by drawing attention to certain 

possibilities. However, the manner in which an epistemic modal accomplishes this depends 

on the possibilities that are already salient in the dialogue’s context. If the possibility 

an epistemic modal calls attention to is not under discussion at all, then the epistemic 

modal adds this possibility to the set of salient possibilities in the context, acting in the 

manner specified in clause (v) (see example 4 below). But if this possibility is already 

under consideration, an epistemic modal draws attention to it by eliminating salient possibilities 

that are inconsistent with it from the context. In this latter case, epistemic modals 

act in the manner specified in clause (iv) (see example 2 - 3 below). 

An epistemic modal acts in accordance with clause (iv) when it functions as an answer 

to a question. Questions introduce several salient possibilities in a context, and an 

epistemic modal acts to draw attention to some answers rather than others. But epistemic 

modals arent always used to answer questions. For example, they can be used to provide 

someone with a warning: 

(4) A: Alice and I are going fishing in Leiden tomorrow. 

B: It might be illegal to fish in Leiden. 

A: Oh, I hadn’t thought to check that; thanks. 

B draws A’s attention to the possibility that fishing is illegal in Leiden, a possibility that 

A had overlooked but should investigate. Here, it is essential that B’s utterance contributes 

a new salient possibility to the context. 

Using this framework, we now define the conditions under which a formula φ answers 

a question ψ. Note that this definition admits full and partial answers. 

DEFINITION 8. 


Let φ ∈ L, and let ψ ∈ L3. We say that φ answers ψ if 〈{I,∅}[ψ][φ]〉 ⊂ 〈{I,∅}[ψ]〉. 

31


Thus, φ answers ψ if φ removes some salient possibilities that ψ introduces. This 

notion of answerhood should be familiar from a partition theory of questions: in both 

cases, answering a question amounts to eliminating some of the possibilities it introduces. 

But an important, unique feature of this definition is that an answer doesnt necessarily give 

information: it suffices that an answer suggest certain possibilities for the questioner to 

consider. 

We close this section by working through a few examples. We use the following notational 

conventions: {p} = {i ∈ I | i |= p}, {¬p} = {i ∈ I | i �|= p} etc. 

Example 2: A Polar Question. 

Recall example (2), and let p and q be the propositions ‘Bill is coming to the party’ and ‘John 

is coming to the party’ respectively. Let Γ = {I, ∅}; then ⋄p ∧ ⋄q answers ?p ∧ ?q: Γ[?p ∧ 

?q] = {I, {p}, {¬p}, {q}, {¬q}, ∅} = Γ 1 . Then: Γ 1 [⋄p ∧ ⋄q] = {I, {p}, {q}, ∅} = Γ 2 , and 

since 〈Γ 2 〉 ⊂ 〈Γ 1 〉, ⋄p ∧ ⋄q answers ?p ∧ ?q. 

Example 3: A Wh-Question. 

Consider the question ‘Who is likes to paint?’, and note that ‘Bill might like to paint’ felicitously 

answers this question. Let Px be ‘x likes to paint’, and let b be Bill. Let Γ = {I, ∅}. 

Then: Γ[?Px] = Γ ∪ {{i | i ≡ j (mod Px)} | j ∈ I } 

= Γ ∪ {{i |i(P) = D*}| D* ⊆ D} = Γ 1 . Then 

Γ 1 [⋄Pb] = Γ ∪ {{ i | i(P) = D*}[⋄Pb] |D* ⊆ D} 

= Γ ∪ {{i | i(b) ∈ i(P) and i(P) = D*}|D* ⊆ D such that i(b) ∈ D*} 

Since Γ 1 [⋄Pb] ⊂ Γ 1 , ⋄Pj is an answer to ?Px. 

Examples 3 and 4 bring out an important feature of this paper’s framework: epistemic 

modals behave much like questions. Both questions and epistemic modals draw attention 

to certain possibilities without committing the speaker to a position on whether or not 

these possibilities are actual. Epistemic modals, however, are stronger than questions: 

modals draw attention to fewer possibilities than questions, suggesting that the chosen 

possibilities are somehow more important than the ignored possibilities. The notion of a 

salient possibility allows us to represent this similarity between questions and epistemic 

modals in fully formal way. 

Example 4: Raising Issues Without Questions. 

Recall (4), and let p and q be ‘Alice and A are going fishing in Leiden tomorrow’ and ‘It’s 

illegal to fish in Leiden’ respectively. Let Γ = {I, ∅}. Then 

Γ[p][⋄q] = {{p}, {p ∧ q}, ∅}. Here, since no possibility in Γ[p] satisfied q, the epistemic 

modal acted to add the possibility {p ∧ q} to the context. Thus, even though no questions 

have been asked in this context, B is able to bring A’s attention to some issue by using an 

epistemic modal. 

Example 5: Infelicitous Answer. 

Responding to a polar question ?φ with ⋄φ ∧ ⋄¬φ should not count as answering the question: 

rather, responding to a question with ‘maybe, maybe not’ is a deliberate and almost 

reticent refusal to answer the question. Our semantics allows us to account for this: {I, 

∅}[?p][⋄p ∧ ⋄¬p] = {I, {p}, {¬p}, ∅}[⋄p ∧ ⋄¬p] 

= {I, {p}, {¬p}, ∅}[⋄p][⋄¬p] = {I, {p}, ∅}[⋄¬p] = {I, {p}, {¬p}, ∅}. Thus, 

⋄p ∧ ⋄¬p does not answer ?p. Moreover, ⋄p ∧ ⋄¬p is actually equivalent to ?p in this information 

state. 

In general, ?φ and ⋄φ ∧ ⋄¬φ are equivalent in any information state that is consistent with 

both φ and ¬φ, so polar questions can almost be defined using epistemic modals (if we assume 

that polar questions presuppose that both of their answers are possible, polar questions 

can be defined in terms of the epistemic modality operator). 

32

3 Comparison With a Partition Semantics of Questions 

In this section, we will slightly change our semantics to yield a partition theory of questions, 

1 and examine the difficulties it faces. These difficulties will bring to light problems 

that any partition theory of questions faces in accounting for non-informative answers to 

questions, and point to an important feature of the theory above that allows it to account 

for non-informative answers. For ease of exposition, we only consider polar questions: in 

this section, suppose that we only allow atomic sentences to be well-formed elements of 

L1. 

Using our terminology, in a partition theory of questions a question divides the common 

ground into the salient possibilities corresponding to its different answers. Crucially, 

salient possibilities are not added to the context as they were in section 2. Thus, to state 

a partition theory of questions in our framework we have to alter the definition of an 

information state: we no longer assume an information state contains a maximal set of 

indices, and for purposes of this section we remove clause (ii) from the definition of an 

information state. 

Since information states no longer contain a common ground, clause (v) in the update 

semantics for information states is difficult to translate to this new system. For purposes 

of this section, then, we also remove clause (v) from this definition, and stipulate that 

epistemic modals always change an information state according to clause (iv). 

Our partition theory of questions results from changing definition 8 and clause (vi) in 

definition 7 to the following. 

DEFINITION 9. 


(i) Let Γ = {P1,...,Pn} be an information state, and let ?φ ∈ L. Then we define 

Γ[?φ] = {P1[φ]P1[¬φ],...,Pn[φ], Pn[¬φ]} 

(ii) Let φ ∈ L and let ψ ∈ L3. We say that φ answers ψ if {I}[ψ][φ] ⊂ {I}[ψ]. 

An immediate problem with this theory is that modal formulas can eliminate blocks of 

a partition. This is the case because after a question ?p, ⋄p will eliminate any possibility 

that was just updated with ¬p. While this is good in so far as under this theory modal formulas 

can answer questions, it has other disastrous consequences. Since modal formulas 

can eliminate blocks of a partition, they can provide as much information as non-modal 

formulas: for any information state Γ, Γ[?p][⋄p] = Γ[?p][p]. This is the case because both 

p and ⋄p will eliminate the possibilities from Γ that have been updated with ¬p and have 

no effect on the possibilities that have been updated with p. This is a bad result: Γ[?p][⋄p] 

[¬p] should be consistent, but Γ[?p][p] [¬p] shouldn’t be. While modals and non-modals 

should both count as answers to questions, they should not answer questions in the same 

way. 

On a more general level, the problem with the partition semantics is that any update 

has to provide information or add possibilities, and possibilities can only be removed by 

information. This leads to trouble with epistemic modals: if one lets an epistemic modal 

answer a question, it must provide information and hence function far too much like a 

non-modal. But, on the other hand, if one posits that an epistemic modal doesnt provide 

1 For purposes of this paper, a partition semantics for questions is a semantics that holds: (i) a question 

changes a context by partitioning an information state, and (ii) to answer a question is to remove blocks 

from this partition. The partition semantics given in Groenendijk (1999) is similar to the one we present in 

this section. 

33


information, there is no way to say how it could change an information state in a way that 

answers a question. 

In the framework presented above this problem is dealt with by separating the common 

ground, and hence the information, from the salient possibilities. This change makes noninformative 

answers to questions possible: epistemic modals can eliminate possibilities 

without changing the information in the common ground. However, by connecting the 

meaning of a question to its possible answers in a context, and by identifying answers to 

questions with the elimination of possibilities, this approach retains much of the spirit of 

the partition theory of questions. 

4 Further Issues and Expansions of the System 

In this section, I will discuss some expansions of the system defined above and consider 

two objections to it. 

First, I will discuss the objections. Though the idea that epistemic modals can answer 

wh-questions or other complex questions by suggesting possible answers is quite natural, 

some readers may find the suggestion that epistemic modals answer polar questions by 

suggesting possible answers a bit odd. After all, someone asking a polar question clearly 

has both possibilities in mind, so how can simply making one of them more salient in the 

context count as felicitously answering her question? 

Dealing with this objection involves delving into the pragmatics of epistemic modals, 

and more specifically the pragmatic role that salient possibilities play in a context. This 

topic would take a great deal of space to treat, and is beyond the scope of this paper. But 

to respond to the objection we note that one very plausible pragmatic principle governing 

the use of epistemic modals is that, in general, one should only focus attention to some 

possibility if one has some reason to believe that it is the case. To see this, note how 

infelicitous dialogue (5) sounds: 

(5) A: Are John and Bill coming to the party? 

B: They might. 

A: Why do you say that? 

B: I dont know; they just might. 

Thus, pragmatically, answering a polar question with an epistemic modal can commit the 

speaker to having some reason to believe that the possibility made salient by her answer 

actually obtains. This pragmatic dimension of epistemic modals makes it clear how a 

speaker can answer a polar question simply by making one of the possible answers rather 

than the other salient in the context. 

Another objection to this framework questions the idea that, given the informal description 

of salient possibilities in the introduction, it makes sense to say that epistemic 

modals actually eliminate salient possibilities that questions introduce. After all, if a question 

is answered by an epistemic modal, its possible answers that are inconsistent with the 

epistemic modal aren’t completely forgotten about. But in the formal system, these possibilities 

have the same status as many other possibilities that the interlocutors haven’t 

given any thought to. Thus, this objection concludes, holding that epistemic modals actually 

eliminate salient possibilities from a context is far too strong. 

34

We take this objection seriously, and admit that the definition of salient possibilities 

given above is too coarse. A better definition would make salience into a scalar notion. 

With a scalar notion of salience, we could say that the salient possibilities eliminated by 

an epistemic modal acting as an answer to a question are less salient than those still in 

the context, but more salient than many other subsets of the common ground. A potential 

candidate for this scale is defined below: 

Scale. 

Let Γ = {CG, P1,,Pn, ∅} be an information state, and let P ⊆ CG. 

(i) P is 1-salient if P ∈ 〈Γ〉 and P - CG �∈ 〈Γ〉 

(ii) P is 2-salient if P ∈ 〈Γ〉 and P - CG ∈ 〈Γ〉 

(iii) P is 3-salient if P �∈ 〈Γ〉 and P - CG ∈ 〈Γ〉 

(iv) P is 4-salient if P �∈ 〈Γ〉 and P - CG �∈ 〈Γ〉 

Here, 1-salient propositions are most salient, and 4-salient propositions are least salient. 

In general, after an epistemic modal answers a question it changes its possible answers 

from 2-salient propositions to either 1-salient propositions or 3-salient propositions, thus 

making possible answers either more or less salient and not rendering any forgotten. Thus, 

replacing an absolute notion of salience with a scalar one solves the problem raised by the 

objection. 

In this papers semantics, epistemic modals can only focus attention on possibilities that 

are subsets of the common ground. This is problematic because some uses of epistemic 

modals make possibilities that lie outside of the common ground salient in a conversation. 

(6) A: There arent any deer in this part of the forest. 

B: (2 hours later) Look over there! Hoofprints! There might be deer after all. 

These modal assertions also challenge previously accepted information without directly 

contradicting it. To account for this use of epistemic modals, one could posit that if ⋄φ 

is inconsistent with the common ground of an information state, then ⋄φ acts on this 

information state by: (i) introducing a salient possibility corresponding to the revision of 

CG with φ, (ii) transforming the information states common ground into the union of this 

revision and the old common ground, and (iii) performing a similar operation on the other 

possibilities in the information state. Thus, though the papers theory itself cannot account 

for uses of epistemic modals like (6), augmented with a theory of belief revision it can 

provide an elegant analysis. 

I would like to thank Paul Dekker. 

References 



De Rose, K. (1991). Epistemic possibilities, Philosophical Review 100.: 581–605. 

Egan, A., Hawthorne, J. and Weatherson, B. (2005). Epistemic modals in context, in 

G. Preyer and P. Peter (eds), Contextualism in Philosophy, Oxford University Press, 

Oxford, pp. 131– 170. 

35


Groenendijk, J. (1999). The logic of interrogation, in T. Matthews and D. Strolovitch 

(eds), The Proceedings of the Ninth Conference on Semantics and Linguistic Theory, 

CLC Publications, Ithaca, NY, pp. 109–126. 

Groenendijk, J. (2007). Inquisitive semantics: Two possibilities for disjunction. 

Groenendijk, J. and Stokhof, M. (1997). Questions, in J. van Benthem and A. T. Meulen 

(eds), Handbook of Logic and Language, Elsevier. 

Stalnaker, R. (1978). Assertion, Syntax and Semantics 9. 

Veltman, F. (1996). Defaults in update semantics, Journal of Philosophical Logic 25(3). 

36

BARE PREDICATION AND KINDS ∗ 

Bert Le Bruyn 

Utrecht University 

Abstract. This paper treats the distinction between singular nominal predication with and 

without indefinite article in languages like Dutch. The former variant is referred to as nonbare 

predication, the latter as bare predication. I make the following claims: (i) temporal 

analyses of the distinction between bare and non-bare predication are on the wrong track, 

(ii) bare predication needn’t be analyzed as a lexical phenomenon, (iii) non-bare predication 

should be analyzed as kind-membership predication. 



In order to understand the role played by the indefinite article in predicate position it is instructive 

to look at instances of singular nominal predication in which the indefinite article 

does not appear. These instances are subsumed under the notion of bare predication (see 

(Kupferman, 1991), (Broekhuis, Keizer and Den Dikken, 2003), (de Swart, Winter and 

Zwarts, 2005), (de Swart, Winter and Zwarts, 2007), (Matushansky and Spector, 2005), 

(Déprez, 2005), (Munn and Schmitt, 2005), (Roy, 2006), (Beyssade and Dobrovie-Sorin, 

2005)). In English bare predication is marginal but a language like Dutch seems to have 

a productive paradigm: 

(1) (a) Jan is slager. (litt. John is butcher) (b) Jan is moslim. (litt. John is muslim) 

(c) Jan is Belg. (litt. John is Belgian) (d) Jan is hertog. (litt. John is duke) 

Nouns that typically occur in bare predication are linked to professions (1a), religions 

(1b), nationalities (1c) and titles (1d). It is important to note that this is not an idiosyncracy 

of Dutch but a pervasive phenomenon in Romance and Germanic languages (examples 

taken from (de Swart et al., 2007)): 

(2) Es negrero. (Spanish, litt. Is trader in black slaves); João é médico. (Portuguese, 

litt. John is doctor); Gianni è dottore. (Italian, litt. John is doctor); Jean est 

médecin. (French, litt. John is doctor); Olivier var skuespiller. (Danish, litt. Oliver 

was actor); Herr Weber är katolik. (Swedish, litt. Mr Weber is catholic); Han er 

lærer. (Norwegian, litt. He is teacher); Er ist praktizierender Katholik. (German, 

litt. He is practicing catholic). 

∗ This paper should be read as a working paper that presents thoughts and bits of analysis that are not 

finished yet. I’m very grateful to audiences at ConSOLE XVI, my UiL-OTS kermit-lecture and the LSB 

2008 Linguists’ Day and to the reviewers of the ESSLLI student session for very useful comments and 

discussion. Special thanks also to Min Que, Gianluca Giorgolo, Dorota Klimek, Sander Lestrade, Joost 

Zwarts and Henriëtte de Swart. 

37


In this paper I will defend three claims concerning bare predication. The first is that analyses 

that reduce the distinction between bare and non-bare predication to a temporal one 

are not on the right track (see paragraph 2). The second is that a purely lexical approach to 

bare predication is not tenable (see paragraph 3). The third and final one is that non-bare 

predication should be analyzed as kind-membership predication (see paragraph 4). 

2 Bare predication and time 

When comparing sentences (3a) and (3b) most informants tend to say that the a-variant is 

more ’eventive’ than the b-variant (Roy, 2006): 

(3) (a) Paul est acteur. (French, litt. Paul is actor) 

(b) Paul est un acteur. (French, litt. Paul is an actor) 

This intuition has led linguists to explore a temporal analysis of bare predication. In its 

simplest form it would state that bare predication is concerned with transient properties 

whereas non-bare predication is concerned with permanent ones. The most convincing 

argument in favour of this analysis comes from ’lifetime effects’: 

(4) (a) Paul était médecin. (French, litt. Paul was doctor) 

(b) Paul était un médecin. (French, litt. Paul was a doctor) 

Sentence (4a) can be understood as stating that Paul used to be a doctor and that he’s 

retired now. Sentence (4b) can only mean that Paul is dead. Under the assumption that 

non-bare predication is concerned with permanent properties the interpretation of sentence 

(4b) follows: to cancel a permanent property one has to cancel the existence of the 

entity the property applies to. The problem this analysis faces is that it predicts that inherently 

transient properties should always occur bare in predicate position. This prediction 

is not borne out (cf. (de Swart et al., 2007)): 

(5) ?? Marie est fille. (French, litt. Mary is girl) 

Another temporal approach to bare predication is the one presented in (Roy, 2006) (variants 

are (Munn and Schmitt, 2005) and (Déprez, 2005)). Roy assumes all nouns come 

with an event argument that has to be bound. When bound by the indefinite article it is 

signalled that the predication holds for the maximal event around the ’time of utterance’ 

(given by the Tense on the copula) and that this event cannot be split up into smaller intervals. 

When bound by Tense it is signalled that the maximal event can be split up. The 

facts that led to this analysis are presented in (6) and (7): 

(6) (a) Jean est professeur le jour, danseur la nuit. 

(French, litt. John is teacher by day, dancer by night) 

(b) ?? Jean est un professeur le jour, un danseur la nuit. 

(French, litt. John is a teacher by day, a dancer by night) 

(7) (a) Paul est devenu chanteur. 

(French, litt. Paul has become singer) 

(b) ?? Paul est devenu un chanteur. 

(French, litt. Paul has become a singer) 

38

The reason why the b-variants are out on Roy’s analysis is that adverbials like le jour 

... la nuit (’by day ... by night’) and verbs like devenir (’become’) split up the ’time of 

utterance’. This is depicted for the adverbials in (8) and for the verb in (9). 

(8) 

(9) 


It is important to note that in absence of temporal adverbials or verbs like devenir there is 

no clear reason in Roy’s analysis to prefer bare over non-bare predication or vice versa. 

In order to account for preferences like in (5) Roy has to assume that whenever world 

knowledge makes it implausible / impossible that the maximal event is split up the indefinite 

article is obligatory and that whenever world knowledge makes it plausible / possible 

that the maximal event is split up the indefinite article ends being obligatory. 

The problem Roy’s analysis faces is that the incompatibility of non-bare predication with 

temporal adverbials or verbs like devenir is only a strong tendency that surfaces as an 

epiphenomenon. To show this it is necessary to anticipate the analysis presented in paragraph 

4. There it is claimed that non-bare predication signals kind-membership. A sentence 

like (10) e.g. would mean that White Fang belongs to the kind wolf. 

(10) White Fang is een wolf. 

(Dutch, litt. White Fang is a wolf) 

What makes kind-membership special is that in general one cannot change from one kind 

into another. White Fang e.g. cannot turn into a sheep or a wild boar. This explains why 

non-bare predication in general is incompatible with temporal adverbials or verbs like devenir. 

There are however instances of transformations in nature and in folklore: e.g. the 

transformation from a caterpillar into a butterfly and from a man into a werewolf. The former 

can be described in a sentence with the verb devenir and the latter in a sentence with 

temporal adverbials. Roy’s analysis predicts that in these sentences non-bare predication 

is not allowed. An analysis that takes non-bare predication to signal kind-membership 

predicts the opposite. As shown by the acceptability of (11) and (12) it is the latter that 

makes the right prediction. 

(11) In Lady Hawke is Rutger Hauer ’s nachts een wolf en overdag een mens. 

(Dutch, litt. In Lady Hawke is Rutger Hauer by night a wolf and by day a man) 

(12) La chenille est devenue un papillon. 

(French, litt. The caterpillar has become a butterfly) 

39

From the preceding I conclude that the existing analyses that try to reduce the distinction 

between bare and non-bare predication to a temporal one are not on the right track. It was 

important to establish this given that most existing analyses are cast in temporal terms 

whereas the one I will defend in paragraph 4 is not. 

3 Bare predication and the lexicon 

In the literature on bare predication one of the following positions is often taken: (i) 

nouns that usually appear in non-bare predication are marked in the lexicon (see e.g. 

(Matushansky and Spector, 2005)); (ii) nouns that usually appear in bare predication are 

marked in the lexicon (see e.g. (de Swart et al., 2005), (de Swart et al., 2007)). In this 

section it will be argued that purely lexical standpoints like (i) and (ii) should be amended. 

In order to do so it will be shown that : 

(a) all nouns that usually appear in bare predication can appear in non-bare predication; 

(b) all nouns that usually appear in non-bare predication can appear in bare predication. 

It should be noted that (a) and (b) don’t constitute decisive arguments against lexical 

analyses. They do however make them less appealing. 

31 Bare predication nouns 


As stated in paragraph 1 there is a subclass of nouns that usually appear in bare predication. 

They include nouns related to professions, religions, nationalities and titles. It is 

however well-known that these nouns appear fairly frequently in non-bare predication too 

(see e.g. (de Swart et al., 2005), (de Swart et al., 2005)). When they do they allow for 

their normal interpretation and an enriched one. This will be illustrated on the basis of 

(13): 

(13) (a) Sil is beenhouwer. (Dutch, litt. Sil is butcher) 

(b) Sil is een beenhouwer. (Dutch, litt. Sil is a butcher) 

The a-variant is the unmarked one and simply states that Sil works as a butcher. The bvariant 

has the same interpretation but also allows the interpretation according to which 

Sil is not a butcher but has the characteristics we usually associate with butchers. A 

typical person the b-variant would apply to is a violent boxer. The enriched interpretation 

projects the (stereotypical) characteristics that are associated with a profession on 

an individual. From a lexical standpoint one could see the enriched interpretation as an 

instance of coercion. Note though that if we store in our world knowledge that butcher is 

a profession we can get the same coercion effect to arise. 

32 Non-bare predication nouns 

The majority of nouns in languages like Dutch usually appears in non-bare predication. 

Up to date these nouns have been defined negatively; they are those that are not related to 

professions, religions, nationalities and titles. 

In the literature there are two claims about nouns appearing in bare predication. The 

first is that they are usually [+ human] (cf. (Matushansky and Spector, 2005) and (Roy, 

40

2006)). The second is that nouns referring to kinds (which would be a subset of nonbare 

predication nouns) can never appear in bare predication (cf. (Kupferman, 1991) and 

(Roy, 2006)). In order to argue that all non-bare predication nouns can in principle appear 

in bare predication the strongest claim would therefore be to say that even [-human] and 

[+kind] nouns can appear in bare predication. This is the claim I defend here. 

A noun that meets both the [-human] and the [+kind] criterion is wolf. An example of 

wolf in non-bare predication was given in (10). Its bare variant would look as follows: 

(14) Ik ben wolf. (Dutch, litt. I am wolf) 

Even though (14) might seem ungrammatical at first sight it is acceptable in Dutch under 

a very specific interpretation, viz. the one in which wolf is a role in a game (e.g. the 

werewolf game). This should not come as a surprise given that it is often claimed that 

bare predication nouns refer to roles in society: 

”[Bare predication nouns] usually [...] denote specific roles in society: professions, religions 

or nationalities. Other nominals (non-human or human) that are not related to such 

roles generally resist taking up a bare nominal position.” (de Swart et al., 2007) 

Under the assumption that any noun can be reinterpreted as referring to a role in a game 

there is no reason to expect a principled limit on nouns appearing in bare predication. 

Note that the reinterpretation referred to can be seen as a coercion mechanism from a lexical 

standpoint. Once again it is not obvious though that we couldn’t get the same effect 

through world knowledge. 

33 Conclusion 

In 3.1. and 3.2. it was argued that any noun can appear in both bare predication and 

non-bare predication. As noted before these facts cannot be seen as decisive arguments 

against a lexical approach. They do however make lexical approaches less appealing and 

clear the road for non-lexical analyses like the one that will be presented in paragraph 4. 

4 Bare predication and kinds 

In this paragraph I will introduce the basic ingredients for an analysis in which non-bare 

predication is seen as kind-membership predication. The basic claim is that a sentence 

involving non-bare predication should be interpreted as ’X belongs to the kind Y’. The 

paragraph is organized as follows. I first present my background assumptions about kinds 

and articles (4.1. and 4.2.). Afterwards I present a pragmatic analysis of the contrast between 

bare and non-bare predication (4.3). I close the paragraph defending the claim that 

there is a one-to-one correspondence between non-bare predication and kind-membership 

predication (4.4). 

41 Background on kinds 


I follow Chierchia (1998) in his intuition that kinds are regularities that occur in nature. 

This translates into two constraints on kinds and their instantiations. The first (see (15)) 

captures the intuition that for something to be regular it should be hypothesized that there 

41

could be more than one. Note though that for K to qualify as a kind in w0 it is not 

necessary for there to be more than one or even one single instantiation of K in w0 (this 

makes it possible to talk about unicorns, dodos and new inventions as kinds). 

(15) For K to be a kind in w0 there has to be at least one world in which K has more 

than one instantiation. 

The second constraint (see (16)) captures the intuition that the instantiations of kinds 

behave in a regular way, i.e. that their kind-membership is not accidental. Note though 

that it does not prohibit kinds to display properties varying over time nor for individuals 

to start or stop being instantiations of a kind (this is left to world knowledge). 

(16) If k is an instantiation of the kind K in w0 at tn and if k exists in a world wn 

accessible from w0 at tn k is an instantiation of the kind K in wn at tn. 

I will call (15) the non-uniqueness constraint and (16) the non-accidentality constraint on 

kinds and their instantiations. 

42 Background on articles 

I follow Partee (1987) in assuming that articles are default type-shifters from type to 

type e or type . In short this means that they are markers of argumenthood and 

that they cannot be omitted in absence of other determiners in argument position: 

(17) *I have cat. 

(18) *Man came to see me. 

I furthermore follow (Hawkins, 1991) and (Farkas, 2002) in assuming that the definite 

article is a uniqueness marker whereas the indefinite article is unmarked for uniqueness. 

This means that (19) signals that there is only one teacher present in a particular setting 

whereas (20) is in principle neutral with respect to there being one or more teachers. 

(19) I saw the teacher. 

(20) I saw a teacher. 


As noted by Hawkins and Farkas it is the case though that by choosing the indefinite 

instead of the definite the speaker triggers the implicature that there is more than one 

teacher. 

Finally, in line with Partee’s type-shifting analysis I expect indefinite articles to be omissible 

in predicate position. The instances of bare predication treated in this paper show that 

this expectation is borne out. The crucial question is why they cannot always be omitted. 

The answer, I claim, does not lie in the semantics but in the pragmatics. The pragmatic 

analysis I defend is presented in 4.3. 

42

43 Non-bare predication and non-uniqueness 

The analysis I defend is cast in (Weak) Bi-directional Optimality Theory (cf. (Blutner, 

2000)) and is based on five standard assumptions. The first is that bare and non-bare 

predication are truth-conditionally equivalent (cf. (Partee, 1987)). The second assumption 

is that both bare and non-bare predication in principle trigger an implicature of nonuniqueness. 

This assumption builds on the insights of Hawkins and Farkas according to 

whom not using the definite triggers an implicature of non-uniqueness. The third assumption 

is that non-bare predication is syntactically more marked than bare predication (cf. 

(de Swart and Zwarts, To appear)). Syntactic markedness can be understood in terms of 

projections: whereas non-bare predication involves DPs, bare predication only involves 

NPs (or NumPs). The fourth assumption is that conveying non-uniqueness is semantically 

more marked than conveying neutrality with respect to uniqueness (cf. (de Swart and 

Zwarts, To appear)). Semantic markedness can be understood in terms of compatibility: 

non-uniqueness is compatible with neutrality but neutrality is not necessarily compatible 

with non-uniqueness. The fifth and final assumption is that unmarked forms and meanings 

are preferred over marked forms and meanings (a standard assumption in the OT 

literature). The resulting (Weak) Bi-directional OT tableau is presented in (21). 

(21) 


What comes out of this analysis is that bare predication is neutral with respect to uniqueness 

whereas non-bare predication marks non-uniqueness. 

44 Kinds and non-bare predication 

In 4.1. I claimed - on the basis of common intuitions - that kinds are subject to a nonuniqueness 

constraint. In 4.3. I claimed - on the basis of standard assumptions - that 

non-bare predication marks non-uniqueness whereas bare predication is neutral with respect 

to uniqueness. When we combine both claims it follows that non-bare predication 

is best suited to signal kind-membership. As I will demonstrate in what follows this is 

indeed what it does in languages like Dutch. I will show this on the basis of five predictions 

that follow from the claim that there is one-to-one correspondence between non-bare 

predication and kind-membership predication. 

The first prediction is that all predication involving kind-membership has to involve the 

indefinite article. That this is the case has been suggested by (Kupferman, 1991) and 

(Roy, 2006) and as far as I know this has never been challenged. Note that (14) is not 

a counterexample. (14) shows that bare predication may involve nouns that are usually 

associated with kinds but it is not an instance of kind-membership predication. Note also 

43

that kinds are not restricted to plants or animals but may involve things as diverse as bottles, 

chairs, ... in as far as they show a sufficiently regular behaviour (see 4.1). 

The second prediction my claim about non-bare predication and kind-membership makes 

is that bare predication should be concerned with the predication of properties that are unlike 

those that link a kind to its instantiations. In view of the non-accidentality constraint 

on kinds and their instantiations (see 4.1) it is then predicted that bare predication is concerned 

with accidental properties. To see that this is exactly what happens it is instructive 

to look at those nouns that usually appear in bare predication: nouns linked to professions, 

religions, nationalities and titles. These ”do not depend on the inherent, natural properties 

of a person or what the person actually does, but on the social or cultural status of that 

person” (de Swart et al., 2007). 

The third prediction is that whenever a noun that is usually associated with kinds is used 

in bare predication it is reinterpreted in such a way that it no longer predicates a nonaccidental 

property. An example was given in (14): being a wolf in (14) is an accidental 

property that comes with the distribution of roles in a game. 

The fourth prediction is that whenever a noun that is usually not associated with kinds 

is used in non-bare predication it is reinterpreted in such a way that it starts predicating 

non-accidental properties. An example was given in (13b): for Sil to be a butcher is no 

longer seen as an accidental property but rather as something that is linked to his inherent 

properties. This explains why Sil needn’t be a butcher by profession to make (13b) true. 

The fifth prediction is that whenever it is not clear whether something is an accidental 

property or not there is variation in the predication that is used. One telling example is 

that of diseases like alcoholism. According to some alcoholism is a disease that people 

may or may not get, according to others alcoholics are themselves responsible and are 

not sick in the classical meaning of the word. Interestingly this division is reflected in the 

use of the more clinical alcoholieker (Dutch, ’alcoholic’) and the more popular drinker 

(Dutch, ’drinker’). On google I found the former 43 times in bare predication and 8 times 

in non-bare predication whereas the latter appeared 364 times in non-bare predication and 

only 6 times in bare predication. 1 

5 Conclusion 


This paper started out as an investigation into the role of the indefinite article in predicate 

position. The analysis I defended is that through its competition with the bare form it 

marks non-uniqueness which in turn can be linked to kind-membership predication. This 

analysis is attractive in at least three respects. The first is that the indefinite article maintains 

its standard semantics and pragmatics and is not reduced to a vacuous item. The 

second is that it offers a formalizable alternative to temporal analyses that were shown to 

make wrong predictions. The third is that it brings together intuitions and claims from 

work on kinds and work on bare predication that lend themselves to an interesting remix. 

1 The google search was done on www.google.nl (restricted to Dutch pages) and concerned searches of 

the form ”is drinker” / ”is alcoholieker”. 

44

References 


Beyssade, C. and Dobrovie-Sorin, C. (2005). Bare predicate nominals in dutch, Proceedings 

of SALT 15. 

Blutner, R. (2000). Some aspects of optimality in natural language interpretation, Journal 

of Semantics 17. 

Broekhuis, H., Keizer, E. and Den Dikken, M. (2003). Modern grammar of Dutch. Occasional 

papers 4, Tilburg University, Tilburg. 

de Swart, H., Winter, Y. and Zwarts, J. (2005). Bare predicate nominals in dutch, in 

E. Maier, C. Bary and J. Huitink (eds), Proceedings of SuB9. 

de Swart, H., Winter, Y. and Zwarts, J. (2007). Bare nominals and reference to capacities, 

Natural Language and Linguistic Theory 25. 

de Swart, H. and Zwarts, J. (To appear). Nominals with and without an article: Distribution, 

interpretation and variation, in P. Hendriks, H. de Hoop, I. Krämer, H. de Swart 

and J. Zwarts (eds), Conflicts in Interpretation. 

Déprez, V. (2005). Morphological number, semantic number and bare nouns, Lingua 115. 

Farkas, D. (2002). Specificity distinctions, Journal of Semantics 19. 

Hawkins, J. (1991). On (in)definite articles: implicatures and (un)grammaticality prediction, 

Journal of Linguistics 27. 

Kupferman, L. (1991). Structure événementielle de l’ alternance un / ∅ devant les noms 

humains attributs, Langage 102. 

Matushansky, O. and Spector, B. (2005). Tinker, tailor, soldier, spy, in E. Maier, C. Bary 

and J. Huitink (eds), Proceedings of SuB9. 

Munn, A. and Schmitt, C. (2005). Number and indefinites, Lingua 115. 

Partee, B. (1987). Noun phrase interpretation and type-shifting principles, in J. Groenendijk, 

D. de Jongh and M. Stokhof (eds), Studies in Discourse Representation 

Theory and the Theory of Generalized Quantifiers, Foris, Dordrecht. 

Roy, I. (2006). Non-verbal predications: a syntactic analysis of predicational copular 

sentences, PhD thesis, University of Southern California. 

45


46

DIAGRAMMATIC REASONING 

WITH ENHANCED STATIC CONSTRAINTS 

James Burton 

University of Brighton 

Abstract. This paper reports on ongoing work to create a proof-carrying Domain Specific 

Embedded Language (DSEL) for diagrammatic logics, using Euler diagrams as a case study. 

The DSEL is written in Haskell with type system extensions that allow the exploitation of 

a combination of ideas from Constructive Type Theory. These extensions offer an increase 

in expressiveness over Hindley-Milner type systems and have been used for program verification. 

We use these extensions to create enhanced static constraints to enforce invariants 

on diagrams and transformations (inference rules). Our work is at an early stage and we 

describe the goals and challenges ahead. The major goal is to create a DSEL for generalized 

constraint diagrams, a visual logic expressive enough to be useful for modelling software, 

and to extract the types of the resulting diagrams for use as software artefacts. 



A great deal of effort is spent on attempts to increase software reliability and the productivity 

of programmers, by both the research community and the software industry. Of 

the techniques employed (development methodologies, systematic modelling, automated 

testing), formal methods have been little used outside of the most safety-critical sectors 

where they are used to verify semantic properties of software and to assure desired runtime 

conditions. We believe the benefits of their more widespread use could be great, but 

the impact of factors inhibiting adoption needs to be reduced. These factors may include 

the fact that existing techniques are seen as difficult to use, time-consuming and requiring 

specialised expertise. There is, therefore, a need for more “lightweight” formal methods 

which are accessible to programmers with a minimum of specialised training and which 

fit in seamlessly with the tools they employ. Sheard has said that enabling programmers 

to make statements about semantic properties of the code they write directly, rather than 

turning to external tools with high barriers to entry (likely to be written by, and for, mathematicians) 

will make it more likely that they do so — in short, that the semantic gap 

between the tools for programming and those for formal reasoning is damaging to the 

cause of both (Sheard, 2004). 

At the same time as the Unified Modelling Language (UML) was adopted as a standard 

visual language for modelling software in the 1990s, breakthroughs occured in the 

use of diagrams as visual logics (Shin, 1994; Hammer, 1995). Shin proved soundness and 

completeness results for the so-called Venn-II reasoning system, equivalent in expressive 

power to Monadic First Order Logic, and research began into a number of diagrammatic 

reasoning systems varying in notation and expressive power. The connection between 

the new formalised diagrams and those used in software modelling was quickly made. 

Although the UML works well to describe the architecture of a system it is not always expressive 

enough to capture all invariants we might wish to enforce, a fact which led to the 

development of the (non-graphical) Object Constraint Language (OCL). Kent proposed 

constraint diagrams as a purely diagrammatic alternative to the OCL, more appropriately 

47


complementing the UML’s visual nature (Kent, 1997). The constraint diagram in figure 1 

shows a constraint in a library management system. Amongst other things it states that 

people can only borrow books that are in the collections of libraries they have joined. 

Figure 1: A constraint diagram and an Euler diagram. 

There are many reasons that we might want to use diagrams to represent information, 

including the potential of diagrams for well matchedness and free rides. A diagram is 

well matched to its subject if it presents the key features of that subject effectively and in 

a way that seems intuitively clear to the viewer (Gurr and Tourlas, 2000). A well matched 

diagram can make certain reasoning tasks appear to be easier when compared with a symbolic 

representation of the same information. Free rides occur when a diagram provides 

some information ‘naturally’ or ‘for free’ which would need to be explicitly stated in, or 

derived from, a symbolic representation (Shimojima, 2004). For example, in the Euler diagram 

in figure 1, the fact that the contour Spaniels is placed within GunDogs asserts directly 

that Spaniels ⊆ GunDogs but also allows the viewer to infer Spaniels ⊆ Dogs and 

Spaniels ∩ Cats = ∅. Details of well matchedness and free rides in constraint diagrams 

can be found in (Stapleton and Delaney, 2008). In some circumstances the expressive 

power of diagrams can produce ambiguity, or lead the viewer to make false inferences. 

However, many diagrammatic notations now have formal, unambiguous semantics, of 

which Euler and constraint diagrams are prominent examples. 

Our ultimate goal is to create a Domain Specific Embedded Language (DSEL) for 

several systems of diagrammatic reasoning, with two main aims: to explore the benefits 

and boundaries of the emerging style of programming that mixes formal methods with 

programming, and to support the work which aims to establish visual logics as a valuable 

tool in formal methods. 

The DSEL will be written in Haskell and will consist of statically verified code which 

will allow the user to manipulate and reason with a variety of visual logics such as Euler 

diagrams, spider diagrams and constraint diagrams (see Section 2). The DSEL, therefore, 

shares one of the primary aims of visual logics — to make formal reasoning more accessible 

and widely used. Reasoning about design and implementation have traditionally 

taken place in separate phases of the software process, with the onus on the programmer 

to bridge the gap between the two. One of the benefits of combining both activities in one 

phase is that constraints modelled by a programmer using the DSEL will form software 

components in their own right, resulting in diagrams with the same type as functions in 

48


the modelled system. This suggests that such constraints could eventually form part of 

working software, perhaps as part of a “trusted kernel” used by other components, following 

the approach of (Kiselyov and Shan, 2007). The form and function of the DSEL will 

therefore be closely linked — a formally specified language to assist formal reasoning. 

Advances in Programming Language Theory are typically explored in research languages 

before percolating into more widely used languages. This is especially true of 

modern functional languages and, in particular, Haskell. The Haskell type system with 

the extensions provided by the GHC compiler make it possible to explore what Sheard 

called (when speaking of the closely related language Ωmega) “a new point in the design 

space of formal reasoning systems — part programming language, part logical framework” 

(Sheard, 2004) and to do so directly within the environment of a practical language 

with efficient implementations. The language features that enable this can be used to emulate 

the behaviour of fully dependently typed languages such as Epigram (Altenkirch, 

Mcbride and Mckinna, 2005), resulting in what have been called “pseudo-dependently 

typed” systems, described in Section 4. The syntactic clarity, referential transparency 

and similarity to mathematical notation of functional languages are also of benefit to us. 

These features help us in our goal to minimise syntactic differences between the DSEL 

and the diagrammatic logics we implement, making it easier to demonstrate a clear mapping 

between the two. The point of this mapping is to demonstrate “literal preservation 

of syntactic relations under denotation”, as Hammer states the conditions for resemblance 

between a sign and that which it signifies (Hammer, 1995). 

In Section 2 we describe reasoning with Euler diagrams. Section 3 gives an overview 

of type theoretic features making their way into programming languages while Section 

4 looks ahead to the form our DSEL will take, using Euler diagrams as a case study. In 

Section 5 we consider the goals of the research, evaluate the strategies used to reach them 

and identify some of the challenges ahead. 

2 Reasoning with Euler diagrams 

Although diagrams have often been used to aid understanding in mathematical proofs, 

they have until fairly recently been treated as informal and secondary to formalized symbolic 

content. In the 1990s the work of Shin began to put diagrams on a different standing 

by proving soundness and completeness results for the Venn-II reasoning system, an extension 

and formalisation of earlier work by Venn and Peirce (Shin, 1994). Stapleton 

provides a summary of the history of diagrammatic reasoning since then, which is now 

a rapidly evolving and active research area (Stapleton, 2007). What makes such logics 

interesting, given the existence of mature symbolic reasoning techniques, is the combination 

of formal reasoning with the compact and intuitive nature of diagrams referred 

to previously. We expect that this, and the efforts to create supporting tools, will make 

formal reasoning more accessible to non-logicians. 

An Euler diagram is a collection of closed curves called contours which represent sets, 

within an enclosing rectangle. Figure 2 shows an example with three contours, labelled 

A, B and C. Containment, intersection and disjointness are represented by the placement 

of contours, so the same diagram asserts C ⊆ A and B ∩ C = ∅. A zone is a set of 

points in the diagram that can be described as being inside certain contours and outside 

all others. The diagram in figure 2 has five zones; one inside A but outside B and C, one 

49


Figure 2: An Euler diagram. 

inside A and C but outside B, and so forth. The region outside of all contours is also a 

zone. Shading within a zone asserts the emptiness of the set represented by that zone. So, 

the shading of the diagram in figure 2 asserts A ∩ B = ∅ and A − C = ∅. 

Reasoning is carried out by the application of rules which transform one diagram into 

another, such as Add Contour and Remove Shading; a sound and complete set is given in 

(Stapleton, Masthoff, Flower, Fish and Southern, 2007). A proof using Euler diagrams is 

formed by applying these rules repeatedly to transform an initial diagram (the premise) 

into the target diagram (the conclusion); figure 3 shows a short example. The Add Shaded 

Zone rule is applied to transform d1 to d2. A new shaded zone can be added at any 

time since both a shaded zone and a missing zone assert the emptiness of the represented 

set; both d1 and d2 state that A and B are disjoint. The Add Contour rule is applied 

to transform d2 to d3. The new contour C intersects all existing zones without changing 

their shading. Since this operation introduces no new shading and the way that C is added 

ensures that no missing zones are created, d2 and d3 have the same meaning. 

Figure 3: An Euler diagram proof. 

The diagrams are formalised using an abstract syntax. The abstraction of Euler diagrams 

that we present here is obtained from (Stapleton et al., 2007). Each zone is represented 

as a tuple of the set of labels of contours that the zone is inside and the set of 

labels of contours the zone is outside. For example, in diagram d1, figure 3, the only zone 

inside A has the abstraction ({A}, {B}). Diagrams are represented as a tuple of the set 

of labels (L), the set of zones (Z) and the set of shaded zones (Z ∗ ). Thus, diagram d2 in 

figure 3 has abstraction: 

〈L = {A, B}, Z = {({A}, {B}), ({B}, {A}), ({A, B}, ∅), (∅, {A, B})}, Z ∗ = {({A, B}, ∅)}〉 

There are a number of logics that extend this system of Euler diagrams, including spider 

diagrams (Howse, Stapleton and Taylor, 2005) and the constraint diagrams mentioned 

50

previously. 


3 Dependent Typing and Proof-Carrying Code 

The Curry-Howard Isomorphism has a long history and arises from the observation of a 

correspondence between Hilbert-style deductive logic and combinatory models of computation. 

The work of Martin-Löf cast it as a more general principle linking logical formalisms 

and the type systems of programming languages (Martin-Löf, 1984). Rather 

than classifying values, types can be viewed as propositions; a value inhabiting type T 

corresponds to a proof of T. Martin-Löf’s type theory can be used as an environment for 

programming with dependent types (Nordstrom, Petersson and Smith, 1990). Dependent 

type systems are so-called because types may depend on a value, such as List a n, the 

type of collections of elements of type a with length n. For different values of n we have 

different types. A sketch of the logical rules for type-safe list operations is given as type 

judgements below. We assume the types Nat (of Peano numbers with constructors Zero 

and Succ n) and List a n. Γ is a typing context and Γ ⊢ σ type means that σ is a type in 

Γ. 

Γ ⊢ Nat type Γ ⊢ Zero : Nat 

Γ ⊢ n : Nat 

Γ ⊢ Succ n : Nat 

Γ ⊢ t type 

Γ ⊢ empty t : List t Zero 

Γ ⊢ t type Γ ⊢ x : t Γ ⊢ n : Nat Γ ⊢ l : List t n 

Γ ⊢ cons x l : List t (Succ n) 

Γ ⊢ t type Γ ⊢ n : Nat Γ ⊢ l : List t (Succ n) 

Γ ⊢ tail l : List t n 

Γ ⊢ t type Γ, n : Nat ⊢ l : List t (Succ n) 

Γ ⊢ head l : t 

Dependent type theory makes Curry-Howard (or propositions-as-types) useful in practical 

ways. The resulting type systems form the basis of automated theorem provers 

(Bertot and Casteran, 2004) and, on the other hand, purely functional and total programming 

languages (Altenkirch et al., 2005). The same insights inform more widely used 

languages at an accelerating rate, especially Haskell, which plays the dual rôle of research 

language and practical tool. The type system of Haskell with extensions is flexible 

enough to emulate many aspects of dependent typing and to create programs whose types 

act as proof that their implementation conforms to their specification. 

4 Haskell and the DSEL for Euler Diagrams 

Programming our diagrammatic DSEL is at the prototype stage. Its foundation is a typelevel 

Set library which encodes and ensures constraints such as set membership, disjointness 

and so on. Above this will sit the implementation of several diagrammatic logics. 

Two diagrammatic transformations corresponding to inference rules in an Euler diagram 

system are presented as a type judgements below. 

51

In a language such as Haskell we may not mix types and terms in the way described in 

Section 3. The collection of techniques used to achieve something often called “pseudodependent 

typing” includes type-level representations of the indexing term supplied to the 

type constructor; to use the example from Section 3, since we have no type-level numbers 

we represent n in List a n by types formed of the empty Haskell type constructors Z and 

Succ n, such as Succ (Succ Z ). 

It is important to distinguish type-level from term-level computations. In the termlevel 

of a programming language with partial functions the result of any function may be 

undefined (⊥), and so programs are not proofs. Type functions like union below are not 

functions over values, are defined extensionally and exclude the undefined. The DSEL 

is comprised of two main components, the domain specific, dependently typed theory 

of diagrammatic reasoning, which provides assurances about the correct formation of 

diagrams and application of reasoning rules, and the interactive front end which makes use 

of this type system and is subject to the usual limitations of the host language. Although 

we do not use a dependently typed host language, our approach is similar in spirit to 

(Oury and Swierstra, 2008) who use Agda to enforce sophisticated constraints statically 

in a series of DSELs. 

Since type-level values are distinct from terms, special measures are required to handle 

them at runtime. We use a combination of techniques involving empty and existential 

types (Peyton Jones, 2008) to do this. As an example of our strategy, the types A, B and 

C below are empty types used to represent the labels of contours in a diagram: 

data A ; data B ; data C data Nil 

data L a where data t ⊲ ts 

AL :: L A 

BL :: L B 

The type L a lifts labels into a more general type, allowing us to consider labels of any 

type. The type constructors Nil and ⊲ are used as the building blocks of sets of labels. 

LBox and LSetBox use “existential boxing” to wrap type-level values of LSet t, allowing 

us to handle the outer type at runtime but for the “boxed” value to remain available for 

inspection by constraints: 

data LSet t where data LBox = ∀a. LBox (L a) 

Empty :: LSet Nil data LSetBox = ∀t. LSetBox (LSet t) 

Ins :: L a → LSet t → LSet (a ⊲ t) 

By creating a function fromChar :: Char → LBox we can box runtime values and insert 

them into boxed sets with a function that calls on fromChar, insertChar :: Char → 

LSetBox → LSetBox. When insertChar is used to add elements to a set of type 

LSetBox, a correspondence is enforced between the collection of values and the type 

of its LSet t parameter. The value of a collection can be seen as fully determined by 

the type of this parameter, which is a proof ensuring that inserted elements are members 

of the resulting collection. Assurances for the semantics of sets may be encoded using 

constraints written using Indexed Type Families (Peyton Jones, 2008). 

41 Judgement Rules 


We model the tuples found in the Euler diagram abstraction with the types Z l1 l2 (zones) 

and D l z z ∗ (diagrams). The type judgements below are a fragment of a self-contained 

52

type theory of Euler diagrams based on the abstract syntax given in (Stapleton et al., 

2007). Once complete, this type theory will be implemented using the techniques in the 

previous section to produce a DSEL with enhanced static constraints. 

Two kinds of element appear in the judgement rules: typing judgements, e.g. x : a 

and type constraints, e.g. Γ ⊢ C x y type, meaning that the type C can be formed in 

the context Γ. Type constraints presumed to be defined in the Set library such as Disjoint 

appear capitalised, while functions from types to types are in lower case, e.g. union. 

We use the constraints Label, LabelSet, Zone and ZoneSet to restrict the input to type 

constructors. 

Supplied with disjoint sets of labels, l1 and l2 , Z constructs a zone: 

Γ ⊢ LabelSet l1 type Γ ⊢ LabelSet l2 type Γ ⊢ Disjoint l1 l2 type 

Γ ⊢ Z l1 l2 type 

The syntactic rules state that given a diagram D l z z ∗ , the zones z form a superset of 

the shaded zones z ∗ . Also, for each zone Z l1 l2 in z and z ∗ , l1 ∪ l2 forms a partition over 

l. The Invs rule applies these constraints to a diagram: 

Γ ⊢ Invs l z type Γ ⊢ Invs l z ∗ type Γ ⊢ Subset z ∗ z type 

Γ ⊢ D l z z ∗ type 

The Inv rule applies the relevant constraint to an individual zone. The base case for 

applying Inv is: 

Γ ⊢ Label l type 

Γ ⊢ Invs l Nil type 

The inductive case for applying Inv is: 

Γ ⊢ Label l type Γ ⊢ ZoneSet (z ⊲ zs) type Γ ⊢ Inv l z type Γ ⊢ Invs l zs type 

Γ ⊢ Invs l (z ⊲ zs) type 

Since l1 ∩ l2 = ∅, they partition l if l1 ∪ l2 = l: 

Γ ⊢ Z l1 l2 type Γ ⊢ u : union l1 l2 Γ ⊢ LabelSet ls type Γ ⊢ Eq l u type 

Γ ⊢ Inv ls (Z l1 l2) type 

The quotes that begin the following subsections are from (Stapleton et al., 2007) from 

which we take reasoning rules and translate them to typing judgements. The invariants are 

not tested after applying the rules since previous judgements guarantee that if a diagram 

can be formed, the invariants have been met. 

411 Remove Shaded Zone 


“A shaded zone can be removed but only if there is at least one zone inside each contour 

in the resulting diagram and the zone outside all the contours remains”. In figure 4, the 

Remove Shaded Zone rule can be applied to transform d1 into d2. 

Γ ⊢ Zone x type 

Γ ⊢ D l z z ∗ type Γ ⊢ z ′ : delete x z 

Γ ⊢ z ∗′ : delete x z ∗ Γ ⊢ Member x z ∗ type 

Γ ⊢ transform RemoveShadedZone x (D l z z ∗ ) : (D l z ′ z ∗′ ) 

53

412 Add Contour 


Figure 4: Three Euler diagrams. 

“A contour can be added to a diagram provided its label is not already in the diagram. Each 

zone is split into two zones (one inside and one outside the new contour), and shading is 

preserved”. In figure 4 the Add Contour rule can be applied to transform d1 into d3. 

Before we can add contours we need a way of replacing all zones z : Z l1 l2 in a set 

with two copies of itself, one with an extra label added to l1, one with that same label 

added to l2. 

Γ ⊢ Label c type 

Γ ⊢ splitZones c Nil : Nil 

Γ ⊢ ZoneSet (z ⊲ zs) type Γ ⊢ Label c type 

Γ ⊢ z2 : insertLabel Excl c z Γ ⊢ z1 : insertLabel Incl c z 

Γ ⊢ splitZones c (z ⊲ zs) : (z1 ⊲ z2 ⊲ (splitZones c zs)) 

Γ ⊢ Z l1 l2 type Γ ⊢ Label c type Γ ⊢ l3 : c ⊲ l1 

Γ ⊢ insertLabel Incl c (Z l1 l2) : (Z l3 l2) 

Γ ⊢ Z l1 l2 type Γ ⊢ Label c type Γ ⊢ l3 : c ⊲ l2 

Γ ⊢ insertLabel Excl c (Z l1 l2) : (Z l1 l3) 

Γ ⊢ Label c type 

Γ ⊢ D l z z ∗ type Γ ⊢ l ′ : c ⊲ l 

Γ ⊢ z ′ : splitZones c z Γ ⊢ z ∗′ : splitZones c z ∗ 

Γ ⊢ transform AddContour c (D l z z ∗ ) : (D l ′ z ′ z ∗′ ) 

5 Conclusions and Further Work 

We have presented part of a DSEL for Euler diagrams that closely mirrors their abstract 

syntax and which allows us to inherit the definitions of reasoning rules in a seamless 

way. We have extended the approach of section 41 to a complete set of reasoning rules, 

providing a type theoretical version of Euler diagrams. Providing a self-contained type 

theory for the DSEL (beginning with the simplest case of a set of rules for reasoning with 

Euler diagrams and extending this to more complex cases) will make results relating to 

the logics (soundness, completeness, etc.) transferable, giving the DSEL the status of a 

reasoning tool in its own right. 

Our goal is to extend the current approach to more expressive notations, such as generalized 

constraint diagrams, which are expressive enough to be used when modelling 

54

software (Stapleton and Delaney, 2008). It is ultimately expected that the DSEL will be 

used by higher level tools which allow the user to select from contextually legitimate diagram 

transformations. Diagrams created using the DSEL (with or without the support of 

additional tools) will have a type which captures the modelled constraint. If the modelled 

software is written in the same language as the constraint and there is a correspondence 

between the datatypes used in each, we may be able to use the constraint as part of a 

“trusted kernel” exporting a safe subset of constructors via the module system. This scenario, 

in which the programmer uses tools to model constraints then applies them directly 

within the implementation phase, will provide a more unified and, ideally, a more usable 

programming/verification environment than exists today. 

Combining types with terms requires careful design. Some of the solutions, such as 

existential boxing, introduce levels of indirection which are unnecessary in more specialised 

environments and which may threaten to obscure the relationship with underlying 

diagrammatic logics, at least superficially. If we were to use a language such as 

Coq or Epigram to implement the DSEL it is possible that we could find a more natural 

expression of many types and constraints. We believe however, given our central aim of 

accessibility, that these risks are offset by the benefits of using a more practical and accessible 

language than is available in the current generation of dependently typed systems. 

The limitations of these techniques and how they might be used to form a general strategy 

to combine verification and programming are some of the subjects of the research. The 

research will support the longer term goals of the diagrammatic reasoning community by 

providing an implementation of various visual logics which can be clearly linked to their 

related abstract syntax. Once extended to the case of constraint diagrams, the DSEL has 

the potential to shrink the toolchain used by programmers who wish to make statements 

about the semantic properties of the code they write. There are a number of interesting 

challenges involved in reaching that point, such as the issue of extracting the type of a 

diagram in a usable form. The work reported in this paper is a first step towards achieving 

these goals. 


I would like to express my sincere thanks to John Howse, Gem Stapleton and Richard 

Bosworth for their support and encouragement, and to the anonymous reviewers for their 

helpful comments. The author is supported by EPSRC Grant EP/P501318/1. 

References 


Altenkirch, T., Mcbride, C. and Mckinna, J. (2005). Why dependent types matter, Available 

online http://www.cs.nott.ac.uk/˜txa/publ/ydtm.pdf Accessed 01/02/08. 

Bertot, Y. and Casteran, P. (2004). Interactive Theorem Proving and Program Development, 

SpringerVerlag. 

Gurr, C. and Tourlas, K. (2000). Towards the principled design of software engineering 

diagrams, Proceedings of 22nd International Conference on Software Engineering, 

ACM Press, pp. 509–518. 

55


Hammer, E. (1995). Logic and Visual Information, CSLI, Stanford. 

Howse, J., Stapleton, G. and Taylor (2005). Spider diagrams, LMS Journal of Computation 

and Mathematics 8: 145–194. 

Kent, S. (1997). Constraint diagrams: Visualizing invariants in object oriented modelling, 

Proceedings of OOPSLA97, ACM Press, pp. 327–341. 

Kiselyov, O. and Shan, C.-C. (2007). Lightweight static capabilities, Electronic Notes in 

Theoretical Computer Science 174(7): 79–104. 

Martin-Löf, P. (1984). Constructive mathematics and computer programming, Royal Society 

of London Philosophical Transactions Series A pp. 501–518. 

Nordstrom, B., Petersson, K. and Smith, J. M. (1990). Programming in Martin-Löf’s Type 

Theory, OUP. 

Oury, N. and Swierstra, W. (2008). The power of pi, Submitted to ICFP 2008. 

Available online http://www.cs.nott.ac.uk/˜wss/Publications/ThePowerOfPi.pdf Accessed 

01/05/08. 

Peyton Jones, S. (2008). Ghc language features, Accessed 01/02/08 

http://www.haskell.org/ghc/docs/latest/html/users guide/ghc-languagefeatures.html. 

Sheard, T. (2004). Languages of the future, SIGPLAN Notices 39(12): 119–132. 

Shimojima, A. (2004). Inferential and expressive capacities of graphical representations: 

Survey and some generalizations, Proceedings of Diagrams 2004, Vol. 2980 

of LNAI, Springer, pp. 18–21. 

Shin, S. J. (1994). The Logical Status of Diagrams, CUP. 

Stapleton, G. (2007). Diagrammatic logics: Past, present and future, International Conference 

on Logic, Navya Nyaya and Applications, Jadavpur University, pp. 4–15. 

Stapleton, G. and Delaney, A. (2008). Evaluating and generalizing constraint diagrams, 

Accepted for Journal of Visual Languages and Computing. Available online from 

JVLC. 

Stapleton, G., Masthoff, J., Flower, J., Fish, A. and Southern, J. (2007). Automated theorem 

proving in Euler diagrams systems, Journal of Automated Reasoning 39: 431– 

470. 

56

FICTIONAL CONTINGENCIES 


University of British Columbia & LOGOS Research Group 

Abstract. I argue that fictional contingencies, such as the one that, in Tolstoy’s Anna Karenina, 

Anna Karenina might not have fallen for Vronsky pose a serious problem to a descriptivist 

and possible worlds view of fiction such as the one defended by David Lewis and 

Gregory Currie. Their view cannot account for the fact that in Tolstoys Anna Karenina, it 

is Anna Karenina herself who contingently falls for Vronsky. In Tolstoy’s Anna Karenina, 

Anna Karenina falls for Vronsky in the actual world but she fails to fall for him in some 

possible world. 



An interesting issue that arises within the topic of fiction is the issue of how to account for 

the intuitive contingencies of fictional characters. For at least some of the things that occur 

to fictional characters within a story are supposed to happen only contingently. There is a 

way certain views on fiction could take to account for these modal properties of fictional 

characters that I think is mistaken and I shall argue why in this paper. 

Gregory Currie recently advanced such an account in his “Characters and Contingency” 

(2003). But his account is one that must be attractive to any follower of the 

Lewis-Currie descriptivist view of fictional names, or of what I take would be a natural 

two-dimensionalist extension of Robert Stalnaker’s position on true negative existentials 

and related matters. The account, in fact, only makes sense within a possible worlds 

framework of fiction. In short, the descriptivist view is the view that fictional names, 

unlike ordinary proper names, are, or are used by the author of the fiction, as non-rigid 

definite descriptions. 

First, I shall explain the problem of fictional contingencies and argue that the explanation 

Currie offered does not work. This is a real problem for the descriptivist view of 

fiction and I will also argue that. Secondly, I shall consider other alternatives to descriptivism 

within the possible worlds framework to conclude that no possible worlds view of 

fiction looks promising. Finally, I will end up with some positive suggestions that I would 

like to develop soon somewhere else. 

2 The Problem of Fictional Contingencies 

I shall motivate the problem I want to address in this paper by introducing the following 

pair of sentences: 

(1) Necessarily, someone who did not fall for Vronsky would not be Anna Karenina 

(2) Someone who necessarily fell for Vronsky would not be Anna Karenina 

Despite the apparent inconsistency between these two claims, both seem intuitively 

true. (1) is true because anything that a fictional story tells about its characters is essential 

to them. Tolstoy’s story about Anna Karenina tells us, among other things, that Anna 

57


Karenina falls for Vronsky. Hence, unlike what happens to non-fictional people like you 

and me, and due to its fictionality, it is a constitutive feature of Anna Karenina that she 

falls for Vronsky. Thus, it is necessary that she does. (2) is true because Tolstoy’s story 

is not a story in which Anna Karenina cannot but fall for Vronsky, but a story in which 

Anna Karenina falls for Vronsky only contingently. Thus, anyone who necessarily fell 

for Vronsky, who fell for Vronsky not contingently, would not be Anna Karenina. 

The apparent incompatibility or tension between (1) and (2) cannot be explained in 

terms of the distinction between truth in fiction and truth simpliciter, or any other similar 

distinction. For both seem to be true in one and the same reading. None of them is true 

in the fiction. Rather, they are about the fictional character Anna Karenina. They specify 

some of its necessary qualities. 

3 The Descriptivist Way Out of the Problem 

The view I want to show wrong in this paper would accept the truth of both claims and 

would explain it as follows: Anna Karenina possibly exists. That is to say, even if –as 

we all agree– Anna Karenina does not actually exist, there is some other possible world 

where she does. For to be Anna Karenina is simply to play the Anna Karenina-role and 

to play the Anna Karenina-role merely amounts to satisfy the general definite description 

that could be extracted out from the story told by Tolstoy, constructed out of everything 

Tolstoy says about Anna in the story he tells, which is the exact meaning of the fictional 

name ‘Anna Karenina’, at least as it is used by Tolstoy. 

On this view, what one does when telling a fiction is to tell a story, which although not 

actual, is possible. It is to qualitatively describe part of some possible worlds other than 

the actual. It is to explain some ways the actual world might have been but is not. Thus, 

the view is that Anna Karenina could have existed and fallen for Vronsky even if in fact 

this never occurred and will never do in actuality. That Anna Karenina falls for Vronsky 

is as possible as my turning off my laptop in a moment. 

What would explain the truth of (1), according to this view, is the fact that there is 

no possible world where someone plays the role of Anna Karenina but does not fall for 

Vronsky. This is so precisely for part of what it means to play this role is to fall for 

Vronsky. Thus, it is true in every world that anyone who plays the Anna Karenina-role in 

that world falls for Vronsky. 

Nevertheless, (2) would be true as well because for every person who plays the Annarole 

in some possible world, there is at least one more world where that same person does 

not fall for Vronsky, i.e. a world where she does not play the role of Anna Karenina (This 

would be so because it is impossible to necessarily fall in love). The existence of these 

other possible worlds is what would explain the contingency of the falling for Vronsky 

by Anna Karenina. In “Characters and Contingency”, Currie advances such an account of 

the truth of (1) and (2). 

The reason why the explanation provided above does not work is that it does not explain 

what it has to explain, that is, the fact that in the fiction, Anna Karenina has the 

property of falling for Vronsky but only contingently so. This amounts to the fact that 

Anna Karenina herself must have the property in every story-world –i.e. where the Anna 

Karenina-role is satisfied–, but at the same time she (Anna Karenina and no one else) 

must fail to have that property while being Anna Karenina at some world, which must 

58


be possible with respect to the story-world. But it is Anna Karenina herself who must 

have the contingent property at one world and lack it at another. This is what contingency 

means. Otherwise, it is not true that Anna Karenina falls for Vronsky in a contingent way, 

but that someone else does. The problem is that the only way for this view to try to explain 

that contingency is by appealing to the possible worlds –which are not story-worlds– 

where the possible persons that, on this view, occupy the Anna Karenina-role, and thus 

are Anna, in some story-worlds, do not fall for Vronsky and, thereby, neither occupy the 

Anna Karenina-role nor are Anna in them. 

I see no way a possible worlds descriptivist view can handle this problem. However, I 

can see how one might reply. But the replies I envisage seem to be wrong as well. 

One might find the possible worlds explanation of fictional contingencies plausible 

and be easily misled into thinking that it is in fact right merely due to a natural tendency 

to forget what this possible worlds view tells us being Anna Karenina consists in and, 

as a result, come to have the following confused thought: that this person who does not 

fall for Vronsky in some world in which she does not occupy the Anna Karenina-role is, 

nevertheless, Anna Karenina also in such a world due to the fact that she is Anna Karenina 

in one of the worlds of the story, where she does occupy the Anna Karenina-role and does 

fall for Vronsky. But to evaluate this possible worlds view under this impression is to 

misunderstand what the view (at least, about being Anna Karenina) is. 

If that other person were to be Anna Karenina in any sense also in this other world 

where she does not fall for Vronsky, (1) would not be true. It would not be a necessary 

condition for being Anna Karenina to fell for Vronsky, for there would be some possible 

worlds where Anna Karenina would not fall for him. These would precisely be the worlds 

where someone who occupies the Anna-role in one of the story-worlds exists and does 

not fall for Vronsky. As I argued above, however, there is no such sense for the case of 

being Anna Karenina. To think of that person, let’s say Jane, as being Anna Karenina also 

in that other world where she does not fall for Vronsky only because she does occupy the 

Anna Karenina-role at some world, it is to mistake what being Anna Karenina is, on such 

a view, for what being Jane (or, in fact, any other real person) is. Currie explains this as 

follows: “Now consider Jane, a respectable inhabitant of the actual world. In the actual 

world she does not fall for Vronsky; in fact she never meets him. But, given what I have 

said just now, it may well be the case that Jane in some other world does fall for Vronsky; 

in that other world, Jane occupies the Anna-role. Does that make Jane, in this world, 

Anna Karenina? No. Being Anna is, according to me, something that happens to you in 

some worlds and not in others. It happens to you in worlds where you occupy the Anna 

role. In any world in which Jane occupies that role she is Anna. But that does not make 

her Anna in this world. Being Anna is not at all like being Jane. The person who is Jane in 

one world is Jane in all worlds. Being Jane is a matter of being a certain individual; being 

Anna, on the other hand, is a matter of occupying a certain role. Moving up a semantic 

step we can say that “Jane” is a proper name of an individual, whereas “Anna”, where it is 

the proper name of anything, is the proper name of a function from worlds to individuals. 

Of course when Tolstoy says that Anna did this or that, we are not from the point of view 

of our imaginative engagement with the work, to understand this as meaning that a role 

did this or that. This is because it is part of the fiction that “Anna” is the name of a person. 

But “Anna”, as used by Tolstoy, is not in fact the name of a person, nor does it purport to 

be. Names are expressions used in order to pick out individuals, and Tolstoy does not use 

59


“Anna” in order to do this, nor does he expect us to believe that he is. “Anna”, as used by 

Tolstoy, is not a name.” (Currie 2003, p. 141) 

On the other hand, one might also contemplate the possibility of the fictional characters 

enjoying of a certain autonomy with respect to their stories in such a way that one could 

say that Tolstoy’s Anna Karenina could have had a different end, for instance. The idea 

being that the characters would be well defined since the very beginning of the fiction -this 

opening possibilities for their fate other than the ones that the author chose. Considering 

this, one might think that the contingency of the properties of the characters could be 

reduced to the contingency of the writing process itself. Anna Karenina, for instance, 

might not have fallen for Vronsky precisely because Tolstoy might not have written that 

she did. However, this possibility would not save Currie’s explanation of the fictional 

contingency, or descriptivism of fictional names, since it is a whole different explanation 

not compatible with them. But also one could see that it would not work by considering 

the fact that one can write a fiction where characters have certain properties necessarily, 

and notwithstanding this, the contingency of the writing process remains; the author could 

have written a different story or this story a bit different. 

The conclusions I think we should draw from all of this go farther than the mere conclusion 

that the explanation of fictional contingencies I criticized is wrong and should be 

rejected. This problem that fictional contingencies pose and the incorrectness of this explanation 

indicate a deeper or more fundamental problem. It really shows why at least any 

descriptivist view that tries to explain fiction in terms of possible worlds –which seems to 

be their only way– is mistaken, and maybe it even shows that fiction cannot be accounted 

for in possible worlds terms at all; at least, for the case of fictions told by the use of 

singular terms such as proper names. In short, the problem is that this possible worlds 

descriptivist view cannot explain the truth of pairs like (1) and (2). For, in particular, it 

cannot explain the possession of any fictional contingency by any fictional character. 

4 Other Possible Descriptivist Ways Out 

If the possible worlds view has it that being Anna Karenina amounts to satisfy the nonrigid 

definite description, which has as a part the description of this woman as falling 

for Vronsky, it will not succeed in explaining that Anna Karenina falls for Vronsky only 

contingently. For the simple reason that any woman who would be Anna Karenina at 

all would be so only in some worlds and precisely in those worlds where she falls for 

Vronsky. One might think, even against what Currie seems to insist, that there are two 

ways of being Anna Karenina, though: one of them, the one we already contemplated 

and the one that Currie tells us; the other, the one that the possible worlds view would 

like to have, while keeping the previous one, which is to be someone who at some storyworld 

satisfies the description that ‘Anna Karenina is or conveys, even if she does not do 

so at some other possible worlds. In this sense anyone who met the description at some 

possible world, would be also Anna Karenina at all the other worlds where she existed 

even if she did not meet the description in them. This last sense does not seem to be 

compatible with the view that claims that ‘Anna Karenina is used as a non-rigid definite 

description, and that when it is not, when it is used literally, does not refer at all. But lets 

assume for a moment it is for the sake of the argument. 

This way there would be two ways of understanding the relevant pair of claims. Ac- 

60


cording to the interpretation corresponding to the first sense of being Anna Karenina, 

(1) would be true but (2) false. And according to the interpretation corresponding to the 

second sense, while (2) would be true, (1) would be false. In none of these two interpretations, 

one gets that both claims are true. Intuitively at least, however, they seem to 

be true under one and the same interpretation. Both claims are about the features that 

characterize a fictional character, Anna Karenina. One of these features is to be someone 

who falls for Vronsky; another, to be someone who falls for Vronsky in a contingent way. 

One might think, though, that the intuitive truth of these two claims may be very well 

accounted for by considering a different interpretation of them in each case. However, 

there is no independent reason to interpret them this differently. This does not seem to be 

why we think they are both true. This way out of the problem fictional contingencies pose 

to this view would be completely ad hoc. 

In any case, there is no way on such a view to obtain what the view really needs. That 

is, that Anna Karenina, one and the same thing, has the property of falling for Vronsky, 

but lacks it at another possible world. For it is a condition on being Anna Karenina that 

she does so contingently. This is what having a contingent property amounts to. Note that 

the independent reason to argue for the legitimacy of using two different interpretations 

cannot be that ‘Anna Karenina can be used both as a non-rigid definite description and as 

a rigid proper name and that while it is used as a non-rigid definite description in the case 

of (1), it is used as a rigid proper name in the case of (2). For, according to the possible 

worlds view, only within the fiction, ‘Anna Karenina is or comes to be used as an ordinary 

rigid proper name. We cannot use the proper names that are used in these other possible 

worlds. For these proper names are only possible, not actual. Note too that appeal to the 

ambiguity in scope due to the interaction between modalities and definite descriptions in 

(1) and (2) does not work either. For the problem is that we are dealing with fiction and 

fictional names and hence, there are no individuals that could stand in the place of these 

fictional characters other than the ones that satisfy the definite descriptions in question in 

each of the possible worlds. Thus, we can explain the consistency of the following pair 

of sentences: 

(3) Necessarily, the Queen of England is queen 

(4) The Queen of England may not have been queen 

by noticing the distinction in scope of the occurrences of the definite description ‘the 

Queen of England in (3) and (4), and explain that (4) can be true compatibly with the 

truth of (3) because there is an individual –i.e. the Queen of England– who can exist in 

another possible world and not be the Queen of England in it. As I said, unlike in the 

case of fiction, this is possible precisely because there is in fact an individual who is the 

Queen of England in the actual world, whereas there is no such individual for the definite 

description that the fictional name Anna Karenina allegedly abbreviates. 

5 Non-Descriptivist Possible Worlds Views of Fiction 

One might think that perhaps there are other possible worlds views of fiction that are 

not descriptivist that could handle this problem of the fictional contingencies of fictional 

characters. I shall very briefly argue that the only available ones are not very attractive. 

Descriptivism seems to be the most plausible possible worlds view of fiction. 

I see two options: one might defend Meignonianism and say that fictional characters 

61


actually exist in some special mysterious way and that fictional names are like ordinary 

proper names that rigidly refer to them. Or one might defend the view that fictional characters 

are abstract objects, which actually exist and to which the fictional names rigidly 

refer. Within this last option I see two further options: one might say that these abstract 

objects are only contingently so, so that in other worlds these same objects exist but are 

concrete instead of abstract in these worlds. The existence of these contingently nonconcrete 

is defended by Bernard Linsky and Edward N. Zalta not with respect to fictional 

characters but with respect to mere possible objects –i.e. possibilia. Or one might defend 

that these abstract objects, like any other abstract objects, are necessarily abstract, in 

which case, they only can do what their fictions tell they do in worlds that are impossible, 

for there are things that only concrete objects can do. Thus, if these abstract objects are 

to do them, it can only occur in impossible worlds rather than possible ones. This is the 

Millian view defended by Nathan Salmon. 

On the one hand, the first option, Meignonianism, is wholly mysterious and hence, no 

plausible at all. On the other hand, the only option left which explains fictions in terms of 

possibilities is the option that sees fictional characters as contingently nonconcrete objects 

and, hence, consists in the very implausible claim that some actual abstract objects can be 

concrete and some actual concrete objects can be abstract. In view of the alternatives to 

descriptivism about fiction, I think we can conclude that fiction should not be dealt with 

in terms of possible worlds. 

6 Some Positive Suggestions 

I think this problem is easily solved once we simply abandon the idea of explaining fiction 

in terms of possible worlds. I would like to defend that story-worlds are not possible 

worlds even if they are ontologically the same kind of thing: that is, sets of sentences 

or propositions. The difference between story worlds and possible worlds would just be 

that only the later represent possibilities with respect to the actual world. The fictional 

contingencies of fictional characters should be explained by appealing to those worlds 

which would be possible but only with respect to the world of the story and not with 

respect to our actual world. 

This way out of the problem would be possible because fictional names, on the other 

hand, are not abbreviated non-rigid definite descriptions, but merely empty rigid proper 

names, that is, proper names that do not have a referent. The meaning of fictional names 

should be derived, in my view, from the fact that part of the meaning of any proper name 

is the meaning of a rigid definite description associated with them. Any proper name N, 

when used, in addition to rigidly refer to their bearer, semantically expresses some definite 

description like ‘the bearer of N or ‘the individual called N, where the token of the name 

N that occurs within that description is used the same way as the name N. This view 

about proper names in general is a view that I learnt from Manuel Garcia-Carpinteros 

work. Note that this view does not say that proper names are synonymous to definite 

descriptions, as Saul Kripke showed this is incorrect, and that we can compatibly say that 

Anna Karenina neither actually nor possibly exist. 

Finally, I also think that in addition to the fictional operator ‘in the fiction f, there is 

another fictional operator that we use, whether explicitly or implicitly, in our fictional 

discourse. When we use fictional names to talk about them as fictional characters instead 

62

of as the individuals that these fictional characters represent in the fictions, we either say 

‘the fictional character N or we just utter the name N. It is my view that even in the later 

case, the expression ‘the fictional character is there, though only in an implicit way. It 

is the interaction between this expression and fictional names that makes our fictional 

discourse when talking about fictional characters meaningful. How this interaction works 

is something we have yet to discover. I do not know. 

7 Conclusion 

I have argued that there is a problem with the fictional contingent properties of fictional 

characters that descriptivism about fiction cannot solve. I have also argued that other 

alternative views on fiction that explain it in terms of possible worlds do not seem any 

plausible. Finally, I have provided some positive suggestions to develop in order to explain 

fiction and the problem posed by fictional contingencies. These are suggestions that 

I plan to develop soon. 


I would like to thank the extremely useful comments to earlier drafts of this work that I 

have received from Manuel Garcia-Carpintero, Dominic McIver Lopes, Genoveva Marti, 

Francis Jeffry Pelletier, Pablo Rychter and Ori Simchen as well as the extremely useful 

patience and interest that Stefano Predelli showed in discussing it with me. I am also 

thankful to the anonymous referees for their interesting points that I have tried to include 

in this final version the best I could. 

References 


Currie, G. (1988). Fictional names, Australasian Journal of Philosophy 66. 

Currie, G. (1990). The Nature of Fiction, Cambridge: Cambridge University Press. 

Currie, G. (2003). Characters and contingency, Dialectica 57. 

Kripke, S. (1972). Naming and Necessity, Harvard University Press. 

Lewis, D. (1978/1983). Truth in fiction, Reprinted in David Lewis: Philosophical Papers 

I.: Oxford: Oxford University Press. 

Linsky, B. and Zalta, E. N. (1996). In defense of the contingently nonconcrete, Philosophical 

Studies 84/2-3. 

Salmon, N. (1998). Nonexistence, Nous 32. 

Stalnaker, R. (1999). Assertion, Context and Content, Oxford : Oxford University Press. 

63


64


MEANING & INFERENCE IN CASE OF CONFLICT 


Universiteit van Amsterdam 

Abstract. This paper applies a model of boundedly rational “level-k thinking” (c.f. Stahl 

and Wilson, 1995; Crawford, 2003; Camerer, Ho and Chong, 2004) to a classical concern of 

game theory: when is information credible and what shall I do with it if it is not? The 

model presented here extends and generalizes recent work in game-theoretic pragmatics 

(Stalnaker, 2006; Jäger, 2007; Benz and van Rooij, 2007). Pragmatic inference is modeled 

as a sequence of iterated best responses, defined here in terms of the interlocutors’ epistemic 

states. Credibility considerations are a special case of a more general pragmatic inference 

procedure at each iteration step. The resulting analysis of message credibility improves on 

previous game-theoretic analyses, is more general and places credibility in the linguistic 

context where it, arguably, belongs. 

1 Semantic Meaning and Credible Information in Signaling Games 

The perhaps simplest game-theoretic model of language use is a signaling game with 

meaningful signals. A sender S observes the state of the world t ∈ T in private and 

chooses a message m from a set of alternatives M all of which are assumed to be meaningful 

in the (unique and commonly known) language shared by S and a receiver R. In 

turn, R observes the sent message and chooses an action a from a given set A. In general, 

the payoffs for both S and R depend on the state t, the sent message m and the action a 

chosen by the receiver. Formally, a SIGNALING GAME WITH MEANINGFUL SIGNALS is 

a tuple 〈{S, R} , T, Pr, M, [·] , A, US, UR〉 where Pr ∈ ∆(T ) is a probability distribution 

over T ; [·] : M → P(T ) is a semantic denotation function and US,R : M × A × T → R 

are utility functions for both sender and receiver. 1 We can conceive of such signaling 

games as abstract mathematical models of a conversational context whose most important 

features they represent: the interlocutors’ beliefs, behavioral possibilities and preferences. 

If a signaling game is a context model, the game’s solution concept is what yields a 

prediction of the behavior of agents in the modelled conversational situation. The following 

easy example of a scalar implicature, e.g., the inference that not all students came 

when hearing the sentence “Some of the students came”, makes this distinction clear. A 

simple context model for this case is the signaling game G1: 2 there are two states t∃¬∀ and 

t∀, two messages msome and mall with semantic meaning as indicated and two receiver 

interpretation actions a∃¬∀ or a∀ which correspond one-to-one with the states; sender and 

receiver payoffs are aligned: an implementation of the standard assumption that conversation 

and implicature calculation revolve around the cooperative principle (Grice, 1989). A 

solution concept, whatever it may be, should then ideally predict that S t∀ (S t∃¬∀) chooses 

msome (mall) and the receiver responds with action a∃¬∀ (a∀). 3 

1 I will assume throughout that (i) all sets T , M and A are non-empty and finite, that (ii) Pr(t) > 0 for 

all t ∈ T , that (iii) for each state t there is at least one message m which is true in that state and that (iv) no 

message is contradictory, i.e., there is no m for which [m] = ∅. 

2 Unless indicated, I assume that states are equiprobable in example games. 

3 For t ∈ T , I write S t as an abbreviation for “a sender of type t”. 

65


a∃¬∀ a∀ msome mall 

t∃¬∀ 1,1 0,0 

t∀ 0,0 1,1 

√ 

√ √− 

G1: “Scalar Implicatures” 

amate aignore mhigh mlow 

√ 

thigh 1,1 0,0 

tlow 1,0 0,1 − 

G2: “Partial Conflict” 

It is obvious that in order to arrive at this prediction, a special role has to be assigned to 

the conventional, semantic meaning of the messages involved. For instance, in the above 

example anti-semantic play, as we could call it, that simply reverses the use of messages, 

should be excluded. Most game-theoretic models of language use hard-wire semantic 

meaning into the game play, either as a restriction on available moves of sender and receiver, 

or into the payoffs, but in both cases effectively enforcing truthfulness and trust. 

This is fine as long as conversation is mainly cooperative and preferences aligned. But 

let’s face it: the central Gricean assumption of cooperation is an optimistic idealization 

after all; conflict, lies and deceit are as ubiquitous as air. But then, hard-wiring of truthfulness 

and trust limits the applicability of our models as it excludes the possibility that 

senders may wish to mislead their audience. We should aim for more general models and, 

ideally, let the agents, not the modeller decide when to be truthful and what to trust. 

Opposed to hard-wiring truthfulness and trust, the most liberal case at the other end 

of the spectrum is to model communication, not considering reputation or further psychological 

constraints at all, as cheap talk. Here messages do not impose restrictions on 

the game play and are entirely payoff irrelevant: US,R(m, a, t) = US,R(m ′ , a, t) for all 

m, m ′ ∈ M, a ∈ A and t ∈ T . However, if talk is cheap, yet exogenously meaningful, the 

question arises how to integrate semantic meaning into the game. Standard solution concepts, 

such as sequential equilibrium or rationalizability, are too weak to predict anything 

reasonable in this case: they allow for nearly all anti-semantic play and also for babbling, 

where signals are sent, as it were, arbitrarily and therefore ignored by the receiver. 

In response to this problem, game theorists have proposed various refinements of the 

standard solution concepts based on the notion of credibility. 4 The idea is that semantic 

meaning should be respected (in the solution concept) wherever this is reasonable in view 

of the possibly diverging preferences of interlocutors. As an easy example, look at game 

G2 where S is of either a high quality or a low quality type, and where R would like 

to pair with S thigh only, while S wants to pair with R irrespective of her type. Interests 

are in partial conflict here and, intuitively, a costless, non-committing message mhigh 

is not credible, because S tlow would have all reason to send it untruthfully. Therefore, 

intuitively, R should ignore whatever S says in this game. In general, if nothing prevents 

S from babbling, lying or deceiving, she might as well do so; whenever she even has an 

incentive to, she certainly will. For the receiver the central question becomes: when is a 

signal credible and what should I do if it is not? 

This paper offers a fresh look at this classical problem of game theory. The novelty 

is, so to speak, a “linguistic turn”: I suggest that credibility considerations are pragmatic 

inferences, in some sense very much alike —and in another sense very much unlike— 

conversational implicatures. I argue that this linguistic approach to credibility of information 

improves on the classical game-theoretic analyses by Farrell (1993) and Rabin 

4 The standards in the debate about credibility were set by Farrell (1993) for equilibrium and by Rabin 

(1990) for rationalizability. I will mainly focus on these two classical papers here for reasons of space. 

66 

√−


(1990). In order to implement conventional meaning of signals in a cheap talk model, the 

present paper takes an epistemic approach to the solution of games: the model presented 

in this paper spells out the reasoning of interlocutors in terms of their beliefs about the 

behavior of their opponents as a sequence of iterated best responses (IBR) which takes 

semantic meaning as a starting point. For clarity: the IBR model places no restriction 

whatsoever on the use of signals; conventional meaning is implemented merely as a focal 

element in the deliberation of agents. This way, the IBR model extends recent work 

in game-theoretic pragmatics (Jäger, 2007; Benz and van Rooij, 2007), to which it adds 

generality by taking diverging preferences into account and by implementing the basic assumptions 

of “level-k models” of reasoning in games (cf. Stahl and Wilson, 1995; Crawford, 

2003; Camerer et al., 2004). In particular, agents in the model are assumed to be 

boundedly rational in the sense that each agent computes only finitely many steps of the 

best response sequence. Section 2 scrutinizes the notion of credibility, section 3 spells out 

the formal model and section 4 discusses its properties and predictions. 

2 Credibility and Pragmatic Inference 

The classical idea of message credibility is due to Farrell (1993). Farrell seeks an equilibrium 

refinement that pays due respect to the semantic meaning of messages. His notion 

of credibility is therefore tied to a given reference equilibrium as a status quo. According 

to Farrell, then, a message m is FARRELL-CREDIBLE with respect to a given equilibrium 

if all t ∈ [m] prefer the receiver to interpret m literally, i.e., to play a best response to the 

belief Pr(·| [m]) that m is true, over the equilibrium play, while no type t �∈ [m] does. 

A number of objections can be raised against Farrell-credibility. First of all, the definition 

requires all types in [m] to prefer a literal interpretation of m over the reference 

equilibrium. This makes sense, under Farrell’s Rich Language Assumption (RLA) that 

for every X ⊆ T there is a message m with [m] = X. This assumption is prevalent in 

game-theoretic discussions of credibility, but restricts applicability. I will show in section 

4 that this assumption seriously restricts Rabin’s (1990) account. But for now, suffice 

it to say that, in particular, the RLA excludes models like G1, used to study pragmatic 

inference in the light of (partial) inexpressibility. I will drop the RLA here to aim for 

more generality and compatibility with linguistic pragmatics. 5 Doing so, implies amending 

Farrell-credibility to require only that some types in [m] prefer a literal interpretation 

of m over the reference equilibrium. 

Still, there are further problems. Matthews, Okuno-Fujiwara and Postlewaite (1991) 

criticize Farrell-credibility as being too strong. Their argument builds on example G3. 

Compared to the babbling equilibrium, in which R performs a3, messages m1 and m2 are 

intuitively credible: both S t1 , as well as S t2 have good reason to send m1 and m2 respectively. 

Communication seems possible and utterly plausible. However, neither message is 

Farrell-credible, because for i, j ∈ {1, 2} and i �= j not only S tj , but also S ti prefers R to 

play a best response to a literal interpretation of mj, which would trigger action aj, over 

5 A reviewer points out that the RLA has a correspondent in the linguistic world in Katz’s (1981) “principle 

of effability”. The reviewer supports dropping the RLA, because otherwise pragmatic inferences is 

limited to context and effort considerations. It is also very common (and, to my mind, reasonable) to restrict 

attention to certain alternative expressions only, namely those that are salient (in context) after observing a 

message. Of course, game theory is silent as to where the alternatives come from, since this is a question 

for the linguist, perhaps even the syntactician (cf. Katzir, 2007). 

67


a1 a2 a3 m1 m2 

t1 4,3 3,0 1,2 √ 

t2 3,0 4,3 1,2 − 

G3: “Best Message Counts” 

√− 

a1 a2 a3 a4 m12 m23 m13 

t1 4,5 5,4 0,0 1,4 √ √ 

√ √− 

t2 0,0 4,5 5,4 1,4 √ √− 

t3 5,4 0,0 4,5 1,4 − 

G4: “Further Iteration” 

the no-communication outcome a3. The problem with Farrell’s notion is obviously that 

just doing better than equilibrium is not enough reason to send a message, when sending 

another message is even better for the sender. When evaluating the credibility of a 

message m, we have to take into account alternative forms that t �∈ [m] might want to 

send. 

Compare this with the scalar implicature in G1. Message msome is interpreted as communicating 

that the true state of affairs is t∃¬∀, because in t∀ the sender would have used 

mall. In other words, the receiver discards a state t ∈ [m] as a possible sender of m 

because that type has a better message to send. Of course, such pragmatic enrichment 

does not make a message intuitively incredible, as it is still used in line with its semantic 

meaning. Intuitively speaking, in G1 S even wants R to draw this pragmatic inference. 

This is, of course, different in G2. In general, if S wants to mislead, she intuitively 

wants the receiver to adopt a certain belief, but she does not want the receiver to realize 

that this belief might be false: we could say, somewhat loosely, that S wants her purported 

communicative intention to be recognized (and acted upon), but she does not want 

her deceptive intention to be recognized. Nevertheless, if the receiver does manage to 

recognize a deceptive intention, this too may lead to some kind of pragmatic inference, 

albeit one that the sender did not intend the receiver to draw. While the implicature in G1 

rules out a semantically feasible possibility, credibility considerations, in a sense, do the 

exact opposite: message mhigh is pragmatically weakened in G2 by ruling in state tlow. 

Despite the differences, there is a common core to both implicature and credibility 

inference. Here and there, the receiver seems to reason: which types of senders would 

send this message given that I believe it literally? Indeed, exactly this kind of reasoning 

underlies Benz and van Rooij’s (2007) model of implicature calculation for the purely 

cooperative case. The driving observation of this paper is that the same reasoning might 

not only rule out states t ∈ [m] to yield implicatures but may also rule in states t �∈ [m]. 

When the latter is the case, m seems intuitively incredible. Still, the reasoning pattern 

by which implicatures and credibility-based inferences are computed is the same. On 

superficial reading, this view on message credibility can be found in Stalnaker (2006) 

: 6 call a message m BVRS-CREDIBLE (Benz, van Rooij, Stalnaker) iff for some types 

t ∈ [m], but for no type t �∈ [m] S t ’s expected utility of sending m given that R interprets 

literally is at least as great as S t ’s expected utility of sending any alternative message m ′ . 

The notion of BvRS-credibility matches our intuitions in all the cases discussed so far, 

but it is, in a sense, self-refuting, as G4 from Matthews et al. (1991) shows. In this game, 

all the available messages m12, m23 and m13 are BvRS-credible, because if R interprets 

6 It is unfortunately not entirely clear to me what exactly Stalnaker’s proposal amounts to, as insightful 

as it might be, because the account is not fully spelled out formally. The basic idea seems to be that 

(something like) the notion of BvRS-credibility, as it is called here, should be integrated as a constraint on 

receiver beliefs —believe a message iff it is BvRS-credible— into an epistemic model of the game together 

with some appropriate assumption of (common) belief in rationality. The class of game models that satisfies 

rationality and credibility constraints would then ultimately define how signals are used and interpreted. 

68


literally S t1 will use message m12, S t2 will use message m23 and S t3 will use message m13. 

No message is used untruthfully by any type. However, if R realizes that exactly S t1 uses 

message m12, he would rather not play a2, but a1. But if the sender realizes that message 

m12 triggers the receiver to play a1, suddenly S t3 wants to send m12 untruthfully. This 

example shows that BvRS-credibility is a reliable start, but stops too short. If messages 

are deemed credible and therefore believed, this may create an incentive to mislead. What 

seems needed to rectify the formal analysis of message credibility is a fully spelled-out 

model of iterated best responses that starts in the Benz-van-Rooij-Stalnaker way and then 

carries on iterating. Here is such a model. 

3 The IBR Model and its Assumptions 

3.1 Assumptions: Focal Meaning and Bounded Rationality 

The IBR model presented in this paper rests on three assumptions with which it also sets 

itself apart from previous best-response models in formal pragmatics (Jäger, 2007; Benz 

and van Rooij, 2007; Jäger, 2008). The first assumption is the Focal Meaning Assumption: 

semantic meaning is focal in the sense that the sequence of best responses starts with a 

purely semantic truth-only sender strategy. Semantic meaning is also assumed focal in 

the sense that throughout the IBR sequence R believes messages to be truthful unless 

S has a positive incentive to be untruthful. This is the second, so called Truth Ceteris 

Paribus Assumption (TCP). These two (epistemic) assumptions assign semantic meaning 

its proper place in this model of cheap-talk communication. 

The third assumption is the Bounded Rationality Assumption: I assume that players 

in the game have limited resources which allow them to reason only up to some finite 

iteration depth k. At the same time I take agents to be overconfident: each agent beliefs 

that she is smarter than her opponent. Camerer et al. (2004) make an empirical case for 

these assumptions about the psychology of reasoners. 7 However, for simplicity, I do not 

implement Camerer et al.’s (2004) Cognitive Hierarchy Model in full. Camerer et al. 

assume that each agent who is able to reason up to strategic depth k has a proper belief 

about the population distribution of players who reason up to depth l < k, but I will 

assume here, just to keep things simple, that each player believes that she is exactly one 

step ahead of her opponent (cf. Crawford, 2003; Crawford, 2007). (I will discuss this 

simplifying assumption critically in section 4.) 

3.2 Beliefs & Best Responses 

Given a signaling game, a SENDER SIGNALING-STRATEGY is a function σ ∈ S = 

(∆(M)) T and a RECEIVER RESPONSE-STRATEGY is a function ρ ∈ R = (∆(A)) M . 

In order to define which strategies are best responses to a given belief, we need to define 

the game-relevant beliefs of both S and R. Since the only uncertainty of S concerns what 

R will do, the set of relevant SENDER BELIEFS ΠS is just the set of receiver responsestrategies: 

ΠS = R. On the receiver’s side, we may say, with some redundancy, that there 

7 A good intuitively accessible example why this should be is a so-called beauty contest game (cf. Ho, 

Camerer and Weigelt, 1998). Each player from a group of size n > 2 chooses a number from 0 to 100. The 

player closest to 2/3 the average wins. When this game is played with a group of subjects who have never 

played the game before, the usual group average lies somewhere between 20 to 30. This is quite far from 

the group average 0 which we would expect from common (true) belief in rationality. Everybody seems to 

believe that they are just a smarter than everybody else, without noticing their own limitations. 

69

are three components in any game-relevant belief (cf. Battigalli, 2006): firstly, R has a 

prior belief Pr(·) about the true state of the world; secondly, he has a belief about the 

sender’s signaling strategy; and thirdly, he has a posterior belief about the true state after 

hearing a message. Posteriors should be derived by Bayesian update from the former two 

components, but also specify R’s beliefs after unexpected surprise messages. Taken to- 

gether, the set of relevant RECEIVER BELIEFS ΠR is the set of all triples 〈π 1 R , π2 R , π3 R 

〉 for 

which π1 R = Pr, π2 R ∈ S = (∆(M))T and π3 R ∈ (∆(T ))M such that for any t ∈ T and 

m ∈ M if π2 R (t, m) �= 0, then: 

π 3 R(m, t) = 

π1 R (t) × π2 R (t, m) 

� 

t ′ ∈T π1 R (t′ ) × π2 R (t′ , m) . 

Given a sender belief ρ ∈ ΠS, say that σ is a BEST RESPONSE SIGNALING STRATEGY 

to belief ρ iff for all t ∈ T and m ∈ M we have: 

σ(t, m) �= 0 → m ∈ arg max 

m ′ � 

ρm 

∈M 

′(a) × US(m ′ , a, t) 

The set of all such best responses to belief ρ is denoted by S(ρ). Given a receiver belief 

πR ∈ ΠR say that ρ is a BEST RESPONSE STRATEGY to belief πR iff for all m ∈ M and 

a ∈ A we have: 

ρ(m, a) �= 0 → a ∈ arg max 

a ′ ∈A 

� 

t∈T 

a∈A 

π 3 R(m, t) × UR(m, a ′ , t) 

The set of all such best responses to belief πR is denoted by R(πR). Also, if Π ′ R ⊆ ΠR is 

a set of receiver beliefs, let R(Π ′ R R(πR). 

) = � 

πR∈Π ′ R 

3.3 Strategic Types and the IBR sequence 

In line with the Bounded Rationality Assumption of Section 3.1, I assume that senders 

and receivers are of different strategic types. Strategic types correspond to the level k of 

strategic depth a player in the game performs (while believing she thereby outperfoms her 

opponent by exactly one step of reasoning). I will give an inductive definition of strategic 

types in terms of players beliefs, starting with a fixed strategy σ∗ 0 of S0. 8 Then, for any 

k ≥ 0, Rk is characterized by a belief set π∗ Rk ⊆ ΠR that S is a level-k sender and Sk+1 is 

characterized by a belief π∗ Sk+1 ∈ ΠS that R is a level-k receiver. 

I assume that S0 plays according to the signaling strategy σ∗ 0 which simply sends any 

true message with equal probability in all states. There need not be any belief to which 

this is a best response, as level-0 senders are (possibly irrational) dummies to implement 

the Focal Meaning Assumption. R0 then believes that he is facing S0. With unique σ∗ 0, 

which sends all messages in M with positive probability (M is finite and contains no 

contradictions), R0 is characterized entirely by the unique belief π∗ Ro that S plays σ∗ 0. 

In general, Rk believes that he is facing a level-k sender. For k > 0, Sk is characterized 

by a belief π∗ Sk ∈ ΠS. Rk consequently believes that Sk plays a best response σk ∈ 

S(π∗ Sk ) to this belief. We can leave this unrestricted and assume that Rk considers any 

) possible. But it will transpire that for an intuitively appealing analysis of 

σk ∈ S(π ∗ Sk 


8 I will write Sk and Rk to refer to a sender or receiver of strategic type k. Likewise, S t k 

of strategic type k and knowledge type t. 

70 

refers to a sender


message credibility we need to assume that Rk takes Sk to be truthful all else being equal 

(see also discussion in section 4). We implement the TCP Assumption of Section 3.1 as 

a restriction S∗ (π∗ Sk ) ⊆ S(π∗ ) on signaling strategies held possible by R. Of course, 

Sk 

even when restricted, there need not be a unique signaling strategy here. As a general 

tie-break rule, assume the “principle of insufficient reason” that all σk ∈ S∗ (π∗ ) are Sk 

equiprobable to Rk. That means that Rk effectively believes that his opponent is playing 

response strategy 

σ ∗ � 

σ∈S 

k(t, m) = 

∗ (π∗ S ) σ(t, m) 

k . 

|S ∗ (π ∗ Sk )| 

This fixes Rk’s beliefs about the behavior of his opponent, but it need not fix Rk’s belief 

π 3 R about surprise messages. Since this matter is intricate and moreover Rk’s counterfactual 

beliefs do not play a crucial role in any examples discussed in this paper, I will not 

pursue this issue at all in this paper (but see also footnote 10 below). In general, let us 

and whose third 

say that Rk is characterized by any belief whose second component is σ∗ k 

component satisfies some (coherent, but possibly vacuous) assumption about the interpretation 

of surprise messages. Let, π∗ Rk ⊆ ΠR be the set of all such beliefs. Rk is then fully 

characterized by π∗ Rk . 

In turn, Sk+1 believes that her opponent is a level-k receiver who plays a best response 

ρk ∈ R(π∗ Rk ). With the above tie-break rule Sk+1 is fully characterized by the belief 

3.4 Credibility and Inference 

ρ ∗ k(m, a) = 

� 

ρ∈R(π ∗ R k ) 

ρ(m, a) 

|R(π∗ Rk )| 

. 

Define that a signal m is k-OPTIMAL in t iff σ∗ k+1 (t, m) �= 0. The set of k-optimal messages 

in t are all messages that Rk+1 believes St k+1 might send (thus taking the TCP 

Assumption into account). 9 Similarly, distill from R’s beliefs his INTERPRETATION- 

STRATEGY δ : M → P(T ) as given by belief πR: δπR (m) = {t ∈ T | π3 R (m, t) �= 0}. 

This simply is the support of the posterior beliefs of R after receiving message m. Let’s 

write δk for the interpretation strategy of a level-k receiver. 

For any k > 0, since Sk believes to face Rk−1 with interpretation strategy δk−1, wanting 

to send message m would intuitively count as an attempt to mislead if sent by St k just in 

case t �∈ δk−1(m). Such an attempt would moreover be untruthful if t �∈ [m]. While 

Rk−1 would be deceived, Rk would see through the attempted deception. From Rk’s 

point of view, who adheres to the TCP Assumption, a message m is incredible if it is 

k − 1-optimal in some t �∈ [m]. But then Rk will include t in his interpretation of 

m: recognizing a deceptive intention leads to pragmatic inference. In general, we should 

consider a message m credible unless some type t �∈ [m] would want to use m somewhere 

along the IBR sequence; precisely, m is CREDIBLE iff δk(m) ⊆ [m] for all k ≥ 0. 10 

9 Without the TCP Assumption, 0-optimality would be equivalent to the notion of an optimal assertion 

in Benz and van Rooij (2007). 

10 It may seem that messages which would not be sent by any type (after the first round or later) come out 

credible under this definition, which would not be a good prediction. (Thanks to Daniel Rothschild (p.c.) for 

pointing this out to me.) However, this is not quite right: we get into this predicament only for some versions 

of the IBR sequence, not for others. It all depends on how the receiver forms his counterfactual beliefs. If, 

for instance, we assume that R rationalizes observed behavior even if it surprises him, we can keep the 

71

4 Discussion 


a1 a2 m12 m3 

t1 1,1 0,0 √ 

− 

t2 0,0 1,1 √ 

√− 

t3 0,0 1,1 - 

G5: “White Lie” 

Pr(t) a1 a2 a3 m12 m23 

t1 1/8 1,1 0,0 0,0 √ 

√ √− 

t2 3/4 0,0 1,1 0,0 √ 

t2 1/8 0,0 0,0 1,1 − 

G6: “Some Game without a Name” 

The IBR model makes intuitively correct predictions about message credibility for the 

games considered so far. In G1, R0 responds to msome with the appropriate action a∃¬∀, 

but still interprets δ0(msome) = {t∃¬∀, t∀}. In turn, R1 interprets as δ1(msome) = {t∃¬∀}; he 

has pragmatically enriched the semantic meaning by taking the sender’s payoff structure 

and available messages into account. After one round a fixed-point is reached, with fully 

revealing credible signaling in accordance with intuition. In G2, IBR predicts that both 

S thigh 

1 and S tlow 

1 will use mhigh which is therefore not credible. In G3, also fully revealing 

communication is predicted and for G4 IBR predicts that all messages are credible for R0 

and R1, but not for R2, hence incredible as such. In general, the IBR model predicts that 

communication in games of pure coordination is always credible: 

Proposition 4.1. Take a signaling game with T = A and US,R(·, t, t ′ ) = c > 0 if t = t ′ 

and 0 otherwise. Then δk(m) ⊆ [m] for all k and m. 

Proof. Clearly, δ0(m) ⊆ [m] for arbitrary m. So assume that δk(m) ⊆ [m]. In this case 

S t k+1 will use m only if t ∈ δk(m). But then t ∈ [m] and therefore δk+1(m) ⊆ [m]. 

However, the IBR model does not guarantee generally that communication is credible 

even when preferences are perfectly aligned, i.e., US = UR. This may seem surprising at 

first, but is due naturally to the possibility of, what we could call, white lies: untruthful 

signaling that is beneficial for the receiver. These may occur if the set of available signals 

is not expressive enough. As an easy example, consider G5 where St2 will use m3 

untruthfully to induce action a2, which, however, is best for both receiver and sender. 

To understand the central role of the TCP assumption in the present proposal, consider 

the game G6. In G6, R0 has the following posterior beliefs: after hearing message m12 he 

rules out t3 and believes that t2 is three times as likely as t1; similarly, after hearing message 

m23 he rules out t1 and believes that t2 is three times as likely as t3. Consequently, 

R0 responds to both signals with a2. Now, S t1 

1 , for instance, does not care which mes- 

sage to choose from, as far as her expected utilities are concerned. But R1 nevertheless 

assumes that S t1 

1 speaks truthfully. It’s thanks to the TCP Assumption that IBR predicts 

messages to be credible in this game. 

G6 also shows a difference between the IBR model and Rabin’s (1990) model of credible 

communication, which superficially look very similar. Rabin’s model consists of two 

components: the first component is a definition of message credibility which is almost a 

two-step iteration of best responses starting from the semantic meaning; the second component 

is iterated strict dominance around a fixed core set of Rabin-credible messages 

definition unchanged: if no type whatsoever has an outstanding reason to send m, the receiver’s posterior 

beliefs after m will support any type. So, unless m is tautologous, it is incredible. Still, Rothschild’s 

criticism is appropriate: the definition of message credibility offered here is, in a sense, incomplete as long 

as we do not properly define the receiver’s counterfactual beliefs; something left for another occasion. 

72


being sent truthfully and believed. In particular, Rabin requires for m to be credible that 

m induces, when taken literally, exactly the set of all sender-best actions (from the set of 

actions that are inducible by some receiver belief) of all t ∈ [m]. This is defensible under 

the Rich Language Assumption, but both messages in G6 fail this requirement. Consequently, 

with no credible message to restrict iterated strict dominance, Rabin’s model 

predicts a total anything-goes for game G6. This shows the limited applicability of approaches 

to message credibility that are inseparable from the Rich Language Assumption. 

The present notion of message credibility and the IBR model are not restricted in this 

sense and fare well with (partial) inexpressibility and the resulting inferences. 

To wrap up: as a solution concept, the epistemic IBR model offers, basically, a set of 

beliefs, viz., beliefs obtained under certain assumptions about the psychology of agents 

from a sequence of iterated best responses. I do not claim that this model is a reasonable 

model for human reasoning in general. Certainly, the simplifying assumption that 

players believe that they are facing a level-k opponent, and not possibly a level-l < k opponent, 

is highly implausible proportional to k, but especially so for agents that have, in 

a manner of speaking, already reasoned themselves through a circle multiple times. (It is 

easily verified that for finite M and T the IBR sequence always enters a circle after some 

k ∈ N.) 11 Still, I wish to defend that the IBR model does capture (our intuitions about) 

certain aspects of (idealized) linguistic behavior, namely pragmatic inference in cooperative 

and non-cooperative situations. Whether it is a plausible model of belief formation 

and reasoning in the envisaged linguistic situations is ultimately an empirical question. 

In conclusion, the IBR model offers a novel perspective on message credibility and 

the pragmatic inferences based on this notion. The model generalizes existing gametheoretical 

models of pragmatic inference by taking conflicting interests into account. It 

also generalizes game-theoretic accounts of credibility by giving up the Rich Language 

Assumption. The explicitly epistemic perspective on agents’ deliberation assigns a natural 

place to semantic meaning in cheap-talk signaling games as a focal starting point. It also 

highlights the unity in pragmatic inference: in this model both credibility-based inferences 

and implicatures are different outcomes of the same reasoning process. 


I’d like to thank Tikitu de Jager, Robert van Rooij, Daniel Rothschild, Marc Staudacher 

and three anonymous referees for insightful comments, help and discussion. I moreover 

benefited greatly from discussing with Gerhard Jäger an early version of his paper (Jäger, 

2008), which also defines and applies a general iterated best response model different 

from what I did here. Also, I am thankful to Sven Lauer for waking my interest by first 

explaining to me with enormous patience some puzzles about credibility that I did not 

fully understand at the time (see Lauer, 2007). Errors are my own. 

11 It is tempting to assume that “looping reasoners” may have an Aha-Erlebnis and to extend the IBR 

sequence by transfinite induction assuming, for instance, that level-ω players best respond to the belief 

that the IBR sequence is circling. I do not know whether this is necessary and/or desirable for linguistic 

applications. We should keep in mind though that in some cases human reasoners may not get to the ideal 

level of reasoning in this model and in others they might even go beyond it. 

73

References 


Battigalli, P. (2006). Rationalization in signaling games: Theory and applications, International 

Game Theory Review 8(1): 67–93. 

Benz, A. and van Rooij, R. (2007). Optimal assertions and what they implicate, Topoi 

26: 63–78. 

Camerer, C. F., Ho, T.-H. and Chong, J.-K. (2004). A cognitive hierarchy model of games, 

The Quarterly Journal of Economics 119(3): 861–898. 

Crawford, V. P. (2003). Lying for strategic advantage: Rational and boundedly rational 

misrepresentation of intentions, American Economic Review 93(1): 133–149. 

Crawford, V. P. (2007). Let’s talk it over: Coordination via preplay communication with 

level-k thinking. Unpublished Manuscript. 

Farrell, J. (1993). Meaning and credibility in cheap-talk games, Games and Economic 

Behavior 5: 514–531. 

Grice, P. H. (1989). Studies in the Ways of Words, Harvard University Press. 

Ho, T.-H., Camerer, C. and Weigelt, K. (1998). Iterated dominance and iterated best 

response in experimental “p-beauty contests”, The American Economic Review 

88(4): 947–969. 

Jäger, G. (2007). Game dynamics connects semantics and pragmatics, in A.-V. Pietarinen 

(ed.), Game Theory and Linguistic Meaning, Elsevier, pp. 89–102. 

Jäger, G. (2008). Game theory in semantics and pragmatics. Manuscript, University of 

Bielefeld. 

Katz, J. J. (1981). Language and Other Abstract Objects, Basil Blackwell. 

Katzir, R. (2007). Structurally-defined alternatives. To appear in Linguistics and Philosophy. 

Lauer, S. (2007). Some kinds of deception do not occur: Credibility and the maxim of 

sincerity. Unpublished Manuscript. Amsterdam, Stanford. 

Matthews, S. A., Okuno-Fujiwara, M. and Postlewaite, A. (1991). Refining cheap talk 

equilibria, Journal of Economic Theory 55: 247–273. 

Rabin, M. (1990). Communication between rational agents, Journal of Economic Theory 

51: 144–170. 

Stahl, D. O. and Wilson, P. W. (1995). On players’ models of other players: Theory and 

experimental evidence, Games and Economic Behavior 10: 218–254. 

Stalnaker, R. (2006). Saying and meaning, cheap talk and credibility, in A. Benz, G. Jäger 

and R. van Rooij (eds), Game Theory and Pragmatics, Palgrave MacMillan, pp. 83– 

100. 

74

TOWARDS A NEW CHARACTERISATION 

OF CHOMSKY'S HIERARCHY VIA ACCEPTANCE PROBABILITY 


Multimedia University, Cyberjaya, Malaysia 

Abstract. Researchers have recently studied the acceptance probability of P and 

NP languages hoping to find new ways of differentiating both classes. The paper 

outlines the authors findings related to the acceptance probability of regular and 

context-free languages, which we describe using the term of a difference shrinking 

chain. A first proof technique, the inflating lemma, based on above results and able 

to separate higher languages from regular languages up to star height 1 as well as 

some incentives to apply those techniques to higher classes are given. 



“The major quest for the complexity theory community is finding methods that may 

separate classes.” (Buhrmann & Torenvliet 2005) Although there has been made an 

impressive progress recently within the area of complexity theory the need for new, 

creative approaches that may result in methods that could be used to separate classes 

has not diminished and is nicely exemplified by the long outstanding P vs. NP problem. 

One of the recent approaches included the study of properties of the acceptance 

probability function of such languages, that is, the study of the form of the graph of the 

function which takes as an argument a natural number n and returns the ratio between 

the number of accepted words of length n in the given language and all possible words 

of the same length. This study has lead to to many discoveries like the so called phase 

transition in the acceptance probability graph of NP complete problems (Clote & 

Kranakis 2002, Dubois et. al. 2000). There has been hope that if we were able to 

describe mentioned phase transition with more and more precision (Achlioptas et al. 

2001, Kirousis et al. 1998) we would then also be able to separate P from NP. 

Unfortunately, this has not yet happened. 

Like other researchers we have therefore turned our attention to smaller classes 

like regular and context free languages first. Given such a language, we define the 

density function dL(n) = |L ∩ Σ n | counting the number of words of length n in L. The 

study of the density of regular languages has a longer history (Schützenberger 1962, 

Eilenberg 1974, Rozenberg et al. 1997, Bodirsky at al. 2004). Languages with a density 

function that can be bounded from above by a polynomial (i.e. there exists a polynomial 

p(x) such that dL(n) ≤ p(n)) are called sparse. If on the other hand there exists a real 

number h > 1 such that dL(n) ≥ h n for infinitely many n ≥ 0 then L is called dense 

(Demain 2003, Krieger 2007). Notice that the language a*b* is a sparse language, while 

the language that includes all words over a binary alphabet that start with the letter a 

(i.e. a(a+b)*) is dense. As described in (Szilard et al. 1992, Rozenberg et al. 1997) a 

regular language is sparse “if and only if it can be represented as a finite union of 

regular expressions of the form xy1*z1...ym*zm, where x, y1, z1, ..., ym, zm are all strings in 

Σ*”. Such regular languages are also called SLRE and equivalent to bounded regular 

75


languages (Habermehl et al. 2000). Nevertheless, it is not difficult to see that the 

majority of all regular languages are dense. (Flajolet 1987) demonstrated that a regular 

language is either sparse or dense, which was recently generalized to context-free 

languages (Ilie 2000, Incitti 2000). While it is interesting in its own right to study such 

properties, (Demaine et al. 2003) could show that only sparse regular languages have 

the power to restrict NP complete problems such that they are polynomially solvable. In 

other words, that the intersection of such a regular language with an NP complete 

problems results in a language from P. (Eisman et al. 2005) proposed another 

application by stating that the density function could be used in some application areas 

such as streaming algorithms, where “rapid computation must be performed (often in a 

single pass)”. 

Still we feel that it is often more interesting to study the acceptance probability 

Acc(L, n) = |L ∩ Σ n | / |Σ n | of a given language rather than its density, that is the ratio 

between the number of accepted words and all possible words of a given length. As 

mentioned above, a(a+b)* has exponential density but it has only stable acceptance 

probability as Acc(a(a+b)*, n) = 0.5, which seems to describe the quantity of accepted 

words more appropriately. Secondly, such a different view allows us to combine both 

sparse and dense languages and study common properties. In (Hartwig et al. 2006a, 

Hartwig et al. 2006b) we could show that the acceptance probability graph is indeed 

expressive enough to separate complexity classes making it an acceptable candidate in 

above mentioned quest. The objective in using such properties to separate mentioned 

classes is hereby to familiarize ourselves with properties, techniques, applications and 

aimed at getting a better understanding of possible uses of acceptance probability 

graphs in higher classes. In (Hartwig et al. 2006a) we described the acceptance 

probability of very low regular languages and in (Hartwig et al. 2006b) we presented a 

proof technique (the inflating lemma) that is powerful enough to separate many higher 

languages from regular languages up to star height 1 and can be compared with the well 

known pumping lemma (Sisper 1997) 1 . 

Inflating Lemma If L ∈ REG(1) and L has increasing acceptance probability then 

there exist a length n0 and natural number k ≥ 1 such that for all w ∈ L with |w| ≥ n0: 

w = pr ∈ L → p(Σ k )*r ⊆ L. 

An example application would be the following proof. 

Example (MAJORITY does not belong to REG(1)) L = {w | w ∈ Σ* and w has more 

(or equal) a than b}∉ REG(1). 

Proof. Acc(L, n) is constantly increasing; hence the inflating lemma can be applied. But 

none of the words accepted can be inflated. We could take any word and position and 

insert (or: inflate with) as many b’s as needed until the word has more b’s than a’s. 

□ 

1 Although the inflating lemma seems to have only limited applicability the following work suggests 

that every regular language has either increasing, stable or decreasing chains. Furthermore, if L is regular 

and of decreasing acceptance probability, then the lemma could be applied to the complement of L. 

76

The following paper continues this work by providing an overview on the status of our 

work on the acceptance probability of regular and context free languages over binary 

alphabets claiming that both classes have acceptance probability graphs that can be split 

into either increasing, decreasing or stable chains with a decreasing (or shrinking) 

difference. We think that the minimal number of mentioned chains should be studied in 

more detail and put into a relationship to the size of any program or machine accepting 

the language. Knowing that NP complete problems exhibit phase transitions in their 

acceptance probability graphs switching from difference shrinking to difference 

increasing sections and vice versa we believe that techniques making use of those 

properties may contribute to the separation of higher classes, too. 

2 Preliminaries 


We use the following definitions: The alphabet for all strings is Σ = {a, b}. The length 

of a string w is given by |w|, all sets L1, L2,.. are considered subsets of Σ*. A regular 

expression e over Σ is built from all symbols in Σ, the symbol λ, the binary operators +, 

· and the unary operator *. The language specified by a regular expression is denoted by 

L(e) and is referred to as a regular language (Kleene 1956, Kulloch et al. 1943). We call 

a regular expression to be unambiguous (or non overlapping) if and only if its 

corresponding NFA is unambiguous. “An NFA is called unambiguous if for each word 

w there is at most one path from the initial state to a final state that spells out w.” 

(Bruggemann-Klein et al. 2007, Moreira et al. 2005)) It is important to know that all 

regular languages are unambiguous (Giammarresi et al. 2001) and can henceforward be 

described by an unambiguous regular expression. sh(e) computes the star height of a 

regular expression and REG(1) specifies all regular languages having a star height of 1 

or less. 

As mentioned in the introduction, the density of a language counts the number of 

accepted words per given length and is defined as 

dL(n) = |L ∩ Σ n |, 

while the acceptance probability of a language is defined as the ratio between the 

number of accepted words dL(n) and the number of all words of a given length, 

Acc(L, n) = |L ∩ Σ n | / |Σ n |. 

3 Regular acceptance probability 

3.1 Low regular languages 

Describing the acceptance probability of a finite language is straightforward. 

Lemma (Finite Languages) For any finite language L: Acc(L, n) = O(0). 

Proof. If L is finite then there exists a length after which no word is accepted by the 

language. The acceptance probability reaches 0. 

77

□ 

Regular languages which can be described by a regular expression having star height 

0 or at most one expression using the star operator and being of the form (a+b)* have 

constant acceptance probability. 

Lemma (Simple Regular Languages) If L = w1(a+b)*w2 with w1, w2 words there exist 

a constant c such that: 

Acc(L, n) = O(c). 

Proof. The smallest accepted word of the language L is of length |w| = |w1| + |w2|. As 

there is only one such smallest word, Acc(L, |w|) = 1/2 |w| = c. For any length n greater 

than |w| we can say that dL(n) = 2 · dL(n-1). Henceforward the acceptance ratio 

remains stable. 

□ 

It is then not difficult to see that also any unification of simple regular languages (in 

the above sense) will again only yield a language with constant acceptance 

probability. 

1 

0 . 8 

0 . 6 

0 . 4 

0 . 2 

0 


0 1 2 3 4 5 6 7 8 

Figure 1. Acceptance probability graphs of low regular languages. 

Left L1 = {a, aba} (finite), right L2 = ab(a+b)*. 

3.2 Regular languages having one star 

Languages built upon regular expressions using the star operator at most once include 

also languages with a decreasing acceptance probability, if the expression under the star 

is not entirely composed of (a+b)* expressions. The length of the expression under the 

star defines the step width d decomposing the acceptance probability graph into d 

chains. We will have d-1 chains with the acceptance probability O(0) and one chain 

being either stable or decreasing. 

1 

0 . 8 

0 . 6 

0 . 4 

0 . 2 

0 

1 

0 . 8 

0 . 6 

0 . 4 

0 . 2 

0 

0 1 2 3 4 5 6 7 8 

0 1 2 3 4 5 6 7 8 

Figure 2. Acceptance probability graph of L3 =b(ba)*. L3 has a step width of 2 with one chain being 

stable (dL3(0) = dL3(2) = ... = 0), while the remaining elements belong to a chain with its peaks 

constantly decreasing by ¾. 

78


Lemma (Regular Languages with One Star) For any regular language L = w1w2*w3 with w1, 

w3 words and sh(w2) = 0 there exists a minimal length n0 such that for all n > n0: 

Acc(L, n) ≤ Acc(L, n-|w2|). 

Proof. The length d = |w2| is usually referred to as a step width for this language touching the 

peaks of the acceptance probability graph. The number of accepted words of any length can be 

traced back to the number of accepted words with length n-|w2| as we can apply the word under 

the star. Henceforward, Acc(L, n) = c · Acc(L, n-|w2|). c is easily determined from w2 and the 

fact that the chains are either decreasing or stable is obvious and follows also directly from the 

inflating lemma. 

□ 

3.3 Regular languages up to star height 1 

Regular languages up to star height 1 provide already a wide range of different 

acceptance probability graphs. 

Lemma (Regular Languages up to Star Height 1) If L ∈ REG(1) then there exists 

constants s, um, and vm such that: 

dL(s) = u0, 

dL(s+1) = u1, 

... , 

dL(s+m) = um 

dL(n) = u1dL(n-v1) + u2dL(n-v2) + .. + umdL(n-vm) 

Proof. (Sketch) See (Hartwig 2008) for the complete proof. If L ∈ REG(1) then L has an 

unambiguous regular expression of the following form: 

L = L1 + L2 + ... + Lk 

where Li = Ri0Ri1...Rit with sh(Rij) ≤ 1 

Calculating the number of accepted words for each Li is done successively starting from 

left. The number of accepted words of length n for Ri0 can be determined from the 

length's of all expressions under the star. For example, let 

L4 = b (aa + bbb)* b (ab + bba)* b 

we would have R4,0 = (aa + bbb)* and L4,1 = (ab + bba)*, which would give us for R4,0: 

dR4,0(3) = 1 // as |b| + |b| + |b| = 3 

dR4,0(n) = dR4,0(n-|aa|) + dR4,0(n-|bbb|) 

= dR4,0(n-2) + dR4,0(n-3) 

This process continues until the last expression within Li is reached consequently 

adding all the accepted words of formerly considered components. 

79

dR4,1(n) = dR4,1(n-|ab|) + dR4,1(n-|bba|) + No_acc_words_for_R4,0 

= dR4,1(n-2) + dR4,1(n-3) + dR4,0(n) 

And this would give us in our (simple) case, 

dL4(n) = dR4,1(n) 

Above result (here depending on R4,0 and R4,1) could then be converted into a recursive 

formula referring only to itself and obeying the requirements. In the example case, 

dL4(3) = 1, 

dL4(n) = 2dL4(n-2) +2dL4(n-3) - dL4(n-4) - 2dL4(n-5) – dL4(n-6). 

0.20 

0.16 

0.12 

0.08 

0.04 

0.00 


1 2 3 4 5 6 7 8 

1 2 3 4 5 6 7 8 

Figure 3. Acceptance probability graphs of higher regular languages up to star height 1. Left 

L4 = b (aa + bbb)* b (ab + bba)* b) from above example, 

right L5 =a (a+b)* + (b + ba)*) with a union operator also outside the star. 

To describe the acceptance probability graphs of such regular and higher languages we 

introduced the term of a difference shrinking chain. 

Definition (Difference Shrinking Chain) We call a language to have a difference 

shrinking chain, if there exists a step width d and length n0 such that for all i ≥ 0: 

|Acc(L, n0+(i+2)d) - Acc(L, n0+(i+1)d)| ≤ |Acc(L, n0+(i+1)d) - Acc(L, n0+i·d)| 

Δ 1 

Figure 4. An example language with only difference shrinking chains. A chain is called 

difference shrinking if for such a chain and any length n the speed of the increase (or 

decrease) slows constantly, i.e. Δ2 ≤ Δ1. 

We call a language to be difference shrinking, if there exists a step width d ≥ 1 

decomposing the acceptance probability graph into d difference shrinking chains. We 

call a language to be a regular increasing language, if it can be decomposed into at 

least one increasing and 0 or more stable chains. (Regular decreasing languages are 

defined in a similar way.) A language is furthermore called strongly increasing if 

80 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 

Δ 2 

□

only one increasing chain completely describes the graph. While similar concepts 

apply to strongly decreasing languages, such languages are also called to have 

monotone acceptance probability. 

Lemma (Star Height 1 Languages are Difference Shrinking) If L ∈ REG(1), then L 

is difference shrinking. 

Proof. See (Hartwig 2008). 

The proof includes an algorithm that is able to compute for any given regular language 

a step width, which might not be the minimal step width but which is decomposing the 

language's acceptance probability graph into such difference shrinking chains. It is not 

difficult to see that most of the regular languages up to star height 1 have also only 

monotone chains and we claim that it is also true for the languages left out. 

4 The acceptance probability of context free languages 

Calculating the number of accepted words of a regular language with a star height of 2 

or higher seems to require a different approach. Let L6 = (w1*w2*)*, we could then 

compute accepted words of length n as follows: dL6(n) = dL6(1)*dL6(n-1) + 

dL6(2)*dL6(n-2) + ... A word of length n is a composition of an accepted word of 

length c ≤ n from w1 and an accepted word of length n-c from w2. Surprisingly the 

same approach will also work in the calculation of the acceptance probability of a 

context-free language as the following examples suggest. 

Example. Let G1 be the following grammar: 

S => SaN | a 

N => bN | bb 

We could compute the number of accepted words that are derived from each of the given non 

terminals. The rule S => SaN specifies that a terminal word can be constructed from any 

smaller word from S and N as long as the sum of their length's equals n-1. (n-1, because the 

letter a makes up the one place.) This would bring us to the following: 

dS(1) = 1 

dN(2) = 1 

n_1 

dS(n) = ! 

i=0 

dN(n) = dN(n-1) 


_d S_i_d N _n_1_ i _ 

Having S as the start symbol, we can calculate the number of accepted words for the given 

grammar with dG1(n) = dS(n). Being also a regular language (a(abbb + )*), the number of 

accepted words could also be calculated with ds(1) = 1, ds(n) = ds(n-1) + ds(n-3) 

following thoughts from the previous chapters. 

Example. Let G2 be the following grammar: 

81

S => aSb | ab 

Although being a truly context-free language, calculating the language's density 

remains quite simple and suggests that the acceptance probability of all context-free 

languages can completely be described with a form similar to the one presented for 

star-height 1 languages. 

dS(2) = 1 

dS(n) = dS(n-2) 

Above examples and referring to the Chomsky-Schutzenberger Theorem stating that 

for every context free language and PDA M = (Q, Σ, Γ, δ, q0, Z0, F) there is a regular 

language R, the Dyck set D2 and two homomorphisms g, h such that L(M) = h(g −1 (D2) 

∩ R) we then claim that context-free languages are equally difference shrinking and 

monotone. 

While we can foresee challenges in the use of our results related to higher 

classes in the construction of new proof techniques, the long outstanding P vs NP 

problem should provide enough incentives to make an attempt. The phase transition 

that such NP complete problems exhibit, is only possible because the language's 

acceptance probability switches from sections being difference shrinking to difference 

increasing as shown in the example below. 

Figure 6. Example languages from NP complete having acceptance probability graphs with 

sections of increasing difference (some of them indicated). 

We think that finding the minimal step width for a given language would help in the 

search for new proof techniques. As mentioned earlier, the minimal step width should 

indicate more properties related to the complexity of accepting the language. 

5 Conclusions 


We have given a first overview related to a new attempt in characterizing classes from 

the Chomsky Hierarchy using properties derived from the language's acceptance 

probability graphs. Regular languages up to star height 1 have therefore graphs that 

can be split into difference shrinking chains. Current research suggests that this holds 

also for context-free languages. Knowing that NP complete languages usually have 

graphs performing a phase transition between difference shrinking and difference 

increasing sections, we recommended further work. Especially the problem of finding 

82

the minimal step width seems to be crucial in the construction of new proof 

techniques. 

Class Acceptance Probability Properties 

finite Acc(L, n) = 0 

simple 

regular 

Acc(L, n) = 2 dL(n-1) / 2 n 

one star Acc(L, n) = c dL(n-d) / 2 n 

star height 1 Acc(L, n) = [ u1dL(n-v1) 

+ u2dL(n-v2) 

+ ... 

+ umdL(n-vm) ] / 2 n 

regular, 

n_d 

context-free Acc(L, n) = ! _d S_i_d N _n_ d _i_...._/ 2 

i=0 

n 

context 

sensitive 

convergent to 0 

convergent to a 

constant (stable) 

as above & at most one 

decreasing chain 

monotone 2 , difference 

shrinking chains 

monotone, difference 

shrinking chains 3 

Acc(L, n) = ? ? as above & difference 

increasing chains, non 

monotonic chains 

Table 1. Acceptance probability of different classes from the Chomsky Hierarchy (state of 

the art, the class of context free languages is currently looked at). 

Acknowledgments 

We'd like to thank the anonymous referees for their comments. 

References 

H. Buhrmann & L. Torenvliet (2005). 'A Post's Program for Complexity Theory', BEATCS 85 

(pp. 41-51) 

P. Clote & E. Kranakis (2002). 'Boolean Functions and Computation Models', Springer, 

M. Hartwig et al. (2006a). 'In Search of a New Proof Technique', M2USIC06 

M. Hartwig et al. (2006b). 'Proving Non Regularity using Acceptance Probability Techniques', 

CSCM2006 

A. Bruggemann-Klein & R. Mesing. (2007). 'Regular Expressions into Finite Automata, 

http://webcourse.cs.technion.ac.il/236826/Spring2005/ho/WCFiles/RegularExpressions into Finite 

Automata.doc 

D. Giammarresi, R. Montalbano, D. Wood (2001). 'Block-Deterministic Languages', 

ICTCS01 

M. Sisper (1997). 'Introduction to the Theory of Computation', PWS Publishing Company (pp. 

2 Claimed for some languages. 

3 Claimed. 


83


63ff.) 

O. Dubois et al. (2000). 'Typical Random 3-SAT Formulae and the Satisfiability Threshold', 

SODA '00 (pp. 126-127) 

D. Achlioptas et al. (2001). 'The Phase Transition in 1-in-k SAT and NAE 3-SAT', SODA '01 

(pp. 721-722) 

L. Kirousis et al. (1998). 'Approximating the unsatisfiability threshold of random formulas', 

Random Structures and Algorithms 12(3) (pp. 253-269) 

D. Achlioptas et al. (2001). 'A Sharp Threshold yields in Proof Complexity Yields a 

Lower Bound for Satisfiability Search', Journal of Comp. & Sys. Sci. 68 (2) 

M. Hartwig (2008), 'Regular Languages up to Star Height 1 have Difference Shrinking 

Acceptance Probability', TMFCS-08 

M. Bodirsky et al. (2004), 'Efficiently computing the density of regular languages', Proceedings 

of Latin American INformatics (LATIN'04), pages 262-270, Buenos Aires 

M.P. Schützenberger (1962), 'Finite counting automata', Information and Control 5(2), 91-107 

S. Eilenberg (1974), 'Auomata, Languages, and Machines', Academic Press, Inc., Orlando, 

Florida, USA 

A. Szilard et al.(1992), 'Characterizing Regular Languages with Polynomial Densities', Lecture 

Notes in Computer Science, Volume 629, Springer, 494-503 

G. Rozenberg et al. (1997), 'Handbook of Formal Languages', Chapter 2: Regular Languages, 

Springer 

E. D. Demaine et al. (2003), 'On Universally Easy Classes for NP-complete Problems', 

Theoretical Computer Science, Vol. 304, pages 471-476 

D. Krieger et al. (2007), 'Finding the Growth Rate of a Regular Language in Polynomial Time', 

CoRR abs/0711.4990 

P. Habermehl et al. (2000), 'A Note on SLRE', http://citeseer.ist.psu.edu/375870.html 

P. Flajolet (1987), 'Analytic Models and Ambiguity of Context-Free Languages', TCS, 49:283- 

309 

L. Ilie et al. (2000), 'A Characterization of Polyslender Context-Free Languages', Theoret. 

Informatics Appl., 34(1):77-86 

R. Incitti (2000), 'The Growth Function of Context-Free Languages', Theoretical Computer 

Science, 255:601-605 

G. Eisman et al. (2005), 'Approximate Recognition of Non-regular Languages by Finite 

Automata', Proceedings of the Twenty-Eighth Australasian Computer Science Conference 

(ACSC2005), Newcastle, Australia 

S. Kleene (1956), 'Representation of events in nerve nets and finite automata', Automata 

Studies, Princeton University Press, Princeton, USA, 3-42 

W. S. Kulloch et al. (1943), 'A logical calculus of the ideas immanent in the nervous activity', 

Bull. Math. Biophys, 5:115-133 

N. Moreira et al. (2005), 'On the Density of Languages Representing Finite Set Partitions', 

Journal of Integer Sequences, Vol. 8 

84


DISTANCE EFFECTS IN SENTENCE PROCESSING 

Simon Hopp 

University of Konstanz 

Abstract. This paper reports results from two experiments investigating distance 

effects in sentence processing. It is well known that the processing difficulty of 

dependency relation increases with the distance between the two items concerned. 

The paper addresses the question what exactly determines ‘distance’: Time or 

amount of linguistic material between the first and the second item. Experiment 1 

disentangles these factors and suggests that linguistic material is the source of 

difficulty. Experiment 2 investigates the role of the characteristics of that 

intervening material. The logic of this experiment is based on Gibson’s (2000) 

claim that the ease of integrating a word into the CPPM decreases with the number 

of newly introduced discourse referents. In particular, experiment 2 asks whether 

adverbials which do not introduce new discourse referents have the same effect. 

The results indicate that while intervening discourse referents elicit the expected 

effect, adverbials do not show any effect at all. 

1 Working Memory and Sentence Processing 

In cognitive science there is a broad agreement that a certain kind of store is necessary 

for all kinds of complex cognitive tasks such as mental arithmetic or language 

processing. The following example (cf. Gibson 2000) illustrates the need for a short 

term store in sentence processing. 

(1) The reporter [that the senator attacked] admitted the error. 

In (1) the short term store (or working memory) has to keep the determiner phrase (DP) 

the reporter active over the period of time in which the relative clause is processed, to 

ensure that the human sentence parser is able to link the DP to the verb admitted and 

then to check the grammatical features correctly. Since sentences can contain several 

dependencies between items and these items can be separated by further items, storing 

linguistic information over a short time is a basic requirement for sentence processing. 

As has long been noticed in linguistic theory, sentences like (1) often lead to processing 

difficulties (e.g. Just & Carpenter 1992). One of the reasons for this fact is the distance 

between the linguistic items dependent on each other. It seems that integrating a word w 

into the CPPM (Current Partial Phrase Marker) is often adversely affected by the 

distance between w and information within the CPPM necessary for integrating w. 

However, it is still unclear why prior pieces in the CPPM are difficult to retrieve at later 

points. There are two prominent mechanisms that are said to be responsible for 

forgetting over a short term: The amount of time that passes between two items and 

linguistic material that has to be processed between two items. According to time-based 

decay earlier information might already have faded away at the point when it is needed 

again. In current models of working memory involvement in sentence processing, timebased 

decay either plays a decisive role (e.g., Lewis & Vasishth 2005) or is taken as one 

possible candidate for contributing to the cost of integrating a word into the sentence 

85


(Levy et al. 2007). The alternatives to theories of time-based decay are event-based 

models (cf. Lewandowsky et al. 2004). Those models admit that forgetting in working 

memory is observed over time, but they predict that time is not the crucial factor for this 

phenomenon. Some event-based models argue that it is rather interference of linguistic 

material that leads to processing difficulties (eg. Nairne 1990) Items that have already 

been processed may be forgotten by the time they are needed again, because new 

incoming material interferes. Clarifying the role of time-based decay versus 

interference-based forgetting is complicated because normally amount of linguistic 

material and amount of time are confounded. 

2 Case Checking as a Test Case 

In this paper, I present two experiments that were run to investigate the nature of 

forgetting in working memory. 1 The focus was on the process of linking and checking 

in German verb-final clauses adhering to the scheme in (2). When integrating the verb 

in clause-final position, the case of NP must be retrieved until the end of the sentence in 

order to check it against the case feature of the verb. 

(2) .. dass NP[case: X] … {distance} … verb[case: Y] 

An example of a verb-final clause in German, as it was used in the following 

experiments, is given in (3). 

(3) Ich glaube, dass die Studentin das wichtige Buch gelesen hat. 

I think that the student(fem) the important book read has 

‘I think, that the student has read the important book.’ 

The auxiliary in clause-final position hat asks for nominative case in the NP die 

Studentin. The memory trace of the case feature of this NP has to be memorized over a 

certain distance until the auxiliary hat is reached. The human sentence parser is then 

able to link the two dependent items - NP and verb - and to check the case features of 

both items. If, however, the distance between the verb and the related NP is too long 

than working memory is unable to keep the memory trace until the end of the sentence. 

In this case processing difficulties arise which can be measured experimentally. 

As mentioned above, amount of linguistic material and amount of time are 

normally confounded. In the first experiment the two factors were disentangled to 

investigate their respective impact on the human sentence parser independently. This 

builds on related work by Lewandowsky et al (2004) and Saito & Miyake (2004). 

The second experiment focused on sentence complexity according to the 

Dependency Locality Theory (DLT) (Gibson 2000). The DLT assumes that the costs of 

integrating a word w increase with the number of new discourse referents intervening 

between w and information needed to integrate w. For case-checking in German this 

prediction has not been tested so far. 

3 Experiment 1: Time-Based Decay versus Interference 

As shown in (3), the issue of forgetting in working memory was addressed by 

investigating the process of case-checking during the parsing of German verb-final 

clauses. By integrating the verb in clause-final position, the case of NP must be 

1 The experiments were part of a bigger project on sentence processing together with Markus Bader. 

86

etrieved in order to check it against the case feature of the verb. If the intervening 

distance is to long, essential information about case features will be lost at a later point 

when it is needed again. To be able to investigate the nature of ‘distance’ the crucial 

factors have to be disentangled. This is achieved by manipulating the factors 

independently. First of all, a procedure was chosen that allowed to present the stimuli 

experimenter-paced in a non-cumulative word-by-word fashion (for details see section 

Procedure). Two different presentation rates, one for a slow and one for a fast 

presentation, were preset. Second, the intervening material between the related items 

was manipulated. Sentences as in (3) were created in a long and in a short version. 

Additional adverbials (e.g. ‘für die letzte Prüfung im Mai’) were inserted for the long 

versions, as can be seen in (4): 

(4) Ich glaube, dass die Studentin (für die letzte Prüfung im Mai) 

I think that the student(fem) ( for the last exam in may) 

das wichtige Buch gelesen hat. 

the important book read has 

‘I think, that the student has read the important book.’ 

A cross-combination of the two independently manipulated factors led to four different 

conditions that were presented (see Figure 1). Sentence (a) is a short sentence presented 

in the fast presentation rate (short-fast). Sentence (b) contains additional material and is 

also presented in the fast pace (long-fast). Sentences (c) and (d) are both presented in 

the slow pace. Note that (c) does not contain any additional material (short-slow), 

whereas (d) contains an additional adverbial (long-slow). Note especially that 

conditions (b) and (c) differ in the amount of intervening material, but - due to the 

different presentation rates - they are matched in the amount of time. 

a 

b 

c 

d 

NP1 das wichtige Buch 

V AUX 

NP1 

NP1 

NP1 

Figure 1: Presentation Time of all 4 Sentence Types of Experiment 1 

This design allows analyzing the impact of both factors independently. As this experiment 

partly builds on work of Lewandowsky et al. (2004) the terminology for the crucial factors will 

be adopted and labeled Time (amount of time) and Event (intervening material). 

Participants and Material 


für die letzte Prüfung im Mai das wichtige Buch 

das wichtige Buch 

für die letzte Prüfung im Mai das wichtige Buch 

V AUX 

V AUX 

V AUX 

16 students of the University of Konstanz participated for course credit or payment. All 

participants were native speakers of German and naive with respect to the purpose of the 

experiment. 

128 sentences were created, each in 16 versions according to the factors Voice (active 

versus passive), Status (grammatical versus ungrammatical), Time (fast versus slow) and Event 

(long versus short). Table 1 shows a Sample Stimuli Item of Experiment 1. 

87

Table 1. Sample Stimuli Item of Experiment 1 

Intervening material for all „(adverbial)“: ([…] für die letzte Prüfung im August […]) 

([…] for the last exam in august […]) 

(Active/ Grammatical) 

Der Dozent hofft, dass die Studentin (adverbial) das wichtige Buch gelesen hat 

the lecturer hopes that the(nom) student(fem) (adverbial) the important book read has 

'The lecturer hopes, that the student has read the important book (for the last exam in august).' 

(Passive/ Grammatical) 

Der Dozent hofft, dass der Studentin (adverbial) das wichtige Buch besorgt wurde 

the lecturer hopes that the(dat) student(fem) (adverbial) the important book obtained was 

'The lecturer hopes, that the important book (for the last exam in august) was obtained for the student.' 

(Active/ Ungrammatical) 

Der Dozent hofft, dass der Studentin (adverbial) das wichtige Buch gelesen hat 

the lecturer hope that the(dat) student(fem) (adverbial) the important book read has 

'The lecturer hopes, that the student has read the important book (for the last exam in august).' 

(Passive/Ungrammatical) 

Der Dozent hofft, dass die Studentin (adverbial) das wichtige Buch besorgt wurde. 

the lecturer hopes that the(nom) student(fem) (adverbial) the important book obtained was. 

'The lecturer hopes, that the important book (for the last exam in august) was obtained for the student.' 

The length of intervening material and presentation rate were manipulated 

independently. The factor Event (intervening material) was varied by adding adverbials 

of six words for the long version (cf. Table 1). The factor Time (presentation rate) was 

either slow (188ms/word + 25ms/character) or fast 369ms/word + 44ms/character). 

Procedure 


In both experiments the speeded grammaticality judgment method was used. In this 

procedure sentences are presented in a word-by-word fashion. Each trial begins with the 

presentation of the words "Bitte Leertaste drücken" ("Please Press Spacebar") to start 

the sentence. After pressing the spacebar, a fixation point appears in the center of the 

screen for 1050ms. Thereafter the sentence is shown word by word in the center of the 

screen. Immediately after the last word the participants are asked to judge the 

grammaticality of the sentence as fast as possible by pressing one of two response 

buttons. Type of response and response time are recorded automatically. If a subject 

does not give a response within 2000ms after the last word appeared, the words "zu 

langsam" ("too slow") are shown and the trial is finished. In both experiments each 

subject received at least 10 practice items before the experimental sessions started. 

In experiment 1, all sentences were presented in two separate blocks in two 

different paces (according to the manipulations of the factor Time in a slow and in a fast 

pace). Every participant had to fulfill the experiment in both paces within one 

experimental session. Each block contained half of the entire set of sentences. Therefore 

each participant saw half of the sentences in the slow condition and the other half in the 

fast condition. The order of the two blocks alternated between participants. The 

sentences were presented with filler sentences. The proportion of experimental 

sentences to filler sentences was 1:1. Filler sentences covered a range of various 

constructions and were half grammatical and half ungrammatical. Most of the fillers 

served as experimental items in two other experiments. 

88

Results 

The percentages of correct judgments in Experiment 1 are shown in Figure 2 

(grammatical conditions) and Figure 3 (ungrammatical conditions). Statistical analyses 

were conducted with subject as the random factor (F1) and with sentences as the 

random factor (F2). The following main effects occurred: First, a significant effect of 

the factor Event is obtained (F1(1,15)=22.30, p

4 Experiment 2: The Role of Complexity in Sentence Parsing 

Experiment 2 investigated the role of sentence complexity according to Gibson’s 

Distance Locality Theory (Gibson 2000) in the context of verb-final clauses in German. 

The DLT is a resource-driven model of language processing. The model assumes two 

major kinds of resource use. First, integrating a new word w into the current structure 

causes some cost (integration cost). Second, keeping the structure in memory also 

causes a certain kind of cost (storage cost). A central idea of the DLT is locality. Gibson 

assumes that the cost of integrating a new element into the current structure depends on 

the distance between the new element and the related element already processed. The 

assumption is that the distance is defined by the amount of discourse referents that are 

newly introduced between the items concerned. 

If this is so, an interesting question is whether material not introducing a new 

discourse referent also affects the ease of integrating w into the CPPM. This was tested 

in experiment 2 by the means of adverbial material. The crucial factors of experiment 2 

therefore are: Adverbial and Discourse Referents (DR). 

Participants and Material 

16 students of the University of Konstanz participated for course credit or payment. All 

participants were native speakers of German and naive with respect to the purpose of 

the experiment. 

We created 128 sentences, each in 16 versions according to the factors Voice 

(active versus passive), Status (grammatical versus ungrammatical), Adverbial (NoAdv 

versus Adv) and Discourse Referents (0 DR versus 2 DR). 

Table 2 shows a Sample Stimuli of Experiment 2. 

Ich vermute, dass […] 

I guess , that […] 


Table 2. Sample Stimuli Item of Experiment 2 

(NoAdv. / 0 DR) 

[…] meine Professorin, die sehr gut erklärt, eine freie Stelle ausgeschrieben hat. 

[…] my professor(fem) who very good explains a vacant position offered has 

‘I guess that my professor, who explains very well, has offered a vacant position.’ 

(Adv. / 0 DR) 

[…] meine Professorin, die immer wieder sehr gut erklärt, eine freie Stelle ausgeschrieben hat. 

[…] my professor(fem) who again and again very good explains a vacant position offered has 

‘I guess that my professor, who explains very well repeatedly, has offered a vacant position.’ 

(NoAdv. / 2 DR) 

[…] meine Professorin, die dem Studenten das Skript ausleiht, eine freie Stelle ausgeschrieben hat. 

[…] my professor(fem) who the student(dat) the script lends a vacant position offered has 

‘I guess that my professor, who lends the script to the student, has offered a vacant position.’ 

(Adv. / 2 DR) 

[…] meine Professorin, die dem Studenten doch noch das Skript ausleiht, eine freie Stelle 

[…] my professor(fem) who the student(dat) eventually the script lends a vacant position 

ausgeschrieben hat. 

offered has 

‘I guess that my professor, who eventually lends the script to the student, has offered a vacant position.’ 

90

The complexity of relative clauses was manipulated in a two-factorial way. First, 

the relative clause contains either 0 or 2 new NP-related discourse referents. The event 

referent introduced by the verb is ignored as it is introduced in all four relative clause 

types. Second, the relative clause does or does not contain an additional adverbial of 

two words. Both factors were crossed. The resulting conditions are shown below in 

Figure 4. Relative-clause complexity increases from (a) to (d). Furthermore, (b) and (c) 

are matched according to the number of words they contain, but they differ in their 

internal structure. As one can see below, (b) contains additional adverbials of two words 

(“immer wieder”), but does not include any newly introduced discourse referents. 

Sentence type (c), on the other hand, only introduces two new discourse referents 

(“Studenten” and “Skript”). 

Procedure 

In this experiment the same procedure, the speeded grammaticality judgment task, as in 

experiment 1 was used. In experiment 2 no manipulation of the presentation time was 

accomplished. The experiment was conducted in a one block. A presentation rate of 

252ms per word + additional 28ms per letter was used. 

a 

b 

c 

d 

Results 

NP1 die sehr gut erklärt 

NP2 V 

NP1 

NP1 

NP1 


die immer wieder sehr gut erklärt 

die dem Studenten das neue Skript ausleiht 

die dem Studenten doch noch das neue Skript ausleiht 

Figure 4: Length of Relative Clauses (According to the Number of Words) 

NP2 V 

NP2 V 

NP2 V 

The percentages of correct judgments in Experiment 2 are provided in Figure 5 (for 

grammatical conditions) and Figure 6 (for ungrammatical conditions). Statistical 

analyses revealed main effects for the factors Status (F1(1,15)= 26.57, p < .001; 

F2(1,15)= 213.43, p

Percentage Correct (%) 

100 

90 

80 

70 

60 

50 

40 

92 

89 

noNP 

noAdv 


86 

83 

2NPs 

noAdv 

91 

87 

noNP 

Adverbial 

Active Passive 

79 78 

2NPs 

Adverbial 

Figure 5. Percentages of correct judgments for 

Grammatical Sentences 

5 General Discussion 

Percentage Correct (%) 

100 

90 

80 

70 

60 

50 

40 

61 60 

noNP 

n o A d v 

56 

51 

2NPs 

n o A d v 

68 

58 

noNP 

A d v 

Active Passive 

56 

51 

2NPs 

A d v 

Figure 6. Percentages of correct judgments 

for Ungrammatical Sentences 

In Experiment 1, the factors Time and Event were disentangled to investigate the nature of 

distance in sentence processing. The experiment had a clear-cut outcome for both factors. 

First, the factor Event clearly affects sentence processing. This especially can be seen in 

ungrammatical passive sentences. In that condition a decrease in the percentages of correct 

judgments of about 14% between long compared to short sentences can be found. As earlier 

experimental work has shown, ungrammatical passive sentences are always judged less 

reliably (cf. Bader & Bayer 2006). More material to process increases processing difficulty 

immensely, which results in a higher error rate of long sentences compared to short 

sentences. Second, the factor Time does not seem to affect sentence processing as predicted 

by time-based models. For short sentences, the slow presentation rate resulted in better 

performance than the fast presentation rate. This goes against the predictions. Long rather 

than short time intervals should affect sentence processing adversely (note that the fast 

presentation rate was not too fast, as can be seen in high percentages of correct judgments 

with up to 92%). For long sentences the presentation rate had no effect at all. The results 

suggest that time-based decay does not contribute to the difficulty of integrating a new word 

into the CPPM. 

Experiment 2 has two major results. First, confirming prior results, the number of new 

discourse referents had a major effect. Sentences containing two new discourse referents in 

the relative clause received significantly more judgment errors. Second, an intervening 

adverbial had no effect at all. This clearly can be found in the sentences which were equal in 

length according to the number of words they contain, but which were manipulated with 

different material. Sentences that contained new discourse referents but no additional 

adverbial received substantially more judgment errors than sentences containing the same 

amount of words, but only containing additional adverbials. The results suggest that the 

pure linear distance between w and information necessary to integrate w cannot be the 

source of the observed difficulty. In particular, finding no differences between (a) versus (b) 

and (c) versus (d), but a substantial difference between (b) and (c) (cf. Figure 4) argues 

against theories assuming that time or pure length - not introducing a new discourse referent 

- leads to forgetting in working memory. The results therefore support the Dependency 

Locality Theory of Gibson (2000). 

92

References 


M. Bader & J. Bayer (2006). Case and Linking in Language Comprehension. Evidence 

from German, Springer, Dordrecht. 

E. Gibson (2000). ‘The dependency locality theory: A distance-based theory of 

linguistic complexity’. In A. Marantz et al. (eds.), Image Languae, Brain. MIT Press. 

S. Hopp & M. Bader (in prep.). ‘Forgetting in Working Memory: Interference versus 

Decay? Evidence from German Sentence Processing’. 

M. A. Just & P. A. Carpenter (1992). ‘A Capacity Theory of Comprehension: Individual 

Differences in Working Memory’. Psychological Review, vol. 99, no.1. 

R. L. Lewis & S. Vasishth. (2005). ‘An activation-based model of sentence processing 

as skilled memory retrieval’. Cognitive Science 29. 

R. Levy et al. (2007). ‘The syntactic complexity of Russian relative clauses’. Paper 

presented at the Annual Conference on Human Sentence Processing – CUNY 2007, 

San Diego, CA. 

S. Lewandowsky et al. (2004). ‘Time does not cause forgetting in short-term serial 

recall’. Psychonomic Bulletin & Review 11. 

J. S. Nairne (1990). ‘A feature model of immediate memory’. Memory & Condition, 18 

Saito, S., & Miyake, A. (2004). On the nature of forgetting and the processing-storage 

relationship in reading span performance. Journal of Memory and Language, 20. 

93


94

A SALIENCE-DRIVEN APPROACH TO 

SPEECH RECOGNITION FOR HUMAN-ROBOT INTERACTION 

Pierre Lison 

German Research Center for Artificial Intelligence 

Abstract. We present an implemented model for speech recognition in natural environments 

which relies on contextual information about salient entities to prime utterance recognition. 

The hypothesis underlying our approach is that, in situated human-robot interaction, speech 

recognition performance can be significantly enhanced by exploiting knowledge about the 

immediate physical environment and the dialogue history. To this end, visual salience (objects 

perceived in the physical scene) and linguistic salience (previously referred-to objects 

within the current dialogue) are integrated into a single cross-modal salience model. The 

model is dynamically updated as the environment evolves, and is used to establish expectations 

about uttered words which are most likely to be heard given the context. The update is 

realised by continously adapting the word-class probabilities specified in the statistical language 

model. The present article discusses the motivations behind our approach, describes 

our implementation as part of a distributed, cognitive architecture for mobile robots, and 

reports the evaluation results on a test suite. 



Recent years have seen increasing interest in service robots endowed with communicative 

capabilities. In many cases, these robots must operate in open-ended environments 

and interact with humans using natural language to perform a variety of service-oriented 

tasks. Developing cognitive systems for such robots remains a formidable challenge. 

Software architectures for cognitive robots are typically composed of several cooperating 

subsystems, such as communication, computer vision, navigation and manipulation 

skills, and various deliberative processes such as symbolic planners (Langley, Laird and 

Rogers, 2005). 

These subsystems are highly interdependent. It is not enough to equip the robot with 

basic functionalities for dialogue comprehension and production to make it interact naturally 

in situated dialogues. We also need to find meaningful ways to relate language, 

action and situated reality, and enable the robot to use its perceptual experience to continuously 

learn and adapt itself to the environment. 

The first step in comprehending spoken dialogue is automatic speech recognition [ASR]. 

For robots operating in real-world noisy environments, and dealing with utterances pertaining 

to complex, open-ended domains, this step is particularly error-prone. In spite of 

continuous technological advances, the performance of ASR remains for most tasks at 

least an order of magnitude worse than that of human listeners (Moore, 2007). 

One strategy for addressing this issue is to use context information to guide the speech 

recognition by percolating contextual constraints to the statistical language model (Gruenstein, 

Wang and Seneff, 2005). In this paper, we follow this approach by defining a contextsensitive 

language model which exploits information about salient objects in the visual 

scene and linguistic expressions in the dialogue history to prime recognition. To this end, 

95

a salience model integrating both visual and linguistic salience is used to dynamically 

compute lexical activations, which are incorporated into the language model at runtime. 

Our approach departs from previous work on context-sensitive speech recognition by 

modeling salience as inherently cross-modal, instead of relying on just one particular 

modality such as gesture (Chai and Qu, 2005), eye gaze (Qu and Chai, 2007) or dialogue 

state (Gruenstein et al., 2005). The FUSE system described in (Roy and Mukherjee, 2005) 

is a closely related approach, but limited to the processing of object descriptions, whereas 

our system was designed from the start to handle generic situated dialogues (cf. §3.3). 

The structure of the paper is as follows: in the next section we briefly introduce the 

software architecture in which our system has been developed. We then describe the 

salience model, and explain how it is utilised within the language model used for ASR. 

We finally present the evaluation of our approach, followed by conclusions. 

Figure 1: Robotic platform (left) and example of a real visual scene (right) 

2 Architecture 


Our approach has been implemented as part of a distributed cognitive architecture (Hawes, 

Sloman, Wyatt, Zillich, Jacobsson, Kruijff, Brenner, Berginc and Skocaj, n.d.). Each subsystem 

consists of a number of processes, and a working memory. The processes can 

access sensors, effectors, and the working memory to share information within the subsystem. 

Figure 2 illustrates the spoken dialogue comprehension. Numbers 1-11 in the 

figure indicate the usual sequential order for the processes.. 

The speech recognition utilises Nuance Recognizer v8.5 together with a statistical language 

model (§ 3.4). For the online update of word class probabilities according to the 

salience model, we use the “just-in-time grammar” functionality provided by Nuance. 

Syntactic parsing is based on an incremental chart parser 1 for Combinatory Categorial 

Grammar (Steedman and Baldridge, 2003), and yields a set of interpretations – that is, 

1 Built on top of the OpenCCG NLP library: http://openccg.sf.net 

96


Figure 2: Schematic view of the architecture for spoken dialogue comprehension 

97

logical forms expressed as ontologically rich, relational structures (Baldridge and Kruijff, 

2001). Figure 3 gives an example of such logical form. 

These interpretations are then packed into a single representation (Oepen and Carroll, 

2000; Kruijff, Lison, Benjamin, Jacobsson and Hawes, in submission), a technique which 

enables us to efficiently handle syntactic ambiguity. 

Once the packed logical form is built, it is retrieved by the dialogue recognition module, 

which performs dialogue-level analysis tasks such as discourse reference resolution 

and dialogue move interpretation, and consequently updates the dialogue structure. 

@w1:cognition(want ∧ 

ind ∧ 

pres ∧ 

(i1 : person ∧ I ∧ 

sg ∧ 

(t1 : action-motion ∧ take ∧ 

y1 : person ∧ 

(m1 : thing ∧ mug ∧ 

unique ∧ 

sg ∧ 

specific singular)) ∧ 

(y1 : person ∧ you ∧ 

sg)) 

Figure 3: Logical form generated for the utterance ‘I want you to take the mug’ 

Linguistic interpretations must finally be associated with extra-linguistic knowledge 

about the environment – dialogue comprehension hence needs to connect with other subarchitectures 

like vision, spatial reasoning or planning. We realise this information binding 

between different modalities via a specific module, called the “binder”, which is responsible 

for the ontology-based mediation accross modalities (Jacobsson, Hawes, Kruijff 

and Wyatt, 2008). 

3 Approach 

3.1 Motivation 

As psycholinguistic studies have shown, humans do not process linguistic utterances in 

isolation from other modalities. Eye-tracking experiments notably highlighted that, during 

utterance comprehension, humans combine, in a closely time-locked fashion, linguistic 

information with scene understanding and world knowledge (Altmann and Kamide, 

2004; Knoeferle and Crocker, 2006). 

These observations – along with many others – therefore provide solid evidence for the 

embodied and situated nature of language and cognition (Lakoff, 1987; Barsalou, 1999). 

Humans thus systematically exploit dialogue and situated context to guide attention 

and help disambiguate and refine linguistic input by filtering out unlikely interpretations. 

Our approach is essentially an attempt to reproduce this mechanism in a robotic system. 

3.2 Salience modeling 


In our implementation, we define salience using two main sources of information: 

1. the salience of objects in the perceived visual scene; 

98

2. the linguistic salience or “recency” of linguistic expressions in the dialogue history. 

In the future, other sources could be added, for instance the possible presence of gestures 

(Chai and Qu, 2005), eye gaze tracking (Qu and Chai, 2007), entities in large-scale 

space (Zender and Kruijff, 2007), or the integration of a task model – as salience generally 

depends on intentionality (Landragin, 2006). 

3.2.1 Visual salience 

Via the “binder”, we can access the set of objects currently perceived in the visual scene. 

Each object is associated with a concept name (e.g. printer) and a number of features, 

for instance spatial coordinates or qualitative propreties like colour, shape or size. 

Several features can be used to compute the salience of an object. The ones currently 

used in our implementation are (1) the object size and (2) its distance relative to the robot 

(e.g. spatial proximity). Other features could also prove to be helpful, like the reachability 

of the object, or its distance from the point of visual focus – similarly to the spread of 

visual acuity across the human retina. To derive the visual salience value for each object, 

we assign a numeric value for the two variables, and then perform a weighted addition. 

The associated weights are determined via regression tests. 

At the end of the processing, we end up with a set Ev of visual objects, each of which 

is associated with a numeric salience value s(ek), with 1 ≤ k ≤ |Ev|. 

3.2.2 Linguistic salience 


There is a vast amount of literature on the topic of linguistic salience. Roughly speaking, 

linguistic salience can be characterised either in terms of hierarchical recency, according 

to a tree-like model of discourse structure, or in terms of linear recency of mention 

(Kelleher, 2005). Our implementation can theorically handle both types of linguistic 

salience, but, at the time of writing, only the linear recency is calculated. 

To compute the linguistic salience, we extract a set El of potential referents from the 

discourse structure, and for each referent ek we assign a salience value s(ek) equal to 

the distance (measured on a logarithmic scale) between its last mention and the current 

position in the discourse structure. 

3.2.3 Cross-modal salience model 

Once the visual and linguistic salience are computed, we can proceed to their integration 

into a cross-modal statistical model. We define the set E as the union of the visual and 

linguistic entities: E = Ev ∪ El, and devise a probability distribution P (E) on this set: 

P (ek) = δv IEv(ek) sv(ek) + δl IEl (ek) sl(ek) 

|E| 

where IA(x) is the indicator function of set A, and δv, δk are factors controlling the 

relative importance of each type of salience. They are determined empirically, subject to 

the following constraint to normalise the distribution : 

δv 

� 

ek∈Ev 

s(ek) + δl 

� 

ek∈El 

(1) 

s(ek) = |E| (2) 

The statistical model P (E) thus simply reflects the salience of each visual or linguistic 

entity: the more salient, the higher the probability. 

99

3.3 Lexical activation 

In order for the salience model to be of any use for speech recognition, a connection 

between the salient entities and their associated words in the ASR vocabulary needs to 

be established. To this end, we define a lexical activation network, which lists, for each 

possible salient entity, the set of words activated by it. The network specifies the words 

which are likely to be heard when the given entity is present in the environment or in 

the dialogue history. It can therefore include words related to the object denomination, 

subparts, common properties or affordances. The salient entity laptop will activate words 

like ‘laptop’, ‘notebook’, ‘screen’, ‘opened’, ‘ibm’, ‘switch on/off’, ‘close’, etc. The list 

is structured according to word classes, and a weight can be set on each word to modulate 

the lexical activation: supposing a laptop is present, the word ‘laptop’ should receive a 

higher activation than, say, the word ‘close’, which is less situation specific. 

The use of lexical activation networks is a key difference between our model and (Roy 

and Mukherjee, 2005), which relies on a measure of “descriptive fitness” to modify the 

word probabilities. One advantage of our approach is the possibility to go beyond object 

descriptions and activate word types denoting subparts, properties or affordances of 

objects 2 . 

If the probability of specific words is increased, we need to re-normalise the probability 

distribution. One solution would be to decrease the probability of all non-activated words 

accordingly. This solution, however, suffers from a significant drawback: our vocabulary 

contains many context-independent words like ‘thing’, or ‘place’, whose probability 

should remain constant. To address this issue, we mark an explicit distinction in our 

vocabulary between context-dependent and context-independent words. 

In the current implementation, the lexical activation network is constructed semimanually, 

using a simple lexicon extraction algorithm. We start with the list of possible 

salient entities, which is given by 

1. the set of physical objects the vision subsystem can recognise ; 

2. the set of nouns specified in the CCG lexicon with ‘object’ as ontological type. 

For each entity, we then extract its associated lexicon by matching domain-specific syntactic 

patterns against a corpus of dialogue transcripts. 

3.4 Language modeling 

We now detail the language model used for the speech recognition – a class-based trigram 

model enriched with contextual information provided by the salience model. 

3.4.1 Corpus generation 


We need a corpus to train any statistical language model. Unfortunately, no corpus of 

situated dialogue adapted to our task domain was available. Collecting in-domain data via 

Wizard of Oz experiments is a very costly and time-consuming process, so we decided 

to follow the approach advocated in (Weilhammer, Stuttle and Young, 2006) instead and 

generate a class-based corpus from a task grammar we had at our disposal. 

Practically, we first collected a small set of WOz experiments, totalling about 800 

utterances. This set is of course too small to be directly used as a corpus for language 

2 In the context of a laptop object, ‘screen’ and ‘switch on/off’ would for instance be activated. 

100


model training, but sufficient to get an intuitive idea of the kind of utterances we had to 

deal with. 

Based on it, we designed a domain-specific context-free grammar able to cover most 

of the utterances. Weights were then automatically assigned to each grammar rule by 

parsing our initial corpus, hence leading to a small stochastic context-free grammar. 

As a last step, this grammar is randomly traversed a large number of times, which gives 

us the generated corpus. 

3.4.2 Salience-driven, class-based language models 

The objective of the speech recognizer is to find the word sequence W ∗ which has the 

highest probability given the observed speech signal O and a set E of salient objects: 

W ∗ = arg max 

W 

P (O|W) × 

� �� 

P (W|E) 

� �� 

acoustic model salience-driven language model 

For a trigram language model, the probability of the word sequence P (w n 1 |E) is: 

P (w n 1 |E) � 

(3) 

n� 

P (wi|wi−1wi−2; E) (4) 

i=1 

Our language model is class-based, so it can be further decomposed into word-class 

and class transitions probabilities. The class transition probabilities reflect the language 

syntax; we assume they are independent of salient objects. The word-class probabilities, 

however, do depend on context: for a given class – e.g. noun -, the probability of hearing 

the word ‘laptop’ will be higher if a laptop is present in the environment. Hence: 

P (wi|wi−1wi−2; E) = P (wi|ci; E) 

� �� 

× P (ci|ci−1, ci−2) 

� �� 

word-class probability class transition probability 

We now define the word-class probabilities P (wi|ci; E): 

P (wi|ci; E) = � 

P (wi|ci; ek) × P (ek) (6) 

ek∈E 

To compute P (wi|ci; ek), we use the lexical activation network specified for ek: 

⎧ 

⎪⎨ 

P (wi|ci) + α1 if wi ∈ activatedWords(ek) 

P (wi|ci) − α2 if wi /∈ activatedWords(ek) ∧ 

P (wi|ci; ek) = 

wi ⎪⎩ 

∈ contextDependentWords 

P (wi|ci) else 

The optimum value of α1 is determined using regression tests. α2 is computed relative 

to α1 in order to keep the sum of all probabilities equal to 1: 

α2 = 

|activatedWords| 

× α1 

|contextDependentWords| − |activatedWords| 

These word-class probabilities are dynamically updated as the environment and the 

dialogue evolves and incorporated into the language model at runtime. 

101 

(5) 

(7)

4 Evaluation 

4.1 Evaluation procedure 

We evaluated our approach using a test suite of 250 spoken utterances recorded during 

Wizard of Oz experiments. The participants were asked to interact with the robot while 

looking at a specific visual scene. We designed 10 different visual scenes by systematic 

variation of the nature, number and spatial configuration of the objects presented. Figure 

4 gives an example of a visual scene. 

The interactions could include descriptions, questions and commands. No particular 

tasks were assigned to the participants. The only constraint we imposed was that all 

interactions with the robot had to be related to the shared visual scene. 

Figure 4: Sample visual scene including three objects: a box, a ball, and a chocolate bar. 

4.2 Results 

Table 1 summarises our experimental results. Due to space constraints, we focus our 

analysis on the WER of our model compared to the baseline – that is, compared to a 

class-based trigram model not based on salience. 

4.3 Analysis 


Word Error Rate 

[WER] 

Classical LM Salience-driven LM 

vocabulary size 25.04 % 24.22 % 

� 200 words (NBest 3: 20.72 %) (NBest 3: 19.97 %) 





Table 1: Comparative results of recognition performance 

As the results show, the use of a salience model can enhance the recognition performance 

in situated interactions: with a vocabulary of about 600 words, the WER is indeed reduced 

by 16.1 % compared to the baseline. According to the Sign test, the differences for the 

last two tests (400 and 600 words) are statistically significant. As we could expect, the 

salience-driven approach is especially helpful when operating with a larger vocabulary, 

102

where the expectations provided by the salience model can really make a difference in the 

word recognition. 

The word error rate remains nevertheless quite high. This is due to several reasons. 

The major issue is that the words causing most recognition problems are – at least in 

our test suite – function words like prepositions, discourse markers, connectives, auxiliaries, 

etc., and not content words. Unfortunately, the use of function words is usually not 

context-dependent, and hence not influenced by salience. We estimated that 89 % of the 

recognition errors were due to function words. Moreover, our chosen test suite is constituted 

of “free speech” interactions, which often include lexical items or grammatical 

constructs outside the range of our language model. 

5 Conclusion 

We have presented an implemented model for speech recognition based on the concept of 

salience. This salience is defined via visual and linguistic cues, and is used to compute 

degrees of lexical activations, which are in turn applied to dynamically adapt the ASR 

language model to the robot’s environment and dialogue state. 

As future work we will examine the potential extension of our approach in three directions. 

First, we are investigating how to use the situated context to perform some priming 

of function words like prepositions or discourse markers. Second, we wish to take other 

information sources into account, particularly the integration of a task model, relying on 

data made available by the symbolic planner. And finally, we want to go beyond speech 

recognition, and investigate the relevance of such salience model for the development of 

a robust understanding system for situated dialogue. 


My thanks go to G.-J. Kruijff, H. Zender, M. Wilson and N. Yampolska for their insightful comments. 

The research reported in this article was supported by the EU FP6 IST Cognitive Systems 

Integrated project Cognitive Systems for Cognitive Assistants “CoSy” FP6-004250-IP. 

References 


Altmann, G. T. and Kamide, Y. (2004). Now you see it, now you don’t: Mediating 

the mapping between language and the visual world, Psychology Press, New York, 

pp. 347–386. 

Baldridge, J. and Kruijff, G.-J. M. (2001). Coupling ccg and hybrid logic dependency 

semantics, ACL ’02: Proceedings of the 40th Annual Meeting on Association for 

Computational Linguistics, ACL, Morristown, NJ, USA, pp. 319–326. 

Barsalou, L. W. (1999). Perceptual symbol systems., Behavioral & Brain Sciences 22(4). 

Chai, J. Y. and Qu, S. (2005). A salience driven approach to robust input interpretation in 

multimodal conversational systems, Proceedings of Human Language Technology 

Conference and Conference on Empirical Methods in Natural Language Processing 

2005, Association for Computational Linguistics, Vancouver, Canada, pp. 217–224. 

Gruenstein, A., Wang, C. and Seneff, S. (2005). Context-sensitive statistical language 

modeling, Proceedings of INTERSPEECH 2005, pp. 17–20. 

103


Hawes, N., Sloman, A., Wyatt, J., Zillich, M., Jacobsson, H., Kruijff, G.-J. M., Brenner, 

M., Berginc, G. and Skocaj, D. (n.d.). Towards an integrated robot with multiple 

cognitive functions., AAAI, AAAI Press, pp. 1548–1553. 

Jacobsson, H., Hawes, N., Kruijff, G.-J. and Wyatt, J. (2008). Crossmodal content binding 

in information-processing architectures, Proceedings of the 3rd ACM/IEEE International 

Conference on Human-Robot Interaction (HRI), Amsterdam, The Netherlands. 

Kelleher, J. (2005). Integrating visual and linguistic salience for reference resolution, in 

N. Creaney (ed.), Proceedings of the 16th Irish conference on Artificial Intelligence 

and Cognitive Science (AICS-05), Portstewart, Northern Ireland. 

Knoeferle, P. and Crocker, M. (2006). The coordinated interplay of scene, utterance, and 

world knowledge: evidence from eye tracking, Cognitive Science 30(3): 481–529. 

Kruijff, G.-J. M., Lison, P., Benjamin, T., Jacobsson, H. and Hawes, N. (in submission). 

Incremental, multi-level processing for comprehending situated dialogue in humanrobot 

interaction, Connection Science . 

Lakoff, G. (1987). Women, fire and dangerous things: what categories reveal about the 

mind, University of Chicago Press, Chicago. 

Landragin, F. (2006). Visual perception, language and gesture: A model for their understanding 

in multimodal dialogue systems, Signal Processing 86(12): 3578–3595. 

Langley, P., Laird, J. E. and Rogers, S. (2005). Cognitive architectures: Research issues 

and challenges, Technical report, Institute for the Study of Learning and Expertise, 

Palo Alto. 

Moore, R. K. (2007). Spoken language processing: piecing together the puzzle, Speech 

Communication: Special Issue on Bridging the Gap Between Human and Automatic 

Speech Processing 49: 418–435. 

Oepen, S. and Carroll, J. (2000). Ambiguity packing in constraint-based parsing - practical 

results, Proceedings of the 1st Conference of the North America Chapter of the 

Association of Computational Linguistics, Seattle, WA, pp. 162–169. 

Qu, S. and Chai, J. (2007). An exploration of eye gaze in spoken language processing for 

multimodal conversational interfaces, Proceedings of the Conference of the North 

America Chapter of the Association of Computational Linguistics, pp. 284–291. 

Roy, D. and Mukherjee, N. (2005). Towards situated speech understanding: visual context 

priming of language models, Computer Speech & Language (2): 227–248. 

Steedman, M. and Baldridge, J. (2003). Combinatory categorial grammar. MS Draft 4. 

Weilhammer, K., Stuttle, M. N. and Young, S. (2006). Bootstrapping language models 

for dialogue systems, Proceedings of INTERSPEECH 2006, Pittsburgh, PA. 

Zender, H. and Kruijff, G.-J. M. (2007). Towards generating referring expressions in 

a mobile robot scenario, Language and Robots: Proceedings of the Symposium, 

Aveiro, Portugal, pp. 101–106. 

104

A LOGIC WITH A CONDITIONAL PROBABILITY OPERATOR 

Petar Maksimović, Dragan Doder, Bojan Marinković and Aleksandar Perović 

Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade, Serbia 

Abstract. This paper presents a sound and strongly complete axiomatization of the reasoning 

about linear combinations of conditional probabilities, including comparative statements. 

The developed logic is decidable, with a PSPACE containment for the decision procedure. 



The present paper constitutes an effort to proceed along the lines of the research presented 

in (Fagin, Halpern and Megiddo, 1990; Lukasiewicz, 2002; Ognjanović and Raˇsković, 

1996; Ognjanović and Raˇsković, 1999; Ognjanović and Raˇsković, 2000; Ognjanović, 

Marković and Raˇsković, 2005; Ognjanović, Perović and Raˇsković, 2008; Raˇsković, Ognjanović 

and Marković, 2004), on the formal development of probabilistic logics, where 

probability statements are expressed by probabilistic operators expressing bounds on the 

probability of a propositional formula. 

The main technical novelty of this paper lies in the fact that in it is given a sound and 

strongly complete axiomatization of the reasoning about linear combinations of conditional 

probabilities, which also allows for qualitative statements. For instance, we formally 

write the statement “the conditional probability of α given β is at least the sum of 

conditional probabilities of α given γ and twice γ given α” as 

CP (α, β) � CP (α, γ) + 2 · CP (γ, α). 

It should be noted that all of the probabilities we use are Kolmogorov-style. We also prove 

that the developed logic is decidable. 

As it is well known, the conditional probability of α given β has meaning only if 

P (β) > 0, and is, by definition, calculated by 

P (α|β) = 

P (α ∧ β) 

. 

P (β) 

To avoid technical difficulties, we will adopt the convention that 0 −1 = 1. Namely, it is 

more convenient to assume that −1 is a total operation, with this being considered usual 

practice in quantifier elimination for the theory of real closed fields. In this way, we make 

sure that conditional events are always defined. 

The rest of the paper is organized as follows. In Section 2 the syntax of the logic is 

given and the class of measurable probabilistic models is described. Section 3 contains 

the corresponding axiomatization and introduces the notion of deduction. A proof of the 

completeness theorem is presented in Section 4, whereas the decidability of the logic is 

analyzed in Section 5. Concluding remarks are in Section 6. 

105

2 Syntax and semantics 

Let V ar = {pn | n < ω} be the set of propositional variables. The corresponding set of all 

propositional formulas over V ar will be denoted by F orC, where C stands for classical, 

and is defined in the usual way. Propositional formulas will be denoted by α, β and γ, 

possibly with indices. 

Definition 1 The set T erm of all probabilistic terms is recursively defined as follows: 

• T erm(0) = {s | s ∈ Q} ∪ {CP (α, β) | α, β ∈ F orC}. 

• T erm(n + 1) = T erm(n) ∪ {(f + g), (s · g), (−f) | f, g ∈ T erm(n), s ∈ Q} 

• T erm = ∞� 

n=0 

T erm(n). � 

Probabilistic terms will be denoted by f, g and h, possibly with indices. To simplify 

notation, we introduce the following convention: f+g is (f+g), f+g+h is ((f+g)+h). 

For n > 3, n� 

fi is ((· · · ((f1 + f2) + f3) + · · ·) + fn). Similarly, −f is (−f) and f − g 

i=1 


is (f + (−g)). 

If α and β are propositional formulas, then the probabilistic term CP (α, β) reads “the 

conditional probability of α given β”. To simplify notation, we will write P (α) instead 

of CP (α, ⊤), where ⊤ is an arbitrary tautology instance. 

Definition 2 A basic probabilistic formula is any formula of the form f � 0. Furthermore, 

we define the following abbreviations: 

• f � 0 is −f � 0; • f > 0 is ¬(f � 0); • f < 0 is ¬(f � 0); 

• f = 0 is f � 0 ∧ f � 0; • f �= 0 is ¬(f = 0); • f � g is f − g � 0. 

We define f � g, f > g, f < g, f = g and f �= g in a similar way. � 

We define the notion of a probabilistic formula as a Boolean combination of basic 

probabilistic formulas. As in the propositional case, ¬ and ∧ are the primitive connectives, 

while all of the other connectives are introduced in the usual way. Probabilistic formulas 

will be denoted by φ, ψ and θ, possibly with indices. The set of all probabilistic formulas 

will be denoted by F orP . 

By “formula” we mean either a classical formula or a probabilistic formula. We do 

not allow for the mixing of those types of formulas, nor for the nesting of the probability 

operator P . Formulas will be denoted by Φ, Ψ and Θ, possibly with indices. The set of 

all formulas will be denoted by F or. 

We define the notion of a model as a special kind of Kripke model. Namely, a model 

M is any tuple 〈W, H, µ, v〉 such that: 

• W is a nonempty set. As usual, its elements will be called worlds. 

• H is an algebra of sets over W . 

• µ : H −→ [0, 1] is a finitely additive probability measure. 

106

• v : F orC × W −→ {0, 1} is a truth assignment 1 compatible with ¬ and ∧. That is, 

v(¬α, w) = 1 − v(α, w) and v(α ∧ β, w) = v(α, w) · v(β, w). 

For a given model M, let [α]M be the set of all w ∈ W such that v(α, w) = 1. If 

the context is clear, we will write [α] instead of [α]M. We say that M is measurable if 

[α] ∈ H for all α ∈ F orC. 

Definition 3 Let M = 〈W, H, µ, v〉 be any measurable model. We define the satisfiability 

relation |= recursively as follows: 

• M |= α if v(α, w) = 1 for all w ∈ W . 

• M |= f � 0 if f M � 0, where f M is recursively defined in the following way: 

– s M = s. 

– CP (α, β) M = µ([α ∧ β]) · µ([β]) −1 . 

– (f + g) M = f M + g M . 

– (s · g) M = s · g M . 

– (−f) M = −(f M ). 

• M |= ¬φ if M �|= φ. 

• M |= φ ∧ ψ if M |= φ and M |= ψ. � 

A formula Φ is satisfiable if there is a measurable model M such that M |= Φ; Φ is 

valid if it is satisfied in every measurable model. We say that the set T of formulas is 

satisfiable if there is a measurable model M such that M |= Φ for all Φ ∈ T . 

Notice that the last two clauses of Definition 3 provide validity of each tautology instance. 

3 Axiomatization 


In this section we will introduce the axioms and inference rules and prove that the proposed 

axiomatization is sound and strongly complete with respect to the class of all measurable 

models. The set of axioms from our axiomatic system, which we denote AXLPCP, 

is divided into three groups: axioms for propositional reasoning, axioms for probabilistic 

reasoning and arithmetical axioms. 

Axioms for propositional reasoning 

A1. τ(Φ1, . . . , Φn), where τ(p1, . . . , pn) ∈ F orC is any tautology and Φi are either all 

propositional or all probabilistic. 

Axioms for probabilistic reasoning 

A2. P (α) � 0; A5. P (α ↔ β) = 1 → P (α) = P (β); 

A3. P (⊤) = 1; A6. P (α ∨ β) = P (α) + P (β) − P (α ∧ β); 

A4. P (⊥) = 0; A7. (P (α ∧ β) = r ∧ P (β) = s) → CP (α, β) = r · s −1 . 

1 1 stands for “true”, while 0 stands for “false” 

107

Arithmetical axioms. 

A8. r � s, whenever r � s; A16. s · (f + g) = (s · f) + (s · g) 

A9. s · r = sr; A17. r · (s · f) = r · s · f 

A10. s + r = s + r; A18. 1 · f = f 

A11. f + g = g + f; A19. f � g ∨ g � f 

A12. (f + g) + h = f + (g + h); A20. (f � g ∧ g � h) → f � h 

A13. f + 0 = f; A21. f � g → f + h � g + h 

A14. f − f = 0; A22. (f � g ∧ s > 0) → s · f � s · g 

A15. (r · f) + (s · f) = r + s · f; 

Inference rules 

R1. From Φ and Φ → Ψ infer Ψ. 

R2. From α infer P (α) = 1. 

R3. From the set of premises {φ → f � −n −1 | n = 1, 2, 3, . . .} infer φ → f � 0. 

Let us briefly comment on the axioms and inference rules. The axioms A1-A7 provide 

the required properties of probability, while the axioms A8-A22 provide the properties 

required for computation. In the inference rules, R1 is modus ponens, R2 resembles 

necessitation, while R3 provides that non-Archimedean probabilites are not permitted. 

Definition 4 A formula Φ is a theorem (⊢ Φ) if there is an at most countable sequence 

of formulas Φ0, Φ1, . . . , Φ, such that every Φi is either an axiom or it is derived from the 

preceding formulas of the sequence by an inference rule. In this paper we will also use 

the notion of deducibility. A formula Φ is deducible from a set T of sentences (T ⊢ Φ) if 

there is an at most countable sequence of formulas Φ0, Φ1, . . . , Φ, such that every Φi is 

an axiom or a formula from the set T , or it is derived from the preceding formulas by an 

inference rule. A formula Φ is a theorem (⊢ Φ) if it is deducible from the empty set. A set 

of sentences T is consistent if there is at least one formula from F orC, and at least one 

formula from F orP that are not deducible from T . Otherwise, T is inconsistent. A set T 

is deductively closed if for every Φ ∈ F or, if T ⊢ Φ, then Φ ∈ T . 

� 

Observe that the length of the inference may be any successor ordinal lesser than the 

first uncountable ordinal ω1. Using a straightforward induction on the length of the inference, 

one can easily show that the above axiomatization is sound with respect to the class 

of all measurable models. 

4 Completeness 


Theorem 1 (Deduction theorem) Suppose that T is an arbitrary set of formulas and that 

Φ, Ψ ∈ F or. Then, T ⊢ Φ → Ψ iff T ∪ {Φ} ⊢ Ψ. 

108


Proof: If T ⊢ Φ → Ψ, then clearly T ∪ {Φ} ⊢ Φ → Ψ, so, by modus ponens (R1), 

T ∪ {Φ} ⊢ Ψ. Conversely, let T ∪ {Φ} ⊢ Ψ. As in the classical case, we will use the 

induction on the length of inference to prove that T ⊢ Φ → Ψ. The proof differs from the 

classical only in the cases when we apply the inifinitary inference rule R3. 

Suppose that Ψ is the formula φ → f � 0 and that T ⊢ Φ → (φ → f � −n −1 ) for all 

n. Since the formula (p0 → (p1 → p2)) ↔ ((p0 ∧ p1) → p2), is a tautology, we obtain 

T ⊢ (Φ ∧ φ) → f � −n −1 for all n (A1). Now, by R3, T ⊢ (Φ ∧ φ) → f � 0. Hence, by 

the same tautology, T ⊢ Φ → Ψ. 

� 

The next technical lemma will be used in the construction of a maximally consistent 

extension of a consistent set of formulas. 

Lemma 2 Suppose that T is a consistent set of formulas. If T ∪ {φ → f � 0} is inconsistent, 

then there is a positive integer n such that T ∪ {φ → f < −n −1 } is consistent. 

Proof: The proof is based on the reductio ad absurdum argument. Thus, let us suppose 

that T ∪ {φ → f < −n −1 } is inconsistent for all n. Due to Deduction theorem, we can 

conclude that 

T ⊢ φ → f � −n −1 

for all n. By R3, T ⊢ φ → f � 0, so T is inconsistent; a contradiction. � 

Definition 5 Suppose that T is a consistent set of formulas and that F orP = {φi | i = 

0, 1, 2, 3, . . .}. We define a completion T ∗ of T recursively as follows: 

1. T0 = T ∪ {α ∈ F orC | T ⊢ α} ∪ {P (α) = 1 | T ⊢ α}. 

2. If Ti ∪ {φi} is consistent, then Ti+1 = Ti ∪ {φi}. 

3. If Ti ∪ {φi} is inconsistent, then: 

(a) If φi has the form ψ → f � 0, then Ti+1 = Ti ∪ {ψ → f < −n −1 }, where n 

is a positive integer such that Ti+1 is consistent. The existence of such an n is 

provided by Lemma 2. 

(b) Otherwise, Ti+1 = Ti. � 

Obviously, each Ti is consistent. In the next theorem we will prove that T ∗ is deductively 

closed, consistent and maximal with respect to F orP . 

Theorem 3 Suppose that T is a consistent set of formulas and that T ∗ is constructed as 

above. Then: 

1. T ∗ is deductively closed, id est, T ∗ ⊢ Φ implies Φ ∈ T ∗ . 

2. There is φ ∈ F orP such that φ /∈ T ∗ . 

3. For each φ ∈ F orP , either φ ∈ T ∗ , or ¬φ ∈ T ∗ . 

109


Proof: We will prove only the first clause, since the remaining clauses can be proved 

in the same way as in the classical case. In order to do so, it is sufficient to prove the 

following four claims: 

(i) Each instance of any axiom is in T ∗ . 

(ii) If Φ ∈ T ∗ and Φ → Ψ ∈ T ∗ , then Ψ ∈ T ∗ . 

(iii) If α ∈ T ∗ , then P (α) = 1 ∈ T ∗ . 

(iv) If {φ → f � −n −1 | n = 1, 2, 3, . . .} is a subset of T ∗ , then φ → f � 0 ∈ T ∗ . 

(i): If Φ ∈ F orC, then Φ ∈ T0. Otherwise, there is a nonnegative integer i such that 

Φ = φi. Since ⊢ φi, Ti ⊢ φi as well, so φi ∈ Ti+1. 

(ii): If Φ, Φ → Ψ ∈ F orC, then Ψ ∈ T0. Otherwise, let Φ = φi, Ψ = φj, and Φ → 

Ψ = φk. Then, Ψ is a deductive consequence of each Tl, where l � max(i, k) + 1. 

Let ¬Ψ = φm. If φm ∈ Tm+1, then ¬Ψ is a deductive consequence of each Tn, where 

n � m + 1. So, for every n � max(i, k, m) + 1, Tn ⊢ Ψ ∧ ¬Ψ, a contradiction. 

Thus, ¬Ψ �∈ T ∗ . On the other hand, if also Ψ �∈ T ∗ , we have that Tn ∪ {Ψ} ⊢ ⊥, and 

Tn ∪ {¬Ψ} ⊢ ⊥, for n � max(j, m) + 1, a contradiction with the consistency of Tn. 

Thus, Ψ ∈ T ∗ . 

(iii): If α ∈ T ∗ , then α ∈ T0, so P (α) = 1 ∈ T0. 

(iv): Suppose that {φ → P (α) � −n −1 | n = 0, 1, 2, . . .} is a subset of T ∗ . We want 

to prove that φ → P (α) � 0 ∈ T ∗ . The proof uses reductio ad absurdum argument. So, 

let φ → P (α) � 0 = φi and let us suppose that Ti ∪ {φi} is inconsistent. By 3.(a) of 

Definition 5, there is a positive integer n such that 

Ti+1 = Ti ∪ {φ → P (α) < −n −1 } 

and Ti+1 is consistent. Then, for all sufficiently large k, Tk ⊢ φ → P (α) < −n −1 

and Tk ⊢ φ → P (α) � −n −1 , so Tk ⊢ φ → ψ for all ψ ∈ F orP . In particular, 

Tk ⊢ φ → P (α) � 0, i.e., Tk ⊢ φi for all sufficiently large k. But, φi /∈ T ∗ , so φi is 

inconsistent with all Tk, k � i. It follows that each Tk is inconsistent for sufficiently large 

k, a contradiction. 

Thus, Ti ∪ {φi} is consistent, so φ → P (α) � 0 ∈ Ti+1. 

� 

For the given completion T ∗ , we define a canonical model M ∗ as follows: 

• W is the set of all functions w : F orC −→ {0, 1} with the following properties: 

– w is compatible with ¬ and ∧. 

– w(α) = 1 for each α ∈ T ∗ . 

• v : F orC × W −→ {0, 1} is defined by v(α, w) = 1 iff w(α) = 1. 

• H = {[α] | α ∈ F orC}. 

• µ : H −→ [0, 1] is defined by µ([α]) = sup{s ∈ [0, 1] ∩ Q | T ∗ ⊢ P (α) � s}. 

110

Lemma 4 M ∗ is a measurable model. 

Proof: We need to prove that H is an algebra of sets and that µ is a finitely additive 

probability measure. It is easy to see that H is an algebra of sets, since [α]∩[β] = [α ∧β], 

[α] ∪ [β] = [α ∨ β] and H \ [α] = [¬α]. Concerning µ, it is sufficient to prove that A3, A4 

and A6 are satisfied in M. Here we will only give the sketch of the proof for A6, which 

provides finite additivity of µ. 

Let µ([α]) = a, µ([β]) = b and µ([α ∧ β]) = c. We claim that 

µ([α ∨ β]) = a + b − c. 

This is an immediate consequence of the following facts: 

• µ([γ]) = sup{s ∈ Q | T ∗ ⊢ P (γ) � s}, γ ∈ F orC. 

• The real function F (x, y, z) = x + y − z is continuous. 

• For each r, s ∈ Q, T ∗ ⊢ r � s iff r � s. 

• Q 3 is dense in R 3 . 

Namely, for each positive ε, there are positive δ1, δ2, δ3 such that for all 〈r1, r2, r3〉 ∈ 

((a − δ1, a] × (b − δ2, b] × (c − δ3, c]) ∩ Q 3 , 

In particular, for each s ′ , s ′′ ∈ Q such that 

r1 + r2 − r3 ∈ (a + b − c − ε, a + b − c + ε). 

a + b − c − ε < s ′ � r1 + r2 − r3 � s ′′ < a + b − c + ε, 

using the axioms about rational numbers, we have that 

T ∗ ⊢ s ′ � r1 + r2 − r3 � s ′′ , 

i.e., µ([α ∨ β]) = µ([α]) + µ([β]) − µ([α ∧ β]). � 

Theorem 5 (Strong completeness theorem) Every consistent set of formulas has a measurable 

model. 

Proof: Let T be a consistent set of formulas. We can extend it to a maximally consistent 

set T ∗ , and define a canonical model M ∗ , as above. By induction on the complexity 

of the formulas we can prove that M ∗ |= Φ iff Φ ∈ T ∗ . 

To begin the induction, let Φ = α ∈ F orC. If α ∈ T ∗ , i.e., T ∗ ⊢ α, then by definition 

of M ∗ , M ∗ |= α. Conversely, if M ∗ |= α, by the completeness of classical propositional 

logic, T ∗ ⊢ α, and α ∈ T ∗ . 

Let us suppose that f � 0 ∈ T ∗ . Then, using the axioms for ordered commutative 

rings, we can prove that 

T ∗ ⊢ f = s + 


m� 

si · CP (αi, βi) and T ∗ ⊢ s + 

i=1 

111 

m� 

si · CP (αi, βi) � 0, 

i=1

for some s, si ∈ Q and some αi, βi ∈ F orC such that T ∗ ⊢ P (βi) > 0. Let ai = µ([αj]) 

and bi = µ([βi]). It remains to prove that 

s + 

m� 

i=1 

si · ai · b −1 

i 

� 0. (1) 

Similarly as in the proof of Lemma 4, we can show that (1) is an immediate consequence 

of the following facts: 

• µ([γ]) = sup{s ∈ Q | T ∗ ⊢ P (γ) � s}, γ ∈ F orC. 

• The real function F (x1, . . . , xm, y1, . . . , ym) = s + n� 

• For each r, s ∈ Q, T ∗ ⊢ r � s iff r � s. 

• Q k is dense in R k . 

i=1 

si · xi · y −1 

i is continuous. 

For the other direction, let M ∗ |= f � 0. If f � 0 /∈ T ∗ , by construction of T ∗ , 

there is a positive integer n such that f < −n−1 ∈ T ∗ . Reasoning as above, we have that 

M ∗ 

f < 0, which is a contradiction. So, f � 0 ∈ T ∗ . 

Let Φ = ¬φ ∈ F orP . Then M ∗ |= ¬φ iff M ∗ �|= φ iff φ �∈ T ∗ iff (by Theorem 3) 

¬φ ∈ T ∗ . 

Finally, let Φ = φ ∧ ψ ∈ F orP . M ∗ |= φ ∧ ψ iff M ∗ |= φ and M ∗ |= ψ iff φ, ψ ∈ T ∗ 

iff (by Theorem 3) φ ∧ ψ ∈ T ∗ . � 

5 Decidability 


Theorem 6 Satisfiability of probabilistic formulas is decidable. 

Proof: Up to equivalence, each probabilistic formula is a finite disjunction of finite 

conjunctions of literals, where literal is either a basic probabilistic formula, or a negation 

of a basic probabilistic formula. Thus, it is sufficient to show the decidability of the 

satisfiability problem for the formulas of the form 

� 

fi � 0 ∧ � 

gj < 0. (2) 

i 

j 

Suppose that p1, . . . , pn are all of the propositional formulas appearing in (2). Let A1, . . . , A2 n 

be all of the formulas of the form 

±p1 ∧ · · · ∧ ±pn, 

where +p = p and −p = ¬p. Clearly, Ai are pairwise disjoint and form a partition of ⊤. 

Furthermore, for each α appearing in (2) there is a unique set Iα ⊆ {1, . . . , 2n } such that 

α ↔ � 

112 

i∈Iα 

Ai

is a tautology. Now we can equivalently rewrite (2) as 

� � � 

sii ′CP ( 

i 

i ′ 

k∈Iα ii ′ 

Ak, � 

l∈Iβ ii ′ 

Al) � 0 ∧ � � � 

sjj ′CP ( 

Let σi(x1, . . . , x2n), δj(x1, . . . , x2n) be the formulas 

� 

and � 

j ′ 

i ′ 

� 

sii ′ · ( 

k∈Iα ii ′ 

� 

sjj ′ · ( 

k∈Iα jj ′ 

j 

j ′ 

xk) · ( � 

l∈Iβ ii ′ 

xk) · ( � 

l∈Iβ jj ′ 

k∈Iα jj ′ 

xl) −1 � 0 

xl) −1 < 0. 

Then, it is easy to see that (2) is satisfiable iff the sentence 

∃x1 . . . ∃x2n(� σi(¯x) ∧ � 

δj(¯x)) 

i 

j 

Ak, � 

l∈Iβ jj ′ 

Al) < 0. 

is satisfied in the ordered field of reals. Since the latter question is decidable, we have our 

claim. � 

It should be noted that this logic can be embedded into the logic described in (Fagin 

et al., 1990), which has a PSPACE containment for the decision procedure. Also, the 

rewriting of formulas from our logic into that logic can be accomplished in linear time: 

CP (α, β) is equavivalent to 

w(α ∧ β) 

w(β) 

which is representable in (Fagin et al., 1990). 

Thus, we conclude that our logic is also decidable in PSPACE. 

6 Conclusion 


In this paper we introduced a sound and strongly-complete axiomatic system for the probabilistic 

logic with the conditional probability operator CP , which allows for linear combinations 

and comparative statements. As it was noticed in (van der Hoek, 1997), it is not 

possible to give a finitary strongly complete axiomatization for such a system. In our case 

the strong completeness was made possible by adding an infinitary rule of inference. 

The obtained formalism is quite expressive and allows for the representation of uncertain 

knowledge, where uncertainty is modeled by probability formulas. For instance, 

conditional statement of the form “the sum of probabilities of α given β and γ given δ is 

at least 0.95” can be written as 

CP (α, β) + CP (γ, δ) � 0.95. 

A similar approach can be applied to de Finetti style conditional probabilities. Future 

research will also consider a possibility of dealing with probabilistic first-order formulas. 

113

References 


Fagin, Halpern and Megiddo (1990). A logic for reasoning about probabilities, Information 

and Computation 87(1/2): 78–128. 

Lukasiewicz, T. (2002). Probabilistic default reasoning with conditional constraints, Annals 

of Mathematics and Artificial Intelligence 34: 35–88. 

Ognjanović, Z., Marković, Z. and Raˇsković, M. (2005). Completeness theorem for a 

logic with imprecise and conditional probabilities, Publications de l’institute mathematique, 

nouvelle serie 78(92): 35–49. 

Ognjanović, Z., Perović, A. and Raˇsković, M. (2008). Logics with the qualitative probability 

operator, Logic Journal of IGPL 16(2): 105–120. 

Ognjanović, Z. and Raˇsković, M. (1996). A logic with higher order probabilities, Publication 

de l‘Institut Math. (NS) 60(74): 1–4. 

Ognjanović, Z. and Raˇsković, M. (1999). Some probability logics with new types of 

probability operators, Journal of Logic and Computation 9(2): 181–195. 

Ognjanović, Z. and Raˇsković, M. (2000). Some first-order probability logics, Theoretical 

Computer Science 247(1-2): 191–212. 

Raˇsković, M., Ognjanović, Z. and Marković, Z. (2004). A logic with conditional probabilities, 

in J. Leite and J. Alferes (eds), 9th European Conference Jelia’04 Logics in 

Artificial Intelligence, Vol. 3229, Springer-Verlag, pp. 226–238. 

van der Hoek, W. (1997). Some considerations on the logic pfd: a logic combining 

modality and probability, Journal of Applied Non-Classical Logics 7(3): 287–307. 

114

A PROOF-THEORETIC APPROACH TO FRENCH PRONOMINAL CLITICS ◦ 

Scott Martin 

The Ohio State University 

Abstract. This paper sketches an account of the behavior of French pronominal clitics in 

CVG, a proof-theoretic categorial grammar formalism. The approach shown here differs 

from most categorial analyses of French clitics in that it treats clitics as noun phrases rather 

than as functions that operate on under-saturated verb phrases. Basic French cliticization, 

clitics in infinitival constructions, and both auxiliary and non-auxiliary clitic climbing are 

analyzed. 



Cliticization in French is a set of phenomena in which pronominal complements to a 

verbal host are systematically realized as affixes. Linguistic generalizations about these 

phenomena have been structured using several different frameworks, with Sag & Miller’s 

(1997) HPSG treatment of French clitics as morphological affixes being the most comprehensive 

and successful. Categorial accounts of cliticization phenomena, among them 

Kraak (1998) for French and Morrill & Gavarro (1992) for Catalan, have largely analyzed 

clitics as functors over under-saturated verb phrases. Stabler (2001) and Amblard (2006) 

are two recent approaches to French clitics in the Minimalist Grammar formalism, both 

of which treat them as syntactic elements with certain feature sets. 

In this paper, I give a preliminary account of some of the phenomena involving French 

clitics using Convergent Grammar (CVG), a categorial grammar framework that uses natural 

deduction with hypothetical proof. 1 This treatment is limited to a subset of what 

Bonami & Boye (2005) call French Pronominal Clitics (FPCs), specifically, those FPCs 

that appear as verbal complements. From Kraak (1998) I borrow the idea of a specialized 

combinatory mode for FPC attachment to a verbal host (analogous to her •ca) that is 

“stronger” than normal Complement Merge and reflects the status of clitic attachment as 

a process more morphological than syntactic. In contrast to Kraak’s and much other work 

on FPCs in categorial frameworks, however, the account sketched here partly follows the 

work of Stabler and Amblard in analyzing FPCs not as functors over verb phrases but 

as sets of morphological features that also represent a syntactic and semantic argument, 

much like ordinary NPs. 

Drawing on Sag & Miller’s work on French clitics as inspiration, the analysis reflected 

here relies mainly on properly-structured lexical axioms to describe the behavior of FPCs. 

Basic instances of cliticization are considered as well as more complicated situations, 

such as argument composition and the interaction of FPCs with infinitivals. However, 

this paper does not take a firm stance on the question of whether cliticization phenomena 

◦ For many helpful comments and suggestions on this and earlier drafts of this paper, I am grateful to 

Yusuke Kubota, Carl Pollard, Chris Worth, and three anonymous ESSLLI reviewers. 

1 Pollard (2007) provides an introduction to CVG. 

115


should be considered syntactic or morphological, since CVG’s tectogrammatical terms 

represent syntactic dependency relations and do not necessarily correspond exactly to 

surface word order or prosodic form. 

2 Pronominal Complement Clitics in French 

French verbs take canonical complements in a manner that resembles complement selection 

for their English analogs: the verbal head combines with its complement(s) to the 

right and with its subject to the left to form a finite or infinitive clause. When certain 

complements are pronominalized, however, they can optionally appear to the immediate 

left of the verb in a variant form as proclitics. The following data, replicated in part from 

(1) in Sag & Miller (1997), show the verb voir ‘to see’ with its complement realized both 

canonically and as a proclitic: 2 

(1) a. Marie voit Jean. ‘Marie sees Jean.’ 

b. Marie voit lui. ‘Marie sees him.’ [boldface = prosodic stress] 

c. Marie le voit. 

Marie ACC.3S sees 

‘Marie sees him.’ 

The cliticized configuration is given in (1c), with the complement in its clitic form (le) 

instead of the canonical one (here Jean, or lui with appropriate stress). 

Among the other distinctive characteristics of complement FPCs noted by Kraak (1998), 

the ones that bear most on the account given here are that: 

• as verbal complements, they do not co-occur with their non-pronominal or noncliticized 

versions (exemplified in (1)). 

• they do not serve as the complement to bare past participles. This fact gives rise to 

an instance of the phenomenon known as “clitic climbing”: 

(2) a. *Marie a le vu. ‘Marie saw him.’ 

b. Marie l’a vu. 

Marie ACC.3S has seen 

‘Marie saw him.’ 

Here, (2a) is unacceptable because although the clitic le is the accusative complement 

of vu, it must be realized on the tense auxiliary form a as in (2b). However, 

causatives and certain verbs of perception exhibit different behavior. For these 

verbs, it is possible for some of their arguments to be realized as clitics on the 

upstairs verb and some on the downstairs one: 

(3) Jean le fera la réparer. 

Jean ACC.3S make.FUT ACC.3FS repair 

‘Jean will make him repair it.’ 

(From Abeille, Godard and Miller(1995, example (2a)).) 

2 I adopt Bonami & Boye’s (2005) scheme here for annotating morphological features. 

116


• No syntactic material except another clitic can intervene between an FPC and its 

host verb. This fact distinguishes cliticized complements from their canonical counterparts 

in which certain adverbials can occur between a verb and its complements: 

(4) a. Marie l’a souvent dit à lui. 

Marie ACC.3S has often said to him 

‘Marie has often said it to him.’ 

b. Marie l’a dit souvent à lui. 

Marie ACC.3S has said often 


to him 

c. Marie le lui a souvent dit. 

Marie ACC.3S DAT.3S has often 


said 

d. *Marie le lui souvent a dit. 

Marie ACC.3S DAT.3S often has said 


e. *Marie le souvent lui a dit. 

Marie ACC.3S often DAT.3S has said 


(Example (4d) is from Kraak (1998, (7d)).) Here, (4d) and (4e) show the disallowed 

intervention of the adverbial souvent ‘often’ between an FPC and its host verb, 

while (4b) demonstrates the allowable intervention of souvent in the canonical form. 

• they are normally realized on the verb they complement, illustrated here with an 

embedded infinitival: 

(5) a. *Marie le veut voir. ‘Marie wants to see him.’ 

b. Marie veut le voir. 

Marie wants ACC.3S to see 

‘Marie wants to see him.’ 

The cliticized accusative le here is the complement of the infinitive voir, and does 

not to attach to the upstairs verb veut. 

These are the most basic facts about cliticization of declarative verbal complements in 

French. FPCs also occur in passive constructions and in constructions like those in (6): 

(6) a. i. Pierre reste fidèle à Jean. 

‘Pierre remains faithful to Jean.’ 

ii. Pierre lui reste fidèle. 

b. 

Pierre DAT.3S remains faithful 

‘Pierre remains faithful to him.’ 

i. Marie connaît la fin de l’histoire. 

‘Marie knows the end of the story.’ 

117

ii. Marie en connaît la fin. 

Marie GEN.3S knows the end 

‘Marie knows the end of it.’ 

(Both are from Sag & Miller (1997, example 3).) Constructions involving FPCs like those 

in (6) are similar to the clitic climbing that occurs with auxiliaries like avoir (as shown in 

(2)). 

In §3, I sketch an analysis of the basic facts about cliticization in some of the situations 

described above. 

3 Accounting for the Data 


Sag & Miller (1997) give extensive argumentation for considering clitics as morphological 

rather than syntactic in nature. Their account constrains the inflectional paradigm 

of French verbs, treating clitics as pronominal affixes that reduce the valence requirements 

of a given verb. In examining French clitics from a deductive perspective, Kraak 

(1998) instead describes cliticization as occurring on a “sliding scale” between morphology 

(affix-host attachment) and syntax (complement selection). The view presented here 

is more in line with Kraaks in that it uses CVG tectogrammatical proof terms to describe 

the combinatoric potential of functions and arguments. 

However, this account diverges from Kraak’s and most other categorial grammar treatments 

in that it construes FPCs as regular pronominal NPs, instead of formulating them 

as functors over under-saturated verb phrases. This approach allows the semantics to be 

nearly identical between canonical and cliticized forms by specifying a separate mode of 

complement selection specifically for clitics. 

3.1 FPCs as a Local Dependency 

Because cliticization differs from the canonical form of complement selection (⊸C) in 

various ways, a separate implication mode, called ⊸PC (for proclitic), is used. As a local 

implication mode, it has modus ponens (elimination) but not hypothetical proof (introduction), 

which is used in CVG for non-local extractions. The elimination (or “merge”) 

rule for ⊸PC is as follows: 3 

Proclitic Merge 

If Γ ⊢ a, x : A, C ⊣ ∆ 

and Γ ′ ⊢ f, v : A ⊸PC B, C ⊃ D ⊣ ∆ ′ 

then Γ, Γ ′ ⊢ ( PC a f), v(x) : B, D ⊣ ∆, ∆ ′ 

This rule formalizes the affixation of clitics to a verbal host, taking into account both 

the syntactic and semantic proof terms. This new ⊸PC implication mode allows lexical 

axioms to specify the cliticized complement mode of combination as opposed to 

the canonical one, and is central to the account of clitic behavior sketched here. As a 

mnemonic meant to reflect French word order in derivational history, function application 

for ⊸PC writes an FPC to the left of its host. This rule also states that hypotheses present 

3 A CVG sign is a triple made up of the prosodic/phonological form, syntactic tectogrammatical term, 

and semantic content. For brevity, I omit the prosodic element and only include the syntactic tecto-term and 

semantic denotation. 

118


in both the syntactic context (to the left of ⊢) and the semantic co-context (to the right 

of ⊣) of both premises are propagated into the conclusion. This ensures that the application 

of this rule does not have any effect on any non-local extractions (filler-gap path 

information), stored quantifiers, or anaphoric pronouns. 

With this new implication mode and merge rule, an account of FPC behavior as demonstrated 

in §2 is possible that requires no other machinery than the CVG merge rules described 

in Pollard (2007). All that remains is to correctly specify the necessary lexical 

axioms. First are the canonical forms of the verbs and complements: 4 

⊢ Marie, marie ′ : Nom, Ind 

⊢ Jean, jean ′ : Acc, Ind 

⊢ lui1 , a : Acc, Ind 

⊢ voit1 , λyλxsee ′ (x, y) : (Acc \ Pcl) ⊸C (Nom ⊸SU Fin), Ind ⊃ (Ind ⊃ Prop) 

The new type Pcl is assigned to proclitics in order to differentiate them from their canonical 

counterparts. Here, voit selects a complement of type Acc \ Pcl to indicate that it 

does not combine with proclitics in canonical complement position: the set complement 

specifies all inhabitants of type Acc except those that inhabit Pcl. Next, the lexicon is 

extended to reflect the syntactic/morphological features of le and the cliticization mode 

of complement selection for voir: 

⊢ le, b : Acc ∩ 3Sg ∩ Pcl, Ind 

⊢ voit2 , λyλxsee ′ (x, y) : (Acc ∩ Pcl) ⊸PC (Nom ⊸SU Fin), 

Ind ⊃ (Ind ⊃ Prop) 

These axioms allow the following proof terms for the data in (1): 5 

(7) a. ⊢ ( SU Marie (voit1 Jean C )), see ′ (marie ′ , jean ′ ) : Fin, Prop 

b. ⊢ ( SU Marie (voit1 lui1 C )), see ′ (marie ′ , a) : Fin, Prop 

c. ⊢ ( SU Marie ( PC le voit2 )), see ′ (marie ′ , b) : Fin, Prop 

Aside from the different implication mode, the only difference between the canonical 

form of voit (voit1 ) and the cliticized variant (voit2 ) is that the argument to voit2 must 

be of the intersective type Acc ∩ Pcl. The type 3Sg represents the argument’s agreement 

features. So stated, this selectional restriction ensures that voit2 can only combine in 

cliticized mode with accusative complements that are also proclitics, as desired. It is 

important to note that not only are the semantics of both variants of voit identical, but 

both cliticized and canonical complements are of the same semantic type (Ind) as well. 

4 The basic tectogrammatical types used here are Nom for nominative NPs, Acc for accusative NPs, and 

Fin for finite clauses. The hyperintensional types Ind, the type of individual concepts; and Prop, the type 

of propositions, are the basic semantic types. In addition to the new combinatory mode ⊸PC, implicative 

tectogrammatical types are constructed using ⊸SU and ⊸C, which invoke Subject Merge and Complement 

Merge, respectively. 

5 For clarity, the proof terms given in this account show the semantics but not the co-context as quantification, 

wh-phrases, and anaphoric binding are not discussed here. 

119


3.2 “Clitic Climbing” and Tense Auxiliaries 

The axioms for tense auxiliaries are structured so that they take the complements of their 

verbal complement. Past-participial verbs in turn need to be specified in such a way that 

the proclitic merge rule does not apply to them. This approach is reminiscent of the 

argument composition approach employed by Sag & Miller (1997) and Abeille, Godard 

& Sag(1998). The axioms necessary to describe the “climbing” behavior in (2) are the 

following: 

⊢ aA, λvv 

: ((A \ Pcl) ⊸C (Nom ⊸SU Psp)) ⊸C ((A ∩ Pcl) ⊸PC (Nom ⊸SU Fin)), 

(Ind ⊃ (Ind ⊃ Prop)) ⊃ (Ind ⊃ (Ind ⊃ Prop)) 

⊢ vu, λyλxsee ′ (x, y) : (Acc \ Pcl) ⊸C (Nom ⊸SU Psp), Ind ⊃ (Ind ⊃ Prop) 

The tense auxiliary form a (from avoir) is schematically defined to combine with a verb 

in past participial form missing its complement, of polymorphic type A, to yield a finite 

sentence missing both that same A complement and a nominative subject. In this way, the 

A-type complement is “passed along” from the past participle to the tense auxiliary, whose 

semantics are just to apply the identity function to the meaning of its past-participial 

complement. 

A proof term that correctly predicts the allowed form of (2b) is then possible: 6 

(8) ⊢ ( SU Marie ( PC le (aAcc vu C ))), see ′ (marie ′ , b) : Fin, Prop 

No proof is available for the disallowed form in (2a) because the lexical axiom vu only 

uses the ⊸C mode of implication, and as a result proclitics can not directly combine with 

it. 

3.3 FPCs in Infinitival Constructions 

Ensuring that cliticized complements of infinitival complements stay on the infinitiveform 

verb, as depicted in (5), can also be accomplished with well formulated lexical 

axioms. This ends up being simply a matter of making sure that infinitive-form verbs can 

take proclitic complements and the verbs that select infinitivals can not: 

⊢ voir1 , λyλxsee ′ (x, y) 

: (Acc ∩ Pcl) ⊸PC (Nom ⊸SU Inf), Ind ⊃ (Ind ⊃ Prop) 

⊢ veut, λPλxwant ′ (x, P (x)) 

: (Nom ⊸SU Inf) ⊸C (Nom ⊸SU Fin), (Ind ⊃ Prop) ⊃ (Ind ⊃ Prop) 

The semantic representation of veut given here is the “equi” version of the denotation 

λP∈Propλx∈Indwant ′ (x, P ) that might be used where veut takes a sentential complement, 

as in Marie veut qu’elle gagne ‘Marie wants that she wins’. 

With the lexicon so extended, a proof term for (5b) can be derived: 

(9) ⊢ ( SU Marie (veut ( PC le voir) C )), want ′ (marie ′ , see ′ (marie ′ , b)) : Fin, Prop 

A derivation for (5a) is not possible because veut does not employ the ⊸PC mode of 

combination required for FPCs. 

6 Note that the tectogrammatical proof term in (8) does not describe the phonological elision between le 

and a that occurs in French. 

120


3.4 FPCs and Non-auxiliary Composition 

Extending CVG to account for FPCs that combine with argument composition verbs other 

than auxiliaries, whose behavior is exemplified in (6), requires defining special lexical 

axioms for those verbs. Similar to the data examined so far, “non-local pronominal affixation” 

(in the terminology of Sag & Miller (1997)) is very short distance in nature, and as 

such employs the local implication ⊸PC that was introduced to handle procliticization. 

It is not necessary to invoke CVG’s hypothetical proof machinery for handling extraction 

phenomena to explain the data in (6). 

Here, a strategy is adopted of composing a predicative adjectival (for example, fidèle) 

or transitive verb (like connaît) with a version of its complement that is itself expecting 

a complement. The necessary extensions to the lexicon for the data in (6a) are the 

following: 7 

⊢ Pierre, pierre ′ : Nom, Ind 

⊢ lui2 , d : Dat ∩ 3Sg ∩ Pcl, Ind 

⊢ fidèle, λyλxfaithful ′ (x, y) : (Dat \ Pcl) ⊸C (Nom ⊸SU Adj), 

Ind ⊃ (Ind ⊃ Prop) 

⊢ reste, λPλyλxremain ′ (P (x, y)) 

: ((Dat \ Pcl) ⊸C (Nom ⊸SU Adj)) ⊸C ((Dat ∩ Pcl) ⊸PC (Nom ⊸SU Fin)), 

(Ind ⊃ (Ind ⊃ Prop)) ⊃ (Ind ⊃ (Ind ⊃ Prop)) 

These axioms describe fidèle as an adjective missing a dative complement to form an 

adjectival small clause and the form of rester that takes an adjectival complement that is 

itself missing its complement. These extensions permit a proof term for (6a-ii): 

(10) ⊢ ( SU Pierre ( PC lui2 (reste fidèle C ))), remain ′ (faithful ′ (pierre ′ , d)) : Fin, Prop 

(A full derivation of (10) is given in Figure 1 in the appendix.) With a few further extensions 

to the lexicon, (6b) can also be accounted for: 

⊢ connaît, λf λyλxknow ′ (x, f(y)) 

: ((De \ Pcl) ⊸C Acc) ⊸C ((De ∩ Pcl) ⊸PC (Nom ⊸SU Fin)), 

(Ind ⊃ Ind) ⊃ (Ind ⊃ Prop) 

⊢ fin, end ′ : N, Ind 

⊢ la, λf λxf(x) : N ⊸SP ((De \ Pcl) ⊸C Acc), Ind ⊃ (Ind ⊃ Ind) 

⊢ en, e : De ∩ Pcl, Ind 

Here, connaît is formulated as just an ordinary transitive verb except that it selects an 

accusative complement that is itself missing its De complement. The definite article la 

is treated as a function from common nouns (type N) to possessive NPs (functions from 

canonical de-phrases to accusatives), using the specifier combinatory mode ⊸SP. The 

clitic en is represented as an axiom whose type is the intersection of De and Pcl. These 

axioms allow a proof term like the one in (10) for (6b-ii): 

7 This account assumes the analysis of predicatives given by Pollard (2006) pp. 52–65, for example, for 

adjectival small clauses of the type Nom ⊸SU Adj. 

121


(11) ⊢ ( SU Marie ( PC en (connaît (la fin SP ) C ))), know ′ (marie ′ , end ′ (e)) : Fin, Prop 

The lexical axioms introduced here predict that FPCs in non-auxiliary composition 

contexts behave in a way largely parallel with that of FPCs that combine with auxiliary 

verbs. The main difference between FPCs with auxiliaries and with non-auxiliaries is that 

the complement types for non-auxiliaries must be more constrained than the free-ranging 

polymorphic complement allowed by auxiliaries. Since this approach does not appeal 

to CVG’s unbounded dependency machinery, instead relying on axioms that specify the 

⊸PC local dependency, these instances of cliticization are guaranteed to remain shortdistance. 

If FPCs in non-auxiliary composition contexts were construed as non-local 

extractions, it would be difficult to rule out constructions like (12), for example, which do 

not occur in French: 8 

(12) *Marie luii reste certaine que Céline a donné le livre i. 

4 Conclusions and Future Work 

This paper sketches a proof-theoretic account of the behavior of FPCs as complements. 

For local cliticization, a new valence implication mode ⊸PC is introduced to differentiate 

procliticization from the canonical form of verbal complement selection. Combined with 

properly-formulated lexical axioms, this new mode can account for some of the behavior 

of FPCs, including the basic instances of cliticization, FPCs in infinitival constructions, 

and two forms of “clitic climbing” via an argument composition analysis. 

The analysis given here departs from traditional categorial analyses of cliticization 

by construing FPCs as special instances of NPs. An advantage of this approach is that 

a cliticized complement has identical semantics and a nearly identical tectogrammatical 

form as its canonical counterpart. This fact, in combination with the new ⊸PC mode 

of implication for FPC affixation, allows lexical axioms to more strictly constrain the 

behavior of FPCs in comparison to other types of verbal complements. This ability may 

be central to correctly predicting, for example, the distribution of souvent as shown in (4). 

This approach suffers, however, from the proliferation of lexical axioms that must occur 

since all verbs that take complements need at least two distinct representations in the 

lexicon. Such a requirement would have especially adverse implications for computational 

applications like parsing. Since very often, as with voit1 and voit2 , the canonical 

form of a verb closely resembles its cliticized variant, it is clear that a lexical rule associating 

these forms is crucial to the success of this type of approach. The instances of 

auxiliary and non-auxiliary composition presented here are also largely similar between 

cliticized and non-cliticized versions. A general account of FPCs in French along the 

lines of the analyses presented here must include a mapping between these similar forms 

that captures their common linguistic and information-structural characteristics. 

Future work on FPCs will aim to develop a correspondence between canonical and 

cliticized verb forms that predicts FPC behavior in a general way. This work will need to 

account for multiple clitic constructions, the rigid (and sometimes idiosyncratic) ordering 

of FPC clusters, agreement between FPCs and past participles, FPCs in passive, causative, 

and perceptual-verb constructions, and the enclitic attachment to imperative-form verbs 

in French. 

8 This example is due to Carl Pollard (personal communication of March 18, 2008). 

122

References 

Abeillé, A., Godard, D. and Miller, P. (1995). Causatifs et Verbes de Perception en 

Français, Actes du Deuxième Colloque Langues et Grammaire, Paris VIII, Saint 

Denis. 

Abeillé, A., Godard, D. and Sag, I. A. (1998). Two Kinds of Composition in French 

Complex Predicates, Syntax and Semantics: Complex Predicates in Nonderivational 

Syntax 30: 1–41. 

Amblard, M. (2006). Treating clitics with minimalist grammars, in S. Wintner (ed.), 

Proceedings of the Eleventh Conference on Formal Grammar, CSLI Publications, 

pp. 9–20. 

Bonami, O. and Boyé, G. (2005). French pronominal clitics and the design of Paradigm 

Function Morphology, in G. Booij, L. Ducceschi, B. Fradin, E. Guevara, A. Ralli 

and S. Scalise (eds), Proceedings of the Fifth Mediterranean Morphology Meeting, 

pp. 291–322. 

Kraak, E. (1998). A Deductive Account of French Object Clitics, Syntax and Semantics: 

Complex Predicates in Nonderivational Syntax 30: 271–312. 

Morrill, G. and Gavarro, A. (1992). Catalan Clitics, in A. Lecomte (ed.), Word Order in 

Categorial Grammar, Editions Adosa, Clermont-Ferrand, pp. 211–232. 

Pollard, C. (2006). Higher Order Grammar: A Tutorial. Unpublished ms., available at 

http://www.ling.osu.edu/∼hana/hog/pollard2006-synners.pdf. 

Pollard, C. (2007). Nonlocal dependencies via variable contexts, in R. Muskens (ed.), 

Proceedings of the Workshop on New Directions in Type-Theoretic Grammar. ESS- 

LLI 2007, Dublin. 

Sag, I. A. and Miller, P. H. (1997). French Clitic Movement without Clitics or Movement, 

Natural Language and Linguistic Theory 15(3): 573–639. 

Stabler, E. P. (2001). Recognizing Head Movement, LACL ’01: Proceedings of the 4th International 

Conference on Logical Aspects of Computational Linguistics, Springer- 

Verlag, London, UK, pp. 245–260. 

Appendix A: Full Derivation 


123


⊢ reste : ((Dat \ Pcl) ⊸C (Nom ⊸SU Adj)) ⊸C ((Dat ∩ Pcl) ⊸PC (Nom ⊸SU Fin)) ⊢ fidèle : (Dat \ Pcl) ⊸C (Nom ⊸SU Adj) 

⊢ (reste fidèle C ) : (Dat ∩ Pcl) ⊸PC (Nom ⊸SU Fin) 

⊢ ( PC lui2 (reste fidèle C )) : Nom ⊸SU Fin 

⊢ lui2 : Dat ∩ 3Sg ∩ Pcl 

⊢ Pierre : Nom 

⊢ ( SU Pierre ( PC lui2 (reste fidèle C ))) : Fin 

124 

⊢ λPλyλxremain ′ (P (x, y)) : (Ind ⊃ (Ind ⊃ Prop)) ⊃ (Ind ⊃ (Ind ⊃ Prop)) ⊢ λyλxfaithful ′ (x, y) : Ind ⊃ (Ind ⊃ Prop) 

⊢ λyλxremain ′ (faithful ′ (x, y)) : Ind ⊃ (Ind ⊃ Prop) 

⊢ λxremain ′ (faithful ′ (x, d)) : Ind ⊃ Prop 

⊢ d : Ind 

⊢ pierre ′ : Ind 

⊢ remain ′ (faithful ′ (pierre ′ , d)) : Prop 

Figure 1: Full derivation of (10), with tecto-terms (above) and semantic terms (below) given separately for space considerations.

INFINITE GAMES 

FROM AN INTUITIONISTIC POINT OF VIEW 

Takako Nemoto 

Tohoku University 

Abstract. In this paper, we consider determinacy in Brouwerian intuitionistic mathematics. 

We give some examples of games such that the character of this mathematical setting—the 

lack of the law of excluded middle and the adoption of continuity principle—makes the 

behavior of determinacy drastically different from that on the classical setting. 



Games on N N have been of great interest in mathematical logic for a long time. On one 

hand, determinacy of games has been used as a strong tool to investigate Baire space N N 

or Cantor space {0, 1} N . On the other hand, as has been known, determinacy statements 

are quite sensitive to the mathematical setting: For example, with the axiom of choice, 

full determinacy is inconsistent; determinacy of analytic games are beyond ZFC. 

The ultimate purpose of the author is to know how Baire space and Cantor space vary 

depending on settings other than usual ones. As the first step toward this, she has been 

investigating the promising tool, determinacy, on these settings. Among these are subsystems 

of second order arithmetic, much weaker ones than ZFC (cf. (Nemoto, Ould Med- 

Salem and Tanaka, 2007), (Nemoto, 2008)). 

This paper treats another setting, Brouwerian intuitionistic mathematics. It denies the 

law of excluded middle (LEM) and adopts the continuity principle, asserting that all the 

functions from N N to N N or to N are continuous (for detail, see Section 2). We give 

some examples of games, which show that the continuity principle and the lack of LEM 

make the behavior of determinacy drastically different from that on the classical setting. 

To explicate the role of classical principles in determinacy, we treat predeterminacy— 

a formalization of determinacy in the intuitionistic mathematics—also in the classical 

mathematics. 

2 Axioms of the intuitionistic mathematics 

In this section, we clarify the mathematical setting of this paper. 

The logical constants have their constructive meanings and the rules of the intuitionistic 

logic are employed. In particular, a disjunctive statement A∨B means there exists a proof 

of A or one of B, and an existential statement ∃x ∈ V [A(x)] means there exist an element 

a of V and an proof of A(a). A statement A is decidable if A ∨ ¬A holds. A set X ⊆ V 

is decidable if the statement a ∈ X is decidable for each a ∈ V . 

An infinite sequence α of natural numbers α(0), α(1), α(2), ... may be determined by 

some finitely described algorithm, i.e., the n-th element α(n) of α is the result of the 

algorithm for input n. Sometimes, however, such an infinite sequence may be constructed 

step by step by choosing its elements one by one. In this case, the construction of the 

125


sequence is never finished: At any point in time, only finitely many elements have been 

chosen, and so we can only know a finite part of the sequence. 

The latter construction is not permitted in the constructive mathematics, and so this 

point divides the intuitionistic mathematics from the constructive mathematics. 

Note that every infinite sequence, even if it is given by an algorithm, can be regarded 

as a result of step-by-step-construction. This is the reason we do not distinguish infinite 

sequences of natural number by their manners of construction. 

Let N be the set of natural numbers. XN is the set of infinite sequences from X. 

In particular NN is called Baire space and 2N is called Cantor space. Xn is the set 

of sequences from X of length n and X


The strict fan theorem 

For a fan S and a decidable bar B in S, there is a bounded sub-bar B ′ ⊆ B in S. 

While König’s lemma and the strict fan theorem are equivalent in the classical mathematics, 

they are not in the intuitionistic mathematics. Actually we can construct a “socalled” 

intuitionistic counterexample, i.e., a fan T which has sequences of any finite length 

such that we cannot prove that T has an infinite path, i.e., αN → N such that αn ∈ T for 

all n. Let i n ∈ {0, 1} n be such that i n (k) = i for all k < n and let i N ∈ {0, 1} N be such 

that i N (n) = i for all n. Define T ⊆ {i n : i < 2, n ∈ N} by 

0 n ∈ T ↔there is no k < n such that pk+i = 9 for all i < 99, or if the least such k is even, 

1 n ∈ T ↔there is no k < n such that pk+i = 9 for all i < 99, then the least such k is odd, 

where pk denotes the k-th digit of the decimal expansion of π. We can easily see that T 

is a fan which has sequences of any finite length and that if T has an infinite path α, then 

α = 0 N or α = 1 N . Assume that T has an infinite path α. If α(0) = 0 (or 1), then we 

must have a proof of the statement “if there is uninterrupted occurrences of 9 of length 99 

in the decimal expansion of π, the least such one starts at an even (resp. odd) digit.” Up 

to now, we do not have any proof of such statements, and so there is no infinite path in T . 

(If we have a proof in future, we can find another so-called counterexample using another 

unsolved problem in a similar way.) 

3 Determinacy in intuitionistic mathematics 

In this section, we introduce the notion of determinacy and variants. 

For A ⊆ N N , the game G(A) in N N is defined as follows. Two players, called players 

I and II, starting with player I, alternately choose a natural number to construct α ∈ N N . 

Player I wins if and only if the resulting play α is in A. Player II wins if and only if player 

I does not win. A strategy for player I (resp. II) is a function which assigns a natural 

number to each even-(resp. odd-)length sequence in N


(Veldman, 2004) gave three formalizations of determinacy in the intuitionistic mathematics. 

G(A) is strongly determinate if, in G(A), either player I or player II has a winning 

strategy. This is the simplest formalization, but almost no game is strongly determinate. 

G(A) is determinate from the view point of player I if, if for every strategy τ of player 

II, there is α ∈II τ with α ∈ A, then player I has a winning strategy in G(A). This 

statement corresponds to the classical statement “if player II has no winning strategy, 

then player I has one in G(A),” which is classically equivalent to “G(A) is determinate.” 

To describe the last, we need a new notion. An anti-strategy for player I in G(A) is 

a function η which assigns α ∈II τ to each strategy τ for player II in G(A). An antistrategy 

η for player I secures A if, for any strategy τ for player II, η(τ) ∈ A. G(A) 

is predeterminate from the viewpoint of player I if, if he has an anti-strategy securing A, 

then he has a winning strategy in G(A). 

Note that G(A) is predeterminate from the viewpoint of player I, if G(A) is determinate 

from his viewpoint. 

Moreover, in a game G(X) in N N (or spread [S]), the second axiom of continuous 

choice yields the converse, i.e., predeterminacy implies determinacy, since a strategy for 

a player can be regarded as a function from N to N and since if there is α ∈II τ with 

α ∈ X for all strategy τ for player II, then by the second axiom of continuous choice an 

anti-strategy for player I securing X is given by a code η of a continuous function. 

The intuitionistic determinacy theorem (Veldman, 2004, Theorem 3.5) If [S] is a IIfinitary 

branching spread, i.e., S is a spread-law such that, for every odd-length s ∈ S, 

there are at most finitely many n with s ∗ 〈n〉 ∈ T , then G[S](A) is predeterminate from 

the viewpoint of player I for every A ⊆ [S]. 

In particular, if A ⊆ {0, 1} N , G {0,1} N(A) is predeterminate from the viewpoint of player 

I. (Veldman, 2004) also gave A ⊆ N N such that G(A) is not predeterminate from the 

viewpoint of player I. 

Remark The notion of predeterminacy can be formalized from the viewpoint of player II 

and we can obtain similar results to the last theorem. 

4 Variations of games and predeterminacy 

In this section, we consider other variations of games in the intuitionistic mathematics. 

For these games, we can define the three formalizations of determinacy in the same way. 

4.1 2-length games in {0, 1} N × {0, 1} 

This subsection treats one of the simplest cases in which less strategies are allowed than 

in the classical context. {0, 1} N × {0, 1} denotes the product topological space of Cantor 

space and discrete space {0, 1}. 

For given A ⊆ {0, 1} N × {0, 1}, the game G1(A) is defined as follows: 

• Player I chooses α ∈ {0, 1} N . 

• Player II chooses i ∈ {0, 1}. 

• Player I wins if (α, i) ∈ A and player II wins if player I does not win. 

128


Although {0, 1} N × {0, 1} is homeomorphic to Cantor space topologically, we must be 

sensitive to the ordertype of the indexing set for the sequences. 

In this game, a strategy for player I is his initial move α, and a strategy for player 

II is a function from {0, 1} N to {0, 1}. The continuity principle forces all the strategies 

for player II to be continuous, and so we may regard a strategy τ for player II as a code 

of a continuous function such that (τ|α)(0) ∈ {0, 1} for all α ∈ {0, 1} N . B = {s ∈ 

{0, 1} 0} is a decidable bar in the fan {0, 1} N . Then, by the strict fan theorem, 

there is a bounded sub-bar B ′ ⊆ B. Take n such that lh(s) < n for every s ∈ B ′ . 

Then, {0, 1} n is also a bar in {0, 1} N , and, for every α, β ∈ {0, 1} N , αn = βn implies 

τ|α(0) = τ|β(0). Thus we can regard τ as a function from {0, 1} nτ to {0, 1}, which can 

be coded by a natural number. Because an anti-strategy η for player I is a function from 

the set of all strategies for player II to the set of plays in this game, it can be regarded as 

a function from N with the discrete topology to {0, 1} N × {0, 1}. 

The following examples shows that even simpler sets, such as open or closed sets, are 

not predeterminate from the viewpoint of player I. 

Example 1 An open game G1(A) which is not predeterminate from the view point of 

player I: Define Ai = {0 n ∗ 〈1, i〉 : n ∈ N} and A = {(α, i) : ∃n[αn ∈ Ai]}. Then A is 

open. Let η be the anti-strategy for player I which assigns (0 n τ ∗ 〈1, τ(0 n τ )〉 ∗ 0 N , τ(0 n τ )) to 

each strategy τ for player II. Then η(τ) ∈ A for each strategy τ for player II, and so η is 

an anti-strategy for player I securing A. On the other hand, it is clear that player I has no 

winning strategy in G1(A). 

Example 2 A closed game G1(B) which is not predeterminate from the viewpoint of 

player I: Let T be an intuitionistic counterexample to König’s lemma, i.e., an unbounded 

binary tree without infinite paths. Let Ti = {t ∗ i n |t ∈ T ∧ n ∈ N}. Then B = 

{(α, i)|∀n[αn ∈ Ti]} is a closed set. If player I had a winning strategy α in G1(B), α 

would be an infinite path of T . Thus player I cannot have a winning strategy in G1(B). 

On the other hand, player I has an anti-strategy securing B. Fix an enumeration of T and 

let tn be the minimum s ∈ T such that lh(s) = n with respect to this enumeration. Let 

η be the anti-strategy for player I which assigns (tn ∗ (τ(tn)) N , τ(tn)) to each strategy 

τ : {0, 1} n → {0, 1} for player II. Clearly η secures B. 

4.2 ω + 1 length games in {0, 1} N × {0, 1} 

In this subsection, we consider another kind of games in {0, 1} N × {0, 1}. 

For given A ⊆ {0, 1} N × {0, 1}, the game G2(A) is defined as follows. 

• Player I and player II alternately choose i ∈ {0, 1} to form α ∈ {0, 1} N . 

• After α is formed, player I chooses i ∈ {0, 1}. 

• Player I wins G2(A) if and only if (α, i) ∈ A. 

In this game, a strategy σ for player I is a pair (σ0, σ1) of functions σ0 : � 

n∈N {0, 1}2n → 

{0, 1} and σ1 : {0, 1} N → {0, 1}. By the strict fan theorem, we can regard, as well as in 

the last subsection, σ1 as a function from {0, 1} n to {0, 1} for some n ∈ N. 

A strategy for player II is a function τ : � 

n∈N {0, 1}2n+1 → {0, 1}, which can be 

regarded as an element of {0, 1} N . Then an anti-strategy η for player I is a function from 

129


{0, 1} N to {0, 1} N × {0, 1}, which can be regarded a pair (η0, η1) of codes of continuous 

functions such that, for any strategy τ for player II, (η0|τ, (η1|τ)(0)) ∈II τ. By the 

strict fan theorem, there is n such that for any strategies τ and τ ′ , τn = τ ′ n implies 

(η1|τ)(0) = (η1|τ ′ )(0), and so we can regard η1 as a function from {0, 1} n to {0, 1}. 

Theorem 1 For any C ⊆ {0, 1} N × {0, 1}, G2(C) is predeterminate from the viewpoint 

of player I. 

Proof. For i < {0, 1}, set Ci = {α : (α, i) ∈ C}. Assume that η = (η0, η1) is an 

anti-strategy for player I securing C and η1 can be regarded as a function from {0, 1} n 

to {0, 1} for some n. Note that, in G {0,1} N(C0 ∪ C1), η0 is an anti-strategy for player I 

securing C0 ∪ C1. Let σ0 be a winning strategy for player I constructed in the proof of 

The intuitionistic determinacy theorem in G {0,1} N(C0 ∪ C1). Set Pσ0 = {α : α ∈I σ0}. 

Note that Pσ0 is a spread. By the proof of The intuitionistic determinacy theorem, for any 

α ∈ Pσ0, there exists a strategy δ for player II with η0|δ = α. By the second axiom of 

continuous choice, there exists a code of continuous function ζ such that, for any strategy 

α ∈ Pσ0, ζ|α is a strategy for player II with η0|(ζ|α) = α. By the strict fan theorem, there 

exists a natural number N such that, for any α and β in Pσ0, αN = βN implies (ζ|α)n = 

(ζ|β)n. Then we can define σ1 : Pσ0 → {0, 1} by σ1(α) = η1((ζ|α)n), since σ1(α) is 

determined by αN. Define a new strategy σ = (σ0, σ1) for player I in G2(C). Then, for 

any (α, i) ∈I σ, a strategy δ = ζ|α for player II satisfies (α, i) = (η0|δ, (η1|δ)(0)), and so 

σ is a winning strategy for player I in G2(C). � 

Comparing this theorem with the examples in the last subsection, we can conclude that 

predeterminacy depends how players construct the sequence rather than what sequence 

they do. 

4.3 ω + 2-length game in {0, 1} N × {0, 1} 2 

Next we consider slightly longer games. 

For a given set A ⊆ {0, 1} N × {0, 1} 2 , consider the following game G3(A). 

• First, player I and player II alternately choose n ∈ {0, 1} to form α ∈ {0, 1} N . 

• After α is formed, player I chooses i ∈ {0, 1} and player II chooses j ∈ {0, 1}. 

• Player I wins if (α, 〈i, j〉) ∈ A and player II wins if player I does not win. 

Similarly to the previous subsection, a strategy σ for player I is a pair (σ0, σ1), where 

σ0 is a function � 

n∈N {0, 1}2n to {0, 1} and where σ1 is a function from {0, 1} N to {0, 1}. 

We can regard σ1 as a function from {0, 1} n to {0, 1} for some n ∈ N. 

A strategy τ for player II is a pair (τ0, τ1), where τ0 is a function from � 

n∈N {0, 1}2n+1 

to {0, 1} and where τ1 is a function from {0, 1} N × {0, 1} to {0, 1}. Note that since τ1 is 

continuous, its restriction τ1,i to {0, 1} N × {i} is also continuous and so we can regard τ1 

as a pair (τ10, τ11) of functions {0, 1} ni to {0, 1} for some ni’s. 

Hence, the set of strategies for player II can be regarded as {0, 1} N × N, and so an antistrategy 

for player I can be regarded as a function η from {0, 1} N × N to {0, 1} N × {0, 1} 2 

such that η(τ) ∈II τ for each strategy τ for player II. 

As in the case of G1(X), we have the following examples. For any s ∈ {0, 1}


Example 3 Recall Ai defined in Example 1. Then the open game G3(A ′ ) defined by 

A ′ = {(α, 〈i, j〉) : ∃n[(αn) ′ ∈ Aj]} is not predeterminate from the viewpoint of player I. 

Example 4 Recall Ti defined in Example 2. Then the closed game G3(B ′ ) defined by 

B ′ = {(α, 〈i, j〉) : ∀n(αn) ′ ∈ Tj} is not predeterminate from the viewpoint of player I. 

5 Predeterminacy in the classical mathematics 

In this section, we consider predeterminacy in the classical mathematics in order to investigate 

the role of classical principles in predeterminacy. Note that all the definitions 

and statements in this section are made in the classical mathematics which includes the 

countable axiom of choice. 

Recall that, in the intuitionistic mathematics, an anti-strategy is a function η such that 

η(τ) ∈II τ for each strategy τ for player II. We translate this definition into the classical 

mathematics, noticing that every function on N N is continuous in the intuitionistic 

mathematics: 

Let G(X) be any of games treated in the previous sections. An anti-strategy for player 

I in G(X) is a continuous function which assigns α ∈II τ to every continuous strategy 

τ for player II in G(X). An anti-strategy η for player I in G(X) secures X if η(τ) ∈ X 

for all continuous strategies τ for player II. G(X) is predeterminate from the viewpoint 

of player I if, 

if player I has an anti-strategy η securing X then player I has a winning 

strategy in G(X). 

Note that the ordinary definition of determinacy statement can be seen as “if there is a 

function η such that η(τ) ∈II τ and η(τ) ∈ X for all strategies τ for player II, then player 

I has a winning strategy in G(X).” 

For X ⊆ N N , strategies for players in the game G(X) can be regarded as functions N 

to N, and so all the strategies are continuous. Therefore the condition “continuous” for 

strategies has no effect in games G(X), but it does in the games G1(X), G2(X) and G3(X). 

Moreover the continuity in the definition of anti-strategy is essential in the following 

discussion. 

As mentioned in (Veldman, 2004, 1.1), The intuitionistic determinacy theorem holds 

also in the classical mathematics. In particular, for all A ⊆ {0, 1} N , G {0,1} N(A) is predeterminate 

from the viewpoint of player I in the classical mathematics. 

Now we consider the predeterminacy of the games G1(X), G2(X) and G3(X) which 

are defined in the last section, in the classical mathematics. Due to König’s lemma, the 

classical counterpart of the strict fan theorem, also in the classical mathematics, a continuous 

function from {0, 1} N → {0, 1} or {0, 1} N → {0, 1} N is given by its code η defined 

in Section 2. In particular, a strategy for player II in G1(A) can be seen as a function 

τ : {0, 1} n → {0, 1} for some n and an anti-strategy for player I in G2(A) can be seen as 

a pair (η0, η1) of a code η0 of continuous function and η1 : {0, 1} m → {0, 1} for some m. 

The game G1(A) is not predeterminate from the viewpoint of player I, where A is 

defined in the proof of Example 1. For closed games, the situation differs: Whereas 

Example 2 is a closed game which is not predeterminate from the viewpoint of player I 

in the intuitionistic mathematics, we will show that there is no such closed game in the 

classical mathematics. 

131

For X ⊆ {0, 1} N × {0, 1} and s ∈ {0, 1}

6 Further problems 


Predeterminacy of closed game G3(X) in the classical mathematics The first problem 

the author is interested in is whether the closed games G3(X) are predeterminate or 

not in the classical mathematics. It will be solved by analyzing the property of continuous 

functions in Cantor space. 

Classical investigation of predeterminacy We can consider various formalizations of 

predeterminacy in the classical mathematics other than defined in Section 5, e.g., 

If player I has an anti-strategy such that η(τ) ∈ A for each continuous strategy 

τ for player II, then player I has a continuous winning strategy in G(A). 

Note that the italicized part is newly added. Again, in game G(X) in N N , this modification 

has no effect. However, we can easily find X ⊆ {0, 1} N which is not predeterminate in 

this sense but which is predeterminate in the sense of Section 5. The author expects that 

the investigation on these variations explicates how continuity confines functions on Baire 

space or Cantor space. 

Constructive reverse mathematical analysis of predeterminacy Constructive reverse 

mathematics is a study to measure the strength of mathematical statements by nonconstructive 

principles using constructive mathematics as a base theory. Constructive mathematics 

is a mathematics which is based on the intuitionistic logic, but which does not 

adopt axioms introduced in Section 2. Therefore it is included both in the classical mathematics 

and in the intuitionistic mathematics. 

(1) The role of the second axiom of continuous choice for predeterminacy Under 

the second axiom of continuous choice, predeterminacy implies determinacy. This implication 

needs only a fragment of the second axiom of continuous choice, and it is natural 

to ask exactly how strong fragments are required. If we measure the strength of fragments 

by the complexity of R in the axiom, the difficulty is in the reduction of general formulas 

of the form ∀α∃βR(α, β) to the form ∀τ∃σ∀α(α ∈I σ ∧ α ∈II τ → R ′ (α)). 

(2) Equivalences between predeterminacy and intuitionistic axioms (Veldman, 

200x) proposed intuitionistic second order arithmetic and proved that the predeterminacy 

of open subsets of II-finitary branching spreads in N is equivalent to the strict fan theorem 

over the system BIM, which corresponds a popular classical base theory RCA0 in the field 

called Friedman-Simpson’s reverse mathematics (cf. (Simpson, 1999)). The author of the 

present paper is now looking for similar equivalences beyond open sets. The first task in 

this direction is to find a suitable intuitionistic axiom to compare with. One of candidates 

is almost-fan-theorem proposed in (Veldman, 2001). 

(3) The role of LEM for predeterminacy In the proof of Theorem 2, we use the law 

of excluded middle. It seems impossible to prove it without this classical law, because 

we have B of Example 2 in the intuitionistic mathematics. The next natural question 

is what fragment of the classical law (such as the excluded middle or double negation 


elimination) is necessary and sufficient for determinacy or predeterminacy statements. 

(Akama, Berardi, Hayashi and Kohlenbach, 2004) discovered a hierarchy consisting of 

these fragments over Heyting arithmetic HA, which is the constructive counterpart to 

Peano arithmetic. The author of present paper tries to measure predeterminacy or determinacy 

statements along this hierarchy. 

(4) Equivalences between predeterminacy and classical axioms Since we treat 

predeterminacy also in the classical mathematics, it is natural to consider Friedman- 

Simpson’s reverse mathematical study of predeterminacy. Using constructive mathematics 

as a base theory, we can make a finer reverse mathematical study of predeterminacy. 


Some parts of this paper were done as the final assignment of master class 2006/2007 in 

logic at mathematical research institute, the Netherlands. The author would like to express 

her gratitude to the supervisor, Dr. Wim Veldman, who introduced her to the attractivity 

of the intuitionistic mathematics. 

References 


Akama, Y., Berardi, S., Hayashi, S. and Kohlenbach, U. (2004). An arithmetical hierarchy 

of the law of excluded middle and related principles, in H. Ganzinger (ed.), 

Proceedings of the Nineteenth Annual IEEE Symp. on Logic in Computer Science, 

LICS 2004, IEEE Computer Society Press, pp. 192–201. 

Nemoto, T. (2008). Determinacy of wadge classes and subsystems of second order arithmetic. 

Accepted for publication in Math. Log. Q., available at 

http://www.math.tohoku.ac.jp/˜sa4m20/wadge.pdf. 

Nemoto, T., Ould MedSalem, M. and Tanaka, K. (2007). Infinite games in the cantor 

space and subsystems of second order arithmetic, Math. Log. Q. 53: 226–236. 

Simpson, S. G. (1999). Subsystems of second order arithmetic, Springer. 

Veldman, W. (2001). Almost the fan theorem, Technical report, Department of Mathematics, 

University of Nijmegen. 

Veldman, W. (2004). The problem of the determinacy of infinite games from an intuitionistic 

point of view, Technical report, Department of Mathematics, University of 

Nijmegen. To appear in the proceeding of Logic, Games and Philosophy: Foundational 

Perspectives, Prague 2004. 

Veldman, W. (200x). Brouwer’s fan theorem as an axiom and as a contrast to kleene’s 

alternative. Preprint. 



LANGUAGE TECHNOLOGIES FOR INSTRUCTIONAL RESOURCES IN 

BULGARIAN 


University of Sofia 

Abstract. This paper describes a system that uses language technologies applied on instructional 

materials in order to provide computer-aided design of test items. This approach 

employs lexical and syntactic information obtained from various techniques like POS tagging, 

constituency parsing and term extraction. The system compiles a list of central terms 

for the instructional materials, creates drafts of fill in the blank questions and suggests possible 

distrators. The experiment is carried out on textbooks in geography, biology and history 

of Bulgarian high-schools. 

1 Introduction and related work 

Asking questions is a way to keep students attention in class and verify their understanding. 

Depending on the type of education and the goal of the teacher, questions could 

be asked in a different form - orally or as short writing examination, in a game manner 

etc. One common technique to do that is asking multiple choice questions, which became 

even more popular in the last years, because it is also applicable for the case of e-learning. 

However, designing thousands of tests is a time and effort-consuming educational activity. 

All questions in the test should be carefully tuned for the target group of test-takers 

and should not underestimate or overestimate their knowledge. Hence the teaching experts 

who prepare the tests must have much broader knowledge in the field, compared to 

the content which is explicitly included in the particular textbook, and they have to tune 

the tests to the knowledge of the test-takers. One of the most difficult tasks in producing 

test items is to decide whether a question does really have its answer in the instructional 

materials. 

These difficulties gave rise of a relatively new research area dealing with support of 

the generation of test items, answer and distractor suggestions. Generation of multiple 

choice questions with the help of NLP technologies is a hot area where different tools for 

text processing are used in order to transform the facts from the instructional materials 

to questions which can be used for students assessment. One of the most interesting approaches 

in this respect is presented by (Mitkov, Ha and Karamis, 2006), where they apply 

language technologies (LT) for generation of test-items for English, focusing on the automatic 

choice of distractors. They report speeding up of the process of test development 

about 6-10 times, compared to the manual test elicitation. Their approach is not domain 

specific and can be applied to each area. Other authors actively working in the area are 

(Aldabe, De Lacalle and Maritxalar, 2007), who are focusing on the different types of 

question models with application primary in the language learning. We are not familiar 

with any related work concerning this activity for learning materials in Bulgarian except 

for the previous work of the author (Nikolova, 2007). So our efforts are strongly inspired 

by the growing interest to this field, which is due to its significant practical importance. 

On the other hand, we are motivated and encouraged by the presence of sophisticated 


LT for Bulgarian language, which enable relatively complex text preprocessing, so the 

automatic acquisition of learning objects from raw texts does not start from scratch. 

This article presents the idea of the master thesis of the author which is still work in 

progress. The aim is to develop a workbench supporting test designers by language technologies, 

applied to the instructional materials. The task has three aspects: (1) suggestion 

of key terms for (2) question generation and (3) distractor suggestion. For our purpose 

the text is preprocessed by a number of preliminary available LT modules and lexical and 

syntactic features are extracted and kept in meta-data format. Those features are used 

later on for the generation of the draft learning objects. The experiment described in the 

article has been applied for three different domain areas Geography, Biology and History. 

The materials are taken from textbooks for 9th, 10th and 11th grade respectively. 

The remaining part of this article is organised as follows: we first sketch the general 

architecture of the system in section 2; in section 3 we describe the data processing; 

section 4 explains in detail the experiment done so far; section 5 concerns the evaluation 

at the current stage of the experiment; section 6 presents the conclusion and issues for 

future work. 

Figure 1: Workbench supporting the development of multiple-choice test items. 

2 Workbench description 


The system suggests draft learning objects to the test designers in order to help them during 

the test items preparation. As shown in Fig.1 the instructional materials are supplied 

by the test maker. They are being preprocessed and two main data sets are created: (a) 

list of key terms (terms central for the text which is supplied), the way how it is built is 

explained later in section 4.1 and (b) lexical and syntactic information about the supplied 

text, which is kept in metadata format. Then the user may obtain all possible questions 

generated from the supplied material or the ones related to a certain key term she is interested 

in. If the system does not find appropriate sentences, containing the term, which 

match its internal question templates (explained later in section 4.2), it returns a list of 


pointers to the text, containing the local context in which the term appears and a list of 

related concepts, generated by the same model as the distractors are. 

3 Data processing 


Our task is to support test makers during the process of building educational resources, 

namely test questions and vocabulary of important concepts for the domain. We do this 

by using language technologies over the raw instructional materials and obtain linguistic 

resources which are to be loaded into a workbench that help the test designers during their 

work. For our purpose we passed through several phases as shown in Fig. 2. 

Figure 2: Data processing. 

The instructional material is taken in a plain text format and is firstly parsed with 

an NP extractor, where nouns and noun phrases are obtained in order to make a list of 

potential key terms, which are to be suggested to the test designers. By the same time 

when those extracted terms are marked an inverted index is produced. It contains a list 

of the extracted NPs (nouns and noun phrases) and their corresponding absolute position 

in the text. A threshold for the importance of the extracted terms is set and all NPs with 

frequency higher than the threshold are included in the list of key terms. In addition all 

the NPs that contain a noun which is a key term are also included in the key terms list. 

During the next phase the raw text is tagged for POS categories. For our case we found 

practical to use the SVMTool made by (Gimenéz and Márquez, 2004) which was trained 

over the newspaper part of BulTreeBank 1 . The proper names, recognised by the tagger 

were added to the list of key terms and then the output was processed with the multilingual 

statistical parsing engine of Dan Bikels (Bikel, 2004), which is implementation 

and extension of Collins parser referred bellow as (Collins, 1999). The parsing model 

1 HPSG-based Syntactic Treebank of Bulgarian (BulTreeBank), http://bultreebank.org/ 


was trained on BulTreeBank. All the syntactic and lexical information obtained in these 

phases is kept in meta-format and used later in order to produce draft learning objects 

(key terms, test items), which are suggested to the test designers. 

4 The experiment 

4.1 Key terms suggestion 


We build our approach on the understanding that questions given to the learner concern 

terms, which are central for the domain. These are the terms, which serve as a basis 

for the learned material and represent a specific domain vocabulary. Here those terms 

are referred to as key terms. Although verbs might be also qualified as good key terms 

in some domains, in this experiment we pay attention only to nouns and noun phrases 

as potential key terms. They were extracted by the classic approach for automatic term 

extraction based on frequencies. In order to overcome the problem of the inflection of 

the language the raw texts were firstly lemmatized and then parsed with the NP-extractor 

Morena. Once we obtained a list of nouns (LN) and noun phrases (LNP) we had to 

rank them in order to extract only the most important ones which are the focus of our 

approach and users queries. We applied two different techniques for measuring the term 

importance over LN: simple frequency counting and TF-IDF measuring. As reported by 

(Mitkov et al., 2006), we also noticed that TF-IDF produces worse results as it tends to 

give low score to frequently used words (for example �� - economy) which are 

actually quite important in the case of instructional materials (it is common to repeat the 

same information to the learners in order to force them to better remember it). At the 

same time sorting the list of nouns by their frequencies, after removing the stop words, 

gave us quite satisfying results. 

Word frequency fi Number of words wf with frequency fi 

55 1 

46 1 

22 6 

20 1 

18 1 wf ≤ fi 

16 1 

14 1 

12 5 

10 5 

8 6 

6 8 

4 44 wf ≥ fi 

2 174 

Table 1: Word frequency distribution in a text with length about 1000 words. 

To set the threshold for important and less important terms in previous experiments 

we have observed already prepared test items, prepared manually by the test designers, 

concerning the same material as the corpora we are processing. The test items were parsed 

with an NP extractor. We checked the popularity of the NPs, extracted from the test items, 

in the whole corpus and the lowest popularity was accepted as a threshold. After repeating 

the same procedure for different domain corpora we noticed that the importance border 

is near the term frequency, which equals to the number of words having that count. For 


example in a comparatively short text we have the following figure (Table 1) where the 

threshold is set to frequency f = 7. 

Once adjusted the threshold, we consider all the terms above it as key terms which 

should be suggested to the test-makers. Now we add all NPs, which contained key terms 

to the list of key terms. For example: along with the term (economy) from the materials 

in geography we add the following NPs: 

�� (rural economy), 

�� (world economy), 

�� (national economy), 

�� (market economy), 

�� (national market economy), 

�� (contemporary world economics), 

�� (Japanese economy), 

�� (natural economy), 

�� (contemporary modern rural economy) 

Removing the NPs containing stop words prevented the use of phrases like �� 

�� (their economy). After the POS tagging the recognised proper nouns were 

also added to the list of key terms and the final list of key terms was formed. 

4.2 Question generation 


In order to filter out clauses which are appropriate for question generation a module processes 

the lexico-syntactic information collected during the preprocessing phase and decides 

that a clause is eligible if: 

(1) it contains at least one key term, 

(2) the term is in a NPA clause of its VPS 2 (the NPA clause is the subject daughter of 

VPS phrase) and 

(3) the clause is finite. 

If the three conditions are present, we consider that the term is in the subject phrase in 

the sentence, which means that it is has central meaning for the sentence and we apply a 

rule which replaces the focal term with a blank. The system additionally checks whether 

the sentences do not point to some figures or tables, appendixes. 

For example in the materials of Biology the terms �� (heredity) and 

�� (inheritance) are key terms. And we have the following information about 

the constituents for one of the sentences which contain the terms. 

(S (VPS (NPA (N (NN ��)) (PP (Prep (IN ��)) (Ncfsd ��) (CoordP (Conj (C (CC 

�)) (Ncnsd ��)) (ConjArg (NPA (N (NN ��)) (PP (Prep (IN �)) (N (NN ��)))))))) 

(VPC (V (T (RP ��)) (Pron (Ppxta ��)) (V (VB ��))) (NPA (A (JJ ��)) (N (NN ��))))) (PUNC .)) 

Whichever of both terms is chosen by the user the system will try to produce a stem 

from this sentence because it satisfies the three necessary conditions. So it will replace 

the suggested key term with a blank and suggest the key term as an answer. 

E.g. �� 

(Due to ... and inheritance the species remain unchanged for long periods.) 

2 NPA - head-adjunct noun phrase / VPS -head-subject verb phrase for full definitions - HPSGbased 

Syntactic Treebank of Bulgarian (BulTreeBank), BulTreeBank Project Technical Report 05. 2004, 

http://bultreebank.org/TechRep/BTB-TR05.pdf 


correct answer: ��(the heredity) 

In the following sentence, again the key term �� is present. 

(S (VPS (NPA (CoordP (ConjArg (NPA (N (NN ��)) (PP (Prep (IN ��)) (Ncfsd ��) 

(CoordP (Conj (C (CC �))) (Ncfsd ��))))) (Conj (C (CC �))) (ConjArg (N (NN ��)))) (PP 

(Prep (IN )) (Ncmpd ) (Pron (Ppetdp3 )))) (VPC (V (VB )) (NPA (A (JJ )) (N (NN )) (IN ))) (Ncfsd )) (PUNC .)) 

The term is a part of the subject phrase, so it is possible to make a fill in the blank 

question, where the blank will replace the focal term ��. 

�� 

�� 

(The study of ... and variability and the discovery of their regularities are the basic tasks of genetics.) 

correct answer: ��(heredity) 

Except for the change of the focal term with a blank, we do not apply any other transformation 

to the chosen sentence. 

4.3 Distractor generation 


For the purpose of our application we need to suggest distractors in two cases: (1) when 

questions are generated automatically and (2) when a key term was chosen by the designer, 

but no questions could be generated for that key term, then only related concepts 

are shown to the user (they are extracted by the same principle as distractors and that is 

why we explain their construction in this section). 

In the well-designed multiple-choice tests, the distractors are always semantically close 

to the correct answer (as well as to each other, in a sense). To find such distractors 

in previous studies we have tried paragraph clustering in order to define groups of text 

sections which have similar topics, but in short text this methodology does not give a 

promising result. Because of that we chose a rather simple working solution. We observed 

already prepared tests for beginners level and we noticed that most of the distractors 

looked very similar in first sight. They were mostly phrases holding the same noun and 

different modifiers or the opposite, composed by the same modifier and different nouns. 

That is why we accepted the practice to suggested as distractors NPs, which contain the 

same noun, which the key term chosen by the user contains, but we change the modifier 

of the phrase. And also the other way around, we change the noun of the chosen key 

term and suggest phrases with the same modifier and different noun. All these phrases are 

taken from the NP list generated in the first stage. 

140

Such an example is: 

Constant modifier Constant noun 

�� (natural complex) �� (rural economy) 

�� (natural zone) �� (world economy) 

�� (natural component) �� (national economy) 

5 Evaluation 


At the current stage the system has been tested by three teachers, who are professional 

test designers. Each one of them is a specialist in one of the three areas and has a degree 

also in one of the others. They have experimented with materials in the three domains 

biology, geography and history. Each designer had to choose 20 key terms in total and 

to evaluate with a YES/NO mark (YES - acceptable question with or without need to be 

changed; NO - not acceptable question) the questions produced by the system, related to 

the chosen key terms. 

From the materials in biology and geography useful definitions were extracted and 

they were appreciated by the designers while for the history domain mainly proper names 

were helpful. In total the average of the generated fill in the blank questions reported as 

acceptable by the designers were 61% (with or without post-editing). The professionals 

shared that the context and the distractors have helped them a lot, because they gave them 

more options to seek for the needed information in order to correct a not well-formed 

question. The reasons for discarding the rest of the questions were mainly that some 

of the sentences had common meaning and did not represent specific definition; some 

others were discarded because the blank was ambiguous - they had two many possible 

options for a correct answer; or the chosen term was not central for the sentence which 

was chosen. 

The designers were especially satisfied with the high quality of the key terms which 

served as a cross-reference over the whole material. They find them useful in order to 

systematise the topics on which the student could be examined. In this way they saved 

them time, because they could use the vocabulary of key terms as a summary of the 

contents. Deeper analysis of the speeding-up of the process will be done after improving 

the user interface of the system. 

The test designers were certain that the so-prepared question items are useful only in 

the case of beginner level testing, where deep understanding is not required and learners 

are taught mostly basic definitions. 

6 Conclusion and future work 

This experiment represents a step towards the automatic test generation and it shows the 

advances gained using more sophisticated tools and deeper processing of the instructional 

materials. 

Although the approach is considered as domain independent we consider Biology and 

Geography more suitable, producing better results than History. One of the reasons is that 

in history pure definitions in one sentence are hardly found and normally many references 

141

are used. In this domain important role had the proper names which were also included 

in the list of key terms. 

As this article represents a work in progress we plan to go deeper in the data analysis 

by adding dependency parsing. Then we can observe the subject and object clauses and 

make additional inferences. We will also try different techniques for distractor selection, 

such as using term similarity measures over the corpus and different types of questions. 

We plan to improve the user interface, because it is a main issue, which concerns the 

efficiency of the work of the test designers. Overall we plan deeper evaluation of the 

system,including Classical test theory and error analysis in order to improve the produced 

items. 

7 Acknowledgements 

My complements go to my supervisor Galia Angelova and for Atanas Chanev who kindly 

provided models for the SVMTool and Dan Bikel’s parser for Bulgarian. 

References 


Aldabe, I., De Lacalle, M. L. and Maritxalar, M. (2007). Automatic acquisition of didactic 

resources: generating test-based questions, in I. F. de Castro (ed.), Proceeding of 

SINTICE 07, pp. 105–111. 

Bikel, D. (2004). A distributional analysis of a lexicalized statistical parsing model, in 

D. Lin and D. Wu (eds), Proceedings of EMNLP. 

URL: http://www.cis.upenn.edu/ dbikel/software.htmlstat-parser 

Collins, M. (1999). Head-Driven Statistical Models for Natural Language Parsing, PhD 

thesis, University of Pennsylvania. 

Gimenéz, J. and Márquez, L. (2004). Svmtool: A general pos tagger generator based on 

support vector machines, Proceedings of the 4th International Conference LREC’04. 

Mitkov, R., Ha, L. A. and Karamis, N. (2006). A computer-aided environment for generating 

multiple-choice test items, Natural Language Engineering 12.: 177–194. 

Nikolova, I. (2007). Supporting the development of multiple-choice tests in bulgarian 

by language technologies, in E. Paskaleva and M. Slavcheva (eds), Proceedings of 

the Workshop A Common Natural Language Processing Paradigm for Balkan Languages, 

pp. 31–34. 

142

WORD SPACE MODELS OF 

SEMANTIC SIMILARITY AND RELATEDNESS 

Yves Peirsman 

University of Leuven & Research Foundation – Flanders 

Abstract. Word Space Models provide a convenient way of modelling word meaning in 

terms of a word’s contexts in a corpus. This paper investigates the influence of the type of 

context features on the kind of semantic information that the models capture. In particular, 

we make a distinction between semantic similarity and semantic relatedness. It is shown 

that the strictness of the context definition correlates with the models’ ability to identify 

semantically similar words: syntactic approaches perform better than bag-of-word models, 

and small context windows are better than larger ones. For semantic relatedness, however, 

syntactic features and small context windows are at a clear disadvantage. Second-order bagof-word 

models perform below average across the board. 



Word Space Models have become the standard approach to the computational modelling 

of lexical semantics (Landauer and Dumais, 1997; Lin, 1998; Schütze, 1998; Padó and 

Lapata, 2007). They indeed offer a convenient way of capturing the meaning of a word 

simply on the basis of the contexts in which it is used in a corpus. In that way, they can 

retrieve the most similar words for a given target word. Yet, there is no agreement on how 

context should be defined exactly. Context features vary from sentences or paragraphs to 

single words, with or without the addition of syntactic relations. While all these features 

definitely capture some semantic information, it is only to be expected that the choice of 

context definition has an influence on the kind of semantic relatives that the Word Space 

Models will find. 

It is well known that words may be semantically related along a number of dimensions 

(Cruse, 1986). In the NLP literature, similarity takes up a central position, with synonymy 

as the most obvious example. But there are other types of semantic relations, too. For 

instance, two words like doctor and hospital have a clear connection, although they are 

in no way semantically similar. Recovering this semantic relatedness from a corpus may 

have to proceed along different lines than the modelling of semantic similarity. Specific 

Word Space Models may thus have a bias towards one or the other of these relations. In the 

literature, however, the investigation of this semantic behaviour of Word Space Models 

has only recently come to the fore (Sahlgren, 2006; Peirsman, Heylen and Speelman, 

2007). 

In this paper, we investigate eleven Word Space Models, representing three broad 

classes, with respect to their performance in the fields of semantic similarity and semantic 

relatedness. It will be shown that there is no such thing as a single best Word Space 

Model: the ranking of the approaches depends on the type of semantic information we 

want to find. The paper is structured as follows: in the next section, we will introduce the 

different context models and the two types of semantic relationship that we investigate. 

Section 3 then presents the precise setup of our experiments, while section 4 discusses 

their results. Section 5 wraps up with conclusions and an outlook for future research. 

143

2 Word Space Models 


2.1 Competing definitions of context 

All Word Space Models of lexical semantics rely on the so-called distributional hypothesis 

(Harris, 1954), which claims that words with similar meanings occur in similar contexts. 

From this hypothesis, it follows that semantic similarity can be modelled in terms of 

contextual or distributional similarity. This is done by constructing for each target word 

a so-called context vector, which contains the scores of its target word for all possible 

context features. These scores can be the number of times that the contextual feature 

co-occurs with the target, or more often, some kind of weighted frequency that captures 

the statistical link between the target word and that feature. The distributional similarity 

between two words is then calculated as the similarity between their vectors, on the basis 

of a function like the cosine. In this way, it is possible to find for each target word the n 

most distributionally similar words in any given corpus. We call these words the nearest 

neighbours of the target. 

Based on the definition of context, it is possible to define a hierarchy of Word Space 

Models, each with its own kind of contextual features. At the top of the tree we make a distinction 

between document-based and word-based approaches. Document-based models 

use sentences, paragraphs or documents as dimensions, and count how often a target word 

appears in each of these entities in the corpus (Landauer and Dumais, 1997; Sahlgren, 

2006). Word-based models, by contrast, take not the context itself, but features from this 

context as dimensions. They can be subdivided into syntactic and bag-of-word models. 

So-called bag-of-word or co-occurrence models take into account all words within a predefined 

distance of the target word (generally with the exception of semantically empty 

words like articles, etc.), whereas syntactic models consider only those words to which 

the target is syntactically related. Sometimes the features of such syntactic models consist 

of these syntactically related words alone (Padó and Lapata, 2007), sometimes they are 

formed by the word plus its relation (Lin, 1998). Finally we can distinguish between firstorder 

and second-order approaches. First-order bag-of-word approaches count the context 

words directly (Levy and Bullinaria, 2001), while second-order bag-of-word approaches 

sum the vectors of these context words. In this last case, the target’s context vector thus 

contains frequency information about the context words of its (first-order) context words 

(Schütze, 1998). Although it is in principle possible to construct second-order syntactic 

models, to our knowledge no implementation has been presented in the literature. 

2.2 Semantic similarity and semantic relatedness 

While it is claimed that all Word Space Models capture some kind of semantic information, 

so far we have only very limited knowledge about the influence of the context 

definition on the types of semantic relationship that the models find. In this paper we 

investigate two such types: semantic similarity and semantic relatedness. The first applies 

to synonyms (e.g., plane and airplane), hyponyms and hypernyms (e.g., bird and 

blackbird) and co-hyponyms (e.g., blackbird and robin) — two words with a relationship 

of similarity between the concepts they refer to. Semantic relatedness, by contrast, exists 

between words whose concepts are not necessarily similar, but still related, for instance 

because they belong to the same script, frame or lexical field. This is true for pairs like 

bird and beak or plane and pilot. Note that it is not possible to draw a clear boundary be- 

144

tween semantic similarity and semantic relatedness. Take the word pair pepper–salt, for 

instance. These two words are clearly semantically similar, since they both refer to spices. 

At the same time, however, they are also semantically related: not only do they both belong 

to the lexical fields of food or spices, they also often co-occur together in the phrase 

salt and pepper. Instead of mutually exclusive classes, semantic similarity and relatedness 

can thus better be thought of as the two ends of a continuum, or two perpendicular 

axes in a two-dimensional plane. 

For many NLP applications, similarity might be the most important relation to model. 

In typical Query Expansion, for instance, only semantically similar words (synonyms or 

possibly hyponyms) make for a desired extension of a search query. Similarly, in Question 

Answering a word in the question should only be matched with semantically similar 

words in the database where the computer looks for the answer. Semantic similarity, however, 

is just one way in which words may be related in our mental lexicon, as suggested 

by psycholinguistic association experiments. According to Aitchinson (2003), the four 

major types of associations that people give in response to a cue word are, in order of 

frequency, co-ordination (co-hyponyms like pepper and salt), collocation (like salt and 

water), superordination (hypernyms like butterfly and insect) and synonymy (like starved 

and hungry). A similar observation is made by Schulte im Walde and Melinger (2005). 

Comparing the results of their German verb association experiment with GermaNet, they 

note that only 6% of the associations are synonyms, 14% are hypernyms and 16% are 

hyponyms, while no less than 54% of the associations are unrelated to their cue words in 

the GermaNet taxonomy. Although part of this can be explained by the incompleteness 

of the database, such results will be difficult to replicate with models of semantic similarity. 

After all, these are meant to prefer synonyms over hypernyms and co-hyponyms, and 

even exclude collocates altogether. The best Word Space Models of semantic similarity 

may thus not be the best models of relatedness, and vice versa. 

Despite the wealth of research into Word Space Models, studies into their semantic 

characteristics are scarce. Most often one model is applied to a specific computationallinguistic 

task, and “comparisons between the (...) models have been few and far between 

in the literature” (Padó and Lapata, 2007, p. 166). Sahlgren (2006) is one exception to this 

rule. Focusing on document-based and first-order bag-of-word models, he showed that the 

latter are better geared towards the modelling of paradigmatic (similarity) relations, while 

the former have a clear bias towards syntagmatic relations. Unfortunately, Sahlgren left 

out a number of popular word space approaches, like those based on syntactic relations or 

second-order co-occurrences. Peirsman et al. (2007) also included syntactic models, but 

concentrated on similarity relations only. This article thus sets out to fill these gaps in the 

literature, by discussing a wide variety of model types from the perspectives of similarity 

as well as relatedness. 

3 Experimental setup 


We investigate three classes of Word Space Models, for a total of eleven approaches: five 

first-order bag-of-word models, five second-order bag-of-word models and one syntactic 

model. Our corpus is the 300 million word Twente Nieuws Corpus of Dutch newspaper 

articles, collected at the University of Twente and parsed by the Alpino parser at the 

University of Groningen. As our test set, we selected from this corpus the 10,000 most 

145

frequent nouns. For each of these, we had all models retrieve the 100 most similar neighbours 

from the 9,999 remaining nouns in the set. 

The bag-of-word models, both first-order and second-order, varied the size of the context 

window they took into account — 1, 3, 5, 10 or 20 words to either side of the target — 

for a total of ten models. Sentence boundaries were ignored; article boundaries were not. 

The syntactic model considered eight different types of syntactic dependency relations, 

in which the target word could be (1) the subject of verb v, (2) the direct object of verb 

v, (3) a prepositional complement of verb v introduced by preposition p, (4) the head of 

an adverbial prepositional phrase (PP) of verb v introduced by preposition p, (5) modified 

by adjective a, (6) postmodified by a PP with head n introduced by preposition p, 

(7) modified by an apposition with head n, or (8) coordinated with head n. Each specific 

instantiation of the variables v, p, a, or n was responsible for a new context feature. 

The other parameter settings were shared by all eleven models: 

• Dimensionality: For all approaches, we used the 2,000 most frequent contextual 

features in the corpus as dimensions. This is a simple but common way of reducing 

the otherwise huge dimensionality of the vectors, which leads to state-of-the-art 

results, particularly for the syntactic model (Levy and Bullinaria, 2001; Padó and 

Lapata, 2007). For the syntactic model these dimensions are the 2,000 most frequent 

syntactic features, like subj of fly. For the bag-of-word models, they are 

formed by the 2,000 most frequent words in the corpus. Function words and other 

semantically empty words were excluded a priori on the basis of a stop list. 

• Frequency cut-off: Depending on the context size, we established a cut-off value n, 

so that the models ignored those features that occurred together with the target fewer 

than n times. For context size 3, this cut-off was fixed at 3, for the larger context 

sizes it lay at 5. The syntactic model and the bag-of-word model with context size 

1 did not use a cut-off, since it led to data sparseness. 

• Frequency weighting: As is usual in the literature, the context vectors of the target 

words did not contain the simple frequencies of the features. Instead, they listed 

the point-wise mutual information between each feature and the target word. This 

measure expresses whether the two occur together more or less often in the corpus 

than we expect on the basis of their individual relative frequencies. 

• Similarity measure: Finally, the distributional similarity between two target words 

was measured by the cosine between their context vectors. 

4 Results 

4.1 Semantic similarity 


We evaluated the ability of our models to find semantically similar words on the basis of 

a comparison with Dutch EuroWordNet (Vossen, 1998). This lexical database contains 

more than 34,000 sets of noun synonyms and the relations that exist between them. Two 

evaluation measures were applied. First, we focused on the general ability of our models 

to capture semantic similarity. Then we looked into the distribution of four more specific 

similarity relations. 

146

wu & palmer 

0.0 0.2 0.4 0.6 0.8 1.0 


syn c1 c3 c5 c10 c20 cc1 cc3 cc5 cc10 cc20 

word space models 

Figure 1: Wu & Palmer similarity scores between target and nearest neighbour. 

syn: syntactic model, cn: first-order bag-of-words, ccn: second-order bag-of-words 

n: context size (number of words on either side of target) 

The general performance of the models was quantified by the average Wu and Palmer 

score between a target word and its single nearest neighbour (Wu and Palmer, 1994). This 

Wu and Palmer score is a popular way of measuring the semantic similarity between two 

words, based on their depth and their distance from each other in a taxonomic structure 

like EuroWordNet. If either the target or its nearest neighbour were not present in the 

database, the pair was simply ignored. In order to make the results perfectly comparable 

across models, we restricted the results to the 4183 target words with a nearest neighbour 

in EuroWordNet for all models. The resulting Wu and Palmer scores are given in Figure 1. 

Figure 1 shows a clear decrease in Wu and Palmer score as the definition of context 

becomes less strict. A Friedman test indeed confirms the influence of the type of Word 

Space Model on performance (Friedman chi-squared = 3541.575, df = 10, p-value < 

.001). The syntactic model achieves the highest average similarity score by far, followed 

by the first-order bag-of-word models and finally the second-order bag-of-word models. 

Moreover, small contexts appear to model semantic similarity better than large ones. A 

test of multiple comparisons after Friedman showed that the differences between all pairs 

of models are indeed statistically significant at the .05 level, except for those between 

context sizes 1 and 3 (both first-order and second-order) and that between the first-order 

model with context size 20 and the second-order model with context size 5. 

Of course, this general similarity score does not give any information about what specific 

type of similarity relation the models find. We therefore defined four taxonomic 

similarity relations, again with EuroWordNet as a gold standard. Synonyms were defined 

as words in the same synonym set as the target word, hypernyms as words exactly one 

node above the target, hyponyms those one node below and co-hyponyms as words one 

node below any of the target’s hypernyms. Together, these relations make up the target’s 

EuroWordNet environment. Note that our strict definition of these relationships does not 

147

frequency 

0 500 1000 1500 2000 2500 

0.512 

0.384 0.405 

0.369 

0.327 

0.264 

0.25 



0.273 0.247 

0.217 

cohyponym 

hyperonym 

hyponym 

synonym 

0.185 

Figure 2: Distribution of semantic similarity relations for all models. 

allow for more than one or two steps in the tree, and thus disregards possible hypernyms 

or hyponyms that are more than one step away from the target. This approach ensures the 

reliability of our gold standard, but constitutes a test that a relatively low percentage of 

nearest neighbours will pass. Figure 2 shows how the single nearest neighbours of our target 

words are distributed over the four similarity relations. Again we restricted ourselves 

to the 4183 target words with a neighbour in EuroWordNet for all models. 

Not surprisingly, the number of retrieved similarity relations mirrors the general Wu 

and Palmer similarity score. Again the syntactic model performs best: 51.2% of its single 

nearest neighbours that occur in EuroWordNet are situated in the environment of the target 

word. This precision drops to between 40.5% and 26.4% for the first-order bag-of-word 

methods and even lower for the second-order models. As above, the performance of 

the models seems to depend on the strictness of their context definition. The stricter they 

view context — i.e., syntactic context rather than a bag of words, smaller context windows 

rather than large ones — the more examples of semantic similarity they find. This pattern 

remains unchanged when a larger number of nearest neighbours is taken into account. 

With one exception, the distribution of the four relations is comparable across the different 

models. Co-hyponyms figure most prominently among the nearest neighbours, 

followed by synonyms, hypernyms and hyponyms. Only the syntactic model finds an 

unexpectedly high number of hypernyms. This can probably be explained by the way 

syntactic relations are typically inherited in a taxonomy: all characteristics of a (prototypical) 

concept (can fly, for instance) also apply to its hypernyms, so that these are often 

most similar in terms of syntactic distribution in a corpus. 

4.2 Semantic relatedness 


The results in the previous section do not necessarily express the overall quality of the investigated 

Word Space Models. It is possible that the models that scored relatively badly 

148


0.0 0.1 0.2 0.3 0.4 

0 20 40 60 80 100 

number of nearest neighbours 

precision 

recall 

F−score 

Figure 3: Evolution of the precision, recall and F-score of the first-order bag-of-word 

model with context size 5 in its retrieval of associations. 

in the similarity experiments are simply biased towards a different kind of semantic relation. 

In this second round of experiments we therefore turn our attention from semantic 

similarity to semantic relatedness. 

For this task, we relied on a psycholinguistic experiment of human associations, described 

in De Deyne and Storms (in press). In this experiment, participants were asked 

to list three different word associations for 1,424 cue words. Each word was presented 

to at least 82 participants, resulting in a total of 381,909 responses. For instance, aap 

(‘monkey’) triggered the response zoo (‘zoo’) 27 times, aarde (‘earth’) prompted planeet 

(‘planet’) 14 times and bikini (‘bikini’) elicited vakantie (‘holiday’) 6 times. These examples 

show that this experiment taps into a different kind of semantic relationship than 

the previous one. Note that at this moment, we ignore the fact that association strength is 

often asymmetric (Michelbacher, Evert and Schütze, 2007). 

In order to make the results comparable to those in section 4.1, we reduced the data set 

to those cue words and associations that belong to the 10,000 most frequent nouns in our 

corpus. This gave a gold standard of 768 cue words with a total of 31,862 different cue– 

association pairs. When these associations are checked against EuroWordNet, we indeed 

find that only 8% belong to the EuroWordNet environment of their target word. 9% of 

these are synonyms, 19% are hypernyms, 16% are hyponyms and 56% are cohyponyms. 

We evaluated the Word Space Models against this gold standard by counting the number 

of associations that they find as the nearest neighbours to the cue words. If we consider 

just one nearest neighbour, the results already show a considerable difference from 

the previous experiments. As the top chart in Figure 4 indicates, the syntactic model still 

performs best, with 340 associations (a precision of .443), followed by the first-order and 

then the second-order bag-of-word models. However, within the bag-of-word models, the 

ideal context size has changed. The first-order bag-of-word models with context sizes 10 

and 20 have 299 and 293 associations among their single nearest neighbours, respectively. 

For 768 targets, this gives precision values of .389 and .382. Then we find context sizes 5 

(n = 281, P = .366), 3 (n = 269, P = .350) and 1 (n = 228, P = .297). Larger contexts 

thus outperform their smaller competitors here. Note that the two best models share only 

149

90 correct predictions, which indicates that they have different preferences among the associations. 

A look at the data suggests that the syntactic model indeed picks out those 

associations that are also semantically similar to their target word, while the first-order 

bag-of-word models with large contexts cover collocational relatedness better. With the 

second-order models, finally, context size 3 seems optimal. 

When we consider one nearest neighbour, the models cannot find more than 768 associations, 

and recall thus stays extremely low. We therefore increased the number of nearest 

neighbours from 1 to 100 and calculated the precision, recall and F-score at each step. 

By way of example, Figure 3 plots the evolution of these values for the best-performing 

model. The bottom bar chart in Figure 4, then, shows the maximum F-score of all the 

models. The syntactic approach has lost its lead, which suggests that it is able to model 

only a small number of associations well — probably those that also score highly on 

similarity. Instead it is now the first-order bag-of-word model with context size 5 that 

outclasses all others, with an F-score of .127 (P = .112, R = .148) at 55 neighbours. 

Extending the context window to 10 words brings the F-score down to .122 (P = .102, 

R = .150, 61 neighbours); reducing the window to 3 words takes it to .120 (P = .104, 

R = .143, 57 neighbours). Next, we have the bag-of-word model with context size 20 

(F = .115, P = .102, R = .133, 54 neighbours) and only then the syntactic model 

(F = .111, P = .102 R = .123, 50 neighbours). Large contexts now score slightly worse 

than intermediate ones, which probably strike the best balance between similarity relations 

and collocational links. Second-order models never attain an F-score above .10, and 

neither do the smallest context windows, which are thus clearly biased towards similarity. 

4.3 Discussion 


In part, our experiments have confirmed earlier results in the literature. For instance, 

Sahlgren (2006) already noted that with first-order bag-of-word models, larger contexts 

score better in his association experiment, while smaller contexts score better in the synonymy 

test. Peirsman et al. (2007) found even better results for a syntactic model in 

Dutch, at least with respect to semantic similarity evaluated against EuroWordNet. Both 

findings are borne out by our experiments. 

At the same time, our results add some new insights to these earlier observations. We 

have shown that the syntactic model and the bag-of-word models with context size 1 are 

most biased towards semantic similarity. The syntactic model scored best in our first 

round of experiments, while the results of the bag-of-word models with context size 1 

were either not statistically different from or better than those of models with larger context 

windows. When it came to the discovery of semantic associations, however, context 

size 1 proved the least advisable choice, and the syntactic model was outperformed by 

all first-order bag-of-word models with an intermediate or large context window. Secondorder 

bag-of-word models scored below average in both experiments. They probably only 

show their power when data sparseness is an issue, as with Word Sense Discrimination 

(Schütze, 1998) or with corpora smaller than ours. 

5 Conclusions and future research 

In this paper, we investigated the influence of the context definition on the ability of 

several Word Space Models to capture two kinds of semantic information — semantic 

150

frequency 

F−score 

0 100 200 300 400 

0.00 0.04 0.08 0.12 

0.443 

0.297 

0.35 0.366 0.389 0.382 

0.185 


0.111 

0.084 

0.12 


0.127 0.122 0.115 

0.052 

0.267 0.247 0.247 0.232 



0.081 0.079 0.081 0.08 

Figure 4: Frequency of associations among single nearest neighbours (top) and maximal 

F-scores for all models (bottom). 

similarity and semantic relatedness. We studied a total of eleven Word Space Models: 

one syntactic approach and ten bag-of-word models with context sizes 1, 3, 5, 10 and 

20, first-order as well as second-order. Both for semantic similarity and semantic relatedness, 

first-order models clearly beat their second-order competitors. However, while 

syntactic models gave the best results for semantic similarity, first-order bag-of-word approaches 

with intermediate to large context windows fared better in the retrieval of associated 

words. 

In the short term, we aim to extend the repository of Word Space Models that we are 

investigating — document-based models and second-order syntactic models are particularly 

high on our list. In the longer term, we will try and determine if the differences we 

observed in the modelling of semantic relations between word types also play a role in 

Word Sense Discrimination. In this task, all contexts of a word are clustered in order to 

automatically find the multiple senses of that word. Given the results here, we suspect that 

different kinds of polysemy or homonymy may not demand the same context definitions. 

References 


Aitchinson, J. (2003). Words in the Mind. An Introduction to the Mental Lexicon, Oxford: 

Blackwell. 

Cruse, D. A. (1986). Lexical Semantics, London: Cambridge University Press. 

151


De Deyne, S. and Storms, G. (in press). Word associations: Norms for 1,424 dutch words 

in a continuous task, Behaviour Research Methods . 

Harris, Z. (1954). Distributional structure, Word 10(23): 146–162. 

Landauer, T. K. and Dumais, S. T. (1997). A solution to Plato’s problem: The Latent 

Semantic Analysis theory of the acquisition, induction, and representation of knowledge, 

Psychological Review 104: 211–240. 

Levy, J. P. and Bullinaria, J. A. (2001). Learning lexical properties from word usage 

patterns: Which context words should be used, in R. French and J. Sougne (eds), 

Connectionist Models of Learning, Development and Evolution: Proceedings of the 

6th Neural Computation and Psychology Workshop, London: Springer, pp. 273– 

282. 

Lin, D. (1998). Automatic retrieval and clustering of similar words, Proceedings of 

COLING-ACL98, Montreal, Canada, pp. 768–774. 

Michelbacher, L., Evert, S. and Schütze, H. (2007). Asymmetric association measures, 

Proceedings of the International Conference on Recent Advances in Natural Language 

Processing (RANLP-07), Borovets, Bulgaria. 

Padó, S. and Lapata, M. (2007). Dependency-based construction of semantic space models, 

Computational Linguistics 33(2): 161–199. 

Peirsman, Y., Heylen, K. and Speelman, D. (2007). Finding semantically related words in 

dutch. co-occurrences versus syntactic contexts, Proceedings of the CoSMO Workshop, 

Roskilde, Denmark, pp. 9–16. 

Sahlgren, M. (2006). The Word-Space Model. Using Distributional Analysis to Represent 

Syntagmatic and Paradigmatic Relations Between Words in High-dimensional 

Vector Spaces, PhD thesis, Stockholm University. 

Schulte im Walde, S. and Melinger, A. (2005). Identifying Semantic Relations and Functional 

Properties of Human Verb Associations, Proceedings of the joint Conference 

on Human Language Technology and Empirical Methods in Natural Language Processing, 

Vancouver, Canada, pp. 612–619. 

Schütze, H. (1998). Automatic word sense discrimination, Computational Linguistics 

24(1): 97–124. 

Vossen, P. (ed.) (1998). EuroWordNet: a Multilingual Database with Lexical Semantic 

Networks for European Languages, Dordrecht: Kluwer. 

Wu, Z. and Palmer, M. (1994). Verb semantics and lexical selection, Proceedings of the 

32nd Annual Meeting of the Association for Computational Linguistics (ACL-94), 

Las Cruces, NM, pp. 133–138. 

152

EXAMINING THE NOTICING FUNCTION OF OUTPUT 


Michigan State University 

Abstract. Following Izumi and Bigelow’s research (Izumi and Bigelow, 2000), this study 

re-investigates the noticing function of output; that is, whether producing the target language 

focuses learners’ attention to second language (L2) structures in subsequent input. Izumi 

and Bigelow found no effects of output on either noticing or acquisition. They attributed 

their findings to limitations in operationalizing noticing via underlining, coupled with the 

relative difficulty of the target-structure (past-hypothetical-conditional). Under the premise 

that the learner’s developmental level and attentional resources may constrain noticing, this 

partial replication addresses whether a less difficult structure may yield greater noticing and, 

consequently, greater L2 gains. Fifteen intermediate ESL learners were randomly assigned 

to two experimental groups (EGs) and one control group (CG). The first EG was given opportunities 

for output that elicited the past hypothetical conditional (more difficult structure), 

while the second EG had opportunities to produce the present hypothetical conditional (less 

difficult structure). The CG was not prompted to produce output that required use of either 

structure. All groups engaged in follow-up reading and underlining activities. The reading 

texts modeled target-like use of the relevant structure for both EGs. Methodological 

triangulation measured noticing through underlining of the target-structure and stimulated 

recall to elicit data about cognitive processes involved. Additionally, noticing and L2 gains 

were assessed based on participants’ performance on subsequent essay-writing activities and 

posttests. Quantitative raw data revealed no effect of output (EGs vs. CG) or difficulty-level 

(EG1 vs. EG2) on the underlining of target forms in subsequent texts. Qualitative stimulated 

recall data, however, showed that output influences subsequent noticing of certain input 

elements; e.g. ’This is a good word for my essay’. Overall findings suggest that output 

can trigger noticing of vocabulary and further illustrate how methodological triangulation 

can enhance insights into learners’ L2 processes. Thus, this study has ramifications for both 

classroom practices and research methodology. 



In the past decade of second language acquisition (SLA) research, the notion that noticing 

is essential for the acquisition of new linguistic systems has been a matter of debate 

(Jourdenais, 2001; Leow, 2002; Robinson, 2001; Schmidt, 2001; Simard and Wong, 2001; 

Tomlin and Villa, 1994; Truscott, 1998). Much of the argumentation is grounded in the 

difficulty of operationalizing and measuring the second language (L2) learner’s internal 

cognitive processes. Research in SLA and cognitive science has raised questions as to 

the type and amount of ’attention’ necessary for language learning, the specific aspects of 

language that are more likely to be noticed, the what extent to which the developmental 

level of the learner determines what is noticed. 

Recently, researchers have turned their attention to the role output plays in noticing. 

The oral or written production of language may consciously induce learners to realize 

the gap between what they want to say and what they can say. This noticing of linguistic 

limitations may prompt learners to seek solutions in subsequent input. A study by 

Izumi and Bigelow (2000) centered on the noticing function of output. They investigated 

whether L2 written output promotes noticing of form in subsequent text. They compared 

an experimental group, which produced output, to a control group, which did not produce 

any output but engaged in comprehension-based activities instead. The noticing of the 

153

participants was operationalized through the participants’ underlining of the target structure 

in written text. Both groups underlined the same amount and Izumi and Bigelow 

concluded that output does not trigger noticing. Because Izumi and Bigelow’s inquiry 

is of importance as it may inform L2 pedagogy, the present study partially replicates 

their study by asking analogous research questions and by implementing a similar design. 

Yet, to achieve a more valid measure of noticing, this study uses stimulated recall to tap 

into learners’ cognitive processes. In addition to the stimulated recall data, this study 

quantitatively and qualitatively analyzes the data from learners’ underlining and written 

production to better examine a possible relationship between output, noticing and L2 development. 

This research also addresses whether a cognitively less demanding structure 

may have an effect on noticing by the learner. The following section provides a review 

of the literature on noticing, followed by sections detailing the difficulties associated with 

measuring noticing, the role output plays in noticing as well as the role of the learner 

level. The third section details the research methodology, and the subsequent sections 

provide a discussion of findings and limitations and a conclusion. 

2 Review of the Literature 

2.1 Noticing and SLA 


Since Schmidt (1990) first proposed his well-known “noticing hypothesis”, a large body 

of SLA and cognitive science research has focused on the role of noticing, or conscious 

attention 1 , in promoting L2 development (Alanen, 1995; Leow, 2002; Rosa and O’Neill, 

1999). The noticing hypothesis claims that noticing requires awareness and is a necessary 

condition for second language acquisition. Yet, some research findings are not in line with 

the premise that conscious attention is a necessary prerequisite for L2 acquisition (Gass, 

Svetics and Lemelin, 2003; Robinson, 1995). 

Truscott rejects the crucial role of noticing in L2 learning process from a theoretical 

perspective, maintaining that noticing only advances metalinguistic knowledge but not 

competence. He further contends that “awareness is not only unnecessary but also unhelpful” 

(Truscott, 1998, page 126). Such a narrow account of the role of noticing in SLA 

is certainly challenged by substantial L2 research data supporting that noticing facilitates 

L2 learning (Ellis, 1994; Long, 1996; Robinson, 1995; Swain and Lapkin, 1998). 

2.1.1 Operationalizing and Measuring Noticing 

At the heart of the ongoing debate on the role of noticing in SLA is the difficulty in 

operationalizing it, which requires introspection and assessment of learner-internal cognitive 

activities. For example, Schmidt (2001) operationalized noticing in terms of the 

learners’ self-reporting either during or immediately after exposure to the input, yet, the 

lack of self-reporting should not be interpreted as a lack of awareness, as some thinking 

processes may be difficult to verbalize (Jourdenais, 2001; Schmidt, 2001). As such, the 

challenge facing the measurement of noticing is to accurately link observable behaviors 

by language learners to the construct of noticing. Methodologies used to qualitatively and 

1 Due to terminological vagueness of ’noticing’ resulting from related terms such as ’attention’ 

(Leow, 2002) and ’awareness’ (Tomlin and Villa, 1994) in noticing- literature, Schmidt’s definition of 

noticing as ’conscious attention’ has been adopted for the present study (Schmidt, 2001). Schmidt equates 

consciousness with awareness and/or attention. 

154


quantitatively account for a learners’ noticing of a specific target language features fall 

into two categories: online, which measure the language learner’s noticing during performance 

of a certain language task, and offline, which employs post-treatment assessment 

of noticing. Neither online nor offline methodologies enable an absolute account of the 

learners’ attentional processes. 

Online methodologies include, for example, think-aloud protocols which require the 

participants to monitor and orally self-report their mental processes while they perform a 

certain language task. Izumi and Bigelow used the online methodology of participants’ 

underlining of “the word, words, or parts of the words that are [felt to be] particularly 

necessary for subsequent production” (Izumi and Bigelow, 2000, page 250). Izumi and 

Bigelow characterize underlining as an authentic procedure readers naturally do during 

a reading task, and argue that the marking of words would not occur without conscious 

awareness of the importance of that particular word or phrase. In partially replicating 

Izumi and Bigelow, the present study utilizes underlining as one integral attribute of the 

triangulated measurement of noticing. 

The advantage of online measures, as opposed to post-exposure measures, is their instantaneous 

access to L2 processing, thus minimizing the risk of possible memory decay 

by the L2 learner (Gass and Mackey, 2000). Yet, stimulated recall has evolved as a 

sound and widely used offline method to obtain data of the language learner’s thought 

processes. During stimulated recall, learners are prompted with a stimulus (e.g. learner’s 

written products or a video displaying the learner while engaging in the language task), 

and he/she is asked to report on thought processes while performing the language task. 

Note, however, that the lack of evidence of noticing in online or offline protocol does not 

necessarily imply absence of noticing. 

2.1.2 Developmental Level as a Factor in Noticing 

In addition to the concern over how noticing data should be collected and analyzed, current 

SLA research has scrutinized connections between the difficulty level of the target 

language input and the learner’s attentional resources (Ellis, 1994; Gass et al., 2003; Philp, 

2003; VanPatten, 1996). Long (1996), for instance, found that the proficiency of the 

learner may modulate noticing. Advanced learners may benefit from the increasing automaticity 

which allows them to attend to more complex structures. A recent study by Philp 

(2003) similarly revealed that the developmental level of the learners was one factor to 

determine accurate recall of the reformulation by the native speaker. Thus, developmental 

readiness may constrain the learner’s attention to aspects of more difficult structures. In 

a similar vein, Robinson (1995) argued that the extent to which a language learner may 

notice a particular form of their linguistic limitations is dependent on the demands of the 

pedagogical task. 

In the research by Izumi and Bigelow (2000), the study to be partially replicated here, 

the past hypothetical conditional was selected as the target structure, based on the rationale 

that this structure poses some difficulty to the learner, which may trigger noticing of 

linguistic limitations. Yet, learner level and attentional capacities for the target structure, 

it is of present interest whether reduced cognitive demands may yield greater noticing 

and, in turn, greater L2 gains. 

155


2.2 The Noticing Function of Output 

Underlying the relationship between noticing and SLA is the question of under what circumstances 

L2 learners may notice linguistic forms. Is it through input or through output, 

or both in combination? While the essential role of input for SLA is universally accepted, 

the sufficiency of input for acquisition has been debated since Swain first proposed her 

Output Hypothesis (Swain, 1985) in reaction to Krashen’s view of primacy of comprehensible 

input (Krashen, 1982). While Swain does not negate the importance of input, 

she argues that “L2 output pushes learners to process language more deeply (with more 

mental effort) than does input” (Swain, 1995). A series of studies by Swain and Lapkin 

revealed noticing as one of the main reasons why producing output mediates L2 development 

(Swain and Lapkin, 1995; Swain and Lapkin, 1998). As such, their argument corresponds 

to Schmidt’s Noticing Hypothesis. Because output focuses the learner’s attention 

on the L2 structures they produce (their interlanguage), it enables them to compare their 

interlanguage to the target language they receive, thereby attending to their linguistic limitations 

(Gass and Varonis, 1994). If relevant input is immediately available afterwards, 

the noticing of the gap may cause the learner to process the subsequent input with more 

focused attention. This hypothesis has been approached by Izumi and Bigelow (2000), 

which constitutes the basis of the research reported here. 

2.3 Izumi and Bigelow 2000 

Izumi and Bigelow addressed the issue of output and noticing in their study guided by 

two questions: (1) “Do output activities promote the noticing of linguistic form in subsequent 

input?” and (2) “Do these output-input-activities result in improved production of 

the target form?” (Izumi and Bigelow, 2000, page 247). They compared an EG, which 

was engaged in output tasks (essay writing and text reconstruction) to a CG, which did 

not produce any written output. Both groups received the same textual input for the subsequent 

reading and underlining activity; however, the groups were given different purposes 

for underlining which may have influenced participant’s attentional focus. In the 

present study, all participants received the same instructions for the reading and underlining 

activity. In Izumi and Bigelow (2000), noticing of the target form (past hypothetical 

conditional in English 2 ) was assessed through underlining and through the demonstration 

of uptake (correct use of the target form by the learner) as a complementary measure of 

noticing and acquisition of that form. The study presented here did not treat uptake as a 

distinct measurement of noticing or acquisition, but qualitatively examined to what extent 

learner’s uptake corresponds to prior noticing action. Izumi and Bigelow attributed their 

non-significant findings to the relative difficulty of the target structure. Thus, this study 

investigates learners’ noticing when engaging with a less complex yet similar structure: 

the present hypothetical conditional 3 . 

Except for one statistically significant increase of performance from the pretest to the 

second posttest of the experimental group, Izumi and Bigelow evidenced no statistically 

significant between-group differences on any measure. Both groups underlined nearly the 

same percentage of conditional-related forms. They concluded that output did not draw 

the learner’s attention to the targeted form, and insignificant results were attributed to 

2 i.e. If Lisa had traveled to Spain, she would have seen the Olympic games. 

3 i.e. If Lisa traveled to Spain, she would see the Olympic Games 

156

effects of input flood and individual variation. I argue that underlining as a single measure 

gives an insufficient account of learners’ cognitive processes, and I hypothesize that the 

output treatment could have been observed to trigger noticing if additional qualitative and 

quantitative measures had been employed. Therefore, the present study follows Izumi and 

Bigelow’s suggestion to implement “methodological triangulation as the research design 

allows” (Izumi and Bigelow, 2000, page 271) by operationalizing noticing through targetstructure 

underlining and reporting of conscious attention during the stimulated recall 

session. In other words, through triangulated data collection, noticing is investigated 

from multiple perspectives. 

3 Research Questions and Hypotheses 

In order to validly replicate Izumi and Bigelow’s study (Izumi and Bigelow, 2000), similar 

research questions are pursued along with their congruent hypotheses: 

RQ1: Do output activities promote noticing of linguistic form in subsequent input? 

RQ2: Do these output-input activities result in improved production of the target 

form? 

It is hypothesized that the experimental groups, which are required to produce output, 

would show greater noticing of the target-structure contained in the input than the control 

group, which does not produce output requiring the use of the target-structure. Furthermore, 

on the posttests, the experimental groups are expected to demonstrate greater gains 

in accuracy of their use of the target form than the control group. Given that prior research 

found the language learner’s developmental level to be associated with attentional 

resources available for the target-structure, it is hypothesized that a less difficult targetstructure 

promotes greater noticing and greater L2 gains. Thus, the present study is further 

guided by the following research question: 

RQ3: Does the present hypothetical conditional, as a less difficult structure, promote 

greater noticing compared to the past-hypothetical-conditional structure? 

4 Methodology 

4.1 Participants 


Fifteen intermediate ESL learners enrolled in the second semester ESL academic writing 

class at Michigan State University have participated up to this point. Students’ enrollment 

in the ESL academic writing class is determined by a placement test or by passing the 

previous course. The ESL learners were from a variety of L2 backgrounds including 

Cantonese, Japanese, Korean and Arabic with an average of 8.7 years of previous English 

study 4 . Three students have lived in the United States for more than two years, and the 

remaining have resided there for at least a year. Upon completion of the questionnaire, 

participants were randomly assigned to one of the two experimental groups (EGs) or to 

the single control group (CG). 

4 It needs to be noted that the different native languages of the learners affect their proximity to (distance 

from) English, which could make some structures easier (more difficult) to process for some learners than 

for others. The native language of the participant was not systematically investigated here. 

157

4.2 Procedures 

The experiment followed a pretest-posttest design. The researcher met one-on-one with 

each participant for about 1 or 1.5 hours depending on whether the participants chose 

to take part in the stimulated recall session or not. The participants were informed of 

the sequence of the activities before they completed the pretest (see Appendix A for an 

example) and the reading and writing activities. Participants assigned to the first experimental 

group (EG1) composed an essay that elicited the past hypothetical conditional 

(Appendix B), whereas participants assigned to the second experimental group (EG2) 

composed an essay that elicited the present hypothetical conditional. Participants in the 

control group (CG) engaged in a writing task that did not require the use of either structure. 

Each participant subsequently received input that modeled the correct use of the 

relevant target structures (Appendix C); yet, for the CG, the reading text did not serve 

as a model. All groups were instructed to either underline what [they] feel is important 

for re-writing the essay or underline what [they] feel is important for writing an essay 

about this topic. By leaving the words to be underlined unspecified, the learner’s attentional 

foci were not predisposed. Before the participants carried out the actual task, the 

grammar-focused underlining was demonstrated to the students using a passage that did 

not contain the target-structure 5 . Following the reading and underlining activity, all participants 

in the EGs reproduced their initial essay, whereas the CG group wrote about 

the EGs’ initial essay topic for the first time. The immediate posttest was administered 

upon completion of the second essay writing activity or after the stimulated recall session 

depending on whether or not participants took part in the stimulated recall interview. The 

delayed posttest was given after one week had passed 6 . Four participants of each EG and 

three participants of the CG volunteered to being videotaped during the reading activity. 

To better track their focus during the reading and underlining task, the videotaped participants 

were asked to read aloud. Immediately following completion of the second essay, 

the videotape was rewound and played to the learner. While watching the videotape, the 

researcher stopped the tapes after episodes that appeared to involve noticing of linguistic 

features (i.e. underlining or hesitation), asking the student to describe his/her thoughts 

during that time. English was used during all interactions between the participants and 

the researchers, which were audio recorded for transcription purposes. 

5 Results and Discussion 


The first research question asked whether output activities promote noticing of grammatical 

features in subsequent input. In a restricted way, the hypothesis predicting greater 

noticing of the target forms for the EGs than the CGs was not confirmed (p 0.5) 7 . All 

participants underlined vocabulary items rather than the grammatical cues in the reading 

text. However, the present study does show that output had an effect on learners’ 

attentional foci and input processing. While no participant appeared to notice the target 

form, most participants’ attention was drawn to the vocabulary in order to process the 

main message of the input passages. The predicted effect of output in promoting noticing 

5 Modeling familiarizes the learners with the underlining procedure and increases precision of the mea- 

sure 6Three students did not show up for the delayed posttest 

7 Wilcoxon-signed-rank tests were used for within-group comparisons 

158


of the correct use of conditional sentences was not supported in this study. Similarly, 

the output-input-output treatment did not alter the students’ level performance on the immediate 

and delayed posttests when compared to the input-output treatment. However, 

output appeared to trigger noticing of vocabulary, style, and some content issues. This 

finding will be discussed in more detail below. 

The second research question addressed the acquisition issue and inquired whether 

output-input activities results in improved production of the target form. The present 

study did not yield clear results in support of such a relationship, mainly because the 

noticing scores could not be sufficiently squared with posttest scores as there was a lack 

of grammar-related noticing with all candidates during the treatment phase. Put another 

way, the posttests do not provide a measure of the effect of noticing. Future research will 

need to use correlation analyses in order to square the underlining, stimulated recall, and 

second essay scores (as a measure of noticing) with gain on individualized vocabulary 

posttests. While data from the underlining, second essay, and stimulated recall point to a 

link between noticing and the subsequent use of noticed items, it would be too suggestive 

to claim a causal relationship between noticing and acquisition. 

Under the premise that attentional resources constrain noticing, the third research question 

asked whether a less difficult structure promotes greater noticing than a more difficult 

structure. The results of this study suggest that the less difficult structure had no effect 

on noticing or L2 learning 8 . There was no notable difference between EG1 and EG2 performance 

on any measure. Of course, any interpretation of the test-, noticing-, or essay 

scores would be unconvincing, given that only three candidates could be compared to 

another set of three candidates. The small number of participants notwithstanding, one 

possible explanation for this finding might be that the less difficult structure was not significantly 

easier. The production and processing of the present-hypothetical- conditional 

may have been just as cognitively demanding as the past-hypothetical-conditional. Thus, 

the results of this research can not validate (nor invalidate) the claim that the demands 

of the targeted grammatical structure or the complexity of the pedagogical task have no 

effect on learners’ attentional resources. In order to tap into a possible relationship between 

cognitive demands and noticing, the fact that one task is indeed cognitively more 

demanding must first be established. Before this research project is further pursued, the 

relative difficulty of both structures needs to be evaluated with a larger number of ESL 

learners. 

Although the research hypotheses of this study have not been supported, this research 

demonstrates that noticing has occurred. These insights contrast with Izumi and Bigelow’s 

conclusion that output does not trigger learner’s noticing (Izumi and Bigelow, 2000). The 

present study demonstrates that output treatment influences learners’ subsequent cognitive 

processes, e.g., that is good way to say. I also wanna say something like that, but my 

essay is not so good, so I try to remember. As such, output focused the learner’s attention 

to specific linguistic features in the output, and those noticed features were then compared 

to the features the learner had produced in their first writing activity. Yet, the data from 

this study leaves us to wonder whether (and to what extent) the noticed features were 

incorporated into the interlanguage system. Chaudron (1985) argued that L2 learning involves 

two stages: first, the perception of input (noticing), and second, the integration of 

intake into the learner’s interlanguage system. Gass and Varonis (1994) similarly sug- 

8 Mann-Whitney-U tests were used for between-group comparisons 

159


gested that learners need to apperceive input before it can become intake. According to 

Ellis “Intake occurs when learners take features into their short- or medium-term memories, 

whereas interlanguage change occurs only when they become part of long term 

memory” (Ellis, 1997, page 119). Accordingly, the learner has to convert from preliminary 

to final intake. It would be interesting to investigate whether the learners in this 

study process the new linguistic items (e.g., vocabulary) beyond noticing and immediate 

intake, in order to contribute to theory building on input, intake, and L2 acquisition. A 

possible way of approaching this would be to include a delayed essay production task to 

see whether, and to what extent, the learners arrived at the “final intake stage”. 

The overall findings of the present study indicate that learners processed the input 

primarily for meaning. Although no form-focused comparisons were invoked, EG candidates 

noticed a difference between their word choice and style and those of the native 

speaker. These findings are in line with VanPatten (1996) who proposed input processing 

principles: 

1. The Primacy of Meaning Principle: Learners process input for meaning before they 

process it for form. 

2. The Primacy of Content Words Principle: Learners process content words in the 

input before anything else. 

3. The Lexical Preference Principle: Learners will tend to rely on lexical items as 

opposed to grammatical form to get meaning when both encode the same semantic 

information 9 . 

Applying VanPatten’s principles to the present study, it might be that all participants 

processed meaningful elements in the input while reading the input text. This may explain 

why they did not underline grammatical elements such as modals like would and could, 

auxiliaries and past participles. Because learners were not capable of attending to vocabulary 

and grammar, the past/present hypothetical conditional may have been processed 

only peripherally. 

Based on the input processing principles, VanPatten (1996) investigated the effects 

of processing instruction, revealing that learners’ focal attention during processing can 

be directed toward the relevant grammatical items and, in turn, enhance L2 learning. 

Follow-up research should investigate whether input enhancement or specific instructions 

to underline grammatical structures (e.g., the past/present hypothetical conditional) would 

enhance noticing, intake, and L2 acquisition. The present study did not provide any specific 

instructions for the underlining, but to underline what is important for subsequent 

production, on purpose: The study’s objective was to see whether output which requires 

use of a particular structure results in underlining of that particular structure in the subsequent 

input passage. If the learners were told to underline grammatical structures, their 

attentional foci would have been predisposed, as it was the case in Izumi and Bigelow 

(2000). 

The findings in the present study also raise important methodological issues that should 

be addressed in future studies that investigate the role of noticing in SLA. First and foremost, 

this study has shown that triangulated or multiple data-elicitation measures can 

9 Only the relevant subset of the entire set of input processing principles is presented here 

160


provide a much more complex picture of learners’ internal processes. In this study, the 

underlining, the essays, the tests scores, and the verbal reports from the stimulated recall 

session, all helped to puzzle out the role of output and noticing in second language 

acquisition. Although verbal stimulated recall reports cannot provide a complete reflection 

of actual internal processing, they provided useful information as to how learners’ 

minds process language information when the learners articulated their concerns (e.g., I 

wanna say something like that, but my essay is not so good, so I try to remember) or 

when they made comparisons to their first essay (e.g., This is a big word I want to remember). 

Learners underlined the words that captured the author’s key message, and 

their comments reflected their intent in seeking meaning and better vocabulary for use in 

their second essay. The stimulated recall protocols obtained in this study collaboratively 

demonstrate that the learners did not attend to grammatical features. Additionally, the 

data from the first and second essays illustrate that students improved their expression 

and word choice, but not their grammatical accuracy. Izumi and Bigelow were unable to 

draw such conclusions as they limited their measurement of noticing and their measurement 

of acquisition to the underlining of conditional related items and posttest scores, 

respectively. Izumi and Bigelow found that output does not prompt the learners to “notice 

the gap”. The present study, by contrast, reveals that some learners were aware that they 

could not express themselves as entirely as they wished, (e.g., I want to say negotiate in 

my essay, but I don’t remember it). They noticed their restricted lexicon and searched 

for more appropriate words in the input passage. In other words, students realized lexical 

gaps which triggered their attention to vocabulary in subsequent input. 

6 Limitations and Future Research 

Although the present study sheds some light on meaning-focused processing and noticing 

as well as methodological issues, there are some limitations that need to be acknowledged. 

First and foremost, the small number of participants clearly limits the generalization of 

findings to a broader variety of L2 learners 10 . Proceeding with this research up to a minimum 

of twenty-one participants will reveal whether the current trends hold true. Further 

study may include asking non-stimulated recall participants about what they have noticed 

in a short questionnaire and what they assume the purpose of the reading and writing tasks 

were. 

The testing instruments employed in this study are limited in length and scope which 

may have impacted the measurement of L2 attainment. Whereas a more comprehensive 

test of the past-hypothetical-conditional may yield more valid results, it may also prompt 

participants to pay closer attention to the form in the input passage. Consequently, a tenable 

comparison between output and no-output treatment would be difficult, as all groups 

would produce the target form to the same extent. As mentioned earlier, in order to better 

understand the relationship between attention and learning, future research may develop 

tests that examine students’ acquisition of noticed vocabulary items. For such measurement, 

individualized delayed posttests in which the noticed (underlined and commented) 

items are assessed in terms of adequate usage and comprehension would be appropriate. 

10 The fact that the participants were willing to take part in the study outside of class time may have lead 

to a participant body that is more motivated and eager to improve than the average intermediate ESL learner 

161

7 Conclusions 

The purpose of this study was to investigate the effects of output and cognitive demands 

on noticing and second language acquisition, providing the following two merits: First, 

this study has demonstrated how multiple perspectives can help to obtain insights into 

learners’ cognitive processes. Secondly, the results of this study support the noticing 

function of output to some extent. Output-input treatment has shown to trigger comparison 

of the learner’s interlanguage lexicon with language produced by a native speaker. 

Furthermore, this study demonstrates that learners primarily attend to meaning, which 

is in line with VanPatten’s input processing principles (VanPatten, 1996). However, the 

overall results do not allow for clear conclusions. Much more research is needed to find 

the extent to which learners notice specific features in the input as well as to explore the 

very mechanisms of noticing. Until then, our understanding of what takes place in the 

learners head remains complex and opaque. 

References 


Alanen, R. (1995). Input enhancement and rule representation in second language acquisition, 

in R. Schmidt (ed.), Attention and Awareness in Foreign Language Learning, 

University of Hawai’i Press, Honolulu. 

Chaudron, C. (1985). Intake: On models and methods for discovering learners’ processing 

of input, Studies in Second Language Acquisition 7(1): 1–14. 

Ellis, R. (1994). Factors in the incidental acquisition of second language vocabulary from 

oral input: A review essay, Applied Language Learning 5(1): 1–32. 

Ellis, R. (1997). SLA Research and Language Teaching, University Press, Oxford. 

Gass, S. and Mackey, A. (2000). Stimulated recall methodology in second language 

research, Lawrence Erlbaum Associates, London. 

Gass, S., Svetics, I. and Lemelin, S. (2003). Differential effects of attention, Language 

Learning 53(3): 497–545. 

Gass, S. and Varonis, E. M. (1994). Input, interaction and second language production, 

Studies in Second Language Acquisition 16(3): 283–302. 

Izumi, S. and Bigelow, M. (2000). Does output promote noticing and second language 

acquisition, TESOL Quarterly 34(2): 239–287. 

Jourdenais, R. (2001). Cognition, instruction and protocol analysis, in P. Robinson (ed.), 

Cognition and Second Language Instruction, Cambridge University Press, New 

York. 

Krashen, S. (1982). Principles and Practice in Second Language Acquisition, Pergamon, 

Oxford. 

Leow, R. P. (2002). Models, attention, and awareness in sla, Studies in Second Language 

Acquisition 24(1): 113–119. 

162


Long, M. (1996). The role of linguistic environment in second language acquisition, 

in W. C. Ritchie and T. K. Bhatia (eds), The Handbook of Language Acquisition, 

Academic Press, San Diego. 

Philp, J. (2003). Constraints on noticing the gap, Studies in Second Language Acquisition 

25(1): 99–126. 

Robinson, P. (1995). Attention, memory, and the noticing hypothesis, Language Learning 

45(2): 283–331. 

Robinson, P. (2001). Individual differences, cognitive abilities, aptitude complexes and 

learning conditions in second language acquisition, Second Language Research 

17(4): 368–392. 

Rosa, E. and O’Neill, M. D. (1999). Explicitness, intake and the issue of awareness, 

Studies in Second Language Acquisition 21(4): 511–556. 

Schmidt, R. (1990). The role of consciousness in second language learning, Applied 

Linguistics 11(2): 129–158. 

Schmidt, R. (2001). Attention, in P. Robinson (ed.), Cognition and Second Language 

Instruction, Cambridge University Press, New York. 

Simard, D. and Wong, W. (2001). Alertness, orientation and detection, Studies in Second 

Language Acquisition 23(1): 103–124. 

Swain, M. (1985). Communicative competence: Some roles of comprehensible input and 

comprehensible output in its development, in S. Gass and C. Madden (eds), Input in 

Second Language Acquisition, Heinle & Heinle, Boston. 

Swain, M. (1995). Three functions of output in second language learning, in G. Cook and 

B. Seidlhofer (eds), Principles and practice in applied linguistics: Studies in honor 

of H. Widdowson, University Press, Oxford. 

Swain, M. and Lapkin, S. (1995). Problems in output and the cognitive processes they 

generate: A step towards second language learning, Applied Linguistics 16(3): 371– 

391. 

Swain, M. and Lapkin, S. (1998). Interaction and second language learning: Two adolescent 

french immersion students working together, Modern Language Journal 

82(3): 320–337. 

Tomlin, R. and Villa, V. (1994). Attention in cognitive science and second language 

acquisition, Studies in Second Language Acquisition 16(2): 183–204. 

Truscott, J. (1998). Noticing in second language acquisition: A critical review, Second 

Language Research 24(2): 103–135. 

VanPatten, B. (1996). Input processing and grammar instruction in second language 

acquisition, Ablex, Westport. 

163


164

CDIPROVER3: A TOOL FOR PROVING DERIVATIONAL COMPLEXITIES 

OF TERM REWRITING SYSTEMS 



Abstract. This paper describes cdiprover3 a tool for proving termination of term rewrite 

systems by polynomial interpretations and context dependent interpretations. The methods 

used by cdiprover3 induce small bounds on the derivational complexity of the considered 

system. We explain the tool in detail, and give an overview of the employed proof methods. 



Term rewriting is a Turing complete model of computation, which is conceptually closely 

related to declarative and (first-order) functional programming. One of its most studied 

properties, termination, is also a central problem in computer science. This property is 

undecidable in general, but many partial decision methods have been developed in the 

last decades. Beyond showing termination of a given rewriting system, some of these 

methods can also give bounds on different measures of its complexity. As suggested in 

(Hofbauer and Lautemann, 1989), a natural way of measuring the complexity of a term 

rewrite system is to analyze its derivational complexity. The derivational complexity is 

a function which relates the size of a term and the maximal number of rewrite steps that 

can be executed starting from any term of that size in the given rewrite system . We 

are particularly interested in small, i.e. polynomial upper bounds on this function. In 

contrast to our approach of measuring derivational complexity, the constructor discipline 

is mentioned in (Lescanne, 1995). In this field, we look at the complexity of the function 

that is encoded by a constructor system. It is either measured by the number of rewrite 

steps needed to bring the term into normal form (Bonfante, Cichon, Marion and Touzet, 

n.d.; Avanzini and Moser, 2008), or by counting the number of steps needed by some 

evaluation mechanism different from standard term rewriting (Marion, 2003; Bonfante, 

Marion and Péchoux, 2007). 

In this paper, we describe cdiprover3, a tool which uses polynomial and contextdependent 

interpretations in order to prove termination and complexity bounds of term 

rewrite systems. The tool, its predecessors, and full experimental data are available at 

http://cl-informatik.uibk.ac.at/˜aschnabl/experiments/cdi/ . 

s Polynomial interpretations, introduced in (Lankford, 1979), are a standard direct termination 

proof method. Besides showing termination of rewrite systems, they also provide 

an easy way to extract upper bounds on the derivational complexity (Hofbauer and 

Lautemann, 1989). However, as noticed in (Hofbauer, 2001), this often heavily overestimates 

the derivational complexity. Context dependent interpretations, also introduced in 

(Hofbauer, 2001), are an effort to improve these upper bounds. 

165

The remainder of this paper is organised as follows: Section 2 outlines the basics of 

term rewriting needed to state all relevant results. In Section 3, we briefly describe polynomial 

and context dependent interpretations, which are used by cdiprover3. Section 

4 describes the implementation of cdiprover3, and mentions some experimental results. 

In Section 5, we explain the input and output of cdiprover3 in detail. Last, in 

Section 6, we state conclusions and potential future work. 

2 Term Rewriting 


In this section, we review some basics of term rewriting. We only cover the concepts 

which are relevant to this paper. A general introduction to term rewriting can be found in 

(Baader and Nipkow, 1998; TeReSe, 2003), for instance. 

A term rewrite system (TRS) R consists of a signature F, a countably infinite set of 

variables V disjoint from F, and a finite set of rewrite rules l → r, where l and r are terms 

such that l /∈ V and all variables which occur in r also occur in l. The signature F defines 

a set of function symbols, and assigns to each function symbol f its arity. We assume that 

every signature contains at least one function symbol of arity 0. The set of terms built 

from F and V is denoted by T (F, V). The set of terms T (F) without any variables is 

called the set of ground terms over F. A function symbol is defined if it occurs at the 

root of a left hand side of a rewrite rule. All non-defined function symbols are called 

constructors. A constructor based term is a term containing exactly one defined function 

symbol, which appears at the root of that term. We call the total number of function 

symbol and variable occurrences in a term t its size, denoted by |t|. A substitution is a 

mapping σ : Dom(σ) → T (F, V), where Dom(σ) is a finite subset of V. The result of 

replacing all occurrences of variables x ∈ Dom(σ) in a term t by σ(x) is denoted by tσ. 

A context is a term C[�] containing a single occurrence of a fresh function symbol � of 

arity 0. If we replace � with a term t, we denote the resulting term by C[t]. Given a TRS 

R and two terms s, t, we say that s rewrites to t (s →R t) if there exist a context C, a 

substitution σ and a rewrite rule l → r in R such that s = C[lσ] and t = C[rσ]. The 

transitive closure of this relation is → + 

R . The reflexive and transitive closure is →∗R . We 

write →n R to express n-fold composition of →R. A TRS R is terminating if there exists 

no infinite chain of terms t0, t1, . . . such that ti →R ti+1 for each i ∈ N. For a terminating 

TRS R, the derivation length of a ground term t is defined as dlR(t) = max{n | ∃s : 

t →n R s}. The derivational complexity is the function dcR : N → N which maps n to 

max{dlR(t) | |t| = n}. 

3 Used Termination Proof Methods 

3.1 Polynomial Interpretations 

An F-algebra A for some signature F consists of a carrier A and interpretation functions 

{fA : A n → A | f ∈ F, n = arity(f)}. Given an assignment α : V → A, we denote the 

evaluation of a term t into A by [α]A(t). It is defined inductively as follows: 

[α]A(x) = α(x) for x ∈ V 

[α]A(f(t1, . . . , tn)) = fA([α]A(t1), . . . , [α]A(tn)) for f ∈ F 

166


A well-founded monotone F-algebra is a pair (A, >) where A is an F-algebra and > is 

a well-founded proper order such that for every function symbol f ∈ F, fA is monotone 

with respect to >. It is compatible with a TRS R if for every rewrite rule l → r in R 

and every assignment α, [α]A(l) > [α]A(r) holds. It is a well-known fact that a TRS R 

is terminating if and only if there exists a well-founded monotone algebra that is compatible 

with R. A polynomial interpretation (Lankford, 1979) is an interpretation into a 

well-founded monotone algebra (A, >) such that A ⊆ N, > is the standard order on the 

natural numbers, and fA is a polynomial for every function symbol f. If a polynomial 

interpretation is compatible with a TRS R, then we clearly have dlR(t) � [α]A(t) for all 

terms t. 

Example 1. Consider the TRS R with the following rewrite rules over the signature containing 

the function symbols 0 (arity 0), s (arity 1), + and - (arity 2). The system is 

example SK90/2.11.trs in the termination problems database 1 (TPDB), which is the 

standard benchmark for termination provers: 

+(0, y) → y -(0, y) → 0 -(s(x), s(y)) → -(x, y) 

+(s(x), y) → s(+(x, y)) -(x, 0) → x 

The following interpretation functions build a compatible polynomial interpretation A 

over the carrier N: 

+A(x, y) = 2x + y -A(x, y) = 3x + 3y sA(x) = x + 2 0A = 1 

A strongly linear interpretation is a polynomial interpretation such that every interpretation 

function fA has the form fA(x1, . . . , xn) = �n i=1 xi + c, c ∈ N. A surprisingly 

simple property is that compatibility with a strongly linear interpretation induces a linear 

upper bound on the derivational complexity (Schnabl, 2007). 

A linear polynomial interpretation is a polynomial interpretation where each interpretation 

function fA has the shape fA(x1, . . . , xn) = �n i=1 aixi + c, ai ∈ N, c ∈ N. 

For instance, the interpretation given in Example 1 is a linear polynomial interpretation. 

Because of their simplicity, this class of polynomial interpretations is the one most commonly 

used in automatic termination provers. As illustrated by Example 2 below, if only 

a single one of the coefficients ai in any of the functions fA is greater than 1, there might 

already exist derivations whose length is exponential in the size of the starting term. 

Example 2. Consider the TRS S with the following single rule over the signature containing 

the function symbols a, b (arity 1), and c (arity 0). The system is example 

SK90/2.50.trs in the TPDB: 

a(b(x)) → b(b(a(x))) 

The following interpretation functions build a compatible linear polynomial interpretation 

A over N: 

aA(x) = 2x bA(x) = x + 1 cA = 0 

If we start a rewrite sequence from the term an (b(c)), we reach the normal form b2n(an (c)) 

after 2n − 1 rewriting steps. Therefore, the derivational complexity of S is at least exponential. 

1 http://www.lri.fr/˜marche/tpdb/. 

167

3.2 Context Dependent Interpretations 

Even though polynomial interpretations provide an easy way to obtain an upper bound 

on the derivational complexity of a TRS, they are not very suitable for proving polynomial 

derivational complexity. Strongly linear interpretations only capture linear derivational 

complexity, but even a slight generalization admits already examples of exponential 

derivational complexity, as illustrated by Example 2. In (Hofbauer, 2001), context dependent 

interpretations are introduced. They use an additional parameter (usually denoted 

by ∆) in the interpretation functions, which changes in the course of evaluating the interpretation 

of a term, thus making the interpretation dependent on the context. This way of 

computing interpretations also allows us to bridge the gap between linear and polynomial 

derivational complexity. 

Definition 3. A context-dependent interpretation C for some signature F consists of functions 

{fC[∆] : (R + 0 ) n → R + 0 | f ∈ F, n = arity(f), ∆ ∈ R + } and {f i C : R+ → R + | f ∈ 

F, i ∈ {1, . . . , arity(f)}}. Given a ∆-assignment α : R + × V → R + 0 , the evaluation of a 

term t by C is denoted by [α, ∆]C(t). It is defined inductively as follows: 

[α, ∆]C(x) = α(∆, x) for x ∈ V 

[α, ∆]C(f(t1, . . . , tn)) = fC[∆]([α, f 1 C (∆)]C(t1), . . . , [α, f n C (∆)]C(tn)) for f ∈ F 

Definition 4. For each ∆ ∈ R + , let >∆ be the order defined by a >∆ b ⇐⇒ a − b � ∆. 

A context-dependent interpretation C is compatible with a TRS R if for all rewrite rules 

l → r in R, all ∆ ∈ R + , and every ∆-assignment α, we have [α, ∆]C(l) >∆ [α, ∆]C(r). 

Definition 5. A ∆-linear interpretation is a context dependent interpretation C whose 

interpretation functions have the form 

fC[∆](z1, . . . , zn) = 


n� 

n� 

a(f,i)zi + b(f,i)zi∆ + cf∆ + df f 

i=1 

i=1 

i C(∆) = 

a(f,i) + b(f,i)∆ 

with a(f,i), b(f,i), cf, df ∈ N, a(f,i) + b(f,i) �= 0 for all f ∈ F, 1 � i � n. If we have 

a(f,i) ∈ {0, 1} for all f, i, we also call it a ∆-restricted interpretation 

We consider ∆-linear interpretations because of the similarity between the functions 

fC[∆] and the interpretation functions of linear polynomial interpretations. Another point 

of interest is that the simple syntactical restriction to ∆-restricted interpretations yields a 

quadratic upper bound on the derivational complexity. Moreover, because of the special 

shape of ∆-linear interpretations, we need no additional monotonicity criterion for our 

main theorems: 

Theorem 6 ((Moser and Schnabl, 2008)). Let R be a TRS and suppose that there exists 

a compatible ∆-linear interpretation. Then R is terminating and dcR(n) = 2 O(n) . 

Theorem 7 ((Schnabl, 2007)). Let R be a TRS and suppose that there exists a compatible 

∆-restricted interpretation. Then R is terminating and dcR(n) = O(n 2 ). 

168 

∆

Example 8. Consider the TRS given in Example 1 again. A compatible ∆-restricted (and 

∆-linear) interpretation C is built from the following interpretation functions: 

+C[∆](x, y) = (1 + ∆)x + y + ∆ + 1 C(∆) = ∆ 

1 + ∆ 

+ 2 C(∆) = ∆ 

-C[∆](x, y) = x + y + ∆ - 1 C(∆) = ∆ − 2 C(∆) = ∆ 

sC[∆](x) = x + ∆ + 1 s 1 C(∆) = ∆ 0C[∆] = 0 

Note that this interpretation gives a quadratic upper bound on the derivational complexity. 

However, from the polynomial interpretation given in Example 1, we can only infer an exponential 

upper bound (Hofbauer and Lautemann, 1989). Consider the term Pn,n, where 

we define P0,n = s n (0) and Pm+1,n = +(Pm,n, 0). We have |Pn,n| = 3n + 1. For every 

m, n ∈ N, Pm+1,n rewrites to Pm,n in n + 1 steps. Therefore, Pn,n reaches its normal form 

s n (0) after n(n + 1) rewriting steps. Hence, the derivational complexity is also Ω(n 2 ) for 

this example, so the inferred bound O(n 2 ) is tight. 

4 Implementation 


cdiprover3 is written fully in OCaml 2 . It employs the libraries of the termination 

prover TTT2 3 . From these libraries, functionality for handling TRSs and SAT encodings, 

and an interface to the SAT solver MiniSAT 4 are used. Without counting this, the tool 

consists of about 1700 lines of OCaml code. About 25% of that code are devoted to 

the manipulation of polynomials and extensions of polynomials that stem from our use 

of the parameter ∆. Another 35% are used for constructing parametric interpretations 

and building suitable Diophantine constraints (see below) which enforce the necessary 

conditions for termination. Using TTT2’s library for propositional logic and its interface 

to MiniSAT, 15% of the code deal with encoding Diophantine constraints into SAT. The 

remaining code is used for parsing input options and the given TRS, generating output, 

and controlling the program flow. 

In order to find polynomial interpretations automatically, Diophantine constraints are 

generated according to the procedure described in (Contejean, Marché, Tomás and Urbain, 

2005). Putting an upper bound on the coefficients makes the problem finite. Essentially 

following (Fuhs, Giesl, Middeldorp, Schneider-Kamp, Thiemann and Zankl, 2007), 

we then encode the (finite domain) constraints into a propositional satisfiability problem. 

This problem is given to MiniSAT. From a satisfying assignment for the SAT problem, 

we construct a polynomial interpretation which is monotone and compatible with the 

given TRS. 

This procedure is also the basis of the automatic search for ∆-linear and ∆-restricted 

interpretations. The starting point of that search is an interpretation with uninstantiated 

coefficients. If we want to be able to apply Theorem 6 or 7, we need to find coefficients 

which make the resulting interpretation compatible with the given TRS. Furthermore, 

we need to make sure that no divisions by zero occur in the interpretation functions. 

Again, we encode these properties into Diophantine constraints on the coefficients of a 

∆-linear or ∆-restricted interpretation. The encoding is an adaptation of the procedure in 

2 http://caml.inria.fr. 

3 http://colo6-c703.uibk.ac.at/ttt2. 

4 http://minisat.se. 

169

Table 1: Performance of cdiprover3 

Method SL SL+∆-restricted ∆-linear ∆-restricted 

-i -b X 31 31 31 3 7 15 31 

# success 41 87 83 83 86 86 86 

average success time 20 3010 5527 3652 4041 4008 3986 

# timeout 0 237 797 144 189 221 238 

(Contejean et al., 2005) to context-dependent interpretations. It is described in detail in 

(Schnabl, 2007; Moser and Schnabl, 2008). Once we have built the constraints, we continue 

using the same techniques as for searching polynomial interpretations: we encode 

the constraints in a propositional satisfiability problem, apply the SAT solver, and use a 

satisfying assignment to construct a context-dependent interpretation. 

Table 1 shows experimental results of applying cdiprover3 on the 957 known terminating 

examples of the TPDB. The tests were performed single-threaded on a 2.40 GHz 

Intel R○ CoreTM 2 Duo with 2 GB of memory. For each system, cdiprover3 was given 

a timeout of 60 seconds. All times in the table are given in milliseconds. The method 

SL denotes strongly linear interpretations. In all tests, we called cdiprover3 with the 

options -i -b X (see Section 5 below), where X is specified in the second row of the 

table. As we can see, cdiprover3 is currently able to prove polynomial derivational 

complexity for 87 of the 368 known terminating non-duplicating rewrite systems of the 

TPDB (duplicating rewrite systems have at least exponential derivational complexity, so 

this restriction is harmless here). The results indicate that an upper bound of 7 on the coefficient 

variables suffices to capture all examples on our test set. Therefore, 3 and 7 seem 

to be good candidates for default values of the -b option. However, it should be noted 

that our handling of the divisions introduced by the functions f i C 

is computationally rather 

expensive, which is indicated by the number of timeouts and the average time needed 

for successful proofs. This also explains the slight decrease in performance when we 

extend the search space to ∆-linear interpretations. However, there is one system which 

can be handled by ∆-linear interpretations, but not by ∆-simple interpretations: system 

SK90/2.50 in the TPDB, which we mentioned in Example 2. 

5 Using cdiprover3 


cdiprover3 is called from command line. The basic usage pattern for cdiprover3 

is 

$ ./cdiprover3 

• specifies the maximum number of seconds until cdiprover3 stops 

looking for a suitable interpretation. 

• specifies the path to the file which contains the considered TRS. 

• For , the following switches are available: 

-c defines the desired subclass of the searched polynomial or contextdependent 

interpretation. The following values of are legal: 

170


linear, simple, simplemixed, quadratic These classes correspond to the respective 

subclasses of polynomial interpretations, as defined in (Steinbach, 

1992). Linear polynomial interpretations imply an exponential upper 

bound on the derivational complexity. The other classes imply a double 

exponential upper bound, cf. (Hofbauer and Lautemann, 1989). 

pizerolinear, pizerosimple, pizerosimplemixed, pizeroquadratic For these 

values, cdiprover3 tries to find a polynomial interpretation with the 

following restrictions: defined function symbols are interpreted by linear, 

simple, simple-mixed, or quadratic polynomials, respectively. Constructors 

are interpreted by strongly linear polynomials. These interpretations 

guarantee that the derivation length of all constructor based terms is polynomial 

(Bonfante et al., n.d.). 

sli This option corresponds to strongly linear interpretations. As mentioned 

in Section 3, they induce a linear upper bound on the derivational complexity 

of a compatible TRS. 

deltalinear This value specifies that the tool should search for a ∆-linear 

interpretation. By Theorem 6, compatibility with such an interpretation 

implies an exponential upper bound on the derivational complexity. 

deltarestricted This option corresponds to ∆-restricted interpretations. By 

Theorem 7, they induce a quadratic upper bound. 

-b sets the upper bound for the coefficient variables. The default value 

for this bound is 3. 

-i This switch activates an incremental strategy for handling the upper bound on 

the coefficient variables. First, cdiprover3 tries to find a solution using 

an intermediate upper bound of 1 (which corresponds to encoding each coefficient 

variable by one bit). Whenever the tool fails to find a proof for some 

upper bound b, it is checked whether b is equal to the bound specified by the 

-b option. If that is the case, then the search for a proof is given up. Otherwise, 

b is set to the minimum of the bound specified by the -b option and 

2(b+1)−1 (which corresponds to increasing the number of bits used for each 

coefficient variable by 1). 

If the -c switch is not specified, then the standard strategy for proving polynomial 

derivational complexity is employed. First, cdiprover3 looks for a strongly linear 

interpretation. If that is not successful, then a suitable ∆-restricted interpretation is 

searched. The input TRS files are expected to have the same format as the files in the 

TPDB. The format specification for this database is available at http://www.lri. 

fr/˜marche/tpdb/format.html. 

The output given by cdiprover3, as exemplified by Example 9, is structured as 

follows. The first line contains a short answer to the question whether the given TRS 

is terminating: YES, MAYBE, or TIMEOUT. The latter means that cdiprover3 was 

still busy after the specified timeout. MAYBE means that a termination proof could not 

be found, and cdiprover3 gave up before time ran out. The answer YES indicates 

that an interpretation of the given class has been found which guarantees termination of 

the given TRS. It is followed by the inferred bound on the derivational complexity and a 

171

listing of the interpretation functions. After the interpretation functions, the elapsed time 

between the call of cdiprover3 and the output of the proof is given. In all cases, the 

answer is concluded by statistics stating the total number of monomials in the constructed 

Diophantine constraints, and the upper bound for the coefficients that was used in the last 

call to MiniSAT. 

Example 9. Given the TRS shown in Example 1, cdiprover3 produces the output 

shown in Figure 1. The interpretations in Example 8 and in the output are equivalent. 

Note that the parameter ∆ in the interpretation functions fC[∆] is treated like another 

argument of the function. The interpretation functions f i C are represented by f tau i in the 

output. 

6 Conclusion 

In this paper, we have presented the (as far as we know) first tool which is specifically 

designed for automatically proving polynomial derivational complexity of term rewriting. 

We have also given a brief introduction into the applied proof methods. With our current 

implementation, we are able to prove polynomial derivational complexity for 87 of 

the 368 known terminating non-duplicating rewrite systems of the TPDB. By adding new 

termination methods to our tool which can prove polynomial derivational complexity of 

rewrite systems, we could extend the range of problems that the prover can solve. The 

matchbounds technique comes to mind here, which induces a linear upper bound on the 

derivational complexity of the considered system (Geser, Hofbauer, Waldmann and Zantema, 

2007; Korp and Middeldorp, 2007). Another avenue for future work is the search for 

other subclasses of context-dependent interpretations which imply non-quadratic and nonlinear, 

but polynomial upper bounds on the derivational complexity. A further possibility 

would be to find more efficient ways of handling the divisions introduced by the functions 

f i C . Results in this area would help to further improve the power of cdiprover3. 

References 


Avanzini, M. and Moser, G. (2008). Complexity analysis by rewriting, Proc. 9th FLOPS, 

Vol. 4989 of LNCS, pp. 130–146. 

Baader, F. and Nipkow, T. (1998). Term Rewriting and All That, Cambridge University 

Press. 

Bonfante, G., Cichon, A., Marion, J.-Y. and Touzet, H. (n.d.). Algorithms with polynomial 

interpretation termination proof, J. Funct. Program. (1): 33–53. 

Bonfante, G., Marion, J.-Y. and Péchoux, R. (2007). Quasi-interpretation synthesis by 

decomposition, Proc. 4th ICTAC, Vol. 4711 of LNCS, pp. 410–424. 

Contejean, E., Marché, C., Tomás, A. P. and Urbain, X. (2005). Mechanically proving 

termination using polynomial interpretations., J. Autom. Reason. 34(4): 325–363. 

Fuhs, C., Giesl, J., Middeldorp, A., Schneider-Kamp, P., Thiemann, R. and Zankl, H. 

(2007). SAT solving for termination analysis with polynomial interpretations, Proc. 

SAT 2007, Vol. 4501 of LNCS, pp. 340–354. 

172


Geser, A., Hofbauer, D., Waldmann, J. and Zantema, H. (2007). On tree automata that 

certify termination of left-linear term rewriting systems, Inf. Comput. 205(4): 512– 

534. 

Hofbauer, D. (2001). Termination proofs by context-dependent interpretations, Proc. 12th 

RTA, Vol. 2051 of LNCS, pp. 108–121. 

Hofbauer, D. and Lautemann, C. (1989). Termination proofs and the length of derivations, 

Proc. 3rd RTA, Vol. 355 of LNCS, pp. 167–177. 

Korp, M. and Middeldorp, A. (2007). Proving termination of rewrite systems using 

bounds, Proc. 18th RTA, Vol. 4533 of LNCS, pp. 273–287. 

Lankford, D. (1979). On proving term-rewriting systems are noetherian, Technical Report 

MTP-2, Math. Dept., Louisiana Tech. University. 

Lescanne, P. (1995). Termination of rewrite systems by elementary interpretations, Formal 

Aspects of Computing 7(1): 77–90. 

Marion, J.-Y. (2003). Analysing the implicit complexity of programs, Inf. Comput. 

183(1): 2–18. 

Moser, G. and Schnabl, A. (2008). Proving quadratic derivational complexities using 

context dependent interpretations, Proc. 19th RTA. Accepted for publication. 

Schnabl, A. (2007). Context Dependent Interpretations 5 , Master’s thesis, Universität 

Innsbruck. 

Steinbach, J. (1992). Proving polynomials positive, Proc. 12th FSTTCS, Vol. 652 of 

LNCS, pp. 191–202. 

TeReSe (2003). Term Rewriting Systems, Vol. 55 of Cambridge Tracts in Theoretical 

Computer Science, Cambridge University Press. 

5 Available online at http://cl-informatik.uibk.ac.at/˜aschnabl/ 

173

Figure 1: Output produced by cdiprover3. 

$ cat tpdb-4.0/TRS/SK90/2.11.trs 

(VAR x y) 

(RULES 

+(0,y) -> y 

+(s(x),y) -> s(+(x,y)) 

-(0,y) -> 0 

-(x,0) -> x 

-(s(x),s(y)) -> -(x,y) 

) 

(COMMENT Example 2.11 (Addition and Subtraction) in \cite{SK90}) 

$ ./cdiprover3 -i tpdb-4.0/TRS/SK90/2.11.trs 60 

YES 

QUADRATIC upper bound on the derivational complexity 

This TRS is terminating using the deltarestricted interpretation 

-(delta, X1, X0) = + 1*X0 + 1*X1 + 0 + 0*X0*delta + 0*X1*delta + 1*delta 

s(delta, X0) = + 1*X0 + 1 + 0*X0*delta + 1*delta 

0(delta) = + 0 + 0*delta 

+(delta, X1, X0) = + 1*X0 + 1*X1 + 0 + 0*X0*delta + 1*X1*delta + 1*delta 

- tau 1(delta) = delta/(1 + 0 * delta) 

- tau 2(delta) = delta/(1 + 0 * delta) 

s tau 1(delta) = delta/(1 + 0 * delta) 

+ tau 1(delta) = delta/(1 + 1 * delta) 

+ tau 2(delta) = delta/(1 + 0 * delta) 

Time: 0.024418 seconds 

Statistics: 


Number of monomials: 187 

Last formula building started for bound 1 

Last SAT solving started for bound 1 

174

THE RANK(S) OF A TOTALLY LEXICALIST SYNTAX 


University of Pécs 

Abstract. Our project works on the implementation of a totally lexicalist grammar. Now 

syntax has been worked out, which in this approach is like a dependency grammar, but word 

order is handled. In harmony with the idea of total lexicalism, no PS-trees (nor transformation) 

exist. We use rank parameters, close to Optimality Theory for expressing word order 

variations in a language. A special kind of rank parameters account for Hungarian focus phenomena, 

which makes radical surface changes in word order (beyond intonational effects). 

The system is implemented in a relational database (SQL). 



Predicates are seeking their arguments in every language of the world, and adjuncts are 

seeking their joining points too. We claim that only 8-10 operations work in languages, 

but their effectiveness is different. This can be ordered by rank parameters: a universal 

tool (as in Optimality Theory (Archangeli and Langendoen, 1997)) with language-specific 

settings. Our project aims to develop an MT system based on GASG (Generalized Argument 

Structure Grammar), a totally lexicalist theory (Alberti, 1999). We are linguists 

basically, so our high-priority goal is linguistic. Lexicalist theories are successful nowadays, 

and we aim to try out this extremity of lexicalism both theoretically and practically. 

For us this is more important than effectiveness in size, speed or time. 

The lexicon is in a relational database. The essence of relational databases is in the 

definition of relations. Relations describe facts and contribute the database as well. Each 

entity is an n-tuple: the elements of the tuples are in a relation contributing a record. 

The elements are attributes contributing the fields of a record. A relation is a table 

in the database, where each row (record) is an n-tuple and each column is an attribute 

(Halassy, 1994). We chose Microsoft SQL 2005 for our implementation, so we have a 

complete and complex database management frame system. 

A morphophonological component has been transferred from our former project. Now 

rules of syntax are being built in. The main component will be the semantic component: 

the implementation of the DRT-based (Kamp, van Genabith and Reyle, 2004) (Asher and 

Lascarides, 2003) ReALIS dynamical semantic system (Alberti, 2005). 

GASG is a monostratal declarative grammar which is considered to be ”totally lexicalist”. 

Total lexicalism means that all information is in the description of the lexical 

items, and unification exclusively moves the combining of lexical elements. Thus, it can 

be considered as a modified unificational categorial grammar (even function application 

is omitted). It carries on radical lexicalism, introduced by (Karttunen, 1986), which states 

that if the lexicon is properly rich, then sentences so can be produced by unification that 

phrase-structure is practically redundant, besides, it goes to false ambiguities. Works 

in computational linguistics (for example (Schneider, 2005)) also come to the point that 

175

educing phrase-structure could be useful. Many applications lean on phrase-structure, 

because otherwise a dependency grammar, without restricting word-order, is not effecitve 

in computation. GASG accounts for word-order by rank parameters, so giving up phrasestructure 

does not result in exponential running time of the analyzing algorhythm. 

Thus, ’rules’ mentioned above are not really rules, but properties which can be unified. 

Requested arguments and their realizations are properties, too. Word order requirements 

are also properties: requirements with different strength. Our grammar model uses rank 

parameters for expressing word order, so this means that a requirement can not only be 

completed or violated, but it can compete with (partially) incompatible requirements. 

A special variant of these rank parameters also expresses those cases where focus (or 

another operator) is ”re-ordering” word order (compared to a neutral sentence). In written 

Hungarian sentences there is no other sign of focus (in spoken sentences there is emphasis 

as well). 

2 Rank parameters 

Primitive syntactic relations (like being before or after each other) can be considered as a 

direct preceding requirement in the description of the lexical item. This is because if an 

element is in relationship with a head, it wants to be its neighbour. To give a short example 

in Hungarian: a definite article needs a noun immediately after itself (1a). If an adjective 

is there, it needs the noun being immediately after itself as well (1b). If this noun has a 

possessive suffix, the suffix wants the possessor between the article and the adjective (1c). 

Another adjective, expressing nationality has to be before the noun (1d). Both adjectives 

cannot precede the noun: nationality gets priority in this case. Since sentences are linear, 

a head has only two neighbours theoretically. And practically languages usually pick their 

complements from one direction. 

(1) a. a tanárom 

the teacher-Poss1Sg 

’my teacher’ 


b. az okos tanárom 

the clever teacher-Poss1Sg 

’my clever teacher’ 

c. az én okos tanárom / *az okos én tanárom 

the I clever teacher-Poss1Sg / the clever I teacher-Poss1Sg 

’my clever teacher’ 

d. az én okos magyar tanárom 

the I clever Hungarian teacher-Poss1Sg 

’my clever Hungarian teacher’ 

These relations can be expressed by a parameter, called rank parameter, a number 

expressing that two lexical items need to be that close to each other to express the relationship 

between them. So now we can calculate how a requirement can be satisfied 

indirectly (or partially). In the case of (1a) and as for the nationality adjective (1d) it is 

regarded as the direct satisfaction of a requirement. The requirement of the article in (1b) 

or the adjective in (1d) is an indirect satisfaction. 

176


Figure 1: Indirect satisfaction in (1d). 

Rank parameters show in which direction the satisfying word should be. It is expressed 

by a character. It can be a, b or c, referring to a following or a previous position or both. 

We differentiate two types of rank parameters based on the way of satisfying requirements. 

Recessive rank parameters (r) give neighbourhood relations (as in (1a-d)), and 

they are satisfied either if they are adjacent immediately or another element with stronger 

(a smaller number) rank is wedged in 1 . In Figure 1. the 5 strength requirement of the determinant 

az ’the’ to the noun tanárom ’teacher-Poss1S’ is satisfied. This case is a partial 

or indirect satisfaction (Alberti, 1999) (see further examples in (6-7)). From conflicting 

dominant rank parameters (d) only the strongest one can be satisfied, all others are deleted 

(see section 6). 

Dominant parameters come language-specifically from either syntax or semantics. For 

example, in Hungarian the subject of a sentence precedes the verb by a dominant semantic 

rank parameter, and in no language it is morpheme-marked (thus, it is not a separate 

lexical item). In contrary, the subject obligately precedes the verb in English, even if it is 

semantically empty. Dominant parameters also play an important part in the Hungarian 

focus phenomena (see examples (6-11)). 

3 Predicates and arguments, heads and complements 

Argument structures are considered as entities. Their elements are given by a stock table 

of argument types. Therefore, an argument is formed by a relationship between the argument 

structure and an argument type. For example, the Hungarian verb lakik ’live’ has 

two arguments: the one who lives somewhere and the place where the one lives. 

Argument types are described by a number parameter which places the argument in 

a scale of being agentive or patient-like. Those types which are not in the central frame 

which describes relations between subjects and objects get a neutral parameter. 

In Hungarian we consider nominal parts of speech as they have more than one argument 

structure: they can be arguments themselves as their basic – in most of the languages 

the only one – role (2a), or can be nominal predicates, too, because the copula is phonetically 

null in Hungarian in present tense third person singular (2b). And we count the short 

possessive form here, which searches for a possessive suffix (2c). 

(2) a. Péter Budapesten lakik. 

Peter-NOM Budapest-SUPERESS live-3Sg 

’Peter lives in Budapest.’ 

1 Wedging in has perceptional limits. 

177

. Annak a fiúnak a neve Péter. 

That-DAT the boy-DAT the name-Poss3Sg Peter-NOM 

’That boy’s name is Peter.’ 

c. Péter kalapja. 

Peter-NOM hat-Poss3Sg 

’Peter’s hat’ 


We store the required complements the same way: there is a case frame, where: the 

word ’case’ now has an extended meaning, we record here all forms like infinitive or 

postpositional phrases, just like constant phrases to which a case-suffixed word form (3a) 

can be switched (3b). Therefore, cases are stored as a relationship between the case frame 

and a case type. 

(3) a. Péter elárult pár dolgot Mariról. 

Peter-NOM disclose-Past3Sg couple thing-ACC Mary-DELAT 

’Peter disclosed a couple of things about Mary.’ 

b. Péter elárult pár dolgot Marival kapcsolatban. 

Peter-NOM disclose-Past3Sg couple thing-ACC Mary-INS relation-INESS 

’Peter disclosed a couple of things about Mary/related to Mary.’ 

Sometimes the lexical item does not select a certain case for its argument. The verb 

lakik ’live’ has two cases for its arguments: the former one gets the nominative case, 

by the linkage between the argument and the case. The other one is a joker type: ’not 

specified’. The lack of the filled argument may cause a non-grammatical sentence, even 

though at this point we do not know the exact case (case type) it is realized as. Therefore, 

argument types and case types can be linked, too. For the ’PLACE’ type argument, more 

case types can be selected, as these examples show: 

(4) a. Péter egy szép házban lakik. 

Peter-NOM a nice house-INESS live-3Sg 

’Peter lives in a nice house’ 

b. Péter Budapesten lakik. 



c. Péter az iskola mellett lakik. 

Peter-NOM the school-NOM next-POSTPOS live-3Sg 

’Peter lives next to the school.’ 

Syntax may account for adjuncts too. A suffixed noun is an adjunct when the suffix is 

compositional, but all those compositional elements are complements which are required 

by another element. In this case the suffix (or the lexical item: ott ’there’) tells about 

itself that it is an adjunct requiring a noun. 

4 Rank parameters in operation 

Rank parameters come from description, by experience. In the followings, some Hungarian 

examples show how they work. 

178


In Hungarian a head-complement relation is given by a 7 strength rank parameter. We 

do not give any direction because (since lexical items are moprhemes) the place of the 

complement is underspecified at this point. Semantic requirements search an aspectualization 

argument in the pre-verbal position. There is always an argument giving aspect: 

usually it is a pre-verb (5a) 2 or a bare NP (5b) or occasionally it can be the verb itself (5c) 

(Alberti, 2004). 

(5) a. Péter megírta a leckét. 

Peter-NOM Perf+write-Past3Sg the homework-ACC 

’Peter has written the homework.’ 

b. Már három hete újságot árulok. 

Already three week newspaper-ACC sell-1Sg 

’I have been selling newspaper for three weeks already.’ 

c. Péter csalódik Mariban. 

Peter-NOM get-disappointed-3Sg Mary-INESS 

’Peter gets disappointed in Mary.’ 

Pre-verbs have two rank parameters, both recessive. In neutral sentences like (6a) 

the pre-verb el ’away’ must precede indul ’starts going’, given by a strong (r2b) rank 

parameter. The emphasis is on the pre-verb, and the verb has no emphasis, so practically 

they form one phonological word. In other cases, like in (6b), the pre-verb may follow the 

verb by a weaker (r3a) rank parameter. This time they are separate phonological words. 

(6) a. Péter elindul horgászni. 

Peter-NOM away+go3Sg fish-INF 

’Peter goes fishing.’ 

b. Péter ’horgászni indul el. / ’Péter indul el horgászni. 

Peter-NOM fish-INF go-3Sg away / Peter-NOM go-3Sg away fish-INF 

’Why Peter goes away is that he will fish.’ / ’It is Peter who goes fishing.’ 

Sometimes a certain argument gives aspect. For example, the verb lakik ’live’ has an 

argument for ’PLACE’, and it is in the preceding position with a strong (r2b) rank (7a), 

or in the following position with a weaker (r3a) rank (7b) 3 . 

(7) a. Péter Budapesten lakik. 



b. *Péter lakik Budapesten / ’Péter lakik Budapesten. 

Peter-NOM live-3Sg Budapest-SUPERESS 

’*Peter lives in Budapest.’ / ’It is Peter who lives in Budapest.’ 

There are even more special cases when a verb having a pre-verb still gets the aspect 

from another argument. 

2 Pre-verbs in Hungarian are considered as complements (as well as in other theories), because they are 

separate words. It is a matter of orthography that if the preverb preceeds the verb immediately they should 

be joint. 

3 In the examples apostrophe means strong emphasis. Besides word order, this denotes focus in a Hungarian 

sentence. 

179

(8) a. Péter Budapesten szállt meg. 

Peter-NOM Budapest-SUPERESS stay-Past3Sg Perf 

’Peter stayed in Budapest.’ 

b. *Péter megszállt Budapesten / Péter ’megszállt Budapesten. 

Peter-NOM Perf+stay-Past3Sg Budapest-SUPERESS 

’*Peter stayed in Budapest.’ / ’What Peter did in Budapest was that he stayed there.’ 

As we can see in (8b), the first sentence without emphasis is non-grammatical. The 

second variant is grammatical, but not neutral in any cases: a focus throws the locative 

back, so only the weaker requirement can be satisfied (see further in the next two sections). 

The aspect-giving argument has to be stored with two rank parameters in every case. 

5 Focus in Hungarian 

Focus in Hungarian can be noticed by emphasis and word order (Kiss, 2000). In the 

following examples (9a) is a neutral sentence and (9b-c) are variants with a focus pointing 

on different complements of the verb. 

(9) a. Mari süteményt süt Péternek. 

Mary-NOM cookie-ACC bake-3Sg Peter-DAT 

’Mary is baking cookies for Peter.’ 

b. Mari ’Péternek süt süteményt. 

’It is Peter for whom Mary is baking cookies.’ 

c. Mari ’süteményt süt Péternek (és nem kenyeret). 

’Those are cookies (and not bread) what Mary is baking for Peter.’ 

In our solution focus is a separate lexical item 4 , because it influences other elements 

in the sentence by its own requirements. It searches for two other elements: the focused 

element and a verb. Focus gives the verb a strong dominant rank parameter to be in the 

following position (d6a). 5 

In the previous section we claimed that the aspect-giving argument (mostly a pre-verb) 

has to be stored with two rank parameters. In neutral sentences (as in (6a)) the stronger 

(r2b) rank parameter is satisfied. But when a focus comes (see (6b)), the requirement of 

the pre-verb cannot be satisfied. The weaker (r3a) requirement is still there, and it can be 

satisfied. 

6 Processing 


Search rolls from the finite verb. Those elements, which turn out to be not required by the 

verb or any of its complements (adjuncts mostly), are legitimate if they find an element to 

attach to. 

The first step of the process is to check dominant rank parameters. In Figure 2. the focused 

element tortát ’cake-ACC’ directly preceds the verb hozott ’bring-Past3Sg’.) Then 

all conflicting requirements are deleted: 

4 Although it is phonetically null in Hungarian, in some languages it is a morpheme (eg. eskimo, 

quechua, tamil). This explains why we consider it as a separate lexical item. 

5 Progressive form of telic situations may work the same. 

180


Figure 2: Processing. 

1. Ranks applying to the same element from the same element (In Figure 2. r3a between 

the pre-verb be ’in’ and the verb, only r7b remains); 

2. All other ranks between the two elements (r7a from the verb to the focused tortát, 

and r7c from tortát to the verb); 

3. Ranks applying to another element with a reverse direction (r7b rank of the verb to 

the subject Péter ’Peter-NOM’ changes 6 to r7c, because subject could be anywhere 

around the verb if there is a focus); 

4. The dominant rank parameter wins if there are two conflicting requirements of the 

same element (between Péter ’Peter-NOM’ and the verb hozott ’bring-Past3Sg’ 

there is r7c and d7a, due to the focus the former one remains in this sentence, but 

in a neutral sencence d7a applies.) 

The next step is to check recessive rank parameters: either two elements are neighbours 

directly or there is another element between them which is required with a stronger rank 

parameter (this may bring adjoining elements). In the example it goes as follows: 

1. egy tortát ’a cake-ACC’, a szobába ’the room-ILLAT’, hozott be ’bring-Past3Sg in’ 

are neighbours directly; 

2. in be a szobába ’in the room-ILLAT’ the definite article is in between, but it has a 

stronger rank parameter (r5a against r7c); 

3. Péter and hozott has egy tortát in between due to the 6 strength rank parameter by 

the focus.) 

In our system, contrary to phrase-ctructure grammars, any element can be focused. 

Sometimes the verb does not succeed the focused element immediately. An adjoining 

word may follow it which wedges itself in by a stronger rank parameter, like (10) shows: 

(10) a. Péter egy lánnyal találkozott. 

Peter-NOM a girl-INS meet-Past3Sg 

’Peter met a girl.’ 

6 Practically its direction is deleted, see section 2. 

181

. Péter egy ’okos lánnyal találkozott. 

Peter-NOM a clever girl-INS meet-Past3Sg 

’It was a clever girl whom Peter met.’ 

c. Péter ’két okos lánnyal találkozott. 

Peter-NOM two clever girl-INS meet-Past3Sg 

’It was two clever girls whom Peter met.’ 

(11) a. Péter olvasott egy verset Adytól. 

Peter-NOM read-Past3Sg a poem-ACC Ady-ABL 

’Peter read a poem by Ady.’ 

b. *Péter egy ’verset Adytól olvasott. 

Peter-NOM a poem-ACC Ady-ABL read-Past3Sg 

In (11) the focused element (verset ’poem-ACC’) has a complement (Adytól ’Ady- 

ABL’), but complements are required with a 7 strength rank parameter and it is weaker 

than the 6 strength rank parameter between the focus and the verb. 

7 Conclusion 

We are working on the implementation of this system in which predicate-argument and 

head-complement relations, adjuncts and word order are all handled in the lexicon. Rank 

parameters account for word order variations in a language, and for other phenomena like 

scrambling (this shows clear differences between languages) or focus and progressive 

(which are sometimes invisible). The next step will be a semantic component, because 

we believe that intelligent applications can be made only on real linguistic basis which 

requires fine semantics. 


I am grateful to the Hungarian National Scientific Research Fund (OTKA K60595) for 

their contribution to my costs. 

References 


Alberti, G. (1999). GASG: The grammar of total lexicalism, Working Papers in the Theory 

of Grammar 6(1). Theoretical Linguistics Programme, Budapest University and 

Research Institute for Linguistics, Hungarian Academy of Sciences. 

Alberti, G. (2004). Climbing for aspect with no rucksack, in K. É. Kiss and H. van 

Riemsdijk (eds), Verb Clusters; A study of Hungarian, German and Dutch, Linguistics 

Today 69, John Benjamins, Amsterdam:Philadelphia, pp. 253–289. 

Alberti, G. (2005). ReALIS. Doctoral dissertation at Hungarian Academy of Sciences, 

ms. HAS Research Institute for Linguistics and University of Pécs. 

URL: http://lingua.btk.pte.hu/gelexi.asp 

Archangeli, D. and Langendoen, T. D. (eds) (1997). Optimality Theory: an Overview, 

Blackwell, Oxford. 

182


Asher, N. and Lascarides, A. (2003). Logics of Conversation, Cambridge University 

Press, Cambridge. 

Halassy, B. (1994). Az adatbázis-tervezés alapjai és titkai [Basics and Secrets of Designing 

a Database], IDG, Budapest. 

Kamp, H., van Genabith, J. and Reyle, U. (2004). Discourse representation theory. ms. 

to appear in Handbook of Philosophical Logic. 

URL: http://www.ims.uni-stuttgart.de/∼hans 

Karttunen, L. (1986). Radical lexicalism, Report No. CSLI-86-68, CSLI Publications. 

Kiss, K. É. (2000). Az egyszerü mondat szerkezete [the Structure of the Simple Sentence], 

in F. Kiefer (ed.), Strukturális magyar nyelvtan I. Mondattan [Structural Hungarian 

Grammar Vol. 1 Syntax], Vol. 7., Akadémiai Kiadó, Budapest, pp. 79–177. 

Schneider, G. (2005). A broad-coverage, representationally minimal LFG parser: Chunks 

and f-structures are sufficient, in M. Butt and T. H. King (eds), Proceedings of the 

LFG05 Conference, CSLI Publications, University of Bergen, pp. 388–407. 

183


184

EXPRESSING CONJUNCTIVE AND AGGREGATE QUERIES OVER 

ONTOLOGIES WITH CONTROLLED ENGLISH 

Camilo Thorne 

Free University of Bozen-Bolzano 

Abstract. We propose to characterize the computational complexity of answering questions 

in ontology-mediated controlled language interfaces to structured data sources by expressing 

ontology-based data access in controlled English. This means: compositionally mapping a 

controlled subset of English into knowledge bases and formal queries for which the computational 

complexity of ontology-based data access is known. In the present paper, we extend 

this approach to conjunctive queries and to conjunctive queries with aggregation functions. 



Lately, there has been a renewed interest within the computational linguistics community 

(Minock, 2005; Lesmo and Robaldo, 2007) in natural language interfaces to databases 

(NLIDBs), where what is aimed at is managing, with natural language (NL), relational 

databases (DBs). In particular, robust interfaces supporting controlled fragments (CLs) 

of English and based on ontologies, computational semantics and deep semantic parsing 

have been developed, by, for instance, the Attempto project (Bernstein et al., 2003; 

Fuchs et al., 2005). Controlled languages are fragments of NL tailored to fit data management 

tasks by, typically, constraining their restricted vocabulary (and syntax), thereby 

stripping them from ambiguity, whether structural or semantic. Controlled languages allow 

a trade-off between the coverage and the accuracy of the translation of questions into 

formal queries. Ontologies (the conceptualizations of the domain) play the intermediate 

role between the CL’s vocabulary and the domain terminology. 

However, some important issues regarding controlled English interfaces have not been, 

to the best of our knowledge, fully adressed. One of them is the tractability and untractability 

of processing CL information requests and utterances, viz., how difficult is 

declaring and accessing structured data with a controlled English interface? And by difficult, 

we mean its computational complexity. We believe that a way of adressing this issue 

consists in expressing ontology based data access with CLs. By this we mean designing 

declarative and interrogative controlled subsets of English that compositionally map 

through a semantic mapping �.� (taken from NL formal semantics) into formal queries, 

ontologies and database facts, their meaning representations (MRs). Ontology based data 

access provides the logical underpinning of accessing structured data w.r.t. ontologies and 

its computational complexity, a measure of how difficult a task it might be. 

The main purpose of this paper is twofold. On the one hand, we will say what means to 

express in CL ontology based data access. On the other hand, we will proceed to express 

in controlled English a class of formal queries known as conjunctive queries. Conjunctive 

queries are good in that with them we reach an optimal computational complexity. Last, 

but not least, we will extend our controlled language to cover aggregate queries, which 

are conjunctive queries to which the basic SQL aggregation functions, COUNT, MIN, MAX 

and SUM, have been added. 

185


2 Ontology Based Data Access 

Accessing and declaring data w.r.t. an ontology or conceptualization can be characterized 

in terms of formal logic as follows (Rosati, 2007). A relational query q of arity n is a 

formal expression q(x) ← Qyβ(x, y), where q(x) is the head and x denotes a sequence 

of n variables, the query’s distinguished variables, and Qyβ(x, y) is the body, a first order 

logic (FOL) quantified boolean combination of relational atoms where the distinguished 

variables occur free and the others (the sequence y) bound to a quantifier. Qy denotes 

the sequence of its quantifier prefixes. When no confusion arises, we shall abbreviate 

Qyβ(x, y) with Φ[x]. A query is said to be boolean if its arity is n = 0. A collection 

of such queries is called a query language. A relational database (DB) D is a finite set 

of ground atoms over a schema R := {R1, ..., Rn}, where, for i ∈ [1, n], Ri is a relation 

symbol of arity m ≥ 1, and over a countably infinite domain Dom of constants. The 

active domain adom(D) of D is the set of constants that occur in D (a finite subset of 

Dom). An ontology O is a set of FOL axioms that make explicit a certain number of 

constraints holding over a domain. They are typically defined over some fragment of 

FOL called an ontology language. This language should be rich enough to express DBs 

(i.e., DB atoms). The pair 〈O, D〉 is called a knowledge base (KB), and can be seen as a 

FOL logical theory: a set of ground atoms (the DB) plus a set of axioms (the ontology). 

A gound substitution is a function σ(.) from V ar(q), the set of variables of q, into Dom. 

They are extended to sequences of variables in the standard way. KBs and substitutons 

give rise to the certain answers semantics of query q of arity n over a KB 〈O, D〉, denoted 

q(〈O, D〉). It consists in collecting the values in adom(D) of all the ground substitutions 

σ(.) for which 〈O, D〉 logically entails qσ, where qσ denotes the grounding of q by σ(.). 

Formally, q(〈O, D〉) := {σ(x) ∈ adom(D) n | σ s.t. 〈O, D〉 |= qσ}. To investigate its 

computational complexity we must look at the associated recognition problem: 

Definition 1. (QA) The KB query answering (QA) decision problem is the FOL entailment 

problem stated as follows: given a KB 〈O, D〉, a sequence c ∈ Dom n of n constants, a 

CQ q of arity n and distinguished variables x, check if there exists a ground substitution 

σ(.) s.t. σ(x) = c and 〈O, D〉 |= qσ holds, where qσ is the grounding of q by σ(.). 

When we focus on #(adom(D)) (the number of constants of D) while considering 

constant both size(q) (the number of symbols of the query) and #(O) (the number of 

axioms), we speak, in a manner set by (Vardi, 1982), of the data complexity of QA. Such 

complexity will depend on the query language and the ontology language chosen (Rosati, 

2007). 

The certain answers semantics can provide a formal semantics for ontology mediated 

CL data access interfaces and QA’s data complexity both a measure of their difficulty and 

a criterion for optimality. To implement this strategy we need, we believe, to go through 

two stages: (i) We need to choose an ontology language and a query language for which 

the computational complexity of QA is known and for which data complexity is optimal. 

(ii) We need to express with controlled English QA. 

3 Expressing QA with Controlled English 

A compositional translation �.�, as proposed and conceived by Montague in (Montague, 

1970) is a function that homomorphically maps a fragment of natural language (English 

186


in our case) into, basically, FOL augmented with the types, the lambda abstraction and the 

function application constructs of the simply typed λ-calculus, a.k.a. λ-FOL. They assign 

to NL utterances a λ-FOL formula: its meaning representation (MR). The key feature of 

compositional translations is that they can be made to map declarative fragments of NL 

into ontology languages and interrogative fragments into query languages. 

Definition 2. (Expressing QA) Given an ontology language L and a query language Q, 

expressing QA in controlled English consists in: (i) Defining a grammar G and a compositional 

translation �.� for a controlled declarative fragment L(G) s.t. �.� maps L(G) 

into L. (ii) Defining a grammar G ′ and a compositional translation �.� for a controlled 

interrogative fragment L(G ′ ) s.t. �.� maps L(G ′ ) into Q. 

We have dealt elsewhere with the problem of expressing KBs and ontology languages 

by expressing, in particular, the DL-LiteR,⊓ ontology language or logic and, in general, 

the DL-Lite family of DLs (Calvanese, De Giacomo, Lembo, Lenzerini and Rosati, 

2007). Description logics (DLs) are knowledge representation logics that conceptually 

model a domain in terms of classes, roles (binary relations among classes) and inheritance 

relations between classes and roles. In (Bernardi, Calvanese and Thorne, 2007; Thorne, 

2007) we define a declarative CL, Lite-English, a compositional translation �.� and 

show that: 

Theorem 1. (Bernardi et al., 2007) For every sentence S in the CL Lite-English, 

there exists a DL-LiteR,⊓ assertion α s.t. �S� = α. Conversely, every DL-LiteR,⊓ 

assertion α is the image by �.� of some sentence S in Lite-English. 

To get the whole picture we need to look now at query languages. It turns out to be that 

QA for DL-LiteR,⊓ is optimal w.r.t. data complexity, falling under LOGSPACE (actually, 

AC 0 ), a minimal complexity class, when we choose as query language the class of 

relational queries known as ruled-based conjunctive queries (CQs). Conjunctive queries 

are queries over a schema R whose body is a conjunction of existentially quantified relational 

atoms. Expressing query languages w.r.t. which QA’s computational complexity is 

optimal can shed light on the conditions under which the task of accessing data w.r.t. an 

ontology with CL might be a relatively easy task. 

4 Expressing Conjunctive Queries 

In this section we will show how to express graph-shaped simple conjunctive queries, a 

subclass of the class of CQs, for which QA is optimal too. A typical boolean graph-shaped 

query over, say, the constant Mary and the binary predicates loves and hates is 

(1) q() ← ∃x∃y(loves(Mary, x) ∧ hates(x, y)) 

which we would like to express through the CL Y/N-question 

(2) Does Mary love somebody who hates somebody? 

And a typical non-boolean graph-shaped query over the same set of relational symbols 

(i.e., the schema {loves, hates}) is 

(3) q(x) ← ∃y(loves(x, y) ∧ hates(x, y)) 

187


(Lexical rule) (Value of �.� on word and category) 

Det → some λP.λQ.∃x(P (x) ∧ Q(x)): (e → t) → ((e → t) → (e → t)) 

Proi → somebody λP.∃xP (x): (e → t) → t 

Pro − 

i → anybody 

Coord → and 

λP.∃xP (x): (e → t) → t 

λP.λQ.∃x(P (x) ∧ Q(x)): (e → t) → ((e → t) → (e → t)) 

Relproi → who 

Proi → him 

Proi → himself 

Intpro → which 

λP.λx.P (x): (e → t) → (e → t) 

λP.P (x): (e → t) → t 

λP.P (x): (e → t) → t 

λP.λQ.λx.P (x) ∧ Q(x): (e → t) → (e → t) 

Intproi → whoi 

NPgapi → ɛ 

λP.λx.P (x): (e → t) → (e → t) 

λP.P (x): (e → t) → t 

Ni → man,... λx.man(x): e → t,... 

IVi → runs,... λx.run(x): e → t,... 

IV − 

i → run,... 

TVi,j → loves,... 

TV 

λx.run(x): e → t,... 

λα.λx.α(λy.loves(x, y)): ((e → t) → t) → (e → t),... 

− 

i,j → love,... 

TV 


p 

i,j → loved,... 

Adji → mortal,... 

Pni → Mary,... 


λx.mortal(x): e → t,... 

λP.P (Mary): (e → t) → t,... 

Table 1: Lexical rules for GCQ-English. 

which we would like to express through the CL Wh-question (containing an anaphoric 

pronoun) 

(4) Who loves somebody who hates him? 

Definition 3. (GCQs) A non-boolean graph-shaped simple conjunctive query (GCQ) of 

arity ≤ 1 is a CQ over a schema R composed of relation symbols of arity ≤ 2 of the form 

q := q(x) ← Φ[x] where the body Φ[x] is inductively defined as: 

Φ[x] := Ai0 (x) ∧ ... ∧ Aim(x) ∧ Rj0 (x, x) ∧ ... ∧ Rjm(x, x) ∧ Rj0 (x, c) ∧ Rjm(x, c). 

Φ[x] := Φ ′ [x] ∧ ∃y(Ai0 (x) ∧ ... ∧ Aim(x) ∧ Rj0 (x, y) ∧ ... ∧ Rjm(x, y) ∧ Rj0 (y, x)∧ 

∧Rjm(y, x) ∧ Φ ′′ [y]). 

Note that we allow in this definition for empty sequences of conjuncts, e.g., |Ai0(x)∧...∧ 

Aim(x)| ≥ 0 (where |.| is the function that returns the number of predicates in the body 

of a relational query). A boolean GCQ is a query of the form q := q() ← ∃yΦ[y], where 

Φ[y] is the body of a non-boolean GCQ. 

4.1 Expressing Conjunctive Queries with GCQ-English 

GCQs are captured by the interrogative CL GCQ-English. Questions in GCQ-English 

fall under two main classes : (i) Wh-questions, that will map into non-boolean GCQs and 

(ii) Y/N-questions, that will map into boolean GCQs. For simplicity, we assume grammars 

to be phrase structure grammars augmented with semantic actions. Phrase structure 

grammars are composed of two sets of rewriting rules: lexical rules (a.k.a. lexicons) 

and phrase-structure rules. Table 2 shows the phrase-structure rules of GCQ-English’s 

grammar and Table 1 its lexicon. Moreover, the latter is divided into two sets: a closed 

set of function word rules, that express (at the semantical level) logical operations and 

connectives, and an open set of content word rules (nouns, adjectives, verbs), a feature we 

convey through dots. 

188

(Rule) (Semantic Action) 

Qwh → Intpro Ni Sgapi ? �Qwh� := �Intpro�(�Ni�)(�Sgapi�) Qwh → Intproi Sgapi ? �Qwh� := �Intproi�(�Sgapi ?�) 

QY /N → does NP − 

i VP−i 

? �QY /N � := �NP − 

i �(�VP−i 

�) 

QY /N → is NPi VPi? �QY /N � := �NPi�(�VPi�) 

Sgap i → NPgap i VPi 

�Sgap i � := �NPgap i �(�VPi�) 

VPi → VPi Coord VPi �VP� := �Coord�(�VP�)(�VP�) 

VP − 

i → VP−i 

Coord VP−i 

�VP − 

i � := �Coord�(�VPi�)(�VP − 

i �) 

VPi → TVi,j NPj 

VPi → is Adji VPi → is a Ni 

VP 

�VPi� := �TVi,j�(�NPj�) 

�VPi� := �Adji� �VPi� := �Ni� 

− 

i → IV−i 

�VP − 

i � := �IV−i 

� 

VPi → IVi 

VP 

�VPi� := �IVi� 

− 

i → TV−i,j 

NPj �VP−i 

� := �TV−i,j�(�NPj�) 

VPi → VP p 

i 

�VPi� := �VP p 

i � 

VP p 

i → TVpi,j 

NPj �VPpi 

� := �TVpi,j�(�NPj�) 

NP − 

NPi → Proi 

NPi → Det Ni 

NPi → Pni 

NPi → Proi 

i → Det− Ni �NP − 

Ni → Adj Ni 

Ni → Ni Relpro i Sgap i 

i � := �Det−�(�Ni�) �NPi� := �Proi� 

�NPi� := �Det�(�N�) 

�NPi� := �Pni� 

�NPi� := �Proi� 

�Ni� := �Adj�(�Ni�) 

�Ni� := �Relproi�(�Ni�)(�Sgapi�)) Table 2: Phrase structure rules for GCQ-English. 

The empty expression ɛ is what in linguistic theory is called a trace, a placeholder for 

the antecedent of the relative pronoun. Symbols occurring in the phrase-structure rewriting 

rules are called components and represent the syntactic chunks into which sentences 

can be analysed. Symbols that rewrite into words, that is, symbols in the lexicon, are 

called categories or terminal components and represent parts of speech, that is, verbs, 

common and proper nouns, pronouns, adjectives, etc. Some basic morpho-syntactic and 

semantic features are attached to (some) components. The feature . − means that the component 

is of negative polarity, the feature . p , associated to verbs and verb phrase components, 

indicates that such component is to be inflected in the passive voice. Absence of 

features indicates that components are in positive polarity and verbs and verb phrases in 

the active voice. Furthermore, indexes are assigned to components following the standard 

set by (Pratt, 2001) to: (i) Resolve intrasentential anaphora: anaphoric pronouns (”him”, 

”himself”) resolve with their nearest (antecedent) head noun. (ii) Indicate gap-filler dependencies. 

For simplicity, verbs are in 3rd person singular and in present tense. 

A quick glace at the grammar rules of GCQ-English will convince the reader that, 

for instance, the (English) question 

(5) Does John love Mary? 

and the question 


(6) Which man is mortal and loves somebody who hates him? 

lie within GCQ-English. By the same token, it is easy to see that the question 

(7) *Which teacher gives a lesson to his pupils? 

189


lies outside this CL. Why? Because we have no possesive adjectives (e.g., ”his”) and no 

ditransitive verbs (e.g., ”gives”). 

Semantic actions mean that we define the translation �.� by recursion over the syntactic 

components of GCQ-English in such a way that the application of each grammar rule, 

lexical or otherwise, ”triggers” �.� (Jurafsky and Martin, 2000). The intermediate values 

of this function are called partial MRs. When we reach in a Wh-question the Qwh component 

�.� will map the λ-FOL expression obtained, of the form �Qwh� = λx.Φ[x]: e → t, 

into the GCQ q(x) ← Φ[x], where Φ[x] denotes a conjunction of existentially quantified 

atoms where variable x occurs free. In the case of a Y/N-question, the λ-FOL 

�QY/N� = Φ: t will be mapped into the boolean GCQ q() ← Φ, where Φ stands for a 

conjunction of existentially quantified atoms with no free variables. Types ensure that �.� 

always terminates. We can compute, given a GCQ-English question Q, �Q� as follows: 

(i) We compute the parse tree of Q. (ii) We compute �Q� bottom-up, from leaves to root, 

as in Figure 1. We start by assigning a λ-expression to the leaves. Then, at each internal 

node, we unify types and compute the λ-application and the β-reduction of its siblings. 

We omit types for reasons of space. In the end we obtain, at the root of the tree, a GCQ. 

The circle delimits an island; the dotted line, a gap-filler dependency forced upon by the 

use of the pronoun. 

Figure 1: Translating ”Who loves Mary?”. 

Lemma 1. (Expressing GCQs) For every question Q in GCQ-English, there exists a 

GCQ q s.t. �Q� = q. Conversely, every GCQ q is the image by �.� of some question Q in 

GCQ-English. 

Proof. (Sketch) We prove each implication separately: 

(⇒) We need to show that for every Wh-question Q in GCQ-English there exists a 

GCQ q of distinguished variable x and body Φ[x] s.t. �Q� = q(x) ← Φ[x]. Given 

that the only recursive components in GCQ-English’s grammar are verb phrases 

(VPs) and nominals (Ns), this can be proved by an easy simulatenous induction on 

Ns and VPs any by discarding all possible parse states where components do not 

satisfy co-indexing, polarity and voice constraints. For Y/N-questions we reason 

analogously. 

190

(⇐) We will prove, by induction on the body Φ[x] of a non-boolean GCQ q of distinguished 

variable x, that we can construct a question Q s.t. that q is the image of 

Q by �.�. The result will then follow both for boolean and non-boolean GCQs. Recall 

that Ns translate into unary predicates, TVs into binary predicates and Pns into 

constants: 

– (Basis) q(x) ← Φ[x] is the image of the question ”which Ai0 who is a Ai1 who 

. . . who is a Aim Rj0s himself and . . . and Rjms himself and Rj0s c and . . . and 

Rjms c and is Rj0d by c and . . . and is Rjmd by c?”. 

– (Inductive step) q(x) ← Φ[x] is the image of the question ”which Φ ′ [x] Rj0s 

and Rj1s and . . . and Rjms some Ai0 who is a Ai1 and who is a Ai2 and . . . and 

who is a Aim and who Rj0s him and . . . and who Rjms him and who Φ ′′ [y]?”, 

by induction hypothesis on Φ ′ [x] and Φ ′′ [y]. ✷ 

Theorem 2. (Expressing QA) The QA problem for Lite-English and GCQ-English 

falls under in LOGSPACE w.r.t. data complexity. 

Proof. It follows immediately from Theorem 1 and Lemma 1. ✷ 

5 Expressing Aggregate Queries 

The question we now need to answer is: how can we expand the coverage of our CL 

without compromising the tractability of QA? In this section we propose to cover graphshaped 

aggregate queries, that is, GCQs augmented with (some of) the basic SQL aggregation 

functions, COUNT, MIN, MAX and SUM. These functions are defined on finite subsets 

of Dom ∪ Q, i.e., on DB domains plus the linearly ordered set of rational numbers and 

take values in Q, that is, they compute a rational number. For the purposes of the current 

paper, we will restrict our analysis to only two of them, namely MAX and MIN, although 

this analysis can be easily generalized to cover all of these functions. 

Aggregates arise frequently in domains and systems containing numerical data, e.g. 

geographical domains and systems. One of them, the GEOQUERY geography database 

system, comes with a NL interface that supports NL questions expressing such functions 

(Mooney, 2007). The corpus of these questions showed that user questions did basically 

convey either a CQ or a CQ with aggregation functions (see Table 3). Most importantly, 

CQs Aggregations Negation 

Questions 34.54% 65.35% 0.11% 

Table 3: Frequency of CQs in GEOQUERY. 

answering CQs (and a fortiori GCQs) with aggregation functions over DL-Lite ontologies 

is polynomial w.r.t. data complexity. So, how do these queries look like and what 

kind of questions do we want to have in our CL? We would like to capture queries over 

unary predicates computing a maximum like 

(8) q(max(n)) ← height(n) ∧ odd(n) 

with a CL Wh-question like 


(9) Which is the greatest height that is odd? 

191

(Rule) (Semantic action) 

VPi,j → COP NPj �VPi,j� := �COP�(�NPj�) 

(Lexical rule) (Value of �.� on word and category) 

Det → the greatest λP.max(P ): (Q → t) → Q 

Det → the smallest λP.min(P ): (Q → t) → Q 

Det → some λP.λQ.∃n(P (n) ∧ Q(n)): (Q → t) → ((Q → t) → (Q → t)) 

Proi → something λP.∃nP (n): (Q → t) → t 

Pro − 

→ anything λP.∃nP (n): (Q → t) → t 

i 

Proi → it λP.P (n): (Q → t) → t 

Proi → itself λP.P (n): (Q → t) → t 

Coord → and λP.λQ.∃n(P (n) ∧ Q(n)): (Q → t) → ((Q → t) → (Q → t)) 

Relpro i → that λP.λn.P (n): (Q → t) → (Q → t) 

Intpro i → which λP.λn.P (n): (Q → t) → (Q → t) 

COPi,j → is λn.λm.n ≈ m: Q → (Q → t) 

NPgap i → ɛ λP.P (n): (Q → t) → t 

Ni → height,... λn.height(n): Q → t,... 

Adj → odd,... λn.odd(n): Q → t,... 

Or queries computing a sum 

Table 4: Grammar rules for AGCQ-English. 

(10) q(sum(n)) ← height(n) ∧ odd(n) 

with the question 


(11) Which is the sum of all heights that are odd? 

Definition 4. (AGCQs) A graph-shaped conjunctive aggregate query (AGCQ) over a relational 

schema R is a query of the form q(α(n)) ← Φ[n], where α ∈ {min, max}, n 

is q’s distinguished variable, a numerical variable, and Φ[n] is the body of a non boolean 

GCQ. Note that there are no boolean AGCQs. 

5.1 Expressing Aggregate Queries with AGCQ-English 

To express AGCQs in CL we extend AGCQ-English into a new fragment of English 

called AGCQ-English as follows. Aggregation functions min and max are conveyed, 

in English, by, respectively, definite NPs like ”the smallest N” and ”the greatest N”, only 

this time they must denote not a set of properties, but, instead, a numeric value. The 

symbol N stands for a nominal component that denotes sets of numerical values. The 

rest of the expression behaves in a manner similar to a determiner. We must thus start by 

enriching our set of primitive λ-FOL types from {e, t} into {e, t, Q} and allow for new 

determiners of type (Q → t) → Q. 

Definition 5. (Aggregate Determiners) An aggregate determiner is any of the following: 

(i) The determiner ”the greatest”, associated to max and of partial MR λP.max(P ): (Q → 

t) → Q. (ii) The determiner ”the smallest”, associated to the aggregation function min 

and of partial MR λP.min(P ): (Q → t) → Q. 

Once aggregate determiners have been introduced, there are three steps left to finish 

covering aggregate queries with AGCQ-English. (i) We introduce a new interrogative 

192


Figure 2: Translating ”Which is the greatest height?”. 

pronoun ”which” of semantics λP.λn.(P )n: (Q → t) → (Q → t), where P is a predicate 

symbol of type Q → t (the type of sets of numbers) and n a variable of type Q (the 

type of numeric values). (ii) We introduce new entries for function words to take into 

account the new basic type Q. (iii) We introduce the identity predicate ”is” (of category 

COP, for copula) of semantics λn.λm.m ≈ n: Q → (Q → t). The reader can see in Table 

4 the (new) lexical rules that extend CL coverage to aggregations. The semantic mapping 

�.� is then computed in the standard way over the parse tree of a AGCQ-English 

question, only it will now output, at the root of the tree a λ-FOL expression of the form 

λm.m ≈ α(λn.Φ[n]): Q → t that �.� will proceed to map into q(α(n)) ← Φ[n]. The 

reader can see a sample run of the procedure in Figure 2. Whence: 

Lemma 2. For every question Q in AGCQ-English, there exists a AGCQ q s.t. �Q� = q. 

Conversely, every ATCQ q is the image by �.� of some question Q in AGCQ-English. 

Proof. (Sketch) As before, the first implication is proved by simultaneous induction on 

the Ns and TVs of question Q. The second implication is proved by induction on the body 

of AGCQs q. ✷ 

Theorem 3. QA is in P for Lite-English and AGCQ-English. 

Proof. It follows from Theorem 1 and Lemma 2. ✷ 

6 Conclusions and Further Work 

We have provided a certain number of guidelines on how to characterize the computational 

complexity of CL interfaces to ontology-driven data access and management systems. 

This is achieved by expressing QA in controlled English. We have also shown that 

we reach tractability when we choose DL-Lite as ontology language and GCQs and 

AGCQs as query languages, for which two CLs, GCQ-English and AGCQ-English, 

have been introduced. As further work we plan to extend the coverage of AGCQ-English 

to the rest of the basic SQL functions, namely COUNT and SUM and to substantiate (or 

validate) the intuitiveness of these CLs anf of their English constructs by analysing more 

question corpora. 

193


I would like to thank my supervisors, R. Bernardi and D. Calvanese, together with I. Pratt, 

for their help and suggestions. 

References 


Bernardi, R., Calvanese, D. and Thorne, C. (2007). Lite Natural Language, Proceedings 

of the 7th International Workshop on Computational Semantics (IWCS-7). 

Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M. and Rosati, R. (2007). 

Tractable Reasoning and Efficient Query Answering in Description Logics: The 

DL-Lite Family, JAR . 

Jurafsky, D. and Martin, J. (2000). Speech and Language Processing, Prentice Hall. 

Lesmo, L. and Robaldo, L. (2007). Use of Ontologies in Practical NL Query Interpretation, 

Proceedings of AI*IA 2007. 

Minock, M. (2005). A Phrasal Approach to Natural Language Interfaces over Databases, 

Natural Language Processing and Information Systems, 10th International Conference 

on Applications of Natural Language to Information Systems (NLDB 2005). 

Montague, R. (1970). Universal Grammar, Theoria (36). 

Mooney, R. J. (2007). Learning for Semantic Parsing, Proceedings of CICLing2007. 

Pratt, I. (2001). On the Semantic Complexity of some Fragments of English, Technical 

report, Department of Computer Science – University of Manchester. 

Rosati, R. (2007). The Limits of Querying Ontologies, Proceedings of the Eleventh International 

Conference on Database Theory (ICDT 2007). 

Thorne, C. (2007). Managing Structured Data with Controlled English - An Approach 

Based on Description Logics, Proceedings of ESSLLI 2007 Student Session. 

Vardi, M. (1982). The Complexity of Relational Query Languages, Proceedings of the 

Fourteenth Annual ACM Symposium on Theory of Computing. 

194


INTERROGATION IN DYNAMIC EPISTEMIC LOGIC ∗ 

Christina Unger – Gianluca Giorgolo 

UiL-OTS, Universiteit Utrecht 

Questions still exhibit an aura of mystery and challenge as it is commonly found with 

natural language phenomena that lie on the border between semantics and speech acts. 

Several treatments have been proposed within denotational semantics, nevertheless there 

is no strong consensus about what kind of a semantic object their meaning is. One of the 

early and most well-known approaches to the semantics of interrogatives was introduced 

by (Hamblin, 1973) and further developed by (Karttunen, 1977). Their line of work reduces 

the meaning of interrogatives to propositions by letting questions denote the set 

of possible or true answers. A slightly different approach is the partition semantics by 

(Higginbotham and May, 1981) and (Groenendijk and Stokhof, 1984). It is based on the 

intuition that the meaning of questions are partitions of the logical space constituted by 

the mutually exclusive possibilities that can serve as answers. 

In this paper we propose Dynamic Epistemic Logic (DEL) as a powerful tool for a 

unified treatment of all question types. We will start with an epistemic interpretation of 

Dynamic Propositional Logic and show how to use it to formalize yes/no questions and 

answerhood. We will then extend it with public announcements and add public questions 

as well as a possibility to embed questions. Finally we sketch how this can also be generalized 

to the case of constituent questions in the line of Groenendijk & Stokhof. In the 

last section we will give an outlook on additional benefits of using DEL as a framework, 

e.g. the interaction of questions and presuppositions. 

2 Yes/no questions 


The basis for our investigations is propositional dynamic logic (PDL), an extension of 

propositional logic with programs, under an epistemic interpretation. If P is a set of 

propositions and A a set of relational atoms, with p ∈ P an arbitrary proposition and i 

ranging over A, the language is given by 

φ ::= ⊤ | p | ¬φ | φ ∧ φ | [π] φ 

π ::= i | π ; π | π ∪ π | π ∗ | TEST φ 

The main idea of giving this PDL-language an epistemic interpretation (see e.g. (van 

Benthem, van Eijck and Kooi, 2006)) is that relational atoms represent epistemic accessibilities 

of single agents. The picture thus is the following: the state of knowledge of 

∗ For valuable comments we are very grateful to the referees and to Jan van Eijck. 

195


a group of agents is modeled as a multimodal S5 Kripke model M = (W, V, R), where 

W is a non-empty set of worlds, V is a valuation function that assigns to every basic 

proposition the set of all worlds where that proposition is true, and R is a function that 

assigns to every agent i an equivalence relation ∼i, where w ∼i w ′ expresses that i cannot 

distinguish between w and w ′ , i.e. that w and w ′ are epistemic alternatives for i. 

The semantics is defined with respect to a model M = (W, R, V ), with the usual 

interpretation for ⊤, p, negation, and conjunction. The interpretation of [π] φ is given by: 

M, w |= [π] φ iff for all w ′ with (w, w ′ ) ∈ �π� M : M, w ′ |= φ 

where �π� M is the meaning of the epistemic construct π, given as follows: basic epistemic 

constructs i are interpreted by ∼i and composed ones by means of regular operations on 

relations: �π ; π ′ � M = �π� M ◦ �π ′ � M where ◦ is relational composition, �π ∪ π ′ � M = 

�π� M ∪�π ′ � M , �π ∗ � M = (�π� M ) ∗ where ∗ is the reflexive transitive closure, and TEST is the 

usual test of dynamic logics with �TEST φ� M = {(w, w) | w ∈ W and M, w |= φ}. The 

epistemic modalities thus express knowledge of an agent or a group of agents, including 

higher-order knowledge. Common knowledge among a group of agents is given by the 

reflexive transitive closure of the union of all individual accessibilities of agents in the 

group: [(i∪j ∪. . .) ∗ ] φ expresses that φ is common knowledge. 1 As an example, consider 

the following model (arrows in both directions are drawn as simple lines, and reflexive 

arrows are not drawn): 

w1 

p q r 

i, j 

j 

w0 

j 

p q r 

w2 

pq r 

Connections between two worlds with label i indicate that agent i confuses these 

worlds. What is depicted is a knowledge state in which it is common knowledge among 

i and j that p (i.e. both know that p and both know that they both know p, etc.) and that 

r → q, and where i also knows q but does not know r, whereas j is ignorant both about q 

and r. 

2.1 Direct yes/no questions and answerhood 

Now it is possible to formulate the ideas of Groenendijk & Stokhof’s partition semantics 

in E-PDL. Given the language above, we can define an additional process Fφ: 

F φ =def (TEST φ ; G ; TEST φ)∪(TEST ¬φ ; G ; TEST ¬φ) where G = W ×W 

With respect to the semantics given above, Fφ denotes an equivalence relation: 

�F φ� M = {(w, w ′ ) | M, w |= φ iff M, w ′ |= φ} 

1 The possibility to express common knowledge makes E-PDL more expressive than simple multi-agent 

epistemic logics that arise from extending propositional logic with knowledge modalities. 

196


The process F thus partitions the model with respect to a formula. Using the restriction 

operation on relations, we will refer to the partition cell containing a world w (in a 

model M) as [w]F φ. An example is given in the following figure, where the model was 

partitioned with respect to p. 

p 

p 

p 

w 

[w]F p 

Important to note is that a question is not a formula, as in (Groenendijk and Stokhof, 

1984), but a process. This might seem like a minor difference, but it will allow us in the 

next section to use questions as updates and thereby talk about the communicative act of 

questioning. 

Defining the process F is enough already to define what it means for a formula to be a 

true or a possible answer to a question. 

Definition 1. A formula ψ is a true answer to the question whether φ, w.r.t. a model M 

and a world w, if for all w ′ ∈ W : M, w ′ |= ψ iff w ′ ∈ {v | (w, v) ∈ [w]F φ}. 

Definition 2. A formula ψ is a possible (or: appropriate) answer to the question whether 

φ, w.r.t. a model M, if there is some w ∈ W , such that ψ is a true answer to the question 

whether φ w.r.t. M and w. 

In other words, a possible answer is a formula with a denotation that spans exactly one 

of the partition cells induced by the question; it is a true answer if this partition cell is 

the actual one with respect to a particular world w. The somewhat small difference to the 

picture of Groenendijk & Stokhof is that we do not rely on entailment between questions 

to define answerhood. Entailment between questions can nevertheless be explicated by 

requiring for two questions whether φ and whether ψ to entail each other that for all M: 

p 

p 

�F φ� M ⊆ �F ψ� M 

Up to now, we have a logic with basic propositions and boolean combinations, together 

with epistemic operations on these, that represent knowledge of (groups of) agents. This 

gave us a possibility to talk about questions as partitioning processes and about formulas 

being answers to questions. However, it tells us nothing about what it means to pose 

a question in a communication, about its effects, and about what it means to answer it, 

because we have no means yet to talk about communicative actions. For that we will 

move to a Dynamic Epistemic Logic that also contains public announcements and public 

questions. 

197 

F p 

p


3 Questioning and answering 

Dynamic Epistemic Logic provides a logical framework for reasoning about knowledge of 

(groups of) agents and change of this knowledge due to communication. Another modal 

operator, taken from Public Announcement Logic (Plaza, 1989), is added to E-PDL, that 

models the event of all agents being told simultaneously and transparently that a certain 

formula holds. Change of knowledge induced by announcements corresponds to updates 

of knowledge states. Analogously, we add a modal operator for public questions and 

a one-place predicate to turn a formula φ into a formula WH φ. Thus, the language is 

extended in the following way: 

φ ::= ... | [!φ] φ | [?φ] φ | WH φ 

The communicative effect of a public announcement is given by a restriction operation on 

epistemic models: 

M, w |= [!φ] φ ′ iff M, w |= φ implies M | φ, w |= φ ′ 

Where M | φ is the restriction of M with φ, i.e. the epistemic model M ′ = (W ′ , V ′ , R ′ ) 

with W ′ = {w ∈ W | M, w |= φ} and V ′ the restiction of V to W ′ and R ′ the result of 

restricting each ∼i to W ′ × W ′ . 

The interpretion of the question update has to be different, because asking a question in 

a communicative situation obviously has no effect on the knowledge of agents. It rather 

creates or shifts the focus of the conversation, in many cases to particular alternatives 

among which the answer lies. To model the focus of a conversation we add to R in the 

model an additional accessibility relation FOCUS, which initially denotes W × W and is 

visible for all agents. The question update can then be interpreted as reseting FOCUS: 

M, w |= [?φ] ψ iff M[FOCUS := �F φ� M ], w |= ψ 

Where M[FOCUS := �F φ� M ] is like M except for that the denotation of the relation 

FOCUS is set to the denotation of F φ. 

Answering can then simply be seen as announcement of an answer. Note that an appropriate 

answer automatically resets the FOCUS relation to its default value (i.e. neutral 

focus W × W ), because the update with the answer will eliminate all but one partition 

cell. From the definition of an appropriate answer it also follows immediately that the 

following holds. 

Proposition 1. If ψ is an appropriate answer to the question whether φ, then for all 

w : M, w |= [?φ][!ψ] φ or M, w |= [?φ][!ψ]¬φ. 

It expresses that appropriate answers do indeed answer the question. (For announcements 

in general it is of course not the case that either [!ψ]φ or [!ψ]¬φ holds.) 

This is best demonstrated by an example. Assume two agents Turing (t) and Church 

(c), and let p and q be propositions (for example, p for For every program we can 

decide whether it halts and q for There is no general algorithm to decide whether a 

property of natural numbers is true or not). The knowledge state of Turing and Church 

is depicted in the leftmost model of the below figure. For example, Turing does not know 

p, but he knows that Church knows whether p. These are preconditions usually considered 

198

to hold for questions to be felicitous. So Turing decides to ask whether p. 2 After updating 

with this question, appropriate answers will be p and ¬p (or formulas equivalent to these). 

Assuming that w0 is our world of reference, the true answer is ¬p. Announcement of the 

true answer eliminates world w1, which results in Turing knowing ¬p. In fact he now also 

happens to know q, whereas Church is still ignorant about it. 

w0 

p q 

t 

c 

w1 

p q 

w2 

p q 

?p 

3.1 Embedded yes/no questions 

w0 

p q 

t 

c 

w1 

p q 

w2 

p q 

!¬p 

w0 

p q 

c 

w2 

p q 

This approach is not at all restricted to direct questions but can straightforwardly deal with 

embedded questions as well, by means of the predicate WH. Embedded questions differ 

from direct questions in that they seem to refer to a specific partition cell (namely the true 

one) and not the partitioning as a whole, yet do not give away which partition cell is the 

true one. We already have the means to achieve this: 

�WH φ� M,w =def {w ′ | (w, w ′ ) ∈ [w]F φ} 

The formula WH φ can now be embedded in other formulas, for example in statements 

about the knowledge of agents. E.g., Ian knows whether Penicillin was flown in can 

be represented as [i] (WH p). In general, the following facts hold. 

Proposition 2. [i] (WH φ) |= [i] φ ∨ [i] ¬φ 

I.e. if an agent knows whether a formula is a case, he either knows the formula or its 

negation. 

Proposition 3. [i] (WH φ) |�= φ and analogously [i] (WH φ) |�= ¬φ 

I.e. the statement that an agents knows whether a formula is the case does not provide the 

information whether the formula or its negation is the case. 

4 Constituent questions 


Let us shortly sketch how our approach can be extended to the predicate logic case, to 

also account for constituent questions in the line of Groenendijk & Stokhof’s work. 

For this, we start with a first order dynamic logic: 

t ::= c | x 

φ ::= ⊤ | t | P t . . . t | ¬φ | φ ∧ φ | ∃x φ | [π] 

π ::= i | π ; π | π ∪ π | π ∗ | TEST φ 

2 Actually announcements and questions are not parametrized with respect to an agent. We will shortly 

come back to that in section 5. 

199

where c ranges over constants, and x ranges over variables. To this language, public 

announcements, questions, and a question embedding predicate are added, as above: 

φ ::= ... | [!φ] φ | [?φ] φ | WH φ 

This language is interpreted with respect to a first order model M with fixed domain, a 

world w, and a variable assignment g, as usual. 

Like Groenendijk & Stokhof, we add another operator to bind variables. 

Definition 3. If φ is a formula in which all and only the variables x1, . . . , xn have one or 

more free occurrences, then Qx1 . . . xn φ is a formula. 

A partitioning process over such a formula will model constituent questions. E.g. 

F Qx P x corresponds to the question about which entity lies in the denotation of P . The 

semantics can be adopted from (Groenendijk and Stokhof, 1997): 

where 

�Qx1 . . . xn φ� M,w,g = {w ′ | 〈Qx1 . . . xn φ〉 M,w′ ,g = 〈Qx1 . . . xn φ〉 M,w,g } 

〈Qx1 . . . xn φ〉 M,w,g = { (g ′ (x1), . . . , g ′ (xn)) | M, w, g ′ |= φ, 

where g ′ (x) = g(x) for all x �= x1 . . . xn} 

I.e. �Qx1 . . . xn φ� M,w,g is the set of all worlds in which the same entities belong to the 

extension of φ as in w. For example, for a question like Who is coming to the party? 

we get 

�F Qx P x� M,g = {(w, w ′ ) | �Qx P x� M,w,g = �Qx P x� M,w′ ,g } 

This means that F Qx P x partitions the model with respect to all possible extensions of 

the predicate P . Thus, John is coming to the party is indeed a true answer to the 

question if j is in the extension of P in the actual world. 

Notice that in the case of closed formulas φ, the process F φ models a yes/no question 

as above. 

4.1 Answerhood 


In the case of constituent questions, we have to distinguish between partial and exhaustive 

true answers. This does not pose a problem, assuming that the result of announcing an 

exhaustive answer to a question φ is equal to [w]F φ, where w is the actual world, while 

the result of announcing a partial answer is a subset of FOCUS that contains [w]F φ. 

4.2 Embedded constituent questions 

The WH predicate introduced in section 3 for embedding yes/no question also works for 

embedding constituent questions. Consider, for example, the embedding of who is coming 

to the party (as in Ian knows who is coming to the party): its interpretation is 

�WH Qx P x� M,g = {w ′ | (w, w ′ ) ∈ [w]F Qx P x} 

As mentioned above, F Qx P x partitions the model with respect to all possible extensions 

of P . Thus [w]F Qx P x corresponds to the true exhaustive answer. Therefore, Ian knows 

who is coming to the party means that Ian knows the exhaustive answer to the question 

about who is coming to the party. 

200

5 Further research 

One goal for a semantics of direct questions, that we did not touch upon yet, is offering 

updates YES and NO as answers to yes/no questions. Having them correspond to the 

actual use of natural language yes and no, however, is a quite complex matter. Leaving 

this discussion aside, the most straight-forward way to accommodate the possibility of 

simple yes/no answers in our system would be to add a yes/no predicate to the language, 

which a question whether φ sets to the denotation of φ. YES would then correspond 

to the announcement of this predicate, whereas NO would be the announcement of its 

negation. Having such a predicate provides another possibility to specify answerhood: 

a formula ψ would be an answer to the question whether φ if it were equivalent to the 

predicate or its negation. This would actually suffice to also get all the partitioning effects 

and propositions we talked about in section 2. As long as only alternative questions are 

addressed, it is just a question of design whether to use such a predicate or a relation as we 

did. The advantage of our proposal, however, is it can be extended to the predicat logic 

case for constituent questions. It thus allows to use one uniform mechanism underlying 

both kinds of questions. 

Possibly the main benefit of our proposal is the general benefit of Dynamic Epistemic 

Logic for natural language analysis: DEL provides a powerful framework for the formalization 

of pragmatic concepts that play a role in communicative situations. Based on 

this, it can also serve to explore the interaction between these phenomena. Let us illustrate 

this by shortly looking at presuppositions. An update of a question ψ that carries a 

presupposition φ can be expressed as the update (TEST φ ; ?ψ), which first tests whether 

the presupposition is satisfied, and, if so, updates with the actual question. If the presupposition 

test fails, the whole update is not successful. Such a presuppositon could, for 

example, be that the speaker who asks does not know the answer but holds it possible that 

someone among the addressees knows it. For this, one would need a means to parametrize 

announcements (and question updates) to a certain agent, which, to our knowledge, has 

not been done yet. On the other hand side, one can ask whether a proposition is the 

case which contains a presupposition itself, e.g. Did John stop smoking?. Adopting 

the treatment of presuppositions in (Eijck and Unger, 2007), asking a proposition ψ that 

carries a presupposition φ, would set FOCUS to the denotation of F (Cφ ∧ ψ) (where Cφ 

abbreviates that φ is common knowledge). If Cφ is false, however, the partitioning induced 

by Cφ ∧ ψ consists just of one partition cell, thus fails to create a focus with which 

an informative answer would be possible. 

Furthermore, incorporating not only knowledge but also belief would allow for modeling, 

among the presuppositions, different expectations in question pairs like Are you 

going to Groningen? and Aren’t you going to Groningen?. 

Last but not least, another possible line of research is to investigate how our FOCUS 

relation connects to Rooth’s theory of focus interpretation (Rooth, 1992), and thus explore 

the connection between questions and information-structural focus. 

References 


Eijck, J. v. and Unger, C. (2007). The epistemics of presupposition projection, in 

M. Aloni, P. Dekker and F. Roelofsen (eds), Proceedings of the Sixteenth Amsterdam 

Colloquium, December 17–19, 2007, ILLC, Amsterdam, pp. 235–240. 

201


Groenendijk, J. and Stokhof, M. (1984). Studies on the Semantics of Questions and the 

Pragmatics of Answers, PhD thesis, Universiteit van Amsterdam. 

Groenendijk, J. and Stokhof, M. (1997). Questions, in J. van Benthem and A. ter Meulen 

(eds), Handbook of logic and language, Elsevier, chapter 19, pp. 1055–1124. 

Hamblin, C. (1973). Questions in montague english, Foundations of Language 10: 41–53. 

Higginbotham, J. and May, R. (1981). Questions, quantifiers and crossing, Linguistic 

Review 1: 41–79. 

Karttunen, L. (1977). Syntax and semantics of questions, Linguistics and Philosophy 1: 1– 

44. Also published in: Portner & Partee (eds.): Formal Semantics. The Essential 

Readings. Blackwell, 2003, pp 382–420. 

Plaza, J. (1989). Logics of public communications, in M. Emrich, M. Pfeifer, 

M. Hadzikadic and Z. Ras (eds), Proceedings of the 4th International Symposium 

on Methodologies for Intelligent Systems, pp. 201––216. 

Rooth, M. (1992). A theory of focus interpretation, Natural Language Semantics 

1(1): 75–116. 

van Benthem, J., van Eijck, J. and Kooi, B. (2006). Logics of communication and change, 

Information and Computation 204(11): 1620–1662. 

202

THE SEMANTIC CHANGE OF THE FRENCH -AGE-DERIVATION 

Melanie Uth 

University of Stuttgart 

Abstract. In this paper, I will investigate the diachrony of the French -age-derivation. I will 

argue that -age originally served to derive kind terms that have been reinterpreted as group 

terms and as true event nominalizations. The main hypothesis will be that the reinterpretation 

of the -age-suffixation was enabled by the fact that the original derivatives and the 

new derivatives share important features of their abstract conceptual representations. This 

approach to the diachrony of the -age-suffixation predicts that even nowadays, true event 

nominalizations in -age focus on the atelic parts of the event denoted by the base verb. As 

such, the proposal constitutes a (further) evidence in favor of the hypothesis that -age may 

be differentiated from its rival -ment by means of its specific aspectual characteristics. 



Modern French -ment and -age are often described as competing nominalization suffixes, 

since they frequently attach to the same verbal bases. For example, Lüdtke (1987) gives 

a listing of 187 doublets, as e.g. gonflement/gonflage (’inflation’) and alludes to ”the numerically 

most important overlap in the French lexicon” (ibd.: 103). Contrary to that, in 

Old French, the -ment-suffixation was one of the standard procedures for deverbal event 

nominalization, while event nominalizations in -age were marginal. In our database, we 

attest 99,5% event nominalizations in -ment contrary to 0,5% deverbal nominalizations in 

-age. In this paper, I will investigate the conditions that enabled the propagation of true 

event nominalizations in -age from Old to Modern French. I will argue that -age could 

develop into an event nominalization suffix next to -ment because the two suffixes systematically 

differ as concerns the abstract conceptual representation of their derivatives. 

In section 2, I will concentrate on the Latin antecedents of the French -age-derivation, the 

relational adjectives in -aticu, as well as on their substantivized forms that are transfered 

to Old French. Section 3 focusses on the genuinly French formations, i.e. group terms 

and true event nominalizations. In section 4, I will argue that the genesis of group terms 

in -age resulted from a reinterpretation of the borrowed terms, that was enabled by the 

fact that the original derivatives and the new derivatives share important features of their 

abstract conceptual representations. In section 5, I hypothesize that an analogous reinterpretation 

occured in the deverbal domain, predicting that true event nominalizations in 

-age only attach to bases that are in some sense atelic. Finally, we will consider different 

analyses of Modern French -age-nominalizations showing that these may indeed be 

characterized by the salience of atelic aspectual values. 

2 The antecedents of New French -age: relational -aticu-adjectives and borrowed 

substantivizations 

The French -age-suffixation developed from the Latin denominal relational adjectives in 

-aticu that served to sub-classify the type of object or event denoted by the head nouns 

203


(e.g. census terraticus ’tax on land’). Denominal relational adjectives establish a relation 

between the head noun and the base noun whose exact semantic function needs to be 

specified by the semantics of the derivational constituents, and possibly by further contextual 

influences, c.f. eg. Fradin (2008: 3ff). For the present purposes, the crucial point is 

that relational adjectives classify types of nouns (in the sense of Vergnaud & Zubizaretta 

(1992)), instead of concrete tokens. 

In recent work on kind terms it is common to treat ”kinds” on a pair with ”classes or 

”types” of objects (in the relevant sense). For example, Krifka et al. (1995) as well as 

Chierchia (1998) assume that common nouns generally have both a kind-referring function 

and a predicative function, predicates and kinds being related by virtue of the realization 

relation R in the sense that every object y in the extension of the predicate δ is 

an instantiation of the kind x (ιx.∀y[δp(y) ↔ R(y,x)]). 1 Building on this approach to 

kinds, we may conceive of the relational -aticu-adjectives as deriving terms referring to 

(sub-)kinds, in a way such that census terraticus is a kind of census, just as e.g. porcus 

silvaticus (’wild pig’) is a kind of porcus and canis venaticus (’staghound’) is a kind of 

canis. 2 

During the transition from Latin to Old French, the -aticu-adjectives were substantivized 

and resulted in designations of taxes, rights, status etc. that are lexicalized and 

entail the traditional head noun as a semantic constituent (cf. Fleischman (1990: 10ff)): 

(1) TAX (chevage = bounty, capitation; from chief ): 

la fud subjecte e rendid chevage . . . 

there was-3Sg subjected and paid-3Sg capitation . . . 

’he subjected it and paid capitation’ (NCA:reis) 

(2) RIGHT (passage = right to cross a territory; from passer ): 

si (. . . ) disent . . . , que il queroient passage . . . 

prt. say-3Pl that they ask for passing 

’and (they) said that they ask for the right to pass. . . ’ (NCA: clari) 

It is important to note that, whereas the base nouns of these lexicalized substantivizations 

consistently retain their kind-referring function, the interpretation of the incorporated 

head nouns varies between kind-reference and object-reference, depending on the 

context. This difference is signalled by the determiner system of Old French, where terms 

that do not refer to actual extensions show up as bare nouns (cf. Foulet (1998: 49)): 

(3) a. et cele claciele guardoit en zz escrignet k il avoit quanqu 

and this little key keep-3Sg in a shrine that he got-3Sg when 

estovoit a monniage. 

was-3Sg in monasticism. 

‘and he kept this little key in a shrine that he got when he lived in monasticism’ 

(NCA: P. Mouskes) 

b. ne fait sanblant que il s en faingne le singne fait dou 

not do-3Sg seeming that he refl of it feign-3Sg the monkey done by the 

moniage. 

monkhood. 

‘he dit not seem to feign the monkey made by the monkhood’ (NCA: Renart) 

1 Contrary to Chierchia (1998), Krifka et al. (1995: 66) ”leave it open as to wether every predicate has a 

corresponding kind individual”. 

2 See McNally & Boleda (2004) for a similar approach to relational adjectives. However, for the time 

being the above analysis is ment to refer only to Latin -aticu, not to relational adjectives in general. 

204


Building on a representational format proposed by Fradin (2008: 4), the lexical semantics 

of the above substantivizations my be represented as in (4), where the index k is ment 

to indicate kind-reference,o signals object reference, and REL designates the relation that 

is introduced by the denominal adjective relating the denotation of the head noun (’rank’) 

to the one of the base noun (’monk’): 3 

(4) a. T (moniage) = (λxk.REL(xk, yk) ∧ rank ′ (xk) ∧ monk ′ (yk)) 

b. T (moniage) = (λxk.REL(xk, yk) ∧ rank ′ (xk) ∧ monk ′ (yk)) 

3 Genuinly French -age-derivatives: group terms and event nominals 

Next to the substantivized -age-derivatives borrowed from Latin there are two genuinly 

French coinages, i.e. group terms (5) and true event nominalizations (6): 

(5) GROUP (porcage, ’porc’ = herd of swine): 

toutes mes bestes et le meilleur porc du porcage 

all my beasts and the best pig of the herd of pigs 

’all my beasts and the best pig of the herd of pigs’ (GO: B. deCaux) 

(6) EVENT NOMINALIZATION (mariage, ’marier’ = marriage): 

il firent le mariage du dit chevalier et de . . . 

they make-3Pl the marriage of the said knight and of . . . 

’they conducted the marriage of the mentioned knight and . . . ’ (NCA: vilhar) 

These forms developed from the substantivized -aticu-adjectives by means of three 

innovations: the replacement of the semantically incorporated head nouns by means of 

”group of” and ”event of”, respectively, the strictly extensional interpretation of the these 

new ”head nouns”, as well as the strictly extensional interpretation of the derivational 

bases, in a way such that the new derivatives generally refer to actual objects and events 

instead of kinds and event types: 

(7) T (porcage) = (λxo.REL(xo, yo) ∧ rank ′ (xo) ∧ pork ′ (yo)) 

4 Approaching the origin of the group terms 

As regards the (semantic) constituents of the newly coined group nouns in -age, note that 

the new ”head nouns” are interpreted as denoting singular individuals (one group), while 

the kind-denoting base nouns have been reinterpreted as denoting plural individuals (e.g. 

several pigs). In the following I would like to argue that this hybrid character of the new 

coinages may be traced back to the fact that the kind-reference of the traditional -aticubase 

nouns was much more dependent on the instantiations of the respective kinds than 

the kind-reference of the head nouns. 

In section 2, we already argued that a common noun may principally denote both a 

kind as well as its instantiations. The relevant definition by Krifka et al. (1995: 66) is 

repreated in (8): 

3 This relation is specified as REL instead of R in order to separater it from the realization relation R 

mediating between a kind and its instances (cf. above). 

205


(8) ιx.∀y[δp(y) ↔ R(y,x)] 

Roughly: “There is a kind x, such that every y that is in the extension of the 

belonging predicate δp is a realization of x.” 

Furthermore, Krifka et al. (1995:78ff) show that, when a common noun shows up with 

kind-reference, the interpretation of the corresponding sentences often also involves the 

instantiations of the kind. According to these authors, ”even kind predicates [such as be 

extinct or be widespread (MU)] are related to properties of instances of the kind, if we 

engage in an analysis of the lexical meaning of such predicates (. . . ). For example, in 

order to show that the dodo is extinct, one has to show that there have been realizations of 

this kind in the past, that there are no present realizations of this kind now, and, perhaps, 

that there will be no more in the future” (ibd:78f). In the restant contexts triggering kindreference 

that are discussed by Krifka et al. (1995), the interpretation of the common 

noun relies to a still greater extent on the instantiations of the relevant kind. A central 

example is the so-called distinguishing property interpretation as in Dutchmen are good 

sailors meaning that ”the Dutch distinguish themselves from other comparable nations by 

having good sailors” (ibd.: 82f). According to this interpretation, the verbal predicate is 

definitely related to properties of instantiations of the kind. 4 

Coming back to the Latin -aticu-adjectives, I would like to argue that the interpretation 

of their base nouns also essentially relied on the close relation between the kinds and their 

instantiations. For example, the kind-reference of the base noun baron of barnage (’quality 

of barons’) builds on the fact that an undefined number of instantiations of the kind is 

said to be distinguishably noble. This analysis largely holds for all borrowed derivatives 

derived from bases that refer to human beings, as e.g. eschevinage (’rank of jury men’), 

veuvage (’widowhood’),etc. By contrast, the kind interpretation of fiscal terms as porcage 

(’tax on swine’) is closely related to the instantiations since these are the ones the taxes 

are to be payed for. Note that if we highlight the impact of the instantiations for the referential 

characteristics of the -aticu base nouns, this is not to say that the base nouns do 

not refer to kinds any longer. We are still faced with an intensional interpretation, i.e. the 

instantiations do not need to exist in the actual world. For the sake of convenience, we 

will distinguish in what follows between intensionally defined instantiations and extensionally 

defined (actual) instances of kinds. Finally note that the instantiations of a kind 

are necessarily non-singular in the sense of Chierchia (1998:350) who argues that ”kinds 

(. . . ) will generally have a plurality of instances (even though sometimes they may have 

just one or non). But something that is necessarily instantiated by just one individual (e.g., 

the individual concept or transworld line associated with Gennaro Chierchia) would not 

qualify as a kind.” 

Contrary to that, if a given head noun like ’rank’ or ’quality’ refers to a kind, the entire 

kind as a whole is much more salient than the plurality of its instantiations. The second 

difference between the -aticu head nouns and the corresponding base nouns is already 

illustrated by example 3 above, i.e. whereas the base nouns consistently retain their kindreferring 

function, the head nouns refer to actual instances of the kind as soon as the 

derivative shows up with an adequate predicate, as e.g. faire in (3b), repeated below as 

(9): 

4 Krifka et al. (1995) argue that the above sentence clearly differs from characterizing sentences as ”Potatoes 

contain vitamin C” in that the former unlike the latter is not adequately paraphrased by an indefinite 

singular NP (cf. ”A Dutchman is a good sailor” vs. ”A potatoe contains vitamin C”). 

206


(9) ne fait sanblant que il s en faingne le singne fait 

not do-3Sg seeming that he refl. of it feign-3Sg the monkey done 

dou moniage. 5 

by the monkhood. 

‘he dit not seem to feign the monkey made by the monkhood’ (cf. 4b.) 

Hence, we may generalize that the incorporated head nouns of the borrowed -agederivatives 

are interperted as denoting either the kind as an entirety or a concrete instance 

of it, whereas the morphologically established kind-reference of the corresponding base 

nouns essentially relies on the (non-singular) instantiations. I would like to argue that 

this difference between the interpretation of head nouns and base nouns is reflected at an 

abstract conceptual level of semantic representation, in a way such that the concepts denoted 

by the head nouns are interpreted as representing bounded entities without internal 

structure, whereas the concepts denoted by the base nouns are interpreted as representing 

unbounded entities composed of sub-individuals. Relying on Jackendoff (1991), we may 

represent the conceptual structure of e.g. le moniage (’the rank of monks’) as in (10), 

where the feature [± b] encodes the distinction between bounded entities like PIG and 

non-bounded entities like WATER, while the feature [± i] signals the distinction between 

non-structured individuals like BRICK and those that are composed of sub-individuals, 

like BUSES or CATTLE (cf. Jackendoff (1991: 20)). PL (”plural”) and REL (”relation”) 

denote functions that map between different values of b and i: 

(10) conceptual structure of le moniage (’the rank of monks’): 

[+b, -i RANK (REL ([-b, +i MONKS (PL ([+b, -i MONK ])])] 

As regards the group terms in -age, it is interesting to note that recent analyses tend to 

define even non-derived group terms like English committee as being semantically hybrid 

in a sense reminiscent to our substantivized -aticu-adjectives. For example, Barker (1992) 

proposes that a group term denotes an atomic individual that is (merely) related to the 

plural individual constituting its members by a membership function f. An argument for 

the difference in extension between the group term and the related plural predicate is that 

there are properties common to all of the members which are never true of the group. For 

example, Bill can be a member of committee A, whereas committee A cannot (cf. ibd.: 

73). Likewise a group may have properties that the collection of its members does not 

have, e.g. a group has members while a plurality does not. 

One piece of evidence corroborating the validity of this approach for the group nouns 

in -age comes from predicates that directly refer to the members of the group denoted by 

the corresponding -age-derivative: 

(11) li quens (. . . ) a fait son barnage asanbler. 

the count has done his knights asemble 

‘the count assembled his knights’ (NCA: elie) 

Adopting this approach to group nouns, we may conclude that the abstract conceptual 

representation of the genuinly French group nouns in -age is very similar to the one of 

the borrowed -aticu-substantivizations, since it likewise entails a bounded non-composed 

5 We may assume that borrowed -age-derivatives as in (8) require a determiner since the verbal predicate 

triggers the type shifting from kind (e) to predicate (e,t), cf. e.g. Chierchia (1998: 353). 

207


concept (in this case the group individual), followed by an unbounded concept that is 

composed of sub-individuals (i.e. the members). The relevant conceptual representation 

is given in (12), COMP representing Jackendoff’s (1991) ”composed of”-function (cf. 

(ibd.: 23)). 

(12) [+b, -i GROUP (COMP ([-b, +i BARONS (PL ([+b, -i BARON ])])] 

According to this analysis, the substantivized -aticu-adjectives borrowed from Latin 

and the genuinly French group nouns in -age have the same ’skeleton’ (in the sense of 

Lieber (2004)): 

(13) a. [+b, -i ([-b, +i (+b, -i) ])] (borrowed substantivizations) 

rank monks monk 

b. [+b, -i ([-b, +i (+b, -i) ])] (genuinely French group terms) 

group barons baron 

This suggests that the genesis of the group nouns in -age essentially relied on the 

abstract conceptual representation of the substantivized -aticu-adjectives. 

Still, the differences between the borrowed derivatives and the new ones are remarkable, 

the most important development arguably being the change from the kind level to 

the level of actual instances, that accompanies the replacement of the incorporated head 

nouns. One possible approach to this change would be to assume that the incorporated 

head nouns were replaced by concepts whose quantitative constitution is closest to the 

abstract conceptual representation of the traditional derivatives, hence the introduction of 

the group-concept. According to this view, the new coinages represent the default realizations 

choosen by the native speakers because of their proximity to the skeleton in 

(13a). 

However, since we do not dispose of any concrete evidence that could corroborate such 

an analysis, this reasoning remains highly speculative. Further investigation is needed 

to shed light on the exact diachronic development of the -age-derivatives. For our purposes, 

the most important conclusion from the above is that the internally plural shape 

of the base nouns exhibited by the substantivized -aticu-adjectives is transferred to the 

genuinely French -age-derives, in a way such that, from a synchronic point of view on 

the group nouns, we may generalize that the attachment of -age necessarily involves the 

pluralization of the base noun. This generalization may be captured by assuming that 

-age introduces a plural operator *P (in the sense of Link (1983)) into the relevant representation. 

In the following, I will argue that the restriction to pluralized bases extends to 

the deverbal domain and that it is this restriction that enabled the event nominalization in 

-age to become more and more productive despite of the existence of the akin procedure 

in -ment. 

5 Etymologically conditioned pluractionality and aspectual properties of New French 

-age 

Example 14 contrasts a deverbal substantivized -aticu-adjective (14a) with a true event 

nominalization in -age (14b). The nominalization in (14a) means ’right of crossing’, the 

head noun as well as the base verb denoting event types. Contrary to that, the nominalization 

in (14b) means ’the event of passing’, the new head noun as well as the base verb 

referring to actual instances of events: 

208


(14) a. si disent que il queroient passage (. . . ). 

prt. say-3Pl that they ask for passing 

’they said that they ask for the right to pass’ (cf.(2)) 

b. cil m abandona le passage de la haie mout doucement. 

this me allow-3Sg the passing of the hedge very gently 

’he very gently allowed me to pass the hedge’ (NCA: rose) 

Evidently, the borrowed deverbal -age-derivatives exhibit the same semantics as the 

borrowed denominal ones. That is, the event type denoted by e.g. passage in 13a. is 

closely related to its instantiations, since the right is conceded for instantiations of crossing 

that are,furthermore, restricted to distinguished territories. Similarly, fees denoted by 

e.g. pressoirage or troillage (’fee for use of the village press’, cf. Fleischman (1990:74)) 

are payed for instantiations of pressing events that display specific properties, etc. Note 

that the fees and rights are still estimated and conceded for events in general, i.e. for event 

types. Nevertheless, the derivation of the corresponding -aticu-adjectives obviously relied 

on (typical) instantiations. 

Naturally, the parallel between the denominal and the deverbal borrowed substantivized 

-age-derivatives also extends to the head noun. That is, due to the accidental 

character of their kind-reference (showing up only since they occurr in the realm of an 

-aticu-adjective), the head nouns are largely independent from their instantiations, focussing 

on the entire type of event as a whole. Furthermore, just as the denominal derivatives, 

the head nouns of the deverbal substantivizations may also be coerced to refer to 

actual instances by contextual means: 

(15) et 

and 

rendi 

gave-3Sg 

chascuns 

everyone 

son 

his 

passage 

toll 

a 

to 

ceuls 

those 

qui 

that 

leur 

them 

avoient 

had 

presté. 

lended 

’and everyone returned his toll to those had lended it to them.’ (NCA: vilhar) 

This parallelism of denominal and deverbal -aticu-substantivizations suggests that the 

extensional reinterpretation of the deverbal ones (i.e. their shift from kind-reference to 

object-reference) may be modelled along the lines of the analysis proposed above for the 

group nouns. In order to put forward this hypothesis, I will draw on van Geenhoven (2005) 

who introduces the so-called pluractional operator, that corresponds to Link’s plural operator 

*P and that operates o verbal bases in order to ”distribute subevent times in various 

ways over the overall event time of an utterance”. Pluractionality and (indefinite) plurality 

join the characteristic of cumulative reference, a concept that was originally introduced 

to define the reference of mass nouns and indefinite plurals denoting homogeneous pluralities 

or masses. The crucial characteristic of an entity being in the extension of, for 

example, a mass term, is that its parts, as well as any sum of its parts, are in the extension 

of the same term. As is pointed out by Quine (1960:19), ”[s]o called mass terms like ’water’, 

’footwear’, and ’red’ have the semantic property of referring cumulatively: any sum 

of parts which are water is water.” Evidently, this characteristic can easily be transferred 

to the domain of eventualities, in the sense that atelic expressions refer cumulatively to 

eventualities, whereas telic expressions refer non-cumulatively to eventualities. Accordingly, 

van Geenhoven (2005: 6) takes pluractionality to be ”the true source of atelicity”, 

covering the ”atelic nature” (ibd.) of different lexical items as e.g. simple activity verbs 

(to sing), imperfective aspectual markers (engl. -ing) or frequency adverbs (occasionally). 

209


Based on this approach, we may hypothesize that, due to the specific conceptual ’skeleton’ 

of their antecedents, the innovative event nominalizations in -age referring to individual 

events are conceptualized as being internally pluractional (PLUR), just as the 

innovative denominal group terms are perceived of as having pluralized bases: 

(16) conceptual representation of passage (’event of crossing’): 

[+b, -i EVENT (COMP ([-b, +i CROSSING (PLUR ([+b, -i CROSSING ])])] 

Unfortunately, this hypothesis may hardly be verified for Old French, since true event 

nominalizations are only marginally represented in Old French corpora. However, evidence 

in favour comes from analyses of -age and -ment in Modern French, showing that 

event nominalizations in -age even nowadays exhibit aspectual characteristics related to 

pluractionality. One example is Bally (1965), who argues that -ment-nominalizations are 

generally very likely to be punctual or terminative, whereas -age-nominalizations rather 

realize durative and iterative aspectual values. Since aspectual values as iterativity, continuativity, 

durativity etc. may all be traced back to pluractionality (cf. van Geenhoven 

(2005: ibd.)), Bally’s differentiation of New French -age and -ment clearly supports our 

analysis. 

Interestingly, Martin (2008) offers a detailed analysis of various aspectual differences 

between -age and -ment that is largely in line with Bally’s classification. For example, 

the non-terminativity of -age is illustrated by the complementary distribution of -age- and 

-tion-nominals in contexts as in (17): 

(17) a. Le dénazifiage de l’Allemagne (par X) a abouti sa dénazification (par X). 

’The denazifying of Germany (by X) resulted in its denazification (by X)’ 

b. *La dénazification de l’Allemagne (par X) a abouti son dénazifiage (par X). The 

denazification of Germany (by X) resulted in its denazifying (by X). 

(Martin (2008: 12)) 

Secondly, Martin argues that -age is able to denote longer eventive chains than -ment, 

as is evidenced by the fact -age-nominals derived from unergative intransitive bases exhibit 

an iterative interpretation, whereas the corresponding -ment-nominals are forced to 

show up with plural inflection in iterative contexts: 

(18) a. OK Une séance de miaulage. (singular) 

’A meouwing session’ 

b. vs. * Une séance de miaulement. (singular) 

c. vs. OK Une séance de miaulements. (plural) (Martin (2008: 6)) 

Thirdly, Martin states that -age contrary to -ment prefers internal arguments that are 

incrementally affected by the event denoted by the relevant base verb, a feature that is also 

displayed by other atelic expressions as e.g. the English Progressive (cf. van Geenhoven 

(2005: 12)). 

In my view, the above findings may be unified by assuming that the diachronically 

motivated restriction of the -age-derivation to pluractional bases carries over to Modern 

French where it is reflected by the fact that -age-nominalizations exhibit several aspectual 

values related to pluractionality, as e.g. iterativity, durativity or imperfectivity. A further 

advantage of this analysis is that we may answer the question relating to the alleged suffix 

rivalry by arguing that -age could develop into an event nominalization suffix since it 

displays specific aspectual properties that distinguishes it from rival suffixes as -ment. 

210

6 Conclusion 

In this paper, I investigated the semantic development of the French -age-derivation. I 

argued that the genesis of group terms in -age resulted from a reinterpretation of the 

borrowed substantivized -aticu-derivatives that was enabled by the fact that the original 

procedure and the new procedure share the same quantitative structure at an abstract level 

of conceptual representation. The result of this reinterpretation is that denominal -age 

is restricted to pluralized bases. With reference to van Geenhoven (2005), I then related 

nominal plurality to verbal pluractionality through the notion of cumulative reference and 

I argued that the development in the deverbal domain strictly parallels the development 

in the denominal domain, in a way such that deverbal -age is restrained to pluractional 

bases. This analysis enables us to approach several questions concerning the change of 

the -age-derivation. First of all, the change turns out to only affect more concrete levels 

of word formation, the basic skeleton being retained through the course of the diachronic 

development. This common ground constitutes both the condition that enabled the change 

to take place and the basic frame that determines the specific characteristics of the -agederivation 

to this day. Secondly, we may answer the question relating to the suffix rivalry 

by arguing that -age could develop into an event nominalization suffix since it displayes 

specific aspectual properties that distinguish it from its alleged rival -ment. 


I wish to thank Martin Becker, Steffen Heidinger, Fabienne Martin, Achim Stein and 

Johannes Wespel for helpful discussions. Many thanks to the reviewers for their helpful 

comments, as well as to Fabienne Martin and Dennis Spohr for their technical support 

and their patience. 

References 


Bally, C. (1965). Linguistique générale et linguistique française, Francke, Berne. 

Barker, C. (1992). Group terms in english, Journal of Semantics 9: 69–93. 

Blum, C. (2002). Godefroy – Le Dictionnaire de l’Ancienne Langue française du IX e au 

XV e siècle [GO], Université Paris-Sorbonne. Electronic edition. 

Chierchia, G. (1995). Reference to kinds across languages, Natural Language and Linguistic 

Theory 6: 339–405. 

Fradin, B. (2008). On the semantics of denominal adjectives, On Line Proceedings of the 

6th Mediterranean Morphology Meeting, Sept. 27-30, 2007, Vol. 2, Ithaca. 

Jackendoff, R. S. (1991). Parts and boundaries, Cognition 41: 9–45. 

Krifka, M., Pelletier, F. J., Carlson, G. N., ter Meulen, A., Chierchia, G. and Link, G. 

(1995). Genericity: An introduction, in G. N. Carlson and F. J. Pelletier (eds), The 

Generic Book, University of Chicago Press, Chicago. 

211


Lieber, R. (2004). Morphology and Lexical Semantics, Cambridge University Press, Cambridge. 

Link, G. (1983). The logical analysis of plurals and mass terms: A lattice theoretical 

approach, in G. Link, R. Bauerle, C. Schwarze and A. von Stechow (eds), Meaning, 

Use and Interpretation of Language, Walter de Gruyter, Berlin, pp. 302–323. 

Lüdtke, J. (1978). Prädikative Nominalisierungen mit Suffixen im Katalanischen, Spanischen 

und Französischen, Niemeyer, Tübingen. 

Martin, F. (2008). The semantics of eventive suffixes in french. Paper presented to Formal 

Semantics in Moscow 4, 5th April 2008. 

McNally, L. and Boleda, G. (2004). Relational adjectives as properties of kinds, in 

O. Bonami and P. Cabredo Hofherr (eds), Empirical Issues in Syntax and Semantics 

5, Papers from CSSP 2003, pp. 179–196. 

Quine, W. V. (1960). Word and Object, MIT Press, Cambridge, Mass. 

Stein, A. and Kunstmann, P. (2006). Le Nouveau Corpus d’Amsterdam [NCA], Universität 

Stuttgart, Institut für Linguistik/Romanistik, Stuttgart. 

van Geenhoven, V. (2005). Atelicity, pluractionality and adverbial quantification, in 

H. Verkuyl, H. de Swart and A. van Hout (eds), Perspectives on Aspect, Springer, 

The Netherlands, pp. 107–125. 

Vergnaud, J. and Zubizaretta, M. (1975). The definite determiner and the inalienable 

construction in french and english, Linguistic Inquiry pp. 595–652. 

212

ADVERSARY IMPLICATURES ∗ 

Grégoire Winterstein 

Université Paris 7 

Abstract. The work reported in this paper deals with a certain preference that speakers show 

when reinforcing some conversational implicatures. We look at the apparent correlation between 

this class of inferences and the bi-partite classification of conversational implicatures 

proposed by L. Horn. We then argue for a separation between the argumentative and inferential 

dimensions of an utterance and propose a brief explanation based on propositions by 

Ducrot. 

In this work we are interested in one aspect of what is classically considered as the reinforcement 

of conversational implicatures. In the first section we show that the felicitous 

reinforcement of implicatures isn’t free, as it is often considered to be (e.g. in 

(Levinson, 2000)). In some cases speakers show a preference for marking a contrast 

when reinforcing inferences, in others a contrast can’t be used. We examine the properties 

of each class of implicatures defined in this manner. We then look at the similarities 

between the class of inferences exhibiting this preference for contrast and the Q-based 

class of implicatures as defined by Horn. Ultimately, we discard the similarity as irrelevant 

to our purpose. More generally, we argue that a classical neo-gricean approach can’t 

give an explanation for the facts at hand. 

The second section aims at explaining these facts in an argumentative perspective based 

on the works of Anscombre and Ducrot. We claim that some implicatures are in a systematic 

rhetorical opposition to the utterance they are derived from, a fact which licenses 

the use of a contrast for reinforcement. Besides licensing it, this opposition seemingly 

requires the presence of contrast. We propose two different views to explain this preference. 

1 Empirical Domain 

1.1 Core data 


The data presented in (1) is our prime example of study. In (1b) B’s answer is interpreted 

as carrying with it the implicature in (1c) 1 , a standard example of scalar implicature as 

presented, among others, in (Horn, 1989). 

(1) a. A: Do you know whether John will come? 

b. B: It’s possible 

c. ❀It’s not sure 

d. It’s possible, but it’s not sure 

∗ I thank Pascal Amsili, Jacques Jayez, Frédéric Laurens, François Mouret and the audiences of FSIM’4 

and JSM’08 for their precious help and remarks during the preparation of this work. 

1 We use the notation A❀B to mean that the utterance of A implicates B 

213


The inference (1c) can be reinforced as in (1d). What interests us is that an utterance such 

as (2), without an adversative discourse marker, sounds degraded compared to (1d) (as an 

answer to (1a)). 

(2) B: # It’s possible and it’s not sure 

We believe that the preference for (1d) over (2) is somehow unexpected. If the implicature 

(1c) is indeed conveyed by the utterance of (1b), one has to explain how it can be construed 

as “opposed” to the utterance that allowed its presence in the first place (as suggested by 

the adversative but). A similar fact is already noted in (Anscombre and Ducrot, 1983) 

with the following example: 

(3) Pierre s’imagine que Jacques et moi sommes de vieilles connaissances, mais pourtant 

on ne s’est jamais rencontrés. 

Pierre figures that Jacques and I are old-time friends, but we never met. 

Anscombre and Ducrot use (3) to illustrate the difference between their notions of argumentation 

2 and inference. Although the first part of the utterance allows an inference 

towards the second part, it is nevertheless argumentatively opposed to it and thus licences 

a contrast. (Horn, 1991) shows that more generally any kind of content related to an 

utterance U (by relations of implicature, presupposition, logical entailment. . . ) can be 

felicitously redunded as long it is argumentatively opposed to U. Therefore, as unexpected 

as the preference for a contrast might be in (1dd), the situation appears common. 

This prompts us to look at the argumentative properties of the implicatures relative to 

their mother-utterances. More specifically, we’ll be checking whether certain subtypes of 

implicatures are distinguished by this argumentative behaviour. 

On a last note about the core-data, we wish to mention the case of the scale of quantifiers: 

〈all, some〉. Usually, scalar implicatures are exemplified with this latter scale as in 

(4). 

(4) a. A: How is your experiment going? 

b. B: I tested some of the subjects. 

c. ❀B didn’t test all the subjects. 

d. I tested some of the subjects, but not all. 

e. # I tested some of the subjects, and not all. 

We prefer to rely on (1) because the preference for using an adversative appears stronger 

in (1d) than in (4d). Neither (2) nor (4e) can be entirely ruled out. Both can be used as 

corrections of a previous statements (in those cases they would probably have specific 

prosodic patterns). Putting this aside, we also observe that the preference for marking 

a contrast is less strong for the examples with quantifiers. Simple Google searches for 

the french quelques-uns et pas tous or english some and not all yield several thousands 

of occurrences, not all of them corrections, whereas a search for possible and not certain 

only provides results of the form only possible and not certain. The presence of the adverb 

2 The notion of argumentation is rooted in Anscombre and Ducrot’s view on discourse. According to 

them a speaker always talk to a point and his utterances argue for a certain conclusion, quite often the topic 

of the discourse, which may or may not be explicit. Merin considers that understanding the nature of this 

topic is what “figuring out the speaker’s apparent and real intentions” is about. Anscombre and Ducrot 

consider that some linguistic items or structures, such as almost, bear specific argumentative properties and 

thus entertain a systematic argumentative opposition or correlation with other propositions. 

214


only restricts the meaning of possible and these examples aren’t conclusive compared to 

the some and not all ones. However, the effect of only is an interesting one and we shall 

return to it below. 

1.2 First attempt at a classification 

Rather unsurprisingly, if we look at the cancellation of the implicature (1c), we find that 

the use of an adversative is odd in (5a). A reformulation as in (5b) sounds better. 

(5) a. # It’s possible but it’s sure 

b. It’s possible and it’s even sure 

Such observations have already been made in (Benndorf and Koenig, 1998). Using data 

about the cancellation of implicatures, the authors argue for a treatment of the semantic 

contribution of the adversative but based on Horn’s distinction between Q-based and Rbased 

implicatures. This distinction appears relevant since R-based implicatures 3 allow a 

contrast for their cancellation as shown with various examples in (6). 

(6) a. Gwen took off her socks and jumped into bed, but not in that order 

b. Billy cut a finger, but it wasn’t his 

c. Sam and Max moved the piano, but not together 

As expected, the use of an adversative to reinforce the same implicatures yields odd sentences: 

(7). 

(7) a. # Gwen took off her socks and jumped into bed, but in that order 

b. # Billy cut a finger, but it was his 

c. # Sam and Max moved the piano, but together 

It should be noted that the sentences in (7) are out only under the assumption that the 

considered implicatures are present. It is easy to imagine contexts for which all these 

sentences are correct. For example, if sentence (7b) is uttered about some mafia henchman 

who breaks other people’s fingers on a daily basis, the sentence is quite felicitous but the 

implicature we’re interested in isn’t conveyed in the first place. 

In (5) we’ve seen that the cancellation of scalar implicatures doesn’t allow a contrast. 

The same goes for all other types of Q-based implicatures 4 : (8a) is a clausal implicature 

as first described in (Gazdar, 1979), (8b) is based on an attitude predicate, (8c) is based 

on Grice’s maxim of Manner rather than of Quantity (and belongs to Levinson’s M-based 

implicatures class). 

(8) a. Bill is in the kitchen or the living room, (?but/and in fact) I know which 

b. John thinks that Mary is pregnant, (?but/and in fact) she is indeed expecting a 

child 

c. Sam caused Max’s death, (?but/and in fact) he actually killed him on purpose 

3 R-based implicatures are enrichments of an utterance related to underspecified aspects of the propositional 

content (temporal ordering, causal relations etc.) They come about in a wide variety of shapes. In 

(Levinson, 2000) these inferences are called I-based implicatures. 

4 For Horn, Q-based implicatures are essentially negative in nature: an implicated meaning is calculated 

by taking into account which stronger, or more informative, relevant forms the speaker could have uttered 

but chose not to. This notion of Q-implicatures subsumes Levinson’s Q and M implicatures. 

215


As in (1c) the reinforcement of these inferences seems better with some contrast 5 . 

(9) a. Bill is in the kitchen or the living room, ?(but) I don’t know which 

b. John thinks that Mary is pregnant, ?(but) she’s not 

c. Sam caused Max’s death, ?(but) he didn’t kill him on purpose 

Relying on these observations, Benndorf and Koenig proposes to change the classical 

description of but, as given in (Anscombre and Ducrot, 1977) and reproduced in (10), by 

reducing Ducrot’s notion of argumentativity to Gricean inferences. 

(10) a. A sentence p but q is felicitous iff there is a proposition H such that: 

b. p is an argument for H 

c. q is an argument for ¬H 

d. q argues more strongly for ¬H than p argues for H 

Benndorf and Koenig’s description is given in (11), where “world inference” stands for 

any inference deriving from world knowledge. 

(11) a. A sentence p but q is felicitous iff there is a proposition H such that: 

b. H is an R-inference or a “world inference” derived from p 

c. q together with the common ground entails ¬H 

1.3 The limits of a purely Gricean description 

The description of but given in (11) is attractive because it explicits Ducrot’s argumentativity 

with well-studied inference mechanisms. However, this proposition raises several 

issues. 

As noted about (3), Anscombre and Ducrot are adamant about distinguishing inference 

and argumentation. A good illustration of the difference between the two is exemplified 

in (12). 

(12) a. Mary almost fell. 

b. → Mary didn’t fell. 

c. Mary almost fell but she caught herself. 

The utterance of (12a) conventionally conveys (12b) 6 and yet a contrast is preferred in 

(12c) where the first sentence is connected with one entailing (12b) (we don’t use (12b) as 

such because the repetition of the lexical material alters the judgment on (12c)). The use of 

an adversative shows that (12a) and (12b) are argumentatively opposed. According to the 

description of but given in (11), this amounts to say that on one hand (12a) conventionally 

conveys (12b) and at the same time R-implicates its opposite. Put more simply, this 

means that an utterance could, and should, convey two opposite inferences at the same 

time. If we adopt the classical Gricean view of an implicature as a part of meaning 

mutually recognized by both speaker and addressee, then a speaker uttering (12c) should 

be contradicting himself, or at the very least sound “dissonant”. 

5 Actually the versions without any connector might sound acceptable with the second conjunct as an 

explanation of the first (especially for (9c)). We acknowledge such readings but won’t deal with them 

directly. Our point lies in the fact that it’s not possible to reinforce these inferences without enforcing a 

discourse relation. A Contrast relation is the most “natural” one to convey and it is the most compatible 

with all studied inferences. 

6 For a detailed study of the properties of almost see (Jayez and Tovena (Jayez and Tovena, 2008)). 

216


Moreover, should we be able to find a sentence coordinated by but such that the second 

conjunct is the cancellation of a Q-based implicature, it would be a counter-example to 

the description in (11). We believe (13) is such an example 7 . 

(13) a. Mother: I hope Kevin has been polite with Granny and he has managed to eat 

some of her terrible cookies. 

b. Father: The problem is, he did eat some of them, but in fact he ate all of them 

and Granny said that he was greedy. 

The use of some in answer (13b) is such that it excludes that Kevin ate all of the cookies: 

an implicature restricting the meaning of some seems present. Two options are available: 

1. In this particular utterance the implicature from some to not all isn’t a scalar implicature 

but an R-based one. On one hand, this would be consistent with (11). 

On the other hand, the presence of the reformulative item in fact is similar to the 

standard cases of scalar implicature cancellation. At this stage, it would mean that 

there are two different mechanisms for producing the same inference with similar 

characteristics except on the argumentative side: not a very desirable situation. 

2. The implicature is indeed a scalar implicature: in this case the argumentative orientation 

of these inferences isn’t always opposed to their base-utterance. A simple 

Gricean approach is then unable to provide a satisfactory analysis of the core data 

in (1). Since the description of but is usually given in an argumentative framework, 

this isn’t surprising. What has now to be explained is how the argumentativity of 

these inferences can be accounted for. 

A last observation we’ll make is that explaining the core data is much simpler once we 

abandon implicatures. Taking the meaning of some as more than 2 and possibly all there 

is a clear opposition with a not all interpretation. Things are however a bit more tricky: as 

shown by (13b) the argumentative relationship between the some and not all propositions 

can vary. What we mean to investigate is on one hand the effect that this relation has on 

the discourse relations one can use to connect discourse segments and on the other hand 

the effect it has, if any, on the derivation of inferences. 

2 The argumentative approach 

Based on the observations of (1.3) we decide not to adopt the description of but given in 

(11) and keep the more traditional one in (10). We now have to explore the argumentative 

properties of implicatures. We will start with a short account of the argumentative 

properties of R-based implicatures and then have a closer look at Q-based inferences. 

2.1 On the reinforcement of R-based implicatures 

We observed that utterances contrasting the content of an R-based implicature with its 

mother-utterrance were odd (cf. (7)) and that felicitously interpreting these utterances 

implied contexts such that the targeted implicature didn’t arise in the first place. For these 

7 Attested examples of this sort are rare, and even scarcer if we restrict them to the specific use of but 

we’re interested in (namely Anscombre and Ducrot’s but/aber/sino), but we think that they’re possible. 

217

particular inferences, it seems that we can argue for a systematic argumentative orientation 

regarding their mother-utterance. 

Contrary to their Q-based counterparts R-based, implicatures lack a propositional content 

of their own (as noted for example in (Levinson, 2000)). Expressing them linguistically 

amounts to explicitely expressing an enriched version of the mother-utterance. Thus, 

expressing a contrast between an utterance B and the linguistic expression I of an hypothetical 

R-implicature attached to B means contrasting two identical propositions: if B 

indeed carries an implicature, its full interpretation is I and B but I should be interpreted 

as I but I. The only way to “redeem” the sentence is to reject the implicature I associated 

with B and interpret B literally or with another implicature. The description (11) is thus 

accounted for as a sufficient condition for the felicitous use of but, albeit not a necessary 

one. 

In the Relevance Theory approach by Sperber and Wilson (see (Wilson and Sperber, 

2005) for an introduction) the inferences in (7) belong to the realm of explicatures (see 

(Carston, 2005) for a presentation). A tempting generalization would then be to say that 

the preference for marking a contrast is limited to the sole “real” implicatures and not 

observed in the case of explicatures. The latter wouldn’t be argumentatively opposed 

to the utterance they’re attached to because they’re enrichments of the meaning of an 

utterance. But, according to (Noveck and Sperber, 2007) and (Carston, 2005), most cases 

of scalar implicatures are really explicatures, including the examples in (1). Furthermore, 

in Grice’s famous “garage” example, reproduced in (14), the relevant inference is an 

implicature, not an explicature, and yet, it is its cancellation that demands a contrast, 

not its reinforcement (cf. the bracketed part in (14b)). 

(14) a. A: I am out of petrol. 

b. B: There is a garage round the corner, [but it’s closed]. 

Therefore the distinction between explicature and implicature in Relevance Theory isn’t 

satisfactory to explain our data. 

2.2 Q-based inferences 


Recent works in experimental pragmatics (see (Breheny, Katsos and Williams, 2005)) distinguish 

contexts according to their relation with a targeted scalar inference: they can be 

upper-bounded (allowing an interpretation with the implicature), lower-bounded (blocking 

an interpretation with the implicature) or neutral. These cognitive studies showed that 

the implicature at hand is only generated in upper-bounded contexts. Our main interest 

will be limited to these upper-bounded contexts, and inside these contexts to have an account 

of both the cases for which the preference is marked and those where it isn’t. In the 

future we shall try to extend our analysis to all kinds of contexts, notably with the benefit 

of experimental data (see (4)). 

We will first outline how scalar implicatures are accounted for in an argumentative perspective 

and then see that the possibility of marking a contrast between an implicature and 

its mother-utterance follows directly from the described mechanism. We’ll base our presentation 

on the account by Anscombre and Ducrot who introduced and first formalized 

the concept of argumentativity in discourse; our explanations are compatible with later 

argumentative frameworks, such as the decision-theoretic one proposed in (Merin, 1999). 

218

2.2.1 The derivation of Q-Implicatures 

The derivation of Q-implicatures has known various refinements in the argumentative 

perspective. The main argument behind this approach to implicatures is the possibility to 

give an account of various cases where no logical entailment scale is at play although a 

preference over propositions is observed (for numerous examples see (Hirschberg, 1985)). 

Ducrot, and Merin after him, proposes to replace the ordering of items based on logical relations 

by a relevance-based order (Merin’s relevance matches Ducrot’s argumentativity). 

The ordering of the items on a scale is determined by their argumentative force relative to 

the topic at hand in discourse. The apparent ordering by informativity (typically assumed 

in neo-Gricean approaches) is due to the fact that more informative propositions usually 

have more argumentative values. In (Ducrot, 1980):61 the derivation of an implicature 

such as (1b) is as follows: 

(15) a. 〈sure, possible〉H is an argumentative scale, i.e. a simple utterance including 

sure has more argumentative power, regarding a certain conclusion H, than one 

relying on possible, and possible has a semantic “at least” interpretation 

b. the utterance of (1b) gets further interpreted by an exhaustivity law, similar to 

standard Gricean reasoning, and yields the desired meaning: since an utterance 

relying on sure would have been argumentatively superior and wasn’t used, one 

is entitled to infer that the corresponding proposition is false 

The point that matters here is that the implicatures come about from the negation of propositions 

that are argumentatively superior (this remains valid in Merin’s framework even 

though the mechanism is different). 

2.2.2 Results 


According to the mechanism in (15), Q-based implicatures are necessarly argumentatively 

opposed to their mother-utterance: they come about from the negation of a proposition 

that is argumentatively superior to their mother-utterance and thus they argue in the 

opposite direction. This explains the core data straightforwardly but not examples such as 

(13). If the sole way to derive Q-implicatures is through a unique argumentation-driven 

mechanism, then the scalar implicature in (13) isn’t accounted for. As it happens, we can 

justify its presence on other grounds. 

In the context of (13) the proposition including all isn’t argumentatively superior to 

that containing some (i.e. to justify Kevin’s good behaviour, it’s better to say that he only 

ate some of the cookies). The mechanism (15) doesn’t exclude the all interpretation. On 

the other hand, what the speaker asserts sets a lower-bound on the argumentative force 

of its assertion: he means to convey something at least as argumentatively strong as its 

utterance. Since the all-proposition is argumentatively inferior to the some-proposition, it 

doesn’t belong to the speaker’s commitment (in Merin’s terms the all-proposition doesn’t 

belong to the speaker’s upward relevance cone). 

The second part of (13) should thus be treated as a way of correcting the first part. Such 

examples, where semantic and argumentative information are clearly decoupled could be 

an interesting starting point in the examination of the nature of correction as compared to 

reformulation. 

Cases including only, such as (16), can also be explained. 

219


(16) The complete extinction of mankind is only possible and not certain. 

In (16) only excludes the necessity of the extinction of mankind; re-asserting this exclusion 

can’t be argumentatively opposed to the first part of the utterance, therefore the use of 

an adversative should be excluded. However, there is a strong feeling for interpreting the 

second conjunct as echoic. This isn’t surprising as the second conjunct is redundant. It’s 

an open question to know whether the use of only in those examples is limited to echoic 

cases (even without the second conjunct of the utterance). 

We can add an interesting side-observation to this. As we already remarked, the second 

conjunct of (13) demands a reformulative marker of some kind to mark the cancellation 

of the implicature. This remains valid even in the core-cases, as shown in (5b). This 

would mean that whereas adversatives aren’t sensitive to the presence of inferences (but 

only to argumentative properties of the utterances), reformulatives are (and are oblivious 

to argumentativity). This is a tentative hypothesis that we shall try to pursue in future 

work. 

3 Obligatoriness of contrast 

We gave arguments to explain why the examples we’re interested in license a contrast. 

We gave no arguments as to why this contrast is preferred when overtly marked. A possibility 

we want to examine is the application of a principle close to Sauerland’s “Maximize 

Redundancy”, as stated in (Sauerland, 2008). This principle can be roughly paraphrased 

as urging a speaker to prefer, among a set of alternatives, a sentence that presupposes 

an already existing proposition over a sentence that presupposes nothing (with a pragmatic 

approach to presupposition as a proposition that is non-controversially part of all 

speakers’ Common Ground). Thus, a speaker should prefer saying the father of the victim 

rather than a father of the victim because the former presupposes a non-controversial 

proposition. Uttering the latter would suggest that the presupposition doesn’t obtain, contrary 

to common knowledge. Applied to our case, this means that given two contextually 

argumentatively opposed propositions p and q, a speaker will prefer to utter p but q rather 

than p and q. Using a simple conjunction implies that a contrast doesn’t hold between p 

and q and thus contradicts intuition, or at least makes the speaker sound “dissonant”. At 

this stage we need to further back up this claim on at least two counts: 

1. by ensuring that the non-felicitousness of (4e) is related to that of utterances such as 

“a father of the victim”, and that the preference is of the same order of magnitude 

(as we already mentioned, the preference for (4d) is far from absolute) 

2. by ensuring that the predictions made by the Maximization principle apply to the 

cases we study; the notion of presupposition used by Sauerland is technical and 

doesn’t necessarly applies to the contrast conveyed by the use of but (i.e. what is 

often called a conventional implicature rather than a presupposition) 

An alternative explanation for the preference for a marked contrast would be to consider 

this preference as an idiosyncratic property of the relation at hand. This would be in line 

with the approach of (Asher and Lascarides, 2003), where it is claimed that the semantics 

of the relation of Contrast (as defined in SDRT) are such that the relation requires a 

specific clue to be used, either an overt cue element such as but or intonation alone. When 

two connected discourse segments are such that the second denies a default consequence 

220

of the first, the relation of Contrast holds and needs to be marked. As an example, the 

first and second segment of (17) are opposed: that John doesn’t like hockey is a default 

consequence of the first, since the relation of opposition exists it needs to be overtly 

marked. 

(17) John hates sports but he likes hockey. 

The preference we observe for using an adversative would then be a consequence of the 

particular semantics of the relation of Contrast. In our core data if one ignores the implicature 

the needed opposition is more obvious: the implicature denies part of the denotation 

of its mother-utterance and somehow contradicts part of it in the same way that the second 

conjunct in (17) denies a default consequence of the first, thus triggering the need for a 

Contrast marker. 

4 Conclusion 

We observed what seemed to be a constraint on the felicitous reinforcement of some 

implicatures. Approaches in traditional Gricean terms weren’t sufficient to explain all the 

possible data we encountered. The main conclusion we drew from this data was that, 

despite an apparent strong correlation, inference mechanisms couldn’t be at the source of 

the argumentative orientation of an utterance. Therefore, the observed constraint doesn’t 

seem to apply on the reinforcement operation itself but is rather due to different discourse 

coherence mechanisms. We took an argumentative approach and showed that the standard 

accounts of adversatives and implicatures in this approach worked together to justify the 

possibility of marking a contrast. The actual preference for marking this available contrast 

could be related to the intrinsic nature of the Contrast discourse relation. 

We mentioned experimental pragmatics as a mean to shed more light on the phenomeon 

we studied. Among the different points we intend to study are the following: 

• According to the context the preference for a marked contrast should differ. In 

particular we expect that in lower-bounded contexts the preference might disappears 

and that the use of but is odd or takes longer to process. 

• We made an hypothesis about reformulatives such as in fact that need to be refined. 

They appear to be sensitive to informativity scales and somehow indifferent to the 

argumentative orientation of the propositions they connect. A test in lower-bounded 

contexts could prove relevant to determine the truth behind this hypothesis. 

The results of these experiments could provide support for the argumentative approach 

to semantics and pragmatics we presented, and thus to an explanation of the main, nontrivial, 

fact we observed: an utterance can convey an implicature and yet be argumentatively 

opposed to it. 

References 


Anscombre, J.-C. and Ducrot, O. (1977). Deux mais en français, Lingua 43: 23–40. 

Anscombre, J.-C. and Ducrot, O. (1983). L’argumentation dans la langue, Pierre 

Mardaga, Liège:Bruxelles. 

221


Asher, N. and Lascarides, A. (2003). Logics of Conversation, Cambridge: Cambridge 

University Press. 

Benndorf, B. and Koenig, J.-P. (1998). Meaning and context : German aber and sondern, 

in J.-P. Koenig (ed.), Discourse and cognition : bridging the gap, CSLI Publications, 

Stanford, pp. 365–386. 

Breheny, R., Katsos, N. and Williams, J. (2005). Are generalised scalar implicatures 

generated by default? an on-line investigation into the role of context in generating 

pragmatic inferences, Cognition . 

Carston, R. (2005). Relevance theory and the saying/implicating distinction, in L. Horn 

and G. Ward (eds), The handbook of Pragmatics, Blackwell. 

Ducrot, O. (1980). Les échelles argumentatives, Les Éditions de Minuit. 

Gazdar (1979). Pragmatics: Implicature, Presupposition and Logical Form, New York : 

Academic Press. 

Hirschberg, J. (1985). A theory of scalar implicature, PhD thesis, Univ. of Pennsylvania. 

Horn, L. (1989). A natural history of negation, The University of Chicago Press. 

Horn, L. (1991). Given as new: when redundant information isn’t, Journal of Pragmatics 

15(4): 313–336. 

Jayez, J. and Tovena, L. (2008). Presque and almost: how argumentation derives from 

comparative meaning, in O. Bonami and P. C. Hofherr (eds), Empirical Issues in 

Syntax and Semantics, Vol. 7, pp. 1–23. 

Levinson, S. C. (2000). Presumptive Meanings: The Theory of Generalized Conversational 

Implicature, MIT Press, Cambridge, MA, USA. 

Merin, A. (1999). Information, relevance and social decision-making, in L. Moss, 

J. Ginzburg and M. de Rijke (eds), Logic, Language, and computation, Vol. 2, CSLI 

Publications, Stanford:CA, pp. 179–221. 

Noveck, I. and Sperber, D. (2007). The why and how of experimental pragmatics: The 

case of ’scalar inferences’, in N. Burton-Roberts (ed.), Advances in Pragmatics, 

Palgrave Macmillan, Basingstoke. 

Sauerland, U. (2008). Implicated presuppositions, in A. Steube (ed.), Sentence and Context, 

Language, Context and Cognition, Mouton de Gruyter, Berlin. to appear. 

Wilson, D. and Sperber, D. (2005). Relevance theory, in L. Horn and G. Ward (eds), The 

handbook of pragmatics, Blackwell. 

222

List of authors 


Martin.Avanzini@student.uibk.ac.at 

Institute of Computer Science 


Austria 

Timo Baumann 

timo@ling.uni-potsdam.de 

Institut für Linguistik 

Universität Potsdam 

Germany 

Christopher Brumwell 

chrisbrumwell@gmail.com 

ILLC 


The Netherlands 

Bert Le Bruyn 

Bert.LeBruyn@let.uu.nl 

Utrecht Institute of Linguistics 

Universiteit Utrecht 


James Burton 

jb162@brighton.ac.uk 

University of Brighton 

United Kingdom 


gceles@interchange.ubc.ca 

Department of Philosophy 

University of British Columbia & 

LOGOS Research Group 

Canada 

Dragan Doder 

ddoder@mas.bg.ac.yu 

Faculty of Mechanical Engineering 

Serbian Academy of Sciences and Arts 

Serbia and Montenegro 


m.franke@uva.nl 

ILLC 



Gianluca Giorgolo 

Gianluca.Giorgolo@let.uu.nl 





223 


Michael.Jua.Hartwig@gmail.com 

Multimedia University, Cyberjaya 

Malaysia 

Simon Hopp 

simon.hopp@uni-konstanz.de 

Fachbereich Sprachwissenschaft 

University of Konstanz 

Germany 

Pierre Lison 

pierrel@coli.uni-sb.de 

Language Technology Lab 

Research Center for Artificial Intelligence 

Saarbrücken 

Germany 

Petar Maksimović 

petarmax@mi.sanu.ac.yu 

Mathematical Institute 



Bojan Marinković 

bojanm@mi.sanu.ac.yu 




Scott Martin 

scott@ling.osu.edu 

The Ohio State University 

United States 

Takako Nemoto 

nmt0731@yahoo.co.jp 


Tohoku university 

Japan 


iva@lml.bas.bg 

LMD, IPP 

Bulgarian Academy of Sciences 

Bulgaria 

Yves Peirsman 

yvespeirsman@gmail.com 

QLVL 

Katholieke Universiteit Leuven 

Belgium

Aleksandar Perović 

pera@sf.bg.ac.yu 

Faculty of Transport and Traffic Engineering 




schierl1@msu.edu 

Michigan State University 

U.S.A. 


andreas.schnabl@uibk.ac.at 

Institute of Computer Science 


Austria 


essay229@gmail.com 

Department of Linguistics 

University of Pécs 

Hungary 


224 

CamiloThorne 

camilo.thorne@gmail.com 

Faculty of Computer Science 

Free University of Bozen-Bolzano 

Italy 

Christina Unger 

christina.unger@let.uu.nl 




Melanie Uth 

melanie.uth@ling.uni-stuttgart.de 

Institut fr Linguistik/Romanistik 

University of Stuttgart 

Germany 

Grégorie Winterstein 

gregoire.winterstein@linguist.jussieu.fr 

Laboratoire de Linguistique Formelle 

Université Paris 7 

France

Proceedings of the 13 ESSLLI Student Session - Multiple Choices ...

Create successful ePaper yourself

Delete template?

Save as template?