13th ESSLLI Student Session - The Institute for Logic, Language ...


13th ESSLLI Student Session - The Institute for Logic, Language ...

13th ESSLLI Student


Kata Balogh


20th European Summer School in Logic, Language and Information

4–15 August 2008

Freie und Hansestadt Hamburg, Germany

Programme Committee. Enrico Franconi (Bolzano, Italy), Petra Hendriks (Groningen, The

Netherlands), Michael Kaminski (Haifa, Israel), Benedikt Löwe (Amsterdam, The Netherlands

& Hamburg, Germany) Massimo Poesio (Colchester, United Kingdom), Philippe Schlenker (Los

Angeles CA, United States of America), Khalil Sima’an (Amsterdam, The Netherlands), Rineke

Verbrugge (Chair, Groningen, The Netherlands).

Organizing Committee. Stefan Bold (Bonn, Germany), Hannah König (Hamburg, Germany),

Benedikt Löwe (chair, Amsterdam, The Netherlands & Hamburg, Germany), Sanchit Saraf (Kanpur,

India), Sara Uckelman (Amsterdam, The Netherlands), Hans van Ditmarsch (chair, Otago,

New Zealand & Toulouse, France), Peter van Ormondt (Amsterdam, The Netherlands).



ESSLLI 2008 is organized by the Universität Hamburg under the auspices of the Association for Logic, Language and

Information (FoLLI). The Institute for Logic, Language and Computation (ILLC) of the Universiteit van Amsterdam is

providing important infrastructural support. Within the Universität Hamburg, ESSLLI 2008 is sponsored by the Departments

Informatik, Mathematik, Philosophie, and Sprache, Literatur, Medien I, the Fakultät für Mathematik, Informatik

und Naturwissenschaften, the Zentrum für Sprachwissenschaft, and the Regionales Rechenzentrum. ESSLLI 2008 is

an event of the Jahr der Mathematik 2008. Further sponsors include the Deutsche Forschungsgemeinschaft (DFG), the

Marie Curie Research Training Site GLoRiClass, the European Chapter of the Association for Computational Linguistics,

the Hamburgische Wissenschaftliche Stiftung, the Kurt Gödel Society, Sun Microsystems, the Association for Symbolic

Logic (ASL), and the European Association for Theoretical Computer Science (EATCS). The official airline of ESSLLI 2008

is Lufthansa; the book prize of the student session is sponsored by Springer Verlag.

Kata Balogh

13th ESSLLI Student Session

Proceedings. 20th European Summer School in Logic, Language

and Information (ESSLLI 2008), Freie und Hansestadt Hamburg,

Germany, 4–15 August 2008

The ESSLLI student session material has been compiled by Kata Balogh. Unless otherwise mentioned, the copyright lies

with the individual authors of the material. Kata Balogh declares that she has obtained all necessary permissions for the

distribution of this material. ESSLLI 2008 and its organizers take no legal responsibility for the contents of this booklet.


Proceedings of the

13 th ESSLLI Student Session

4–15 August 2008, Hamburg, Germany

Kata Balogh


Copyright c○ to the authors


Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Martin Avanzini

POP ∗ and Semantic Labelling using SAT . . . . . . . . . . . . 7

Timo Baumann

Simulating Spoken Dialogue

With a Focus on Realistic Turn-Taking . . . . . . . . . . . . . 17

Christopher Brumwell

Epistemic Modals in Dialogue . . . . . . . . . . . . . . . . . . 27

Bert Le Bruyn

Bare predication and kinds . . . . . . . . . . . . . . . . . . . . 37

James Burton

Diagrammatic Reasoning

with Enhanced Static Constraints . . . . . . . . . . . . . . . . 47

Gemma Celestino

Fictional Contingencies . . . . . . . . . . . . . . . . . . . . . 57

Michael Franke

Meaning & Inference in Case of Conflict . . . . . . . . . . . . 65

Michael Hartwig

Towards a New Characterisation of Chomsky’s Hierarchy via

Acceptance Probability . . . . . . . . . . . . . . . . . . . . . . 75

Simon Hopp

Distance Effects in Sentence Processing . . . . . . . . . . . . . 85

Pierre Lison

A Salience-driven Approach to

Speech Recognition for Human-Robot Interaction . . . . . . . . 95

Petar Maksimović – Dragan Doder–Bojan Marinković – Aleksandar


A logic with a conditional probability operator . . . . . . . . . 105

Scott Martin

A Proof-theoretic Approach to French Pronominal Clitics . . . 115

Takako Nemoto

Infinite games from an intuitionistic point of view . . . . . . . 125

Ivelina Nikolova

Language Technologies for Instructional Resources in Bulgarian135


Proceedings of the 13 th ESSLLI Student Session

Yves Peirsman

Word Space Models of Semantic Similarity and Relatedness . . 143

Maren Schierloh

Examining the Noticing Function of Output . . . . . . . . . . 153

Andreas Schnabl

Cdiprover3: a Tool for Proving

Derivational Complexities of Term Rewriting Systems . . . . . 165

Éva Szilágyi

The Rank(s) Of A Totally Lexicalist Syntax . . . . . . . . . . 175

Camilo Thorne

Expressing Conjunctive and Aggregate Queries

over Ontologies with Controlled English . . . . . . . . . . . . . 185

Christina Unger – Gianluca Giorgolo

Interrogation in Dynamic Epistemic Logic . . . . . . . . . . . 195

Melanie Uth

The Semantic Change of the French -age-Derivation . . . . . . 203

Grégoire Winterstein

Adversary Implicatures . . . . . . . . . . . . . . . . . . . . . . 213

List of Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223


Proceedings of the 13 th ESSLLI Student Session


This years Student Session is the thirteenth in the twenty years history of the annual

European Summer School on Logic Language and Information. The first edition was

held in Prague in 1996, invented and organized by students, and ever since ESS-

LLI has been accompanied by a separate Student Session. The aim of the Student

Session is to give an opportunity to students at all levels (Bachelor-, Master-, and

PhD-students) to present and discuss their work in progress with a possibility to get

feedback from senior researchers.

Similarly to the previous years, the quality of the submissions was high, this made

the selection procedure difficult. This year 17 papers were selected for oral presentation

and 5 for poster presentation from a total of 46 submissions. All the accepted

papers are included in this volume.

I would like to thank the ESSLLI organization, in particular Rineke Verbrugge and

Benedikt Loewe for their continuos support and for making it possible. I am grateful

to the StuS Program Committee, the co-chairs: Laia Mayol, Manuel Kirschner and

Ji Ruan, for their efforts in coordinating the reviewing process, and the senior area

experts: Anke Lüdeling, Paul Egré, Guram Bezhanishvili and Alexander Rabinovich

for their continuous presence and helpful advice. Also, I want to hank the anonymous

reviewers, whose detailed comments have not only proved invaluable during

the selection procedure, but also provide useful feedback to the authors. Many

thanks to the Kluwer Academic Publishers who offered — as in previous years —

prizes in “Best Student Paper in the Oral Session” and “Best Student Paper in the

Poster Session” nominations.

We are very much looking forward to the 13 th ESSLLI Student Session in Hamburg,

and believe that it will be again a very inspiring meeting.

Kata Balogh

Amsterdam, May 2008


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session


Martin Avanzini

University of Innsbruck

Abstract. The polynomial path order (POP ∗ for short) is a termination method that induces

polynomial bounds on the innermost runtime complexity of term rewrite systems (TRSs).

Semantic labeling is a transformation technique used for proving termination. In this paper

we propose an efficient implementation of POP ∗ together with finite semantic labeling. This

automation works by a reduction to the problem of boolean satisfiability. Satisfiability of

the resulting formula is checked by a state-of-the-art SAT-solver. We have implemented the

technique and experimental results confirm the feasibility of our approach. By semantic

labeling, we significantly increase the power of POP ∗ .

Term rewrite systems provide a conceptually simple but powerful abstract model of

computation. In rewriting, proving termination is a long standing research field and

consequently termination techniques applicable in an automated setting have been introduced

quite early. Former research concentrated mainly on direct termination techniques

(TeReSe, 2003). One such technique is the use of recursive path orders (RPOs), for instance

the multiset path order (MPO) (Baader and Nipkow, 1998). Recently, the emphasis

shifted toward transformation techniques like the dependency pair method (Arts and

Giesl, 2000) or semantic labeling (Zantema, 1995). These methods significantly increase

the possibility to automatically conclude termination.

For direct termination techniques it is often possible to infer upper bounds on the

derivational complexity of a rewrite system R from the termination proof. For instance,

Hofbauer was the first to observe that termination via MPO implies the existence of a

primitive recursive bound on the derivational complexity (Hofbauer, 1992). Here derivational

complexity refers to the function that relates the length of the longest derivation

sequence to the size of the initial term. It is thus quite natural to extend such a termination

analysis of rewrite systems to the analysis of complexity properties. For the study of lower

complexity bounds we recently introduced in (Avanzini and Moser, 2008) the polynomial

path order (POP ∗ for short). This order is in essence a miniaturization of MPO, carefully

crafted to induce polynomial bounds on the number of rewrite steps (c.f. Theorem 4).

In this work, we show how to increase the power of POP ∗ by semantic labeling

(Zantema, 1995). The idea behind semantic labeling is to label the function symbols

of a rewrite system R with semantic information in such a way that direct termination

methods become applicable for the labeled rewrite system R lab . In order to label R, one

needs to define suitable interpretation- and labeling-functions for all symbols appearing

in R. Naturally, these functions have to be chosen such that POP ∗ is applicable to the

labeled system. To find them automatically, we extend the propositional encoding from

(Avanzini and Moser, 2008). Satisfiability of the constructed formula certifies the existence

of a labeled system R lab that is compatible with POP ∗ . Finite semantic labeling is

non-termination preserving and moreover, it is complexity preserving. Thus from compatibility

of R lab with POP ∗ we conclude that R admits a polynomial runtime complexity

(c.f. Lemma 6).


Proceedings of the 13 th ESSLLI Student Session

A translation of infinite semantic labeling in conjunction with RPOs has already been

given in (Koprowski and Middeldorp, 2007). Unfortunately, this approach is inapplicable

in our context since the runtime complexity of the original system cannot be related to the

runtime complexity of the infinite labeled system in general. Furthermore, finite semantic

labeling using heuristics is implemented in the termination prover TPA (Koprowski, 2006)

for instance. We consider the here presented approach favorable, as the choice of labeling

suitable for the base order can be left to a state-of-the-art SAT-solver.

1 The Polynomial Path Order

We briefly recall the basic concepts of term rewriting, for details (Baader and Nipkow,

1998) provides a good resource. Let V denote a countably infinite set of variables and F

a signature. The set of terms over F and V is denoted by T (F, V). We write ✂ for the

subterm relation, the converse is denoted by ☎ and the the strict part of ☎ by ✄.

A term rewrite system (TRS for short) R over T (F, V) is a set of rewrite rules l → r

such that l,r ∈ T (F, V), l ∉ V and all variables of r also appear in l. In the following, R

will always denote a TRS and in our context, R is finite. A binary relation on T (F, V) is a

rewrite relation if it is compatible with F-operations and closed under substitutions. The

smallest extension of R that is a rewrite relation is denoted by → R . The innermost rewrite

relation −→ i R is a restriction of → R , where innermost terms have to be reduced first. The

transitive and reflexive closure of a rewrite relation → is denoted by → ∗ and we write

s → n t for the contraction of s to t in n steps. We say that R is (innermost) terminating


if there exists no infinite chain of terms t 0 , t 1 , ... such that t i → R t i+1 (t i −→ R t i+1 ) for

all i ∈ N.

The root symbols of left-hand sides of rewrite rules in R are called defined symbols and

collected in D(R), while all other symbols are called constructor symbols and collected

in C(R). A term f(s 1 , ...,s n ) is constructor-based with respect to R if f ∈ D(R) and

s 1 , ...,s n ∈ T (C(R), V). We write T cb (R) for the set of all constructor-based terms

over R. If every left-hand side of R is constructor-based then R is called constructor

TRS. Constructor TRSs allow us to model the computation of functions in a very natural

way. Consider the following TRS:

Example 1 The constructor TRS R mult is defined by

add(0, y) → y mult(0, y) → 0

add(s(x), y) → s(add(x, y))

mult(s(x), y) → add(y, mult(x, y)).

R mult defines the function symbols add and mult, i.e. D(R) = {add, mult}. Natural

numbers are represented using the constructor symbols from C(R) = {s, 0}. Define the

encoding function · : Σ ∗ → T (C(R), ∅) by 0 = 0 and n + 1 = s(n). Then for

all n,m ∈ N, mult(n, m) i −→ ∗ R n ∗ m. We say that R mult computes multiplication

(and addition) on natural numbers. For instance, the system admits the innermost rewrite

sequence mult(s(0), 0) i −→ add(0, mult(0, 0)) i −→ add(0, 0) i −→ 0, computing 1 ∗ 0. Notice

that we have to reduce in the second step the innermost redex mult(0, 0) first.

In (Lescanne, 1995) it is proposed to conceive the complexity of a rewrite system

R as the complexity of the functions computed by R. Whereas this view falls into the

realm of implicit complexity analysis, we conceive rewriting under R as the evaluation


Proceedings of the 13 th ESSLLI Student Session

mechanism of the encoded function. Thus it is natural to define the runtime complexity

based on the number of rewrite steps admitted by R. Let |s| denote the size of a term

s. The (innermost) runtime complexity of a terminating rewrite system R is defined by

Dl R (m) = max{n | ∃s, t. s i −→ n t,s ∈ T cb (R) and |s| m}.

To verify whether the runtime complexity of a rewrite system R is polynomially

bounded, we employ the polynomial path order. Similar to the recursion-theoretic characterization

of the polytime functions given in (Bellantoni and Cook, 1992), POP ∗ relies

on the separation of safe and normal inputs. For this, the notion of safe mappings is introduced.

A safe mapping safe associates with every n-ary function symbol f the set of

safe argument positions. If f ∈ D(R) then safe(f) ⊆ {1, ...,n}, for f ∈ C(R) we fix

safe(f) = {1, ...,n}. The argument positions not included in safe(f) are called normal

and denoted by nrm(f). A precedence is an irreflexive and transitive order on F. The

polynomial path order > pop∗ is an extension of the auxiliary order > pop , both defined in

the following definitions:

Definition 2 Let > be a precedence and safe a safe mapping. We define the order > pop

inductively as follows: s = f(s 1 , ...,s n ) > pop t if one of the following alternatives hold:

1. f ∈ C(R) and s i > = pop t for some i ∈ {1, ...,n}, or

2. s i > = pop t for some i ∈ nrm(f), or

3. t = g(t 1 , ...,t m ) with f ∈ D(R) and f > g and s > pop t i for all 1 i m.

Definition 3 Let > be a precedence and safe a safe mapping. We define the polynomial

path order > pop∗ inductively as follows: s = f(s 1 , ...,s n ) > pop∗ t if either

1. s > pop t, or

2. s i > = pop∗ t for some i ∈ {1, ...,n}, or

3. t = g(t 1 , ...,t m ), with f ∈ D(R), f > g, and the following properties hold:

• s > pop∗ t i0 for some i 0 ∈ safe(g) and

• either s > pop t i or s ✄ t i and i ∈ safe(g) for all i ≠ i 0 , or

4. t = f(t 1 , ...,t m ) and for nrm(f) = {i 1 , ...,i p }, safe(f) = {j 1 , ...,j q } both

[s i1 , ...,s ip ] (> pop∗ ) mul [t i1 , ...,t ip ] and [s j1 , ...,s jq ] (> = pop∗) mul [t j1 , ...,t jq ] holds.

Here > = pop∗ (> = pop) denotes the reflexive closure of > pop∗ (> pop ) and (> pop∗ ) mul the multiset

extension of > pop∗ . When R ⊆ > pop∗ holds, we say that > pop∗ is compatible with R.

The main theorem from (Avanzini and Moser, 2008) states:

Theorem 4 Let R be a finite, constructor TRS compatible with > pop∗ , i.e., R ⊆ > pop∗ .

Then the runtime complexity of R is polynomial. The polynomial depends only on the

cardinality of F and the sizes of the right-hand sides in R.

We conclude this section by demonstrating the application of POP ∗ on the TRS R mult :


Proceedings of the 13 th ESSLLI Student Session

Example 5 Reconsider the rewrite system R mult from Example 1. We suppose that the

second argument of addition (add) is safe (safe(add) = {2}) and that all arguments of

multiplication (mult) are normal (safe(mult) = ∅). Furthermore let the precedence >

be defined as mult > add > s. Then R mult is compatible with > pop∗ . As a consequence

of Theorem 4, the number of rewrite steps starting from mult(n, m) is polynomially

bounded in n and m.

In order to verify compatibility for this particular instance > pop∗ we need to show that

all the rules in R mult are strictly decreasing with respect to > pop∗ , that is l > pop∗ r holds

for l → r ∈ R mult . To exemplify this, consider the rule add(s(x), y) → s(add(x, y)).

We write 〈i〉 for the i-th case of Definition 3. From s(x) > pop∗ x by rule 〈2〉 we infer

[s(x)](> pop∗ ) mul [x]. Furthermore [y](> = pop∗) mul [y] holds and thus by rule 〈4〉 we obtain

add(s(x), y) > pop∗ add(x, y). Finally, from this and add > s we conclude by one application

of rule 〈3〉 that add(s(x), y) > pop∗ s(add(x, y)) holds.

2 A Propositional Encoding of POP ∗ with Finite Semantic Labeling

In (Zantema, 1995) the transformation technique semantic labeling is introduced. From

R a labeled TRS R lab is obtained by labeling the function symbols in R with semantic

information. Semantics are given to R by defining a model. A model is a F-algebra

A, i.e. a carrier A equipped with operations f A : A n → A for every n-ary symbol

f ∈ F, such that for every rule l → r ∈ R and any assignment α : V → A, the equality

[α] A (l) = [α] A (r) holds. Here [α] A (t) denotes the interpretation of t with assignment α,

inductively defined by [α] A (t) = α(t) if t ∈ V and [α] A (t) = f A ([α] A (t 1 ), ...,[α] A (t n ))

if t = f(t 1 , ...,t n ). The system is then labeled according to a labeling l for A, i.e. a set

of mappings l f : A n → A for every n-ary function symbol f ∈ F. 1

For every assignment α, the mapping lab α (t) is defined by lab α (t) = t if t ∈ V and

lab α (f(t 1 , ...,t n )) = f a (lab α (t 1 ), ...,lab α (t n )) where a = l f ([α] A (t 1 ), ...,[α] A (t n )).

The labeled TRS R lab is obtained by labeling all rules for all assignments α, that is

R lab = {lab α (l) → lab α (r) | l → r ∈ R and assignment α}.

The main theorem from (Zantema, 1995) states that R lab is terminating if and only if R

is terminating. In the following, we restrict to algebras B with carrier B = {true, false},

however the approach is extensible to arbitrary finite carriers.

To encode a Boolean function b : B n → B, we make use of unique propositional atoms

b w for every sequence of arguments w = w 1 , ...,w n ∈ B n . The atom b w will denote

the result of applying w 1 , ...,w n to b. Let a 1 , ...,a n be propositional formulas. To

impose restrictions on the encoded function b, we introduce the formula b(a 1 , ...,a n )

such that for a satisfying assignment ν the equality ν(b(a 1 , ...,a n )) = b ν(a1 ),...,ν(a n)

holds. For instance with b(a 1 , a 2 ) ↔ r we assert that the encoded function b satisfies

b(ν(a 1 ), ν(a 2 )) = ν(r).

For every assignment α : V → A and term t appearing in R we introduce the atoms

int α,t and lab α,t for t ∉ V. The meaning of int α,t will be the result of [α] B (t), lab α,t

will denote the label of the root symbol of t under α. In order to ensure this for t =

1 The definition from (Zantema, 1995) allows the labeling of a subset of F and leave other symbols

unchanged. In our context, this has no consequence and simplifies the translation.


Proceedings of the 13 th ESSLLI Student Session

f(t 1 , ...,t n ) and a particular assignment α, we define

INT α (t) = int α,t ↔ f B (int α,t1 , ...,int α,tn ), and

LAB α (t) = lab α,t ↔ l f (int α,t1 , ...,int α,tn ).

Furthermore for t ∈ V we set INT α (t) = int α,t ↔ α(t). We extend ☎ to TRSs as follows:

R ☎ t if l ☎ t or r ☎ t for some rule l → r ∈ R. Beside the model condition, the above

constraints have to be enforced for every term appearing in R. This is covered by

LAB(R) = ∧ (∧

(INT α (t) ∧ LAB α (t)) ∧

(int α,l ↔ int α,r ) ) .




Assume ν is a satisfying assignment for LAB(R) and R lab denotes the system obtained by

labeling R according to the encoded labeling and model. In order to show compatibility

of R lab with POP ∗ , we need to find a precedence > and safe mapping safe such that

R lab ⊆> pop∗ holds for the induced order > pop∗ . To compare the labeled versions of two

concrete terms s, t ∈ T (F, V) under a particular assignment α, we define

s > pop∗ t α = s > (1)

pop∗ t α ∨ s > (2)

pop∗ t α ∨ s > (3)

pop∗ t α ∨ s > (4)

pop∗ t α .

Here s > (i)

pop∗ t refers to the encodings of the case 〈i〉 from Definition 3. We discuss

the cases 〈2〉 – 〈4〉, case 〈1〉, the comparison using the weaker order > pop , is obtained


Note that s i = t implies lab α (s i ) = lab α (t). Thus case 〈2〉 is perfectly captured

by f(s 1 , ...,s n ) > (2)

pop∗ t α = ⊤ 2 if s i = t holds for some s i . Otherwise, we define

f(s 1 , ...,s n ) > (2)

pop∗ t α = ∨ n

i=1 s i > pop∗ t α . For f ∈ F and formula a representing

the label, the formula SF(f a , i) (NRM(f a , i)) assesses that depending on the valuation of

a, the i-th position of f true or f false is safe (normal). Likewise, for f, g ∈ F, the formula

f a > g b is defined such that for a satisfying assignment ν, f ν(a) > g ν(b) is asserted.

Assume the unlabeled symbol f is a defined symbol of R.We define for f ≠ g

f(s 1 , ...,s n ) > (3)

pop∗ g(t 1 , ...,t m ) α = f labα,s > g labα,t

n∨ (

∧ s > pop∗ t i0 α ∧ SF(g labα,t , i 0 )

i 0 =1


i=1,i≠i 0


s >


pop∗ t i α ∨ ( SF(g labα,t , i) ∧ s ✄ t i ) )) .

Here we employ that the superterm property ✄ is closed under labeling. Additionally

we add the rule f a (x 1 , ...,x n ) → c with c a fresh constant to the labeled system and

require f a > c in the precedence. This guarantees that f a is defined with respect to

R lab as otherwise case 〈3〉 is not applicable. Alternatively one could encode whether f a is

defined and adopt the encoding of case 〈3〉 accordingly, but experimental findings indicate

that the described approach is favorable.

To encode multiset comparisons, we make use of multiset covers (Schneider-Kamp,

Thiemann, Annov, Codish and Giesl, 2007). A multiset cover is a pair of total mappings

2 We use ⊤ and ⊥ to denote truth and falsity in propositional formulas.


Proceedings of the 13 th ESSLLI Student Session

γ : {1, ...,n} → {1, ...,n} and ε: {1, ...,n} → B, encoded using fresh atoms γ i,j and

ε i . The underlying idea is that for the comparison [s 1 , ...,s n ](> = pop∗) mul [t 1 , ...,t n ] to

hold, every term t j has to be covered by some term s i (encoded as γ ij = true), either by

s i = t j (ε i = true) or s i > pop∗ t j (ε i = false). For the case s i = t j , s i must not cover

any element besides t j . To assert a correct encoding of (γ, ε), we introduce the formula

(γ, ε). By means of multiset covers we are able to encode case 〈4〉 using one multiset

comparison. We define

f(s 1 , ...,s n ) > (4)

pop∗ f(t 1 , ...,t n ) α =

(lab α,s ↔ lab α,t ) ∧ (γ, ε) ∧



i=1 j=1

n∨ ( )

NRM(flabα,s , i) ∧ ¬ε i



γ i,j → ( (SF(f labα,s , i) ↔ SF(f labα,t , j))

∧ (ε i → s i = t j ) ∧ (¬ε i → s i > pop∗ t j α ) ))

where we restrict comparisons of arguments by their kind. Assuming STRICT(R) and

SM SL (R) cover the restrictions on the precedence and safe mapping, satisfiability of

POP ∗ SL(R) = ∧ ∧

l > pop∗ r α ∧ SM(R) ∧ STRICT(R) ∧ LAB(R)



certifies the existence of a model B and labeling l such that the rewrite system

R ′ lab = R lab ∪ {f a (x 1 , ...,x n ) → c | f ∈ D(R) and f a ∈ C(R lab )}

is compatible with > pop∗ . Since every rewrite sequence in R translates to a sequence in

R lab , by Theorem 4 it is an easy exercise to proof the following lemma:

Lemma 6 Let R be a finite, constructor TRS and assume POP ∗ SL (R) is satisfiable. Then

the induced runtime complexity is polynomial.

3 Experimental Results

We implemented the encoding of POP ∗ with semantic labeling (denoted by POP ∗ SL )

in OCaml and compare it to the implementation without labeling from (Avanzini and

Moser, 2008) (denoted by POP ∗ ) and an implementation of a restricted class of polynomial

interpretations (denoted by SMC). To check satisfiability of the obtained formulas

we employ the MiniSat SAT-solver (Eén and Sörensson, 2003).

SMC refers to a restrictive class polynomial interpretations: Every constructor symbol

is interpreted by a strongly linear polynomial, i.e. a polynomial of shape P(x 1 , ...,x n ) =

Σ n i=1x i + c with c ∈ N, c 1. Furthermore, each defined symbol is interpreted by a

simple-mixed polynomial P(x 1 , ...,x n ) = Σ ij ∈0,1a i1 ...i n

x i 1

1 . ..x in

n + Σ n i=1b i x 2 i with coefficients

in N. For this class of polynomial interpretations it is trivial to check that they

induce polynomial bounds on the runtime complexity. To find these interpretations automatically

we employ cdiprover3 (Moser and Schnabl, 2008).


Proceedings of the 13 th ESSLLI Student Session

The table below presents experimental results based on two testbeds. Testbed T constitutes

of the 957 examples from the Termination Problem Database 4.0 3 (TPDB) that were

automatically verified terminating in the competition of 2007 4 . Testbed C is a restriction

of T where only constructor TRSs have been considered (449 in total). Experimental

results, performed on a PC with 512 MB of RAM and a 2.4 GHz Intel R○ Pentium TM


processor, are collected in Table 1 5 .

Table 1: Experimental results on TPDB 4.0.



Yes 65 41 128 74 156 83

Maybe 892 408 800 370 495 271

Timeout (60 sec.) 0 0 29 5 306 95

Average Time Yes (sec.) 0.037 0.130 0.183

The results confirm that semantic labeling significantly increases the power of POP ∗ ,

yielding comparable results to SMC. What is noteworthy is that the union of yes-instances

of the three methods constitutes of 218 examples for testbed T and 112 for testbed C. For

these 112 out of 449 constructor TRSs we are able to conclude a polynomial runtime

complexity. Interestingly POP ∗ SL and SMC succeed on a quite different range of systems.

There are 29 constructor TRSs that only POP ∗ SL can deal with, whereas 38 constructor

yes-instances of SMC cannot be handled by POP ∗ SL . Table 1 reflects that for both suites

SMC runs into a timeout for approximately every fourth system. This indicates that purely

semantic methods similar to SMC tend to get impractical when the size of the input system

increases. Compared to this, the number of timeouts of POP ∗ SL is rather low, confirming

the feasibility of our new approach.

We perform various optimizations in our implementation: First of all, the constraint

formula can be reduced during construction. It is usually beneficial in combination with

this to lazily construct the formula. For example, f(s 1 , ...,s n ) > (2)

pop∗ s i α reduces to ⊤

and thus one can directly conclude f(s 1 , ...,s n ) > pop∗ s i α = ⊤ without constructing

encodings for the other cases. Furthermore, s > pop∗ t is doomed to failure if t contains

variables not appearing in s, in this case we replace the constraint by ⊥. SAT-solvers

expect their input in CNF (worst case exponential in size). We employ the transformation

proposed in (Plaisted and Greenbaum, 1986) to obtain a equisatisfiable CNF linear in size.

This approach is analogous to Tseitin’s transformation (Tseitin, 1968) but additionally

takes the plurality of atoms into account, usually resulting in shorter transformations.

4 Conclusion

In this paper we have shown how to automatically verify polynomial runtime complexities

of rewrite systems. For that we employ semantic labeling and the polynomial path order

3 Available athttp://www.lri.fr/ ∼ marche/tpdb.

4 C.f. http://www.lri.fr/ ∼ marche/termination-competition/2007/.

5 Detailed results available athttp://homepage.uibk.ac.at/ ∼ csae2496/esslli08.


Proceedings of the 13 th ESSLLI Student Session

POP ∗ . Our automation works by a reduction to SAT and employing a state-of-the-art

SAT-solver. To our best knowledge, this is the first SAT encoding of recursive path orders

with finite semantic labeling. The experimental results confirm the feasibility of our approach.

Moreover, they demonstrate that by semantic labeling we significantly increase

the power of POP ∗ .

Our research seems also comparable to (Bonfante, Marion and Pchoux, 2007), where

recursive path orders together with strongly linear polynomial quasi-interpretations are

employed in the complexity analysis. However, this method relies on caching techniques

to achieve polytime computability. Opposite to this, we only demand an eager evaluation


In future work we will strengthen the applicability of our methods. Currently we investigate

in the integration of POP ∗ into the dependency pair framework for an automatic

complexity analysis as proposed in (Hirokawa and Moser, 2008). As this framework allows

the use of argument filterings (Kusakari, Nakamura and Toyama, 1999) and usable

rules (Arts and Giesl, 2000), we expect a significant increase in the ability to automatically

verify polynomial runtime complexities.

Finally we want to mention another exciting field of application. There is a long interest

in the functional programming community to automatically verify complexity properties

of programs. For brevity we just mention (Rosendahl, 1989; Anderson, Khoo,

Andrei and Luca, 2005; Bonfante et al., 2007). Rewriting naturally models the evaluation

of functional programs, and termination behavior of functional programs via transformations

to rewrite systems has been extensively studied. For instance, one recent approach is

described in (Giesl, Swiderski, Schneider-Kamp and Thiemann, 2006) where Haskell programs

are covered. In joint work with Hirokawa, Middeldorp and Moser (Avanzini, Hirokawa,

Middeldorp and Moser, 2007) we propose a translation from (a subset of higherorder)

Scheme programs to term rewrite systems. The transformation is designed to be

complexity preserving and thus allows the study of the complexity of a Scheme program

P by the analysis of the transformed rewrite system R. Hence from compatibility of R

with POP ∗ we can directly conclude that the number of evaluation steps of the Scheme

program P is polynomially bounded with respect to the input sizes. All necessary steps

can be performed mechanically and thus we arrive at a completely automatic complexity

analysis for Scheme, and eagerly evaluated functional programs in general.


Anderson, H., Khoo, S.-C., Andrei, S. and Luca, B. (2005).

runtime properties, Proc. 3th APLAS, pp. 230–246.

Calculating polynomial

Arts, T. and Giesl, J. (2000). Termination of term rewriting using dependency pairs, TCS

236(1-2): 133–178.

Avanzini, M., Hirokawa, N., Middeldorp, A. and Moser, G. (2007). Proving termination

of scheme programs by rewriting. Draft 6 .

Avanzini, M. and Moser, G. (2008). Complexity analysis by rewriting, Proc. 9th FLOPS,

Vol. 4989 of LICS, pp. 130–146.

6 Available athttp://cl-informatik.uibk.ac.at/ ∼ georg/list.publications


Proceedings of the 13 th ESSLLI Student Session

Baader, F. and Nipkow, T. (1998). Term Rewriting and All That, Cambridge University


Bellantoni, S. and Cook, S. A. (1992). A new recursion-theoretic characterization of the

polytime functions, CC 2: 97–110.

Bonfante, G., Marion, J.-Y. and Pchoux, R. (2007). Quasi-interpretation synthesis by

decomposition., Proc. 4th ICTAC, Vol. 4711 of LICS, pp. 410–424.

Eén, N. and Sörensson, N. (2003). An extensible sat-solver, Proc. 6th SAT, Vol. 2919 of

LICS, pp. 502–518.

Giesl, J., Swiderski, S., Schneider-Kamp, P. and Thiemann, R. (2006). Automated termination

analysis for haskell: From term rewriting to programming languages, Proc.

17th RTA, Vol. 4098 of LICS, pp. 297–312.

Hirokawa, N. and Moser, G. (2008). Automated complexity analysis based on the dependency

pair method, Proc. 4th IJCAR. To appear.

Hofbauer, D. (1992). Termination proofs by multiset path orderings imply primitive recursive

derivation lengths, TCS 105(1): 129–140.

Koprowski, A. (2006). Tpa: Termination proved automatically, Proc. 17th RTA, pp. 257–


Koprowski, A. and Middeldorp, A. (2007). Predictive labeling with dependency pairs

using sat, Proc. 21th CADE, Vol. 4603 of LICS, pp. 410–425.

Kusakari, K., Nakamura, M. and Toyama, Y. (1999). Argument filtering transformation,

Proc. 1th PPDP, Vol. 1702 of LICS, pp. 47–61.

Lescanne, P. (1995). Termination of rewrite systems by elementary interpretations, Formal

Aspects of Computing 7(1): 77–90.

Moser, G. and Schnabl, A. (2008). Proving quadratic derivational complexities using

context dependent interpretations, Proc. 19th RTA. To appear.

Plaisted, D. A. and Greenbaum, S. (1986). A structure-preserving clause form translation,

J. Symb. Comput. 2(3): 293–304.

Rosendahl, M. (1989). Automatic complexity analysis, Proc. 4th FPCA, pp. 144–156.

Schneider-Kamp, P., Thiemann, R., Annov, E., Codish, M. and Giesl, J. (2007). Proving

termination using recursive path orders and SAT solving, Proc. 6th FroCoS, number

4720 in LNCS, pp. 267–282.

TeReSe (2003). Term Rewriting Systems, Vol. 55 of CTTCS, Cambridge University Press.

Tseitin, G. (1968). On the complexity of derivation in propositional calculus, SCML, Part

2 pp. 115–125.

Zantema, H. (1995). Termination of term rewriting by semantic labelling, FI 24(1/2): 89–



Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session



Timo Baumann

University of Potsdam

Abstract. We present a system for testing turn-taking strategies in a simulation environment,

in which artificial dialogue participants exchange audio streams in real time – unlike earlier

turn-taking simulations, which interchanged unambiguous symbolic messages. Dialogue

participants autonomously determine their turn-taking behaviour, based on their analysis of

the incoming audio. We use machine-learning methods to classifiy the continuous audio

signal into symbolic turn-taking states. We experiment with various rule sets and show how

simple, local management rules can create realistic behavioural patterns.

1 Introduction

Turn-taking management, i. e. deciding who may speak when in a dialogue, is an important

subtask of interaction management. The classical model of turn-taking (Sacks,

Schegloff and Jefferson, 1974) describes turn-taking as locally managed (depending only

on a local context) and predictive (upcoming turn endings are signalled in advance by the

interplay of syntax, semantics and prosody). Current speech dialogue systems (SDSes) on

the other hand, use reactive turn-taking schemes, with the turn being taken after a silence

of fixed length or of contextually determined length (Ferrer, Shriberg and Stolcke, 2002).

This limits the interactivity of SDSes, as turns have to be separated by intervening silence.

The prediction of turn endings (EoT prediction) has been investigated by a number

of authors. Schlangen (2006) trains classifiers to predict the end of turn (EoT) but uses

features that are not calculated strictly incrementally. Turn-management has also been

studied before, but typically in simulation systems that interchange symbolic messages

and work in a centrally managed environment (Padilha, 2006). In the present paper, we

combine the efforts for EoT-prediction and turn-taking simulation. We propose an incremental

classification of speech into speech states that control the system’s turn-taking. We

first evaluate the classification itself and then combined with different turn-management

strategies in a dialogue simulation environment.

Dialogue simulation itself has a long standing tradition in the development of SDSes,

but the main focus seems to be on the improvement of dialogue strategies (Schatzmann,

Weilhammer, Stuttle and Young, 2006) and audio is usually just used to trigger realistic

ASR errors (López-Cózar, De la Torre, Segura and Rubio, 2003), which contrasts with

the focus of the present paper: Our goal is to show how realistic turn-taking behaviour

can be simulated using only local context for the classification of speech into classes relevant

to turn-taking management combined with simple, locally managed rules. Dialogue

strategies in general are not locally managed and thus learning dialogue strategies seems

to require the more complex reinforcement learning instead of simple classifier training

which we use.


Proceedings of the 13 th ESSLLI Student Session

Figure 1: A human user conversing with an artificial DP in our interaction environment

(structured as in section 2). A dialogue recorder wiretaps their conversation.

We do not (and do not need to) take into account the content of the dialogues and

in fact we limit our speech analysis to simple prosodic features for the EoT prediction.

Thus, for this work, we abstract away from all questions of content management and let

our dialogue participants speak randomly selected pre-recorded utterances – though with

proper turn-taking.

The remainder of the paper is structured as follows: Section 2 describes the system

architecture and Section 3 the corpora we use. Section 4 evaluates the speech state classification

and Section 5 demonstrates and evaluates some simple turn-management strategies.

We close with conclusions and ideas for further work.

2 Architecture of the Interaction Environment

Our architecture defines an interaction environment in which dialogue participants (DPs)

communicate with each other. Interaction is purely non-symbolic, using asynchronous

audio streams over RTP (Schulzrinne, Casner, Frederick and Jacobson, 2003). There is no

common clock, or other synchronisation required between DPs. The architecture provides

a headset tool for human DPs, and monitoring tools to listen to ongoing dialogues and to

record them to disk.

Figure 1 shows two dialogue participants – one human, one artificial – conversing in

the environment described above. The artificial DP on the right of figure 1 is structured

as described below.

Artificial DPs are realized as modular and extensible collections of event-driven software

agents in the open agent architecture, OAA (Martin, Cheyer and Moran, 1999).

In the OAA each software agent advertises its own abilities to solve problems (such as

generating utterances) and may itself request other agents to solve sub-problems (e. g.

sending data over RTP). For audio processing inside the DP we rely on the Sphinx-4

framework (Walker, Lamere, Kwok, Raj, Singh, Gouvea, Wolf and Woelfel, 2004) which

we extended for our audio-processing pipeline. In the current system, we do not yet use

Sphinx’ abilities as a speech recognizer and most other modules that would be needed for

a real dialogue system are missing. These are obvious enhancements for later versions.


21 Speech Generation

Speech generation consists of a synthesizer and a dispatcher. The synthesizer currently

selects from a corpus of pre-recorded utterances and will be extended to include text-tospeech.

To make turn-taking management harder and the system more realistic a fixed

delay of 100 ms between signal to the module and onset of the recorded utterance is

introduced at this point. 1 This delay is realized by sending 100 ms of recorded silence

before the utterance and utterances are also followed by 100 ms of recorded silence. (If

DPs were to send digital zeros directly before and after their utterances, speech state

classification, as described below, would become trivial.)

The speech dispatcher continuously sends an RTP stream in packets of 10 ms, either

audio from a file or sine waves if so instructed by the synthesizer, or silence (digital zero).

It can also be ordered to interrupt the audio and to revert to silence. The dispatcher also

publishes its current speech state which may be one of sil, start of turn (SoT), talk, or end

of turn (EoT) to the DP it is part of.

22 Speech Analysis

Proceedings of the 13 th ESSLLI Student Session

Speech analysis focuses solely on local prosodic analysis for the classification of the

listening state (which should reflect the interlocutor’s speech state, as described above).

In order to be effective, classification must happen with as short a lag as possible. While

short lags would allow for reactive behaviour, we aim to predict when the interlocutor’s

end of turn is approaching in order to achieve smooth turn changes and counter-balance

the 100 ms lag before a response can be uttered by the speech generation.

We use machine learning to classify each received frame (10 ms) of audio as silence (sil),

ongoing talk (talk) or end of turn (EoT). Classification is based exclusively on signal

power, pitch and derived features. Our pitch extraction is modelled after the first three

steps of the YIN algorithm (de Cheveigné and Kawahara, 2002). As no smoothing or dynamic

programming is applied to the pitch extraction, results are computed incrementally

in real-time and become available instantaneously. The algorithm runs at several times

real-time on average hardware. On the corpora described below, the gross error rate is

1.6 % compared to the well known ESPS algorithm (Talkin, 1995).

In order to track changes over time, we derive features by windowing over past values

of pitch and power with sizes ranging from 20 to 500 ms. While the features calculated

on smaller windows help to smooth and to remove outliers due to failures of the pitch

extraction, the larger windows are expected to capture long-term trends. We calculate the

arithmetic mean and the range of the values, the mean difference between values within

the window and the relative position of the minimum and maximum. We also perform

a linear regression and use its slope, the MSE of the regression and the error of the

regression for the last value in the window.

23 Turn-Taking Management

The turn-taking management agent determines whether to start or stop emitting utterances

on the basis of the states of the generation and analysis modules. An important aspect in

turn-taking management is robustness. To be robust, the turn-taking strategy must not

1 In a dialogue system NLG and TTS would require processing time; for humans there is a delay between

starting to plan an utterance and the start of the articulation (Levinson, 1983).


Proceedings of the 13 th ESSLLI Student Session

depend on its interlocutor acting and reacting in certain ways. Naturally, “good” dialogue

will only evolve from friendly dialogue partners, but the turn-management strategy must

prevent dead-locks due to the interlocutor’s behaviour.

Upon the reception of dialogue state change notifications from the analysis module, the

agent decides about emitting messages to the generation module, ordering it to talk or to

hush, according to a defined turn-taking strategy. Messages are only emitted with certain

probabilities. The probabilities to start or hush were determined empirically to lead to

natural performance. If no action is taken, the agent sleeps for a short while (currently,

50 ms) being awakened if another message is received (for example EoT changing to

sil). Thus, exact timings are non-deterministic and randomly differ between agents. The

probability to start an utterance is set to 0.1, and to hush during an utterance to 0.3.

3 Corpora

We perform our experiments with two different corpora, one of simple pseudo-speech, one

of read speech. Each corpus contains material from two different speakers (one female,

one male) for which we train separate speech analyzers, in order to be able to simulate

dialogues with one male and one female each.

For pseudo-speech our speakers repeatedly uttered the syllable /ba/ instead of the actually

occuring syllables in a script of 50 utterances (questions, informative sentences,

confirmations, etc). By always uttering the same syllable, we remove segment-inherent

influences on power and pitch variation, while at the same time retaining sentence intonation.

For read speech we relied on the two major speakers of the Kiel Corpus of

Read Speech, KCoRS (IPDS, 1994). That corpus contains some 600 utterances for each


The two corpora differ in size and complexity. Our controlled pseudo-speech poses

hardly any problem for pitch-extraction and does not contain voiceless speech, silence

during the occlusion of voiceless plosives or other potentially “difficult” audio. The

KCoRS on the other hand contains far more training material. Also, as the pseudo-speech

does not convey any semantic meaning, subjects in a listening test for the evaluation of

generated turn-taking patterns would not be distracted by nonsense dialogue.

The performance of a speech state classifier on both of our corpora is likely to be better

than on a corpus of real dialogue speech as it is more homogenous (especially compared to

speaker-independent speech state classification). Thus, our results should be considered

an upper bound on realistic results.

The start and end of each utterance were hand-annotated and each 10 ms of audio was

assigned to one of the listening states as described above with EoT being assigned to

frames in the vicinity of ± 50 ms of the utterance end. For the turn-taking management

experiments, we crop the audio files so that each utterance is preceeded and succeeded by

100 ms of silence.

4 Speech Analysis Evaluation

We used the machine learning toolkit Weka (Witten and Frank, 2000) to train various

speaker-dependent classifiers. For the evaluation 80 % of each corpus were used as

training- and 20 % as test-set. Tables 1 and 2 show the results of the OneR-, J48 and


Proceedings of the 13 th ESSLLI Student Session


female speaker

male speaker

Acc. F sil F talk F EoT FAR Acc. F sil F talk F EoT FAR

OneR 96.1 0.98 0.96 0.00 21.4 92.8 0.96 0.93 0.13 65.5

J48 94.8 0.98 0.95 0.50 68.9 96.3 0.97 0.97 0.71 64.3

JRip 95.3 0.98 0.95 0.55 68.3 96.2 0.97 0.97 0.80 59.2

Stateful JRip 95.9 0.98 0.95 0.59 48.4 95.5 0.97 0.96 0.72 50.0

Stateful JRip, shifted 96.2 0.98 0.96 0.59 48.4 96.4 0.97 0.97 0.80 47.5

Table 1: Accuracy, per-class f-measures and false alarm rate for various speech state

classifiers for the pseudo-speech corpus.



female speaker

F sil F talk F EoT FAR Acc.

male speaker

F sil F talk F EoT FAR

OneR 94.5 0.97 0.96 0.03 65.4 93.7 0.92 0.96 0.10 80.7

J48 97.3 0.98 0.98 0.61 71.1 96.1 0.96 0.98 0.42 84.1

JRip 96.6 0.97 0.98 0.73 61.1 95.9 0.97 0.96 0.61 65.7

Stateful JRip 96.4 0.96 0.98 0.70 31.9 94.9 0.97 0.96 0.58 50.0

Stateful JRip, shifted 96.9 0.97 0.98 0.74 31.6 95.5 0.97 0.96 0.64 48.9

Table 2: Accuracy, per-class f-measures and false alarm rate for various speech state

classifiers for the KCoRS speakers.

JRip-algorithms for each corpus. OneR finds the most predictive feature to be the dynamic

range of frame energy over the last 100 or 200 ms. JRip outperforms J48, but

has far worse training complexity. Separation of speech and silence (which here is the

recorded silence in the corpus, not digital zero) is done with high accuracy. Recognition

of EoT regions is of lower quality, but still surpasses results in (Schlangen, 2006). 2

While the data and their states are sequential in nature, the classifiers as described

above evaluate each frame independently. At the same time, recognizing the other speaker’s

start or end of turn a little too late or too early hardly matters, while frequently

changing the listening state may lead to bad dialogue behaviour. This is measured in the

false alarm rate (FAR), defined as the proportion of over-generated state changes.

The analysis of classification output showed that wrong classifications would often

last for only one frame. We implemented a stateful classifier that only changes state

after two consecutive classifications of the underlying classifier. This strongly decreases

FAR but introduces systematic errors in the classification (every actual state change will

be registered one frame too late) and reduces precision/recall measures. When this is

accounted for in the evaluation, the stateful classifier outperforms the base classifier also

in these measures.

The results show, that the complexity of the KCoRS is counterbalanced by its 10 times

larger size. This may indicate, that speech state classification for real dialogue speech

would be feasible with a sufficiently large corpus and speaker-normalized prosodic features.

5 Simple Strategies for Turn-Taking

We outline some simple strategies to turn-control. Their purpose is to exemplify how

very restricted locally managed behaviour with some simple rules can already lead to

acceptable turn-taking behaviour as postulated by the local management model of Sacks

et al. (1974), without the need for a dialogue history, or complex temporal reasoning.

2 Results cannot be easily compared, as Schlangen (2006) recognizes turn-final words using prosodic

and syntactic features on a more complex corpus, reaching an f-measure of 0.36.


Proceedings of the 13 th ESSLLI Student Session

measure strategy 1 strategy 2 strategy 3

gap 14.0 % 351 ms 18.7 % 358 ms 17.4 % 362 ms

speaker a 31.4 % 1259 ms 35.9 % 1009 ms 36.5 % 1079 ms

speaker b 39.3 % 1415 ms 39.8 % 1165 ms 40.8 % 1225 ms

clash 15.4 % 1184 ms 5.6 % 317 ms 5.3 % 278 ms

Table 3: Distribution and mean duration of dialogue states for three turn-taking strategies

with pseudo-speech.

measure strategy 1 strategy 2 strategy 3

gap 14.1 % 528 ms 20.7 % 477 ms 18.9 % 454 ms

speaker a 36.2 % 1764 ms 40.5 % 1456 ms 34.7 % 1232 ms

speaker b 26.2 % 1437 ms 24.8 % 1307 ms 42.0 % 1540 ms

clash 23.5 % 1915 ms 4.0 % 253 ms 4.4 % 243 ms

Table 4: Distribution and mean duration of dialogue states for three turn-taking strategies

with KCoRS speakers.

51 Measuring Turn-Management Success

The dialogue state can be described by the current speech state of each of the dialogue

participants, with each speech state being either talk or sil. For two-party dialogue, this

results in four states: two “good” states where either one of the dialogue participants is

talking and two “bad” states: Clashes when both participants talk simultaneously, and

gaps with neither of them talking.

According to Sacks et al. (1974), speakers try to optimize their behaviour so as to

minimize the occurence of both clashes and gaps. That is why we choose clashes and

gaps as basic measures for turn-taking success. Slight gaps and clashes occur all the time,

but they are not always perceptually relevant. We thus decided to calculate the proportion

of clashes and gaps over the course of the dialogue as well as their mean duration.

For evaluation purposes, we set up two artificial dialogue participants and let them talk

with each other for about 10 minutes for each of the following strategies. We recorded

the internal states and calculated the described measures. The audio itself was recorded

but not further analyzed in the evaluation. The results of the strategies described below

are shown in tables tables 3 and 4.

52 Strategy 1: Talk When Nobody Talks

Rule: Start an utterance when neither you nor your interlocutor is talking. (Implicitly:

Continue talking until your utterance is finished.)

The performance with this strategy strongly depends on the round-trip time from one

agent’s decision to take the turn until the other agent notices the turn being taken. The

shorter the lags introduced by the talking agent’s internal communication, audio transmission,

prosodic processing and classification, and the listening agent’s internal communication,

the more likely it is for a dialogue participant to notice its interlocutor talking (and

then listen until he has finished) before she has started talking herself. For longer lags,

the DP will decide to talk even though its interlocutor may already have started talking

himself. As can be seen, this strategy leads to a large amount of clahes.

53 Strategy 2: Hush When Both Talk

Rule as above, plus: Stop your utterance when both you and your interlocutor are talking.


Proceedings of the 13 th ESSLLI Student Session

The rule proves effective in reducing simultaneous talk as clashes are reduced by 65 %

(pseudo-speech) and over 80 % (KCors) respectively. At the same time, this strategy leads

to the introduction of utterance truncations, when an utterance was stopped prematuerly.

(Actually, the majority of utterances (71 % for pseudo-speech) was truncated, but many of

these truncations occur in the silent phases before or after the actual talk and do not have

any deteriorating effect on the perceived turn-taking performance.) Truncations could be

reduced with a higher probability to hush during SoT.

54 Strategy 3: Start Talking Early

The previous strategies only react after turns have started or ended. In order to initiate

actions early and anticipates turn changes, this strategy exploits the EoT class of the

speech analysis (which was ignored before) in the first rule: Start an utterance, when you

are not talking and your interlocutor is ending their turn or has already finished.

By starting utterance planning before the interlocutor’s preceding utterance is finished,

the dialogue participant can hide some of the lag introduced by its speech generation

module. The duration of both gaps and clashes is reduced compared to strategy 2, for

gaps because turns will be taken over more quickly and for clashes due to the original

talk-owner noticing the turn-change earlier, avoiding the start of a new utterance.

The durations for gaps and clashes with this strategy are similar to those reported

for parts of the Verbmobil corpus by Weilhammer and Rabold (2003), with 363 ms and

331 ms respectively. 3 Performance could be further improved by using a lower probability

to hush during EoT.

6 Conclusion and Future Directions

We have presented a flexible, modular architecture for dialogue strategy evaluation where

arbitrary pairings of human users and artificial dialogue participants can be created. We

have discussed a case-study in this environment, where pairs of artificial DPs converse in

real time via audio. Each DP autonomously decides on their turn-taking behaviour (start

or stop talking) based on a local analysis of the audio signal and using machine-learned

classifiers. We tested these with corpora of simplified speech and achieve good recognition

performance. Three implemented turn-management rulesets, all of them locallymanaged

in the sense of Sacks et al. (1974), i. e. not requiring dialogue memory, were

shown to create increasingly realistic behavioural patterns.

We plan to use the components developed for this system in an interactive speech

dialogue system. For the speech state classification, we will need normalized prosodic

features that allow for speaker independent speech state classification. At the same time,

ASR will make features relative to syllable information (stress patterns, speech rate, ...)

accessible, as well as word hypotheses. We may also want to look into classifier confidence

scores, only emitting speech state changes if the classifier is reasonably certain.

In real dialogue, the problem of hesitations arises. Our classification will have to be

extended to distinguish hesitational interruptions from normal EoT. We would also like to

identify positions in a turn where a back-channelling utterance might be appropriate.

3 Note, that their numbers are for turn changes only, while we do not distinguish between gaps at turn

changes and at turn continuations.


Proceedings of the 13 th ESSLLI Student Session


I would like to thank my supervisor David Schlangen for his constant guidance and support

and the anonymous reviewers for their insightful comments and suggestions.


de Cheveigné, A. and Kawahara, H. (2002). Yin, a fundamental frequency estimator for

speech and music, The Journal of the Acoustical Society of America 111(4): 1917–


Ferrer, L., Shriberg, E. and Stolcke, A. (2002). Is the speaker done yet? Faster and more

accurate end-of-utterance detection using prosody, Proceedings of the International

Conference on Spoken Language Processing (ICSLP2002), Denver, USA.

IPDS (1994). The kiel corpus of read speech, CD-ROM.

Levinson, S. C. (1983). Pragmatics, Cambridge Textbooks in Linguistics, Cambridge

University Press.

López-Cózar, R., De la Torre, A., Segura, J. and Rubio, A. (2003). Assessment of dialogue

systems by means of a new simulation technique, Speech Communication

40(3): 387–407.

Martin, D., Cheyer, A. and Moran, D. (1999). The Open Agent Architecture: a framework

for building distributed software systems, Applied Artificial Intelligence 13(1/2): 91–


URL: citeseer.ist.psu.edu/martin99open.html

Padilha, E. G. (2006). Modelling Turn-taking in a Simulation of Small Group Discussion,

PhD thesis, School of Informatics, University of Edinburgh, Edinburgh, UK.

Sacks, H., Schegloff, E. A. and Jefferson, G. A. (1974). A simplest systematic for the

organization of turn-taking in conversation, Language 50: 735–996.

Schatzmann, J., Weilhammer, K., Stuttle, M. and Young, S. (2006). A survey of statistical

user simulation techniques for reinforcement-learning of dialogue management

strategies, The Knowledge Engineering Review 21(02): 97–126.

Schlangen, D. (2006). From reaction to prediction: Experiments with computational

models of turn-taking, Interspeech 2006, Pittsburgh, USA.

URL: http://www.ling.uni-potsdam.de/ das/papers/schlangen intersp2006.pdf

Schulzrinne, H., Casner, S., Frederick, R. and Jacobson, V. (2003). RTP: A Transport

Protocol for Real-Time Applications, RFC 3550 (Standard).

URL: http://www.ietf.org/rfc/rfc3550.txt

Talkin, D. (1995). A robust algorithm for pitch tracking (rapt), in W. B. Kleijn and K. K.

Paliwal (eds), Speech Coding and Synthesis, Elsevier, chapter 14, pp. 495–518.


Proceedings of the 13 th ESSLLI Student Session

Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., Wolf, P. and Woelfel,

J. (2004). Sphinx-4: A flexible open source framework for speech recognition,

Technical Report SMLI TR2004-0811, Sun Microsystems Inc.

Weilhammer, K. and Rabold, S. (2003). Durational aspects in turn taking, Proc. of the

ICPhS, Barcelona, Spain.

URL: http://www.phonetik.uni-muenchen.de/Publications/WeilhammerRabold-03-


Witten, I. H. and Frank, E. (2000). Data Mining. Practical Machine Learning Tools and

Techniques with Java Implementations., Morgan Kaufmann.


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session


Chris Brumwell

University of Amsterdam

Abstract. I present an update semantics for epistemic modals in which a formula of the

form might φ acts on a context Γ by introducing a salient possibility con-structed from φ

into Γ. This theory is meant to account for the intuitions and data that suggest that assertions

of epistemic modals do not provide information to the participants in a conversation, but

instead suggest certain possibilities for their con-sideration. Among this data is the important

empirical fact that epistemic modals can answer questions. To account for this, I also define a

semantics for questions and show that in this system epistemic modals can count as answers

to questions.

1 Introduction and Motivations

In the classic picture of communication given in Stalnaker (1978), a conversation is a

process of distinguishing between various possibilities, or ways the world might be. It is

clear, however, that in a conversation not all possibilities are given equal attention by the

interlocutors. People talking about whether or not John murdered Jack are not trying to

distinguish a possibility in which chocolate makes cats sick from a possibility in which

chocolate doesnt make cats sick. In this paper, I call the possibilities the interlocutors are

most interested in salient possibilities.

Asking a question is the canonical way of introducing salient possibilities into a discourse:

questions introduce possibilities corresponding to their different answers. But

other constructions introduce salient possibilities as well. The disjunction Jones works

at a bank or a hospital introduces the salient possibilities that Jones works at a bank and

Jones works at a hospital. Constructions containing indefinite NPs such as somebody

stole the jewels can introduce salient possibilities corresponding to various instantiations

of somebody. Free choice commands such as Take any apple you like introduce salient

possibilities corresponding to your various choices. Finally, a statement expressing epistemic

modality such as John might be hiding upstairs introduces the salient possi-bility

that John is hiding upstairs.

Recent work by Groenendijk (Groenendijk 2007) proposes an analysis of disjunction

and existential quantification that captures their potential to introduce salient possibilities

into a dialogue. In this paper, I formalize the notion of a salient possibility and use it

to define a dynamic semantics for questions and epistemic modals. In the semantics,

a question introduces salient possibilities corresponding to its possible answers, and an

epistemic modal of the form might φ introduces a salient possibility constructed from φ

and, following Veltman (1996), tests the common ground to see whether it is consistent

with φ.

Salient possibilities are almost perfectly suited for an analysis of epistemic modals.

Unlike other kinds of assertions, an assertion of an epistemic modal does not con-tribute

information to a conversation. Instead, its function is to call attention to certain possibilities

that the conversational participants should, for some reason, find interesting. Thus,


Proceedings of the 13 th ESSLLI Student Session

to analyze epistemic modals one must develop a framework in which assertions can significantly

change a context without providing information. Since this papers framework

postulates that epistemic modals affect the salient possibilities in a context rather than its

information, the non-informative yet non-trivial effects of epistemic modals are properly


One advantage of this analysis is that it is able to account for the felicity of a modalized

construction as an answer to a question. For example:

(1) A: Where are my keys?

B: They might be in the basement.

(2) A: Are John and Bill coming to the party?

B: They might.

In dialogue (1), B doesnt answers As questions by saying where her keys are (because,

if hes acting felicitously, he doesnt know where they are), but by suggesting a possibility

for her to consider. Similarly in (2): B suggests that A should not overlook the possibility

that Bill and John come to the party. If she really dislikes them, the very possibility that

they attend may be reason enough for her to skip the party.

Enemies of salient possibilities may think that a modal answer to a question really says

nothing more than I dont know, or Any answer is consistent with my knowledge. Against

this, consider the following case: suppose A is frantically looking for her husband Joe,

and comes across B, who has never met Joe and has never given him one thought. If

she asks him Where is Joe? and he responds I dont know, this is perfectly acceptable.

However, if he responds He might be in Boston this is completely infelicitous: if A takes

him seriously, shes on her way to a wild goose chase. Intuitively, this is because she

seriously takes into account his (inappropriate) suggestion to consider the possibility that

Joe is in Boston.

A classical partition theory of questions has difficulty accounting for (1) and (2). This

is the case because in a partition theory an answer to a question has to give information.

However, as dialogues (1) and (2) demonstrate, answers to questions do not need to be

informative: it suffices that they suggest informative answers. Below, I give a more detailed

and formal discussion of the problem partition theories of questions face from noninformative

answers to questions, and discuss the similarities and differences between the

theory presented in this paper and a partition theory.

This analysis also accounts for a puzzling feature of the behavior of epistemic modals

under attitude reports. Statements of the form x believes that might φ mean, in part,

that the attitude holder x considers φ to be a salient possibility. For example, sup-pose

that John has never given a thought to what the weather is like in Amsterdam. Then (3)

certainly seems wrong:

(3) John believes it might be raining in Amsterdam.

Using this papers theory, one could account for (3) by analyzing a belief state as composed

of both information and salient possibilities. The content of (3) then states, roughly,

that its consistent with Johns beliefs that its raining in Amsterdam and that this is a salient


Proceedings of the 13 th ESSLLI Student Session

possibility in his belief state. Several contemporary theories of epistemic modality do not

appeal to any notion similar to that of a salient possibility, and hence have no clear way

of accounting for (3) (e.g. DeRose (1991) and Egan et. al. (2005); for similar reasons

these theories also have problems accounting for the question and answer data presented

above). Due to constraints on length we will not formalize this theory of the interaction

between epistemic modals and attitude reports below.

As mentioned above, the analysis is carried out in a dynamic semantic framework. In

dynamic semantics, the meaning of a formula is not identified with its truth conditions,

but rather with the way it changes a context. More specifically, our theory is a version of

update semantics in the style of Veltman (1996), i.e. we give a definition of an information

state and the meanings of formulas are functions from information states to information


2 Questions and Salient Possibilities

In this section, we define a 1st-order language with a question operator and an epistemic

possibility operator. We then define the structures (information states) used to give a

semantics for this language and define the notion of a salient possibility. Finally, we give

the semantics for this language and define what it means for a formula to be an answer to

a question. This definition will allow modal and non-modal formulas to answer questions.

DEFINITION 1. We define the languages L 1 , L 2 , and L 3 as follows:

(i) If P is an n-place predicate and t 1 ...t n are terms, then P(t 1 ...t n ) ∈ L 1

(ii) If φ,ψ ∈ L 1 , then φ ∧ ψ ∈ L 1 and ¬ φ ∈ L 1

(iii) If φ ∈ L 1 , then ⋄φ ∈ L 2

(iv) If φ,ψ ∈ L 2 , then φ ∧ ψ ∈ L 2 and ¬ φ ∈ L 2

(v) If φ ∈ L 1 , then ?φ ∈ L 3

(vi) If φ,ψ ∈ L 3 , then φ ∧ ψ ∈ L 3

The language L we discuss in this paper is defined L = L 1 ∪ L 2 ∪ L 3 . As a notational

convention, we write atomic sentences (i.e. atomic formulas with no free variables) and

Boolean combinations of atomic sentences as p, q, ¬q,p ∧ q, etc.

In a standard update semantics, information states are sets of indices, where an index

assigns an individual from a domain D to each constant of the language and an n-ary

relation to each n-place predicate. In this papers framework, an information state is a set

of sets of indices A such that there is an I* ∈ A that for all I m ∈ A, I m ⊆ I*. The intui-tion

behind this definition is that this maximal set I* represents the common ground at a point

in a conversation. Any subset of I* is a possible future state of the common ground, and

hence could be a possibility that the discourse participants are interested in. However,

recalling the introduction, all such subsets are not always of interest to the discourse

participants. With that in mind we think of the subsets I m of I* as salient possibilities. We

formally define information states below:

DEFINITION 2. Let I be the set of all indices for the language L. We define an information

state to be a set Γ = {P 1 ,...,P n ,...} such that:

(i) P i ⊆ I for all n (ii) For some i, P i = Γ

(iii) There is an i such that for all j, P j ⊆ P i . This maximal set P i is called the common


We write CG (common ground) for the maximal set P i defined in (iii), and write

Γ = {CG,P 1 ,...,P n ,...,∅}. In some cases, we refer to information states as contexts.


Though every element of an information state is a salient possibility (except the empty

set, which is present to simplify the definition of an answer to a question), the sets in an

information state do not exhaust its salient possibilities. Rather, the salient possibilities

in an information state are generated by closing it under union and intersection. Salient

possibilities are defined this way because, intuitively, if P 1 and P 2 are salient possibilities

in a context, then if they are not mutually exclusive it is also possible that they both obtain.

Thus, their intersection should count as a salient possibility as well. Similar reasoning

supports considering the union of salient possibilities to be a salient possibility.

DEFINITION 3. Let Γ be an information state. Then 〈Γ〉, the set of salient possibilities in

Γ, is defined as the the smallest set such that:

(i) If P ∈ Γ, then P ∈ 〈Γ〉 (ii) If P 1 , P 2 ∈ 〈Γ〉, then P 1 ∪ P 2 ∈ 〈Γ〉

(iii) If P 1 , P 2 ∈ 〈Γ〉, then P 1 ∩ P 2 ∈ 〈Γ〉.

We need one more concept in order to define the semantics of wh-questions. On our

analysis, wh-questions introduce salient possibilities corresponding to each of their possible

answers into an information state. To represent the possible answers to a wh-question,

we use the relations defined in definition 5 (Definition 4 is a standard account of satisfaction,

which is necessary for articulating definition 5):

DEFINITION 4. Let φ ψ ∈ L 1 , let i be an index, and let g be a variable assignment function.

(i) If φ = Qt 1 ...t n , then i |= φ [g] iff 〈[t 1 ] i,g ,...,[t n ] i,g 〉 ∈ i(Q)

(ii) i |= φ ∧ ψ [g] iff i |= φ [g] and i |= ψ [g] (iii) i |= ¬φ [g] iff i ̸|= φ [g]

DEFINITION 5. Let φ ∈ L 1 , and let i and j be indices. We say that i ≡ j (mod φ) if for all

assignments g, i |= φ [g] iff i |= ψ [g]

Given a formula φ, definition 6 defines the conditions under which two indices give

the same answer to the question ?φ. For a sentence φ of L 1 , i ≡ j (mod φ) will hold as

long as i and j assign φ the same truth value. But for a formula of L 1 with free variables,

congruence modulo φ requires that the indices assign the same denotations (or just similar

denotations if the formula contains both free variables and constants) to predicates that

occur in φ. The following examples illustrates how this definition works.


Proceedings of the 13 th ESSLLI Student Session

(i) Let ?φ = ?Px (Who came to the party?) i ≡ j (mod φ) if i(P) = j(P), or informally, if the

same people came to the party according to indices i and j.

(ii) Let ?φ = ?Ibx (Who did Bill invite to the party?) i ≡ j (mod φ) if

{d ∈ D| 〈d, i(b)〉 ∈ i(P)} = {d ∈ D| 〈d, j(b)〉 ∈ j(P)}.

(iii) Let ?φ = ?p (Did Alice help Bill?) i ≡ j (mod φ) if i |= p iff j |= p.

In our update semantics, the effect of a formula on an information state will be defined

in terms of the effects it has on certain elements of the information state. Thus, to state

our update semantics for information states we require an update semantics for sets of

indices as well. The update semantics for sets indices is fairly simple, and is roughly the

same as that given in Veltman (1996).

DEFINITION 6. Let φ ∈ L 1 ∪ L 2 be a sentence, and let P be a set of indices. We define the

update of P with φ, written P[φ], as follows:

(i) P[p] = {i ∈ P | i |= p}

(ii) P[φ ∧ ψ] = P[φ][ψ]

(iii) P[¬φ] = {i ∈ P | i ∉ P[φ]} (iv) P[⋄φ] = P if P[φ] ≠ ∅

(v) P[⋄φ] = ∅ if P[φ] = ∅


We now state our update semantics for information states.

DEFINITION 7. Let Γ = {CG, P 1 ,...,P n , ∅} be an information state, and let φ ∈ L be a

sentence. We define the update of Γ with φ as follows:

(i) Γ[p] = { CG[p], P 1 [p],...,P n [p], ∅}

(ii) Γ[¬φ] = { CG[¬φ], P 1 [¬φ],...,P n [¬φ], ∅}

(iii) Γ[φ ∧ ψ] = Γ[φ][ψ]

(iv) Γ[⋄φ] = {CG[⋄φ], P 1 [⋄φ],...,P n [⋄φ], ∅} if there is a P ∈ Γ such that P[φ]] = P

(v) Γ[⋄φ] = { CG[⋄φ], CG[φ], P 1 [φ],...,P n , ∅} if there is a P ∈ Γ such that P[φ]] = P

(vi) Γ = Γ ∪ {{ i | i ≡ j (mod φ)} | j ∈ CG}.

Clauses (i) - (vi) apply as long as CG[φ] ≠ ∅. In the degenerate case that CG[φ] = ∅, we set

Γ[φ] = {∅}, the absurd state.

In the semantics defined above, although epistemic modals can change information

states they cannot have a non-trivial effect on the common ground. This is as it should

be: only constructions that provide information should change the common ground, and

epistemic modals do not play that role in a dialogue. Thus, this semantics complies with

the requirement set forward in the introduction: epistemic modals change a context in a

significant yet non-informative manner.

More specifically, epistemic modals change a context by drawing attention to certain

possibilities. However, the manner in which an epistemic modal accomplishes this depends

on the possibilities that are already salient in the dialogue’s context. If the possibility

an epistemic modal calls attention to is not under discussion at all, then the epistemic

modal adds this possibility to the set of salient possibilities in the context, acting in the

manner specified in clause (v) (see example 4 below). But if this possibility is already

under consideration, an epistemic modal draws attention to it by eliminating salient possibilities

that are inconsistent with it from the context. In this latter case, epistemic modals

act in the manner specified in clause (iv) (see example 2 - 3 below).

An epistemic modal acts in accordance with clause (iv) when it functions as an answer

to a question. Questions introduce several salient possibilities in a context, and an

epistemic modal acts to draw attention to some answers rather than others. But epistemic

modals arent always used to answer questions. For example, they can be used to provide

someone with a warning:

(4) A: Alice and I are going fishing in Leiden tomorrow.

B: It might be illegal to fish in Leiden.

A: Oh, I hadn’t thought to check that; thanks.

B draws A’s attention to the possibility that fishing is illegal in Leiden, a possibility that

A had overlooked but should investigate. Here, it is essential that B’s utterance contributes

a new salient possibility to the context.

Using this framework, we now define the conditions under which a formula φ answers

a question ψ. Note that this definition admits full and partial answers.


Proceedings of the 13 th ESSLLI Student Session

Let φ ∈ L, and let ψ ∈ L 3 . We say that φ answers ψ if 〈{I,∅}[ψ][φ]〉 ⊂ 〈{I,∅}[ψ]〉.


Proceedings of the 13 th ESSLLI Student Session

Thus, φ answers ψ if φ removes some salient possibilities that ψ introduces. This

notion of answerhood should be familiar from a partition theory of questions: in both

cases, answering a question amounts to eliminating some of the possibilities it introduces.

But an important, unique feature of this definition is that an answer doesnt necessarily give

information: it suffices that an answer suggest certain possibilities for the questioner to


We close this section by working through a few examples. We use the following notational

conventions: {p} = {i ∈ I | i |= p}, {¬p} = {i ∈ I | i ̸|= p} etc.

Example 2: A Polar Question.

Recall example (2), and let p and q be the propositions ‘Bill is coming to the party’ and ‘John

is coming to the party’ respectively. Let Γ = {I, ∅}; then ⋄p ∧ ⋄q answers ?p ∧ ?q: Γ[?p ∧

?q] = {I, {p}, {¬p}, {q}, {¬q}, ∅} = Γ 1 . Then: Γ 1 [⋄p ∧ ⋄q] = {I, {p}, {q}, ∅} = Γ 2 , and

since 〈Γ 2 〉 ⊂ 〈Γ 1 〉, ⋄p ∧ ⋄q answers ?p ∧ ?q.

Example 3: A Wh-Question.

Consider the question ‘Who is likes to paint?’, and note that ‘Bill might like to paint’ felicitously

answers this question. Let Px be ‘x likes to paint’, and let b be Bill. Let Γ = {I, ∅}.

Then: Γ[?Px] = Γ ∪ {{i | i ≡ j (mod Px)} | j ∈ I }

= Γ ∪ {{i |i(P) = D*}| D* ⊆ D} = Γ 1 . Then

Γ 1 [⋄Pb] = Γ ∪ {{ i | i(P) = D*}[⋄Pb] |D* ⊆ D}

= Γ ∪ {{i | i(b) ∈ i(P) and i(P) = D*}|D* ⊆ D such that i(b) ∈ D*}

Since Γ 1 [⋄Pb] ⊂ Γ 1 , ⋄Pj is an answer to ?Px.

Examples 3 and 4 bring out an important feature of this paper’s framework: epistemic

modals behave much like questions. Both questions and epistemic modals draw attention

to certain possibilities without committing the speaker to a position on whether or not

these possibilities are actual. Epistemic modals, however, are stronger than questions:

modals draw attention to fewer possibilities than questions, suggesting that the chosen

possibilities are somehow more important than the ignored possibilities. The notion of a

salient possibility allows us to represent this similarity between questions and epistemic

modals in fully formal way.

Example 4: Raising Issues Without Questions.

Recall (4), and let p and q be ‘Alice and A are going fishing in Leiden tomorrow’ and ‘It’s

illegal to fish in Leiden’ respectively. Let Γ = {I, ∅}. Then

Γ[p][⋄q] = {{p}, {p ∧ q}, ∅}. Here, since no possibility in Γ[p] satisfied q, the epistemic

modal acted to add the possibility {p ∧ q} to the context. Thus, even though no questions

have been asked in this context, B is able to bring A’s attention to some issue by using an

epistemic modal.

Example 5: Infelicitous Answer.

Responding to a polar question ?φ with ⋄φ ∧ ⋄¬φ should not count as answering the question:

rather, responding to a question with ‘maybe, maybe not’ is a deliberate and almost

reticent refusal to answer the question. Our semantics allows us to account for this: {I,

∅}[?p][⋄p ∧ ⋄¬p] = {I, {p}, {¬p}, ∅}[⋄p ∧ ⋄¬p]

= {I, {p}, {¬p}, ∅}[⋄p][⋄¬p] = {I, {p}, ∅}[⋄¬p] = {I, {p}, {¬p}, ∅}. Thus,

⋄p ∧ ⋄¬p does not answer ?p. Moreover, ⋄p ∧ ⋄¬p is actually equivalent to ?p in this information


In general, ?φ and ⋄φ ∧ ⋄¬φ are equivalent in any information state that is consistent with

both φ and ¬φ, so polar questions can almost be defined using epistemic modals (if we assume

that polar questions presuppose that both of their answers are possible, polar questions

can be defined in terms of the epistemic modality operator).


3 Comparison With a Partition Semantics of Questions

In this section, we will slightly change our semantics to yield a partition theory of questions,

1 and examine the difficulties it faces. These difficulties will bring to light problems

that any partition theory of questions faces in accounting for non-informative answers to

questions, and point to an important feature of the theory above that allows it to account

for non-informative answers. For ease of exposition, we only consider polar questions: in

this section, suppose that we only allow atomic sentences to be well-formed elements of

L 1 .

Using our terminology, in a partition theory of questions a question divides the common

ground into the salient possibilities corresponding to its different answers. Crucially,

salient possibilities are not added to the context as they were in section 2. Thus, to state

a partition theory of questions in our framework we have to alter the definition of an

information state: we no longer assume an information state contains a maximal set of

indices, and for purposes of this section we remove clause (ii) from the definition of an

information state.

Since information states no longer contain a common ground, clause (v) in the update

semantics for information states is difficult to translate to this new system. For purposes

of this section, then, we also remove clause (v) from this definition, and stipulate that

epistemic modals always change an information state according to clause (iv).

Our partition theory of questions results from changing definition 8 and clause (vi) in

definition 7 to the following.


Proceedings of the 13 th ESSLLI Student Session

(i) Let Γ = {P 1 ,...,P n } be an information state, and let ?φ ∈ L. Then we define

Γ[?φ] = {P 1 [φ]P 1 [¬φ],...,P n [φ], P n [¬φ]}

(ii) Let φ ∈ L and let ψ ∈ L 3 . We say that φ answers ψ if {I}[ψ][φ] ⊂ {I}[ψ].

An immediate problem with this theory is that modal formulas can eliminate blocks of

a partition. This is the case because after a question ?p, ⋄p will eliminate any possibility

that was just updated with ¬p. While this is good in so far as under this theory modal formulas

can answer questions, it has other disastrous consequences. Since modal formulas

can eliminate blocks of a partition, they can provide as much information as non-modal

formulas: for any information state Γ, Γ[?p][⋄p] = Γ[?p][p]. This is the case because both

p and ⋄p will eliminate the possibilities from Γ that have been updated with ¬p and have

no effect on the possibilities that have been updated with p. This is a bad result: Γ[?p][⋄p]

[¬p] should be consistent, but Γ[?p][p] [¬p] shouldn’t be. While modals and non-modals

should both count as answers to questions, they should not answer questions in the same


On a more general level, the problem with the partition semantics is that any update

has to provide information or add possibilities, and possibilities can only be removed by

information. This leads to trouble with epistemic modals: if one lets an epistemic modal

answer a question, it must provide information and hence function far too much like a

non-modal. But, on the other hand, if one posits that an epistemic modal doesnt provide

1 For purposes of this paper, a partition semantics for questions is a semantics that holds: (i) a question

changes a context by partitioning an information state, and (ii) to answer a question is to remove blocks

from this partition. The partition semantics given in Groenendijk (1999) is similar to the one we present in

this section.


Proceedings of the 13 th ESSLLI Student Session

information, there is no way to say how it could change an information state in a way that

answers a question.

In the framework presented above this problem is dealt with by separating the common

ground, and hence the information, from the salient possibilities. This change makes noninformative

answers to questions possible: epistemic modals can eliminate possibilities

without changing the information in the common ground. However, by connecting the

meaning of a question to its possible answers in a context, and by identifying answers to

questions with the elimination of possibilities, this approach retains much of the spirit of

the partition theory of questions.

4 Further Issues and Expansions of the System

In this section, I will discuss some expansions of the system defined above and consider

two objections to it.

First, I will discuss the objections. Though the idea that epistemic modals can answer

wh-questions or other complex questions by suggesting possible answers is quite natural,

some readers may find the suggestion that epistemic modals answer polar questions by

suggesting possible answers a bit odd. After all, someone asking a polar question clearly

has both possibilities in mind, so how can simply making one of them more salient in the

context count as felicitously answering her question?

Dealing with this objection involves delving into the pragmatics of epistemic modals,

and more specifically the pragmatic role that salient possibilities play in a context. This

topic would take a great deal of space to treat, and is beyond the scope of this paper. But

to respond to the objection we note that one very plausible pragmatic principle governing

the use of epistemic modals is that, in general, one should only focus attention to some

possibility if one has some reason to believe that it is the case. To see this, note how

infelicitous dialogue (5) sounds:

(5) A: Are John and Bill coming to the party?

B: They might.

A: Why do you say that?

B: I dont know; they just might.

Thus, pragmatically, answering a polar question with an epistemic modal can commit the

speaker to having some reason to believe that the possibility made salient by her answer

actually obtains. This pragmatic dimension of epistemic modals makes it clear how a

speaker can answer a polar question simply by making one of the possible answers rather

than the other salient in the context.

Another objection to this framework questions the idea that, given the informal description

of salient possibilities in the introduction, it makes sense to say that epistemic

modals actually eliminate salient possibilities that questions introduce. After all, if a question

is answered by an epistemic modal, its possible answers that are inconsistent with the

epistemic modal aren’t completely forgotten about. But in the formal system, these possibilities

have the same status as many other possibilities that the interlocutors haven’t

given any thought to. Thus, this objection concludes, holding that epistemic modals actually

eliminate salient possibilities from a context is far too strong.


We take this objection seriously, and admit that the definition of salient possibilities

given above is too coarse. A better definition would make salience into a scalar notion.

With a scalar notion of salience, we could say that the salient possibilities eliminated by

an epistemic modal acting as an answer to a question are less salient than those still in

the context, but more salient than many other subsets of the common ground. A potential

candidate for this scale is defined below:


Proceedings of the 13 th ESSLLI Student Session

Let Γ = {CG, P 1 ,,P n , ∅} be an information state, and let P ⊆ CG.

(i) P is 1-salient if P ∈ 〈Γ〉 and P - CG ∉ 〈Γ〉

(ii) P is 2-salient if P ∈ 〈Γ〉 and P - CG ∈ 〈Γ〉

(iii) P is 3-salient if P ∉ 〈Γ〉 and P - CG ∈ 〈Γ〉

(iv) P is 4-salient if P ∉ 〈Γ〉 and P - CG ∉ 〈Γ〉

Here, 1-salient propositions are most salient, and 4-salient propositions are least salient.

In general, after an epistemic modal answers a question it changes its possible answers

from 2-salient propositions to either 1-salient propositions or 3-salient propositions, thus

making possible answers either more or less salient and not rendering any forgotten. Thus,

replacing an absolute notion of salience with a scalar one solves the problem raised by the


In this papers semantics, epistemic modals can only focus attention on possibilities that

are subsets of the common ground. This is problematic because some uses of epistemic

modals make possibilities that lie outside of the common ground salient in a conversation.

(6) A: There arent any deer in this part of the forest.

B: (2 hours later) Look over there! Hoofprints! There might be deer after all.

These modal assertions also challenge previously accepted information without directly

contradicting it. To account for this use of epistemic modals, one could posit that if ⋄φ

is inconsistent with the common ground of an information state, then ⋄φ acts on this

information state by: (i) introducing a salient possibility corresponding to the revision of

CG with φ, (ii) transforming the information states common ground into the union of this

revision and the old common ground, and (iii) performing a similar operation on the other

possibilities in the information state. Thus, though the papers theory itself cannot account

for uses of epistemic modals like (6), augmented with a theory of belief revision it can

provide an elegant analysis.

I would like to thank Paul Dekker.



De Rose, K. (1991). Epistemic possibilities, Philosophical Review 100.: 581–605.

Egan, A., Hawthorne, J. and Weatherson, B. (2005). Epistemic modals in context, in

G. Preyer and P. Peter (eds), Contextualism in Philosophy, Oxford University Press,

Oxford, pp. 131– 170.


Proceedings of the 13 th ESSLLI Student Session

Groenendijk, J. (1999). The logic of interrogation, in T. Matthews and D. Strolovitch

(eds), The Proceedings of the Ninth Conference on Semantics and Linguistic Theory,

CLC Publications, Ithaca, NY, pp. 109–126.

Groenendijk, J. (2007). Inquisitive semantics: Two possibilities for disjunction.

Groenendijk, J. and Stokhof, M. (1997). Questions, in J. van Benthem and A. T. Meulen

(eds), Handbook of Logic and Language, Elsevier.

Stalnaker, R. (1978). Assertion, Syntax and Semantics 9.

Veltman, F. (1996). Defaults in update semantics, Journal of Philosophical Logic 25(3).


Proceedings of the 13 th ESSLLI Student Session


Bert Le Bruyn

Utrecht University

Abstract. This paper treats the distinction between singular nominal predication with and

without indefinite article in languages like Dutch. The former variant is referred to as nonbare

predication, the latter as bare predication. I make the following claims: (i) temporal

analyses of the distinction between bare and non-bare predication are on the wrong track,

(ii) bare predication needn’t be analyzed as a lexical phenomenon, (iii) non-bare predication

should be analyzed as kind-membership predication.

1 Introduction

In order to understand the role played by the indefinite article in predicate position it is instructive

to look at instances of singular nominal predication in which the indefinite article

does not appear. These instances are subsumed under the notion of bare predication (see

(Kupferman, 1991), (Broekhuis, Keizer and Den Dikken, 2003), (de Swart, Winter and

Zwarts, 2005), (de Swart, Winter and Zwarts, 2007), (Matushansky and Spector, 2005),

(Déprez, 2005), (Munn and Schmitt, 2005), (Roy, 2006), (Beyssade and Dobrovie-Sorin,

2005)). In English bare predication is marginal but a language like Dutch seems to have

a productive paradigm:

(1) (a) Jan is slager. (litt. John is butcher) (b) Jan is moslim. (litt. John is muslim)

(c) Jan is Belg. (litt. John is Belgian) (d) Jan is hertog. (litt. John is duke)

Nouns that typically occur in bare predication are linked to professions (1a), religions

(1b), nationalities (1c) and titles (1d). It is important to note that this is not an idiosyncracy

of Dutch but a pervasive phenomenon in Romance and Germanic languages (examples

taken from (de Swart et al., 2007)):

(2) Es negrero. (Spanish, litt. Is trader in black slaves); João é médico. (Portuguese,

litt. John is doctor); Gianni è dottore. (Italian, litt. John is doctor); Jean est

médecin. (French, litt. John is doctor); Olivier var skuespiller. (Danish, litt. Oliver

was actor); Herr Weber är katolik. (Swedish, litt. Mr Weber is catholic); Han er

lærer. (Norwegian, litt. He is teacher); Er ist praktizierender Katholik. (German,

litt. He is practicing catholic).

∗ This paper should be read as a working paper that presents thoughts and bits of analysis that are not

finished yet. I’m very grateful to audiences at ConSOLE XVI, my UiL-OTS kermit-lecture and the LSB

2008 Linguists’ Day and to the reviewers of the ESSLLI student session for very useful comments and

discussion. Special thanks also to Min Que, Gianluca Giorgolo, Dorota Klimek, Sander Lestrade, Joost

Zwarts and Henriëtte de Swart.


Proceedings of the 13 th ESSLLI Student Session

In this paper I will defend three claims concerning bare predication. The first is that analyses

that reduce the distinction between bare and non-bare predication to a temporal one

are not on the right track (see paragraph 2). The second is that a purely lexical approach to

bare predication is not tenable (see paragraph 3). The third and final one is that non-bare

predication should be analyzed as kind-membership predication (see paragraph 4).

2 Bare predication and time

When comparing sentences (3a) and (3b) most informants tend to say that the a-variant is

more ’eventive’ than the b-variant (Roy, 2006):

(3) (a) Paul est acteur. (French, litt. Paul is actor)

(b) Paul est un acteur. (French, litt. Paul is an actor)

This intuition has led linguists to explore a temporal analysis of bare predication. In its

simplest form it would state that bare predication is concerned with transient properties

whereas non-bare predication is concerned with permanent ones. The most convincing

argument in favour of this analysis comes from ’lifetime effects’:

(4) (a) Paul était médecin. (French, litt. Paul was doctor)

(b) Paul était un médecin. (French, litt. Paul was a doctor)

Sentence (4a) can be understood as stating that Paul used to be a doctor and that he’s

retired now. Sentence (4b) can only mean that Paul is dead. Under the assumption that

non-bare predication is concerned with permanent properties the interpretation of sentence

(4b) follows: to cancel a permanent property one has to cancel the existence of the

entity the property applies to. The problem this analysis faces is that it predicts that inherently

transient properties should always occur bare in predicate position. This prediction

is not borne out (cf. (de Swart et al., 2007)):

(5) ?? Marie est fille. (French, litt. Mary is girl)

Another temporal approach to bare predication is the one presented in (Roy, 2006) (variants

are (Munn and Schmitt, 2005) and (Déprez, 2005)). Roy assumes all nouns come

with an event argument that has to be bound. When bound by the indefinite article it is

signalled that the predication holds for the maximal event around the ’time of utterance’

(given by the Tense on the copula) and that this event cannot be split up into smaller intervals.

When bound by Tense it is signalled that the maximal event can be split up. The

facts that led to this analysis are presented in (6) and (7):

(6) (a) Jean est professeur le jour, danseur la nuit.

(French, litt. John is teacher by day, dancer by night)

(b) ?? Jean est un professeur le jour, un danseur la nuit.

(French, litt. John is a teacher by day, a dancer by night)

(7) (a) Paul est devenu chanteur.

(French, litt. Paul has become singer)

(b) ?? Paul est devenu un chanteur.

(French, litt. Paul has become a singer)


The reason why the b-variants are out on Roy’s analysis is that adverbials like le jour

... la nuit (’by day ... by night’) and verbs like devenir (’become’) split up the ’time of

utterance’. This is depicted for the adverbials in (8) and for the verb in (9).


Proceedings of the 13 th ESSLLI Student Session


It is important to note that in absence of temporal adverbials or verbs like devenir there is

no clear reason in Roy’s analysis to prefer bare over non-bare predication or vice versa.

In order to account for preferences like in (5) Roy has to assume that whenever world

knowledge makes it implausible / impossible that the maximal event is split up the indefinite

article is obligatory and that whenever world knowledge makes it plausible / possible

that the maximal event is split up the indefinite article ends being obligatory.

The problem Roy’s analysis faces is that the incompatibility of non-bare predication with

temporal adverbials or verbs like devenir is only a strong tendency that surfaces as an

epiphenomenon. To show this it is necessary to anticipate the analysis presented in paragraph

4. There it is claimed that non-bare predication signals kind-membership. A sentence

like (10) e.g. would mean that White Fang belongs to the kind wolf.

(10) White Fang is een wolf.

(Dutch, litt. White Fang is a wolf)

What makes kind-membership special is that in general one cannot change from one kind

into another. White Fang e.g. cannot turn into a sheep or a wild boar. This explains why

non-bare predication in general is incompatible with temporal adverbials or verbs like devenir.

There are however instances of transformations in nature and in folklore: e.g. the

transformation from a caterpillar into a butterfly and from a man into a werewolf. The former

can be described in a sentence with the verb devenir and the latter in a sentence with

temporal adverbials. Roy’s analysis predicts that in these sentences non-bare predication

is not allowed. An analysis that takes non-bare predication to signal kind-membership

predicts the opposite. As shown by the acceptability of (11) and (12) it is the latter that

makes the right prediction.

(11) In Lady Hawke is Rutger Hauer ’s nachts een wolf en overdag een mens.

(Dutch, litt. In Lady Hawke is Rutger Hauer by night a wolf and by day a man)

(12) La chenille est devenue un papillon.

(French, litt. The caterpillar has become a butterfly)


Proceedings of the 13 th ESSLLI Student Session

From the preceding I conclude that the existing analyses that try to reduce the distinction

between bare and non-bare predication to a temporal one are not on the right track. It was

important to establish this given that most existing analyses are cast in temporal terms

whereas the one I will defend in paragraph 4 is not.

3 Bare predication and the lexicon

In the literature on bare predication one of the following positions is often taken: (i)

nouns that usually appear in non-bare predication are marked in the lexicon (see e.g.

(Matushansky and Spector, 2005)); (ii) nouns that usually appear in bare predication are

marked in the lexicon (see e.g. (de Swart et al., 2005), (de Swart et al., 2007)). In this

section it will be argued that purely lexical standpoints like (i) and (ii) should be amended.

In order to do so it will be shown that :

(a) all nouns that usually appear in bare predication can appear in non-bare predication;

(b) all nouns that usually appear in non-bare predication can appear in bare predication.

It should be noted that (a) and (b) don’t constitute decisive arguments against lexical

analyses. They do however make them less appealing.

31 Bare predication nouns

As stated in paragraph 1 there is a subclass of nouns that usually appear in bare predication.

They include nouns related to professions, religions, nationalities and titles. It is

however well-known that these nouns appear fairly frequently in non-bare predication too

(see e.g. (de Swart et al., 2005), (de Swart et al., 2005)). When they do they allow for

their normal interpretation and an enriched one. This will be illustrated on the basis of


(13) (a) Sil is beenhouwer. (Dutch, litt. Sil is butcher)

(b) Sil is een beenhouwer. (Dutch, litt. Sil is a butcher)

The a-variant is the unmarked one and simply states that Sil works as a butcher. The b-

variant has the same interpretation but also allows the interpretation according to which

Sil is not a butcher but has the characteristics we usually associate with butchers. A

typical person the b-variant would apply to is a violent boxer. The enriched interpretation

projects the (stereotypical) characteristics that are associated with a profession on

an individual. From a lexical standpoint one could see the enriched interpretation as an

instance of coercion. Note though that if we store in our world knowledge that butcher is

a profession we can get the same coercion effect to arise.

32 Non-bare predication nouns

The majority of nouns in languages like Dutch usually appears in non-bare predication.

Up to date these nouns have been defined negatively; they are those that are not related to

professions, religions, nationalities and titles.

In the literature there are two claims about nouns appearing in bare predication. The

first is that they are usually [+ human] (cf. (Matushansky and Spector, 2005) and (Roy,


2006)). The second is that nouns referring to kinds (which would be a subset of nonbare

predication nouns) can never appear in bare predication (cf. (Kupferman, 1991) and

(Roy, 2006)). In order to argue that all non-bare predication nouns can in principle appear

in bare predication the strongest claim would therefore be to say that even [-human] and

[+kind] nouns can appear in bare predication. This is the claim I defend here.

A noun that meets both the [-human] and the [+kind] criterion is wolf. An example of

wolf in non-bare predication was given in (10). Its bare variant would look as follows:

(14) Ik ben wolf. (Dutch, litt. I am wolf)

Even though (14) might seem ungrammatical at first sight it is acceptable in Dutch under

a very specific interpretation, viz. the one in which wolf is a role in a game (e.g. the

werewolf game). This should not come as a surprise given that it is often claimed that

bare predication nouns refer to roles in society:

”[Bare predication nouns] usually [...] denote specific roles in society: professions, religions

or nationalities. Other nominals (non-human or human) that are not related to such

roles generally resist taking up a bare nominal position.” (de Swart et al., 2007)

Under the assumption that any noun can be reinterpreted as referring to a role in a game

there is no reason to expect a principled limit on nouns appearing in bare predication.

Note that the reinterpretation referred to can be seen as a coercion mechanism from a lexical

standpoint. Once again it is not obvious though that we couldn’t get the same effect

through world knowledge.

33 Conclusion

Proceedings of the 13 th ESSLLI Student Session

In 3.1. and 3.2. it was argued that any noun can appear in both bare predication and

non-bare predication. As noted before these facts cannot be seen as decisive arguments

against a lexical approach. They do however make lexical approaches less appealing and

clear the road for non-lexical analyses like the one that will be presented in paragraph 4.

4 Bare predication and kinds

In this paragraph I will introduce the basic ingredients for an analysis in which non-bare

predication is seen as kind-membership predication. The basic claim is that a sentence

involving non-bare predication should be interpreted as ’X belongs to the kind Y’. The

paragraph is organized as follows. I first present my background assumptions about kinds

and articles (4.1. and 4.2.). Afterwards I present a pragmatic analysis of the contrast between

bare and non-bare predication (4.3). I close the paragraph defending the claim that

there is a one-to-one correspondence between non-bare predication and kind-membership

predication (4.4).

41 Background on kinds

I follow Chierchia (1998) in his intuition that kinds are regularities that occur in nature.

This translates into two constraints on kinds and their instantiations. The first (see (15))

captures the intuition that for something to be regular it should be hypothesized that there


could be more than one. Note though that for K to qualify as a kind in w 0 it is not

necessary for there to be more than one or even one single instantiation of K in w 0 (this

makes it possible to talk about unicorns, dodos and new inventions as kinds).

(15) For K to be a kind in w 0 there has to be at least one world in which K has more

than one instantiation.

The second constraint (see (16)) captures the intuition that the instantiations of kinds

behave in a regular way, i.e. that their kind-membership is not accidental. Note though

that it does not prohibit kinds to display properties varying over time nor for individuals

to start or stop being instantiations of a kind (this is left to world knowledge).

(16) If k is an instantiation of the kind K in w 0 at t n and if k exists in a world w n

accessible from w 0 at t n k is an instantiation of the kind K in w n at t n .

I will call (15) the non-uniqueness constraint and (16) the non-accidentality constraint on

kinds and their instantiations.

42 Background on articles

I follow Partee (1987) in assuming that articles are default type-shifters from type to

type e or type . In short this means that they are markers of argumenthood and

that they cannot be omitted in absence of other determiners in argument position:

(17) *I have cat.

(18) *Man came to see me.

I furthermore follow (Hawkins, 1991) and (Farkas, 2002) in assuming that the definite

article is a uniqueness marker whereas the indefinite article is unmarked for uniqueness.

This means that (19) signals that there is only one teacher present in a particular setting

whereas (20) is in principle neutral with respect to there being one or more teachers.

(19) I saw the teacher.

(20) I saw a teacher.

Proceedings of the 13 th ESSLLI Student Session

As noted by Hawkins and Farkas it is the case though that by choosing the indefinite

instead of the definite the speaker triggers the implicature that there is more than one


Finally, in line with Partee’s type-shifting analysis I expect indefinite articles to be omissible

in predicate position. The instances of bare predication treated in this paper show that

this expectation is borne out. The crucial question is why they cannot always be omitted.

The answer, I claim, does not lie in the semantics but in the pragmatics. The pragmatic

analysis I defend is presented in 4.3.


43 Non-bare predication and non-uniqueness

The analysis I defend is cast in (Weak) Bi-directional Optimality Theory (cf. (Blutner,

2000)) and is based on five standard assumptions. The first is that bare and non-bare

predication are truth-conditionally equivalent (cf. (Partee, 1987)). The second assumption

is that both bare and non-bare predication in principle trigger an implicature of nonuniqueness.

This assumption builds on the insights of Hawkins and Farkas according to

whom not using the definite triggers an implicature of non-uniqueness. The third assumption

is that non-bare predication is syntactically more marked than bare predication (cf.

(de Swart and Zwarts, To appear)). Syntactic markedness can be understood in terms of

projections: whereas non-bare predication involves DPs, bare predication only involves

NPs (or NumPs). The fourth assumption is that conveying non-uniqueness is semantically

more marked than conveying neutrality with respect to uniqueness (cf. (de Swart and

Zwarts, To appear)). Semantic markedness can be understood in terms of compatibility:

non-uniqueness is compatible with neutrality but neutrality is not necessarily compatible

with non-uniqueness. The fifth and final assumption is that unmarked forms and meanings

are preferred over marked forms and meanings (a standard assumption in the OT

literature). The resulting (Weak) Bi-directional OT tableau is presented in (21).


Proceedings of the 13 th ESSLLI Student Session

What comes out of this analysis is that bare predication is neutral with respect to uniqueness

whereas non-bare predication marks non-uniqueness.

44 Kinds and non-bare predication

In 4.1. I claimed - on the basis of common intuitions - that kinds are subject to a nonuniqueness

constraint. In 4.3. I claimed - on the basis of standard assumptions - that

non-bare predication marks non-uniqueness whereas bare predication is neutral with respect

to uniqueness. When we combine both claims it follows that non-bare predication

is best suited to signal kind-membership. As I will demonstrate in what follows this is

indeed what it does in languages like Dutch. I will show this on the basis of five predictions

that follow from the claim that there is one-to-one correspondence between non-bare

predication and kind-membership predication.

The first prediction is that all predication involving kind-membership has to involve the

indefinite article. That this is the case has been suggested by (Kupferman, 1991) and

(Roy, 2006) and as far as I know this has never been challenged. Note that (14) is not

a counterexample. (14) shows that bare predication may involve nouns that are usually

associated with kinds but it is not an instance of kind-membership predication. Note also


Proceedings of the 13 th ESSLLI Student Session

that kinds are not restricted to plants or animals but may involve things as diverse as bottles,

chairs, ... in as far as they show a sufficiently regular behaviour (see 4.1).

The second prediction my claim about non-bare predication and kind-membership makes

is that bare predication should be concerned with the predication of properties that are unlike

those that link a kind to its instantiations. In view of the non-accidentality constraint

on kinds and their instantiations (see 4.1) it is then predicted that bare predication is concerned

with accidental properties. To see that this is exactly what happens it is instructive

to look at those nouns that usually appear in bare predication: nouns linked to professions,

religions, nationalities and titles. These ”do not depend on the inherent, natural properties

of a person or what the person actually does, but on the social or cultural status of that

person” (de Swart et al., 2007).

The third prediction is that whenever a noun that is usually associated with kinds is used

in bare predication it is reinterpreted in such a way that it no longer predicates a nonaccidental

property. An example was given in (14): being a wolf in (14) is an accidental

property that comes with the distribution of roles in a game.

The fourth prediction is that whenever a noun that is usually not associated with kinds

is used in non-bare predication it is reinterpreted in such a way that it starts predicating

non-accidental properties. An example was given in (13b): for Sil to be a butcher is no

longer seen as an accidental property but rather as something that is linked to his inherent

properties. This explains why Sil needn’t be a butcher by profession to make (13b) true.

The fifth prediction is that whenever it is not clear whether something is an accidental

property or not there is variation in the predication that is used. One telling example is

that of diseases like alcoholism. According to some alcoholism is a disease that people

may or may not get, according to others alcoholics are themselves responsible and are

not sick in the classical meaning of the word. Interestingly this division is reflected in the

use of the more clinical alcoholieker (Dutch, ’alcoholic’) and the more popular drinker

(Dutch, ’drinker’). On google I found the former 43 times in bare predication and 8 times

in non-bare predication whereas the latter appeared 364 times in non-bare predication and

only 6 times in bare predication. 1

5 Conclusion

This paper started out as an investigation into the role of the indefinite article in predicate

position. The analysis I defended is that through its competition with the bare form it

marks non-uniqueness which in turn can be linked to kind-membership predication. This

analysis is attractive in at least three respects. The first is that the indefinite article maintains

its standard semantics and pragmatics and is not reduced to a vacuous item. The

second is that it offers a formalizable alternative to temporal analyses that were shown to

make wrong predictions. The third is that it brings together intuitions and claims from

work on kinds and work on bare predication that lend themselves to an interesting remix.

1 The google search was done on www.google.nl (restricted to Dutch pages) and concerned searches of

the form ”is drinker” / ”is alcoholieker”.


Proceedings of the 13 th ESSLLI Student Session


Beyssade, C. and Dobrovie-Sorin, C. (2005). Bare predicate nominals in dutch, Proceedings

of SALT 15.

Blutner, R. (2000). Some aspects of optimality in natural language interpretation, Journal

of Semantics 17.

Broekhuis, H., Keizer, E. and Den Dikken, M. (2003). Modern grammar of Dutch. Occasional

papers 4, Tilburg University, Tilburg.

de Swart, H., Winter, Y. and Zwarts, J. (2005). Bare predicate nominals in dutch, in

E. Maier, C. Bary and J. Huitink (eds), Proceedings of SuB9.

de Swart, H., Winter, Y. and Zwarts, J. (2007). Bare nominals and reference to capacities,

Natural Language and Linguistic Theory 25.

de Swart, H. and Zwarts, J. (To appear). Nominals with and without an article: Distribution,

interpretation and variation, in P. Hendriks, H. de Hoop, I. Krämer, H. de Swart

and J. Zwarts (eds), Conflicts in Interpretation.

Déprez, V. (2005). Morphological number, semantic number and bare nouns, Lingua 115.

Farkas, D. (2002). Specificity distinctions, Journal of Semantics 19.

Hawkins, J. (1991). On (in)definite articles: implicatures and (un)grammaticality prediction,

Journal of Linguistics 27.

Kupferman, L. (1991). Structure événementielle de l’ alternance un / ∅ devant les noms

humains attributs, Langage 102.

Matushansky, O. and Spector, B. (2005). Tinker, tailor, soldier, spy, in E. Maier, C. Bary

and J. Huitink (eds), Proceedings of SuB9.

Munn, A. and Schmitt, C. (2005). Number and indefinites, Lingua 115.

Partee, B. (1987). Noun phrase interpretation and type-shifting principles, in J. Groenendijk,

D. de Jongh and M. Stokhof (eds), Studies in Discourse Representation

Theory and the Theory of Generalized Quantifiers, Foris, Dordrecht.

Roy, I. (2006). Non-verbal predications: a syntactic analysis of predicational copular

sentences, PhD thesis, University of Southern California.


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session



James Burton

University of Brighton

Abstract. This paper reports on ongoing work to create a proof-carrying Domain Specific

Embedded Language (DSEL) for diagrammatic logics, using Euler diagrams as a case study.

The DSEL is written in Haskell with type system extensions that allow the exploitation of

a combination of ideas from Constructive Type Theory. These extensions offer an increase

in expressiveness over Hindley-Milner type systems and have been used for program verification.

We use these extensions to create enhanced static constraints to enforce invariants

on diagrams and transformations (inference rules). Our work is at an early stage and we

describe the goals and challenges ahead. The major goal is to create a DSEL for generalized

constraint diagrams, a visual logic expressive enough to be useful for modelling software,

and to extract the types of the resulting diagrams for use as software artefacts.

1 Introduction

A great deal of effort is spent on attempts to increase software reliability and the productivity

of programmers, by both the research community and the software industry. Of

the techniques employed (development methodologies, systematic modelling, automated

testing), formal methods have been little used outside of the most safety-critical sectors

where they are used to verify semantic properties of software and to assure desired runtime

conditions. We believe the benefits of their more widespread use could be great, but

the impact of factors inhibiting adoption needs to be reduced. These factors may include

the fact that existing techniques are seen as difficult to use, time-consuming and requiring

specialised expertise. There is, therefore, a need for more “lightweight” formal methods

which are accessible to programmers with a minimum of specialised training and which

fit in seamlessly with the tools they employ. Sheard has said that enabling programmers

to make statements about semantic properties of the code they write directly, rather than

turning to external tools with high barriers to entry (likely to be written by, and for, mathematicians)

will make it more likely that they do so — in short, that the semantic gap

between the tools for programming and those for formal reasoning is damaging to the

cause of both (Sheard, 2004).

At the same time as the Unified Modelling Language (UML) was adopted as a standard

visual language for modelling software in the 1990s, breakthroughs occured in the

use of diagrams as visual logics (Shin, 1994; Hammer, 1995). Shin proved soundness and

completeness results for the so-called Venn-II reasoning system, equivalent in expressive

power to Monadic First Order Logic, and research began into a number of diagrammatic

reasoning systems varying in notation and expressive power. The connection between

the new formalised diagrams and those used in software modelling was quickly made.

Although the UML works well to describe the architecture of a system it is not always expressive

enough to capture all invariants we might wish to enforce, a fact which led to the

development of the (non-graphical) Object Constraint Language (OCL). Kent proposed

constraint diagrams as a purely diagrammatic alternative to the OCL, more appropriately


Proceedings of the 13 th ESSLLI Student Session

complementing the UML’s visual nature (Kent, 1997). The constraint diagram in figure 1

shows a constraint in a library management system. Amongst other things it states that

people can only borrow books that are in the collections of libraries they have joined.

Figure 1: A constraint diagram and an Euler diagram.

There are many reasons that we might want to use diagrams to represent information,

including the potential of diagrams for well matchedness and free rides. A diagram is

well matched to its subject if it presents the key features of that subject effectively and in

a way that seems intuitively clear to the viewer (Gurr and Tourlas, 2000). A well matched

diagram can make certain reasoning tasks appear to be easier when compared with a symbolic

representation of the same information. Free rides occur when a diagram provides

some information ‘naturally’ or ‘for free’ which would need to be explicitly stated in, or

derived from, a symbolic representation (Shimojima, 2004). For example, in the Euler diagram

in figure 1, the fact that the contour Spaniels is placed within GunDogs asserts directly

that Spaniels ⊆ GunDogs but also allows the viewer to infer Spaniels ⊆ Dogs and

Spaniels ∩ Cats = ∅. Details of well matchedness and free rides in constraint diagrams

can be found in (Stapleton and Delaney, 2008). In some circumstances the expressive

power of diagrams can produce ambiguity, or lead the viewer to make false inferences.

However, many diagrammatic notations now have formal, unambiguous semantics, of

which Euler and constraint diagrams are prominent examples.

Our ultimate goal is to create a Domain Specific Embedded Language (DSEL) for

several systems of diagrammatic reasoning, with two main aims: to explore the benefits

and boundaries of the emerging style of programming that mixes formal methods with

programming, and to support the work which aims to establish visual logics as a valuable

tool in formal methods.

The DSEL will be written in Haskell and will consist of statically verified code which

will allow the user to manipulate and reason with a variety of visual logics such as Euler

diagrams, spider diagrams and constraint diagrams (see Section 2). The DSEL, therefore,

shares one of the primary aims of visual logics — to make formal reasoning more accessible

and widely used. Reasoning about design and implementation have traditionally

taken place in separate phases of the software process, with the onus on the programmer

to bridge the gap between the two. One of the benefits of combining both activities in one

phase is that constraints modelled by a programmer using the DSEL will form software

components in their own right, resulting in diagrams with the same type as functions in


Proceedings of the 13 th ESSLLI Student Session

the modelled system. This suggests that such constraints could eventually form part of

working software, perhaps as part of a “trusted kernel” used by other components, following

the approach of (Kiselyov and Shan, 2007). The form and function of the DSEL will

therefore be closely linked — a formally specified language to assist formal reasoning.

Advances in Programming Language Theory are typically explored in research languages

before percolating into more widely used languages. This is especially true of

modern functional languages and, in particular, Haskell. The Haskell type system with

the extensions provided by the GHC compiler make it possible to explore what Sheard

called (when speaking of the closely related language Ωmega) “a new point in the design

space of formal reasoning systems — part programming language, part logical framework”

(Sheard, 2004) and to do so directly within the environment of a practical language

with efficient implementations. The language features that enable this can be used to emulate

the behaviour of fully dependently typed languages such as Epigram (Altenkirch,

Mcbride and Mckinna, 2005), resulting in what have been called “pseudo-dependently

typed” systems, described in Section 4. The syntactic clarity, referential transparency

and similarity to mathematical notation of functional languages are also of benefit to us.

These features help us in our goal to minimise syntactic differences between the DSEL

and the diagrammatic logics we implement, making it easier to demonstrate a clear mapping

between the two. The point of this mapping is to demonstrate “literal preservation

of syntactic relations under denotation”, as Hammer states the conditions for resemblance

between a sign and that which it signifies (Hammer, 1995).

In Section 2 we describe reasoning with Euler diagrams. Section 3 gives an overview

of type theoretic features making their way into programming languages while Section

4 looks ahead to the form our DSEL will take, using Euler diagrams as a case study. In

Section 5 we consider the goals of the research, evaluate the strategies used to reach them

and identify some of the challenges ahead.

2 Reasoning with Euler diagrams

Although diagrams have often been used to aid understanding in mathematical proofs,

they have until fairly recently been treated as informal and secondary to formalized symbolic

content. In the 1990s the work of Shin began to put diagrams on a different standing

by proving soundness and completeness results for the Venn-II reasoning system, an extension

and formalisation of earlier work by Venn and Peirce (Shin, 1994). Stapleton

provides a summary of the history of diagrammatic reasoning since then, which is now

a rapidly evolving and active research area (Stapleton, 2007). What makes such logics

interesting, given the existence of mature symbolic reasoning techniques, is the combination

of formal reasoning with the compact and intuitive nature of diagrams referred

to previously. We expect that this, and the efforts to create supporting tools, will make

formal reasoning more accessible to non-logicians.

An Euler diagram is a collection of closed curves called contours which represent sets,

within an enclosing rectangle. Figure 2 shows an example with three contours, labelled

A, B and C. Containment, intersection and disjointness are represented by the placement

of contours, so the same diagram asserts C ⊆ A and B ∩ C = ∅. A zone is a set of

points in the diagram that can be described as being inside certain contours and outside

all others. The diagram in figure 2 has five zones; one inside A but outside B and C, one


Proceedings of the 13 th ESSLLI Student Session

Figure 2: An Euler diagram.

inside A and C but outside B, and so forth. The region outside of all contours is also a

zone. Shading within a zone asserts the emptiness of the set represented by that zone. So,

the shading of the diagram in figure 2 asserts A ∩ B = ∅ and A − C = ∅.

Reasoning is carried out by the application of rules which transform one diagram into

another, such as Add Contour and Remove Shading; a sound and complete set is given in

(Stapleton, Masthoff, Flower, Fish and Southern, 2007). A proof using Euler diagrams is

formed by applying these rules repeatedly to transform an initial diagram (the premise)

into the target diagram (the conclusion); figure 3 shows a short example. The Add Shaded

Zone rule is applied to transform d 1 to d 2 . A new shaded zone can be added at any

time since both a shaded zone and a missing zone assert the emptiness of the represented

set; both d 1 and d 2 state that A and B are disjoint. The Add Contour rule is applied

to transform d 2 to d 3 . The new contour C intersects all existing zones without changing

their shading. Since this operation introduces no new shading and the way that C is added

ensures that no missing zones are created, d 2 and d 3 have the same meaning.

Figure 3: An Euler diagram proof.

The diagrams are formalised using an abstract syntax. The abstraction of Euler diagrams

that we present here is obtained from (Stapleton et al., 2007). Each zone is represented

as a tuple of the set of labels of contours that the zone is inside and the set of

labels of contours the zone is outside. For example, in diagram d 1 , figure 3, the only zone

inside A has the abstraction ({A}, {B}). Diagrams are represented as a tuple of the set

of labels (L), the set of zones (Z) and the set of shaded zones (Z ∗ ). Thus, diagram d 2 in

figure 3 has abstraction:

〈L = {A, B}, Z = {({A}, {B}),({B}, {A}),({A, B}, ∅),(∅, {A, B})}, Z ∗ = {({A, B}, ∅)}〉

There are a number of logics that extend this system of Euler diagrams, including spider

diagrams (Howse, Stapleton and Taylor, 2005) and the constraint diagrams mentioned


Proceedings of the 13 th ESSLLI Student Session


3 Dependent Typing and Proof-Carrying Code

The Curry-Howard Isomorphism has a long history and arises from the observation of a

correspondence between Hilbert-style deductive logic and combinatory models of computation.

The work of Martin-Löf cast it as a more general principle linking logical formalisms

and the type systems of programming languages (Martin-Löf, 1984). Rather

than classifying values, types can be viewed as propositions; a value inhabiting type T

corresponds to a proof of T. Martin-Löf’s type theory can be used as an environment for

programming with dependent types (Nordstrom, Petersson and Smith, 1990). Dependent

type systems are so-called because types may depend on a value, such as List a n, the

type of collections of elements of type a with length n. For different values of n we have

different types. A sketch of the logical rules for type-safe list operations is given as type

judgements below. We assume the types Nat (of Peano numbers with constructors Zero

and Succ n) and List a n. Γ is a typing context and Γ ⊢ σ type means that σ is a type in


Γ ⊢ Nat type

Γ ⊢ n : Nat

Γ ⊢ Succ n : Nat

Γ ⊢ Zero : Nat

Γ ⊢ t type

Γ ⊢ empty t : List t Zero

Γ ⊢ t type Γ ⊢ x : t Γ ⊢ n : Nat Γ ⊢ l : List t n

Γ ⊢ cons x l : List t (Succ n)

Γ ⊢ t type Γ ⊢ n : Nat Γ ⊢ l : List t (Succ n)

Γ ⊢ tail l : List t n

Γ ⊢ t type Γ, n : Nat ⊢ l : List t (Succ n)

Γ ⊢ head l : t

Dependent type theory makes Curry-Howard (or propositions-as-types) useful in practical

ways. The resulting type systems form the basis of automated theorem provers

(Bertot and Casteran, 2004) and, on the other hand, purely functional and total programming

languages (Altenkirch et al., 2005). The same insights inform more widely used

languages at an accelerating rate, especially Haskell, which plays the dual rôle of research

language and practical tool. The type system of Haskell with extensions is flexible

enough to emulate many aspects of dependent typing and to create programs whose types

act as proof that their implementation conforms to their specification.

4 Haskell and the DSEL for Euler Diagrams

Programming our diagrammatic DSEL is at the prototype stage. Its foundation is a typelevel

Set library which encodes and ensures constraints such as set membership, disjointness

and so on. Above this will sit the implementation of several diagrammatic logics.

Two diagrammatic transformations corresponding to inference rules in an Euler diagram

system are presented as a type judgements below.


In a language such as Haskell we may not mix types and terms in the way described in

Section 3. The collection of techniques used to achieve something often called “pseudodependent

typing” includes type-level representations of the indexing term supplied to the

type constructor; to use the example from Section 3, since we have no type-level numbers

we represent n in List a n by types formed of the empty Haskell type constructors Z and

Succ n, such as Succ (Succ Z).

It is important to distinguish type-level from term-level computations. In the termlevel

of a programming language with partial functions the result of any function may be

undefined (⊥), and so programs are not proofs. Type functions like union below are not

functions over values, are defined extensionally and exclude the undefined. The DSEL

is comprised of two main components, the domain specific, dependently typed theory

of diagrammatic reasoning, which provides assurances about the correct formation of

diagrams and application of reasoning rules, and the interactive front end which makes use

of this type system and is subject to the usual limitations of the host language. Although

we do not use a dependently typed host language, our approach is similar in spirit to

(Oury and Swierstra, 2008) who use Agda to enforce sophisticated constraints statically

in a series of DSELs.

Since type-level values are distinct from terms, special measures are required to handle

them at runtime. We use a combination of techniques involving empty and existential

types (Peyton Jones, 2008) to do this. As an example of our strategy, the types A, B and

C below are empty types used to represent the labels of contours in a diagram:

data A ; data B ; data C

data L a where

AL :: L A

BL :: L B

data Nil

data t ⊲ ts

The type L a lifts labels into a more general type, allowing us to consider labels of any

type. The type constructors Nil and ⊲ are used as the building blocks of sets of labels.

LBox and LSetBox use “existential boxing” to wrap type-level values of LSet t, allowing

us to handle the outer type at runtime but for the “boxed” value to remain available for

inspection by constraints:

data LSet t where data LBox = ∀a. LBox (L a)

Empty :: LSet Nil data LSetBox = ∀t. LSetBox (LSet t)

Ins :: L a → LSet t → LSet (a ⊲ t)

By creating a function fromChar :: Char → LBox we can box runtime values and insert

them into boxed sets with a function that calls on fromChar, insertChar :: Char →

LSetBox → LSetBox. When insertChar is used to add elements to a set of type

LSetBox, a correspondence is enforced between the collection of values and the type

of its LSet t parameter. The value of a collection can be seen as fully determined by

the type of this parameter, which is a proof ensuring that inserted elements are members

of the resulting collection. Assurances for the semantics of sets may be encoded using

constraints written using Indexed Type Families (Peyton Jones, 2008).

41 Judgement Rules

Proceedings of the 13 th ESSLLI Student Session

We model the tuples found in the Euler diagram abstraction with the types Z l 1 l 2 (zones)

and D l z z ∗ (diagrams). The type judgements below are a fragment of a self-contained


type theory of Euler diagrams based on the abstract syntax given in (Stapleton et al.,

2007). Once complete, this type theory will be implemented using the techniques in the

previous section to produce a DSEL with enhanced static constraints.

Two kinds of element appear in the judgement rules: typing judgements, e.g. x : a

and type constraints, e.g. Γ ⊢ C x y type, meaning that the type C can be formed in

the context Γ. Type constraints presumed to be defined in the Set library such as Disjoint

appear capitalised, while functions from types to types are in lower case, e.g. union.

We use the constraints Label, LabelSet, Zone and ZoneSet to restrict the input to type


Supplied with disjoint sets of labels, l 1 and l 2 , Z constructs a zone:

Γ ⊢ LabelSet l 1 type Γ ⊢ LabelSet l 2 type Γ ⊢ Disjoint l 1 l 2 type

Γ ⊢ Z l 1 l 2 type

The syntactic rules state that given a diagram D l z z ∗ , the zones z form a superset of

the shaded zones z ∗ . Also, for each zone Z l 1 l 2 in z and z ∗ , l 1 ∪ l 2 forms a partition over

l. The Invs rule applies these constraints to a diagram:

Γ ⊢ Invs l z type Γ ⊢ Invs l z ∗ type Γ ⊢ Subset z ∗ z type

Γ ⊢ D l z z ∗ type

The Inv rule applies the relevant constraint to an individual zone. The base case for

applying Inv is:

Γ ⊢ Label l type

Γ ⊢ Invs l Nil type

The inductive case for applying Inv is:

Γ ⊢ Label l type Γ ⊢ ZoneSet (z ⊲ zs) type Γ ⊢ Inv l z type Γ ⊢ Invs l zs type

Γ ⊢ Invs l (z ⊲ zs) type

Since l 1 ∩ l 2 = ∅, they partition l if l 1 ∪ l 2 = l:

Γ ⊢ Z l 1 l 2 type Γ ⊢ u : union l 1 l 2 Γ ⊢ LabelSet ls type Γ ⊢ Eq l u type

Γ ⊢ Inv ls (Z l 1 l 2 ) type

The quotes that begin the following subsections are from (Stapleton et al., 2007) from

which we take reasoning rules and translate them to typing judgements. The invariants are

not tested after applying the rules since previous judgements guarantee that if a diagram

can be formed, the invariants have been met.

411 Remove Shaded Zone

Proceedings of the 13 th ESSLLI Student Session

“A shaded zone can be removed but only if there is at least one zone inside each contour

in the resulting diagram and the zone outside all the contours remains”. In figure 4, the

Remove Shaded Zone rule can be applied to transform d 1 into d 2 .

Γ ⊢ Zone x type

Γ ⊢ D l z z ∗ type Γ ⊢ z ′ : delete x z

Γ ⊢ z ∗′ : delete x z ∗ Γ ⊢ Member x z ∗ type

Γ ⊢ transform RemoveShadedZone x (D l z z ∗ ) : (D l z ′ z ∗′ )


Proceedings of the 13 th ESSLLI Student Session

Figure 4: Three Euler diagrams.

412 Add Contour

“A contour can be added to a diagram provided its label is not already in the diagram. Each

zone is split into two zones (one inside and one outside the new contour), and shading is

preserved”. In figure 4 the Add Contour rule can be applied to transform d 1 into d 3 .

Before we can add contours we need a way of replacing all zones z : Z l 1 l 2 in a set

with two copies of itself, one with an extra label added to l 1 , one with that same label

added to l 2 .

Γ ⊢ Label c type

Γ ⊢ splitZones c Nil : Nil

Γ ⊢ ZoneSet (z ⊲ zs) type Γ ⊢ Label c type

Γ ⊢ z 2 : insertLabel Excl c z Γ ⊢ z 1 : insertLabel Incl c z

Γ ⊢ splitZones c (z ⊲ zs) : (z 1 ⊲ z 2 ⊲ (splitZones c zs))

Γ ⊢ Z l 1 l 2 type Γ ⊢ Label c type Γ ⊢ l 3 : c ⊲ l 1

Γ ⊢ insertLabel Incl c (Z l 1 l 2 ) : (Z l 3 l 2 )

Γ ⊢ Z l 1 l 2 type Γ ⊢ Label c type Γ ⊢ l 3 : c ⊲ l 2

Γ ⊢ insertLabel Excl c (Z l 1 l 2 ) : (Z l 1 l 3 )

Γ ⊢ Label c type

Γ ⊢ D l z z ∗ type Γ ⊢ l ′ : c ⊲ l

Γ ⊢ z ′ : splitZones c z Γ ⊢ z ∗′ : splitZones c z ∗

Γ ⊢ transform AddContour c (D l z z ∗ ) : (D l ′ z ′ z ∗′ )

5 Conclusions and Further Work

We have presented part of a DSEL for Euler diagrams that closely mirrors their abstract

syntax and which allows us to inherit the definitions of reasoning rules in a seamless

way. We have extended the approach of section 41 to a complete set of reasoning rules,

providing a type theoretical version of Euler diagrams. Providing a self-contained type

theory for the DSEL (beginning with the simplest case of a set of rules for reasoning with

Euler diagrams and extending this to more complex cases) will make results relating to

the logics (soundness, completeness, etc.) transferable, giving the DSEL the status of a

reasoning tool in its own right.

Our goal is to extend the current approach to more expressive notations, such as generalized

constraint diagrams, which are expressive enough to be used when modelling


Proceedings of the 13 th ESSLLI Student Session

software (Stapleton and Delaney, 2008). It is ultimately expected that the DSEL will be

used by higher level tools which allow the user to select from contextually legitimate diagram

transformations. Diagrams created using the DSEL (with or without the support of

additional tools) will have a type which captures the modelled constraint. If the modelled

software is written in the same language as the constraint and there is a correspondence

between the datatypes used in each, we may be able to use the constraint as part of a

“trusted kernel” exporting a safe subset of constructors via the module system. This scenario,

in which the programmer uses tools to model constraints then applies them directly

within the implementation phase, will provide a more unified and, ideally, a more usable

programming/verification environment than exists today.

Combining types with terms requires careful design. Some of the solutions, such as

existential boxing, introduce levels of indirection which are unnecessary in more specialised

environments and which may threaten to obscure the relationship with underlying

diagrammatic logics, at least superficially. If we were to use a language such as

Coq or Epigram to implement the DSEL it is possible that we could find a more natural

expression of many types and constraints. We believe however, given our central aim of

accessibility, that these risks are offset by the benefits of using a more practical and accessible

language than is available in the current generation of dependently typed systems.

The limitations of these techniques and how they might be used to form a general strategy

to combine verification and programming are some of the subjects of the research. The

research will support the longer term goals of the diagrammatic reasoning community by

providing an implementation of various visual logics which can be clearly linked to their

related abstract syntax. Once extended to the case of constraint diagrams, the DSEL has

the potential to shrink the toolchain used by programmers who wish to make statements

about the semantic properties of the code they write. There are a number of interesting

challenges involved in reaching that point, such as the issue of extracting the type of a

diagram in a usable form. The work reported in this paper is a first step towards achieving

these goals.


I would like to express my sincere thanks to John Howse, Gem Stapleton and Richard

Bosworth for their support and encouragement, and to the anonymous reviewers for their

helpful comments. The author is supported by EPSRC Grant EP/P501318/1.


Altenkirch, T., Mcbride, C. and Mckinna, J. (2005). Why dependent types matter, Available

online http://www.cs.nott.ac.uk/˜txa/publ/ydtm.pdf Accessed 01/02/08.

Bertot, Y. and Casteran, P. (2004). Interactive Theorem Proving and Program Development,


Gurr, C. and Tourlas, K. (2000). Towards the principled design of software engineering

diagrams, Proceedings of 22nd International Conference on Software Engineering,

ACM Press, pp. 509–518.


Proceedings of the 13 th ESSLLI Student Session

Hammer, E. (1995). Logic and Visual Information, CSLI, Stanford.

Howse, J., Stapleton, G. and Taylor (2005). Spider diagrams, LMS Journal of Computation

and Mathematics 8: 145–194.

Kent, S. (1997). Constraint diagrams: Visualizing invariants in object oriented modelling,

Proceedings of OOPSLA97, ACM Press, pp. 327–341.

Kiselyov, O. and Shan, C.-C. (2007). Lightweight static capabilities, Electronic Notes in

Theoretical Computer Science 174(7): 79–104.

Martin-Löf, P. (1984). Constructive mathematics and computer programming, Royal Society

of London Philosophical Transactions Series A pp. 501–518.

Nordstrom, B., Petersson, K. and Smith, J. M. (1990). Programming in Martin-Löf’s Type

Theory, OUP.

Oury, N. and Swierstra, W. (2008). The power of pi, Submitted to ICFP 2008.

Available online http://www.cs.nott.ac.uk/˜wss/Publications/ThePowerOfPi.pdf Accessed


Peyton Jones, S. (2008). Ghc language features, Accessed 01/02/08

http://www.haskell.org/ghc/docs/latest/html/users guide/ghc-languagefeatures.html.

Sheard, T. (2004). Languages of the future, SIGPLAN Notices 39(12): 119–132.

Shimojima, A. (2004). Inferential and expressive capacities of graphical representations:

Survey and some generalizations, Proceedings of Diagrams 2004, Vol. 2980

of LNAI, Springer, pp. 18–21.

Shin, S. J. (1994). The Logical Status of Diagrams, CUP.

Stapleton, G. (2007). Diagrammatic logics: Past, present and future, International Conference

on Logic, Navya Nyaya and Applications, Jadavpur University, pp. 4–15.

Stapleton, G. and Delaney, A. (2008). Evaluating and generalizing constraint diagrams,

Accepted for Journal of Visual Languages and Computing. Available online from


Stapleton, G., Masthoff, J., Flower, J., Fish, A. and Southern, J. (2007). Automated theorem

proving in Euler diagrams systems, Journal of Automated Reasoning 39: 431–



Proceedings of the 13 th ESSLLI Student Session


Gemma Celestino

University of British Columbia & LOGOS Research Group

Abstract. I argue that fictional contingencies, such as the one that, in Tolstoy’s Anna Karenina,

Anna Karenina might not have fallen for Vronsky pose a serious problem to a descriptivist

and possible worlds view of fiction such as the one defended by David Lewis and

Gregory Currie. Their view cannot account for the fact that in Tolstoys Anna Karenina, it

is Anna Karenina herself who contingently falls for Vronsky. In Tolstoy’s Anna Karenina,

Anna Karenina falls for Vronsky in the actual world but she fails to fall for him in some

possible world.

1 Introduction

An interesting issue that arises within the topic of fiction is the issue of how to account for

the intuitive contingencies of fictional characters. For at least some of the things that occur

to fictional characters within a story are supposed to happen only contingently. There is a

way certain views on fiction could take to account for these modal properties of fictional

characters that I think is mistaken and I shall argue why in this paper.

Gregory Currie recently advanced such an account in his “Characters and Contingency”

(2003). But his account is one that must be attractive to any follower of the

Lewis-Currie descriptivist view of fictional names, or of what I take would be a natural

two-dimensionalist extension of Robert Stalnaker’s position on true negative existentials

and related matters. The account, in fact, only makes sense within a possible worlds

framework of fiction. In short, the descriptivist view is the view that fictional names,

unlike ordinary proper names, are, or are used by the author of the fiction, as non-rigid

definite descriptions.

First, I shall explain the problem of fictional contingencies and argue that the explanation

Currie offered does not work. This is a real problem for the descriptivist view of

fiction and I will also argue that. Secondly, I shall consider other alternatives to descriptivism

within the possible worlds framework to conclude that no possible worlds view of

fiction looks promising. Finally, I will end up with some positive suggestions that I would

like to develop soon somewhere else.

2 The Problem of Fictional Contingencies

I shall motivate the problem I want to address in this paper by introducing the following

pair of sentences:

(1) Necessarily, someone who did not fall for Vronsky would not be Anna Karenina

(2) Someone who necessarily fell for Vronsky would not be Anna Karenina

Despite the apparent inconsistency between these two claims, both seem intuitively

true. (1) is true because anything that a fictional story tells about its characters is essential

to them. Tolstoy’s story about Anna Karenina tells us, among other things, that Anna


Proceedings of the 13 th ESSLLI Student Session

Karenina falls for Vronsky. Hence, unlike what happens to non-fictional people like you

and me, and due to its fictionality, it is a constitutive feature of Anna Karenina that she

falls for Vronsky. Thus, it is necessary that she does. (2) is true because Tolstoy’s story

is not a story in which Anna Karenina cannot but fall for Vronsky, but a story in which

Anna Karenina falls for Vronsky only contingently. Thus, anyone who necessarily fell

for Vronsky, who fell for Vronsky not contingently, would not be Anna Karenina.

The apparent incompatibility or tension between (1) and (2) cannot be explained in

terms of the distinction between truth in fiction and truth simpliciter, or any other similar

distinction. For both seem to be true in one and the same reading. None of them is true

in the fiction. Rather, they are about the fictional character Anna Karenina. They specify

some of its necessary qualities.

3 The Descriptivist Way Out of the Problem

The view I want to show wrong in this paper would accept the truth of both claims and

would explain it as follows: Anna Karenina possibly exists. That is to say, even if –as

we all agree– Anna Karenina does not actually exist, there is some other possible world

where she does. For to be Anna Karenina is simply to play the Anna Karenina-role and

to play the Anna Karenina-role merely amounts to satisfy the general definite description

that could be extracted out from the story told by Tolstoy, constructed out of everything

Tolstoy says about Anna in the story he tells, which is the exact meaning of the fictional

name ‘Anna Karenina’, at least as it is used by Tolstoy.

On this view, what one does when telling a fiction is to tell a story, which although not

actual, is possible. It is to qualitatively describe part of some possible worlds other than

the actual. It is to explain some ways the actual world might have been but is not. Thus,

the view is that Anna Karenina could have existed and fallen for Vronsky even if in fact

this never occurred and will never do in actuality. That Anna Karenina falls for Vronsky

is as possible as my turning off my laptop in a moment.

What would explain the truth of (1), according to this view, is the fact that there is

no possible world where someone plays the role of Anna Karenina but does not fall for

Vronsky. This is so precisely for part of what it means to play this role is to fall for

Vronsky. Thus, it is true in every world that anyone who plays the Anna Karenina-role in

that world falls for Vronsky.

Nevertheless, (2) would be true as well because for every person who plays the Annarole

in some possible world, there is at least one more world where that same person does

not fall for Vronsky, i.e. a world where she does not play the role of Anna Karenina (This

would be so because it is impossible to necessarily fall in love). The existence of these

other possible worlds is what would explain the contingency of the falling for Vronsky

by Anna Karenina. In “Characters and Contingency”, Currie advances such an account of

the truth of (1) and (2).

The reason why the explanation provided above does not work is that it does not explain

what it has to explain, that is, the fact that in the fiction, Anna Karenina has the

property of falling for Vronsky but only contingently so. This amounts to the fact that

Anna Karenina herself must have the property in every story-world –i.e. where the Anna

Karenina-role is satisfied–, but at the same time she (Anna Karenina and no one else)

must fail to have that property while being Anna Karenina at some world, which must


Proceedings of the 13 th ESSLLI Student Session

be possible with respect to the story-world. But it is Anna Karenina herself who must

have the contingent property at one world and lack it at another. This is what contingency

means. Otherwise, it is not true that Anna Karenina falls for Vronsky in a contingent way,

but that someone else does. The problem is that the only way for this view to try to explain

that contingency is by appealing to the possible worlds –which are not story-worlds–

where the possible persons that, on this view, occupy the Anna Karenina-role, and thus

are Anna, in some story-worlds, do not fall for Vronsky and, thereby, neither occupy the

Anna Karenina-role nor are Anna in them.

I see no way a possible worlds descriptivist view can handle this problem. However, I

can see how one might reply. But the replies I envisage seem to be wrong as well.

One might find the possible worlds explanation of fictional contingencies plausible

and be easily misled into thinking that it is in fact right merely due to a natural tendency

to forget what this possible worlds view tells us being Anna Karenina consists in and,

as a result, come to have the following confused thought: that this person who does not

fall for Vronsky in some world in which she does not occupy the Anna Karenina-role is,

nevertheless, Anna Karenina also in such a world due to the fact that she is Anna Karenina

in one of the worlds of the story, where she does occupy the Anna Karenina-role and does

fall for Vronsky. But to evaluate this possible worlds view under this impression is to

misunderstand what the view (at least, about being Anna Karenina) is.

If that other person were to be Anna Karenina in any sense also in this other world

where she does not fall for Vronsky, (1) would not be true. It would not be a necessary

condition for being Anna Karenina to fell for Vronsky, for there would be some possible

worlds where Anna Karenina would not fall for him. These would precisely be the worlds

where someone who occupies the Anna-role in one of the story-worlds exists and does

not fall for Vronsky. As I argued above, however, there is no such sense for the case of

being Anna Karenina. To think of that person, let’s say Jane, as being Anna Karenina also

in that other world where she does not fall for Vronsky only because she does occupy the

Anna Karenina-role at some world, it is to mistake what being Anna Karenina is, on such

a view, for what being Jane (or, in fact, any other real person) is. Currie explains this as

follows: “Now consider Jane, a respectable inhabitant of the actual world. In the actual

world she does not fall for Vronsky; in fact she never meets him. But, given what I have

said just now, it may well be the case that Jane in some other world does fall for Vronsky;

in that other world, Jane occupies the Anna-role. Does that make Jane, in this world,

Anna Karenina? No. Being Anna is, according to me, something that happens to you in

some worlds and not in others. It happens to you in worlds where you occupy the Anna

role. In any world in which Jane occupies that role she is Anna. But that does not make

her Anna in this world. Being Anna is not at all like being Jane. The person who is Jane in

one world is Jane in all worlds. Being Jane is a matter of being a certain individual; being

Anna, on the other hand, is a matter of occupying a certain role. Moving up a semantic

step we can say that “Jane” is a proper name of an individual, whereas “Anna”, where it is

the proper name of anything, is the proper name of a function from worlds to individuals.

Of course when Tolstoy says that Anna did this or that, we are not from the point of view

of our imaginative engagement with the work, to understand this as meaning that a role

did this or that. This is because it is part of the fiction that “Anna” is the name of a person.

But “Anna”, as used by Tolstoy, is not in fact the name of a person, nor does it purport to

be. Names are expressions used in order to pick out individuals, and Tolstoy does not use


Proceedings of the 13 th ESSLLI Student Session

“Anna” in order to do this, nor does he expect us to believe that he is. “Anna”, as used by

Tolstoy, is not a name.” (Currie 2003, p. 141)

On the other hand, one might also contemplate the possibility of the fictional characters

enjoying of a certain autonomy with respect to their stories in such a way that one could

say that Tolstoy’s Anna Karenina could have had a different end, for instance. The idea

being that the characters would be well defined since the very beginning of the fiction -this

opening possibilities for their fate other than the ones that the author chose. Considering

this, one might think that the contingency of the properties of the characters could be

reduced to the contingency of the writing process itself. Anna Karenina, for instance,

might not have fallen for Vronsky precisely because Tolstoy might not have written that

she did. However, this possibility would not save Currie’s explanation of the fictional

contingency, or descriptivism of fictional names, since it is a whole different explanation

not compatible with them. But also one could see that it would not work by considering

the fact that one can write a fiction where characters have certain properties necessarily,

and notwithstanding this, the contingency of the writing process remains; the author could

have written a different story or this story a bit different.

The conclusions I think we should draw from all of this go farther than the mere conclusion

that the explanation of fictional contingencies I criticized is wrong and should be

rejected. This problem that fictional contingencies pose and the incorrectness of this explanation

indicate a deeper or more fundamental problem. It really shows why at least any

descriptivist view that tries to explain fiction in terms of possible worlds –which seems to

be their only way– is mistaken, and maybe it even shows that fiction cannot be accounted

for in possible worlds terms at all; at least, for the case of fictions told by the use of

singular terms such as proper names. In short, the problem is that this possible worlds

descriptivist view cannot explain the truth of pairs like (1) and (2). For, in particular, it

cannot explain the possession of any fictional contingency by any fictional character.

4 Other Possible Descriptivist Ways Out

If the possible worlds view has it that being Anna Karenina amounts to satisfy the nonrigid

definite description, which has as a part the description of this woman as falling

for Vronsky, it will not succeed in explaining that Anna Karenina falls for Vronsky only

contingently. For the simple reason that any woman who would be Anna Karenina at

all would be so only in some worlds and precisely in those worlds where she falls for

Vronsky. One might think, even against what Currie seems to insist, that there are two

ways of being Anna Karenina, though: one of them, the one we already contemplated

and the one that Currie tells us; the other, the one that the possible worlds view would

like to have, while keeping the previous one, which is to be someone who at some storyworld

satisfies the description that ‘Anna Karenina is or conveys, even if she does not do

so at some other possible worlds. In this sense anyone who met the description at some

possible world, would be also Anna Karenina at all the other worlds where she existed

even if she did not meet the description in them. This last sense does not seem to be

compatible with the view that claims that ‘Anna Karenina is used as a non-rigid definite

description, and that when it is not, when it is used literally, does not refer at all. But lets

assume for a moment it is for the sake of the argument.

This way there would be two ways of understanding the relevant pair of claims. Ac-


Proceedings of the 13 th ESSLLI Student Session

cording to the interpretation corresponding to the first sense of being Anna Karenina,

(1) would be true but (2) false. And according to the interpretation corresponding to the

second sense, while (2) would be true, (1) would be false. In none of these two interpretations,

one gets that both claims are true. Intuitively at least, however, they seem to

be true under one and the same interpretation. Both claims are about the features that

characterize a fictional character, Anna Karenina. One of these features is to be someone

who falls for Vronsky; another, to be someone who falls for Vronsky in a contingent way.

One might think, though, that the intuitive truth of these two claims may be very well

accounted for by considering a different interpretation of them in each case. However,

there is no independent reason to interpret them this differently. This does not seem to be

why we think they are both true. This way out of the problem fictional contingencies pose

to this view would be completely ad hoc.

In any case, there is no way on such a view to obtain what the view really needs. That

is, that Anna Karenina, one and the same thing, has the property of falling for Vronsky,

but lacks it at another possible world. For it is a condition on being Anna Karenina that

she does so contingently. This is what having a contingent property amounts to. Note that

the independent reason to argue for the legitimacy of using two different interpretations

cannot be that ‘Anna Karenina can be used both as a non-rigid definite description and as

a rigid proper name and that while it is used as a non-rigid definite description in the case

of (1), it is used as a rigid proper name in the case of (2). For, according to the possible

worlds view, only within the fiction, ‘Anna Karenina is or comes to be used as an ordinary

rigid proper name. We cannot use the proper names that are used in these other possible

worlds. For these proper names are only possible, not actual. Note too that appeal to the

ambiguity in scope due to the interaction between modalities and definite descriptions in

(1) and (2) does not work either. For the problem is that we are dealing with fiction and

fictional names and hence, there are no individuals that could stand in the place of these

fictional characters other than the ones that satisfy the definite descriptions in question in

each of the possible worlds. Thus, we can explain the consistency of the following pair

of sentences:

(3) Necessarily, the Queen of England is queen

(4) The Queen of England may not have been queen

by noticing the distinction in scope of the occurrences of the definite description ‘the

Queen of England in (3) and (4), and explain that (4) can be true compatibly with the

truth of (3) because there is an individual –i.e. the Queen of England– who can exist in

another possible world and not be the Queen of England in it. As I said, unlike in the

case of fiction, this is possible precisely because there is in fact an individual who is the

Queen of England in the actual world, whereas there is no such individual for the definite

description that the fictional name Anna Karenina allegedly abbreviates.

5 Non-Descriptivist Possible Worlds Views of Fiction

One might think that perhaps there are other possible worlds views of fiction that are

not descriptivist that could handle this problem of the fictional contingencies of fictional

characters. I shall very briefly argue that the only available ones are not very attractive.

Descriptivism seems to be the most plausible possible worlds view of fiction.

I see two options: one might defend Meignonianism and say that fictional characters


Proceedings of the 13 th ESSLLI Student Session

actually exist in some special mysterious way and that fictional names are like ordinary

proper names that rigidly refer to them. Or one might defend the view that fictional characters

are abstract objects, which actually exist and to which the fictional names rigidly

refer. Within this last option I see two further options: one might say that these abstract

objects are only contingently so, so that in other worlds these same objects exist but are

concrete instead of abstract in these worlds. The existence of these contingently nonconcrete

is defended by Bernard Linsky and Edward N. Zalta not with respect to fictional

characters but with respect to mere possible objects –i.e. possibilia. Or one might defend

that these abstract objects, like any other abstract objects, are necessarily abstract, in

which case, they only can do what their fictions tell they do in worlds that are impossible,

for there are things that only concrete objects can do. Thus, if these abstract objects are

to do them, it can only occur in impossible worlds rather than possible ones. This is the

Millian view defended by Nathan Salmon.

On the one hand, the first option, Meignonianism, is wholly mysterious and hence, no

plausible at all. On the other hand, the only option left which explains fictions in terms of

possibilities is the option that sees fictional characters as contingently nonconcrete objects

and, hence, consists in the very implausible claim that some actual abstract objects can be

concrete and some actual concrete objects can be abstract. In view of the alternatives to

descriptivism about fiction, I think we can conclude that fiction should not be dealt with

in terms of possible worlds.

6 Some Positive Suggestions

I think this problem is easily solved once we simply abandon the idea of explaining fiction

in terms of possible worlds. I would like to defend that story-worlds are not possible

worlds even if they are ontologically the same kind of thing: that is, sets of sentences

or propositions. The difference between story worlds and possible worlds would just be

that only the later represent possibilities with respect to the actual world. The fictional

contingencies of fictional characters should be explained by appealing to those worlds

which would be possible but only with respect to the world of the story and not with

respect to our actual world.

This way out of the problem would be possible because fictional names, on the other

hand, are not abbreviated non-rigid definite descriptions, but merely empty rigid proper

names, that is, proper names that do not have a referent. The meaning of fictional names

should be derived, in my view, from the fact that part of the meaning of any proper name

is the meaning of a rigid definite description associated with them. Any proper name N,

when used, in addition to rigidly refer to their bearer, semantically expresses some definite

description like ‘the bearer of N or ‘the individual called N, where the token of the name

N that occurs within that description is used the same way as the name N. This view

about proper names in general is a view that I learnt from Manuel Garcia-Carpinteros

work. Note that this view does not say that proper names are synonymous to definite

descriptions, as Saul Kripke showed this is incorrect, and that we can compatibly say that

Anna Karenina neither actually nor possibly exist.

Finally, I also think that in addition to the fictional operator ‘in the fiction f, there is

another fictional operator that we use, whether explicitly or implicitly, in our fictional

discourse. When we use fictional names to talk about them as fictional characters instead


Proceedings of the 13 th ESSLLI Student Session

of as the individuals that these fictional characters represent in the fictions, we either say

‘the fictional character N or we just utter the name N. It is my view that even in the later

case, the expression ‘the fictional character is there, though only in an implicit way. It

is the interaction between this expression and fictional names that makes our fictional

discourse when talking about fictional characters meaningful. How this interaction works

is something we have yet to discover. I do not know.

7 Conclusion

I have argued that there is a problem with the fictional contingent properties of fictional

characters that descriptivism about fiction cannot solve. I have also argued that other

alternative views on fiction that explain it in terms of possible worlds do not seem any

plausible. Finally, I have provided some positive suggestions to develop in order to explain

fiction and the problem posed by fictional contingencies. These are suggestions that

I plan to develop soon.


I would like to thank the extremely useful comments to earlier drafts of this work that I

have received from Manuel Garcia-Carpintero, Dominic McIver Lopes, Genoveva Marti,

Francis Jeffry Pelletier, Pablo Rychter and Ori Simchen as well as the extremely useful

patience and interest that Stefano Predelli showed in discussing it with me. I am also

thankful to the anonymous referees for their interesting points that I have tried to include

in this final version the best I could.


Currie, G. (1988). Fictional names, Australasian Journal of Philosophy 66.

Currie, G. (1990). The Nature of Fiction, Cambridge: Cambridge University Press.

Currie, G. (2003). Characters and contingency, Dialectica 57.

Kripke, S. (1972). Naming and Necessity, Harvard University Press.

Lewis, D. (1978/1983). Truth in fiction, Reprinted in David Lewis: Philosophical Papers

I.: Oxford: Oxford University Press.

Linsky, B. and Zalta, E. N. (1996). In defense of the contingently nonconcrete, Philosophical

Studies 84/2-3.

Salmon, N. (1998). Nonexistence, Nous 32.

Stalnaker, R. (1999). Assertion, Context and Content, Oxford : Oxford University Press.


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session


Michael Franke

Universiteit van Amsterdam

Abstract. This paper applies a model of boundedly rational “level-k thinking” (c.f. Stahl

and Wilson, 1995; Crawford, 2003; Camerer, Ho and Chong, 2004) to a classical concern of

game theory: when is information credible and what shall I do with it if it is not? The

model presented here extends and generalizes recent work in game-theoretic pragmatics

(Stalnaker, 2006; Jäger, 2007; Benz and van Rooij, 2007). Pragmatic inference is modeled

as a sequence of iterated best responses, defined here in terms of the interlocutors’ epistemic

states. Credibility considerations are a special case of a more general pragmatic inference

procedure at each iteration step. The resulting analysis of message credibility improves on

previous game-theoretic analyses, is more general and places credibility in the linguistic

context where it, arguably, belongs.

1 Semantic Meaning and Credible Information in Signaling Games

The perhaps simplest game-theoretic model of language use is a signaling game with

meaningful signals. A sender S observes the state of the world t ∈ T in private and

chooses a message m from a set of alternatives M all of which are assumed to be meaningful

in the (unique and commonly known) language shared by S and a receiver R. In

turn, R observes the sent message and chooses an action a from a given set A. In general,

the payoffs for both S and R depend on the state t, the sent message m and the action a

chosen by the receiver. Formally, a SIGNALING GAME WITH MEANINGFUL SIGNALS is

a tuple 〈{S, R} , T, Pr, M, [·] , A, U S , U R 〉 where Pr ∈ ∆(T) is a probability distribution

over T ; [·] : M → P(T) is a semantic denotation function and U S,R : M × A × T → R

are utility functions for both sender and receiver. 1 We can conceive of such signaling

games as abstract mathematical models of a conversational context whose most important

features they represent: the interlocutors’ beliefs, behavioral possibilities and preferences.

If a signaling game is a context model, the game’s solution concept is what yields a

prediction of the behavior of agents in the modelled conversational situation. The following

easy example of a scalar implicature, e.g., the inference that not all students came

when hearing the sentence “Some of the students came”, makes this distinction clear. A

simple context model for this case is the signaling game G1: 2 there are two states t ∃¬∀ and

t ∀ , two messages m some and m all with semantic meaning as indicated and two receiver

interpretation actions a ∃¬∀ or a ∀ which correspond one-to-one with the states; sender and

receiver payoffs are aligned: an implementation of the standard assumption that conversation

and implicature calculation revolve around the cooperative principle (Grice, 1989). A

solution concept, whatever it may be, should then ideally predict that S t ∀ (S

t ∃¬∀) chooses

m some (m all ) and the receiver responds with action a ∃¬∀ (a ∀ ). 3

1 I will assume throughout that (i) all sets T , M and A are non-empty and finite, that (ii) Pr(t) > 0 for

all t ∈ T , that (iii) for each state t there is at least one message m which is true in that state and that (iv) no

message is contradictory, i.e., there is no m for which [[m]] = ∅.

2 Unless indicated, I assume that states are equiprobable in example games.

3 For t ∈ T , I write S t as an abbreviation for “a sender of type t”.


Proceedings of the 13 th ESSLLI Student Session

a ∃¬∀ a ∀ m some m all

t ∃¬∀ 1,1 0,0 √ √−

t ∀ 0,0 1,1

G1: “Scalar Implicatures”

a mate a ignore m high m low

t high 1,1 0,0


t low 1,0 0,1 −

G2: “Partial Conflict”

It is obvious that in order to arrive at this prediction, a special role has to be assigned to

the conventional, semantic meaning of the messages involved. For instance, in the above

example anti-semantic play, as we could call it, that simply reverses the use of messages,

should be excluded. Most game-theoretic models of language use hard-wire semantic

meaning into the game play, either as a restriction on available moves of sender and receiver,

or into the payoffs, but in both cases effectively enforcing truthfulness and trust.

This is fine as long as conversation is mainly cooperative and preferences aligned. But

let’s face it: the central Gricean assumption of cooperation is an optimistic idealization

after all; conflict, lies and deceit are as ubiquitous as air. But then, hard-wiring of truthfulness

and trust limits the applicability of our models as it excludes the possibility that

senders may wish to mislead their audience. We should aim for more general models and,

ideally, let the agents, not the modeller decide when to be truthful and what to trust.

Opposed to hard-wiring truthfulness and trust, the most liberal case at the other end

of the spectrum is to model communication, not considering reputation or further psychological

constraints at all, as cheap talk. Here messages do not impose restrictions on

the game play and are entirely payoff irrelevant: U S,R (m, a, t) = U S,R (m ′ , a,t) for all

m, m ′ ∈ M, a ∈ A and t ∈ T . However, if talk is cheap, yet exogenously meaningful, the

question arises how to integrate semantic meaning into the game. Standard solution concepts,

such as sequential equilibrium or rationalizability, are too weak to predict anything

reasonable in this case: they allow for nearly all anti-semantic play and also for babbling,

where signals are sent, as it were, arbitrarily and therefore ignored by the receiver.

In response to this problem, game theorists have proposed various refinements of the

standard solution concepts based on the notion of credibility. 4 The idea is that semantic

meaning should be respected (in the solution concept) wherever this is reasonable in view

of the possibly diverging preferences of interlocutors. As an easy example, look at game

G2 where S is of either a high quality or a low quality type, and where R would like

to pair with S thigh only, while S wants to pair with R irrespective of her type. Interests

are in partial conflict here and, intuitively, a costless, non-committing message m high

is not credible, because S tlow would have all reason to send it untruthfully. Therefore,

intuitively, R should ignore whatever S says in this game. In general, if nothing prevents

S from babbling, lying or deceiving, she might as well do so; whenever she even has an

incentive to, she certainly will. For the receiver the central question becomes: when is a

signal credible and what should I do if it is not?

This paper offers a fresh look at this classical problem of game theory. The novelty

is, so to speak, a “linguistic turn”: I suggest that credibility considerations are pragmatic

inferences, in some sense very much alike —and in another sense very much unlike—

conversational implicatures. I argue that this linguistic approach to credibility of information

improves on the classical game-theoretic analyses by Farrell (1993) and Rabin

4 The standards in the debate about credibility were set by Farrell (1993) for equilibrium and by Rabin

(1990) for rationalizability. I will mainly focus on these two classical papers here for reasons of space.


Proceedings of the 13 th ESSLLI Student Session

(1990). In order to implement conventional meaning of signals in a cheap talk model, the

present paper takes an epistemic approach to the solution of games: the model presented

in this paper spells out the reasoning of interlocutors in terms of their beliefs about the

behavior of their opponents as a sequence of iterated best responses (IBR) which takes

semantic meaning as a starting point. For clarity: the IBR model places no restriction

whatsoever on the use of signals; conventional meaning is implemented merely as a focal

element in the deliberation of agents. This way, the IBR model extends recent work

in game-theoretic pragmatics (Jäger, 2007; Benz and van Rooij, 2007), to which it adds

generality by taking diverging preferences into account and by implementing the basic assumptions

of “level-k models” of reasoning in games (cf. Stahl and Wilson, 1995; Crawford,

2003; Camerer et al., 2004). In particular, agents in the model are assumed to be

boundedly rational in the sense that each agent computes only finitely many steps of the

best response sequence. Section 2 scrutinizes the notion of credibility, section 3 spells out

the formal model and section 4 discusses its properties and predictions.

2 Credibility and Pragmatic Inference

The classical idea of message credibility is due to Farrell (1993). Farrell seeks an equilibrium

refinement that pays due respect to the semantic meaning of messages. His notion

of credibility is therefore tied to a given reference equilibrium as a status quo. According

to Farrell, then, a message m is FARRELL-CREDIBLE with respect to a given equilibrium

if all t ∈ [m] prefer the receiver to interpret m literally, i.e., to play a best response to the

belief Pr(·| [m]) that m is true, over the equilibrium play, while no type t ∉ [m] does.

A number of objections can be raised against Farrell-credibility. First of all, the definition

requires all types in [m] to prefer a literal interpretation of m over the reference

equilibrium. This makes sense, under Farrell’s Rich Language Assumption (RLA) that

for every X ⊆ T there is a message m with [m] = X. This assumption is prevalent in

game-theoretic discussions of credibility, but restricts applicability. I will show in section

4 that this assumption seriously restricts Rabin’s (1990) account. But for now, suffice

it to say that, in particular, the RLA excludes models like G1, used to study pragmatic

inference in the light of (partial) inexpressibility. I will drop the RLA here to aim for

more generality and compatibility with linguistic pragmatics. 5 Doing so, implies amending

Farrell-credibility to require only that some types in [m] prefer a literal interpretation

of m over the reference equilibrium.

Still, there are further problems. Matthews, Okuno-Fujiwara and Postlewaite (1991)

criticize Farrell-credibility as being too strong. Their argument builds on example G3.

Compared to the babbling equilibrium, in which R performs a 3 , messages m 1 and m 2 are

intuitively credible: both S t1 , as well as S t2 have good reason to send m 1 and m 2 respectively.

Communication seems possible and utterly plausible. However, neither message is

Farrell-credible, because for i, j ∈ {1, 2} and i ≠ j not only S t j

, but also S t i

prefers R to

play a best response to a literal interpretation of m j , which would trigger action a j , over

5 A reviewer points out that the RLA has a correspondent in the linguistic world in Katz’s (1981) “principle

of effability”. The reviewer supports dropping the RLA, because otherwise pragmatic inferences is

limited to context and effort considerations. It is also very common (and, to my mind, reasonable) to restrict

attention to certain alternative expressions only, namely those that are salient (in context) after observing a

message. Of course, game theory is silent as to where the alternatives come from, since this is a question

for the linguist, perhaps even the syntactician (cf. Katzir, 2007).


Proceedings of the 13 th ESSLLI Student Session

a 1 a 2 a 3 m 1 m 2

t 1 4,3 3,0 1,2 √−

t 2 3,0 4,3 1,2 −

G3: “Best Message Counts”

a 1 a 2 a 3 a 4 m 12 m 23 m 13

√ √

t 1 4,5 5,4 0,0 1,4 √ √−

t 2 0,0 4,5 5,4 1,4 √ √−

t 3 5,4 0,0 4,5 1,4 −

G4: “Further Iteration”

the no-communication outcome a 3 . The problem with Farrell’s notion is obviously that

just doing better than equilibrium is not enough reason to send a message, when sending

another message is even better for the sender. When evaluating the credibility of a

message m, we have to take into account alternative forms that t ∉ [m] might want to


Compare this with the scalar implicature in G1. Message m some is interpreted as communicating

that the true state of affairs is t ∃¬∀ , because in t ∀ the sender would have used

m all . In other words, the receiver discards a state t ∈ [m] as a possible sender of m

because that type has a better message to send. Of course, such pragmatic enrichment

does not make a message intuitively incredible, as it is still used in line with its semantic

meaning. Intuitively speaking, in G1 S even wants R to draw this pragmatic inference.

This is, of course, different in G2. In general, if S wants to mislead, she intuitively

wants the receiver to adopt a certain belief, but she does not want the receiver to realize

that this belief might be false: we could say, somewhat loosely, that S wants her purported

communicative intention to be recognized (and acted upon), but she does not want

her deceptive intention to be recognized. Nevertheless, if the receiver does manage to

recognize a deceptive intention, this too may lead to some kind of pragmatic inference,

albeit one that the sender did not intend the receiver to draw. While the implicature in G1

rules out a semantically feasible possibility, credibility considerations, in a sense, do the

exact opposite: message m high is pragmatically weakened in G2 by ruling in state t low .

Despite the differences, there is a common core to both implicature and credibility

inference. Here and there, the receiver seems to reason: which types of senders would

send this message given that I believe it literally? Indeed, exactly this kind of reasoning

underlies Benz and van Rooij’s (2007) model of implicature calculation for the purely

cooperative case. The driving observation of this paper is that the same reasoning might

not only rule out states t ∈ [m] to yield implicatures but may also rule in states t ∉ [m].

When the latter is the case, m seems intuitively incredible. Still, the reasoning pattern

by which implicatures and credibility-based inferences are computed is the same. On

superficial reading, this view on message credibility can be found in Stalnaker (2006)

: 6 call a message m BVRS-CREDIBLE (Benz, van Rooij, Stalnaker) iff for some types

t ∈ [m], but for no type t ∉ [m] S t ’s expected utility of sending m given that R interprets

literally is at least as great as S t ’s expected utility of sending any alternative message m ′ .

The notion of BvRS-credibility matches our intuitions in all the cases discussed so far,

but it is, in a sense, self-refuting, as G4 from Matthews et al. (1991) shows. In this game,

all the available messages m 12 , m 23 and m 13 are BvRS-credible, because if R interprets

6 It is unfortunately not entirely clear to me what exactly Stalnaker’s proposal amounts to, as insightful

as it might be, because the account is not fully spelled out formally. The basic idea seems to be that

(something like) the notion of BvRS-credibility, as it is called here, should be integrated as a constraint on

receiver beliefs —believe a message iff it is BvRS-credible— into an epistemic model of the game together

with some appropriate assumption of (common) belief in rationality. The class of game models that satisfies

rationality and credibility constraints would then ultimately define how signals are used and interpreted.


Proceedings of the 13 th ESSLLI Student Session

literally S t1 will use message m 12 , S t2 will use message m 23 and S t3 will use message m 13 .

No message is used untruthfully by any type. However, if R realizes that exactly S t1 uses

message m 12 , he would rather not play a 2 , but a 1 . But if the sender realizes that message

m 12 triggers the receiver to play a 1 , suddenly S t3 wants to send m 12 untruthfully. This

example shows that BvRS-credibility is a reliable start, but stops too short. If messages

are deemed credible and therefore believed, this may create an incentive to mislead. What

seems needed to rectify the formal analysis of message credibility is a fully spelled-out

model of iterated best responses that starts in the Benz-van-Rooij-Stalnaker way and then

carries on iterating. Here is such a model.

3 The IBR Model and its Assumptions

3.1 Assumptions: Focal Meaning and Bounded Rationality

The IBR model presented in this paper rests on three assumptions with which it also sets

itself apart from previous best-response models in formal pragmatics (Jäger, 2007; Benz

and van Rooij, 2007; Jäger, 2008). The first assumption is the Focal Meaning Assumption:

semantic meaning is focal in the sense that the sequence of best responses starts with a

purely semantic truth-only sender strategy. Semantic meaning is also assumed focal in

the sense that throughout the IBR sequence R believes messages to be truthful unless

S has a positive incentive to be untruthful. This is the second, so called Truth Ceteris

Paribus Assumption (TCP). These two (epistemic) assumptions assign semantic meaning

its proper place in this model of cheap-talk communication.

The third assumption is the Bounded Rationality Assumption: I assume that players

in the game have limited resources which allow them to reason only up to some finite

iteration depth k. At the same time I take agents to be overconfident: each agent beliefs

that she is smarter than her opponent. Camerer et al. (2004) make an empirical case for

these assumptions about the psychology of reasoners. 7 However, for simplicity, I do not

implement Camerer et al.’s (2004) Cognitive Hierarchy Model in full. Camerer et al.

assume that each agent who is able to reason up to strategic depth k has a proper belief

about the population distribution of players who reason up to depth l < k, but I will

assume here, just to keep things simple, that each player believes that she is exactly one

step ahead of her opponent (cf. Crawford, 2003; Crawford, 2007). (I will discuss this

simplifying assumption critically in section 4.)

3.2 Beliefs & Best Responses

Given a signaling game, a SENDER SIGNALING-STRATEGY is a function σ ∈ S =

(∆(M)) T and a RECEIVER RESPONSE-STRATEGY is a function ρ ∈ R = (∆(A)) M .

In order to define which strategies are best responses to a given belief, we need to define

the game-relevant beliefs of both S and R. Since the only uncertainty of S concerns what

R will do, the set of relevant SENDER BELIEFS Π S is just the set of receiver responsestrategies:

Π S = R. On the receiver’s side, we may say, with some redundancy, that there

7 A good intuitively accessible example why this should be is a so-called beauty contest game (cf. Ho,

Camerer and Weigelt, 1998). Each player from a group of size n > 2 chooses a number from 0 to 100. The

player closest to 2/3 the average wins. When this game is played with a group of subjects who have never

played the game before, the usual group average lies somewhere between 20 to 30. This is quite far from

the group average 0 which we would expect from common (true) belief in rationality. Everybody seems to

believe that they are just a smarter than everybody else, without noticing their own limitations.


Proceedings of the 13 th ESSLLI Student Session

are three components in any game-relevant belief (cf. Battigalli, 2006): firstly, R has a

prior belief Pr(·) about the true state of the world; secondly, he has a belief about the

sender’s signaling strategy; and thirdly, he has a posterior belief about the true state after

hearing a message. Posteriors should be derived by Bayesian update from the former two

components, but also specify R’s beliefs after unexpected surprise messages. Taken together,

the set of relevant RECEIVER BELIEFS Π R is the set of all triples 〈πR 1 , π2 R , π3 R 〉 for

which πR 1 = Pr, π2 R ∈ S = (∆(M))T and πR 3 ∈ (∆(T))M such that for any t ∈ T and

m ∈ M if πR 2 (t,m) ≠ 0, then:

π 3 R(m, t) =

π 1 R (t) × π2 R (t,m)

∑t ′ ∈T π1 R (t′ ) × π 2 R (t′ , m) .

Given a sender belief ρ ∈ Π S , say that σ is a BEST RESPONSE SIGNALING STRATEGY

to belief ρ iff for all t ∈ T and m ∈ M we have:

σ(t,m) ≠ 0 → m ∈ arg max ρ m ′(a) × U S (m ′ , a,t)

m ′ ∈M

The set of all such best responses to belief ρ is denoted by S(ρ). Given a receiver belief

π R ∈ Π R say that ρ is a BEST RESPONSE STRATEGY to belief π R iff for all m ∈ M and

a ∈ A we have:

ρ(m, a) ≠ 0 → a ∈ arg max

a ′ ∈A


πR(m, 3 t) × U R (m, a ′ , t)

The set of all such best responses to belief π R is denoted by R(π R ). Also, if Π ′ R ⊆ Π R is

a set of receiver beliefs, let R(Π ′ R ) = ⋃ π R ∈Π ′ R

R(π R ).

3.3 Strategic Types and the IBR sequence

In line with the Bounded Rationality Assumption of Section 3.1, I assume that senders

and receivers are of different strategic types. Strategic types correspond to the level k of

strategic depth a player in the game performs (while believing she thereby outperfoms her

opponent by exactly one step of reasoning). I will give an inductive definition of strategic

types in terms of players beliefs, starting with a fixed strategy σ0 ∗ of S 0 . 8 Then, for any

k ≥ 0, R k is characterized by a belief set πR ∗ k

⊆ Π R that S is a level-k sender and S k+1 is

characterized by a belief πS ∗ k+1

∈ Π S that R is a level-k receiver.

I assume that S 0 plays according to the signaling strategy σ0 ∗ which simply sends any

true message with equal probability in all states. There need not be any belief to which

this is a best response, as level-0 senders are (possibly irrational) dummies to implement

the Focal Meaning Assumption. R 0 then believes that he is facing S 0 . With unique σ0,

which sends all messages in M with positive probability (M is finite and contains no

contradictions), R 0 is characterized entirely by the unique belief πR ∗ o

that S plays σ0.

In general, R k believes that he is facing a level-k sender. For k > 0, S k is characterized

by a belief πS ∗ k

∈ Π S . R k consequently believes that S k plays a best response σ k ∈

S(πS ∗ k

) to this belief. We can leave this unrestricted and assume that R k considers any

σ k ∈ S(πS ∗ k

) possible. But it will transpire that for an intuitively appealing analysis of

8 I will write S k and R k to refer to a sender or receiver of strategic type k. Likewise, Sk t refers to a sender

of strategic type k and knowledge type t.



Proceedings of the 13 th ESSLLI Student Session

message credibility we need to assume that R k takes S k to be truthful all else being equal

(see also discussion in section 4). We implement the TCP Assumption of Section 3.1 as

a restriction S ∗ (πS ∗ k

) ⊆ S(πS ∗ k

) on signaling strategies held possible by R. Of course,

even when restricted, there need not be a unique signaling strategy here. As a general

tie-break rule, assume the “principle of insufficient reason” that all σ k ∈ S ∗ (πS ∗ k

) are

equiprobable to R k . That means that R k effectively believes that his opponent is playing

response strategy

σk(t,m) ∗ σ∈S


∗ (πS ∗ ) σ(t,m)



|S ∗ (πS ∗ k


This fixes R k ’s beliefs about the behavior of his opponent, but it need not fix R k ’s belief

πR 3 about surprise messages. Since this matter is intricate and moreover R k’s counterfactual

beliefs do not play a crucial role in any examples discussed in this paper, I will not

pursue this issue at all in this paper (but see also footnote 10 below). In general, let us

say that R k is characterized by any belief whose second component is σk ∗ and whose third

component satisfies some (coherent, but possibly vacuous) assumption about the interpretation

of surprise messages. Let, πR ∗ k

⊆ Π R be the set of all such beliefs. R k is then fully

characterized by πR ∗ k


In turn, S k+1 believes that her opponent is a level-k receiver who plays a best response

ρ k ∈ R(πR ∗ k

). With the above tie-break rule S k+1 is fully characterized by the belief

3.4 Credibility and Inference

ρ ∗ k(m,a) =

ρ∈R(π ∗ R k


ρ(m, a)


|R(πR ∗ k


Define that a signal m is k-OPTIMAL in t iff σk+1 ∗ (t,m) ≠ 0. The set of k-optimal messages

in t are all messages that R k+1 believes Sk+1 t might send (thus taking the TCP

Assumption into account). 9 Similarly, distill from R’s beliefs his INTERPRETATION-

STRATEGY δ : M → P(T) as given by belief π R : δ πR (m) = {t ∈ T | πR 3 (m,t) ≠ 0}.

This simply is the support of the posterior beliefs of R after receiving message m. Let’s

write δ k for the interpretation strategy of a level-k receiver.

For any k > 0, since S k believes to face R k−1 with interpretation strategy δ k−1 , wanting

to send message m would intuitively count as an attempt to mislead if sent by Sk t just in

case t ∉ δ k−1 (m). Such an attempt would moreover be untruthful if t ∉ [m]. While

R k−1 would be deceived, R k would see through the attempted deception. From R k ’s

point of view, who adheres to the TCP Assumption, a message m is incredible if it is

k − 1-optimal in some t ∉ [m]. But then R k will include t in his interpretation of

m: recognizing a deceptive intention leads to pragmatic inference. In general, we should

consider a message m credible unless some type t ∉ [m] would want to use m somewhere

along the IBR sequence; precisely, m is CREDIBLE iff δ k (m) ⊆ [m] for all k ≥ 0. 10

9 Without the TCP Assumption, 0-optimality would be equivalent to the notion of an optimal assertion

in Benz and van Rooij (2007).

10 It may seem that messages which would not be sent by any type (after the first round or later) come out

credible under this definition, which would not be a good prediction. (Thanks to Daniel Rothschild (p.c.) for

pointing this out to me.) However, this is not quite right: we get into this predicament only for some versions

of the IBR sequence, not for others. It all depends on how the receiver forms his counterfactual beliefs. If,

for instance, we assume that R rationalizes observed behavior even if it surprises him, we can keep the


Proceedings of the 13 th ESSLLI Student Session

a 1 a 2 m 12 m 3

t 1 1,1 0,0 √ −

t 2 0,0 1,1 √−

t 3 0,0 1,1 -

G5: “White Lie”

Pr(t) a 1 a 2 a 3 m 12 m 23

t 1 1/8 1,1 0,0 0,0 √ √−

t 2 3/4 0,0 1,1 0,0 √

t 2 1/8 0,0 0,0 1,1 −

G6: “Some Game without a Name”

4 Discussion

The IBR model makes intuitively correct predictions about message credibility for the

games considered so far. In G1, R 0 responds to m some with the appropriate action a ∃¬∀ ,

but still interprets δ 0 (m some ) = {t ∃¬∀ , t ∀ }. In turn, R 1 interprets as δ 1 (m some ) = {t ∃¬∀ }; he

has pragmatically enriched the semantic meaning by taking the sender’s payoff structure

and available messages into account. After one round a fixed-point is reached, with fully

revealing credible signaling in accordance with intuition. In G2, IBR predicts that both

S thigh

1 and S tlow

1 will use m high which is therefore not credible. In G3, also fully revealing

communication is predicted and for G4 IBR predicts that all messages are credible for R 0

and R 1 , but not for R 2 , hence incredible as such. In general, the IBR model predicts that

communication in games of pure coordination is always credible:

Proposition 4.1. Take a signaling game with T = A and U S,R (·, t, t ′ ) = c > 0 if t = t ′

and 0 otherwise. Then δ k (m) ⊆ [m] for all k and m.

Proof. Clearly, δ 0 (m) ⊆ [m] for arbitrary m. So assume that δ k (m) ⊆ [m]. In this case

S t k+1 will use m only if t ∈ δ k(m). But then t ∈ [m] and therefore δ k+1 (m) ⊆ [m].

However, the IBR model does not guarantee generally that communication is credible

even when preferences are perfectly aligned, i.e., U S = U R . This may seem surprising at

first, but is due naturally to the possibility of, what we could call, white lies: untruthful

signaling that is beneficial for the receiver. These may occur if the set of available signals

is not expressive enough. As an easy example, consider G5 where S t2 will use m 3

untruthfully to induce action a 2 , which, however, is best for both receiver and sender.

To understand the central role of the TCP assumption in the present proposal, consider

the game G6. In G6, R 0 has the following posterior beliefs: after hearing message m 12 he

rules out t 3 and believes that t 2 is three times as likely as t 1 ; similarly, after hearing message

m 23 he rules out t 1 and believes that t 2 is three times as likely as t 3 . Consequently,

R 0 responds to both signals with a 2 . Now, S t1

1 , for instance, does not care which message

to choose from, as far as her expected utilities are concerned. But R 1 nevertheless

assumes that S t1

1 speaks truthfully. It’s thanks to the TCP Assumption that IBR predicts

messages to be credible in this game.

G6 also shows a difference between the IBR model and Rabin’s (1990) model of credible

communication, which superficially look very similar. Rabin’s model consists of two

components: the first component is a definition of message credibility which is almost a

two-step iteration of best responses starting from the semantic meaning; the second component

is iterated strict dominance around a fixed core set of Rabin-credible messages

definition unchanged: if no type whatsoever has an outstanding reason to send m, the receiver’s posterior

beliefs after m will support any type. So, unless m is tautologous, it is incredible. Still, Rothschild’s

criticism is appropriate: the definition of message credibility offered here is, in a sense, incomplete as long

as we do not properly define the receiver’s counterfactual beliefs; something left for another occasion.


Proceedings of the 13 th ESSLLI Student Session

being sent truthfully and believed. In particular, Rabin requires for m to be credible that

m induces, when taken literally, exactly the set of all sender-best actions (from the set of

actions that are inducible by some receiver belief) of all t ∈ [m]. This is defensible under

the Rich Language Assumption, but both messages in G6 fail this requirement. Consequently,

with no credible message to restrict iterated strict dominance, Rabin’s model

predicts a total anything-goes for game G6. This shows the limited applicability of approaches

to message credibility that are inseparable from the Rich Language Assumption.

The present notion of message credibility and the IBR model are not restricted in this

sense and fare well with (partial) inexpressibility and the resulting inferences.

To wrap up: as a solution concept, the epistemic IBR model offers, basically, a set of

beliefs, viz., beliefs obtained under certain assumptions about the psychology of agents

from a sequence of iterated best responses. I do not claim that this model is a reasonable

model for human reasoning in general. Certainly, the simplifying assumption that

players believe that they are facing a level-k opponent, and not possibly a level-l < k opponent,

is highly implausible proportional to k, but especially so for agents that have, in

a manner of speaking, already reasoned themselves through a circle multiple times. (It is

easily verified that for finite M and T the IBR sequence always enters a circle after some

k ∈ N.) 11 Still, I wish to defend that the IBR model does capture (our intuitions about)

certain aspects of (idealized) linguistic behavior, namely pragmatic inference in cooperative

and non-cooperative situations. Whether it is a plausible model of belief formation

and reasoning in the envisaged linguistic situations is ultimately an empirical question.

In conclusion, the IBR model offers a novel perspective on message credibility and

the pragmatic inferences based on this notion. The model generalizes existing gametheoretical

models of pragmatic inference by taking conflicting interests into account. It

also generalizes game-theoretic accounts of credibility by giving up the Rich Language

Assumption. The explicitly epistemic perspective on agents’ deliberation assigns a natural

place to semantic meaning in cheap-talk signaling games as a focal starting point. It also

highlights the unity in pragmatic inference: in this model both credibility-based inferences

and implicatures are different outcomes of the same reasoning process.


I’d like to thank Tikitu de Jager, Robert van Rooij, Daniel Rothschild, Marc Staudacher

and three anonymous referees for insightful comments, help and discussion. I moreover

benefited greatly from discussing with Gerhard Jäger an early version of his paper (Jäger,

2008), which also defines and applies a general iterated best response model different

from what I did here. Also, I am thankful to Sven Lauer for waking my interest by first

explaining to me with enormous patience some puzzles about credibility that I did not

fully understand at the time (see Lauer, 2007). Errors are my own.

11 It is tempting to assume that “looping reasoners” may have an Aha-Erlebnis and to extend the IBR

sequence by transfinite induction assuming, for instance, that level-ω players best respond to the belief

that the IBR sequence is circling. I do not know whether this is necessary and/or desirable for linguistic

applications. We should keep in mind though that in some cases human reasoners may not get to the ideal

level of reasoning in this model and in others they might even go beyond it.


Proceedings of the 13 th ESSLLI Student Session


Battigalli, P. (2006). Rationalization in signaling games: Theory and applications, International

Game Theory Review 8(1): 67–93.

Benz, A. and van Rooij, R. (2007). Optimal assertions and what they implicate, Topoi

26: 63–78.

Camerer, C. F., Ho, T.-H. and Chong, J.-K. (2004). A cognitive hierarchy model of games,

The Quarterly Journal of Economics 119(3): 861–898.

Crawford, V. P. (2003). Lying for strategic advantage: Rational and boundedly rational

misrepresentation of intentions, American Economic Review 93(1): 133–149.

Crawford, V. P. (2007). Let’s talk it over: Coordination via preplay communication with

level-k thinking. Unpublished Manuscript.

Farrell, J. (1993). Meaning and credibility in cheap-talk games, Games and Economic

Behavior 5: 514–531.

Grice, P. H. (1989). Studies in the Ways of Words, Harvard University Press.

Ho, T.-H., Camerer, C. and Weigelt, K. (1998). Iterated dominance and iterated best

response in experimental “p-beauty contests”, The American Economic Review

88(4): 947–969.

Jäger, G. (2007). Game dynamics connects semantics and pragmatics, in A.-V. Pietarinen

(ed.), Game Theory and Linguistic Meaning, Elsevier, pp. 89–102.

Jäger, G. (2008). Game theory in semantics and pragmatics. Manuscript, University of


Katz, J. J. (1981). Language and Other Abstract Objects, Basil Blackwell.

Katzir, R. (2007). Structurally-defined alternatives. To appear in Linguistics and Philosophy.

Lauer, S. (2007). Some kinds of deception do not occur: Credibility and the maxim of

sincerity. Unpublished Manuscript. Amsterdam, Stanford.

Matthews, S. A., Okuno-Fujiwara, M. and Postlewaite, A. (1991). Refining cheap talk

equilibria, Journal of Economic Theory 55: 247–273.

Rabin, M. (1990). Communication between rational agents, Journal of Economic Theory

51: 144–170.

Stahl, D. O. and Wilson, P. W. (1995). On players’ models of other players: Theory and

experimental evidence, Games and Economic Behavior 10: 218–254.

Stalnaker, R. (2006). Saying and meaning, cheap talk and credibility, in A. Benz, G. Jäger

and R. van Rooij (eds), Game Theory and Pragmatics, Palgrave MacMillan, pp. 83–



Proceedings of the 13 th ESSLLI Student Session



Michael Hartwig

Multimedia University, Cyberjaya, Malaysia

Abstract. Researchers have recently studied the acceptance probability of P and

NP languages hoping to find new ways of differentiating both classes. The paper

outlines the authors findings related to the acceptance probability of regular and

context-free languages, which we describe using the term of a difference shrinking

chain. A first proof technique, the inflating lemma, based on above results and able

to separate higher languages from regular languages up to star height 1 as well as

some incentives to apply those techniques to higher classes are given.

1 Introduction

The major quest for the complexity theory community is finding methods that may

separate classes.” (Buhrmann & Torenvliet 2005) Although there has been made an

impressive progress recently within the area of complexity theory the need for new,

creative approaches that may result in methods that could be used to separate classes

has not diminished and is nicely exemplified by the long outstanding P vs. NP problem.

One of the recent approaches included the study of properties of the acceptance

probability function of such languages, that is, the study of the form of the graph of the

function which takes as an argument a natural number n and returns the ratio between

the number of accepted words of length n in the given language and all possible words

of the same length. This study has lead to to many discoveries like the so called phase

transition in the acceptance probability graph of NP complete problems (Clote &

Kranakis 2002, Dubois et. al. 2000). There has been hope that if we were able to

describe mentioned phase transition with more and more precision (Achlioptas et al.

2001, Kirousis et al. 1998) we would then also be able to separate P from NP.

Unfortunately, this has not yet happened.

Like other researchers we have therefore turned our attention to smaller classes

like regular and context free languages first. Given such a language, we define the

density function d L (n) = |L n | counting the number of words of length n in L. The

study of the density of regular languages has a longer history (Schützenberger 1962,

Eilenberg 1974, Rozenberg et al. 1997, Bodirsky at al. 2004). Languages with a density

function that can be bounded from above by a polynomial (i.e. there exists a polynomial

p(x) such that d L (n) p(n)) are called sparse. If on the other hand there exists a real

number h > 1 such that d L (n) h n for infinitely many n 0 then L is called dense

(Demain 2003, Krieger 2007). Notice that the language a*b* is a sparse language, while

the language that includes all words over a binary alphabet that start with the letter a

(i.e. a(a+b)*) is dense. As described in (Szilard et al. 1992, Rozenberg et al. 1997) a

regular language is sparse “if and only if it can be represented as a finite union of

regular expressions of the form xy 1 *z 1 ...y m *z m , where x, y 1 , z 1 , ..., y m , z m are all strings in

*”. Such regular languages are also called SLRE and equivalent to bounded regular


Proceedings of the 13 th ESSLLI Student Session

languages (Habermehl et al. 2000). Nevertheless, it is not difficult to see that the

majority of all regular languages are dense. (Flajolet 1987) demonstrated that a regular

language is either sparse or dense, which was recently generalized to context-free

languages (Ilie 2000, Incitti 2000). While it is interesting in its own right to study such

properties, (Demaine et al. 2003) could show that only sparse regular languages have

the power to restrict NP complete problems such that they are polynomially solvable. In

other words, that the intersection of such a regular language with an NP complete

problems results in a language from P. (Eisman et al. 2005) proposed another

application by stating that the density function could be used in some application areas

such as streaming algorithms, where “rapid computation must be performed (often in a

single pass)”.

Still we feel that it is often more interesting to study the acceptance probability

Acc(L, n) = |L n | / | n | of a given language rather than its density, that is the ratio

between the number of accepted words and all possible words of a given length. As

mentioned above, a(a+b)* has exponential density but it has only stable acceptance

probability as Acc(a(a+b)*, n) = 0.5, which seems to describe the quantity of accepted

words more appropriately. Secondly, such a different view allows us to combine both

sparse and dense languages and study common properties. In (Hartwig et al. 2006a,

Hartwig et al. 2006b) we could show that the acceptance probability graph is indeed

expressive enough to separate complexity classes making it an acceptable candidate in

above mentioned quest. The objective in using such properties to separate mentioned

classes is hereby to familiarize ourselves with properties, techniques, applications and

aimed at getting a better understanding of possible uses of acceptance probability

graphs in higher classes. In (Hartwig et al. 2006a) we described the acceptance

probability of very low regular languages and in (Hartwig et al. 2006b) we presented a

proof technique (the inflating lemma) that is powerful enough to separate many higher

languages from regular languages up to star height 1 and can be compared with the well

known pumping lemma (Sisper 1997) 1 .

Inflating Lemma If L REG(1) and L has increasing acceptance probability then

there exist a length n 0 and natural number k 1 such that for all w L with |w| n 0 :

w = pr L p( k )*r L.

An example application would be the following proof.

Example (MAJORITY does not belong to REG(1)) L = {w | w * and w has more

(or equal) a than b} REG(1).

Proof. Acc(L, n) is constantly increasing; hence the inflating lemma can be applied. But

none of the words accepted can be inflated. We could take any word and position and

insert (or: inflate with) as many b’s as needed until the word has more b’s than a’s.


Although the inflating lemma seems to have only limited applicability the following work suggests

that every regular language has either increasing, stable or decreasing chains. Furthermore, if L is regular

and of decreasing acceptance probability, then the lemma could be applied to the complement of L.


Proceedings of the 13 th ESSLLI Student Session

The following paper continues this work by providing an overview on the status of our

work on the acceptance probability of regular and context free languages over binary

alphabets claiming that both classes have acceptance probability graphs that can be split

into either increasing, decreasing or stable chains with a decreasing (or shrinking)

difference. We think that the minimal number of mentioned chains should be studied in

more detail and put into a relationship to the size of any program or machine accepting

the language. Knowing that NP complete problems exhibit phase transitions in their

acceptance probability graphs switching from difference shrinking to difference

increasing sections and vice versa we believe that techniques making use of those

properties may contribute to the separation of higher classes, too.

2 Preliminaries

We use the following definitions: The alphabet for all strings is = {a, b}. The length

of a string w is given by |w|, all sets L 1 , L 2 ,.. are considered subsets of *. A regular

expression e over is built from all symbols in , the symbol , the binary operators +,

· and the unary operator *. The language specified by a regular expression is denoted by

L(e) and is referred to as a regular language (Kleene 1956, Kulloch et al. 1943). We call

a regular expression to be unambiguous (or non overlapping) if and only if its

corresponding NFA is unambiguous. “An NFA is called unambiguous if for each word

w there is at most one path from the initial state to a final state that spells out w.”

(Bruggemann-Klein et al. 2007, Moreira et al. 2005)) It is important to know that all

regular languages are unambiguous (Giammarresi et al. 2001) and can henceforward be

described by an unambiguous regular expression. sh(e) computes the star height of a

regular expression and REG(1) specifies all regular languages having a star height of 1

or less.

As mentioned in the introduction, the density of a language counts the number of

accepted words per given length and is defined as

d L (n) = |L n |,

while the acceptance probability of a language is defined as the ratio between the

number of accepted words d L (n) and the number of all words of a given length,

Acc(L, n) = |L n | / | n |.

3 Regular acceptance probability

3.1 Low regular languages

Describing the acceptance probability of a finite language is straightforward.

Lemma (Finite Languages) For any finite language L: Acc(L, n) = O(0).

Proof. If L is finite then there exists a length after which no word is accepted by the

language. The acceptance probability reaches 0.


Proceedings of the 13 th ESSLLI Student Session

Regular languages which can be described by a regular expression having star height

0 or at most one expression using the star operator and being of the form (a+b)* have

constant acceptance probability.

Lemma (Simple Regular Languages) If L = w 1 (a+b)*w 2 with w 1 , w 2 words there exist

a constant c such that:

Acc(L, n) = O(c).

Proof. The smallest accepted word of the language L is of length |w| = |w 1 | + |w 2 |. As

there is only one such smallest word, Acc(L, |w|) = 1/2 |w| = c. For any length n greater

than |w| we can say that d L (n) = 2 · d L (n-1). Henceforward the acceptance ratio

remains stable.

It is then not difficult to see that also any unification of simple regular languages (in

the above sense) will again only yield a language with constant acceptance




0 . 8

0 . 8

0 . 6

0 . 6

0 . 4

0 . 4

0 . 2

0 . 2



0 1 2 3 4 5 6 7 8

Figure 1. Acceptance probability graphs of low regular languages.

Left L 1 = {a, aba} (finite), right L 2 = ab(a+b)*.

3.2 Regular languages having one star

0 1 2 3 4 5 6 7 8

Languages built upon regular expressions using the star operator at most once include

also languages with a decreasing acceptance probability, if the expression under the star

is not entirely composed of (a+b)* expressions. The length of the expression under the

star defines the step width d decomposing the acceptance probability graph into d

chains. We will have d-1 chains with the acceptance probability O(0) and one chain

being either stable or decreasing.


0 . 8

0 . 6

0 . 4

0 . 2


0 1 2 3 4 5 6 7 8

Figure 2. Acceptance probability graph of L 3 =b(ba)*. L 3 has a step width of 2 with one chain being

stable (d L3 (0) = d L3 (2) = ... = 0), while the remaining elements belong to a chain with its peaks

constantly decreasing by .


Proceedings of the 13 th ESSLLI Student Session

Lemma (Regular Languages with One Star) For any regular language L = w 1 w 2 *w 3 with w 1 ,

w 3 words and sh(w 2 ) = 0 there exists a minimal length n 0 such that for all n > n 0 :

Acc(L, n) Acc(L, n-|w 2 |).

Proof. The length d = |w 2 | is usually referred to as a step width for this language touching the

peaks of the acceptance probability graph. The number of accepted words of any length can be

traced back to the number of accepted words with length n-|w 2 | as we can apply the word under

the star. Henceforward, Acc(L, n) = c · Acc(L, n-|w 2 |). c is easily determined from w 2 and the

fact that the chains are either decreasing or stable is obvious and follows also directly from the

inflating lemma.

3.3 Regular languages up to star height 1

Regular languages up to star height 1 provide already a wide range of different

acceptance probability graphs.

Lemma (Regular Languages up to Star Height 1) If L REG(1) then there exists

constants s, u m , and v m such that:

d L (s) = u 0,

d L (s+1) = u 1 ,

... ,

d L (s+m) = u m

d L (n) = u 1 d L (n-v 1 ) + u 2 d L (n-v 2 ) + .. + u m d L (n-v m )

Proof. (Sketch) See (Hartwig 2008) for the complete proof. If L REG(1) then L has an

unambiguous regular expression of the following form:

L = L 1 + L 2 + ... + L k

where L i = R i0 R i1 ...R it with sh(R ij ) 1

Calculating the number of accepted words for each L i is done successively starting from

left. The number of accepted words of length n for R i0 can be determined from the

length's of all expressions under the star. For example, let

L 4 = b (aa + bbb)* b (ab + bba)* b

we would have R 4,0 = (aa + bbb)* and L 4,1 = (ab + bba)*, which would give us for R 4,0:

d R4,0 (3) = 1 // as |b| + |b| + |b| = 3

d R4,0 (n) = d R4,0 (n-|aa|) + d R4,0 (n-|bbb|)

= d R4,0 (n-2) + d R4,0 (n-3)

This process continues until the last expression within L i is reached consequently

adding all the accepted words of formerly considered components.




Proceedings of the 13 th ESSLLI Student Session

d R4,1 (n) = d R4,1 (n-|ab|) + d R4,1 (n-|bba|) + No_acc_words_for_R 4,0

= d R4,1 (n-2) + d R4,1 (n-3) + d R4,0 (n)

And this would give us in our (simple) case,

d L4 (n)

= d R4,1 (n)

Above result (here depending on R 4,0 and R 4,1 ) could then be converted into a recursive

formula referring only to itself and obeying the requirements. In the example case,

d L4 (3) = 1,

d L4 (n) = 2d L4 (n-2) +2d L4 (n-3) - d L4 (n-4) - 2d L4 (n-5) – d L4 (n-6).













1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

Figure 3. Acceptance probability graphs of higher regular languages up to star height 1. Left

L 4 = b (aa + bbb)* b (ab + bba)* b) from above example,

right L 5 =a (a+b)* + (b + ba)*) with a union operator also outside the star.

To describe the acceptance probability graphs of such regular and higher languages we

introduced the term of a difference shrinking chain.

Definition (Difference Shrinking Chain) We call a language to have a difference

shrinking chain, if there exists a step width d and length n 0 such that for all i 0:

|Acc(L, n 0 +(i+2)d) - Acc(L, n 0 +(i+1)d)| |Acc(L, n 0 +(i+1)d) - Acc(L, n 0 +i·d)|

Figure 4. An example language with only difference shrinking chains. A chain is called

difference shrinking if for such a chain and any length n the speed of the increase (or

decrease) slows constantly, i.e. 2 1.

We call a language to be difference shrinking, if there exists a step width d 1

decomposing the acceptance probability graph into d difference shrinking chains. We

call a language to be a regular increasing language, if it can be decomposed into at

least one increasing and 0 or more stable chains. (Regular decreasing languages are

defined in a similar way.) A language is furthermore called strongly increasing if


only one increasing chain completely describes the graph. While similar concepts

apply to strongly decreasing languages, such languages are also called to have

monotone acceptance probability.

Lemma (Star Height 1 Languages are Difference Shrinking) If L REG(1), then L

is difference shrinking.

Proof. See (Hartwig 2008).

The proof includes an algorithm that is able to compute for any given regular language

a step width, which might not be the minimal step width but which is decomposing the

language's acceptance probability graph into such difference shrinking chains. It is not

difficult to see that most of the regular languages up to star height 1 have also only

monotone chains and we claim that it is also true for the languages left out.

4 The acceptance probability of context free languages

Calculating the number of accepted words of a regular language with a star height of 2

or higher seems to require a different approach. Let L 6 = (w 1 *w 2 *)*, we could then

compute accepted words of length n as follows: d L6 (n) = d L6 (1)*d L6 (n-1) +

d L6 (2)*d L6 (n-2) + ... A word of length n is a composition of an accepted word of

length c n from w 1 and an accepted word of length n-c from w 2 . Surprisingly the

same approach will also work in the calculation of the acceptance probability of a

context-free language as the following examples suggest.

Example. Let G 1 be the following grammar:

S => SaN | a

N => bN | bb

We could compute the number of accepted words that are derived from each of the given non

terminals. The rule S => SaN specifies that a terminal word can be constructed from any

smaller word from S and N as long as the sum of their length's equals n-1. (n-1, because the

letter a makes up the one place.) This would bring us to the following:

d S (1) = 1

d N (2) = 1


d S (n) =


d N (n) = d N (n-1)

Proceedings of the 13 th ESSLLI Student Session

_d S _i_d N _n_1_ i _

Having S as the start symbol, we can calculate the number of accepted words for the given

grammar with d G1 (n) = d S (n). Being also a regular language (a(abbb + )*), the number of

accepted words could also be calculated with d s (1) = 1, d s (n) = d s (n-1) + d s (n-3)

following thoughts from the previous chapters.

Example. Let G 2 be the following grammar:


Proceedings of the 13 th ESSLLI Student Session

S => aSb | ab

Although being a truly context-free language, calculating the language's density

remains quite simple and suggests that the acceptance probability of all context-free

languages can completely be described with a form similar to the one presented for

star-height 1 languages.

d S (2) = 1

d S (n) = d S (n-2)

Above examples and referring to the Chomsky-Schutzenberger Theorem stating that

for every context free language and PDA M = (Q, , , , q 0 , Z 0 , F) there is a regular

language R, the Dyck set D 2 and two homomorphisms g, h such that L(M) = h(g 1 (D 2 )

R) we then claim that context-free languages are equally difference shrinking and


While we can foresee challenges in the use of our results related to higher

classes in the construction of new proof techniques, the long outstanding P vs NP

problem should provide enough incentives to make an attempt. The phase transition

that such NP complete problems exhibit, is only possible because the language's

acceptance probability switches from sections being difference shrinking to difference

increasing as shown in the example below.

Figure 6. Example languages from NP complete having acceptance probability graphs with

sections of increasing difference (some of them indicated).

We think that finding the minimal step width for a given language would help in the

search for new proof techniques. As mentioned earlier, the minimal step width should

indicate more properties related to the complexity of accepting the language.

5 Conclusions

We have given a first overview related to a new attempt in characterizing classes from

the Chomsky Hierarchy using properties derived from the language's acceptance

probability graphs. Regular languages up to star height 1 have therefore graphs that

can be split into difference shrinking chains. Current research suggests that this holds

also for context-free languages. Knowing that NP complete languages usually have

graphs performing a phase transition between difference shrinking and difference

increasing sections, we recommended further work. Especially the problem of finding


the minimal step width seems to be crucial in the construction of new proof


Class Acceptance Probability Properties

finite Acc(L, n) = 0 convergent to 0



Acc(L, n) = 2 d L (n-1) / 2 n

convergent to a

constant (stable)

one star Acc(L, n) = c d L (n-d) / 2 n as above & at most one

decreasing chain

star height 1 Acc(L, n) = [ u 1 d L (n-v 1 )

monotone 2 , difference

+ u 2 d L (n-v 2 )

shrinking chains

+ ...

+ u m d L (n-v m ) ] / 2 n





Proceedings of the 13 th ESSLLI Student Session

n_d monotone, difference

Acc(L, n) = _d S _i_d N _n_ d _i_...._/ 2 n shrinking chains 3


Acc(L, n) = ? ?

as above & difference

increasing chains, non

monotonic chains

Table 1. Acceptance probability of different classes from the Chomsky Hierarchy (state of

the art, the class of context free languages is currently looked at).


We'd like to thank the anonymous referees for their comments.


H. Buhrmann & L. Torenvliet (2005). 'A Post's Program for Complexity Theory', BEATCS 85

(pp. 41-51)

P. Clote & E. Kranakis (2002). 'Boolean Functions and Computation Models', Springer,

M. Hartwig et al. (2006a). 'In Search of a New Proof Technique', M2USIC06

M. Hartwig et al. (2006b). 'Proving Non Regularity using Acceptance Probability Techniques',


A. Bruggemann-Klein & R. Mesing. (2007). 'Regular Expressions into Finite Automata,

http://webcourse.cs.technion.ac.il/236826/Spring2005/ho/WCFiles/RegularExpressions into Finite


D. Giammarresi, R. Montalbano, D. Wood (2001). 'Block-Deterministic Languages',


M. Sisper (1997). 'Introduction to the Theory of Computation', PWS Publishing Company (pp.



Claimed for some languages.



Proceedings of the 13 th ESSLLI Student Session


O. Dubois et al. (2000). 'Typical Random 3-SAT Formulae and the Satisfiability Threshold',

SODA '00 (pp. 126-127)

D. Achlioptas et al. (2001). 'The Phase Transition in 1-in-k SAT and NAE 3-SAT', SODA '01

(pp. 721-722)

L. Kirousis et al. (1998). 'Approximating the unsatisfiability threshold of random formulas',

Random Structures and Algorithms 12(3) (pp. 253-269)

D. Achlioptas et al. (2001). 'A Sharp Threshold yields in Proof Complexity Yields a

Lower Bound for Satisfiability Search', Journal of Comp. & Sys. Sci. 68 (2)

M. Hartwig (2008), 'Regular Languages up to Star Height 1 have Difference Shrinking

Acceptance Probability', TMFCS-08

M. Bodirsky et al. (2004), 'Efficiently computing the density of regular languages', Proceedings

of Latin American INformatics (LATIN'04), pages 262-270, Buenos Aires

M.P. Schützenberger (1962), 'Finite counting automata', Information and Control 5(2), 91-107

S. Eilenberg (1974), 'Auomata, Languages, and Machines', Academic Press, Inc., Orlando,

Florida, USA

A. Szilard et al.(1992), 'Characterizing Regular Languages with Polynomial Densities', Lecture

Notes in Computer Science, Volume 629, Springer, 494-503

G. Rozenberg et al. (1997), 'Handbook of Formal Languages', Chapter 2: Regular Languages,


E. D. Demaine et al. (2003), 'On Universally Easy Classes for NP-complete Problems',

Theoretical Computer Science, Vol. 304, pages 471-476

D. Krieger et al. (2007), 'Finding the Growth Rate of a Regular Language in Polynomial Time',

CoRR abs/0711.4990

P. Habermehl et al. (2000), 'A Note on SLRE', http://citeseer.ist.psu.edu/375870.html

P. Flajolet (1987), 'Analytic Models and Ambiguity of Context-Free Languages', TCS, 49:283-


L. Ilie et al. (2000), 'A Characterization of Polyslender Context-Free Languages', Theoret.

Informatics Appl., 34(1):77-86

R. Incitti (2000), 'The Growth Function of Context-Free Languages', Theoretical Computer

Science, 255:601-605

G. Eisman et al. (2005), 'Approximate Recognition of Non-regular Languages by Finite

Automata', Proceedings of the Twenty-Eighth Australasian Computer Science Conference

(ACSC2005), Newcastle, Australia

S. Kleene (1956), 'Representation of events in nerve nets and finite automata', Automata

Studies, Princeton University Press, Princeton, USA, 3-42

W. S. Kulloch et al. (1943), 'A logical calculus of the ideas immanent in the nervous activity',

Bull. Math. Biophys, 5:115-133

N. Moreira et al. (2005), 'On the Density of Languages Representing Finite Set Partitions',

Journal of Integer Sequences, Vol. 8


Proceedings of the 13 th ESSLLI Student Session


Simon Hopp

University of Konstanz

Abstract. This paper reports results from two experiments investigating distance

effects in sentence processing. It is well known that the processing difficulty of

dependency relation increases with the distance between the two items concerned.

The paper addresses the question what exactly determines ‘distance’: Time or

amount of linguistic material between the first and the second item. Experiment 1

disentangles these factors and suggests that linguistic material is the source of

difficulty. Experiment 2 investigates the role of the characteristics of that

intervening material. The logic of this experiment is based on Gibson’s (2000)

claim that the ease of integrating a word into the CPPM decreases with the number

of newly introduced discourse referents. In particular, experiment 2 asks whether

adverbials which do not introduce new discourse referents have the same effect.

The results indicate that while intervening discourse referents elicit the expected

effect, adverbials do not show any effect at all.

1 Working Memory and Sentence Processing

In cognitive science there is a broad agreement that a certain kind of store is necessary

for all kinds of complex cognitive tasks such as mental arithmetic or language

processing. The following example (cf. Gibson 2000) illustrates the need for a short

term store in sentence processing.

(1) The reporter [that the senator attacked] admitted the error.

In (1) the short term store (or working memory) has to keep the determiner phrase (DP)

the reporter active over the period of time in which the relative clause is processed, to

ensure that the human sentence parser is able to link the DP to the verb admitted and

then to check the grammatical features correctly. Since sentences can contain several

dependencies between items and these items can be separated by further items, storing

linguistic information over a short time is a basic requirement for sentence processing.

As has long been noticed in linguistic theory, sentences like (1) often lead to processing

difficulties (e.g. Just & Carpenter 1992). One of the reasons for this fact is the distance

between the linguistic items dependent on each other. It seems that integrating a word w

into the CPPM (Current Partial Phrase Marker) is often adversely affected by the

distance between w and information within the CPPM necessary for integrating w.

However, it is still unclear why prior pieces in the CPPM are difficult to retrieve at later

points. There are two prominent mechanisms that are said to be responsible for

forgetting over a short term: The amount of time that passes between two items and

linguistic material that has to be processed between two items. According to time-based

decay earlier information might already have faded away at the point when it is needed

again. In current models of working memory involvement in sentence processing, timebased

decay either plays a decisive role (e.g., Lewis & Vasishth 2005) or is taken as one

possible candidate for contributing to the cost of integrating a word into the sentence


Proceedings of the 13 th ESSLLI Student Session

(Levy et al. 2007). The alternatives to theories of time-based decay are event-based

models (cf. Lewandowsky et al. 2004). Those models admit that forgetting in working

memory is observed over time, but they predict that time is not the crucial factor for this

phenomenon. Some event-based models argue that it is rather interference of linguistic

material that leads to processing difficulties (eg. Nairne 1990) Items that have already

been processed may be forgotten by the time they are needed again, because new

incoming material interferes. Clarifying the role of time-based decay versus

interference-based forgetting is complicated because normally amount of linguistic

material and amount of time are confounded.

2 Case Checking as a Test Case

In this paper, I present two experiments that were run to investigate the nature of

forgetting in working memory. 1 The focus was on the process of linking and checking

in German verb-final clauses adhering to the scheme in (2). When integrating the verb

in clause-final position, the case of NP must be retrieved until the end of the sentence in

order to check it against the case feature of the verb.

(2) .. dass NP[case: X] … {distance} … verb[case: Y]

An example of a verb-final clause in German, as it was used in the following

experiments, is given in (3).

(3) Ich glaube, dass die Studentin das wichtige Buch gelesen hat.

I think that the student(fem) the important book read has

‘I think, that the student has read the important book.’

The auxiliary in clause-final position hat asks for nominative case in the NP die

Studentin. The memory trace of the case feature of this NP has to be memorized over a

certain distance until the auxiliary hat is reached. The human sentence parser is then

able to link the two dependent items - NP and verb - and to check the case features of

both items. If, however, the distance between the verb and the related NP is too long

than working memory is unable to keep the memory trace until the end of the sentence.

In this case processing difficulties arise which can be measured experimentally.

As mentioned above, amount of linguistic material and amount of time are

normally confounded. In the first experiment the two factors were disentangled to

investigate their respective impact on the human sentence parser independently. This

builds on related work by Lewandowsky et al (2004) and Saito & Miyake (2004).

The second experiment focused on sentence complexity according to the

Dependency Locality Theory (DLT) (Gibson 2000). The DLT assumes that the costs of

integrating a word w increase with the number of new discourse referents intervening

between w and information needed to integrate w. For case-checking in German this

prediction has not been tested so far.

3 Experiment 1: Time-Based Decay versus Interference

As shown in (3), the issue of forgetting in working memory was addressed by

investigating the process of case-checking during the parsing of German verb-final

clauses. By integrating the verb in clause-final position, the case of NP must be

1 The experiments were part of a bigger project on sentence processing together with Markus Bader.


Proceedings of the 13 th ESSLLI Student Session

retrieved in order to check it against the case feature of the verb. If the intervening

distance is to long, essential information about case features will be lost at a later point

when it is needed again. To be able to investigate the nature of ‘distance’ the crucial

factors have to be disentangled. This is achieved by manipulating the factors

independently. First of all, a procedure was chosen that allowed to present the stimuli

experimenter-paced in a non-cumulative word-by-word fashion (for details see section

Procedure). Two different presentation rates, one for a slow and one for a fast

presentation, were preset. Second, the intervening material between the related items

was manipulated. Sentences as in (3) were created in a long and in a short version.

Additional adverbials (e.g. ‘für die letzte Prüfung im Mai’) were inserted for the long

versions, as can be seen in (4):

(4) Ich glaube, dass die Studentin (für die letzte Prüfung im Mai)

I think that the student(fem) ( for the last exam in may)

das wichtige Buch gelesen hat.

the important book read has

‘I think, that the student has read the important book.’

A cross-combination of the two independently manipulated factors led to four different

conditions that were presented (see Figure 1). Sentence (a) is a short sentence presented

in the fast presentation rate (short-fast). Sentence (b) contains additional material and is

also presented in the fast pace (long-fast). Sentences (c) and (d) are both presented in

the slow pace. Note that (c) does not contain any additional material (short-slow),

whereas (d) contains an additional adverbial (long-slow). Note especially that

conditions (b) and (c) differ in the amount of intervening material, but - due to the

different presentation rates - they are matched in the amount of time.



das wichtige Buch




für die letzte Prüfung im Mai das wichtige Buch




das wichtige Buch




für die letzte Prüfung im Mai das wichtige Buch


Figure 1: Presentation Time of all 4 Sentence Types of Experiment 1

This design allows analyzing the impact of both factors independently. As this experiment

partly builds on work of Lewandowsky et al. (2004) the terminology for the crucial factors will

be adopted and labeled Time (amount of time) and Event (intervening material).

Participants and Material

16 students of the University of Konstanz participated for course credit or payment. All

participants were native speakers of German and naive with respect to the purpose of the


128 sentences were created, each in 16 versions according to the factors Voice (active

versus passive), Status (grammatical versus ungrammatical), Time (fast versus slow) and Event

(long versus short). Table 1 shows a Sample Stimuli Item of Experiment 1.


Intervening material for all „(adverbial)“:

Table 1. Sample Stimuli Item of Experiment 1

([…] für die letzte Prüfung im August […])

([…] for the last exam in august […])

(Active/ Grammatical)

Der Dozent hofft, dass die Studentin (adverbial) das wichtige Buch gelesen hat

the lecturer hopes that the(nom) student(fem) (adverbial) the important book read has

'The lecturer hopes, that the student has read the important book (for the last exam in august).'

(Passive/ Grammatical)

Der Dozent hofft, dass der Studentin (adverbial) das wichtige Buch besorgt wurde

the lecturer hopes that the(dat) student(fem) (adverbial) the important book obtained was

'The lecturer hopes, that the important book (for the last exam in august) was obtained for the student.'

(Active/ Ungrammatical)

Der Dozent hofft, dass der Studentin (adverbial) das wichtige Buch gelesen hat

the lecturer hope that the(dat) student(fem) (adverbial) the important book read has

'The lecturer hopes, that the student has read the important book (for the last exam in august).'


Der Dozent hofft, dass die Studentin (adverbial) das wichtige Buch besorgt wurde.

the lecturer hopes that the(nom) student(fem) (adverbial) the important book obtained was.

'The lecturer hopes, that the important book (for the last exam in august) was obtained for the student.'

The length of intervening material and presentation rate were manipulated

independently. The factor Event (intervening material) was varied by adding adverbials

of six words for the long version (cf. Table 1). The factor Time (presentation rate) was

either slow (188ms/word + 25ms/character) or fast 369ms/word + 44ms/character).


Proceedings of the 13 th ESSLLI Student Session

In both experiments the speeded grammaticality judgment method was used. In this

procedure sentences are presented in a word-by-word fashion. Each trial begins with the

presentation of the words "Bitte Leertaste drücken" ("Please Press Spacebar") to start

the sentence. After pressing the spacebar, a fixation point appears in the center of the

screen for 1050ms. Thereafter the sentence is shown word by word in the center of the

screen. Immediately after the last word the participants are asked to judge the

grammaticality of the sentence as fast as possible by pressing one of two response

buttons. Type of response and response time are recorded automatically. If a subject

does not give a response within 2000ms after the last word appeared, the words "zu

langsam" ("too slow") are shown and the trial is finished. In both experiments each

subject received at least 10 practice items before the experimental sessions started.

In experiment 1, all sentences were presented in two separate blocks in two

different paces (according to the manipulations of the factor Time in a slow and in a fast

pace). Every participant had to fulfill the experiment in both paces within one

experimental session. Each block contained half of the entire set of sentences. Therefore

each participant saw half of the sentences in the slow condition and the other half in the

fast condition. The order of the two blocks alternated between participants. The

sentences were presented with filler sentences. The proportion of experimental

sentences to filler sentences was 1:1. Filler sentences covered a range of various

constructions and were half grammatical and half ungrammatical. Most of the fillers

served as experimental items in two other experiments.



Proceedings of the 13 th ESSLLI Student Session

The percentages of correct judgments in Experiment 1 are shown in Figure 2

(grammatical conditions) and Figure 3 (ungrammatical conditions). Statistical analyses

were conducted with subject as the random factor (F1) and with sentences as the

random factor (F2). The following main effects occurred: First, a significant effect of

the factor Event is obtained (F1(1,15)=22.30, p

4 Experiment 2: The Role of Complexity in Sentence Parsing

Experiment 2 investigated the role of sentence complexity according to Gibson’s

Distance Locality Theory (Gibson 2000) in the context of verb-final clauses in German.

The DLT is a resource-driven model of language processing. The model assumes two

major kinds of resource use. First, integrating a new word w into the current structure

causes some cost (integration cost). Second, keeping the structure in memory also

causes a certain kind of cost (storage cost). A central idea of the DLT is locality. Gibson

assumes that the cost of integrating a new element into the current structure depends on

the distance between the new element and the related element already processed. The

assumption is that the distance is defined by the amount of discourse referents that are

newly introduced between the items concerned.

If this is so, an interesting question is whether material not introducing a new

discourse referent also affects the ease of integrating w into the CPPM. This was tested

in experiment 2 by the means of adverbial material. The crucial factors of experiment 2

therefore are: Adverbial and Discourse Referents (DR).

Participants and Material

16 students of the University of Konstanz participated for course credit or payment. All

participants were native speakers of German and naive with respect to the purpose of

the experiment.

We created 128 sentences, each in 16 versions according to the factors Voice

(active versus passive), Status (grammatical versus ungrammatical), Adverbial (NoAdv

versus Adv) and Discourse Referents (0 DR versus 2 DR).

Table 2 shows a Sample Stimuli of Experiment 2.

Ich vermute, dass […]

I guess , that […]

Proceedings of the 13 th ESSLLI Student Session

Table 2. Sample Stimuli Item of Experiment 2

(NoAdv. / 0 DR)

[…] meine Professorin, die sehr gut erklärt, eine freie Stelle ausgeschrieben hat.

[…] my professor(fem) who very good explains a vacant position offered has

‘I guess that my professor, who explains very well, has offered a vacant position.’

(Adv. / 0 DR)

[…] meine Professorin, die immer wieder sehr gut erklärt, eine freie Stelle ausgeschrieben hat.

[…] my professor(fem) who again and again very good explains a vacant position offered has

‘I guess that my professor, who explains very well repeatedly, has offered a vacant position.’

(NoAdv. / 2 DR)

[…] meine Professorin, die dem Studenten das Skript ausleiht, eine freie Stelle ausgeschrieben hat.

[…] my professor(fem) who the student(dat) the script lends a vacant position offered has

‘I guess that my professor, who lends the script to the student, has offered a vacant position.’

(Adv. / 2 DR)

[…] meine Professorin, die dem Studenten doch noch das Skript ausleiht, eine freie Stelle

[…] my professor(fem) who the student(dat) eventually the script lends a vacant position

ausgeschrieben hat.

offered has

‘I guess that my professor, who eventually lends the script to the student, has offered a vacant position.’


The complexity of relative clauses was manipulated in a two-factorial way. First,

the relative clause contains either 0 or 2 new NP-related discourse referents. The event

referent introduced by the verb is ignored as it is introduced in all four relative clause

types. Second, the relative clause does or does not contain an additional adverbial of

two words. Both factors were crossed. The resulting conditions are shown below in

Figure 4. Relative-clause complexity increases from (a) to (d). Furthermore, (b) and (c)

are matched according to the number of words they contain, but they differ in their

internal structure. As one can see below, (b) contains additional adverbials of two words

(“immer wieder”), but does not include any newly introduced discourse referents.

Sentence type (c), on the other hand, only introduces two new discourse referents

(“Studenten” and “Skript”).


Proceedings of the 13 th ESSLLI Student Session

In this experiment the same procedure, the speeded grammaticality judgment task, as in

experiment 1 was used. In experiment 2 no manipulation of the presentation time was

accomplished. The experiment was conducted in a one block. A presentation rate of

252ms per word + additional 28ms per letter was used.



die sehr gut erklärt




die immer wieder sehr gut erklärt




die dem Studenten das neue Skript ausleiht




die dem Studenten doch noch das neue Skript ausleiht



Figure 4: Length of Relative Clauses (According to the Number of Words)

The percentages of correct judgments in Experiment 2 are provided in Figure 5 (for

grammatical conditions) and Figure 6 (for ungrammatical conditions). Statistical

analyses revealed main effects for the factors Status (F1(1,15)= 26.57, p < .001;

F2(1,15)= 213.43, p

Proceedings of the 13 th ESSLLI Student Session

Percentage Correct (%)













86 87






79 78



Percentage Correct (%)








61 60


n o A d v




n o A d v




A d v




A d v





Figure 5. Percentages of correct judgments for

Grammatical Sentences

Figure 6. Percentages of correct judgments

for Ungrammatical Sentences

5 General Discussion

In Experiment 1, the factors Time and Event were disentangled to investigate the nature of

distance in sentence processing. The experiment had a clear-cut outcome for both factors.

First, the factor Event clearly affects sentence processing. This especially can be seen in

ungrammatical passive sentences. In that condition a decrease in the percentages of correct

judgments of about 14% between long compared to short sentences can be found. As earlier

experimental work has shown, ungrammatical passive sentences are always judged less

reliably (cf. Bader & Bayer 2006). More material to process increases processing difficulty

immensely, which results in a higher error rate of long sentences compared to short

sentences. Second, the factor Time does not seem to affect sentence processing as predicted

by time-based models. For short sentences, the slow presentation rate resulted in better

performance than the fast presentation rate. This goes against the predictions. Long rather

than short time intervals should affect sentence processing adversely (note that the fast

presentation rate was not too fast, as can be seen in high percentages of correct judgments

with up to 92%). For long sentences the presentation rate had no effect at all. The results

suggest that time-based decay does not contribute to the difficulty of integrating a new word

into the CPPM.

Experiment 2 has two major results. First, confirming prior results, the number of new

discourse referents had a major effect. Sentences containing two new discourse referents in

the relative clause received significantly more judgment errors. Second, an intervening

adverbial had no effect at all. This clearly can be found in the sentences which were equal in

length according to the number of words they contain, but which were manipulated with

different material. Sentences that contained new discourse referents but no additional

adverbial received substantially more judgment errors than sentences containing the same

amount of words, but only containing additional adverbials. The results suggest that the

pure linear distance between w and information necessary to integrate w cannot be the

source of the observed difficulty. In particular, finding no differences between (a) versus (b)

and (c) versus (d), but a substantial difference between (b) and (c) (cf. Figure 4) argues

against theories assuming that time or pure length - not introducing a new discourse referent

- leads to forgetting in working memory. The results therefore support the Dependency

Locality Theory of Gibson (2000).



Proceedings of the 13 th ESSLLI Student Session

M. Bader & J. Bayer (2006). Case and Linking in Language Comprehension. Evidence

from German, Springer, Dordrecht.

E. Gibson (2000). ‘The dependency locality theory: A distance-based theory of

linguistic complexity’. In A. Marantz et al. (eds.), Image Languae, Brain. MIT Press.

S. Hopp & M. Bader (in prep.). ‘Forgetting in Working Memory: Interference versus

Decay? Evidence from German Sentence Processing’.

M. A. Just & P. A. Carpenter (1992). ‘A Capacity Theory of Comprehension: Individual

Differences in Working Memory’. Psychological Review, vol. 99, no.1.

R. L. Lewis & S. Vasishth. (2005). ‘An activation-based model of sentence processing

as skilled memory retrieval’. Cognitive Science 29.

R. Levy et al. (2007). ‘The syntactic complexity of Russian relative clauses’. Paper

presented at the Annual Conference on Human Sentence Processing – CUNY 2007,

San Diego, CA.

S. Lewandowsky et al. (2004). ‘Time does not cause forgetting in short-term serial

recall’. Psychonomic Bulletin & Review 11.

J. S. Nairne (1990). ‘A feature model of immediate memory’. Memory & Condition, 18

Saito, S., & Miyake, A. (2004). On the nature of forgetting and the processing-storage

relationship in reading span performance. Journal of Memory and Language, 20.


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session



Pierre Lison

German Research Center for Artificial Intelligence

Abstract. We present an implemented model for speech recognition in natural environments

which relies on contextual information about salient entities to prime utterance recognition.

The hypothesis underlying our approach is that, in situated human-robot interaction, speech

recognition performance can be significantly enhanced by exploiting knowledge about the

immediate physical environment and the dialogue history. To this end, visual salience (objects

perceived in the physical scene) and linguistic salience (previously referred-to objects

within the current dialogue) are integrated into a single cross-modal salience model. The

model is dynamically updated as the environment evolves, and is used to establish expectations

about uttered words which are most likely to be heard given the context. The update is

realised by continously adapting the word-class probabilities specified in the statistical language

model. The present article discusses the motivations behind our approach, describes

our implementation as part of a distributed, cognitive architecture for mobile robots, and

reports the evaluation results on a test suite.

1 Introduction

Recent years have seen increasing interest in service robots endowed with communicative

capabilities. In many cases, these robots must operate in open-ended environments

and interact with humans using natural language to perform a variety of service-oriented

tasks. Developing cognitive systems for such robots remains a formidable challenge.

Software architectures for cognitive robots are typically composed of several cooperating

subsystems, such as communication, computer vision, navigation and manipulation

skills, and various deliberative processes such as symbolic planners (Langley, Laird and

Rogers, 2005).

These subsystems are highly interdependent. It is not enough to equip the robot with

basic functionalities for dialogue comprehension and production to make it interact naturally

in situated dialogues. We also need to find meaningful ways to relate language,

action and situated reality, and enable the robot to use its perceptual experience to continuously

learn and adapt itself to the environment.

The first step in comprehending spoken dialogue is automatic speech recognition [ASR].

For robots operating in real-world noisy environments, and dealing with utterances pertaining

to complex, open-ended domains, this step is particularly error-prone. In spite of

continuous technological advances, the performance of ASR remains for most tasks at

least an order of magnitude worse than that of human listeners (Moore, 2007).

One strategy for addressing this issue is to use context information to guide the speech

recognition by percolating contextual constraints to the statistical language model (Gruenstein,

Wang and Seneff, 2005). In this paper, we follow this approach by defining a contextsensitive

language model which exploits information about salient objects in the visual

scene and linguistic expressions in the dialogue history to prime recognition. To this end,


Proceedings of the 13 th ESSLLI Student Session

a salience model integrating both visual and linguistic salience is used to dynamically

compute lexical activations, which are incorporated into the language model at runtime.

Our approach departs from previous work on context-sensitive speech recognition by

modeling salience as inherently cross-modal, instead of relying on just one particular

modality such as gesture (Chai and Qu, 2005), eye gaze (Qu and Chai, 2007) or dialogue

state (Gruenstein et al., 2005). The FUSE system described in (Roy and Mukherjee, 2005)

is a closely related approach, but limited to the processing of object descriptions, whereas

our system was designed from the start to handle generic situated dialogues (cf. §3.3).

The structure of the paper is as follows: in the next section we briefly introduce the

software architecture in which our system has been developed. We then describe the

salience model, and explain how it is utilised within the language model used for ASR.

We finally present the evaluation of our approach, followed by conclusions.

Figure 1: Robotic platform (left) and example of a real visual scene (right)

2 Architecture

Our approach has been implemented as part of a distributed cognitive architecture (Hawes,

Sloman, Wyatt, Zillich, Jacobsson, Kruijff, Brenner, Berginc and Skocaj, n.d.). Each subsystem

consists of a number of processes, and a working memory. The processes can

access sensors, effectors, and the working memory to share information within the subsystem.

Figure 2 illustrates the spoken dialogue comprehension. Numbers 1-11 in the

figure indicate the usual sequential order for the processes..

The speech recognition utilises Nuance Recognizer v8.5 together with a statistical language

model (§ 3.4). For the online update of word class probabilities according to the

salience model, we use the “just-in-time grammar” functionality provided by Nuance.

Syntactic parsing is based on an incremental chart parser 1 for Combinatory Categorial

Grammar (Steedman and Baldridge, 2003), and yields a set of interpretations – that is,

1 Built on top of the OpenCCG NLP library: http://openccg.sf.net


Proceedings of the 13 th ESSLLI Student Session

Figure 2: Schematic view of the architecture for spoken dialogue comprehension


Proceedings of the 13 th ESSLLI Student Session

logical forms expressed as ontologically rich, relational structures (Baldridge and Kruijff,

2001). Figure 3 gives an example of such logical form.

These interpretations are then packed into a single representation (Oepen and Carroll,

2000; Kruijff, Lison, Benjamin, Jacobsson and Hawes, in submission), a technique which

enables us to efficiently handle syntactic ambiguity.

Once the packed logical form is built, it is retrieved by the dialogue recognition module,

which performs dialogue-level analysis tasks such as discourse reference resolution

and dialogue move interpretation, and consequently updates the dialogue structure.

@ w1 :cognition(want ∧

ind ∧

pres ∧

(i 1 : person ∧ I ∧

sg ∧

(t 1 : action-motion ∧ take ∧

y 1 : person ∧

(m 1 : thing ∧ mug ∧

unique ∧

sg ∧

specific singular)) ∧

(y 1 : person ∧ you ∧


Figure 3: Logical form generated for the utterance ‘I want you to take the mug’

Linguistic interpretations must finally be associated with extra-linguistic knowledge

about the environment – dialogue comprehension hence needs to connect with other subarchitectures

like vision, spatial reasoning or planning. We realise this information binding

between different modalities via a specific module, called the “binder”, which is responsible

for the ontology-based mediation accross modalities (Jacobsson, Hawes, Kruijff

and Wyatt, 2008).

3 Approach

3.1 Motivation

As psycholinguistic studies have shown, humans do not process linguistic utterances in

isolation from other modalities. Eye-tracking experiments notably highlighted that, during

utterance comprehension, humans combine, in a closely time-locked fashion, linguistic

information with scene understanding and world knowledge (Altmann and Kamide,

2004; Knoeferle and Crocker, 2006).

These observations – along with many others – therefore provide solid evidence for the

embodied and situated nature of language and cognition (Lakoff, 1987; Barsalou, 1999).

Humans thus systematically exploit dialogue and situated context to guide attention

and help disambiguate and refine linguistic input by filtering out unlikely interpretations.

Our approach is essentially an attempt to reproduce this mechanism in a robotic system.

3.2 Salience modeling

In our implementation, we define salience using two main sources of information:

1. the salience of objects in the perceived visual scene;


2. the linguistic salience or “recency” of linguistic expressions in the dialogue history.

In the future, other sources could be added, for instance the possible presence of gestures

(Chai and Qu, 2005), eye gaze tracking (Qu and Chai, 2007), entities in large-scale

space (Zender and Kruijff, 2007), or the integration of a task model – as salience generally

depends on intentionality (Landragin, 2006).

3.2.1 Visual salience

Via the “binder”, we can access the set of objects currently perceived in the visual scene.

Each object is associated with a concept name (e.g. printer) and a number of features,

for instance spatial coordinates or qualitative propreties like colour, shape or size.

Several features can be used to compute the salience of an object. The ones currently

used in our implementation are (1) the object size and (2) its distance relative to the robot

(e.g. spatial proximity). Other features could also prove to be helpful, like the reachability

of the object, or its distance from the point of visual focus – similarly to the spread of

visual acuity across the human retina. To derive the visual salience value for each object,

we assign a numeric value for the two variables, and then perform a weighted addition.

The associated weights are determined via regression tests.

At the end of the processing, we end up with a set E v of visual objects, each of which

is associated with a numeric salience value s(e k ), with 1 ≤ k ≤ |E v |.

3.2.2 Linguistic salience

Proceedings of the 13 th ESSLLI Student Session

There is a vast amount of literature on the topic of linguistic salience. Roughly speaking,

linguistic salience can be characterised either in terms of hierarchical recency, according

to a tree-like model of discourse structure, or in terms of linear recency of mention

(Kelleher, 2005). Our implementation can theorically handle both types of linguistic

salience, but, at the time of writing, only the linear recency is calculated.

To compute the linguistic salience, we extract a set E l of potential referents from the

discourse structure, and for each referent e k we assign a salience value s(e k ) equal to

the distance (measured on a logarithmic scale) between its last mention and the current

position in the discourse structure.

3.2.3 Cross-modal salience model

Once the visual and linguistic salience are computed, we can proceed to their integration

into a cross-modal statistical model. We define the set E as the union of the visual and

linguistic entities: E = E v ∪ E l , and devise a probability distribution P(E) on this set:

P(e k ) = δ v I Ev (e k ) s v (e k ) + δ l I El (e k ) s l (e k )


where I A (x) is the indicator function of set A, and δ v , δ k are factors controlling the

relative importance of each type of salience. They are determined empirically, subject to

the following constraint to normalise the distribution :

∑ ∑

δ v s(e k ) + δ l s(e k ) = |E| (2)

e k ∈E v e k ∈E l

The statistical model P(E) thus simply reflects the salience of each visual or linguistic

entity: the more salient, the higher the probability.



3.3 Lexical activation

In order for the salience model to be of any use for speech recognition, a connection

between the salient entities and their associated words in the ASR vocabulary needs to

be established. To this end, we define a lexical activation network, which lists, for each

possible salient entity, the set of words activated by it. The network specifies the words

which are likely to be heard when the given entity is present in the environment or in

the dialogue history. It can therefore include words related to the object denomination,

subparts, common properties or affordances. The salient entity laptop will activate words

like ‘laptop’, ‘notebook’, ‘screen’, ‘opened’, ‘ibm’, ‘switch on/off’, ‘close’, etc. The list

is structured according to word classes, and a weight can be set on each word to modulate

the lexical activation: supposing a laptop is present, the word ‘laptop’ should receive a

higher activation than, say, the word ‘close’, which is less situation specific.

The use of lexical activation networks is a key difference between our model and (Roy

and Mukherjee, 2005), which relies on a measure of “descriptive fitness” to modify the

word probabilities. One advantage of our approach is the possibility to go beyond object

descriptions and activate word types denoting subparts, properties or affordances of

objects 2 .

If the probability of specific words is increased, we need to re-normalise the probability

distribution. One solution would be to decrease the probability of all non-activated words

accordingly. This solution, however, suffers from a significant drawback: our vocabulary

contains many context-independent words like ‘thing’, or ‘place’, whose probability

should remain constant. To address this issue, we mark an explicit distinction in our

vocabulary between context-dependent and context-independent words.

In the current implementation, the lexical activation network is constructed semimanually,

using a simple lexicon extraction algorithm. We start with the list of possible

salient entities, which is given by

1. the set of physical objects the vision subsystem can recognise ;

2. the set of nouns specified in the CCG lexicon with ‘object’ as ontological type.

For each entity, we then extract its associated lexicon by matching domain-specific syntactic

patterns against a corpus of dialogue transcripts.

3.4 Language modeling

We now detail the language model used for the speech recognition – a class-based trigram

model enriched with contextual information provided by the salience model.

3.4.1 Corpus generation

Proceedings of the 13 th ESSLLI Student Session

We need a corpus to train any statistical language model. Unfortunately, no corpus of

situated dialogue adapted to our task domain was available. Collecting in-domain data via

Wizard of Oz experiments is a very costly and time-consuming process, so we decided

to follow the approach advocated in (Weilhammer, Stuttle and Young, 2006) instead and

generate a class-based corpus from a task grammar we had at our disposal.

Practically, we first collected a small set of WOz experiments, totalling about 800

utterances. This set is of course too small to be directly used as a corpus for language

2 In the context of a laptop object, ‘screen’ and ‘switch on/off’ would for instance be activated.


Proceedings of the 13 th ESSLLI Student Session

model training, but sufficient to get an intuitive idea of the kind of utterances we had to

deal with.

Based on it, we designed a domain-specific context-free grammar able to cover most

of the utterances. Weights were then automatically assigned to each grammar rule by

parsing our initial corpus, hence leading to a small stochastic context-free grammar.

As a last step, this grammar is randomly traversed a large number of times, which gives

us the generated corpus.

3.4.2 Salience-driven, class-based language models

The objective of the speech recognizer is to find the word sequence W ∗ which has the

highest probability given the observed speech signal O and a set E of salient objects:

W ∗

= arg max


P(O|W) ×

} {{ }


} {{ }

acoustic model salience-driven language model

For a trigram language model, the probability of the word sequence P(w n 1 |E) is:


P(w n 1 |E) ≃


P(w i |w i−1 w i−2 ;E) (4)


Our language model is class-based, so it can be further decomposed into word-class

and class transitions probabilities. The class transition probabilities reflect the language

syntax; we assume they are independent of salient objects. The word-class probabilities,

however, do depend on context: for a given class – e.g. noun -, the probability of hearing

the word ‘laptop’ will be higher if a laptop is present in the environment. Hence:

P(w i |w i−1 w i−2 ;E) = P(w i |c i ;E)

} {{ }

× P(c i |c i−1 , c i−2 )

} {{ }

word-class probability class transition probability

We now define the word-class probabilities P(w i |c i ;E):


P(w i |c i ;E) = ∑ e k ∈E

P(w i |c i ; e k ) × P(e k ) (6)

To compute P(w i |c i ; e k ), we use the lexical activation network specified for e k :

P(w i |c i ) + α 1 if w i ∈ activatedWords(e k )



P(w i |c i ; e k ) = i |c i ) − α 2 if w i /∈ activatedWords(e k ) ∧

w i ∈ contextDependentWords


P(w i |c i )


The optimum value of α 1 is determined using regression tests. α 2 is computed relative

to α 1 in order to keep the sum of all probabilities equal to 1:

α 2 =


|contextDependentWords| − |activatedWords| × α 1

These word-class probabilities are dynamically updated as the environment and the

dialogue evolves and incorporated into the language model at runtime.



4 Evaluation

4.1 Evaluation procedure

Proceedings of the 13 th ESSLLI Student Session

We evaluated our approach using a test suite of 250 spoken utterances recorded during

Wizard of Oz experiments. The participants were asked to interact with the robot while

looking at a specific visual scene. We designed 10 different visual scenes by systematic

variation of the nature, number and spatial configuration of the objects presented. Figure

4 gives an example of a visual scene.

The interactions could include descriptions, questions and commands. No particular

tasks were assigned to the participants. The only constraint we imposed was that all

interactions with the robot had to be related to the shared visual scene.

Figure 4: Sample visual scene including three objects: a box, a ball, and a chocolate bar.

4.2 Results

Table 1 summarises our experimental results. Due to space constraints, we focus our

analysis on the WER of our model compared to the baseline – that is, compared to a

class-based trigram model not based on salience.

Word Error Rate Classical LM Salience-driven LM


vocabulary size 25.04 % 24.22 %

≃ 200 words (NBest 3: 20.72 %) (NBest 3: 19.97 %)

vocabulary size 26.68 % 23.85 %

≃ 400 words (NBest 3: 21.98 %) (NBest 3: 19.97 %)

vocabulary size 28.61 % 23.99 %

≃ 600 words (NBest 3: 24.59 %) (NBest 3: 20.27 %)

Table 1: Comparative results of recognition performance

4.3 Analysis

As the results show, the use of a salience model can enhance the recognition performance

in situated interactions: with a vocabulary of about 600 words, the WER is indeed reduced

by 16.1 % compared to the baseline. According to the Sign test, the differences for the

last two tests (400 and 600 words) are statistically significant. As we could expect, the

salience-driven approach is especially helpful when operating with a larger vocabulary,


Proceedings of the 13 th ESSLLI Student Session

where the expectations provided by the salience model can really make a difference in the

word recognition.

The word error rate remains nevertheless quite high. This is due to several reasons.

The major issue is that the words causing most recognition problems are – at least in

our test suite – function words like prepositions, discourse markers, connectives, auxiliaries,

etc., and not content words. Unfortunately, the use of function words is usually not

context-dependent, and hence not influenced by salience. We estimated that 89 % of the

recognition errors were due to function words. Moreover, our chosen test suite is constituted

of “free speech” interactions, which often include lexical items or grammatical

constructs outside the range of our language model.

5 Conclusion

We have presented an implemented model for speech recognition based on the concept of

salience. This salience is defined via visual and linguistic cues, and is used to compute

degrees of lexical activations, which are in turn applied to dynamically adapt the ASR

language model to the robot’s environment and dialogue state.

As future work we will examine the potential extension of our approach in three directions.

First, we are investigating how to use the situated context to perform some priming

of function words like prepositions or discourse markers. Second, we wish to take other

information sources into account, particularly the integration of a task model, relying on

data made available by the symbolic planner. And finally, we want to go beyond speech

recognition, and investigate the relevance of such salience model for the development of

a robust understanding system for situated dialogue.


My thanks go to G.-J. Kruijff, H. Zender, M. Wilson and N. Yampolska for their insightful comments.

The research reported in this article was supported by the EU FP6 IST Cognitive Systems

Integrated project Cognitive Systems for Cognitive Assistants “CoSy” FP6-004250-IP.


Altmann, G. T. and Kamide, Y. (2004). Now you see it, now you don’t: Mediating

the mapping between language and the visual world, Psychology Press, New York,

pp. 347–386.

Baldridge, J. and Kruijff, G.-J. M. (2001). Coupling ccg and hybrid logic dependency

semantics, ACL ’02: Proceedings of the 40th Annual Meeting on Association for

Computational Linguistics, ACL, Morristown, NJ, USA, pp. 319–326.

Barsalou, L. W. (1999). Perceptual symbol systems., Behavioral & Brain Sciences 22(4).

Chai, J. Y. and Qu, S. (2005). A salience driven approach to robust input interpretation in

multimodal conversational systems, Proceedings of Human Language Technology

Conference and Conference on Empirical Methods in Natural Language Processing

2005, Association for Computational Linguistics, Vancouver, Canada, pp. 217–224.

Gruenstein, A., Wang, C. and Seneff, S. (2005). Context-sensitive statistical language

modeling, Proceedings of INTERSPEECH 2005, pp. 17–20.


Proceedings of the 13 th ESSLLI Student Session

Hawes, N., Sloman, A., Wyatt, J., Zillich, M., Jacobsson, H., Kruijff, G.-J. M., Brenner,

M., Berginc, G. and Skocaj, D. (n.d.). Towards an integrated robot with multiple

cognitive functions., AAAI, AAAI Press, pp. 1548–1553.

Jacobsson, H., Hawes, N., Kruijff, G.-J. and Wyatt, J. (2008). Crossmodal content binding

in information-processing architectures, Proceedings of the 3rd ACM/IEEE International

Conference on Human-Robot Interaction (HRI), Amsterdam, The Netherlands.

Kelleher, J. (2005). Integrating visual and linguistic salience for reference resolution, in

N. Creaney (ed.), Proceedings of the 16th Irish conference on Artificial Intelligence

and Cognitive Science (AICS-05), Portstewart, Northern Ireland.

Knoeferle, P. and Crocker, M. (2006). The coordinated interplay of scene, utterance, and

world knowledge: evidence from eye tracking, Cognitive Science 30(3): 481–529.

Kruijff, G.-J. M., Lison, P., Benjamin, T., Jacobsson, H. and Hawes, N. (in submission).

Incremental, multi-level processing for comprehending situated dialogue in humanrobot

interaction, Connection Science .

Lakoff, G. (1987). Women, fire and dangerous things: what categories reveal about the

mind, University of Chicago Press, Chicago.

Landragin, F. (2006). Visual perception, language and gesture: A model for their understanding

in multimodal dialogue systems, Signal Processing 86(12): 3578–3595.

Langley, P., Laird, J. E. and Rogers, S. (2005). Cognitive architectures: Research issues

and challenges, Technical report, Institute for the Study of Learning and Expertise,

Palo Alto.

Moore, R. K. (2007). Spoken language processing: piecing together the puzzle, Speech

Communication: Special Issue on Bridging the Gap Between Human and Automatic

Speech Processing 49: 418–435.

Oepen, S. and Carroll, J. (2000). Ambiguity packing in constraint-based parsing - practical

results, Proceedings of the 1st Conference of the North America Chapter of the

Association of Computational Linguistics, Seattle, WA, pp. 162–169.

Qu, S. and Chai, J. (2007). An exploration of eye gaze in spoken language processing for

multimodal conversational interfaces, Proceedings of the Conference of the North

America Chapter of the Association of Computational Linguistics, pp. 284–291.

Roy, D. and Mukherjee, N. (2005). Towards situated speech understanding: visual context

priming of language models, Computer Speech & Language (2): 227–248.

Steedman, M. and Baldridge, J. (2003). Combinatory categorial grammar. MS Draft 4.

Weilhammer, K., Stuttle, M. N. and Young, S. (2006). Bootstrapping language models

for dialogue systems, Proceedings of INTERSPEECH 2006, Pittsburgh, PA.

Zender, H. and Kruijff, G.-J. M. (2007). Towards generating referring expressions in

a mobile robot scenario, Language and Robots: Proceedings of the Symposium,

Aveiro, Portugal, pp. 101–106.


Proceedings of the 13 th ESSLLI Student Session


Petar Maksimović, Dragan Doder, Bojan Marinković and Aleksandar Perović

Mathematical Institute of the Serbian Academy of Sciences and Arts, Belgrade, Serbia

Abstract. This paper presents a sound and strongly complete axiomatization of the reasoning

about linear combinations of conditional probabilities, including comparative statements.

The developed logic is decidable, with a PSPACE containment for the decision procedure.

1 Introduction

The present paper constitutes an effort to proceed along the lines of the research presented

in (Fagin, Halpern and Megiddo, 1990; Lukasiewicz, 2002; Ognjanović and Rašković,

1996; Ognjanović and Rašković, 1999; Ognjanović and Rašković, 2000; Ognjanović,

Marković and Rašković, 2005; Ognjanović, Perović and Rašković, 2008; Rašković, Ognjanović

and Marković, 2004), on the formal development of probabilistic logics, where

probability statements are expressed by probabilistic operators expressing bounds on the

probability of a propositional formula.

The main technical novelty of this paper lies in the fact that in it is given a sound and

strongly complete axiomatization of the reasoning about linear combinations of conditional

probabilities, which also allows for qualitative statements. For instance, we formally

write the statement “the conditional probability of α given β is at least the sum of

conditional probabilities of α given γ and twice γ given α” as

CP(α, β) CP(α, γ) + 2 · CP(γ, α).

It should be noted that all of the probabilities we use are Kolmogorov-style. We also prove

that the developed logic is decidable.

As it is well known, the conditional probability of α given β has meaning only if

P(β) > 0, and is, by definition, calculated by

P(α|β) =

P(α ∧ β)



To avoid technical difficulties, we will adopt the convention that 0 −1 = 1. Namely, it is

more convenient to assume that −1 is a total operation, with this being considered usual

practice in quantifier elimination for the theory of real closed fields. In this way, we make

sure that conditional events are always defined.

The rest of the paper is organized as follows. In Section 2 the syntax of the logic is

given and the class of measurable probabilistic models is described. Section 3 contains

the corresponding axiomatization and introduces the notion of deduction. A proof of the

completeness theorem is presented in Section 4, whereas the decidability of the logic is

analyzed in Section 5. Concluding remarks are in Section 6.


2 Syntax and semantics

Let V ar = {p n | n < ω} be the set of propositional variables. The corresponding set of all

propositional formulas over V ar will be denoted by For C , where C stands for classical,

and is defined in the usual way. Propositional formulas will be denoted by α, β and γ,

possibly with indices.

Definition 1 The set Term of all probabilistic terms is recursively defined as follows:

• Term(0) = {s | s ∈ Q} ∪ {CP(α, β) | α, β ∈ For C }.

• Term(n + 1) = Term(n) ∪ {(f +g), (s ·g), (−f) |f,g ∈ Term(n), s ∈ Q}

• Term = ∞ ⋃



Probabilistic terms will be denoted by f,g and h, possibly with indices. To simplify

notation, we introduce the following convention: f+g is (f+g),f+g+h is ((f+g)+h).


For n > 3, f i is ((· · · ((f 1 +f 2 ) +f 3 ) + · · ·) +f n ). Similarly, −f is (−f) and f −g


Proceedings of the 13 th ESSLLI Student Session

is (f + (−g)).

If α and β are propositional formulas, then the probabilistic term CP(α, β) reads “the

conditional probability of α given β”. To simplify notation, we will write P(α) instead

of CP(α, ⊤), where ⊤ is an arbitrary tautology instance.

Definition 2 A basic probabilistic formula is any formula of the form f 0. Furthermore,

we define the following abbreviations:

•f 0 is −f 0; •f > 0 is ¬(f 0); •f < 0 is ¬(f 0);

•f = 0 isf 0 ∧ f 0; •f ≠ 0 is ¬(f = 0); •f g isf −g 0.

We definef g,f > g,f < g,f = g andf ≠ g in a similar way.

We define the notion of a probabilistic formula as a Boolean combination of basic

probabilistic formulas. As in the propositional case, ¬ and ∧ are the primitive connectives,

while all of the other connectives are introduced in the usual way. Probabilistic formulas

will be denoted by φ, ψ and θ, possibly with indices. The set of all probabilistic formulas

will be denoted by For P .

By “formula” we mean either a classical formula or a probabilistic formula. We do

not allow for the mixing of those types of formulas, nor for the nesting of the probability

operator P . Formulas will be denoted by Φ, Ψ and Θ, possibly with indices. The set of

all formulas will be denoted by For.

We define the notion of a model as a special kind of Kripke model. Namely, a model

M is any tuple 〈W,H, µ, v〉 such that:

• W is a nonempty set. As usual, its elements will be called worlds.

• H is an algebra of sets over W .

• µ : H −→ [0, 1] is a finitely additive probability measure.


• v : For C ×W −→ {0, 1} is a truth assignment 1 compatible with ¬ and ∧. That is,

v(¬α, w) = 1 − v(α, w) and v(α ∧ β, w) = v(α, w) · v(β, w).

For a given model M, let [α] M be the set of all w ∈ W such that v(α, w) = 1. If

the context is clear, we will write [α] instead of [α] M . We say that M is measurable if

[α] ∈ H for all α ∈ For C .

Definition 3 Let M = 〈W,H, µ, v〉 be any measurable model. We define the satisfiability

relation |= recursively as follows:

• M |= α if v(α, w) = 1 for all w ∈ W .

• M |= f 0 iff M 0, wheref M is recursively defined in the following way:

– s M = s.

– CP(α, β) M = µ([α ∧ β]) · µ([β]) −1 .

– (f +g) M = f M +g M .

– (s ·g) M = s ·g M .

– (−f) M = −(f M ).

• M |= ¬φ if M ̸|= φ.

Proceedings of the 13 th ESSLLI Student Session

• M |= φ ∧ ψ if M |= φ and M |= ψ.

A formula Φ is satisfiable if there is a measurable model M such that M |= Φ; Φ is

valid if it is satisfied in every measurable model. We say that the set T of formulas is

satisfiable if there is a measurable model M such that M |= Φ for all Φ ∈ T .

Notice that the last two clauses of Definition 3 provide validity of each tautology instance.

3 Axiomatization

In this section we will introduce the axioms and inference rules and prove that the proposed

axiomatization is sound and strongly complete with respect to the class of all measurable

models. The set of axioms from our axiomatic system, which we denote AX LPCP ,

is divided into three groups: axioms for propositional reasoning, axioms for probabilistic

reasoning and arithmetical axioms.

Axioms for propositional reasoning

A1. τ(Φ 1 , ...,Φ n ), where τ(p 1 , ...,p n ) ∈ For C is any tautology and Φ i are either all

propositional or all probabilistic.

Axioms for probabilistic reasoning

A2. P(α) 0; A5. P(α ↔ β) = 1 → P(α) = P(β);

A3. P(⊤) = 1; A6. P(α ∨ β) = P(α) + P(β) − P(α ∧ β);

A4. P(⊥) = 0; A7. (P(α ∧ β) = r ∧ P(β) = s) → CP(α, β) = r · s −1 .

1 1 stands for “true”, while 0 stands for “false”


Arithmetical axioms.

A8. r s, whenever r s; A16. s · (f +g) = (s ·f) + (s ·g)

A9. s · r = sr; A17. r · (s ·f) = r · s ·f

A10. s + r = s + r; A18. 1 ·f = f

A11. f +g = g +f; A19. f g ∨ g f

A12. (f +g) +h = f + (g +h); A20. (f g ∧ g h) → f h

A13. f + 0 = f; A21. f g → f +h g +h

A14. f −f = 0; A22. (f g ∧ s > 0) → s ·f s ·g

A15. (r ·f) + (s · f) = r + s ·f;

Inference rules

Proceedings of the 13 th ESSLLI Student Session

R1. From Φ and Φ → Ψ infer Ψ.

R2. From α infer P(α) = 1.

R3. From the set of premises {φ → f −n −1 | n = 1, 2, 3, ...} infer φ → f 0.

Let us briefly comment on the axioms and inference rules. The axioms A1-A7 provide

the required properties of probability, while the axioms A8-A22 provide the properties

required for computation. In the inference rules, R1 is modus ponens, R2 resembles

necessitation, while R3 provides that non-Archimedean probabilites are not permitted.

Definition 4 A formula Φ is a theorem (⊢ Φ) if there is an at most countable sequence

of formulas Φ 0 , Φ 1 , ...,Φ, such that every Φ i is either an axiom or it is derived from the

preceding formulas of the sequence by an inference rule. In this paper we will also use

the notion of deducibility. A formula Φ is deducible from a set T of sentences (T ⊢ Φ) if

there is an at most countable sequence of formulas Φ 0 , Φ 1 , ...,Φ, such that every Φ i is

an axiom or a formula from the set T , or it is derived from the preceding formulas by an

inference rule. A formula Φ is a theorem (⊢ Φ) if it is deducible from the empty set. A set

of sentences T is consistent if there is at least one formula from For C , and at least one

formula from For P that are not deducible from T . Otherwise, T is inconsistent. A set T

is deductively closed if for every Φ ∈ For, if T ⊢ Φ, then Φ ∈ T .

Observe that the length of the inference may be any successor ordinal lesser than the

first uncountable ordinal ω 1 . Using a straightforward induction on the length of the inference,

one can easily show that the above axiomatization is sound with respect to the class

of all measurable models.

4 Completeness

Theorem 1 (Deduction theorem) Suppose that T is an arbitrary set of formulas and that

Φ, Ψ ∈ For. Then, T ⊢ Φ → Ψ iff T ∪ {Φ} ⊢ Ψ.


Proceedings of the 13 th ESSLLI Student Session

Proof: If T ⊢ Φ → Ψ, then clearly T ∪ {Φ} ⊢ Φ → Ψ, so, by modus ponens (R1),

T ∪ {Φ} ⊢ Ψ. Conversely, let T ∪ {Φ} ⊢ Ψ. As in the classical case, we will use the

induction on the length of inference to prove that T ⊢ Φ → Ψ. The proof differs from the

classical only in the cases when we apply the inifinitary inference rule R3.

Suppose that Ψ is the formula φ → f 0 and that T ⊢ Φ → (φ → f −n −1 ) for all

n. Since the formula (p 0 → (p 1 → p 2 )) ↔ ((p 0 ∧ p 1 ) → p 2 ), is a tautology, we obtain

T ⊢ (Φ ∧ φ) → f −n −1 for all n (A1). Now, by R3, T ⊢ (Φ ∧ φ) → f 0. Hence, by

the same tautology, T ⊢ Φ → Ψ.

The next technical lemma will be used in the construction of a maximally consistent

extension of a consistent set of formulas.

Lemma 2 Suppose that T is a consistent set of formulas. If T ∪ {φ → f 0} is inconsistent,

then there is a positive integer n such that T ∪ {φ → f < −n −1 } is consistent.

Proof: The proof is based on the reductio ad absurdum argument. Thus, let us suppose

that T ∪ {φ → f < −n −1 } is inconsistent for all n. Due to Deduction theorem, we can

conclude that

T ⊢ φ → f −n −1

for all n. By R3, T ⊢ φ → f 0, so T is inconsistent; a contradiction.

Definition 5 Suppose that T is a consistent set of formulas and that For P = {φ i | i =

0, 1, 2, 3, ...}. We define a completion T ∗ of T recursively as follows:

1. T 0 = T ∪ {α ∈ For C | T ⊢ α} ∪ {P(α) = 1 | T ⊢ α}.

2. If T i ∪ {φ i } is consistent, then T i+1 = T i ∪ {φ i }.

3. If T i ∪ {φ i } is inconsistent, then:

(a) If φ i has the form ψ → f 0, then T i+1 = T i ∪ {ψ → f < −n −1 }, where n

is a positive integer such that T i+1 is consistent. The existence of such an n is

provided by Lemma 2.

(b) Otherwise, T i+1 = T i .

Obviously, each T i is consistent. In the next theorem we will prove that T ∗ is deductively

closed, consistent and maximal with respect to For P .

Theorem 3 Suppose that T is a consistent set of formulas and that T ∗ is constructed as

above. Then:

1. T ∗ is deductively closed, id est, T ∗ ⊢ Φ implies Φ ∈ T ∗ .

2. There is φ ∈ For P such that φ /∈ T ∗ .

3. For each φ ∈ For P , either φ ∈ T ∗ , or ¬φ ∈ T ∗ .


Proceedings of the 13 th ESSLLI Student Session

Proof: We will prove only the first clause, since the remaining clauses can be proved

in the same way as in the classical case. In order to do so, it is sufficient to prove the

following four claims:

(i) Each instance of any axiom is in T ∗ .

(ii) If Φ ∈ T ∗ and Φ → Ψ ∈ T ∗ , then Ψ ∈ T ∗ .

(iii) If α ∈ T ∗ , then P(α) = 1 ∈ T ∗ .

(iv) If {φ → f −n −1 | n = 1, 2, 3, ...} is a subset of T ∗ , then φ → f 0 ∈ T ∗ .

(i): If Φ ∈ For C , then Φ ∈ T 0 . Otherwise, there is a nonnegative integer i such that

Φ = φ i . Since ⊢ φ i , T i ⊢ φ i as well, so φ i ∈ T i+1 .

(ii): If Φ, Φ → Ψ ∈ For C , then Ψ ∈ T 0 . Otherwise, let Φ = φ i , Ψ = φ j , and Φ →

Ψ = φ k . Then, Ψ is a deductive consequence of each T l , where l max(i, k) + 1.

Let ¬Ψ = φ m . If φ m ∈ T m+1 , then ¬Ψ is a deductive consequence of each T n , where

n m + 1. So, for every n max(i, k, m) + 1, T n ⊢ Ψ ∧ ¬Ψ, a contradiction.

Thus, ¬Ψ ∉ T ∗ . On the other hand, if also Ψ ∉ T ∗ , we have that T n ∪ {Ψ} ⊢ ⊥, and

T n ∪ {¬Ψ} ⊢ ⊥, for n max(j, m) + 1, a contradiction with the consistency of T n .

Thus, Ψ ∈ T ∗ .

(iii): If α ∈ T ∗ , then α ∈ T 0 , so P(α) = 1 ∈ T 0 .

(iv): Suppose that {φ → P(α) −n −1 | n = 0, 1, 2, ...} is a subset of T ∗ . We want

to prove that φ → P(α) 0 ∈ T ∗ . The proof uses reductio ad absurdum argument. So,

let φ → P(α) 0 = φ i and let us suppose that T i ∪ {φ i } is inconsistent. By 3.(a) of

Definition 5, there is a positive integer n such that

T i+1 = T i ∪ {φ → P(α) < −n −1 }

and T i+1 is consistent. Then, for all sufficiently large k, T k ⊢ φ → P(α) < −n −1

and T k ⊢ φ → P(α) −n −1 , so T k ⊢ φ → ψ for all ψ ∈ For P . In particular,

T k ⊢ φ → P(α) 0, i.e., T k ⊢ φ i for all sufficiently large k. But, φ i /∈ T ∗ , so φ i is

inconsistent with all T k , k i. It follows that each T k is inconsistent for sufficiently large

k, a contradiction.

Thus, T i ∪ {φ i } is consistent, so φ → P(α) 0 ∈ T i+1 .

For the given completion T ∗ , we define a canonical model M ∗ as follows:

• W is the set of all functions w : For C −→ {0, 1} with the following properties:

– w is compatible with ¬ and ∧.

– w(α) = 1 for each α ∈ T ∗ .

• v : For C × W −→ {0, 1} is defined by v(α, w) = 1 iff w(α) = 1.

• H = {[α] | α ∈ For C }.

• µ : H −→ [0, 1] is defined by µ([α]) = sup{s ∈ [0, 1] ∩ Q | T ∗ ⊢ P(α) s}.


Lemma 4 M ∗ is a measurable model.

Proof: We need to prove that H is an algebra of sets and that µ is a finitely additive

probability measure. It is easy to see that H is an algebra of sets, since [α]∩[β] = [α ∧β],

[α] ∪[β] = [α ∨β] and H \[α] = [¬α]. Concerning µ, it is sufficient to prove that A3, A4

and A6 are satisfied in M. Here we will only give the sketch of the proof for A6, which

provides finite additivity of µ.

Let µ([α]) = a, µ([β]) = b and µ([α ∧ β]) = c. We claim that

µ([α ∨ β]) = a + b − c.

This is an immediate consequence of the following facts:

• µ([γ]) = sup{s ∈ Q | T ∗ ⊢ P(γ) s}, γ ∈ For C .

The real function F(x, y,z) = x + y − z is continuous.

• For each r,s ∈ Q, T ∗ ⊢ r s iff r s.

• Q 3 is dense in R 3 .

Proceedings of the 13 th ESSLLI Student Session

Namely, for each positive ε, there are positive δ 1 , δ 2 , δ 3 such that for all 〈r 1 , r 2 , r 3 〉 ∈

((a − δ 1 , a] × (b − δ 2 , b] × (c − δ 3 , c]) ∩ Q 3 ,

In particular, for each s ′ , s ′′ ∈ Q such that

r 1 + r 2 − r 3 ∈ (a + b − c − ε, a + b − c + ε).

a + b − c − ε < s ′ r 1 + r 2 − r 3 s ′′ < a + b − c + ε,

using the axioms about rational numbers, we have that

T ∗ ⊢ s ′ r 1 + r 2 − r 3 s ′′ ,

i.e., µ([α ∨ β]) = µ([α]) + µ([β]) − µ([α ∧ β]).

Theorem 5 (Strong completeness theorem) Every consistent set of formulas has a measurable


Proof: Let T be a consistent set of formulas. We can extend it to a maximally consistent

set T ∗ , and define a canonical model M ∗ , as above. By induction on the complexity

of the formulas we can prove that M ∗ |= Φ iff Φ ∈ T ∗ .

To begin the induction, let Φ = α ∈ For C . If α ∈ T ∗ , i.e., T ∗ ⊢ α, then by definition

of M ∗ , M ∗ |= α. Conversely, if M ∗ |= α, by the completeness of classical propositional

logic, T ∗ ⊢ α, and α ∈ T ∗ .

Let us suppose that f 0 ∈ T ∗ . Then, using the axioms for ordered commutative

rings, we can prove that

T ∗ ⊢ f = s +


s i · CP(α i , β i ) and T ∗ ⊢ s +



s i · CP(α i , β i ) 0,



Proceedings of the 13 th ESSLLI Student Session

for some s, s i ∈ Q and some α i , β i ∈ For C such that T ∗ ⊢ P(β i ) > 0. Let a i = µ([α j ])

and b i = µ([β i ]). It remains to prove that

s +



s i · a i · b −1

i 0. (1)

Similarly as in the proof of Lemma 4, we can show that (1) is an immediate consequence

of the following facts:

• µ([γ]) = sup{s ∈ Q | T ∗ ⊢ P(γ) s}, γ ∈ For C .

The real function F(x 1 , ...,x m , y 1 , ...,y m ) = s + n ∑

• For each r,s ∈ Q, T ∗ ⊢ r s iff r s.

• Q k is dense in R k .


s i · x i · y −1


is continuous.

For the other direction, let M ∗ |= f 0. If f 0 /∈ T ∗ , by construction of T ∗ ,

there is a positive integer n such that f < −n −1 ∈ T ∗ . Reasoning as above, we have that

f M ∗ < 0, which is a contradiction. So,f 0 ∈ T ∗ .

Let Φ = ¬φ ∈ For P . Then M ∗ |= ¬φ iff M ∗ ̸|= φ iff φ ∉ T ∗ iff (by Theorem 3)

¬φ ∈ T ∗ .

Finally, let Φ = φ ∧ ψ ∈ For P . M ∗ |= φ ∧ ψ iff M ∗ |= φ and M ∗ |= ψ iff φ, ψ ∈ T ∗

iff (by Theorem 3) φ ∧ ψ ∈ T ∗ .

5 Decidability

Theorem 6 Satisfiability of probabilistic formulas is decidable.

Proof: Up to equivalence, each probabilistic formula is a finite disjunction of finite

conjunctions of literals, where literal is either a basic probabilistic formula, or a negation

of a basic probabilistic formula. Thus, it is sufficient to show the decidability of the

satisfiability problem for the formulas of the form

f i 0 ∧ ∧ g j < 0. (2)



Suppose that p 1 , ...,p n are all of the propositional formulas appearing in (2). Let A 1 , ...,A 2 n

be all of the formulas of the form

±p 1 ∧ · · · ∧ ±p n ,

where +p = p and −p = ¬p. Clearly, A i are pairwise disjoint and form a partition of ⊤.

Furthermore, for each α appearing in (2) there is a unique set I α ⊆ {1, ...,2 n } such that

α ↔ ∨

i∈I α

A i


Proceedings of the 13 th ESSLLI Student Session

is a tautology. Now we can equivalently rewrite (2) as

∧ ∑

s ii ′CP( ∨ ∨

∑ ∨ ∨

A k ,

s jj ′CP( A k , A l ) < 0.

i i ′ k∈I αii ′

l∈I βii ′

A l ) 0 ∧ ∧ j

j ′ k∈I αjj ′ l∈I βjj ′

Let σ i (x 1 , ...,x 2 n), δ j (x 1 , ...,x 2 n) be the formulas

s ii ′ · ( ∑

x k ) · ( ∑

x l ) −1 0

i ′ k∈I αii ′ l∈I βii ′


s jj ′ · ( ∑

x k ) · ( ∑

x l ) −1 < 0.

j ′ k∈I αjj ′ l∈I βjj ′

Then, it is easy to see that (2) is satisfiable iff the sentence

∃x 1 . ..∃x 2 n( ∧ i

σ i (¯x) ∧ ∧ j

δ j (¯x))

is satisfied in the ordered field of reals. Since the latter question is decidable, we have our


It should be noted that this logic can be embedded into the logic described in (Fagin

et al., 1990), which has a PSPACE containment for the decision procedure. Also, the

rewriting of formulas from our logic into that logic can be accomplished in linear time:

CP(α, β) is equavivalent to

w(α ∧ β)


which is representable in (Fagin et al., 1990).

Thus, we conclude that our logic is also decidable in PSPACE.

6 Conclusion

In this paper we introduced a sound and strongly-complete axiomatic system for the probabilistic

logic with the conditional probability operator CP , which allows for linear combinations

and comparative statements. As it was noticed in (van der Hoek, 1997), it is not

possible to give a finitary strongly complete axiomatization for such a system. In our case

the strong completeness was made possible by adding an infinitary rule of inference.

The obtained formalism is quite expressive and allows for the representation of uncertain

knowledge, where uncertainty is modeled by probability formulas. For instance,

conditional statement of the form “the sum of probabilities of α given β and γ given δ is

at least 0.95” can be written as

CP(α, β) + CP(γ, δ) 0.95.

A similar approach can be applied to de Finetti style conditional probabilities. Future

research will also consider a possibility of dealing with probabilistic first-order formulas.


Proceedings of the 13 th ESSLLI Student Session


Fagin, Halpern and Megiddo (1990). A logic for reasoning about probabilities, Information

and Computation 87(1/2): 78–128.

Lukasiewicz, T. (2002). Probabilistic default reasoning with conditional constraints, Annals

of Mathematics and Artificial Intelligence 34: 35–88.

Ognjanović, Z., Marković, Z. and Rašković, M. (2005). Completeness theorem for a

logic with imprecise and conditional probabilities, Publications de l’institute mathematique,

nouvelle serie 78(92): 35–49.

Ognjanović, Z., Perović, A. and Rašković, M. (2008). Logics with the qualitative probability

operator, Logic Journal of IGPL 16(2): 105–120.

Ognjanović, Z. and Rašković, M. (1996). A logic with higher order probabilities, Publication

de l‘Institut Math. (NS) 60(74): 1–4.

Ognjanović, Z. and Rašković, M. (1999). Some probability logics with new types of

probability operators, Journal of Logic and Computation 9(2): 181–195.

Ognjanović, Z. and Rašković, M. (2000). Some first-order probability logics, Theoretical

Computer Science 247(1-2): 191–212.

Rašković, M., Ognjanović, Z. and Marković, Z. (2004). A logic with conditional probabilities,

in J. Leite and J. Alferes (eds), 9th European Conference Jelia’04 Logics in

Artificial Intelligence, Vol. 3229, Springer-Verlag, pp. 226–238.

van der Hoek, W. (1997). Some considerations on the logic p f d: a logic combining

modality and probability, Journal of Applied Non-Classical Logics 7(3): 287–307.


Proceedings of the 13 th ESSLLI Student Session


Scott Martin

The Ohio State University

Abstract. This paper sketches an account of the behavior of French pronominal clitics in

CVG, a proof-theoretic categorial grammar formalism. The approach shown here differs

from most categorial analyses of French clitics in that it treats clitics as noun phrases rather

than as functions that operate on under-saturated verb phrases. Basic French cliticization,

clitics in infinitival constructions, and both auxiliary and non-auxiliary clitic climbing are


1 Introduction

Cliticization in French is a set of phenomena in which pronominal complements to a

verbal host are systematically realized as affixes. Linguistic generalizations about these

phenomena have been structured using several different frameworks, with Sag & Miller’s

(1997) HPSG treatment of French clitics as morphological affixes being the most comprehensive

and successful. Categorial accounts of cliticization phenomena, among them

Kraak (1998) for French and Morrill & Gavarro (1992) for Catalan, have largely analyzed

clitics as functors over under-saturated verb phrases. Stabler (2001) and Amblard (2006)

are two recent approaches to French clitics in the Minimalist Grammar formalism, both

of which treat them as syntactic elements with certain feature sets.

In this paper, I give a preliminary account of some of the phenomena involving French

clitics using Convergent Grammar (CVG), a categorial grammar framework that uses natural

deduction with hypothetical proof. 1 This treatment is limited to a subset of what

Bonami & Boye (2005) call French Pronominal Clitics (FPCs), specifically, those FPCs

that appear as verbal complements. From Kraak (1998) I borrow the idea of a specialized

combinatory mode for FPC attachment to a verbal host (analogous to her • ca ) that is

“stronger” than normal Complement Merge and reflects the status of clitic attachment as

a process more morphological than syntactic. In contrast to Kraak’s and much other work

on FPCs in categorial frameworks, however, the account sketched here partly follows the

work of Stabler and Amblard in analyzing FPCs not as functors over verb phrases but

as sets of morphological features that also represent a syntactic and semantic argument,

much like ordinary NPs.

Drawing on Sag & Miller’s work on French clitics as inspiration, the analysis reflected

here relies mainly on properly-structured lexical axioms to describe the behavior of FPCs.

Basic instances of cliticization are considered as well as more complicated situations,

such as argument composition and the interaction of FPCs with infinitivals. However,

this paper does not take a firm stance on the question of whether cliticization phenomena

◦ For many helpful comments and suggestions on this and earlier drafts of this paper, I am grateful to

Yusuke Kubota, Carl Pollard, Chris Worth, and three anonymous ESSLLI reviewers.

1 Pollard (2007) provides an introduction to CVG.


Proceedings of the 13 th ESSLLI Student Session

should be considered syntactic or morphological, since CVG’s tectogrammatical terms

represent syntactic dependency relations and do not necessarily correspond exactly to

surface word order or prosodic form.

2 Pronominal Complement Clitics in French

French verbs take canonical complements in a manner that resembles complement selection

for their English analogs: the verbal head combines with its complement(s) to the

right and with its subject to the left to form a finite or infinitive clause. When certain

complements are pronominalized, however, they can optionally appear to the immediate

left of the verb in a variant form as proclitics. The following data, replicated in part from

(1) in Sag & Miller (1997), show the verb voir ‘to see’ with its complement realized both

canonically and as a proclitic: 2

(1) a. Marie voit Jean. ‘Marie sees Jean.’

b. Marie voit lui. ‘Marie sees him.’ [boldface = prosodic stress]

c. Marie le voit.

Marie ACC.3S sees

‘Marie sees him.’

The cliticized configuration is given in (1c), with the complement in its clitic form (le)

instead of the canonical one (here Jean, or lui with appropriate stress).

Among the other distinctive characteristics of complement FPCs noted by Kraak (1998),

the ones that bear most on the account given here are that:

• as verbal complements, they do not co-occur with their non-pronominal or noncliticized

versions (exemplified in (1)).

• they do not serve as the complement to bare past participles. This fact gives rise to

an instance of the phenomenon known as “clitic climbing”:

(2) a. *Marie a le vu. ‘Marie saw him.’

b. Marie l’a vu.

Marie ACC.3S has seen

‘Marie saw him.’

Here, (2a) is unacceptable because although the clitic le is the accusative complement

of vu, it must be realized on the tense auxiliary form a as in (2b). However,

causatives and certain verbs of perception exhibit different behavior. For these

verbs, it is possible for some of their arguments to be realized as clitics on the

upstairs verb and some on the downstairs one:

(3) Jean le fera la réparer.

Jean ACC.3S make.FUT ACC.3FS repair

‘Jean will make him repair it.’

(From Abeille, Godard and Miller(1995, example (2a)).)

2 I adopt Bonami & Boye’s (2005) scheme here for annotating morphological features.


Proceedings of the 13 th ESSLLI Student Session

• No syntactic material except another clitic can intervene between an FPC and its

host verb. This fact distinguishes cliticized complements from their canonical counterparts

in which certain adverbials can occur between a verb and its complements:

(4) a. Marie l’a souvent dit à lui.

Marie ACC.3S has often said to him

‘Marie has often said it to him.’

b. Marie l’a dit souvent à lui.

Marie ACC.3S has said often to him

‘Marie has often said it to him.’

c. Marie le lui a souvent dit.

Marie ACC.3S DAT.3S has often said

‘Marie has often said it to him.’

d. *Marie le lui souvent a dit.

Marie ACC.3S DAT.3S often has said

‘Marie has often said it to him.’

e. *Marie le souvent lui a dit.

Marie ACC.3S often DAT.3S has said

‘Marie has often said it to him.’

(Example (4d) is from Kraak (1998, (7d)).) Here, (4d) and (4e) show the disallowed

intervention of the adverbial souvent ‘often’ between an FPC and its host verb,

while (4b) demonstrates the allowable intervention of souvent in the canonical form.

• they are normally realized on the verb they complement, illustrated here with an

embedded infinitival:

(5) a. *Marie le veut voir. ‘Marie wants to see him.’

b. Marie veut le voir.

Marie wants ACC.3S to see

‘Marie wants to see him.’

The cliticized accusative le here is the complement of the infinitive voir, and does

not to attach to the upstairs verb veut.

These are the most basic facts about cliticization of declarative verbal complements in

French. FPCs also occur in passive constructions and in constructions like those in (6):

(6) a. i. Pierre reste fidèle à Jean.

‘Pierre remains faithful to Jean.’

ii. Pierre lui reste fidèle.

Pierre DAT.3S remains faithful

‘Pierre remains faithful to him.’

b. i. Marie connaît la fin de l’histoire.

‘Marie knows the end of the story.’


Proceedings of the 13 th ESSLLI Student Session

ii. Marie en connaît la fin.

Marie GEN.3S knows the end

‘Marie knows the end of it.’

(Both are from Sag & Miller (1997, example 3).) Constructions involving FPCs like those

in (6) are similar to the clitic climbing that occurs with auxiliaries like avoir (as shown in


In §3, I sketch an analysis of the basic facts about cliticization in some of the situations

described above.

3 Accounting for the Data

Sag & Miller (1997) give extensive argumentation for considering clitics as morphological

rather than syntactic in nature. Their account constrains the inflectional paradigm

of French verbs, treating clitics as pronominal affixes that reduce the valence requirements

of a given verb. In examining French clitics from a deductive perspective, Kraak

(1998) instead describes cliticization as occurring on a “sliding scale” between morphology

(affix-host attachment) and syntax (complement selection). The view presented here

is more in line with Kraaks in that it uses CVG tectogrammatical proof terms to describe

the combinatoric potential of functions and arguments.

However, this account diverges from Kraak’s and most other categorial grammar treatments

in that it construes FPCs as regular pronominal NPs, instead of formulating them

as functors over under-saturated verb phrases. This approach allows the semantics to be

nearly identical between canonical and cliticized forms by specifying a separate mode of

complement selection specifically for clitics.

3.1 FPCs as a Local Dependency

Because cliticization differs from the canonical form of complement selection (⊸ C ) in

various ways, a separate implication mode, called ⊸ PC (for proclitic), is used. As a local

implication mode, it has modus ponens (elimination) but not hypothetical proof (introduction),

which is used in CVG for non-local extractions. The elimination (or “merge”)

rule for ⊸ PC is as follows: 3

Proclitic Merge

If Γ ⊢ a, x : A, C ⊣ ∆

and Γ ′ ⊢ f, v : A ⊸ PC B, C ⊃ D ⊣ ∆ ′

then Γ, Γ ′ ⊢ ( PC a f), v(x) : B, D ⊣ ∆, ∆ ′

This rule formalizes the affixation of clitics to a verbal host, taking into account both

the syntactic and semantic proof terms. This new ⊸ PC implication mode allows lexical

axioms to specify the cliticized complement mode of combination as opposed to

the canonical one, and is central to the account of clitic behavior sketched here. As a

mnemonic meant to reflect French word order in derivational history, function application

for ⊸ PC writes an FPC to the left of its host. This rule also states that hypotheses present

3 A CVG sign is a triple made up of the prosodic/phonological form, syntactic tectogrammatical term,

and semantic content. For brevity, I omit the prosodic element and only include the syntactic tecto-term and

semantic denotation.


Proceedings of the 13 th ESSLLI Student Session

in both the syntactic context (to the left of ⊢) and the semantic co-context (to the right

of ⊣) of both premises are propagated into the conclusion. This ensures that the application

of this rule does not have any effect on any non-local extractions (filler-gap path

information), stored quantifiers, or anaphoric pronouns.

With this new implication mode and merge rule, an account of FPC behavior as demonstrated

in §2 is possible that requires no other machinery than the CVG merge rules described

in Pollard (2007). All that remains is to correctly specify the necessary lexical

axioms. First are the canonical forms of the verbs and complements: 4

⊢ Marie, marie ′ : Nom, Ind

⊢ Jean, jean ′ : Acc, Ind

⊢ lui 1 , a : Acc, Ind

⊢ voit 1 , λ y λ x see ′ (x, y) : (Acc \ Pcl) ⊸ C (Nom ⊸ SU Fin), Ind ⊃ (Ind ⊃ Prop)

The new type Pcl is assigned to proclitics in order to differentiate them from their canonical

counterparts. Here, voit selects a complement of type Acc \ Pcl to indicate that it

does not combine with proclitics in canonical complement position: the set complement

specifies all inhabitants of type Acc except those that inhabit Pcl. Next, the lexicon is

extended to reflect the syntactic/morphological features of le and the cliticization mode

of complement selection for voir:

⊢ le, b : Acc ∩ 3Sg ∩ Pcl, Ind

⊢ voit 2 , λ y λ x see ′ (x, y) : (Acc ∩ Pcl) ⊸ PC (Nom ⊸ SU Fin),

Ind ⊃ (Ind ⊃ Prop)

These axioms allow the following proof terms for the data in (1): 5

(7) a. ⊢ ( SU Marie (voit 1 Jean C )), see ′ (marie ′ , jean ′ ) : Fin, Prop

b. ⊢ ( SU Marie (voit 1 lui 1 C )), see ′ (marie ′ , a) : Fin, Prop

c. ⊢ ( SU Marie ( PC le voit 2 )), see ′ (marie ′ , b) : Fin, Prop

Aside from the different implication mode, the only difference between the canonical

form of voit (voit 1 ) and the cliticized variant (voit 2 ) is that the argument to voit 2 must

be of the intersective type Acc ∩ Pcl. The type 3Sg represents the argument’s agreement

features. So stated, this selectional restriction ensures that voit 2 can only combine in

cliticized mode with accusative complements that are also proclitics, as desired. It is

important to note that not only are the semantics of both variants of voit identical, but

both cliticized and canonical complements are of the same semantic type (Ind) as well.

4 The basic tectogrammatical types used here are Nom for nominative NPs, Acc for accusative NPs, and

Fin for finite clauses. The hyperintensional types Ind, the type of individual concepts; and Prop, the type

of propositions, are the basic semantic types. In addition to the new combinatory mode ⊸ PC , implicative

tectogrammatical types are constructed using ⊸ SU and ⊸ C , which invoke Subject Merge and Complement

Merge, respectively.

5 For clarity, the proof terms given in this account show the semantics but not the co-context as quantification,

wh-phrases, and anaphoric binding are not discussed here.


Proceedings of the 13 th ESSLLI Student Session

3.2 “Clitic Climbing” and Tense Auxiliaries

The axioms for tense auxiliaries are structured so that they take the complements of their

verbal complement. Past-participial verbs in turn need to be specified in such a way that

the proclitic merge rule does not apply to them. This approach is reminiscent of the

argument composition approach employed by Sag & Miller (1997) and Abeille, Godard

& Sag(1998). The axioms necessary to describe the “climbing” behavior in (2) are the


⊢ a A , λ v v

: ((A \ Pcl) ⊸ C (Nom ⊸ SU Psp)) ⊸ C ((A ∩ Pcl) ⊸ PC (Nom ⊸ SU Fin)),

(Ind ⊃ (Ind ⊃ Prop)) ⊃ (Ind ⊃ (Ind ⊃ Prop))

⊢ vu, λ y λ x see ′ (x, y) : (Acc \ Pcl) ⊸ C (Nom ⊸ SU Psp), Ind ⊃ (Ind ⊃ Prop)

The tense auxiliary form a (from avoir) is schematically defined to combine with a verb

in past participial form missing its complement, of polymorphic type A, to yield a finite

sentence missing both that same A complement and a nominative subject. In this way, the

A-type complement is “passed along” from the past participle to the tense auxiliary, whose

semantics are just to apply the identity function to the meaning of its past-participial


A proof term that correctly predicts the allowed form of (2b) is then possible: 6

(8) ⊢ ( SU Marie ( PC le (a Acc vu C ))), see ′ (marie ′ , b) : Fin, Prop

No proof is available for the disallowed form in (2a) because the lexical axiom vu only

uses the ⊸ C mode of implication, and as a result proclitics can not directly combine with


3.3 FPCs in Infinitival Constructions

Ensuring that cliticized complements of infinitival complements stay on the infinitiveform

verb, as depicted in (5), can also be accomplished with well formulated lexical

axioms. This ends up being simply a matter of making sure that infinitive-form verbs can

take proclitic complements and the verbs that select infinitivals can not:

⊢ voir 1 , λ y λ x see ′ (x, y)

: (Acc ∩ Pcl) ⊸ PC (Nom ⊸ SU Inf), Ind ⊃ (Ind ⊃ Prop)

⊢ veut, λ P λ x want ′ (x, P(x))

: (Nom ⊸ SU Inf) ⊸ C (Nom ⊸ SU Fin), (Ind ⊃ Prop) ⊃ (Ind ⊃ Prop)

The semantic representation of veut given here is the “equi” version of the denotation

λ P∈Prop λ x∈Ind want ′ (x, P) that might be used where veut takes a sentential complement,

as in Marie veut qu’elle gagne ‘Marie wants that she wins’.

With the lexicon so extended, a proof term for (5b) can be derived:

(9) ⊢ ( SU Marie (veut ( PC le voir) C )), want ′ (marie ′ , see ′ (marie ′ , b)) : Fin, Prop

A derivation for (5a) is not possible because veut does not employ the ⊸ PC mode of

combination required for FPCs.

6 Note that the tectogrammatical proof term in (8) does not describe the phonological elision between le

and a that occurs in French.


Proceedings of the 13 th ESSLLI Student Session

3.4 FPCs and Non-auxiliary Composition

Extending CVG to account for FPCs that combine with argument composition verbs other

than auxiliaries, whose behavior is exemplified in (6), requires defining special lexical

axioms for those verbs. Similar to the data examined so far, “non-local pronominal affixation”

(in the terminology of Sag & Miller (1997)) is very short distance in nature, and as

such employs the local implication ⊸ PC that was introduced to handle procliticization.

It is not necessary to invoke CVG’s hypothetical proof machinery for handling extraction

phenomena to explain the data in (6).

Here, a strategy is adopted of composing a predicative adjectival (for example, fidèle)

or transitive verb (like connaît) with a version of its complement that is itself expecting

a complement. The necessary extensions to the lexicon for the data in (6a) are the

following: 7

⊢ Pierre, pierre ′ : Nom, Ind

⊢ lui 2 , d : Dat ∩ 3Sg ∩ Pcl, Ind

⊢ fidèle, λ y λ x faithful ′ (x, y) : (Dat \ Pcl) ⊸ C (Nom ⊸ SU Adj),

Ind ⊃ (Ind ⊃ Prop)

⊢ reste, λ P λ y λ x remain ′ (P(x, y))

: ((Dat \ Pcl) ⊸ C (Nom ⊸ SU Adj)) ⊸ C ((Dat ∩ Pcl) ⊸ PC (Nom ⊸ SU Fin)),

(Ind ⊃ (Ind ⊃ Prop)) ⊃ (Ind ⊃ (Ind ⊃ Prop))

These axioms describe fidèle as an adjective missing a dative complement to form an

adjectival small clause and the form of rester that takes an adjectival complement that is

itself missing its complement. These extensions permit a proof term for (6a-ii):

(10) ⊢ ( SU Pierre ( PC lui 2 (reste fidèle C ))), remain ′ (faithful ′ (pierre ′ , d)) : Fin, Prop

(A full derivation of (10) is given in Figure 1 in the appendix.) With a few further extensions

to the lexicon, (6b) can also be accounted for:

⊢ connaît, λ f λ y λ x know ′ (x, f(y))

: ((De \ Pcl) ⊸ C Acc) ⊸ C ((De ∩ Pcl) ⊸ PC (Nom ⊸ SU Fin)),

(Ind ⊃ Ind) ⊃ (Ind ⊃ Prop)

⊢ fin, end ′ : N, Ind

⊢ la, λ f λ x f(x) : N ⊸ SP ((De \ Pcl) ⊸ C Acc), Ind ⊃ (Ind ⊃ Ind)

⊢ en, e : De ∩ Pcl, Ind

Here, connaît is formulated as just an ordinary transitive verb except that it selects an

accusative complement that is itself missing its De complement. The definite article la

is treated as a function from common nouns (type N) to possessive NPs (functions from

canonical de-phrases to accusatives), using the specifier combinatory mode ⊸ SP . The

clitic en is represented as an axiom whose type is the intersection of De and Pcl. These

axioms allow a proof term like the one in (10) for (6b-ii):

7 This account assumes the analysis of predicatives given by Pollard (2006) pp. 52–65, for example, for

adjectival small clauses of the type Nom ⊸ SU Adj.


Proceedings of the 13 th ESSLLI Student Session

(11) ⊢ ( SU Marie ( PC en (connaît (la fin SP ) C ))), know ′ (marie ′ , end ′ (e)) : Fin, Prop

The lexical axioms introduced here predict that FPCs in non-auxiliary composition

contexts behave in a way largely parallel with that of FPCs that combine with auxiliary

verbs. The main difference between FPCs with auxiliaries and with non-auxiliaries is that

the complement types for non-auxiliaries must be more constrained than the free-ranging

polymorphic complement allowed by auxiliaries. Since this approach does not appeal

to CVG’s unbounded dependency machinery, instead relying on axioms that specify the

⊸ PC local dependency, these instances of cliticization are guaranteed to remain shortdistance.

If FPCs in non-auxiliary composition contexts were construed as non-local

extractions, it would be difficult to rule out constructions like (12), for example, which do

not occur in French: 8

(12) *Marie lui i reste certaine que Céline a donné le livre i.

4 Conclusions and Future Work

This paper sketches a proof-theoretic account of the behavior of FPCs as complements.

For local cliticization, a new valence implication mode ⊸ PC is introduced to differentiate

procliticization from the canonical form of verbal complement selection. Combined with

properly-formulated lexical axioms, this new mode can account for some of the behavior

of FPCs, including the basic instances of cliticization, FPCs in infinitival constructions,

and two forms of “clitic climbing” via an argument composition analysis.

The analysis given here departs from traditional categorial analyses of cliticization

by construing FPCs as special instances of NPs. An advantage of this approach is that

a cliticized complement has identical semantics and a nearly identical tectogrammatical

form as its canonical counterpart. This fact, in combination with the new ⊸ PC mode

of implication for FPC affixation, allows lexical axioms to more strictly constrain the

behavior of FPCs in comparison to other types of verbal complements. This ability may

be central to correctly predicting, for example, the distribution of souvent as shown in (4).

This approach suffers, however, from the proliferation of lexical axioms that must occur

since all verbs that take complements need at least two distinct representations in the

lexicon. Such a requirement would have especially adverse implications for computational

applications like parsing. Since very often, as with voit 1 and voit 2 , the canonical

form of a verb closely resembles its cliticized variant, it is clear that a lexical rule associating

these forms is crucial to the success of this type of approach. The instances of

auxiliary and non-auxiliary composition presented here are also largely similar between

cliticized and non-cliticized versions. A general account of FPCs in French along the

lines of the analyses presented here must include a mapping between these similar forms

that captures their common linguistic and information-structural characteristics.

Future work on FPCs will aim to develop a correspondence between canonical and

cliticized verb forms that predicts FPC behavior in a general way. This work will need to

account for multiple clitic constructions, the rigid (and sometimes idiosyncratic) ordering

of FPC clusters, agreement between FPCs and past participles, FPCs in passive, causative,

and perceptual-verb constructions, and the enclitic attachment to imperative-form verbs

in French.

8 This example is due to Carl Pollard (personal communication of March 18, 2008).


Proceedings of the 13 th ESSLLI Student Session


Abeillé, A., Godard, D. and Miller, P. (1995). Causatifs et Verbes de Perception en

Français, Actes du Deuxième Colloque Langues et Grammaire, Paris VIII, Saint


Abeillé, A., Godard, D. and Sag, I. A. (1998). Two Kinds of Composition in French

Complex Predicates, Syntax and Semantics: Complex Predicates in Nonderivational

Syntax 30: 1–41.

Amblard, M. (2006). Treating clitics with minimalist grammars, in S. Wintner (ed.),

Proceedings of the Eleventh Conference on Formal Grammar, CSLI Publications,

pp. 9–20.

Bonami, O. and Boyé, G. (2005). French pronominal clitics and the design of Paradigm

Function Morphology, in G. Booij, L. Ducceschi, B. Fradin, E. Guevara, A. Ralli

and S. Scalise (eds), Proceedings of the Fifth Mediterranean Morphology Meeting,

pp. 291–322.

Kraak, E. (1998). A Deductive Account of French Object Clitics, Syntax and Semantics:

Complex Predicates in Nonderivational Syntax 30: 271–312.

Morrill, G. and Gavarro, A. (1992). Catalan Clitics, in A. Lecomte (ed.), Word Order in

Categorial Grammar, Editions Adosa, Clermont-Ferrand, pp. 211–232.

Pollard, C. (2006). Higher Order Grammar: A Tutorial. Unpublished ms., available at


Pollard, C. (2007). Nonlocal dependencies via variable contexts, in R. Muskens (ed.),

Proceedings of the Workshop on New Directions in Type-Theoretic Grammar. ESS-

LLI 2007, Dublin.

Sag, I. A. and Miller, P. H. (1997). French Clitic Movement without Clitics or Movement,

Natural Language and Linguistic Theory 15(3): 573–639.

Stabler, E. P. (2001). Recognizing Head Movement, LACL ’01: Proceedings of the 4th International

Conference on Logical Aspects of Computational Linguistics, Springer-

Verlag, London, UK, pp. 245–260.

Appendix A: Full Derivation



⊢ Pierre : Nom

⊢ pierre ′ : Ind

⊢ reste : ((Dat \ Pcl) ⊸ C (Nom ⊸ SU Adj)) ⊸ C ((Dat ∩ Pcl) ⊸ PC (Nom ⊸ SU Fin)) ⊢ fidèle : (Dat \ Pcl) ⊸ C (Nom ⊸ SU Adj)

⊢ lui 2 : Dat ∩ 3Sg ∩ Pcl

⊢ (reste fidèle C ) : (Dat ∩ Pcl) ⊸ PC (Nom ⊸ SU Fin)

⊢ ( PC lui 2 (reste fidèle C )) : Nom ⊸ SU Fin

⊢ ( SU Pierre ( PC lui 2 (reste fidèle C ))) : Fin

⊢ λ P λ y λ x remain ′ (P(x, y)) : (Ind ⊃ (Ind ⊃ Prop)) ⊃ (Ind ⊃ (Ind ⊃ Prop)) ⊢ λ y λ x faithful ′ (x, y) : Ind ⊃ (Ind ⊃ Prop)

⊢ d : Ind

⊢ λ y λ x remain ′ (faithful ′ (x, y)) : Ind ⊃ (Ind ⊃ Prop)

⊢ λ x remain ′ (faithful ′ (x, d)) : Ind ⊃ Prop

⊢ remain ′ (faithful ′ (pierre ′ , d)) : Prop

Figure 1: Full derivation of (10), with tecto-terms (above) and semantic terms (below) given separately for space considerations.

Proceedings of the 13 th ESSLLI Student Session

Proceedings of the 13 th ESSLLI Student Session



Takako Nemoto

Tohoku University

Abstract. In this paper, we consider determinacy in Brouwerian intuitionistic mathematics.

We give some examples of games such that the character of this mathematical setting—the

lack of the law of excluded middle and the adoption of continuity principle—makes the

behavior of determinacy drastically different from that on the classical setting.

1 Introduction

Games on N N have been of great interest in mathematical logic for a long time. On one

hand, determinacy of games has been used as a strong tool to investigate Baire space N N

or Cantor space {0, 1} N . On the other hand, as has been known, determinacy statements

are quite sensitive to the mathematical setting: For example, with the axiom of choice,

full determinacy is inconsistent; determinacy of analytic games are beyond ZFC.

The ultimate purpose of the author is to know how Baire space and Cantor space vary

depending on settings other than usual ones. As the first step toward this, she has been

investigating the promising tool, determinacy, on these settings. Among these are subsystems

of second order arithmetic, much weaker ones than ZFC (cf. (Nemoto, Ould Med-

Salem and Tanaka, 2007), (Nemoto, 2008)).

This paper treats another setting, Brouwerian intuitionistic mathematics. It denies the

law of excluded middle (LEM) and adopts the continuity principle, asserting that all the

functions from N N to N N or to N are continuous (for detail, see Section 2). We give

some examples of games, which show that the continuity principle and the lack of LEM

make the behavior of determinacy drastically different from that on the classical setting.

To explicate the role of classical principles in determinacy, we treat predeterminacy—

a formalization of determinacy in the intuitionistic mathematics—also in the classical


2 Axioms of the intuitionistic mathematics

In this section, we clarify the mathematical setting of this paper.

The logical constants have their constructive meanings and the rules of the intuitionistic

logic are employed. In particular, a disjunctive statement A∨B means there exists a proof

of A or one of B, and an existential statement ∃x ∈ V [A(x)] means there exist an element

a of V and an proof of A(a). A statement A is decidable if A ∨ ¬A holds. A set X ⊆ V

is decidable if the statement a ∈ X is decidable for each a ∈ V .

An infinite sequence α of natural numbers α(0), α(1), α(2), ... may be determined by

some finitely described algorithm, i.e., the n-th element α(n) of α is the result of the

algorithm for input n. Sometimes, however, such an infinite sequence may be constructed

step by step by choosing its elements one by one. In this case, the construction of the


Proceedings of the 13 th ESSLLI Student Session

sequence is never finished: At any point in time, only finitely many elements have been

chosen, and so we can only know a finite part of the sequence.

The latter construction is not permitted in the constructive mathematics, and so this

point divides the intuitionistic mathematics from the constructive mathematics.

Note that every infinite sequence, even if it is given by an algorithm, can be regarded

as a result of step-by-step-construction. This is the reason we do not distinguish infinite

sequences of natural number by their manners of construction.

Let N be the set of natural numbers. X N is the set of infinite sequences from X.

In particular N N is called Baire space and 2 N is called Cantor space. X n is the set

of sequences from X of length n and X

Proceedings of the 13 th ESSLLI Student Session

The strict fan theorem

For a fan S and a decidable bar B in S, there is a bounded sub-bar B ′ ⊆ B in S.

While König’s lemma and the strict fan theorem are equivalent in the classical mathematics,

they are not in the intuitionistic mathematics. Actually we can construct a “socalled”

intuitionistic counterexample, i.e., a fan T which has sequences of any finite length

such that we cannot prove that T has an infinite path, i.e., αN → N such that αn ∈ T for

all n. Let i n ∈ {0, 1} n be such that i n (k) = i for all k < n and let i N ∈ {0, 1} N be such

that i N (n) = i for all n. Define T ⊆ {i n : i < 2, n ∈ N} by

0 n ∈ T ↔there is no k < n such that p k+i = 9 for all i < 99, or if the least such k is even,

1 n ∈ T ↔there is no k < n such that p k+i = 9 for all i < 99, then the least such k is odd,

where p k denotes the k-th digit of the decimal expansion of π. We can easily see that T

is a fan which has sequences of any finite length and that if T has an infinite path α, then

α = 0 N or α = 1 N . Assume that T has an infinite path α. If α(0) = 0 (or 1), then we

must have a proof of the statement “if there is uninterrupted occurrences of 9 of length 99

in the decimal expansion of π, the least such one starts at an even (resp. odd) digit.” Up

to now, we do not have any proof of such statements, and so there is no infinite path in T .

(If we have a proof in future, we can find another so-called counterexample using another

unsolved problem in a similar way.)

3 Determinacy in intuitionistic mathematics

In this section, we introduce the notion of determinacy and variants.

For A ⊆ N N , the game G(A) in N N is defined as follows. Two players, called players

I and II, starting with player I, alternately choose a natural number to construct α ∈ N N .

Player I wins if and only if the resulting play α is in A. Player II wins if and only if player

I does not win. A strategy for player I (resp. II) is a function which assigns a natural

number to each even-(resp. odd-)length sequence in N

Proceedings of the 13 th ESSLLI Student Session

(Veldman, 2004) gave three formalizations of determinacy in the intuitionistic mathematics.

G(A) is strongly determinate if, in G(A), either player I or player II has a winning

strategy. This is the simplest formalization, but almost no game is strongly determinate.

G(A) is determinate from the view point of player I if, if for every strategy τ of player

II, there is α ∈ II τ with α ∈ A, then player I has a winning strategy in G(A). This

statement corresponds to the classical statement “if player II has no winning strategy,

then player I has one in G(A),” which is classically equivalent to “G(A) is determinate.”

To describe the last, we need a new notion. An anti-strategy for player I in G(A) is

a function η which assigns α ∈ II τ to each strategy τ for player II in G(A). An antistrategy

η for player I secures A if, for any strategy τ for player II, η(τ) ∈ A. G(A)

is predeterminate from the viewpoint of player I if, if he has an anti-strategy securing A,

then he has a winning strategy in G(A).

Note that G(A) is predeterminate from the viewpoint of player I, if G(A) is determinate

from his viewpoint.

Moreover, in a game G(X) in N N (or spread [S]), the second axiom of continuous

choice yields the converse, i.e., predeterminacy implies determinacy, since a strategy for

a player can be regarded as a function from N to N and since if there is α ∈ II τ with

α ∈ X for all strategy τ for player II, then by the second axiom of continuous choice an

anti-strategy for player I securing X is given by a code η of a continuous function.

The intuitionistic determinacy theorem (Veldman, 2004, Theorem 3.5) If [S] is a IIfinitary

branching spread, i.e., S is a spread-law such that, for every odd-length s ∈ S,

there are at most finitely many n with s ∗ 〈n〉 ∈ T , then G [S] (A) is predeterminate from

the viewpoint of player I for every A ⊆ [S].

In particular, if A ⊆ {0, 1} N , G {0,1} N(A) is predeterminate from the viewpoint of player

I. (Veldman, 2004) also gave A ⊆ N N such that G(A) is not predeterminate from the

viewpoint of player I.

Remark The notion of predeterminacy can be formalized from the viewpoint of player II

and we can obtain similar results to the last theorem.

4 Variations of games and predeterminacy

In this section, we consider other variations of games in the intuitionistic mathematics.

For these games, we can define the three formalizations of determinacy in the same way.

4.1 2-length games in {0, 1} N × {0, 1}

This subsection treats one of the simplest cases in which less strategies are allowed than

in the classical context. {0, 1} N × {0, 1} denotes the product topological space of Cantor

space and discrete space {0, 1}.

For given A ⊆ {0, 1} N × {0, 1}, the game G 1 (A) is defined as follows:

• Player I chooses α ∈ {0, 1} N .

• Player II chooses i ∈ {0, 1}.

• Player I wins if (α, i) ∈ A and player II wins if player I does not win.


Proceedings of the 13 th ESSLLI Student Session

Although {0, 1} N × {0, 1} is homeomorphic to Cantor space topologically, we must be

sensitive to the ordertype of the indexing set for the sequences.

In this game, a strategy for player I is his initial move α, and a strategy for player

II is a function from {0, 1} N to {0, 1}. The continuity principle forces all the strategies

for player II to be continuous, and so we may regard a strategy τ for player II as a code

of a continuous function such that (τ|α)(0) ∈ {0, 1} for all α ∈ {0, 1} N . B = {s ∈

{0, 1} 0} is a decidable bar in the fan {0, 1} N . Then, by the strict fan theorem,

there is a bounded sub-bar B ′ ⊆ B. Take n such that lh(s) < n for every s ∈ B ′ .

Then, {0, 1} n is also a bar in {0, 1} N , and, for every α, β ∈ {0, 1} N , αn = βn implies

τ|α(0) = τ|β(0). Thus we can regard τ as a function from {0, 1} nτ to {0, 1}, which can

be coded by a natural number. Because an anti-strategy η for player I is a function from

the set of all strategies for player II to the set of plays in this game, it can be regarded as

a function from N with the discrete topology to {0, 1} N × {0, 1}.

The following examples shows that even simpler sets, such as open or closed sets, are

not predeterminate from the viewpoint of player I.

Example 1 An open game G 1 (A) which is not predeterminate from the view point of

player I: Define A i = {0 n ∗ 〈1, i〉 : n ∈ N} and A = {(α, i) : ∃n[αn ∈ A i ]}. Then A is

open. Let η be the anti-strategy for player I which assigns (0 n τ ∗ 〈1, τ(0 n τ)〉 ∗ 0 N , τ(0 n τ)) to

each strategy τ for player II. Then η(τ) ∈ A for each strategy τ for player II, and so η is

an anti-strategy for player I securing A. On the other hand, it is clear that player I has no

winning strategy in G 1 (A).

Example 2 A closed game G 1 (B) which is not predeterminate from the viewpoint of

player I: Let T be an intuitionistic counterexample to König’s lemma, i.e., an unbounded

binary tree without infinite paths. Let T i = {t ∗ i n |t ∈ T ∧ n ∈ N}. Then B =

{(α, i)|∀n[αn ∈ T i ]} is a closed set. If player I had a winning strategy α in G 1 (B), α

would be an infinite path of T . Thus player I cannot have a winning strategy in G 1 (B).

On the other hand, player I has an anti-strategy securing B. Fix an enumeration of T and

let t n be the minimum s ∈ T such that lh(s) = n with respect to this enumeration. Let

η be the anti-strategy for player I which assigns (t n ∗ (τ(t n )) N , τ(t n )) to each strategy

τ : {0, 1} n → {0, 1} for player II. Clearly η secures B.

4.2 ω + 1 length games in {0, 1} N × {0, 1}

In this subsection, we consider another kind of games in {0, 1} N × {0, 1}.

For given A ⊆ {0, 1} N × {0, 1}, the game G 2 (A) is defined as follows.

• Player I and player II alternately choose i ∈ {0, 1} to form α ∈ {0, 1} N .

• After α is formed, player I chooses i ∈ {0, 1}.

• Player I wins G 2 (A) if and only if (α, i) ∈ A.

In this game, a strategy σ for player I is a pair (σ 0 , σ 1 ) of functions σ 0 : ⋃ n∈N {0, 1}2n →

{0, 1} and σ 1 : {0, 1} N → {0, 1}. By the strict fan theorem, we can regard, as well as in

the last subsection, σ 1 as a function from {0, 1} n to {0, 1} for some n ∈ N.

A strategy for player II is a function τ : ⋃ n∈N {0, 1}2n+1 → {0, 1}, which can be

regarded as an element of {0, 1} N . Then an anti-strategy η for player I is a function from


Proceedings of the 13 th ESSLLI Student Session

{0, 1} N to {0, 1} N × {0, 1}, which can be regarded a pair (η 0 , η 1 ) of codes of continuous

functions such that, for any strategy τ for player II, (η 0 |τ, (η 1 |τ)(0)) ∈ II τ. By the

strict fan theorem, there is n such that for any strategies τ and τ ′ , τn = τ ′ n implies

(η 1 |τ)(0) = (η 1 |τ ′ )(0), and so we can regard η 1 as a function from {0, 1} n to {0, 1}.

Theorem 1 For any C ⊆ {0, 1} N × {0, 1}, G 2 (C) is predeterminate from the viewpoint

of player I.

Proof. For i < {0, 1}, set C i = {α : (α, i) ∈ C}. Assume that η = (η 0 , η 1 ) is an

anti-strategy for player I securing C and η 1 can be regarded as a function from {0, 1} n

to {0, 1} for some n. Note that, in G {0,1} N(C 0 ∪ C 1 ), η 0 is an anti-strategy for player I

securing C 0 ∪ C 1 . Let σ 0 be a winning strategy for player I constructed in the proof of

The intuitionistic determinacy theorem in G {0,1} N(C 0 ∪ C 1 ). Set P σ0 = {α : α ∈ I σ 0 }.

Note that P σ0 is a spread. By the proof of The intuitionistic determinacy theorem, for any

α ∈ P σ0 , there exists a strategy δ for player II with η 0 |δ = α. By the second axiom of

continuous choice, there exists a code of continuous function ζ such that, for any strategy

α ∈ P σ0 , ζ|α is a strategy for player II with η 0 |(ζ|α) = α. By the strict fan theorem, there

exists a natural number N such that, for any α and β in P σ0 , αN = βN implies (ζ|α)n =

(ζ|β)n. Then we can define σ 1 : P σ0 → {0, 1} by σ 1 (α) = η 1 ((ζ|α)n), since σ 1 (α) is

determined by αN. Define a new strategy σ = (σ 0 , σ 1 ) for player I in G 2 (C). Then, for

any (α, i) ∈ I σ, a strategy δ = ζ|α for player II satisfies (α, i) = (η 0 |δ, (η 1 |δ)(0)), and so

σ is a winning strategy for player I in G 2 (C).

Comparing this theorem with the examples in the last subsection, we can conclude that

predeterminacy depends how players construct the sequence rather than what sequence

they do.

4.3 ω + 2-length game in {0, 1} N × {0, 1} 2

Next we consider slightly longer games.

For a given set A ⊆ {0, 1} N × {0, 1} 2 , consider the following game G 3 (A).

• First, player I and player II alternately choose n ∈ {0, 1} to form α ∈ {0, 1} N .

• After α is formed, player I chooses i ∈ {0, 1} and player II chooses j ∈ {0, 1}.

• Player I wins if (α, 〈i, j〉) ∈ A and player II wins if player I does not win.

Similarly to the previous subsection, a strategy σ for player I is a pair (σ 0 , σ 1 ), where

σ 0 is a function ⋃ n∈N {0, 1}2n to {0, 1} and where σ 1 is a function from {0, 1} N to {0, 1}.

We can regard σ 1 as a function from {0, 1} n to {0, 1} for some n ∈ N.

A strategy τ for player II is a pair (τ 0 , τ 1 ), where τ 0 is a function from ⋃ n∈N

{0, 1}2n+1

to {0, 1} and where τ 1 is a function from {0, 1} N × {0, 1} to {0, 1}. Note that since τ 1 is

continuous, its restriction τ 1,i to {0, 1} N × {i} is also continuous and so we can regard τ 1

as a pair (τ 10 , τ 11 ) of functions {0, 1} n i

to {0, 1} for some n i ’s.

Hence, the set of strategies for player II can be regarded as {0, 1} N ×N, and so an antistrategy

for player I can be regarded as a function η from {0, 1} N ×N to {0, 1} N × {0, 1} 2

such that η(τ) ∈ II τ for each strategy τ for player II.

As in the case of G 1 (X), we have the following examples. For any s ∈ {0, 1}

Proceedings of the 13 th ESSLLI Student Session

Example 3 Recall A i defined in Example 1. Then the open game G 3 (A ′ ) defined by

A ′ = {(α, 〈i, j〉) : ∃n[(αn) ′ ∈ A j ]} is not predeterminate from the viewpoint of player I.

Example 4 Recall T i defined in Example 2. Then the closed game G 3 (B ′ ) defined by

B ′ = {(α, 〈i, j〉) : ∀n(αn) ′ ∈ T j } is not predeterminate from the viewpoint of player I.

5 Predeterminacy in the classical mathematics

In this section, we consider predeterminacy in the classical mathematics in order to investigate

the role of classical principles in predeterminacy. Note that all the definitions

and statements in this section are made in the classical mathematics which includes the

countable axiom of choice.

Recall that, in the intuitionistic mathematics, an anti-strategy is a function η such that

η(τ) ∈ II τ for each strategy τ for player II. We translate this definition into the classical

mathematics, noticing that every function on N N is continuous in the intuitionistic


Let G(X) be any of games treated in the previous sections. An anti-strategy for player

I in G(X) is a continuous function which assigns α ∈ II τ to every continuous strategy

τ for player II in G(X). An anti-strategy η for player I in G(X) secures X if η(τ) ∈ X

for all continuous strategies τ for player II. G(X) is predeterminate from the viewpoint

of player I if,

if player I has an anti-strategy η securing X then player I has a winning

strategy in G(X).

Note that the ordinary definition of determinacy statement can be seen as “if there is a

function η such that η(τ) ∈ II τ and η(τ) ∈ X for all strategies τ for player II, then player

I has a winning strategy in G(X).”

For X ⊆ N N , strategies for players in the game G(X) can be regarded as functions N

to N, and so all the strategies are continuous. Therefore the condition “continuous” for

strategies has no effect in games G(X), but it does in the games G 1 (X), G 2 (X) and G 3 (X).

Moreover the continuity in the definition of anti-strategy is essential in the following


As mentioned in (Veldman, 2004, 1.1), The intuitionistic determinacy theorem holds

also in the classical mathematics. In particular, for all A ⊆ {0, 1} N , G {0,1} N(A) is predeterminate

from the viewpoint of player I in the classical mathematics.

Now we consider the predeterminacy of the games G 1 (X), G 2 (X) and G 3 (X) which

are defined in the last section, in the classical mathematics. Due to König’s lemma, the

classical counterpart of the strict fan theorem, also in the classical mathematics, a continuous

function from {0, 1} N → {0, 1} or {0, 1} N → {0, 1} N is given by its code η defined

in Section 2. In particular, a strategy for player II in G 1 (A) can be seen as a function

τ : {0, 1} n → {0, 1} for some n and an anti-strategy for player I in G 2 (A) can be seen as

a pair (η 0 , η 1 ) of a code η 0 of continuous function and η 1 : {0, 1} m → {0, 1} for some m.

The game G 1 (A) is not predeterminate from the viewpoint of player I, where A is

defined in the proof of Example 1. For closed games, the situation differs: Whereas

Example 2 is a closed game which is not predeterminate from the viewpoint of player I

in the intuitionistic mathematics, we will show that there is no such closed game in the

classical mathematics.


Proceedings of the 13 th ESSLLI Student Session

For X ⊆ {0, 1} N × {0, 1} and s ∈ {0, 1}

6 Further problems

Proceedings of the 13 th ESSLLI Student Session

Predeterminacy of closed game G 3 (X) in the classical mathematics The first problem

the author is interested in is whether the closed games G 3 (X) are predeterminate or

not in the classical mathematics. It will be solved by analyzing the property of continuous

functions in Cantor space.

Classical investigation of predeterminacy We can consider various formalizations of

predeterminacy in the classical mathematics other than defined in Section 5, e.g.,

If player I has an anti-strategy such that η(τ) ∈ A for each continuous strategy

τ for player II, then player I has a continuous winning strategy in G(A).

Note that the italicized part is newly added. Again, in game G(X) in N N , this modification

has no effect. However, we can easily find X ⊆ {0, 1} N which is not predeterminate in

this sense but which is predeterminate in the sense of Section 5. The author expects that

the investigation on these variations explicates how continuity confines functions on Baire

space or Cantor space.

Constructive reverse mathematical analysis of predeterminacy Constructive reverse

mathematics is a study to measure the strength of mathematical statements by nonconstructive

principles using constructive mathematics as a base theory. Constructive mathematics

is a mathematics which is based on the intuitionistic logic, but which does not

adopt axioms introduced in Section 2. Therefore it is included both in the classical mathematics

and in the intuitionistic mathematics.

(1) The role of the second axiom of continuous choice for predeterminacy Under

the second axiom of continuous choice, predeterminacy implies determinacy. This implication

needs only a fragment of the second axiom of continuous choice, and it is natural

to ask exactly how strong fragments are required. If we measure the strength of fragments

by the complexity of R in the axiom, the difficulty is in the reduction of general formulas

of the form ∀α∃βR(α, β) to the form ∀τ∃σ∀α(α ∈ I σ ∧ α ∈ II τ → R ′ (α)).

(2) Equivalences between predeterminacy and intuitionistic axioms (Veldman,

200x) proposed intuitionistic second order arithmetic and proved that the predeterminacy

of open subsets of II-finitary branching spreads in N is equivalent to the strict fan theorem

over the system BIM, which corresponds a popular classical base theory RCA 0 in the field

called Friedman-Simpson’s reverse mathematics (cf. (Simpson, 1999)). The author of the

present paper is now looking for similar equivalences beyond open sets. The first task in

this direction is to find a suitable intuitionistic axiom to compare with. One of candidates

is almost-fan-theorem proposed in (Veldman, 2001).

(3) The role of LEM for predeterminacy In the proof of Theorem 2, we use the law

of excluded middle. It seems impossible to prove it without this classical law, because

we have B of Example 2 in the intuitionistic mathematics. The next natural question

is what fragment of the classical law (such as the excluded middle or double negation


Proceedings of the 13 th ESSLLI Student Session

elimination) is necessary and sufficient for determinacy or predeterminacy statements.

(Akama, Berardi, Hayashi and Kohlenbach, 2004) discovered a hierarchy consisting of

these fragments over Heyting arithmetic HA, which is the constructive counterpart to

Peano arithmetic. The author of present paper tries to measure predeterminacy or determinacy

statements along this hierarchy.

(4) Equivalences between predeterminacy and classical axioms Since we treat

predeterminacy also in the classical mathematics, it is natural to consider Friedman-

Simpson’s reverse mathematical study of predeterminacy. Using constructive mathematics

as a base theory, we can make a finer reverse mathematical study of predeterminacy.


Some parts of this paper were done as the final assignment of master class 2006/2007 in

logic at mathematical research institute, the Netherlands. The author would like to express

her gratitude to the supervisor, Dr. Wim Veldman, who introduced her to the attractivity

of the intuitionistic mathematics.


Akama, Y., Berardi, S., Hayashi, S. and Kohlenbach, U. (2004). An arithmetical hierarchy

of the law of excluded middle and related principles, in H. Ganzinger (ed.),

Proceedings of the Nineteenth Annual IEEE Symp. on Logic in Computer Science,

LICS 2004, IEEE Computer Society Press, pp. 192–201.

Nemoto, T. (2008). Determinacy of wadge classes and subsystems of second order arithmetic.

Accepted for publication in Math. Log. Q., available at


Nemoto, T., Ould MedSalem, M. and Tanaka, K. (2007). Infinite games in the cantor

space and subsystems of second order arithmetic, Math. Log. Q. 53: 226–236.

Simpson, S. G. (1999). Subsystems of second order arithmetic, Springer.

Veldman, W. (2001). Almost the fan theorem, Technical report, Department of Mathematics,

University of Nijmegen.

Veldman, W. (2004). The problem of the determinacy of infinite games from an intuitionistic

point of view, Technical report, Department of Mathematics, University of

Nijmegen. To appear in the proceeding of Logic, Games and Philosophy: Foundational

Perspectives, Prague 2004.

Veldman, W. (200x). Brouwer’s fan theorem as an axiom and as a contrast to kleene’s

alternative. Preprint.


Proceedings of the 13 th ESSLLI Student Session



Ivelina Nikolova

University of Sofia

Abstract. This paper describes a system that uses language technologies applied on instructional

materials in order to provide computer-aided design of test items. This approach

employs lexical and syntactic information obtained from various techniques like POS tagging,

constituency parsing and term extraction. The system compiles a list of central terms

for the instructional materials, creates drafts of fill in the blank questions and suggests possible

distrators. The experiment is carried out on textbooks in geography, biology and history

of Bulgarian high-schools.

1 Introduction and related work

Asking questions is a way to keep students attention in class and verify their understanding.

Depending on the type of education and the goal of the teacher, questions could

be asked in a different form - orally or as short writing examination, in a game manner

etc. One common technique to do that is asking multiple choice questions, which became

even more popular in the last years, because it is also applicable for the case of e-learning.

However, designing thousands of tests is a time and effort-consuming educational activity.

All questions in the test should be carefully tuned for the target group of test-takers

and should not underestimate or overestimate their knowledge. Hence the teaching experts

who prepare the tests must have much broader knowledge in the field, compared to

the content which is explicitly included in the particular textbook, and they have to tune

the tests to the knowledge of the test-takers. One of the most difficult tasks in producing

test items is to decide whether a question does really have its answer in the instructional


These difficulties gave rise of a relatively new research area dealing with support of

the generation of test items, answer and distractor suggestions. Generation of multiple

choice questions with the help of NLP technologies is a hot area where different tools for

text processing are used in order to transform the facts from the instructional materials

to questions which can be used for students assessment. One of the most interesting approaches

in this respect is presented by (Mitkov, Ha and Karamis, 2006), where they apply

language technologies (LT) for generation of test-items for English, focusing on the automatic

choice of distractors. They report speeding up of the process of test development

about 6-10 times, compared to the manual test elicitation. Their approach is not domain

specific and can be applied to each area. Other authors actively working in the area are

(Aldabe, De Lacalle and Maritxalar, 2007), who are focusing on the different types of

question models with application primary in the language learning. We are not familiar

with any related work concerning this activity for learning materials in Bulgarian except

for the previous work of the author (Nikolova, 2007). So our efforts are strongly inspired

by the growing interest to this field, which is due to its significant practical importance.

On the other hand, we are motivated and encouraged by the presence of sophisticated


Proceedings of the 13 th ESSLLI Student Session

LT for Bulgarian language, which enable relatively complex text preprocessing, so the

automatic acquisition of learning objects from raw texts does not start from scratch.

This article presents the idea of the master thesis of the author which is still work in

progress. The aim is to develop a workbench supporting test designers by language technologies,

applied to the instructional materials. The task has three aspects: (1) suggestion

of key terms for (2) question generation and (3) distractor suggestion. For our purpose

the text is preprocessed by a number of preliminary available LT modules and lexical and

syntactic features are extracted and kept in meta-data format. Those features are used

later on for the generation of the draft learning objects. The experiment described in the

article has been applied for three different domain areas Geography, Biology and History.

The materials are taken from textbooks for 9th, 10th and 11th grade respectively.

The remaining part of this article is organised as follows: we first sketch the general

architecture of the system in section 2; in section 3 we describe the data processing;

section 4 explains in detail the experiment done so far; section 5 concerns the evaluation

at the current stage of the experiment; section 6 presents the conclusion and issues for

future work.

Figure 1: Workbench supporting the development of multiple-choice test items.

2 Workbench description

The system suggests draft learning objects to the test designers in order to help them during

the test items preparation. As shown in Fig.1 the instructional materials are supplied

by the test maker. They are being preprocessed and two main data sets are created: (a)

list of key terms (terms central for the text which is supplied), the way how it is built is

explained later in section 4.1 and (b) lexical and syntactic information about the supplied

text, which is kept in metadata format. Then the user may obtain all possible questions

generated from the supplied material or the ones related to a certain key term she is interested

in. If the system does not find appropriate sentences, containing the term, which

match its internal question templates (explained later in section 4.2), it returns a list of


Proceedings of the 13 th ESSLLI Student Session

pointers to the text, containing the local context in which the term appears and a list of

related concepts, generated by the same model as the distractors are.

3 Data processing

Our task is to support test makers during the process of building educational resources,

namely test questions and vocabulary of important concepts for the domain. We do this

by using language technologies over the raw instructional materials and obtain linguistic

resources which are to be loaded into a workbench that help the test designers during their

work. For our purpose we passed through several phases as shown in Fig. 2.

Figure 2: Data processing.

The instructional material is taken in a plain text format and is firstly parsed with

an NP extractor, where nouns and noun phrases are obtained in order to make a list of

potential key terms, which are to be suggested to the test designers. By the same time

when those extracted terms are marked an inverted index is produced. It contains a list

of the extracted NPs (nouns and noun phrases) and their corresponding absolute position

in the text. A threshold for the importance of the extracted terms is set and all NPs with

frequency higher than the threshold are included in the list of key terms. In addition all

the NPs that contain a noun which is a key term are also included in the key terms list.

During the next phase the raw text is tagged for POS categories. For our case we found

practical to use the SVMTool made by (Gimenéz and Márquez, 2004) which was trained

over the newspaper part of BulTreeBank 1 . The proper names, recognised by the tagger

were added to the list of key terms and then the output was processed with the multilingual

statistical parsing engine of Dan Bikels (Bikel, 2004), which is implementation

and extension of Collins parser referred bellow as (Collins, 1999). The parsing model

1 HPSG-based Syntactic Treebank of Bulgarian (BulTreeBank), http://bultreebank.org/


Proceedings of the 13 th ESSLLI Student Session

was trained on BulTreeBank. All the syntactic and lexical information obtained in these

phases is kept in meta-format and used later in order to produce draft learning objects

(key terms, test items), which are suggested to the test designers.

4 The experiment

4.1 Key terms suggestion

We build our approach on the understanding that questions given to the learner concern

terms, which are central for the domain. These are the terms, which serve as a basis

for the learned material and represent a specific domain vocabulary. Here those terms

are referred to as key terms. Although verbs might be also qualified as good key terms

in some domains, in this experiment we pay attention only to nouns and noun phrases

as potential key terms. They were extracted by the classic approach for automatic term

extraction based on frequencies. In order to overcome the problem of the inflection of

the language the raw texts were firstly lemmatized and then parsed with the NP-extractor

Morena. Once we obtained a list of nouns (LN) and noun phrases (LNP) we had to

rank them in order to extract only the most important ones which are the focus of our

approach and users queries. We applied two different techniques for measuring the term

importance over LN: simple frequency counting and TF-IDF measuring. As reported by

(Mitkov et al., 2006), we also noticed that TF-IDF produces worse results as it tends to

give low score to frequently used words (for example stopanstvo - economy) which are

actually quite important in the case of instructional materials (it is common to repeat the

same information to the learners in order to force them to better remember it). At the

same time sorting the list of nouns by their frequencies, after removing the stop words,

gave us quite satisfying results.

Word frequency f i Number of words w f with frequency f i

55 1

46 1

22 6

20 1

18 1 w f ≤ f i

16 1

14 1

12 5

10 5

8 6

6 8

4 44 w f ≥ f i

2 174

Table 1: Word frequency distribution in a text with length about 1000 words.

To set the threshold for important and less important terms in previous experiments

we have observed already prepared test items, prepared manually by the test designers,

concerning the same material as the corpora we are processing. The test items were parsed

with an NP extractor. We checked the popularity of the NPs, extracted from the test items,

in the whole corpus and the lowest popularity was accepted as a threshold. After repeating

the same procedure for different domain corpora we noticed that the importance border

is near the term frequency, which equals to the number of words having that count. For


Proceedings of the 13 th ESSLLI Student Session

example in a comparatively short text we have the following figure (Table 1) where the

threshold is set to frequency f = 7.

Once adjusted the threshold, we consider all the terms above it as key terms which

should be suggested to the test-makers. Now we add all NPs, which contained key terms

to the list of key terms. For example: along with the term (economy) from the materials

in geography we add the following NPs:

agrarno stopanstvo (rural economy),

svetovno stopanstvo (world economy),

nacionalno stopanstvo (national economy),

pazarno stopanstvo (market economy),

nacionalno pazarno stopanstvo (national market economy),

svremenno svetovno stopanstvo (contemporary world economics),

ponsko stopanstvo (Japanese economy),

naturalno stopanstvo (natural economy),

svremenno moderno agrarno stopanstvo (contemporary modern rural economy)

Removing the NPs containing stop words prevented the use of phrases like thnoto

stopanstvo (their economy). After the POS tagging the recognised proper nouns were

also added to the list of key terms and the final list of key terms was formed.

4.2 Question generation

In order to filter out clauses which are appropriate for question generation a module processes

the lexico-syntactic information collected during the preprocessing phase and decides

that a clause is eligible if:

(1) it contains at least one key term,

(2) the term is in a NPA clause of its VPS 2 (the NPA clause is the subject daughter of

VPS phrase) and

(3) the clause is finite.

If the three conditions are present, we consider that the term is in the subject phrase in

the sentence, which means that it is has central meaning for the sentence and we apply a

rule which replaces the focal term with a blank. The system additionally checks whether

the sentences do not point to some figures or tables, appendixes.

For example in the materials of Biology the terms nasledstvenost (heredity) and

unasledvane (inheritance) are key terms. And we have the following information about

the constituents for one of the sentences which contain the terms.

(S (VPS (NPA (N (NN Blagodarenie)) (PP (Prep (IN na)) (Ncfsd nasledstvenostta) (CoordP (Conj (C (CC

i)) (Ncnsd unasledvaneto)) (ConjArg (NPA (N (NN vidovete)) (PP (Prep (IN v)) (N (NN prirodata))))))))

(VPC (V (T (RP ne)) (Pron (Ppxta se)) (V (VB proment))) (NPA (A (JJ dlgo)) (N (NN vreme))))) (PUNC .))

Whichever of both terms is chosen by the user the system will try to produce a stem

from this sentence because it satisfies the three necessary conditions. So it will replace

the suggested key term with a blank and suggest the key term as an answer.

E.g. Blagodarenie na ... i unasledvaneto vidovete v prirodata ne se promenqt dlgo vreme.

(Due to ... and inheritance the species remain unchanged for long periods.)

2 NPA - head-adjunct noun phrase / VPS -head-subject verb phrase for full definitions - HPSGbased

Syntactic Treebank of Bulgarian (BulTreeBank), BulTreeBank Project Technical Report 05. 2004,



Proceedings of the 13 th ESSLLI Student Session

correct answer: nasledstvenostta(the heredity)

In the following sentence, again the key term nasledstvenost is present.

(S (VPS (NPA (CoordP (ConjArg (NPA (N (NN Izuqavaneto)) (PP (Prep (IN na)) (Ncfsd nasledstvenostta)

(CoordP (Conj (C (CC i))) (Ncfsd izmenqivostta))))) (Conj (C (CC i))) (ConjArg (N (NN razkrivaneto)))) (PP

(Prep (IN )) (Ncmpd ) (Pron (Ppetdp3 )))) (VPC (V (VB )) (NPA (A (JJ )) (N (NN )) (IN ))) (Ncfsd )) (PUNC .))

The term is a part of the subject phrase, so it is possible to make a fill in the blank

question, where the blank will replace the focal term nasledstvenostta.

Izuqavaneto na ... i izmenqivostta i razkrivaneto na zakonomernostite im sa osnovnite zadaqi

na genetikata.

(The study of ... and variability and the discovery of their regularities are the basic tasks of genetics.)

correct answer: nasledstvenostta(heredity)

Except for the change of the focal term with a blank, we do not apply any other transformation

to the chosen sentence.

4.3 Distractor generation

For the purpose of our application we need to suggest distractors in two cases: (1) when

questions are generated automatically and (2) when a key term was chosen by the designer,

but no questions could be generated for that key term, then only related concepts

are shown to the user (they are extracted by the same principle as distractors and that is

why we explain their construction in this section).

In the well-designed multiple-choice tests, the distractors are always semantically close

to the correct answer (as well as to each other, in a sense). To find such distractors

in previous studies we have tried paragraph clustering in order to define groups of text

sections which have similar topics, but in short text this methodology does not give a

promising result. Because of that we chose a rather simple working solution. We observed

already prepared tests for beginners level and we noticed that most of the distractors

looked very similar in first sight. They were mostly phrases holding the same noun and

different modifiers or the opposite, composed by the same modifier and different nouns.

That is why we accepted the practice to suggested as distractors NPs, which contain the

same noun, which the key term chosen by the user contains, but we change the modifier

of the phrase. And also the other way around, we change the noun of the chosen key

term and suggest phrases with the same modifier and different noun. All these phrases are

taken from the NP list generated in the first stage.


Such an example is:

Proceedings of the 13 th ESSLLI Student Session

Constant modifier

priroden kompleks (natural complex)

prirodna zona (natural zone)

priroden komponent (natural component)

Constant noun

agrarno stopanstvo (rural economy)

svetovno stopanstvo (world economy)

nacionalno stopanstvo (national economy)

5 Evaluation

At the current stage the system has been tested by three teachers, who are professional

test designers. Each one of them is a specialist in one of the three areas and has a degree

also in one of the others. They have experimented with materials in the three domains

biology, geography and history. Each designer had to choose 20 key terms in total and

to evaluate with a YES/NO mark (YES - acceptable question with or without need to be

changed; NO - not acceptable question) the questions produced by the system, related to

the chosen key terms.

From the materials in biology and geography useful definitions were extracted and

they were appreciated by the designers while for the history domain mainly proper names

were helpful. In total the average of the generated fill in the blank questions reported as

acceptable by the designers were 61% (with or without post-editing). The professionals

shared that the context and the distractors have helped them a lot, because they gave them

more options to seek for the needed information in order to correct a not well-formed

question. The reasons for discarding the rest of the questions were mainly that some

of the sentences had common meaning and did not represent specific definition; some

others were discarded because the blank was ambiguous - they had two many possible

options for a correct answer; or the chosen term was not central for the sentence which

was chosen.

The designers were especially satisfied with the high quality of the key terms which

served as a cross-reference over the whole material. They find them useful in order to

systematise the topics on which the student could be examined. In this way they saved

them time, because they could use the vocabulary of key terms as a summary of the

contents. Deeper analysis of the speeding-up of the process will be done after improving

the user interface of the system.

The test designers were certain that the so-prepared question items are useful only in

the case of beginner level testing, where deep understanding is not required and learners

are taught mostly basic definitions.

6 Conclusion and future work

This experiment represents a step towards the automatic test generation and it shows the

advances gained using more sophisticated tools and deeper processing of the instructional


Although the approach is considered as domain independent we consider Biology and

Geography more suitable, producing better results than History. One of the reasons is that

in history pure definitions in one sentence are hardly found and normally many references


Proceedings of the 13 th ESSLLI Student Session

are used. In this domain important role had the proper names which were also included

in the list of key terms.

As this article represents a work in progress we plan to go deeper in the data analysis

by adding dependency parsing. Then we can observe the subject and object clauses and

make additional inferences. We will also try different techniques for distractor selection,

such as using term similarity measures over the corpus and different types of questions.

We plan to improve the user interface, because it is a main issue, which concerns the

efficiency of the work of the test designers. Overall we plan deeper evaluation of the

system,including Classical test theory and error analysis in order to improve the produced


7 Acknowledgements

My complements go to my supervisor Galia Angelova and for Atanas Chanev who kindly

provided models for the SVMTool and Dan Bikel’s parser for Bulgarian.


Aldabe, I., De Lacalle, M. L. and Maritxalar, M. (2007). Automatic acquisition of didactic

resources: generating test-based questions, in I. F. de Castro (ed.), Proceeding of

SINTICE 07, pp. 105–111.

Bikel, D. (2004). A distributional analysis of a lexicalized statistical parsing model, in

D. Lin and D. Wu (eds), Proceedings of EMNLP.

URL: http://www.cis.upenn.edu/ dbikel/software.htmlstat-parser

Collins, M. (1999). Head-Driven Statistical Models for Natural Language Parsing, PhD

thesis, University of Pennsylvania.

Gimenéz, J. and Márquez, L. (2004). Svmtool: A general pos tagger generator based on

support vector machines, Proceedings of the 4th International Conference LREC’04.

Mitkov, R., Ha, L. A. and Karamis, N. (2006). A computer-aided environment for generating

multiple-choice test items, Natural Language Engineering 12.: 177–194.

Nikolova, I. (2007). Supporting the development of multiple-choice tests in bulgarian

by language technologies, in E. Paskaleva and M. Slavcheva (eds), Proceedings of

the Workshop A Common Natural Language Processing Paradigm for Balkan Languages,

pp. 31–34.


Proceedings of the 13 th ESSLLI Student Session



Yves Peirsman

University of Leuven & Research Foundation – Flanders

Abstract. Word Space Models provide a convenient way of modelling word meaning in

terms of a word’s contexts in a corpus. This paper investigates the influence of the type of

context features on the kind of semantic information that the models capture. In particular,

we make a distinction between semantic similarity and semantic relatedness. It is shown

that the strictness of the context definition correlates with the models’ ability to identify

semantically similar words: syntactic approaches perform better than bag-of-word models,

and small context windows are better than larger ones. For semantic relatedness, however,

syntactic features and small context windows are at a clear disadvantage. Second-order bagof-word

models perform below average across the board.

1 Introduction

Word Space Models have become the standard approach to the computational modelling

of lexical semantics (Landauer and Dumais, 1997; Lin, 1998; Schütze, 1998; Padó and

Lapata, 2007). They indeed offer a convenient way of capturing the meaning of a word

simply on the basis of the contexts in which it is used in a corpus. In that way, they can

retrieve the most similar words for a given target word. Yet, there is no agreement on how

context should be defined exactly. Context features vary from sentences or paragraphs to

single words, with or without the addition of syntactic relations. While all these features

definitely capture some semantic information, it is only to be expected that the choice of

context definition has an influence on the kind of semantic relatives that the Word Space

Models will find.

It is well known that words may be semantically related along a number of dimensions

(Cruse, 1986). In the NLP literature, similarity takes up a central position, with synonymy

as the most obvious example. But there are other types of semantic relations, too. For

instance, two words like doctor and hospital have a clear connection, although they are

in no way semantically similar. Recovering this semantic relatedness from a corpus may

have to proceed along different lines than the modelling of semantic similarity. Specific

Word Space Models may thus have a bias towards one or the other of these relations. In the

literature, however, the investigation of this semantic behaviour of Word Space Models

has only recently come to the fore (Sahlgren, 2006; Peirsman, Heylen and Speelman,


In this paper, we investigate eleven Word Space Models, representing three broad

classes, with respect to their performance in the fields of semantic similarity and semantic

relatedness. It will be shown that there is no such thing as a single best Word Space

Model: the ranking of the approaches depends on the type of semantic information we

want to find. The paper is structured as follows: in the next section, we will introduce the

different context models and the two types of semantic relationship that we investigate.

Section 3 then presents the precise setup of our experiments, while section 4 discusses

their results. Section 5 wraps up with conclusions and an outlook for future research.


2 Word Space Models

Proceedings of the 13 th ESSLLI Student Session

2.1 Competing definitions of context

All Word Space Models of lexical semantics rely on the so-called distributional hypothesis

(Harris, 1954), which claims that words with similar meanings occur in similar contexts.

From this hypothesis, it follows that semantic similarity can be modelled in terms of

contextual or distributional similarity. This is done by constructing for each target word

a so-called context vector, which contains the scores of its target word for all possible

context features. These scores can be the number of times that the contextual feature

co-occurs with the target, or more often, some kind of weighted frequency that captures

the statistical link between the target word and that feature. The distributional similarity

between two words is then calculated as the similarity between their vectors, on the basis

of a function like the cosine. In this way, it is possible to find for each target word the n

most distributionally similar words in any given corpus. We call these words the nearest

neighbours of the target.

Based on the definition of context, it is possible to define a hierarchy of Word Space

Models, each with its own kind of contextual features. At the top of the tree we make a distinction

between document-based and word-based approaches. Document-based models

use sentences, paragraphs or documents as dimensions, and count how often a target word

appears in each of these entities in the corpus (Landauer and Dumais, 1997; Sahlgren,

2006). Word-based models, by contrast, take not the context itself, but features from this

context as dimensions. They can be subdivided into syntactic and bag-of-word models.

So-called bag-of-word or co-occurrence models take into account all words within a predefined

distance of the target word (generally with the exception of semantically empty

words like articles, etc.), whereas syntactic models consider only those words to which

the target is syntactically related. Sometimes the features of such syntactic models consist

of these syntactically related words alone (Padó and Lapata, 2007), sometimes they are

formed by the word plus its relation (Lin, 1998). Finally we can distinguish between firstorder

and second-order approaches. First-order bag-of-word approaches count the context

words directly (Levy and Bullinaria, 2001), while second-order bag-of-word approaches

sum the vectors of these context words. In this last case, the target’s context vector thus

contains frequency information about the context words of its (first-order) context words

(Schütze, 1998). Although it is in principle possible to construct second-order syntactic

models, to our knowledge no implementation has been presented in the literature.

2.2 Semantic similarity and semantic relatedness

While it is claimed that all Word Space Models capture some kind of semantic information,

so far we have only very limited knowledge about the influence of the context

definition on the types of semantic relationship that the models find. In this paper we

investigate two such types: semantic similarity and semantic relatedness. The first applies

to synonyms (e.g., plane and airplane), hyponyms and hypernyms (e.g., bird and

blackbird) and co-hyponyms (e.g., blackbird and robin) — two words with a relationship

of similarity between the concepts they refer to. Semantic relatedness, by contrast, exists

between words whose concepts are not necessarily similar, but still related, for instance

because they belong to the same script, frame or lexical field. This is true for pairs like

bird and beak or plane and pilot. Note that it is not possible to draw a clear boundary be-


Proceedings of the 13 th ESSLLI Student Session

tween semantic similarity and semantic relatedness. Take the word pair pepper–salt, for

instance. These two words are clearly semantically similar, since they both refer to spices.

At the same time, however, they are also semantically related: not only do they both belong

to the lexical fields of food or spices, they also often co-occur together in the phrase

salt and pepper. Instead of mutually exclusive classes, semantic similarity and relatedness

can thus better be thought of as the two ends of a continuum, or two perpendicular

axes in a two-dimensional plane.

For many NLP applications, similarity might be the most important relation to model.

In typical Query Expansion, for instance, only semantically similar words (synonyms or

possibly hyponyms) make for a desired extension of a search query. Similarly, in Question

Answering a word in the question should only be matched with semantically similar

words in the database where the computer looks for the answer. Semantic similarity, however,

is just one way in which words may be related in our mental lexicon, as suggested

by psycholinguistic association experiments. According to Aitchinson (2003), the four

major types of associations that people give in response to a cue word are, in order of

frequency, co-ordination (co-hyponyms like pepper and salt), collocation (like salt and

water), superordination (hypernyms like butterfly and insect) and synonymy (like starved

and hungry). A similar observation is made by Schulte im Walde and Melinger (2005).

Comparing the results of their German verb association experiment with GermaNet, they

note that only 6% of the associations are synonyms, 14% are hypernyms and 16% are

hyponyms, while no less than 54% of the associations are unrelated to their cue words in

the GermaNet taxonomy. Although part of this can be explained by the incompleteness

of the database, such results will be difficult to replicate with models of semantic similarity.

After all, these are meant to prefer synonyms over hypernyms and co-hyponyms, and

even exclude collocates altogether. The best Word Space Models of semantic similarity

may thus not be the best models of relatedness, and vice versa.

Despite the wealth of research into Word Space Models, studies into their semantic

characteristics are scarce. Most often one model is applied to a specific computationallinguistic

task, and “comparisons between the (...) models have been few and far between

in the literature” (Padó and Lapata, 2007, p. 166). Sahlgren (2006) is one exception to this

rule. Focusing on document-based and first-order bag-of-word models, he showed that the

latter are better geared towards the modelling of paradigmatic (similarity) relations, while

the former have a clear bias towards syntagmatic relations. Unfortunately, Sahlgren left

out a number of popular word space approaches, like those based on syntactic relations or

second-order co-occurrences. Peirsman et al. (2007) also included syntactic models, but

concentrated on similarity relations only. This article thus sets out to fill these gaps in the

literature, by discussing a wide variety of model types from the perspectives of similarity

as well as relatedness.

3 Experimental setup

We investigate three classes of Word Space Models, for a total of eleven approaches: five

first-order bag-of-word models, five second-order bag-of-word models and one syntactic

model. Our corpus is the 300 million word Twente Nieuws Corpus of Dutch newspaper

articles, collected at the University of Twente and parsed by the Alpino parser at the

University of Groningen. As our test set, we selected from this corpus the 10,000 most


Proceedings of the 13 th ESSLLI Student Session

frequent nouns. For each of these, we had all models retrieve the 100 most similar neighbours

from the 9,999 remaining nouns in the set.

The bag-of-word models, both first-order and second-order, varied the size of the context

window they took into account — 1, 3, 5, 10 or 20 words to either side of the target —

for a total of ten models. Sentence boundaries were ignored; article boundaries were not.

The syntactic model considered eight different types of syntactic dependency relations,

in which the target word could be (1) the subject of verb v, (2) the direct object of verb

v, (3) a prepositional complement of verb v introduced by preposition p, (4) the head of

an adverbial prepositional phrase (PP) of verb v introduced by preposition p, (5) modified

by adjective a, (6) postmodified by a PP with head n introduced by preposition p,

(7) modified by an apposition with head n, or (8) coordinated with head n. Each specific

instantiation of the variables v, p, a, or n was responsible for a new context feature.

The other parameter settings were shared by all eleven models:

• Dimensionality: For all approaches, we used the 2,000 most frequent contextual

features in the corpus as dimensions. This is a simple but common way of reducing

the otherwise huge dimensionality of the vectors, which leads to state-of-the-art

results, particularly for the syntactic model (Levy and Bullinaria, 2001; Padó and

Lapata, 2007). For the syntactic model these dimensions are the 2,000 most frequent

syntactic features, likesubj of fly. For the bag-of-word models, they are

formed by the 2,000 most frequent words in the corpus. Function words and other

semantically empty words were excluded a priori on the basis of a stop list.

• Frequency cut-off: Depending on the context size, we established a cut-off value n,

so that the models ignored those features that occurred together with the target fewer

than n times. For context size 3, this cut-off was fixed at 3, for the larger context

sizes it lay at 5. The syntactic model and the bag-of-word model with context size

1 did not use a cut-off, since it led to data sparseness.

• Frequency weighting: As is usual in the literature, the context vectors of the target

words did not contain the simple frequencies of the features. Instead, they listed

the point-wise mutual information between each feature and the target word. This

measure expresses whether the two occur together more or less often in the corpus

than we expect on the basis of their individual relative frequencies.

• Similarity measure: Finally, the distributional similarity between two target words

was measured by the cosine between their context vectors.

4 Results

4.1 Semantic similarity

We evaluated the ability of our models to find semantically similar words on the basis of

a comparison with Dutch EuroWordNet (Vossen, 1998). This lexical database contains

more than 34,000 sets of noun synonyms and the relations that exist between them. Two

evaluation measures were applied. First, we focused on the general ability of our models

to capture semantic similarity. Then we looked into the distribution of four more specific

similarity relations.


Proceedings of the 13 th ESSLLI Student Session

wu & palmer

0.0 0.2 0.4 0.6 0.8 1.0

syn c1 c3 c5 c10 c20 cc1 cc3 cc5 cc10 cc20

word space models

Figure 1: Wu & Palmer similarity scores between target and nearest neighbour.

syn: syntactic model, cn: first-order bag-of-words, ccn: second-order bag-of-words

n: context size (number of words on either side of target)

The general performance of the models was quantified by the average Wu and Palmer

score between a target word and its single nearest neighbour (Wu and Palmer, 1994). This

Wu and Palmer score is a popular way of measuring the semantic similarity between two

words, based on their depth and their distance from each other in a taxonomic structure

like EuroWordNet. If either the target or its nearest neighbour were not present in the

database, the pair was simply ignored. In order to make the results perfectly comparable

across models, we restricted the results to the 4183 target words with a nearest neighbour

in EuroWordNet for all models. The resulting Wu and Palmer scores are given in Figure 1.

Figure 1 shows a clear decrease in Wu and Palmer score as the definition of context

becomes less strict. A Friedman test indeed confirms the influence of the type of Word

Space Model on performance (Friedman chi-squared = 3541.575, df = 10, p-value <

.001). The syntactic model achieves the highest average similarity score by far, followed

by the first-order bag-of-word models and finally the second-order bag-of-word models.

Moreover, small contexts appear to model semantic similarity better than large ones. A

test of multiple comparisons after Friedman showed that the differences between all pairs

of models are indeed statistically significant at the .05 level, except for those between

context sizes 1 and 3 (both first-order and second-order) and that between the first-order

model with context size 20 and the second-order model with context size 5.

Of course, this general similarity score does not give any information about what specific

type of similarity relation the models find. We therefore defined four taxonomic

similarity relations, again with EuroWordNet as a gold standard. Synonyms were defined

as words in the same synonym set as the target word, hypernyms as words exactly one

node above the target, hyponyms those one node below and co-hyponyms as words one

node below any of the target’s hypernyms. Together, these relations make up the target’s

EuroWordNet environment. Note that our strict definition of these relationships does not


Proceedings of the 13 th ESSLLI Student Session


0 500 1000 1500 2000 2500


0.384 0.405 0.369












syn c1 c3 c5 c10 c20 cc1 cc3 cc5 cc10 cc20

word space models

Figure 2: Distribution of semantic similarity relations for all models.

allow for more than one or two steps in the tree, and thus disregards possible hypernyms

or hyponyms that are more than one step away from the target. This approach ensures the

reliability of our gold standard, but constitutes a test that a relatively low percentage of

nearest neighbours will pass. Figure 2 shows how the single nearest neighbours of our target

words are distributed over the four similarity relations. Again we restricted ourselves

to the 4183 target words with a neighbour in EuroWordNet for all models.

Not surprisingly, the number of retrieved similarity relations mirrors the general Wu

and Palmer similarity score. Again the syntactic model performs best: 51.2% of its single

nearest neighbours that occur in EuroWordNet are situated in the environment of the target

word. This precision drops to between 40.5% and 26.4% for the first-order bag-of-word

methods and even lower for the second-order models. As above, the performance of

the models seems to depend on the strictness of their context definition. The stricter they

view context — i.e., syntactic context rather than a bag of words, smaller context windows

rather than large ones — the more examples of semantic similarity they find. This pattern

remains unchanged when a larger number of nearest neighbours is taken into account.

With one exception, the distribution of the four relations is comparable across the different

models. Co-hyponyms figure most prominently among the nearest neighbours,

followed by synonyms, hypernyms and hyponyms. Only the syntactic model finds an

unexpectedly high number of hypernyms. This can probably be explained by the way

syntactic relations are typically inherited in a taxonomy: all characteristics of a (prototypical)

concept (can fly, for instance) also apply to its hypernyms, so that these are often

most similar in terms of syntactic distribution in a corpus.

4.2 Semantic relatedness

The results in the previous section do not necessarily express the overall quality of the investigated

Word Space Models. It is possible that the models that scored relatively badly


Proceedings of the 13 th ESSLLI Student Session

0.0 0.1 0.2 0.3 0.4




0 20 40 60 80 100

number of nearest neighbours

Figure 3: Evolution of the precision, recall and F-score of the first-order bag-of-word

model with context size 5 in its retrieval of associations.

in the similarity experiments are simply biased towards a different kind of semantic relation.

In this second round of experiments we therefore turn our attention from semantic

similarity to semantic relatedness.

For this task, we relied on a psycholinguistic experiment of human associations, described

in De Deyne and Storms (in press). In this experiment, participants were asked

to list three different word associations for 1,424 cue words. Each word was presented

to at least 82 participants, resulting in a total of 381,909 responses. For instance, aap

(‘monkey’) triggered the response zoo (‘zoo’) 27 times, aarde (‘earth’) prompted planeet

(‘planet’) 14 times and bikini (‘bikini’) elicited vakantie (‘holiday’) 6 times. These examples

show that this experiment taps into a different kind of semantic relationship than

the previous one. Note that at this moment, we ignore the fact that association strength is

often asymmetric (Michelbacher, Evert and Schütze, 2007).

In order to make the results comparable to those in section 4.1, we reduced the data set

to those cue words and associations that belong to the 10,000 most frequent nouns in our

corpus. This gave a gold standard of 768 cue words with a total of 31,862 different cue–

association pairs. When these associations are checked against EuroWordNet, we indeed

find that only 8% belong to the EuroWordNet environment of their target word. 9% of

these are synonyms, 19% are hypernyms, 16% are hyponyms and 56% are cohyponyms.

We evaluated the Word Space Models against this gold standard by counting the number

of associations that they find as the nearest neighbours to the cue words. If we consider

just one nearest neighbour, the results already show a considerable difference from

the previous experiments. As the top chart in Figure 4 indicates, the syntactic model still

performs best, with 340 associations (a precision of .443), followed by the first-order and

then the second-order bag-of-word models. However, within the bag-of-word models, the

ideal context size has changed. The first-order bag-of-word models with context sizes 10

and 20 have 299 and 293 associations among their single nearest neighbours, respectively.

For 768 targets, this gives precision values of .389 and .382. Then we find context sizes 5

(n = 281, P = .366), 3 (n = 269, P = .350) and 1 (n = 228, P = .297). Larger contexts

thus outperform their smaller competitors here. Note that the two best models share only


90 correct predictions, which indicates that they have different preferences among the associations.

A look at the data suggests that the syntactic model indeed picks out those

associations that are also semantically similar to their target word, while the first-order

bag-of-word models with large contexts cover collocational relatedness better. With the

second-order models, finally, context size 3 seems optimal.

When we consider one nearest neighbour, the models cannot find more than 768 associations,

and recall thus stays extremely low. We therefore increased the number of nearest

neighbours from 1 to 100 and calculated the precision, recall and F-score at each step.

By way of example, Figure 3 plots the evolution of these values for the best-performing

model. The bottom bar chart in Figure 4, then, shows the maximum F-score of all the

models. The syntactic approach has lost its lead, which suggests that it is able to model

only a small number of associations well — probably those that also score highly on

similarity. Instead it is now the first-order bag-of-word model with context size 5 that

outclasses all others, with an F-score of .127 (P = .112, R = .148) at 55 neighbours.

Extending the context window to 10 words brings the F-score down to .122 (P = .102,

R = .150, 61 neighbours); reducing the window to 3 words takes it to .120 (P = .104,

R = .143, 57 neighbours). Next, we have the bag-of-word model with context size 20

(F = .115, P = .102, R = .133, 54 neighbours) and only then the syntactic model

(F = .111, P = .102 R = .123, 50 neighbours). Large contexts now score slightly worse

than intermediate ones, which probably strike the best balance between similarity relations

and collocational links. Second-order models never attain an F-score above .10, and

neither do the smallest context windows, which are thus clearly biased towards similarity.

4.3 Discussion

Proceedings of the 13 th ESSLLI Student Session

In part, our experiments have confirmed earlier results in the literature. For instance,

Sahlgren (2006) already noted that with first-order bag-of-word models, larger contexts

score better in his association experiment, while smaller contexts score better in the synonymy

test. Peirsman et al. (2007) found even better results for a syntactic model in

Dutch, at least with respect to semantic similarity evaluated against EuroWordNet. Both

findings are borne out by our experiments.

At the same time, our results add some new insights to these earlier observations. We

have shown that the syntactic model and the bag-of-word models with context size 1 are

most biased towards semantic similarity. The syntactic model scored best in our first

round of experiments, while the results of the bag-of-word models with context size 1

were either not statistically different from or better than those of models with larger context

windows. When it came to the discovery of semantic associations, however, context

size 1 proved the least advisable choice, and the syntactic model was outperformed by

all first-order bag-of-word models with an intermediate or large context window. Secondorder

bag-of-word models scored below average in both experiments. They probably only

show their power when data sparseness is an issue, as with Word Sense Discrimination

(Schütze, 1998) or with corpora smaller than ours.

5 Conclusions and future research

In this paper, we investigated the influence of the context definition on the ability of

several Word Space Models to capture two kinds of semantic information — semantic


Proceedings of the 13 th ESSLLI Student Session


0 100 200 300 400



0.35 0.366 0.389 0.382


0.267 0.247 0.247 0.232

syn c1 c3 c5 c10 c20 cc1 cc3 cc5 cc10 cc20

word space models


0.00 0.04 0.08 0.12

0.127 0.122 0.115




0.081 0.079 0.081 0.08


syn c1 c3 c5 c10 c20 cc1 cc3 cc5 cc10 cc20

word space models

Figure 4: Frequency of associations among single nearest neighbours (top) and maximal

F-scores for all models (bottom).

similarity and semantic relatedness. We studied a total of eleven Word Space Models:

one syntactic approach and ten bag-of-word models with context sizes 1, 3, 5, 10 and

20, first-order as well as second-order. Both for semantic similarity and semantic relatedness,

first-order models clearly beat their second-order competitors. However, while

syntactic models gave the best results for semantic similarity, first-order bag-of-word approaches

with intermediate to large context windows fared better in the retrieval of associated


In the short term, we aim to extend the repository of Word Space Models that we are

investigating — document-based models and second-order syntactic models are particularly

high on our list. In the longer term, we will try and determine if the differences we

observed in the modelling of semantic relations between word types also play a role in

Word Sense Discrimination. In this task, all contexts of a word are clustered in order to

automatically find the multiple senses of that word. Given the results here, we suspect that

different kinds of polysemy or homonymy may not demand the same context definitions.


Aitchinson, J. (2003). Words in the Mind. An Introduction to the Mental Lexicon, Oxford:


Cruse, D. A. (1986). Lexical Semantics, London: Cambridge University Press.


Proceedings of the 13 th ESSLLI Student Session

De Deyne, S. and Storms, G. (in press). Word associations: Norms for 1,424 dutch words

in a continuous task, Behaviour Research Methods .

Harris, Z. (1954). Distributional structure, Word 10(23): 146–162.

Landauer, T. K. and Dumais, S. T. (1997). A solution to Plato’s problem: The Latent

Semantic Analysis theory of the acquisition, induction, and representation of knowledge,

Psychological Review 104: 211–240.

Levy, J. P. and Bullinaria, J. A. (2001). Learning lexical properties from word usage

patterns: Which context words should be used, in R. French and J. Sougne (eds),

Connectionist Models of Learning, Development and Evolution: Proceedings of the

6th Neural Computation and Psychology Workshop, London: Springer, pp. 273–


Lin, D. (1998). Automatic retrieval and clustering of similar words, Proceedings of

COLING-ACL98, Montreal, Canada, pp. 768–774.

Michelbacher, L., Evert, S. and Schütze, H. (2007). Asymmetric association measures,

Proceedings of the International Conference on Recent Advances in Natural Language

Processing (RANLP-07), Borovets, Bulgaria.

Padó, S. and Lapata, M. (2007). Dependency-based construction of semantic space models,

Computational Linguistics 33(2): 161–199.

Peirsman, Y., Heylen, K. and Speelman, D. (2007). Finding semantically related words in

dutch. co-occurrences versus syntactic contexts, Proceedings of the CoSMO Workshop,

Roskilde, Denmark, pp. 9–16.

Sahlgren, M. (2006). The Word-Space Model. Using Distributional Analysis to Represent

Syntagmatic and Paradigmatic Relations Between Words in High-dimensional

Vector Spaces, PhD thesis, Stockholm University.

Schulte im Walde, S. and Melinger, A. (2005). Identifying Semantic Relations and Functional

Properties of Human Verb Associations, Proceedings of the joint Conference

on Human Language Technology and Empirical Methods in Natural Language Processing,

Vancouver, Canada, pp. 612–619.

Schütze, H. (1998). Automatic word sense discrimination, Computational Linguistics

24(1): 97–124.

Vossen, P. (ed.) (1998). EuroWordNet: a Multilingual Database with Lexical Semantic

Networks for European Languages, Dordrecht: Kluwer.

Wu, Z. and Palmer, M. (1994). Verb semantics and lexical selection, Proceedings of the

32nd Annual Meeting of the Association for Computational Linguistics (ACL-94),

Las Cruces, NM, pp. 133–138.


Proceedings of the 13 th ESSLLI Student Session


Maren Schierloh

Michigan State University

Abstract. Following Izumi and Bigelow’s research (Izumi and Bigelow, 2000), this study

re-investigates the noticing function of output; that is, whether producing the target language

focuses learners’ attention to second language (L2) structures in subsequent input. Izumi

and Bigelow found no effects of output on either noticing or acquisition. They attributed

their findings to limitations in operationalizing noticing via underlining, coupled with the

relative difficulty of the target-structure (past-hypothetical-conditional). Under the premise

that the learner’s developmental level and attentional resources may constrain noticing, this

partial replication addresses whether a less difficult structure may yield greater noticing and,

consequently, greater L2 gains. Fifteen intermediate ESL learners were randomly assigned

to two experimental groups (EGs) and one control group (CG). The first EG was given opportunities

for output that elicited the past hypothetical conditional (more difficult structure),

while the second EG had opportunities to produce the present hypothetical conditional (less

difficult structure). The CG was not prompted to produce output that required use of either

structure. All groups engaged in follow-up reading and underlining activities. The reading

texts modeled target-like use of the relevant structure for both EGs. Methodological

triangulation measured noticing through underlining of the target-structure and stimulated

recall to elicit data about cognitive processes involved. Additionally, noticing and L2 gains

were assessed based on participants’ performance on subsequent essay-writing activities and

posttests. Quantitative raw data revealed no effect of output (EGs vs. CG) or difficulty-level

(EG1 vs. EG2) on the underlining of target forms in subsequent texts. Qualitative stimulated

recall data, however, showed that output influences subsequent noticing of certain input

elements; e.g. ’This is a good word for my essay’. Overall findings suggest that output

can trigger noticing of vocabulary and further illustrate how methodological triangulation

can enhance insights into learners’ L2 processes. Thus, this study has ramifications for both

classroom practices and research methodology.

1 Introduction

In the past decade of second language acquisition (SLA) research, the notion that noticing

is essential for the acquisition of new linguistic systems has been a matter of debate

(Jourdenais, 2001; Leow, 2002; Robinson, 2001; Schmidt, 2001; Simard and Wong, 2001;

Tomlin and Villa, 1994; Truscott, 1998). Much of the argumentation is grounded in the

difficulty of operationalizing and measuring the second language (L2) learner’s internal

cognitive processes. Research in SLA and cognitive science has raised questions as to

the type and amount of ’attention’ necessary for language learning, the specific aspects of

language that are more likely to be noticed, the what extent to which the developmental

level of the learner determines what is noticed.

Recently, researchers have turned their attention to the role output plays in noticing.

The oral or written production of language may consciously induce learners to realize

the gap between what they want to say and what they can say. This noticing of linguistic

limitations may prompt learners to seek solutions in subsequent input. A study by

Izumi and Bigelow (2000) centered on the noticing function of output. They investigated

whether L2 written output promotes noticing of form in subsequent text. They compared

an experimental group, which produced output, to a control group, which did not produce

any output but engaged in comprehension-based activities instead. The noticing of the


Proceedings of the 13 th ESSLLI Student Session

participants was operationalized through the participants’ underlining of the target structure

in written text. Both groups underlined the same amount and Izumi and Bigelow

concluded that output does not trigger noticing. Because Izumi and Bigelow’s inquiry

is of importance as it may inform L2 pedagogy, the present study partially replicates

their study by asking analogous research questions and by implementing a similar design.

Yet, to achieve a more valid measure of noticing, this study uses stimulated recall to tap

into learners’ cognitive processes. In addition to the stimulated recall data, this study

quantitatively and qualitatively analyzes the data from learners’ underlining and written

production to better examine a possible relationship between output, noticing and L2 development.

This research also addresses whether a cognitively less demanding structure

may have an effect on noticing by the learner. The following section provides a review

of the literature on noticing, followed by sections detailing the difficulties associated with

measuring noticing, the role output plays in noticing as well as the role of the learner

level. The third section details the research methodology, and the subsequent sections

provide a discussion of findings and limitations and a conclusion.

2 Review of the Literature

2.1 Noticing and SLA

Since Schmidt (1990) first proposed his well-known “noticing hypothesis”, a large body

of SLA and cognitive science research has focused on the role of noticing, or conscious

attention 1 , in promoting L2 development (Alanen, 1995; Leow, 2002; Rosa and O’Neill,

1999). The noticing hypothesis claims that noticing requires awareness and is a necessary

condition for second language acquisition. Yet, some research findings are not in line with

the premise that conscious attention is a necessary prerequisite for L2 acquisition (Gass,

Svetics and Lemelin, 2003; Robinson, 1995).

Truscott rejects the crucial role of noticing in L2 learning process from a theoretical

perspective, maintaining that noticing only advances metalinguistic knowledge but not

competence. He further contends that “awareness is not only unnecessary but also unhelpful”

(Truscott, 1998, page 126). Such a narrow account of the role of noticing in SLA

is certainly challenged by substantial L2 research data supporting that noticing facilitates

L2 learning (Ellis, 1994; Long, 1996; Robinson, 1995; Swain and Lapkin, 1998).

2.1.1 Operationalizing and Measuring Noticing

At the heart of the ongoing debate on the role of noticing in SLA is the difficulty in

operationalizing it, which requires introspection and assessment of learner-internal cognitive

activities. For example, Schmidt (2001) operationalized noticing in terms of the

learners’ self-reporting either during or immediately after exposure to the input, yet, the

lack of self-reporting should not be interpreted as a lack of awareness, as some thinking

processes may be difficult to verbalize (Jourdenais, 2001; Schmidt, 2001). As such, the

challenge facing the measurement of noticing is to accurately link observable behaviors

by language learners to the construct of noticing. Methodologies used to qualitatively and

1 Due to terminological vagueness of ’noticing’ resulting from related terms such as ’attention’

(Leow, 2002) and ’awareness’ (Tomlin and Villa, 1994) in noticing- literature, Schmidt’s definition of

noticing as ’conscious attention’ has been adopted for the present study (Schmidt, 2001). Schmidt equates

consciousness with awareness and/or attention.


Proceedings of the 13 th ESSLLI Student Session

quantitatively account for a learners’ noticing of a specific target language features fall

into two categories: online, which measure the language learner’s noticing during performance

of a certain language task, and offline, which employs post-treatment assessment

of noticing. Neither online nor offline methodologies enable an absolute account of the

learners’ attentional processes.

Online methodologies include, for example, think-aloud protocols which require the

participants to monitor and orally self-report their mental processes while they perform a

certain language task. Izumi and Bigelow used the online methodology of participants’

underlining of “the word, words, or parts of the words that are [felt to be] particularly

necessary for subsequent production” (Izumi and Bigelow, 2000, page 250). Izumi and

Bigelow characterize underlining as an authentic procedure readers naturally do during

a reading task, and argue that the marking of words would not occur without conscious

awareness of the importance of that particular word or phrase. In partially replicating

Izumi and Bigelow, the present study utilizes underlining as one integral attribute of the

triangulated measurement of noticing.

The advantage of online measures, as opposed to post-exposure measures, is their instantaneous

access to L2 processing, thus minimizing the risk of possible memory decay

by the L2 learner (Gass and Mackey, 2000). Yet, stimulated recall has evolved as a

sound and widely used offline method to obtain data of the language learner’s thought

processes. During stimulated recall, learners are prompted with a stimulus (e.g. learner’s

written products or a video displaying the learner while engaging in the language task),

and he/she is asked to report on thought processes while performing the language task.

Note, however, that the lack of evidence of noticing in online or offline protocol does not

necessarily imply absence of noticing.

2.1.2 Developmental Level as a Factor in Noticing

In addition to the concern over how noticing data should be collected and analyzed, current

SLA research has scrutinized connections between the difficulty level of the target

language input and the learner’s attentional resources (Ellis, 1994; Gass et al., 2003; Philp,

2003; VanPatten, 1996). Long (1996), for instance, found that the proficiency of the

learner may modulate noticing. Advanced learners may benefit from the increasing automaticity

which allows them to attend to more complex structures. A recent study by Philp

(2003) similarly revealed that the developmental level of the learners was one factor to

determine accurate recall of the reformulation by the native speaker. Thus, developmental

readiness may constrain the learner’s attention to aspects of more difficult structures. In

a similar vein, Robinson (1995) argued that the extent to which a language learner may

notice a particular form of their linguistic limitations is dependent on the demands of the

pedagogical task.

In the research by Izumi and Bigelow (2000), the study to be partially replicated here,

the past hypothetical conditional was selected as the target structure, based on the rationale

that this structure poses some difficulty to the learner, which may trigger noticing of

linguistic limitations. Yet, learner level and attentional capacities for the target structure,

it is of present interest whether reduced cognitive demands may yield greater noticing

and, in turn, greater L2 gains.


Proceedings of the 13 th ESSLLI Student Session

2.2 The Noticing Function of Output

Underlying the relationship between noticing and SLA is the question of under what circumstances

L2 learners may notice linguistic forms. Is it through input or through output,

or both in combination? While the essential role of input for SLA is universally accepted,

the sufficiency of input for acquisition has been debated since Swain first proposed her

Output Hypothesis (Swain, 1985) in reaction to Krashen’s view of primacy of comprehensible

input (Krashen, 1982). While Swain does not negate the importance of input,

she argues that “L2 output pushes learners to process language more deeply (with more

mental effort) than does input” (Swain, 1995). A series of studies by Swain and Lapkin

revealed noticing as one of the main reasons why producing output mediates L2 development

(Swain and Lapkin, 1995; Swain and Lapkin, 1998). As such, their argument corresponds

to Schmidt’s Noticing Hypothesis. Because output focuses the learner’s attention

on the L2 structures they produce (their interlanguage), it enables them to compare their

interlanguage to the target language they receive, thereby attending to their linguistic limitations

(Gass and Varonis, 1994). If relevant input is immediately available afterwards,

the noticing of the gap may cause the learner to process the subsequent input with more

focused attention. This hypothesis has been approached by Izumi and Bigelow (2000),

which constitutes the basis of the research reported here.

2.3 Izumi and Bigelow 2000

Izumi and Bigelow addressed the issue of output and noticing in their study guided by

two questions: (1) “Do output activities promote the noticing of linguistic form in subsequent

input?” and (2) “Do these output-input-activities result in improved production of

the target form?” (Izumi and Bigelow, 2000, page 247). They compared an EG, which

was engaged in output tasks (essay writing and text reconstruction) to a CG, which did

not produce any written output. Both groups received the same textual input for the subsequent

reading and underlining activity; however, the groups were given different purposes

for underlining which may have influenced participant’s attentional focus. In the

present study, all participants received the same instructions for the reading and underlining

activity. In Izumi and Bigelow (2000), noticing of the target form (past hypothetical

conditional in English 2 ) was assessed through underlining and through the demonstration

of uptake (correct use of the target form by the learner) as a complementary measure of

noticing and acquisition of that form. The study presented here did not treat uptake as a

distinct measurement of noticing or acquisition, but qualitatively examined to what extent

learner’s uptake corresponds to prior noticing action. Izumi and Bigelow attributed their

non-significant findings to the relative difficulty of the target structure. Thus, this study

investigates learners’ noticing when engaging with a less complex yet similar structure:

the present hypothetical conditional 3 .

Except for one statistically significant increase of performance from the pretest to the

second posttest of the experimental group, Izumi and Bigelow evidenced no statistically

significant between-group differences on any measure. Both groups underlined nearly the

same percentage of conditional-related forms. They concluded that output did not draw

the learner’s attention to the targeted form, and insignificant results were attributed to

2 i.e. If Lisa had traveled to Spain, she would have seen the Olympic games.

3 i.e. If Lisa traveled to Spain, she would see the Olympic Games


Proceedings of the 13 th ESSLLI Student Session

effects of input flood and individual variation. I argue that underlining as a single measure

gives an insufficient account of learners’ cognitive processes, and I hypothesize that the

output treatment could have been observed to trigger noticing if additional qualitative and

quantitative measures had been employed. Therefore, the present study follows Izumi and

Bigelow’s suggestion to implement “methodological triangulation as the research design

allows” (Izumi and Bigelow, 2000, page 271) by operationalizing noticing through targetstructure

underlining and reporting of conscious attention during the stimulated recall

session. In other words, through triangulated data collection, noticing is investigated

from multiple perspectives.

3 Research Questions and Hypotheses

In order to validly replicate Izumi and Bigelow’s study (Izumi and Bigelow, 2000), similar

research questions are pursued along with their congruent hypotheses:

RQ1: Do output activities promote noticing of linguistic form in subsequent input?

RQ2: Do these output-input activities result in improved production of the target


It is hypothesized that the experimental groups, which are required to produce output,

would show greater noticing of the target-structure contained in the input than the control

group, which does not produce output requiring the use of the target-structure. Furthermore,

on the posttests, the experimental groups are expected to demonstrate greater gains

in accuracy of their use of the target form than the control group. Given that prior research

found the language learner’s developmental level to be associated with attentional

resources available for the target-structure, it is hypothesized that a less difficult targetstructure

promotes greater noticing and greater L2 gains. Thus, the present study is further

guided by the following research question:

RQ3: Does the present hypothetical conditional, as a less difficult structure, promote

greater noticing compared to the past-hypothetical-conditional structure?

4 Methodology

4.1 Participants

Fifteen intermediate ESL learners enrolled in the second semester ESL academic writing

class at Michigan State University have participated up to this point. Students’ enrollment

in the ESL academic writing class is determined by a placement test or by passing the

previous course. The ESL learners were from a variety of L2 backgrounds including

Cantonese, Japanese, Korean and Arabic with an average of 8.7 years of previous English

study 4 . Three students have lived in the United States for more than two years, and the

remaining have resided there for at least a year. Upon completion of the questionnaire,

participants were randomly assigned to one of the two experimental groups (EGs) or to

the single control group (CG).

4 It needs to be noted that the different native languages of the learners affect their proximity to (distance

from) English, which could make some structures easier (more difficult) to process for some learners than

for others. The native language of the participant was not systematically investigated here.


4.2 Procedures

Proceedings of the 13 th ESSLLI Student Session

The experiment followed a pretest-posttest design. The researcher met one-on-one with

each participant for about 1 or 1.5 hours depending on whether the participants chose

to take part in the stimulated recall session or not. The participants were informed of

the sequence of the activities before they completed the pretest (see Appendix A for an

example) and the reading and writing activities. Participants assigned to the first experimental

group (EG1) composed an essay that elicited the past hypothetical conditional

(Appendix B), whereas participants assigned to the second experimental group (EG2)

composed an essay that elicited the present hypothetical conditional. Participants in the

control group (CG) engaged in a writing task that did not require the use of either structure.

Each participant subsequently received input that modeled the correct use of the

relevant target structures (Appendix C); yet, for the CG, the reading text did not serve

as a model. All groups were instructed to either underline what [they] feel is important

for re-writing the essay or underline what [they] feel is important for writing an essay

about this topic. By leaving the words to be underlined unspecified, the learner’s attentional

foci were not predisposed. Before the participants carried out the actual task, the

grammar-focused underlining was demonstrated to the students using a passage that did

not contain the target-structure 5 . Following the reading and underlining activity, all participants

in the EGs reproduced their initial essay, whereas the CG group wrote about

the EGs’ initial essay topic for the first time. The immediate posttest was administered

upon completion of the second essay writing activity or after the stimulated recall session

depending on whether or not participants took part in the stimulated recall interview. The

delayed posttest was given after one week had passed 6 . Four participants of each EG and

three participants of the CG volunteered to being videotaped during the reading activity.

To better track their focus during the reading and underlining task, the videotaped participants

were asked to read aloud. Immediately following completion of the second essay,

the videotape was rewound and played to the learner. While watching the videotape, the

researcher stopped the tapes after episodes that appeared to involve noticing of linguistic

features (i.e. underlining or hesitation), asking the student to describe his/her thoughts

during that time. English was used during all interactions between the participants and

the researchers, which were audio recorded for transcription purposes.

5 Results and Discussion

The first research question asked whether output activities promote noticing of grammatical

features in subsequent input. In a restricted way, the hypothesis predicting greater

noticing of the target forms for the EGs than the CGs was not confirmed (p 0.5) 7 . All

participants underlined vocabulary items rather than the grammatical cues in the reading

text. However, the present study does show that output had an effect on learners’

attentional foci and input processing. While no participant appeared to notice the target

form, most participants’ attention was drawn to the vocabulary in order to process the

main message of the input passages. The predicted effect of output in promoting noticing

5 Modeling familiarizes the learners with the underlining procedure and increases precision of the measure

6 Three students did not show up for the delayed posttest

7 Wilcoxon-signed-rank tests were used for within-group comparisons


Proceedings of the 13 th ESSLLI Student Session

of the correct use of conditional sentences was not supported in this study. Similarly,

the output-input-output treatment did not alter the students’ level performance on the immediate

and delayed posttests when compared to the input-output treatment. However,

output appeared to trigger noticing of vocabulary, style, and some content issues. This

finding will be discussed in more detail below.

The second research question addressed the acquisition issue and inquired whether

output-input activities results in improved production of the target form. The present

study did not yield clear results in support of such a relationship, mainly because the

noticing scores could not be sufficiently squared with posttest scores as there was a lack

of grammar-related noticing with all candidates during the treatment phase. Put another

way, the posttests do not provide a measure of the effect of noticing. Future research will

need to use correlation analyses in order to square the underlining, stimulated recall, and

second essay scores (as a measure of noticing) with gain on individualized vocabulary

posttests. While data from the underlining, second essay, and stimulated recall point to a

link between noticing and the subsequent use of noticed items, it would be too suggestive

to claim a causal relationship between noticing and acquisition.

Under the premise that attentional resources constrain noticing, the third research question

asked whether a less difficult structure promotes greater noticing than a more difficult

structure. The results of this study suggest that the less difficult structure had no effect

on noticing or L2 learning 8 . There was no notable difference between EG1 and EG2 performance

on any measure. Of course, any interpretation of the test-, noticing-, or essay

scores would be unconvincing, given that only three candidates could be compared to

another set of three candidates. The small number of participants notwithstanding, one

possible explanation for this finding might be that the less difficult structure was not significantly

easier. The production and processing of the present-hypothetical- conditional

may have been just as cognitively demanding as the past-hypothetical-conditional. Thus,

the results of this research can not validate (nor invalidate) the claim that the demands

of the targeted grammatical structure or the complexity of the pedagogical task have no

effect on learners’ attentional resources. In order to tap into a possible relationship between

cognitive demands and noticing, the fact that one task is indeed cognitively more

demanding must first be established. Before this research project is further pursued, the

relative difficulty of both structures needs to be evaluated with a larger number of ESL


Although the research hypotheses of this study have not been supported, this research

demonstrates that noticing has occurred. These insights contrast with Izumi and Bigelow’s

conclusion that output does not trigger learner’s noticing (Izumi and Bigelow, 2000). The

present study demonstrates that output treatment influences learners’ subsequent cognitive

processes, e.g., that is good way to say. I also wanna say something like that, but my

essay is not so good, so I try to remember. As such, output focused the learner’s attention

to specific linguistic features in the output, and those noticed features were then compared

to the features the learner had produced in their first writing activity. Yet, the data from

this study leaves us to wonder whether (and to what extent) the noticed features were

incorporated into the interlanguage system. Chaudron (1985) argued that L2 learning involves

two stages: first, the perception of input (noticing), and second, the integration of

intake into the learner’s interlanguage system. Gass and Varonis (1994) similarly sug-

8 Mann-Whitney-U tests were used for between-group comparisons


Proceedings of the 13 th ESSLLI Student Session

gested that learners need to apperceive input before it can become intake. According to

Ellis “Intake occurs when learners take features into their short- or medium-term memories,

whereas interlanguage change occurs only when they become part of long term

memory” (Ellis, 1997, page 119). Accordingly, the learner has to convert from preliminary

to final intake. It would be interesting to investigate whether the learners in this

study process the new linguistic items (e.g., vocabulary) beyond noticing and immediate

intake, in order to contribute to theory building on input, intake, and L2 acquisition. A

possible way of approaching this would be to include a delayed essay production task to

see whether, and to what extent, the learners arrived at the “final intake stage”.

The overall findings of the present study indicate that learners processed the input

primarily for meaning. Although no form-focused comparisons were invoked, EG candidates

noticed a difference between their word choice and style and those of the native

speaker. These findings are in line with VanPatten (1996) who proposed input processing


1. The Primacy of Meaning Principle: Learners process input for meaning before they

process it for form.

2. The Primacy of Content Words Principle: Learners process content words in the

input before anything else.

3. The Lexical Preference Principle: Learners will tend to rely on lexical items as

opposed to grammatical form to get meaning when both encode the same semantic

information 9 .

Applying VanPatten’s principles to the present study, it might be that all participants

processed meaningful elements in the input while reading the input text. This may explain

why they did not underline grammatical elements such as modals like would and could,

auxiliaries and past participles. Because learners were not capable of attending to vocabulary

and grammar, the past/present hypothetical conditional may have been processed

only peripherally.

Based on the input processing principles, VanPatten (1996) investigated the effects

of processing instruction, revealing that learners’ focal attention during processing can

be directed toward the relevant grammatical items and, in turn, enhance L2 learning.

Follow-up research should investigate whether input enhancement or specific instructions

to underline grammatical structures (e.g., the past/present hypothetical conditional) would

enhance noticing, intake, and L2 acquisition. The present study did not provide any specific

instructions for the underlining, but to underline what is important for subsequent

production, on purpose: The study’s objective was to see whether output which requires

use of a particular structure results in underlining of that particular structure in the subsequent

input passage. If the learners were told to underline grammatical structures, their

attentional foci would have been predisposed, as it was the case in Izumi and Bigelow


The findings in the present study also raise important methodological issues that should

be addressed in future studies that investigate the role of noticing in SLA. First and foremost,

this study has shown that triangulated or multiple data-elicitation measures can

9 Only the relevant subset of the entire set of input processing principles is presented here


Proceedings of the 13 th ESSLLI Student Session

provide a much more complex picture of learners’ internal processes. In this study, the

underlining, the essays, the tests scores, and the verbal reports from the stimulated recall

session, all helped to puzzle out the role of output and noticing in second language

acquisition. Although verbal stimulated recall reports cannot provide a complete reflection

of actual internal processing, they provided useful information as to how learners’

minds process language information when the learners articulated their concerns (e.g., I

wanna say something like that, but my essay is not so good, so I try to remember) or

when they made comparisons to their first essay (e.g., This is a big word I want to remember).

Learners underlined the words that captured the author’s key message, and

their comments reflected their intent in seeking meaning and better vocabulary for use in

their second essay. The stimulated recall protocols obtained in this study collaboratively

demonstrate that the learners did not attend to grammatical features. Additionally, the

data from the first and second essays illustrate that students improved their expression

and word choice, but not their grammatical accuracy. Izumi and Bigelow were unable to

draw such conclusions as they limited their measurement of noticing and their measurement

of acquisition to the underlining of conditional related items and posttest scores,

respectively. Izumi and Bigelow found that output does not prompt the learners to “notice

the gap”. The present study, by contrast, reveals that some learners were aware that they

could not express themselves as entirely as they wished, (e.g., I want to say negotiate in

my essay, but I don’t remember it). They noticed their restricted lexicon and searched

for more appropriate words in the input passage. In other words, students realized lexical

gaps which triggered their attention to vocabulary in subsequent input.

6 Limitations and Future Research

Although the present study sheds some light on meaning-focused processing and noticing

as well as methodological issues, there are some limitations that need to be acknowledged.

First and foremost, the small number of participants clearly limits the generalization of

findings to a broader variety of L2 learners 10 . Proceeding with this research up to a minimum

of twenty-one participants will reveal whether the current trends hold true. Further

study may include asking non-stimulated recall participants about what they have noticed

in a short questionnaire and what they assume the purpose of the reading and writing tasks


The testing instruments employed in this study are limited in length and scope which

may have impacted the measurement of L2 attainment. Whereas a more comprehensive

test of the past-hypothetical-conditional may yield more valid results, it may also prompt

participants to pay closer attention to the form in the input passage. Consequently, a tenable

comparison between output and no-output treatment would be difficult, as all groups

would produce the target form to the same extent. As mentioned earlier, in order to better

understand the relationship between attention and learning, future research may develop

tests that examine students’ acquisition of noticed vocabulary items. For such measurement,

individualized delayed posttests in which the noticed (underlined and commented)

items are assessed in terms of adequate usage and comprehension would be appropriate.

10 The fact that the participants were willing to take part in the study outside of class time may have lead

to a participant body that is more motivated and eager to improve than the average intermediate ESL learner


7 Conclusions

Proceedings of the 13 th ESSLLI Student Session

The purpose of this study was to investigate the effects of output and cognitive demands

on noticing and second language acquisition, providing the following two merits: First,

this study has demonstrated how multiple perspectives can help to obtain insights into

learners’ cognitive processes. Secondly, the results of this study support the noticing

function of output to some extent. Output-input treatment has shown to trigger comparison

of the learner’s interlanguage lexicon with language produced by a native speaker.

Furthermore, this study demonstrates that learners primarily attend to meaning, which

is in line with VanPatten’s input processing principles (VanPatten, 1996). However, the

overall results do not allow for clear conclusions. Much more research is needed to find

the extent to which learners notice specific features in the input as well as to explore the

very mechanisms of noticing. Until then, our understanding of what takes place in the

learners head remains complex and opaque.


Alanen, R. (1995). Input enhancement and rule representation in second language acquisition,

in R. Schmidt (ed.), Attention and Awareness in Foreign Language Learning,

University of Hawai’i Press, Honolulu.

Chaudron, C. (1985). Intake: On models and methods for discovering learners’ processing

of input, Studies in Second Language Acquisition 7(1): 1–14.

Ellis, R. (1994). Factors in the incidental acquisition of second language vocabulary from

oral input: A review essay, Applied Language Learning 5(1): 1–32.

Ellis, R. (1997). SLA Research and Language Teaching, University Press, Oxford.

Gass, S. and Mackey, A. (2000). Stimulated recall methodology in second language

research, Lawrence Erlbaum Associates, London.

Gass, S., Svetics, I. and Lemelin, S. (2003). Differential effects of attention, Language

Learning 53(3): 497–545.

Gass, S. and Varonis, E. M. (1994). Input, interaction and second language production,

Studies in Second Language Acquisition 16(3): 283–302.

Izumi, S. and Bigelow, M. (2000). Does output promote noticing and second language

acquisition, TESOL Quarterly 34(2): 239–287.

Jourdenais, R. (2001). Cognition, instruction and protocol analysis, in P. Robinson (ed.),

Cognition and Second Language Instruction, Cambridge University Press, New


Krashen, S. (1982). Principles and Practice in Second Language Acquisition, Pergamon,


Leow, R. P. (2002). Models, attention, and awareness in sla, Studies in Second Language

Acquisition 24(1): 113–119.


Proceedings of the 13 th ESSLLI Student Session

Long, M. (1996). The role of linguistic environment in second language acquisition,

in W. C. Ritchie and T. K. Bhatia (eds), The Handbook of Language Acquisition,

Academic Press, San Diego.

Philp, J. (2003). Constraints on noticing the gap, Studies in Second Language Acquisition

25(1): 99–126.

Robinson, P. (1995). Attention, memory, and the noticing hypothesis, Language Learning

45(2): 283–331.

Robinson, P. (2001). Individual differences, cognitive abilities, aptitude complexes and

learning conditions in second language acquisition, Second Language Research

17(4): 368–392.

Rosa, E. and O’Neill, M. D. (1999). Explicitness, intake and the issue of awareness,

Studies in Second Language Acquisition 21(4): 511–556.

Schmidt, R. (1990). The role of consciousness in second language learning, Applied

Linguistics 11(2): 129–158.

Schmidt, R. (2001). Attention, in P. Robinson (ed.), Cognition and Second Language

Instruction, Cambridge University Press, New York.

Simard, D. and Wong, W. (2001). Alertness, orientation and detection, Studies in Second

Language Acquisition 23(1): 103–124.

Swain, M. (1985). Communicative competence: Some roles of comprehensible input and

comprehensible output in its development, in S. Gass and C. Madden (eds), Input in

Second Language Acquisition, Heinle & Heinle, Boston.

Swain, M. (1995). Three functions of output in second language learning, in G. Cook and

B. Seidlhofer (eds), Principles and practice in applied linguistics: Studies in honor

of H. Widdowson, University Press, Oxford.

Swain, M. and Lapkin, S. (1995). Problems in output and the cognitive processes they

generate: A step towards second language learning, Applied Linguistics 16(3): 371–


Swain, M. and Lapkin, S. (1998). Interaction and second language learning: Two adolescent

french immersion students working together, Modern Language Journal

82(3): 320–337.

Tomlin, R. and Villa, V. (1994). Attention in cognitive science and second language

acquisition, Studies in Second Language Acquisition 16(2): 183–204.

Truscott, J. (1998). Noticing in second language acquisition: A critical review, Second

Language Research 24(2): 103–135.

VanPatten, B. (1996). Input processing and grammar instruction in second language

acquisition, Ablex, Westport.


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session



Andreas Schnabl

University of Innsbruck

Abstract. This paper describescdiprover3 a tool for proving termination of term rewrite

systems by polynomial interpretations and context dependent interpretations. The methods

used bycdiprover3 induce small bounds on the derivational complexity of the considered

system. We explain the tool in detail, and give an overview of the employed proof methods.

1 Introduction

Term rewriting is a Turing complete model of computation, which is conceptually closely

related to declarative and (first-order) functional programming. One of its most studied

properties, termination, is also a central problem in computer science. This property is

undecidable in general, but many partial decision methods have been developed in the

last decades. Beyond showing termination of a given rewriting system, some of these

methods can also give bounds on different measures of its complexity. As suggested in

(Hofbauer and Lautemann, 1989), a natural way of measuring the complexity of a term

rewrite system is to analyze its derivational complexity. The derivational complexity is

a function which relates the size of a term and the maximal number of rewrite steps that

can be executed starting from any term of that size in the given rewrite system . We

are particularly interested in small, i.e. polynomial upper bounds on this function. In

contrast to our approach of measuring derivational complexity, the constructor discipline

is mentioned in (Lescanne, 1995). In this field, we look at the complexity of the function

that is encoded by a constructor system. It is either measured by the number of rewrite

steps needed to bring the term into normal form (Bonfante, Cichon, Marion and Touzet,

n.d.; Avanzini and Moser, 2008), or by counting the number of steps needed by some

evaluation mechanism different from standard term rewriting (Marion, 2003; Bonfante,

Marion and Péchoux, 2007).

In this paper, we describecdiprover3, a tool which uses polynomial and contextdependent

interpretations in order to prove termination and complexity bounds of term

rewrite systems. The tool, its predecessors, and full experimental data are available at

http://cl-informatik.uibk.ac.at/˜aschnabl/experiments/cdi/ .

s Polynomial interpretations, introduced in (Lankford, 1979), are a standard direct termination

proof method. Besides showing termination of rewrite systems, they also provide

an easy way to extract upper bounds on the derivational complexity (Hofbauer and

Lautemann, 1989). However, as noticed in (Hofbauer, 2001), this often heavily overestimates

the derivational complexity. Context dependent interpretations, also introduced in

(Hofbauer, 2001), are an effort to improve these upper bounds.


Proceedings of the 13 th ESSLLI Student Session

The remainder of this paper is organised as follows: Section 2 outlines the basics of

term rewriting needed to state all relevant results. In Section 3, we briefly describe polynomial

and context dependent interpretations, which are used bycdiprover3. Section

4 describes the implementation of cdiprover3, and mentions some experimental results.

In Section 5, we explain the input and output ofcdiprover3 in detail. Last, in

Section 6, we state conclusions and potential future work.

2 Term Rewriting

In this section, we review some basics of term rewriting. We only cover the concepts

which are relevant to this paper. A general introduction to term rewriting can be found in

(Baader and Nipkow, 1998; TeReSe, 2003), for instance.

A term rewrite system (TRS) R consists of a signature F, a countably infinite set of

variables V disjoint from F, and a finite set of rewrite rules l → r, where l and r are terms

such that l /∈ V and all variables which occur in r also occur in l. The signature F defines

a set of function symbols, and assigns to each function symbol f its arity. We assume that

every signature contains at least one function symbol of arity 0. The set of terms built

from F and V is denoted by T (F, V). The set of terms T (F) without any variables is

called the set of ground terms over F. A function symbol is defined if it occurs at the

root of a left hand side of a rewrite rule. All non-defined function symbols are called

constructors. A constructor based term is a term containing exactly one defined function

symbol, which appears at the root of that term. We call the total number of function

symbol and variable occurrences in a term t its size, denoted by |t|. A substitution is a

mapping σ : Dom(σ) → T (F, V), where Dom(σ) is a finite subset of V. The result of

replacing all occurrences of variables x ∈ Dom(σ) in a term t by σ(x) is denoted by tσ.

A context is a term C[□] containing a single occurrence of a fresh function symbol □ of

arity 0. If we replace □ with a term t, we denote the resulting term by C[t]. Given a TRS

R and two terms s, t, we say that s rewrites to t (s → R t) if there exist a context C, a

substitution σ and a rewrite rule l → r in R such that s = C[lσ] and t = C[rσ]. The

transitive closure of this relation is → + R . The reflexive and transitive closure is →∗ R . We

write → n R to express n-fold composition of → R. A TRS R is terminating if there exists

no infinite chain of terms t 0 , t 1 , ... such that t i → R t i+1 for each i ∈ N. For a terminating

TRS R, the derivation length of a ground term t is defined as dl R (t) = max{n | ∃s :

t → n R s}. The derivational complexity is the function dc R : N → N which maps n to

max{dl R (t) | |t| = n}.

3 Used Termination Proof Methods

3.1 Polynomial Interpretations

An F-algebra A for some signature F consists of a carrier A and interpretation functions

{f A : A n → A | f ∈ F, n = arity(f)}. Given an assignment α : V → A, we denote the

evaluation of a term t into A by [α] A (t). It is defined inductively as follows:

[α] A (x) = α(x)

[α] A (f(t 1 , ...,t n )) = f A ([α] A (t 1 ), ...,[α] A (t n ))

for x ∈ V

for f ∈ F


Proceedings of the 13 th ESSLLI Student Session

A well-founded monotone F-algebra is a pair (A, >) where A is an F-algebra and > is

a well-founded proper order such that for every function symbol f ∈ F, f A is monotone

with respect to >. It is compatible with a TRS R if for every rewrite rule l → r in R

and every assignment α, [α] A (l) > [α] A (r) holds. It is a well-known fact that a TRS R

is terminating if and only if there exists a well-founded monotone algebra that is compatible

with R. A polynomial interpretation (Lankford, 1979) is an interpretation into a

well-founded monotone algebra (A, >) such that A ⊆ N, > is the standard order on the

natural numbers, and f A is a polynomial for every function symbol f. If a polynomial

interpretation is compatible with a TRS R, then we clearly have dl R (t) [α] A (t) for all

terms t.

Example 1. Consider the TRS R with the following rewrite rules over the signature containing

the function symbols 0 (arity 0), s (arity 1), + and - (arity 2). The system is

exampleSK90/2.11.trs in the termination problems database 1 (TPDB), which is the

standard benchmark for termination provers:

+(0, y) → y -(0, y) → 0 -(s(x), s(y)) → -(x, y)

+(s(x), y) → s(+(x, y))

-(x, 0) → x

The following interpretation functions build a compatible polynomial interpretation A

over the carrier N:

+ A (x, y) = 2x + y - A (x, y) = 3x + 3y s A (x) = x + 2 0 A = 1

A strongly linear interpretation is a polynomial interpretation such that every interpretation

function f A has the form f A (x 1 , ...,x n ) = ∑ n

i=1 x i + c, c ∈ N. A surprisingly

simple property is that compatibility with a strongly linear interpretation induces a linear

upper bound on the derivational complexity (Schnabl, 2007).

A linear polynomial interpretation is a polynomial interpretation where each interpretation

function f A has the shape f A (x 1 , ...,x n ) = ∑ n

i=1 a ix i + c, a i ∈ N, c ∈ N.

For instance, the interpretation given in Example 1 is a linear polynomial interpretation.

Because of their simplicity, this class of polynomial interpretations is the one most commonly

used in automatic termination provers. As illustrated by Example 2 below, if only

a single one of the coefficients a i in any of the functions f A is greater than 1, there might

already exist derivations whose length is exponential in the size of the starting term.

Example 2. Consider the TRS S with the following single rule over the signature containing

the function symbols a, b (arity 1), and c (arity 0). The system is example

SK90/2.50.trs in the TPDB:

a(b(x)) → b(b(a(x)))

The following interpretation functions build a compatible linear polynomial interpretation

A over N:

a A (x) = 2x b A (x) = x + 1 c A = 0

If we start a rewrite sequence from the term a n (b(c)), we reach the normal form b 2n (a n (c))

after 2 n − 1 rewriting steps. Therefore, the derivational complexity of S is at least exponential.

1 http://www.lri.fr/˜marche/tpdb/.


Proceedings of the 13 th ESSLLI Student Session

3.2 Context Dependent Interpretations

Even though polynomial interpretations provide an easy way to obtain an upper bound

on the derivational complexity of a TRS, they are not very suitable for proving polynomial

derivational complexity. Strongly linear interpretations only capture linear derivational

complexity, but even a slight generalization admits already examples of exponential

derivational complexity, as illustrated by Example 2. In (Hofbauer, 2001), context dependent

interpretations are introduced. They use an additional parameter (usually denoted

by ∆) in the interpretation functions, which changes in the course of evaluating the interpretation

of a term, thus making the interpretation dependent on the context. This way of

computing interpretations also allows us to bridge the gap between linear and polynomial

derivational complexity.

Definition 3. A context-dependent interpretation C for some signature F consists of functions

{f C [∆] : (R + 0 ) n → R + 0 | f ∈ F, n = arity(f), ∆ ∈ R + } and {f i C : R+ → R + | f ∈

F, i ∈ {1, ...,arity(f)}}. Given a ∆-assignment α : R + × V → R + 0 , the evaluation of a

term t by C is denoted by [α, ∆] C (t). It is defined inductively as follows:

[α, ∆] C (x) = α(∆, x)

[α, ∆] C (f(t 1 , ...,t n )) = f C [∆]([α, f 1 C(∆)] C (t 1 ), ...,[α, f n C (∆)] C (t n ))

for x ∈ V

for f ∈ F

Definition 4. For each ∆ ∈ R + , let > ∆ be the order defined by a > ∆ b ⇐⇒ a −b ∆.

A context-dependent interpretation C is compatible with a TRS R if for all rewrite rules

l → r in R, all ∆ ∈ R + , and every ∆-assignment α, we have [α, ∆] C (l) > ∆ [α, ∆] C (r).

Definition 5. A ∆-linear interpretation is a context dependent interpretation C whose

interpretation functions have the form

f C [∆](z 1 , ...,z n ) =


a (f,i) z i +



b (f,i) z i ∆ + c f ∆ + d f fC(∆) i =


a (f,i) + b (f,i) ∆

with a (f,i) , b (f,i) , c f , d f ∈ N, a (f,i) + b (f,i) ≠ 0 for all f ∈ F, 1 i n. If we have

a (f,i) ∈ {0, 1} for all f, i, we also call it a ∆-restricted interpretation

We consider ∆-linear interpretations because of the similarity between the functions

f C [∆] and the interpretation functions of linear polynomial interpretations. Another point

of interest is that the simple syntactical restriction to ∆-restricted interpretations yields a

quadratic upper bound on the derivational complexity. Moreover, because of the special

shape of ∆-linear interpretations, we need no additional monotonicity criterion for our

main theorems:

Theorem 6 ((Moser and Schnabl, 2008)). Let R be a TRS and suppose that there exists

a compatible ∆-linear interpretation. Then R is terminating and dc R (n) = 2 O(n) .

Theorem 7 ((Schnabl, 2007)). Let R be a TRS and suppose that there exists a compatible

∆-restricted interpretation. Then R is terminating and dc R (n) = O(n 2 ).


Example 8. Consider the TRS given in Example 1 again. A compatible ∆-restricted (and

∆-linear) interpretation C is built from the following interpretation functions:

+ C [∆](x, y) = (1 + ∆)x + y + ∆ + 1 C(∆) = ∆ + 2

1 + ∆

C(∆) = ∆

- C [∆](x, y) = x + y + ∆ - 1 C(∆) = ∆ − 2 C(∆) = ∆

s C [∆](x) = x + ∆ + 1 s 1 C(∆) = ∆ 0 C [∆] = 0

Note that this interpretation gives a quadratic upper bound on the derivational complexity.

However, from the polynomial interpretation given in Example 1, we can only infer an exponential

upper bound (Hofbauer and Lautemann, 1989). Consider the term P n,n , where

we define P 0,n = s n (0) and P m+1,n = +(P m,n , 0). We have |P n,n | = 3n + 1. For every

m, n ∈ N, P m+1,n rewrites to P m,n in n+1 steps. Therefore, P n,n reaches its normal form

s n (0) after n(n + 1) rewriting steps. Hence, the derivational complexity is also Ω(n 2 ) for

this example, so the inferred bound O(n 2 ) is tight.

4 Implementation

cdiprover3 is written fully in OCaml 2 . It employs the libraries of the termination

prover TTT2 3 . From these libraries, functionality for handling TRSs and SAT encodings,

and an interface to the SAT solver MiniSAT 4 are used. Without counting this, the tool

consists of about 1700 lines of OCaml code. About 25% of that code are devoted to

the manipulation of polynomials and extensions of polynomials that stem from our use

of the parameter ∆. Another 35% are used for constructing parametric interpretations

and building suitable Diophantine constraints (see below) which enforce the necessary

conditions for termination. Using TTT2’s library for propositional logic and its interface

toMiniSAT, 15% of the code deal with encoding Diophantine constraints into SAT. The

remaining code is used for parsing input options and the given TRS, generating output,

and controlling the program flow.

In order to find polynomial interpretations automatically, Diophantine constraints are

generated according to the procedure described in (Contejean, Marché, Tomás and Urbain,

2005). Putting an upper bound on the coefficients makes the problem finite. Essentially

following (Fuhs, Giesl, Middeldorp, Schneider-Kamp, Thiemann and Zankl, 2007),

we then encode the (finite domain) constraints into a propositional satisfiability problem.

This problem is given toMiniSAT. From a satisfying assignment for the SAT problem,

we construct a polynomial interpretation which is monotone and compatible with the

given TRS.

This procedure is also the basis of the automatic search for ∆-linear and ∆-restricted

interpretations. The starting point of that search is an interpretation with uninstantiated

coefficients. If we want to be able to apply Theorem 6 or 7, we need to find coefficients

which make the resulting interpretation compatible with the given TRS. Furthermore,

we need to make sure that no divisions by zero occur in the interpretation functions.

Again, we encode these properties into Diophantine constraints on the coefficients of a

∆-linear or ∆-restricted interpretation. The encoding is an adaptation of the procedure in

2 http://caml.inria.fr.

3 http://colo6-c703.uibk.ac.at/ttt2.

4 http://minisat.se.


Proceedings of the 13 th ESSLLI Student Session

Table 1: Performance ofcdiprover3

Method SL SL+∆-restricted ∆-linear ∆-restricted

-i -b X 31 31 31 3 7 15 31

# success 41 87 83 83 86 86 86

average success time 20 3010 5527 3652 4041 4008 3986

# timeout 0 237 797 144 189 221 238

(Contejean et al., 2005) to context-dependent interpretations. It is described in detail in

(Schnabl, 2007; Moser and Schnabl, 2008). Once we have built the constraints, we continue

using the same techniques as for searching polynomial interpretations: we encode

the constraints in a propositional satisfiability problem, apply the SAT solver, and use a

satisfying assignment to construct a context-dependent interpretation.

Table 1 shows experimental results of applyingcdiprover3 on the 957 known terminating

examples of the TPDB. The tests were performed single-threaded on a 2.40 GHz

Intel R○ Core TM 2 Duo with 2 GB of memory. For each system,cdiprover3 was given

a timeout of 60 seconds. All times in the table are given in milliseconds. The method

SL denotes strongly linear interpretations. In all tests, we calledcdiprover3 with the

options -i -b X (see Section 5 below), where X is specified in the second row of the

table. As we can see, cdiprover3 is currently able to prove polynomial derivational

complexity for 87 of the 368 known terminating non-duplicating rewrite systems of the

TPDB (duplicating rewrite systems have at least exponential derivational complexity, so

this restriction is harmless here). The results indicate that an upper bound of 7 on the coefficient

variables suffices to capture all examples on our test set. Therefore, 3 and 7 seem

to be good candidates for default values of the -b option. However, it should be noted

that our handling of the divisions introduced by the functions fC i is computationally rather

expensive, which is indicated by the number of timeouts and the average time needed

for successful proofs. This also explains the slight decrease in performance when we

extend the search space to ∆-linear interpretations. However, there is one system which

can be handled by ∆-linear interpretations, but not by ∆-simple interpretations: system

SK90/2.50 in the TPDB, which we mentioned in Example 2.

5 Using cdiprover3

cdiprover3 is called from command line. The basic usage pattern forcdiprover3


$ ./cdiprover3

• specifies the maximum number of seconds untilcdiprover3 stops

looking for a suitable interpretation.

• specifies the path to the file which contains the considered TRS.

• For, the following switches are available:

-c defines the desired subclass of the searched polynomial or contextdependent

interpretation. The following values of are legal:


Proceedings of the 13 th ESSLLI Student Session

linear, simple, simplemixed, quadratic These classes correspond to the respective

subclasses of polynomial interpretations, as defined in (Steinbach,

1992). Linear polynomial interpretations imply an exponential upper

bound on the derivational complexity. The other classes imply a double

exponential upper bound, cf. (Hofbauer and Lautemann, 1989).

pizerolinear, pizerosimple, pizerosimplemixed, pizeroquadratic For these

values, cdiprover3 tries to find a polynomial interpretation with the

following restrictions: defined function symbols are interpreted by linear,

simple, simple-mixed, or quadratic polynomials, respectively. Constructors

are interpreted by strongly linear polynomials. These interpretations

guarantee that the derivation length of all constructor based terms is polynomial

(Bonfante et al., n.d.).

sli This option corresponds to strongly linear interpretations. As mentioned

in Section 3, they induce a linear upper bound on the derivational complexity

of a compatible TRS.

deltalinear This value specifies that the tool should search for a ∆-linear

interpretation. By Theorem 6, compatibility with such an interpretation

implies an exponential upper bound on the derivational complexity.

deltarestricted This option corresponds to ∆-restricted interpretations. By

Theorem 7, they induce a quadratic upper bound.

-b sets the upper bound for the coefficient variables. The default value

for this bound is 3.

-i This switch activates an incremental strategy for handling the upper bound on

the coefficient variables. First, cdiprover3 tries to find a solution using

an intermediate upper bound of 1 (which corresponds to encoding each coefficient

variable by one bit). Whenever the tool fails to find a proof for some

upper bound b, it is checked whether b is equal to the bound specified by the

-b option. If that is the case, then the search for a proof is given up. Otherwise,

b is set to the minimum of the bound specified by the-b option and

2(b+1)−1 (which corresponds to increasing the number of bits used for each

coefficient variable by 1).

If the -c switch is not specified, then the standard strategy for proving polynomial

derivational complexity is employed. First, cdiprover3 looks for a strongly linear

interpretation. If that is not successful, then a suitable ∆-restricted interpretation is

searched. The input TRS files are expected to have the same format as the files in the

TPDB. The format specification for this database is available at http://www.lri.


The output given by cdiprover3, as exemplified by Example 9, is structured as

follows. The first line contains a short answer to the question whether the given TRS

is terminating: YES, MAYBE, or TIMEOUT. The latter means that cdiprover3 was

still busy after the specified timeout. MAYBE means that a termination proof could not

be found, and cdiprover3 gave up before time ran out. The answer YES indicates

that an interpretation of the given class has been found which guarantees termination of

the given TRS. It is followed by the inferred bound on the derivational complexity and a


listing of the interpretation functions. After the interpretation functions, the elapsed time

between the call ofcdiprover3 and the output of the proof is given. In all cases, the

answer is concluded by statistics stating the total number of monomials in the constructed

Diophantine constraints, and the upper bound for the coefficients that was used in the last

call toMiniSAT.

Example 9. Given the TRS shown in Example 1, cdiprover3 produces the output

shown in Figure 1. The interpretations in Example 8 and in the output are equivalent.

Note that the parameter ∆ in the interpretation functions f C [∆] is treated like another

argument of the function. The interpretation functions fC i are represented by f tau i in the


6 Conclusion

In this paper, we have presented the (as far as we know) first tool which is specifically

designed for automatically proving polynomial derivational complexity of term rewriting.

We have also given a brief introduction into the applied proof methods. With our current

implementation, we are able to prove polynomial derivational complexity for 87 of

the 368 known terminating non-duplicating rewrite systems of the TPDB. By adding new

termination methods to our tool which can prove polynomial derivational complexity of

rewrite systems, we could extend the range of problems that the prover can solve. The

matchbounds technique comes to mind here, which induces a linear upper bound on the

derivational complexity of the considered system (Geser, Hofbauer, Waldmann and Zantema,

2007; Korp and Middeldorp, 2007). Another avenue for future work is the search for

other subclasses of context-dependent interpretations which imply non-quadratic and nonlinear,

but polynomial upper bounds on the derivational complexity. A further possibility

would be to find more efficient ways of handling the divisions introduced by the functions

fC i . Results in this area would help to further improve the power ofcdiprover3.


Avanzini, M. and Moser, G. (2008). Complexity analysis by rewriting, Proc. 9th FLOPS,

Vol. 4989 of LNCS, pp. 130–146.

Baader, F. and Nipkow, T. (1998). Term Rewriting and All That, Cambridge University


Bonfante, G., Cichon, A., Marion, J.-Y. and Touzet, H. (n.d.). Algorithms with polynomial

interpretation termination proof, J. Funct. Program. (1): 33–53.

Bonfante, G., Marion, J.-Y. and Péchoux, R. (2007). Quasi-interpretation synthesis by

decomposition, Proc. 4th ICTAC, Vol. 4711 of LNCS, pp. 410–424.

Contejean, E., Marché, C., Tomás, A. P. and Urbain, X. (2005). Mechanically proving

termination using polynomial interpretations., J. Autom. Reason. 34(4): 325–363.

Fuhs, C., Giesl, J., Middeldorp, A., Schneider-Kamp, P., Thiemann, R. and Zankl, H.

(2007). SAT solving for termination analysis with polynomial interpretations, Proc.

SAT 2007, Vol. 4501 of LNCS, pp. 340–354.

Proceedings of the 13 th ESSLLI Student Session

Geser, A., Hofbauer, D., Waldmann, J. and Zantema, H. (2007). On tree automata that

certify termination of left-linear term rewriting systems, Inf. Comput. 205(4): 512–


Hofbauer, D. (2001). Termination proofs by context-dependent interpretations, Proc. 12th

RTA, Vol. 2051 of LNCS, pp. 108–121.

Hofbauer, D. and Lautemann, C. (1989). Termination proofs and the length of derivations,

Proc. 3rd RTA, Vol. 355 of LNCS, pp. 167–177.

Korp, M. and Middeldorp, A. (2007). Proving termination of rewrite systems using

bounds, Proc. 18th RTA, Vol. 4533 of LNCS, pp. 273–287.

Lankford, D. (1979). On proving term-rewriting systems are noetherian, Technical Report

MTP-2, Math. Dept., Louisiana Tech. University.

Lescanne, P. (1995). Termination of rewrite systems by elementary interpretations, Formal

Aspects of Computing 7(1): 77–90.

Marion, J.-Y. (2003).

183(1): 2–18.

Analysing the implicit complexity of programs, Inf. Comput.

Moser, G. and Schnabl, A. (2008). Proving quadratic derivational complexities using

context dependent interpretations, Proc. 19th RTA. Accepted for publication.

Schnabl, A. (2007).


Context Dependent Interpretations 5 , Master’s thesis, Universität

Steinbach, J. (1992). Proving polynomials positive, Proc. 12th FSTTCS, Vol. 652 of

LNCS, pp. 191–202.

TeReSe (2003). Term Rewriting Systems, Vol. 55 of Cambridge Tracts in Theoretical

Computer Science, Cambridge University Press.

5 Available online athttp://cl-informatik.uibk.ac.at/˜aschnabl/


Figure 1: Output produced bycdiprover3.

$ cat tpdb-4.0/TRS/SK90/2.11.trs

(VAR x y)


+(0,y) -> y

+(s(x),y) -> s(+(x,y))

-(0,y) -> 0

-(x,0) -> x

-(s(x),s(y)) -> -(x,y)


(COMMENT Example 2.11 (Addition and Subtraction) in \cite{SK90})

$ ./cdiprover3 -i tpdb-4.0/TRS/SK90/2.11.trs 60


QUADRATIC upper bound on the derivational complexity

This TRS is terminating using the deltarestricted interpretation

-(delta, X1, X0) = + 1*X0 + 1*X1 + 0 + 0*X0*delta + 0*X1*delta + 1*delta

s(delta, X0) = + 1*X0 + 1 + 0*X0*delta + 1*delta

0(delta) = + 0 + 0*delta

+(delta, X1, X0) = + 1*X0 + 1*X1 + 0 + 0*X0*delta + 1*X1*delta + 1*delta

- tau 1(delta) = delta/(1 + 0 * delta)

- tau 2(delta) = delta/(1 + 0 * delta)

s tau 1(delta) = delta/(1 + 0 * delta)

+ tau 1(delta) = delta/(1 + 1 * delta)

+ tau 2(delta) = delta/(1 + 0 * delta)

Time: 0.024418 seconds


Number of monomials: 187

Last formula building started for bound 1

Last SAT solving started for bound 1

Proceedings of the 13 th ESSLLI Student Session


Éva Szilágyi

University of Pécs

Abstract. Our project works on the implementation of a totally lexicalist grammar. Now

syntax has been worked out, which in this approach is like a dependency grammar, but word

order is handled. In harmony with the idea of total lexicalism, no PS-trees (nor transformation)

exist. We use rank parameters, close to Optimality Theory for expressing word order

variations in a language. A special kind of rank parameters account for Hungarian focus phenomena,

which makes radical surface changes in word order (beyond intonational effects).

The system is implemented in a relational database (SQL).

1 Introduction

Predicates are seeking their arguments in every language of the world, and adjuncts are

seeking their joining points too. We claim that only 8-10 operations work in languages,

but their effectiveness is different. This can be ordered by rank parameters: a universal

tool (as in Optimality Theory (Archangeli and Langendoen, 1997)) with language-specific

settings. Our project aims to develop an MT system based on GASG (Generalized Argument

Structure Grammar), a totally lexicalist theory (Alberti, 1999). We are linguists

basically, so our high-priority goal is linguistic. Lexicalist theories are successful nowadays,

and we aim to try out this extremity of lexicalism both theoretically and practically.

For us this is more important than effectiveness in size, speed or time.

The lexicon is in a relational database. The essence of relational databases is in the

definition of relations. Relations describe facts and contribute the database as well. Each

entity is an n-tuple: the elements of the tuples are in a relation contributing a record.

The elements are attributes contributing the fields of a record. A relation is a table

in the database, where each row (record) is an n-tuple and each column is an attribute

(Halassy, 1994). We chose Microsoft SQL 2005 for our implementation, so we have a

complete and complex database management frame system.

A morphophonological component has been transferred from our former project. Now

rules of syntax are being built in. The main component will be the semantic component:

the implementation of the DRT-based (Kamp, van Genabith and Reyle, 2004) (Asher and

Lascarides, 2003) ReALIS dynamical semantic system (Alberti, 2005).

GASG is a monostratal declarative grammar which is considered to be ”totally lexicalist”.

Total lexicalism means that all information is in the description of the lexical

items, and unification exclusively moves the combining of lexical elements. Thus, it can

be considered as a modified unificational categorial grammar (even function application

is omitted). It carries on radical lexicalism, introduced by (Karttunen, 1986), which states

that if the lexicon is properly rich, then sentences so can be produced by unification that

phrase-structure is practically redundant, besides, it goes to false ambiguities. Works

in computational linguistics (for example (Schneider, 2005)) also come to the point that


Proceedings of the 13 th ESSLLI Student Session

reducing phrase-structure could be useful. Many applications lean on phrase-structure,

because otherwise a dependency grammar, without restricting word-order, is not effecitve

in computation. GASG accounts for word-order by rank parameters, so giving up phrasestructure

does not result in exponential running time of the analyzing algorhythm.

Thus, ’rules’ mentioned above are not really rules, but properties which can be unified.

Requested arguments and their realizations are properties, too. Word order requirements

are also properties: requirements with different strength. Our grammar model uses rank

parameters for expressing word order, so this means that a requirement can not only be

completed or violated, but it can compete with (partially) incompatible requirements.

A special variant of these rank parameters also expresses those cases where focus (or

another operator) is ”re-ordering” word order (compared to a neutral sentence). In written

Hungarian sentences there is no other sign of focus (in spoken sentences there is emphasis

as well).

2 Rank parameters

Primitive syntactic relations (like being before or after each other) can be considered as a

direct preceding requirement in the description of the lexical item. This is because if an

element is in relationship with a head, it wants to be its neighbour. To give a short example

in Hungarian: a definite article needs a noun immediately after itself (1a). If an adjective

is there, it needs the noun being immediately after itself as well (1b). If this noun has a

possessive suffix, the suffix wants the possessor between the article and the adjective (1c).

Another adjective, expressing nationality has to be before the noun (1d). Both adjectives

cannot precede the noun: nationality gets priority in this case. Since sentences are linear,

a head has only two neighbours theoretically. And practically languages usually pick their

complements from one direction.

(1) a. a tanárom

the teacher-Poss1Sg

’my teacher’

b. az okos tanárom

the clever teacher-Poss1Sg

’my clever teacher’

c. az én okos tanárom / *az okos én tanárom

the I clever teacher-Poss1Sg / the clever I teacher-Poss1Sg

’my clever teacher’

d. az én okos magyar tanárom

the I clever Hungarian teacher-Poss1Sg

’my clever Hungarian teacher’

These relations can be expressed by a parameter, called rank parameter, a number

expressing that two lexical items need to be that close to each other to express the relationship

between them. So now we can calculate how a requirement can be satisfied

indirectly (or partially). In the case of (1a) and as for the nationality adjective (1d) it is

regarded as the direct satisfaction of a requirement. The requirement of the article in (1b)

or the adjective in (1d) is an indirect satisfaction.


Proceedings of the 13 th ESSLLI Student Session

Figure 1: Indirect satisfaction in (1d).

Rank parameters show in which direction the satisfying word should be. It is expressed

by a character. It can be a, b or c, referring to a following or a previous position or both.

We differentiate two types of rank parameters based on the way of satisfying requirements.

Recessive rank parameters (r) give neighbourhood relations (as in (1a-d)), and

they are satisfied either if they are adjacent immediately or another element with stronger

(a smaller number) rank is wedged in 1 . In Figure 1. the 5 strength requirement of the determinant

az ’the’ to the noun tanárom ’teacher-Poss1S’ is satisfied. This case is a partial

or indirect satisfaction (Alberti, 1999) (see further examples in (6-7)). From conflicting

dominant rank parameters (d) only the strongest one can be satisfied, all others are deleted

(see section 6).

Dominant parameters come language-specifically from either syntax or semantics. For

example, in Hungarian the subject of a sentence precedes the verb by a dominant semantic

rank parameter, and in no language it is morpheme-marked (thus, it is not a separate

lexical item). In contrary, the subject obligately precedes the verb in English, even if it is

semantically empty. Dominant parameters also play an important part in the Hungarian

focus phenomena (see examples (6-11)).

3 Predicates and arguments, heads and complements

Argument structures are considered as entities. Their elements are given by a stock table

of argument types. Therefore, an argument is formed by a relationship between the argument

structure and an argument type. For example, the Hungarian verb lakik ’live’ has

two arguments: the one who lives somewhere and the place where the one lives.

Argument types are described by a number parameter which places the argument in

a scale of being agentive or patient-like. Those types which are not in the central frame

which describes relations between subjects and objects get a neutral parameter.

In Hungarian we consider nominal parts of speech as they have more than one argument

structure: they can be arguments themselves as their basic – in most of the languages

the only one – role (2a), or can be nominal predicates, too, because the copula is phonetically

null in Hungarian in present tense third person singular (2b). And we count the short

possessive form here, which searches for a possessive suffix (2c).

(2) a. Péter Budapesten lakik.

Peter-NOM Budapest-SUPERESS live-3Sg

’Peter lives in Budapest.’

1 Wedging in has perceptional limits.


. Annak a fiúnak a neve Péter.

That-DAT the boy-DAT the name-Poss3Sg Peter-NOM

’That boy’s name is Peter.’

c. Péter kalapja.

Peter-NOM hat-Poss3Sg

’Peter’s hat’

Proceedings of the 13 th ESSLLI Student Session

We store the required complements the same way: there is a case frame, where: the

word ’case’ now has an extended meaning, we record here all forms like infinitive or

postpositional phrases, just like constant phrases to which a case-suffixed word form (3a)

can be switched (3b). Therefore, cases are stored as a relationship between the case frame

and a case type.

(3) a. Péter elárult pár dolgot Mariról.

Peter-NOM disclose-Past3Sg couple thing-ACC Mary-DELAT

’Peter disclosed a couple of things about Mary.’

b. Péter elárult pár dolgot Marival kapcsolatban.

Peter-NOM disclose-Past3Sg couple thing-ACC Mary-INS relation-INESS

’Peter disclosed a couple of things about Mary/related to Mary.’

Sometimes the lexical item does not select a certain case for its argument. The verb

lakik ’live’ has two cases for its arguments: the former one gets the nominative case,

by the linkage between the argument and the case. The other one is a joker type: ’not

specified’. The lack of the filled argument may cause a non-grammatical sentence, even

though at this point we do not know the exact case (case type) it is realized as. Therefore,

argument types and case types can be linked, too. For the ’PLACE’ type argument, more

case types can be selected, as these examples show:

(4) a. Péter egy szép házban lakik.

Peter-NOM a nice house-INESS live-3Sg

’Peter lives in a nice house’

b. Péter Budapesten lakik.

Peter-NOM Budapest-SUPERESS live-3Sg

’Peter lives in Budapest.’

c. Péter az iskola mellett lakik.

Peter-NOM the school-NOM next-POSTPOS live-3Sg

’Peter lives next to the school.’

Syntax may account for adjuncts too. A suffixed noun is an adjunct when the suffix is

compositional, but all those compositional elements are complements which are required

by another element. In this case the suffix (or the lexical item: ott ’there’) tells about

itself that it is an adjunct requiring a noun.

4 Rank parameters in operation

Rank parameters come from description, by experience. In the followings, some Hungarian

examples show how they work.


Proceedings of the 13 th ESSLLI Student Session

In Hungarian a head-complement relation is given by a 7 strength rank parameter. We

do not give any direction because (since lexical items are moprhemes) the place of the

complement is underspecified at this point. Semantic requirements search an aspectualization

argument in the pre-verbal position. There is always an argument giving aspect:

usually it is a pre-verb (5a) 2 or a bare NP (5b) or occasionally it can be the verb itself (5c)

(Alberti, 2004).

(5) a. Péter megírta a leckét.

Peter-NOM Perf+write-Past3Sg the homework-ACC

’Peter has written the homework.’

b. Már három hete újságot árulok.

Already three week newspaper-ACC sell-1Sg

’I have been selling newspaper for three weeks already.’

c. Péter csalódik Mariban.

Peter-NOM get-disappointed-3Sg Mary-INESS

’Peter gets disappointed in Mary.’

Pre-verbs have two rank parameters, both recessive. In neutral sentences like (6a)

the pre-verb el ’away’ must precede indul ’starts going’, given by a strong (r2b) rank

parameter. The emphasis is on the pre-verb, and the verb has no emphasis, so practically

they form one phonological word. In other cases, like in (6b), the pre-verb may follow the

verb by a weaker (r3a) rank parameter. This time they are separate phonological words.

(6) a. Péter elindul horgászni.

Peter-NOM away+go3Sg fish-INF

’Peter goes fishing.’

b. Péter ’horgászni indul el. / ’Péter indul el horgászni.

Peter-NOM fish-INF go-3Sg away / Peter-NOM go-3Sg away fish-INF

’Why Peter goes away is that he will fish.’ / ’It is Peter who goes fishing.’

Sometimes a certain argument gives aspect. For example, the verb lakik ’live’ has an

argument for ’PLACE’, and it is in the preceding position with a strong (r2b) rank (7a),

or in the following position with a weaker (r3a) rank (7b) 3 .

(7) a. Péter Budapesten lakik.

Peter-NOM Budapest-SUPERESS live-3Sg

’Peter lives in Budapest.’

b. *Péter lakik Budapesten / ’Péter lakik Budapesten.

Peter-NOM live-3Sg Budapest-SUPERESS

’*Peter lives in Budapest.’ / ’It is Peter who lives in Budapest.’

There are even more special cases when a verb having a pre-verb still gets the aspect

from another argument.

2 Pre-verbs in Hungarian are considered as complements (as well as in other theories), because they are

separate words. It is a matter of orthography that if the preverb preceeds the verb immediately they should

be joint.

3 In the examples apostrophe means strong emphasis. Besides word order, this denotes focus in a Hungarian



Proceedings of the 13 th ESSLLI Student Session

(8) a. Péter Budapesten szállt meg.

Peter-NOM Budapest-SUPERESS stay-Past3Sg Perf

’Peter stayed in Budapest.’

b. *Péter megszállt Budapesten / Péter ’megszállt Budapesten.

Peter-NOM Perf+stay-Past3Sg Budapest-SUPERESS

’*Peter stayed in Budapest.’ / ’What Peter did in Budapest was that he stayed there.’

As we can see in (8b), the first sentence without emphasis is non-grammatical. The

second variant is grammatical, but not neutral in any cases: a focus throws the locative

back, so only the weaker requirement can be satisfied (see further in the next two sections).

The aspect-giving argument has to be stored with two rank parameters in every case.

5 Focus in Hungarian

Focus in Hungarian can be noticed by emphasis and word order (Kiss, 2000). In the

following examples (9a) is a neutral sentence and (9b-c) are variants with a focus pointing

on different complements of the verb.

(9) a. Mari süteményt süt Péternek.

Mary-NOM cookie-ACC bake-3Sg Peter-DAT

’Mary is baking cookies for Peter.’

b. Mari ’Péternek süt süteményt.

’It is Peter for whom Mary is baking cookies.’

c. Mari ’süteményt süt Péternek (és nem kenyeret).

’Those are cookies (and not bread) what Mary is baking for Peter.’

In our solution focus is a separate lexical item 4 , because it influences other elements

in the sentence by its own requirements. It searches for two other elements: the focused

element and a verb. Focus gives the verb a strong dominant rank parameter to be in the

following position (d6a). 5

In the previous section we claimed that the aspect-giving argument (mostly a pre-verb)

has to be stored with two rank parameters. In neutral sentences (as in (6a)) the stronger

(r2b) rank parameter is satisfied. But when a focus comes (see (6b)), the requirement of

the pre-verb cannot be satisfied. The weaker (r3a) requirement is still there, and it can be


6 Processing

Search rolls from the finite verb. Those elements, which turn out to be not required by the

verb or any of its complements (adjuncts mostly), are legitimate if they find an element to

attach to.

The first step of the process is to check dominant rank parameters. In Figure 2. the focused

element tortát ’cake-ACC’ directly preceds the verb hozott ’bring-Past3Sg’.) Then

all conflicting requirements are deleted:

4 Although it is phonetically null in Hungarian, in some languages it is a morpheme (eg. eskimo,

quechua, tamil). This explains why we consider it as a separate lexical item.

5 Progressive form of telic situations may work the same.


Proceedings of the 13 th ESSLLI Student Session

Figure 2: Processing.

1. Ranks applying to the same element from the same element (In Figure 2. r3a between

the pre-verb be ’in’ and the verb, only r7b remains);

2. All other ranks between the two elements (r7a from the verb to the focused tortát,

and r7c from tortát to the verb);

3. Ranks applying to another element with a reverse direction (r7b rank of the verb to

the subject Péter ’Peter-NOM’ changes 6 to r7c, because subject could be anywhere

around the verb if there is a focus);

4. The dominant rank parameter wins if there are two conflicting requirements of the

same element (between Péter ’Peter-NOM’ and the verb hozott ’bring-Past3Sg’

there is r7c and d7a, due to the focus the former one remains in this sentence, but

in a neutral sencence d7a applies.)

The next step is to check recessive rank parameters: either two elements are neighbours

directly or there is another element between them which is required with a stronger rank

parameter (this may bring adjoining elements). In the example it goes as follows:

1. egy tortát ’a cake-ACC’, a szobába ’the room-ILLAT’, hozott be ’bring-Past3Sg in’

are neighbours directly;

2. in be a szobába ’in the room-ILLAT’ the definite article is in between, but it has a

stronger rank parameter (r5a against r7c);

3. Péter and hozott has egy tortát in between due to the 6 strength rank parameter by

the focus.)

In our system, contrary to phrase-ctructure grammars, any element can be focused.

Sometimes the verb does not succeed the focused element immediately. An adjoining

word may follow it which wedges itself in by a stronger rank parameter, like (10) shows:

(10) a. Péter egy lánnyal találkozott.

Peter-NOM a girl-INS meet-Past3Sg

’Peter met a girl.’

6 Practically its direction is deleted, see section 2.


Proceedings of the 13 th ESSLLI Student Session

b. Péter egy ’okos lánnyal találkozott.

Peter-NOM a clever girl-INS meet-Past3Sg

’It was a clever girl whom Peter met.’

c. Péter ’két okos lánnyal találkozott.

Peter-NOM two clever girl-INS meet-Past3Sg

’It was two clever girls whom Peter met.’

(11) a. Péter olvasott egy verset Adytól.

Peter-NOM read-Past3Sg a poem-ACC Ady-ABL

’Peter read a poem by Ady.’

b. *Péter egy ’verset Adytól olvasott.

Peter-NOM a poem-ACC Ady-ABL read-Past3Sg

In (11) the focused element (verset ’poem-ACC’) has a complement (Adytól ’Ady-

ABL’), but complements are required with a 7 strength rank parameter and it is weaker

than the 6 strength rank parameter between the focus and the verb.

7 Conclusion

We are working on the implementation of this system in which predicate-argument and

head-complement relations, adjuncts and word order are all handled in the lexicon. Rank

parameters account for word order variations in a language, and for other phenomena like

scrambling (this shows clear differences between languages) or focus and progressive

(which are sometimes invisible). The next step will be a semantic component, because

we believe that intelligent applications can be made only on real linguistic basis which

requires fine semantics.


I am grateful to the Hungarian National Scientific Research Fund (OTKA K60595) for

their contribution to my costs.


Alberti, G. (1999). GASG: The grammar of total lexicalism, Working Papers in the Theory

of Grammar 6(1). Theoretical Linguistics Programme, Budapest University and

Research Institute for Linguistics, Hungarian Academy of Sciences.

Alberti, G. (2004). Climbing for aspect with no rucksack, in K. É. Kiss and H. van

Riemsdijk (eds), Verb Clusters; A study of Hungarian, German and Dutch, Linguistics

Today 69, John Benjamins, Amsterdam:Philadelphia, pp. 253–289.

Alberti, G. (2005). ReALIS. Doctoral dissertation at Hungarian Academy of Sciences,

ms. HAS Research Institute for Linguistics and University of Pécs.

URL: http://lingua.btk.pte.hu/gelexi.asp

Archangeli, D. and Langendoen, T. D. (eds) (1997). Optimality Theory: an Overview,

Blackwell, Oxford.


Proceedings of the 13 th ESSLLI Student Session

Asher, N. and Lascarides, A. (2003).

Press, Cambridge.

Logics of Conversation, Cambridge University

Halassy, B. (1994). Az adatbázis-tervezés alapjai és titkai [Basics and Secrets of Designing

a Database], IDG, Budapest.

Kamp, H., van Genabith, J. and Reyle, U. (2004). Discourse representation theory. ms.

to appear in Handbook of Philosophical Logic.

URL: http://www.ims.uni-stuttgart.de/∼hans

Karttunen, L. (1986). Radical lexicalism, Report No. CSLI-86-68, CSLI Publications.

Kiss, K. É. (2000). Az egyszerü mondat szerkezete [the Structure of the Simple Sentence],

in F. Kiefer (ed.), Strukturális magyar nyelvtan I. Mondattan [Structural Hungarian

Grammar Vol. 1 Syntax], Vol. 7., Akadémiai Kiadó, Budapest, pp. 79–177.

Schneider, G. (2005). A broad-coverage, representationally minimal LFG parser: Chunks

and f-structures are sufficient, in M. Butt and T. H. King (eds), Proceedings of the

LFG05 Conference, CSLI Publications, University of Bergen, pp. 388–407.


Proceedings of the 13 th ESSLLI Student Session


Proceedings of the 13 th ESSLLI Student Session



Camilo Thorne

Free University of Bozen-Bolzano

Abstract. We propose to characterize the computational complexity of answering questions

in ontology-mediated controlled language interfaces to structured data sources by expressing

ontology-based data access in controlled English. This means: compositionally mapping a

controlled subset of English into knowledge bases and formal queries for which the computational

complexity of ontology-based data access is known. In the present paper, we extend

this approach to conjunctive queries and to conjunctive queries with aggregation functions.

1 Introduction

Lately, there has been a renewed interest within the computational linguistics community

(Minock, 2005; Lesmo and Robaldo, 2007) in natural language interfaces to databases

(NLIDBs), where what is aimed at is managing, with natural language (NL), relational

databases (DBs). In particular, robust interfaces supporting controlled fragments (CLs)

of English and based on ontologies, computational semantics and deep semantic parsing

have been developed, by, for instance, the Attempto project (Bernstein et al., 2003;

Fuchs et al., 2005). Controlled languages are fragments of NL tailored to fit data management

tasks by, typically, constraining their restricted vocabulary (and syntax), thereby

stripping them from ambiguity, whether structural or semantic. Controlled languages allow

a trade-off between the coverage and the accuracy of the translation of questions into

formal queries. Ontologies (the conceptualizations of the domain) play the intermediate

role between theCL’s vocabulary and the domain terminology.

However, some important issues regarding controlled English interfaces have not been,

to the best of our knowledge, fully adressed. One of them is the tractability and untractability

of processing CL information requests and utterances, viz., how difficult is

declaring and accessing structured data with a controlled English interface? And by difficult,

we mean its computational complexity. We believe that a way of adressing this issue

consists in expressing ontology based data access with CLs. By this we mean designing

declarative and interrogative controlled subsets of English that compositionally map

through a semantic mapping . (taken from NL formal semantics) into formal queries,

ontologies and database facts, their meaning representations (MRs). Ontology based data

access provides the logical underpinning of accessing structured data w.r.t. ontologies and

its computational complexity, a measure of how difficult a task it might be.

The main purpose of this paper is twofold. On the one hand, we will say what means to

express inCL ontology based data access. On the other hand, we will proceed to express

in controlled English a class of formal queries known as conjunctive queries. Conjunctive

queries are good in that with them we reach an optimal computational complexity. Last,

but not least, we will extend our controlled language to cover aggregate queries, which

are conjunctive queries to which the basicSQL aggregation functions,COUNT,MIN,MAX

andSUM, have been added.


Proceedings of the 13 th ESSLLI Student Session

2 Ontology Based Data Access

Accessing and declaring data w.r.t. an ontology or conceptualization can be characterized

in terms of formal logic as follows (Rosati, 2007). A relational query q of arity n is a

formal expression q(x) ← Qyβ(x,y), where q(x) is the head and x denotes a sequence

of n variables, the query’s distinguished variables, and Qyβ(x,y) is the body, a first order

logic (FOL) quantified boolean combination of relational atoms where the distinguished

variables occur free and the others (the sequence y) bound to a quantifier. Qy denotes

the sequence of its quantifier prefixes. When no confusion arises, we shall abbreviate

Qyβ(x,y) with Φ[x]. A query is said to be boolean if its arity is n = 0. A collection

of such queries is called a query language. A relational database (DB) D is a finite set

of ground atoms over a schema R := {R 1 , ...,R n }, where, for i ∈ [1, n], R i is a relation

symbol of arity m ≥ 1, and over a countably infinite domain Dom of constants. The

active domain adom(D) of D is the set of constants that occur in D (a finite subset of

Dom). An ontology O is a set of FOL axioms that make explicit a certain number of

constraints holding over a domain. They are typically defined over some fragment of

FOL called an ontology language. This language should be rich enough to express DBs

(i.e., DB atoms). The pair 〈O, D〉 is called a knowledge base (KB), and can be seen as a

FOL logical theory: a set of ground atoms (theDB) plus a set of axioms (the ontology).

A gound substitution is a function σ(.) from V ar(q), the set of variables of q, into Dom.

They are extended to sequences of variables in the standard way. KBs and substitutons

give rise to the certain answers semantics of query q of arity n over aKB 〈O, D〉, denoted

q(〈O, D〉). It consists in collecting the values in adom(D) of all the ground substitutions

σ(.) for which 〈O, D〉 logically entails qσ, where qσ denotes the grounding of q by σ(.).

Formally, q(〈O, D〉) := {σ(x) ∈ adom(D) n | σ s.t. 〈O, D〉 |= qσ}. To investigate its

computational complexity we must look at the associated recognition problem:

Definition 1. (QA) TheKB query answering (QA) decision problem is theFOL entailment

problem stated as follows: given aKB 〈O, D〉, a sequence c ∈ Dom n of n constants, a

CQ q of arity n and distinguished variables x, check if there exists a ground substitution

σ(.) s.t. σ(x) = c and 〈O, D〉 |= qσ holds, where qσ is the grounding of q by σ(.).

When we focus on #(adom(D)) (the number of constants of D) while considering

constant both size(q) (the number of symbols of the query) and #(O) (the number of

axioms), we speak, in a manner set by (Vardi, 1982), of the data complexity ofQA. Such

complexity will depend on the query language and the ontology language chosen (Rosati,


The certain answers semantics can provide a formal semantics for ontology mediated

CL data access interfaces andQA’s data complexity both a measure of their difficulty and

a criterion for optimality. To implement this strategy we need, we believe, to go through

two stages: (i) We need to choose an ontology language and a query language for which

the computational complexity ofQA is known and for which data complexity is optimal.

(ii) We need to express with controlled EnglishQA.

3 ExpressingQA with Controlled English

A compositional translation ., as proposed and conceived by Montague in (Montague,

1970) is a function that homomorphically maps a fragment of natural language (English


Proceedings of the 13 th ESSLLI Student Session

in our case) into, basically,FOL augmented with the types, the lambda abstraction and the

function application constructs of the simply typed λ-calculus, a.k.a. λ-FOL. They assign

to NL utterances a λ-FOL formula: its meaning representation (MR). The key feature of

compositional translations is that they can be made to map declarative fragments of NL

into ontology languages and interrogative fragments into query languages.

Definition 2. (ExpressingQA) Given an ontology language L and a query language Q,

expressing QA in controlled English consists in: (i) Defining a grammar G and a compositional

translation . for a controlled declarative fragment L(G) s.t. . maps L(G)

into L. (ii) Defining a grammar G ′ and a compositional translation . for a controlled

interrogative fragment L(G ′ ) s.t. . maps L(G ′ ) into Q.

We have dealt elsewhere with the problem of expressingKBs and ontology languages

by expressing, in particular, theDL-Lite R,⊓ ontology language or logic and, in general,

the DL-Lite family of DLs (Calvanese, De Giacomo, Lembo, Lenzerini and Rosati,

2007). Description logics (DLs) are knowledge representation logics that conceptually

model a domain in terms of classes, roles (binary relations among classes) and inheritance

relations between classes and roles. In (Bernardi, Calvanese and Thorne, 2007; Thorne,

2007) we define a declarativeCL,Lite-English, a compositional translation . and

show that:

Theorem 1. (Bernardi et al., 2007) For every sentence S in the CL Lite-English,

there exists a DL-Lite R,⊓ assertion α s.t. S = α. Conversely, every DL-Lite R,⊓

assertion α is the image by . of some sentence S inLite-English.

To get the whole picture we need to look now at query languages. It turns out to be that

QA forDL-Lite R,⊓ is optimal w.r.t. data complexity, falling under LOGSPACE (actually,

AC 0 ), a minimal complexity class, when we choose as query language the class of

relational queries known as ruled-based conjunctive queries (CQs). Conjunctive queries

are queries over a schema R whose body is a conjunction of existentially quantified relational

atoms. Expressing query languages w.r.t. whichQA’s computational complexity is

optimal can shed light on the conditions under which the task of accessing data w.r.t. an

ontology withCL might be a relatively easy task.

4 Expressing Conjunctive Queries

In this section we will show how to express graph-shaped simple conjunctive queries, a

subclass of the class ofCQs, for whichQA is optimal too. A typical boolean graph-shaped

query over, say, the constant Mary and the binary predicates loves and hates is

(1) q() ← ∃x∃y(loves(Mary,x) ∧ hates(x,y))

which we would like to express through theCLY/N-question

(2) Does Mary love somebody who hates somebody?

And a typical non-boolean graph-shaped query over the same set of relational symbols

(i.e., the schema {loves,hates}) is

(3) q(x) ← ∃y(loves(x,y) ∧ hates(x,y))


Proceedings of the 13 th ESSLLI Student Session

(Lexical rule)

(Value of . on word and category)

Det → some λP.λQ.∃x(P(x) ∧ Q(x)): (e → t) → ((e → t) → (e → t))

Pro i → somebody λP.∃xP(x): (e → t) → t

Pro − i → anybody λP.∃xP(x): (e → t) → t

Coord → and λP.λQ.∃x(P(x) ∧ Q(x)): (e → t) → ((e → t) → (e → t))

Relpro i → who λP.λx.P(x): (e → t) → (e → t)

Pro i → him λP.P(x): (e → t) → t

Pro i → himself λP.P(x): (e → t) → t

Intpro → which λP.λQ.λx.P(x) ∧ Q(x): (e → t) → (e → t)

Intpro i → who i λP.λx.P(x): (e → t) → (e → t)

NP gapi → ǫ λP.P(x): (e → t) → t

N i → man,... λx.man(x): e → t,...

IV i → runs,... λx.run(x): e → t,...

IV − i → run,... λx.run(x): e → t,...

TV i,j → loves,... λα.λx.α(λy.loves(x, y)): ((e → t) → t) → (e → t),...

TV − i,j → love,... λα.λx.α(λy.loves(x, y)): ((e → t) → t) → (e → t),...

TV p i,j → loved,... λα.λx.α(λy.loves(x, y)): ((e → t) → t) → (e → t),...

Adj i → mortal,... λx.mortal(x): e → t,...

Pn i → Mary,... λP.P(Mary): (e → t) → t,...

Table 1: Lexical rules forGCQ-English.

which we would like to express through the CL Wh-question (containing an anaphoric


(4) Who loves somebody who hates him?

Definition 3. (GCQs) A non-boolean graph-shaped simple conjunctive query (GCQ) of

arity ≤ 1 is aCQ over a schema R composed of relation symbols of arity ≤ 2 of the form

q := q(x) ← Φ[x] where the body Φ[x] is inductively defined as:

Φ[x] := A i0 (x) ∧ ... ∧ A im (x) ∧ R j0 (x,x) ∧ ... ∧ R jm (x,x) ∧ R j0 (x,c) ∧ R jm (x,c).

Φ[x] := Φ ′ [x] ∧ ∃y(A i0 (x) ∧ ... ∧ A im (x) ∧ R j0 (x,y) ∧ ... ∧ R jm (x,y) ∧ R j0 (y, x)∧

∧R jm (y, x) ∧ Φ ′′ [y]).

Note that we allow in this definition for empty sequences of conjuncts, e.g., |A i0 (x)∧...∧

A im (x)| ≥ 0 (where |.| is the function that returns the number of predicates in the body

of a relational query). A booleanGCQ is a query of the form q := q() ← ∃yΦ[y], where

Φ[y] is the body of a non-booleanGCQ.

4.1 Expressing Conjunctive Queries withGCQ-English

GCQs are captured by the interrogativeCLGCQ-English. Questions inGCQ-English

fall under two main classes : (i)Wh-questions, that will map into non-booleanGCQs and

(ii) Y/N-questions, that will map into boolean GCQs. For simplicity, we assume grammars

to be phrase structure grammars augmented with semantic actions. Phrase structure

grammars are composed of two sets of rewriting rules: lexical rules (a.k.a. lexicons)

and phrase-structure rules. Table 2 shows the phrase-structure rules ofGCQ-English’s

grammar and Table 1 its lexicon. Moreover, the latter is divided into two sets: a closed

set of function word rules, that express (at the semantical level) logical operations and

connectives, and an open set of content word rules (nouns, adjectives, verbs), a feature we

convey through dots.


Proceedings of the 13 th ESSLLI Student Session


(Semantic Action)

Q wh → Intpro N i S gapi ? Q wh := Intpro(N i)(S gapi )

Q wh → Intpro i S gapi ? Q wh := Intpro i (S gapi ?)

Q Y/N → does NP − i VP − i ?

Q Y/N → is NP i VP i?

Q Y/N := NP − i (VP− i )

Q Y/N := NP i(VP i)

S gapi → NP gapi VP i

S gapi := NP gapi (VP i)

VP i → VP i Coord VP i VP := Coord(VP)(VP)

VP − i → VP − i Coord VP − i

VP i → TV i,j NP j

VP − i := Coord(VPi )(VP − i )

VP i := TV i,j(NP j)

VP i → is Adj i

VP i := Adj i

VP i → is a N i

VP i := N i

VP − i → IV − i

VP i → IV i

VP − i := IV− i

VP i := IV i

VP − i → TV − i,j NPj

VP i → VP p i

VP− i := TV− i,j (NPj)

VP i := VP p i

VP p i → TVp i,j NPj VPp i := TVp i,j (NPj)

NP − i → Det − N i

NP i → Pro i

NP − i := Det− (N i)

NP i := Pro i

NP i → Det N i

NP i := Det(N)

NP i → Pn i

NP i := Pn i

NP i → Pro i

NP i := Pro i

N i → Adj N i

N i → N i Relpro i S gapi

N i := Adj(N i)

N i := Relpro i (N i)(S gapi ))

Table 2: Phrase structure rules forGCQ-English.

The empty expression ǫ is what in linguistic theory is called a trace, a placeholder for

the antecedent of the relative pronoun. Symbols occurring in the phrase-structure rewriting

rules are called components and represent the syntactic chunks into which sentences

can be analysed. Symbols that rewrite into words, that is, symbols in the lexicon, are

called categories or terminal components and represent parts of speech, that is, verbs,

common and proper nouns, pronouns, adjectives, etc. Some basic morpho-syntactic and

semantic features are attached to (some) components. The feature . − means that the component

is of negative polarity, the feature . p , associated to verbs and verb phrase components,

indicates that such component is to be inflected in the passive voice. Absence of

features indicates that components are in positive polarity and verbs and verb phrases in

the active voice. Furthermore, indexes are assigned to components following the standard

set by (Pratt, 2001) to: (i) Resolve intrasentential anaphora: anaphoric pronouns (”him”,

”himself”) resolve with their nearest (antecedent) head noun. (ii) Indicate gap-filler dependencies.

For simplicity, verbs are in 3rd person singular and in present tense.

A quick glace at the grammar rules ofGCQ-English will convince the reader that,

for instance, the (English) question

(5) Does John love Mary?

and the question

(6) Which man is mortal and loves somebody who hates him?

lie withinGCQ-English. By the same token, it is easy to see that the question

(7) *Which teacher gives a lesson to his pupils?


Proceedings of the 13 th ESSLLI Student Session

lies outside thisCL. Why? Because we have no possesive adjectives (e.g., ”his”) and no

ditransitive verbs (e.g., ”gives”).

Semantic actions mean that we define the translation . by recursion over the syntactic

components ofGCQ-English in such a way that the application of each grammar rule,

lexical or otherwise, ”triggers” . (Jurafsky and Martin, 2000). The intermediate values

of this function are called partialMRs. When we reach in aWh-question the Q wh component

. will map the λ-FOL expression obtained, of the form Q wh = λx.Φ[x]: e → t,

into the GCQ q(x) ← Φ[x], where Φ[x] denotes a conjunction of existentially quantified

atoms where variable x occurs free. In the case of a Y/N-question, the λ-FOL

Q Y/N = Φ: t will be mapped into the boolean GCQ q() ← Φ, where Φ stands for a

conjunction of existentially quantified atoms with no free variables. Types ensure that .

always terminates. We can compute, given aGCQ-English question Q, Q as follows:

(i) We compute the parse tree of Q. (ii) We compute Q bottom-up, from leaves to root,

as in Figure 1. We start by assigning a λ-expression to the leaves. Then, at each internal

node, we unify types and compute the λ-application and the β-reduction of its siblings.

We omit types for reasons of space. In the end we obtain, at the root of the tree, aGCQ.

The circle delimits an island; the dotted line, a gap-filler dependency forced upon by the

use of the pronoun.

Figure 1: Translating ”Who loves Mary?”.

Lemma 1. (Expressing GCQs) For every question Q in GCQ-English, there exists a

GCQ q s.t. Q = q. Conversely, everyGCQ q is the image by . of some question Q in


Proof. (Sketch) We prove each implication separately:

(⇒) We need to show that for every Wh-question Q in GCQ-English there exists a

GCQ q of distinguished variable x and body Φ[x] s.t. Q = q(x) ← Φ[x]. Given

that the only recursive components inGCQ-English’s grammar are verb phrases

(VPs) and nominals (Ns), this can be proved by an easy simulatenous induction on

Ns and VPs any by discarding all possible parse states where components do not

satisfy co-indexing, polarity and voice constraints. For Y/N-questions we reason



Proceedings of the 13 th ESSLLI Student Session

(⇐) We will prove, by induction on the body Φ[x] of a non-boolean GCQ q of distinguished

variable x, that we can construct a question Q s.t. that q is the image of

Q by .. The result will then follow both for boolean and non-booleanGCQs. Recall

that Ns translate into unary predicates, TVs into binary predicates and Pns into


– (Basis) q(x) ← Φ[x] is the image of the question ”which A i0 who is a A i1 who

. . . who is a A im R j0 s himself and . . . and R jm s himself and R j0 s c and . . . and

R jm s c and is R j0 d by c and . . . and is R jm d by c?”.

– (Inductive step) q(x) ← Φ[x] is the image of the question ”which Φ ′ [x] R j0 s

and R j1 s and . . . and R jm s some A i0 who is a A i1 and who is a A i2 and . . . and

who is a A im and who R j0 s him and . . . and who R jm s him and who Φ ′′ [y]?”,

by induction hypothesis on Φ ′ [x] and Φ ′′ [y].

Theorem 2. (ExpressingQA) TheQA problem forLite-English andGCQ-English

falls under in LOGSPACE w.r.t. data complexity.

Proof. It follows immediately from Theorem 1 and Lemma 1.

5 Expressing Aggregate Queries

The question we now need to answer is: how can we expand the coverage of our CL

without compromising the tractability ofQA? In this section we propose to cover graphshaped

aggregate queries, that is,GCQs augmented with (some of) the basicSQL aggregation

functions,COUNT,MIN,MAX andSUM. These functions are defined on finite subsets

of Dom ∪ Q, i.e., onDB domains plus the linearly ordered set of rational numbers and

take values in Q, that is, they compute a rational number. For the purposes of the current

paper, we will restrict our analysis to only two of them, namelyMAX andMIN, although

this analysis can be easily generalized to cover all of these functions.

Aggregates arise frequently in domains and systems containing numerical data, e.g.

geographical domains and systems. One of them, the GEOQUERY geography database

system, comes with aNL interface that supportsNL questions expressing such functions

(Mooney, 2007). The corpus of these questions showed that user questions did basically

convey either aCQ or aCQ with aggregation functions (see Table 3). Most importantly,

CQs Aggregations Negation

Questions 34.54% 65.35% 0.11%

Table 3: Frequency ofCQs inGEOQUERY.

answeringCQs (and a fortioriGCQs) with aggregation functions overDL-Lite ontologies

is polynomial w.r.t. data complexity. So, how do these queries look like and what

kind of questions do we want to have in ourCL? We would like to capture queries over

unary predicates computing a maximum like

(8) q(max(n)) ← height(n) ∧ odd(n)

with aCLWh-question like

(9) Which is the greatest height that is odd?


Proceedings of the 13 th ESSLLI Student Session


VP i,j → COP NP j

(Lexical rule)

(Semantic action)

VP i,j := COP(NP j)

(Value of . on word and category)

Det → the greatest λP.max(P): (Q → t) → Q

Det → the smallest λP.min(P): (Q → t) → Q

Det → some λP.λQ.∃n(P(n) ∧ Q(n)): (Q → t) → ((Q → t) → (Q → t))

Pro i → something λP.∃nP(n): (Q → t) → t

Pro − i → anything λP.∃nP(n): (Q → t) → t

Pro i → it λP.P(n): (Q → t) → t

Pro i → itself λP.P(n): (Q → t) → t

Coord → and λP.λQ.∃n(P(n) ∧ Q(n)): (Q → t) → ((Q → t) → (Q → t))

Relpro i → that λP.λn.P(n): (Q → t) → (Q → t)

Intpro i → which λP.λn.P(n): (Q → t) → (Q → t)

COP i,j → is λn.λm.n ≈ m: Q → (Q → t)

NP gapi → ǫ λP.P(n): (Q → t) → t

N i → height,...

Adj → odd,...

λn.height(n): Q → t,...

λn.odd(n): Q → t,...

Table 4: Grammar rules forAGCQ-English.

Or queries computing a sum

(10) q(sum(n)) ← height(n) ∧ odd(n)

with the question

(11) Which is the sum of all heights that are odd?

Definition 4. (AGCQs) A graph-shaped conjunctive aggregate query (AGCQ) over a relational

schema R is a query of the form q(α(n)) ← Φ[n], where α ∈ {min,max}, n

is q’s distinguished variable, a numerical variable, and Φ[n] is the body of a non boolean

GCQ. Note that there are no booleanAGCQs.

5.1 Expressing Aggregate Queries withAGCQ-English

To express AGCQs in CL we extend AGCQ-English into a new fragment of English

calledAGCQ-English as follows. Aggregation functions min and max are conveyed,

in English, by, respectively, definite NPs like ”the smallest N” and ”the greatest N”, only

this time they must denote not a set of properties, but, instead, a numeric value. The

symbol N stands for a nominal component that denotes sets of numerical values. The

rest of the expression behaves in a manner similar to a determiner. We must thus start by

enriching our set of primitive λ-FOL types from {e,t} into {e,t, Q} and allow for new

determiners of type (Q → t) → Q.

Definition 5. (Aggregate Determiners) An aggregate determiner is any of the following: