14.03.2014 Views

Towards a Logical Description of Trees in Annotation Graphs - JLCL

Towards a Logical Description of Trees in Annotation Graphs - JLCL

Towards a Logical Description of Trees in Annotation Graphs - JLCL

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Jens Michaelis, Uwe Mönnich<br />

<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

The importance <strong>of</strong> annotation graphs <strong>in</strong>creases rapidly due to new developments<br />

<strong>in</strong> multi-media applications, text technology, and semantic web<br />

technologies. The paper provides a logical specification <strong>of</strong> trees <strong>in</strong> annotation<br />

graphs commonly used to code documents that are <strong>in</strong>herently structured on<br />

various levels.<br />

1 Introduction<br />

It is a matter <strong>of</strong> fact that a long history <strong>in</strong> artificial <strong>in</strong>telligence and computational l<strong>in</strong>guistics<br />

tries to develop tools to extract semantic knowledge from syntactic <strong>in</strong>formation.<br />

In particular, from a text technological po<strong>in</strong>t <strong>of</strong> view the general research perspective<br />

is to extract (semantic) <strong>in</strong>formation from annotated documents. Regard<strong>in</strong>g this aim,<br />

some <strong>of</strong> the relevant annotation models used are:<br />

• Multilayer annotations<br />

• Hyperl<strong>in</strong>ks<br />

• Discourse structure<br />

• (Classical) l<strong>in</strong>guistic description levels<br />

An important aspect concern<strong>in</strong>g annotation models is the development <strong>of</strong> a suitable<br />

logic system <strong>in</strong> order to specify the syntactic and semantic structures <strong>of</strong> such<br />

models. Due to the fact that heterogeneous subsystems must be <strong>in</strong>corporated, an<br />

amalgamation process can be assumed as the underly<strong>in</strong>g architecture. The challenges<br />

<strong>of</strong> such architectures are tw<strong>of</strong>old: first, the dynamic <strong>in</strong>teraction between syntactic and<br />

semantic <strong>in</strong>formation must be represented and second, the efficiency <strong>of</strong> algorithms must<br />

be guaranteed.<br />

Fortunately, <strong>in</strong> the case <strong>of</strong> annotation graphs, the object <strong>of</strong> these remarks, techniques<br />

from parameterized complexity theory can be exploited. This theory provides powerful<br />

tools for a detailed <strong>in</strong>vestigation <strong>of</strong> algorithmic problems. As it turns out, the concept<br />

<strong>of</strong> treewidth, <strong>in</strong>dicat<strong>in</strong>g the similarity <strong>of</strong> a graph or a relational structure with a tree, is<br />

a parameter which helps to show that many otherwise <strong>in</strong>tractable problems become<br />

computable <strong>in</strong> l<strong>in</strong>ear time when restricted to tree-like <strong>in</strong>puts.<br />

The key feature <strong>of</strong> annotation graphs is their abstraction from the diversity <strong>of</strong> concrete<br />

formats used for the transcription <strong>of</strong> text and speech. This feature makes them an<br />

ideal candidate for the comparison <strong>of</strong> different annotation systems, e.g., those currently<br />

developed by several l<strong>in</strong>guistic collaborative research centres <strong>in</strong> Germany.<br />

LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

Translat<strong>in</strong>g these different representation schemes <strong>in</strong>to the framework <strong>of</strong> annotation<br />

graphs is a necessary prerequisite for the transfer <strong>of</strong> the pleasant computational properties<br />

<strong>of</strong> annotation graphs to the orig<strong>in</strong>al systems. This is particularly important <strong>in</strong> those<br />

circumstances where the natural data structure <strong>of</strong> the concrete markup system cannot<br />

be readily understood as describ<strong>in</strong>g trees, or at least, multi-rooted trees, taken <strong>in</strong>to<br />

account the possibilty <strong>of</strong> multiple annotation layers. Our ma<strong>in</strong> result can be <strong>in</strong>terpreted<br />

as specify<strong>in</strong>g the conditions under which algorithmic methods that were designed for<br />

trees can be applied to the realm <strong>of</strong> annotation systems that are apparently based on<br />

underly<strong>in</strong>g graph structures.<br />

A classical doma<strong>in</strong> for multilayered annotations is l<strong>in</strong>guistics. Utterances <strong>of</strong> speakers<br />

can be considered from different perspectives: examples are syntactic, semantic, discourse<br />

and <strong>in</strong>tonation aspects, just to mention some <strong>of</strong> them. Represent<strong>in</strong>g these types <strong>of</strong> data<br />

<strong>in</strong> one representation format yields overlapp<strong>in</strong>g hierarchies. Moreover, the result<strong>in</strong>g<br />

structures are naturally considered as graphs rather than trees.<br />

These challenges can be met by represent<strong>in</strong>g annotation graphs <strong>in</strong> logical form.<br />

Benefits <strong>of</strong> such a representation are the low descriptive complexity, the representability<br />

as open diagrams, or the abstract format for the <strong>in</strong>teraction <strong>of</strong> different l<strong>in</strong>guistic levels.<br />

Furthermore logical representations are amenable to the arsenal <strong>of</strong> techniques from<br />

logical graph theory.<br />

2 <strong>Annotation</strong> <strong>Graphs</strong><br />

We start this section by explicitly provid<strong>in</strong>g the correspond<strong>in</strong>g formal def<strong>in</strong>itions, first,<br />

and by look<strong>in</strong>g, <strong>in</strong> particular, at one example <strong>in</strong> more detail, afterwards.<br />

2.1 Def<strong>in</strong>itions and Examples<br />

The underly<strong>in</strong>g def<strong>in</strong>ition <strong>of</strong> an annotation graph is specified as follows:<br />

Def<strong>in</strong>ition 2.1 (Bird and Liberman 2001) An annotation graph (AG), G, over a<br />

label set L and a family <strong>of</strong> timel<strong>in</strong>es 〈 〈T i, ≤ i〉 〉 i∈I<br />

, I be<strong>in</strong>g some <strong>in</strong>dex set, 1 is a 3-tuple<br />

〈N, A, τ〉 consist<strong>in</strong>g <strong>of</strong> a node set N, a collection <strong>of</strong> labeled arcs A ⊆ N × N × L, and a<br />

partial time function τ : N ⇀ ⋃ Ti, which satisfies the follow<strong>in</strong>g two conditions:<br />

i∈I<br />

(i) 〈N, A〉 is a labeled acyclic digraph conta<strong>in</strong><strong>in</strong>g no nodes <strong>of</strong> degree zero, and<br />

(ii) for any path from node n 1 to n 2 <strong>in</strong> A, if τ(n 1) and τ(n 2) are def<strong>in</strong>ed, then there<br />

is a timel<strong>in</strong>e 〈T i, ≤ i〉 such that τ(n 1), τ(n 2) ∈ T i, and such that τ(n 1) ≤ i τ(n 2).<br />

Remarks. AGs may be disconnected or empty, and they must not have orphan nodes. It<br />

follows from the def<strong>in</strong>ition that every piece <strong>of</strong> connected annotation structure can refer<br />

to at least one timel<strong>in</strong>e. In Figure 1, an AG from the l<strong>in</strong>guistics doma<strong>in</strong> is depicted.<br />

1 Thus, for each i ∈ I, timel<strong>in</strong>e 〈T i, ≤ i〉 consists <strong>of</strong> a nonempty set T i and a total order ≤ i on T i.<br />

Band 22 (2) – 2007 69


Michaelis, Mönnich<br />

31<br />

994.19<br />

speaker/B<br />

W/yeah<br />

32<br />

994.46<br />

speaker/A<br />

11<br />

991.75<br />

12<br />

W/he<br />

13<br />

W/said<br />

14<br />

W/,<br />

15 16<br />

W/and<br />

W/%um<br />

17<br />

994.65<br />

speaker/B<br />

33<br />

996.51<br />

W/whatever’s<br />

34<br />

W/helpful<br />

35<br />

997.61<br />

speaker/A<br />

speaker/A<br />

18<br />

995.21<br />

W/he<br />

19 20<br />

W/%um<br />

996.59<br />

21<br />

997.40<br />

W/right<br />

22 23<br />

W/.<br />

W/so<br />

24<br />

25<br />

1002.55<br />

Figure 1: An example <strong>of</strong> an anchored annotation graph (cf. Bird and Liberman 2001).<br />

Def<strong>in</strong>ition 2.2 (Bird and Liberman 2001) An anchored AG is an AG, G, def<strong>in</strong>ed<br />

as <strong>in</strong> Def<strong>in</strong>ition 2.1 and additionally satisfy<strong>in</strong>g the follow<strong>in</strong>g condition:<br />

(iii) if any node n does not have both <strong>in</strong>com<strong>in</strong>g and outgo<strong>in</strong>g arcs then τ : n ↦→ t for<br />

some time t.<br />

Remarks. For anchored annotation graphs, it follows from the def<strong>in</strong>ition that every<br />

node has two bound<strong>in</strong>g times, and timel<strong>in</strong>es partition the node set. The AGs depicted<br />

<strong>in</strong> Figure 1 are anchored.<br />

Def<strong>in</strong>ition 2.3 (Bird and Liberman 2001) A totally anchored AG is an anchored<br />

AG, G, def<strong>in</strong>ed as <strong>in</strong> Def<strong>in</strong>ition 2.2 such that the time function τ : N ⇀ ⋃ Ti is total.<br />

i∈I<br />

Def<strong>in</strong>ition 2.4 A totally anchored AG, G, def<strong>in</strong>ed as <strong>in</strong> Def<strong>in</strong>ition 2.3 is time-cross<strong>in</strong>g<br />

arc free if for all 〈p, q, l〉, 〈r, s, m〉 ∈ A such that τ(p), τ(r) ∈ T i for some i, and such<br />

that τ(p) ≤ i τ(r) then either τ(q) ≤ i τ(r) or τ(s) ≤ i τ(q) holds.<br />

In particular, the EXMARaLDA annotation tool, developed by the l<strong>in</strong>guistic collaborative<br />

research centre <strong>in</strong> Hamburg, can be seen as strongly rely<strong>in</strong>g on the formal concept<br />

<strong>of</strong> AGs (cf. Schmidt 2005). In fact, not us<strong>in</strong>g the full range <strong>of</strong> possibilities provided by<br />

the general AG-def<strong>in</strong>ition given above, the core model underly<strong>in</strong>g an EXMARaLDA<br />

basic transcription provides a so-called s<strong>in</strong>gle timel<strong>in</strong>e, multiple tiers (STMT) model,<br />

which <strong>in</strong> strict AG-terms can be understood as be<strong>in</strong>g a totally anchored AG consist<strong>in</strong>g<br />

<strong>of</strong> exactly one timel<strong>in</strong>e (cf. Figure 2). In l<strong>in</strong>e with the assumption <strong>of</strong> Bird and Liberman<br />

(2001, p. 12f) that reference to a s<strong>in</strong>gle timel<strong>in</strong>e implies that nodes with the same time<br />

reference should be considered to be identical, the AG depicted <strong>in</strong> Figure 2 can formally<br />

be specified as <strong>in</strong> the next example.<br />

Example 2.5 For T ex = {0, 1, 2, 3, 4, 5} and ≤ ex=≤ N ↾ T ex × T ex consider the timel<strong>in</strong>e<br />

〈T ex, ≤ ex〉. 2 Then for 〈T ex, ≤ ex〉 and the label set L ex implicitly specified via the<br />

2 N denotes the set <strong>of</strong> all non-negative <strong>in</strong>tegers, <strong>in</strong>clud<strong>in</strong>g 0. ≤ N is the canonical order on N, and for<br />

each set M ⊆ N, ≤ N ↾ M × M is the restriction <strong>of</strong> ≤ N to M × M.<br />

70 LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

Figure 2: Example <strong>of</strong> an STMT model (cf. Schmidt 2005).<br />

def<strong>in</strong>ition <strong>of</strong> the arc collection A ex, let G ex be the totally anchored AG 〈N ex, A ex, τ ex〉<br />

over 〈T ex, ≤ ex〉 (as the s<strong>in</strong>gle timel<strong>in</strong>e) and L ex, where N ex is the set {0, 1, 2, 3, 4, 5},<br />

τ ex is the identity function on {0, 1, 2, 3, 4, 5} and A ex consists <strong>of</strong> the follow<strong>in</strong>g directed<br />

labeled edges: 3<br />

a (1)<br />

1 = 〈1 , 3 , faster〉 ,<br />

a (2)<br />

1 = 〈0 , 1 , Okay.〉 ,<br />

a (2)<br />

2 = 〈1 , 2 , Très bien,〉 ,<br />

a (2)<br />

3 = 〈2 , 3 , Très bien.〉 ,<br />

a (3)<br />

1 = 〈0 , 1 , Okay.〉 ,<br />

a (3)<br />

2 = 〈1 , 3 , Very good, very good.〉 ,<br />

a (4)<br />

1 = 〈2 , 4 , right hand hand raised〉 ,<br />

a (5)<br />

1 = 〈2 , 3 , Alors ça〉 ,<br />

a (5)<br />

2 = 〈3 , 4 , dépend ((cough))〉 ,<br />

a (5)<br />

3 = 〈4 , 5 , un petit peu.〉 ,<br />

a (6)<br />

1 = 〈2 , 5 , That depends, then, a little bit〉 ,<br />

a (7)<br />

1 = 〈4 , 5 , εtipø:〉 .<br />

3 If the collection <strong>of</strong> arcs is, <strong>in</strong> fact, “simply” a set then a (2)<br />

1 and a (3)<br />

1 are identical. S<strong>in</strong>ce the <strong>in</strong>tention<br />

here is actually that they are not identical, we assume them to be dist<strong>in</strong>guishable either by some<br />

(at least technically) different label<strong>in</strong>g or by treat<strong>in</strong>g the collection <strong>of</strong> arcs as a multiset.<br />

Band 22 (2) – 2007 71


Michaelis, Mönnich<br />

2.2 <strong>Logical</strong> Properties<br />

In order to specify some logical properties, we need some additional concepts. First, we<br />

def<strong>in</strong>e a tree-decomposition <strong>of</strong> a graph.<br />

Def<strong>in</strong>ition 2.6 (Robertson and Seymour 1986) A tree-decomposition <strong>of</strong> a graph<br />

G = 〈V, E〉 is a tree T = 〈N, F 〉 together with a collection <strong>of</strong> subsets {T u | u ∈ N} <strong>of</strong> V<br />

such that ⋃ Tu = V and the follow<strong>in</strong>g properties hold:4<br />

u∈N<br />

1. For every edge e = 〈a, b〉 <strong>of</strong> G there is a u ∈ N such that {a, b} ⊆ T u<br />

2. For all u, v, w ∈ N, if w lies on a path from u to v <strong>in</strong> T then T u ∩ T v ⊆ T w<br />

The width <strong>of</strong> a tree-decomposition is equal to max u∈T {|T u| − 1}<br />

Note that a consequence <strong>of</strong> condition 2 from the above def<strong>in</strong>ition could be formulated<br />

as 2’, namely, for all u ∈ V , 〈{v ∈ N | u ∈ T v}, F ∩({v ∈ N | u ∈ T v}×{v ∈ N | u ∈ T v}〉<br />

is a tree. The important concept <strong>of</strong> treewidth, def<strong>in</strong>ed next, is an <strong>in</strong>dicator for the<br />

tree-likeness <strong>of</strong> a given graph G.<br />

Def<strong>in</strong>ition 2.7 (Robertson and Seymour 1986) The treewidth <strong>of</strong> a graph G is the<br />

m<strong>in</strong>imum value <strong>of</strong> k such that G has a tree-decomposition <strong>of</strong> width k.<br />

The treewidth <strong>of</strong> a class <strong>of</strong> graphs C is naturally def<strong>in</strong>ed as the smallest number k<br />

such that for all graphs G <strong>in</strong> C, their treewidth is smaller or equal to k. <strong>Annotation</strong><br />

graphs have unbounded treewidth s<strong>in</strong>ce arbitrary large grids can be considered as<br />

annotation graphs. In practical applications, though, one is ma<strong>in</strong>ly concerned with<br />

f<strong>in</strong>ite families <strong>of</strong> documents represented as annotation graphs. Trivially, these families<br />

are <strong>of</strong> bounded treewidth. In the follow<strong>in</strong>g we therefore restrict our attention to such<br />

f<strong>in</strong>ite families <strong>of</strong> annotation graphs.<br />

Fact 2.8 F<strong>in</strong>ite families <strong>of</strong> anchored annotation graphs are <strong>of</strong> bounded treewidth.<br />

Monadic Second-Order Logic (MSO) is a subset <strong>of</strong> second-order logic. MSO extends<br />

first-order logic by allow<strong>in</strong>g quantification over subsets <strong>of</strong> the universe <strong>of</strong> discourse.<br />

In other words quantification over second-order variables with at most one argument<br />

position is allowed.<br />

<strong>Annotation</strong> graphs can be conceived as f<strong>in</strong>ite relational structures with the set <strong>of</strong><br />

nodes understood as universe <strong>of</strong> discourse, the family <strong>of</strong> arcs as b<strong>in</strong>ary relations and<br />

the times as monadic predicates. A compact representation can be given <strong>in</strong> terms <strong>of</strong> an<br />

open diagram (cf. Prolog facts).<br />

Theorem 2.9 (Courcelle) Every property expressible <strong>in</strong> MSO is verifiable <strong>in</strong> l<strong>in</strong>ear<br />

time on graphs <strong>of</strong> bounded treewidth.<br />

4 Here, a tree is taken to be a connected acyclic graph. Later we will restrict our attention to the<br />

concept <strong>of</strong> a f<strong>in</strong>ite ordered tree as it underlies our Def<strong>in</strong>ition 3.2 <strong>of</strong> a f<strong>in</strong>ite labeled tree.<br />

72 LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

Courcelle’s result is conta<strong>in</strong>ed <strong>in</strong> Courcelle 1990. The pr<strong>in</strong>cipal tool <strong>in</strong> the pro<strong>of</strong> <strong>of</strong><br />

l<strong>in</strong>ear time decidability is the annotation <strong>of</strong> tree-like graph structures with logical types<br />

<strong>of</strong> bounded quantifier rank. These types assume the role <strong>of</strong> states <strong>in</strong> a bottom-up tree<br />

automaton.<br />

Let MSO 2 designate the extended monadic second-order logic with quantification over<br />

sets <strong>of</strong> nodes and sets <strong>of</strong> edges.<br />

Theorem 2.10 (Seese 1991) If a set <strong>of</strong> graphs G has a decidable MSO 2 theory then<br />

it is the subset <strong>of</strong> the (homomorphic images <strong>of</strong>) a recognizable set <strong>of</strong> trees, i.e. it is<br />

tree-def<strong>in</strong>able <strong>in</strong> the sense <strong>of</strong> Courcelle (2006).<br />

Remark. <strong>Trees</strong> are constructed from a f<strong>in</strong>ite set F <strong>of</strong> graph operations. Typical examples<br />

are disjo<strong>in</strong>t union, relabel<strong>in</strong>g <strong>of</strong> edges, addition <strong>of</strong> edges. Seese’s theorem is an outgrowth<br />

<strong>of</strong> a comb<strong>in</strong>ation <strong>of</strong> techniques from MSO-def<strong>in</strong>able graph transductions and from the<br />

fundamental work <strong>of</strong> Robertson and Seymour on graph m<strong>in</strong>ors.<br />

The representation <strong>of</strong> trees with<strong>in</strong> the AG-framework is possible <strong>in</strong> terms <strong>of</strong> so-called<br />

chart constructions, where each tree node is mapped to an AG-arc, as outl<strong>in</strong>ed <strong>in</strong> the<br />

follow<strong>in</strong>g easy example (cf. Cotton and Bird 2000):<br />

A <br />

<br />

B<br />

C<br />

The representation <strong>of</strong> trees <strong>in</strong> MSO 2-terms, rely<strong>in</strong>g on an order ≤, can be given via<br />

the notion <strong>of</strong> a match<strong>in</strong>g relation. A set <strong>of</strong> edges M is called a match<strong>in</strong>g relation if the<br />

follow<strong>in</strong>g conditions are satisfied:<br />

• e ∈ M ∧ <strong>in</strong>c(e, u, v) → u ≤ v (M is compatible with ≤)<br />

• e ∈ M ∧ e ′ ∈ M ∧ <strong>in</strong>c(e, u, v) ∧ <strong>in</strong>c(e ′ , w, z) ∧ u ≤ w ≤ v → u ≤ z ≤ v<br />

(M is non-cross<strong>in</strong>g)<br />

where <strong>in</strong>c denotes the usual <strong>in</strong>cidence relation.<br />

Recall from the previous section the notion <strong>of</strong> a totally anchored AG that is time-cross<strong>in</strong>g<br />

arc free. In case such an AG is anchored w.r.t. a s<strong>in</strong>gle timel<strong>in</strong>e, that notion is a<br />

particular <strong>in</strong>stance <strong>of</strong> a match<strong>in</strong>g relation.<br />

The present section has served to emphasize the advantages that come with a logical<br />

description <strong>of</strong> graphs <strong>of</strong> bounded treewidth. The fact that a set <strong>of</strong> f<strong>in</strong>ite graphs is<br />

<strong>of</strong> bounded treewidth does not lead automatically to a representation <strong>of</strong> this set as<br />

a family <strong>of</strong> trees beyond the obvious tree decomposition as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition 2.6.<br />

Restrict<strong>in</strong>g the attention to the family <strong>of</strong> annotation graphs that satisfy the comb<strong>in</strong>ed<br />

conditions characteristic <strong>of</strong> the s<strong>in</strong>gle timel<strong>in</strong>e, multiple tiers model, it can <strong>in</strong>deed be<br />

shown that these graphs allow a presentation <strong>in</strong> the form <strong>of</strong> multi-rooted trees. The<br />

next section is devoted to an elaboration <strong>of</strong> this claim.<br />

·<br />

B<br />

A<br />

·<br />

C<br />

·<br />

Band 22 (2) – 2007 73


Michaelis, Mönnich<br />

3 Multilayer <strong>Annotation</strong> and STMT models<br />

3.1 Multi-rooted <strong>Trees</strong><br />

S<strong>in</strong>ce our f<strong>in</strong>al aim will be to formally reconstruct an STMT model as a so-called<br />

multi-rooted tree, we here give some further explicit def<strong>in</strong>itions sett<strong>in</strong>g the stage.<br />

Def<strong>in</strong>ition 3.1 A tree doma<strong>in</strong> is a nonempty set Dτ ⊆ N ∗ such that for all χ ∈ N ∗<br />

and i ∈ N it holds that χ ∈ Dτ if χχ ′ ∈ Dτ for some χ ′ ∈ N ∗ , and χi ∈ Dτ if χj ∈ Dτ<br />

for some j ∈ N with i < j. 5<br />

Def<strong>in</strong>ition 3.2 A f<strong>in</strong>ite labeled tree, τ, is a quadruple <strong>of</strong> the form 〈Nτ , ⊳ ∗ τ , ≺τ , labelτ 〉<br />

where the triple 〈Nτ , ⊳ ∗ τ , ≺τ 〉 is a f<strong>in</strong>ite (ordered) tree def<strong>in</strong>ed <strong>in</strong> the usual sense, 6 i.e.<br />

up to an isomorphism 〈Nτ , ⊳ ∗ τ , ≺τ 〉 is the natural (tree) <strong>in</strong>terpretation <strong>of</strong> some tree<br />

doma<strong>in</strong> Dτ , 7 and where labelτ is the label<strong>in</strong>g function (<strong>of</strong> τ), i.e. a function from Nτ<br />

<strong>in</strong>to some set <strong>of</strong> labels, Lτ .<br />

Note that, if 〈Nτ , ⊳ ∗ τ , ≺τ , labelτ 〉, is a f<strong>in</strong>ite labeled tree, a tree doma<strong>in</strong>, Dτ , whose<br />

natural (tree) <strong>in</strong>terpretation is isomorphic to 〈Nτ , ⊳ ∗ τ , ≺τ 〉 is uniquely determ<strong>in</strong>ed. Note<br />

further that Def<strong>in</strong>ition 3.2 demands 〈Nτ , ⊳ ∗ τ , ≺τ 〉 to be "only" isomorphic to the natural<br />

(tree) <strong>in</strong>terpretation <strong>of</strong> some tree doma<strong>in</strong>, Dτ , and not to be necessarily a tree doma<strong>in</strong><br />

itself. Such a def<strong>in</strong>ition allows us to have two f<strong>in</strong>ite labeled trees with disjo<strong>in</strong>t set<br />

<strong>of</strong> nodes, which, <strong>of</strong> course, is not the case for the correspond<strong>in</strong>g tree doma<strong>in</strong>s whose<br />

natural (tree) <strong>in</strong>terpretations are isomorphic to the underly<strong>in</strong>g (non-labeled) trees. This<br />

possibility is exploited with<strong>in</strong> the next def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition 3.3 For any f<strong>in</strong>ite str<strong>in</strong>g α ∈ Σ ∗ for some f<strong>in</strong>ite alphabet Σ, a multi-rooted<br />

tree (over α) is a f<strong>in</strong>ite tuple <strong>of</strong> the form 〈τ r〉 r


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

the considerations <strong>of</strong> this subsection concern the aim <strong>of</strong> f<strong>in</strong>d<strong>in</strong>g a partition <strong>of</strong> G <strong>in</strong>to<br />

subgraphs each <strong>of</strong> which be<strong>in</strong>g without time-cross<strong>in</strong>g arcs <strong>in</strong> the sense <strong>of</strong> Def<strong>in</strong>ition<br />

2.4. This aim is motivated by the observation that time-cross<strong>in</strong>g <strong>of</strong> two edges is usually<br />

caused by <strong>in</strong>terference <strong>of</strong> two different layers <strong>of</strong> annotation, while s<strong>in</strong>gle layers consist<br />

<strong>of</strong> edges <strong>in</strong> match<strong>in</strong>g form, i.e. time-cross<strong>in</strong>g arc free. A correspond<strong>in</strong>g partition is<br />

not necessarily unique and there are, <strong>of</strong> course, several possible algorithms <strong>in</strong> order<br />

to calculate a particular <strong>in</strong>stance <strong>of</strong> such a partition. Heuristics may help to decide<br />

which algorithm is to be preferred <strong>in</strong> terms <strong>of</strong> its computational properties. As an<br />

example, we here present an algorithm which can be seen as be<strong>in</strong>g a representative<br />

<strong>of</strong> a “left-to-right, top-down” strategy (cf. Figure 3). The perspective motivat<strong>in</strong>g this<br />

“tree-travers<strong>in</strong>g” metaphor results from our f<strong>in</strong>al aim to reconstruct the given AG as<br />

a multi-rooted tree by us<strong>in</strong>g the correspond<strong>in</strong>g partition. In the next section we will<br />

show, that such a reconstruction can be done “on the fly,” when calculat<strong>in</strong>g the partition<br />

accord<strong>in</strong>g to the presented algorithm.<br />

For the rest <strong>of</strong> this section let G = 〈N, A, τ〉 be a totally anchored AG over a label set<br />

L and a s<strong>in</strong>gle timel<strong>in</strong>e 〈T, ≤ T 〉 such that every two nodes with the same time reference<br />

are identical. W.l.o.g. we may assume T is <strong>of</strong> the form {0, 1, . . . , K − 1} for some K ∈ N<br />

such that τ is surjective, i.e., <strong>in</strong> particular the order on T is <strong>in</strong>duced by the canonical<br />

order on N. S<strong>in</strong>ce we are concerned with a s<strong>in</strong>gle timel<strong>in</strong>e and a totally anchored graph,<br />

we can also identify the set <strong>of</strong> vertices, N, with T . Thus, the card<strong>in</strong>ality <strong>of</strong> N is K.<br />

We construct the “upper part” <strong>of</strong> the correspond<strong>in</strong>g adjacency matrix, 〈A p,q〉 0≤p 1 if N is nonempty.<br />

Band 22 (2) – 2007 75


Michaelis, Mönnich<br />

For 0 ≤ p < q < K, explore A p,q is def<strong>in</strong>ed as follows:<br />

explore A p,q<br />

1 if A p,q ≠ ∅ then<br />

2 choose a ∈ A p,q<br />

3 A r ← A r ∪ {a} % add a to partition set A r<br />

4 A p,q ← A p,q \ {a} % remove a from set A p,q<br />

5 fi<br />

6 i ← 0<br />

7 j ← 1<br />

8 while p + i < q − j and A p+i,q−j = ∅ repeat % search<strong>in</strong>g a leftmost “child”<br />

9 if p + i < q − j − 1 then % explor<strong>in</strong>g “top-down” <strong>in</strong>terval [p+i,q-j]<br />

10 j ← j + 1 % reduc<strong>in</strong>g right <strong>in</strong>terval value by 1<br />

11 else<br />

12 i ← i + 1 % <strong>in</strong>creas<strong>in</strong>g left <strong>in</strong>terval value by 1<br />

13 j ← 0 % right <strong>in</strong>terval value back to q<br />

14 fi<br />

15 i p,q ← i<br />

16 j p,q ← j<br />

17 if p + i p,q < q − j p,q then % there is no “child” at all iff p+i p,q<br />

=q-j p,q<br />

18 explore A p+ip,q,q−jp,q % explor<strong>in</strong>g leftmost “child” —- nonempt<strong>in</strong>ess<br />

<strong>of</strong> A p+ip,q ,q-j p,q<br />

is guaranteed by WHILE-loop<br />

19 if j p,q > 0 then<br />

20 explore A q−jp,q,q % search<strong>in</strong>g for right “sibl<strong>in</strong>g” <strong>of</strong> leftmost<br />

“child”<br />

21 fi<br />

22 fi<br />

Fact 3.4 For p, q < K with p < q, i p,q and j p,q (depend<strong>in</strong>g on i p,q) are m<strong>in</strong>imal, i.e.,<br />

A p+i,q−j = ∅ for all j < q − (p + i) <strong>in</strong> case i < i p,q, and for all j < j p,q <strong>in</strong> case i = i p,q.<br />

Remark. Note that for p, q < K with p < q, the while-loop for potentially f<strong>in</strong>d<strong>in</strong>g<br />

m<strong>in</strong>imal i and m<strong>in</strong>imal j, depend<strong>in</strong>g on i, with<strong>in</strong> the subprocedure explore A p,q is a<br />

“left-to-right, top-down” search along the “rows” left-to and below matrix entry A p,q (cf.<br />

Figure 3). In particular, we have A p+i,q−j = ∅ if i < i p,q and q−(p+i p,q) ≤ j < q−(p+i).<br />

Hence explore A p,p+ip,q would yield no contribution to the partition set A r, if it were<br />

part <strong>of</strong> the subprocedure explore A p,q.<br />

Proposition 3.5 The time complexity <strong>of</strong> partition A is <strong>in</strong> O(K 2 m), m be<strong>in</strong>g the<br />

card<strong>in</strong>ality <strong>of</strong> A.<br />

Proposition 3.6 Let k ∈ N. Then for K = 2k, the label set L = {l 0, l 1, . . . , l k−1 } and<br />

the arc set A = {〈p, p + k, l p〉 | 0 ≤ p < k} the time complexity <strong>of</strong> partition A is at<br />

least <strong>in</strong> O(K 2 m) (as before, m be<strong>in</strong>g the card<strong>in</strong>ality <strong>of</strong> A, i.e., m = k <strong>in</strong> this case).<br />

76 LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

“y-axis”<br />

∧<br />

p + 1<br />

q<br />

∧<br />

A p,q<br />

p<br />

∧<br />

∧<br />

∧<br />

∧<br />

“x-axis”<br />

∧<br />

∧<br />

∧<br />

q − 1<br />

∧<br />

Figure 3: Potential search space for i and j with<strong>in</strong> WHILE-loop <strong>of</strong> explore A p,q.<br />

As mentioned above, <strong>in</strong> order to f<strong>in</strong>d an appropriate partition <strong>of</strong> A, the algorithm<br />

given above only represents one <strong>of</strong> several more general possibilities depend<strong>in</strong>g on<br />

the strategy pursued. In fact, apply<strong>in</strong>g the algorithm to the example above yields a<br />

partition <strong>in</strong>to three arc sets which might “contra-<strong>in</strong>tuitively” partition the second, third<br />

and fifth layer, depend<strong>in</strong>g on which arc is chosen first from the arc subsets A 0,1, A 1,3,<br />

A 2,3 and A 4,5 accord<strong>in</strong>g to l<strong>in</strong>e 2 <strong>of</strong> the subprocedure explore. If the second layer is<br />

not partitioned, the fifth will be, and vice versa. This is due to the fact that both a (2)<br />

2<br />

and a (5)<br />

2 def<strong>in</strong>itely belong to the set A 0, while a (2)<br />

3 and a (5)<br />

1 never will do so at the same<br />

time.<br />

Example 3.7 (cont<strong>in</strong>ued) Apply<strong>in</strong>g partition to A ex yields a partition <strong>in</strong>to three<br />

sets, A 0, A 1 and A 2, <strong>of</strong> the form<br />

A 0 = {x 0 , x 1 , a (2)<br />

2 , x2 , a(5) 2 , x3} , A1 = {x′ 0 , x ′ 1 , x ′ 2 , x ′ 3} and A 2 = {a (6)<br />

1 , a(4) 1 }<br />

with x i, x ′ i ∈ X i such that x i ≠ x ′ i for 0 ≤ i ≤ 3, where<br />

X 0 = {a (2)<br />

1 , a(3) 1<br />

} , X1 = {a(1)<br />

1 , a(3) 2<br />

} , X2 = {a(2)<br />

3 , a(5) 1<br />

} and X3 = {a(5)<br />

3 , a(7)<br />

Note that the situation <strong>of</strong> “algorithmic arbitrar<strong>in</strong>ess” immediately changes if we have<br />

further access to “external” <strong>in</strong>formation which can be used to guide the selection from<br />

1 }.<br />

Band 22 (2) – 2007 77


Michaelis, Mönnich<br />

an arc subset A p,q, like hierarchical structure and ontological <strong>in</strong>formation <strong>in</strong> the sense<br />

<strong>of</strong> Bird and Liberman (2001, Section 3.2f), e.g., <strong>in</strong> terms <strong>of</strong> type and speaker as <strong>in</strong><br />

an EXMARaLDA Basic Transcription Model itself based on a pure STMT model (cf.<br />

Schmidt 2005 and Figure 4 here).<br />

Figure 4: Example <strong>of</strong> an EXMARaLDA Basic Transcription Model (cf. Schmidt 2005).<br />

3.3 Build<strong>in</strong>g a multi-rooted tree from an STMT model - “Revers<strong>in</strong>g the chart construction”<br />

Consistently tak<strong>in</strong>g over all formal prerequisites, we, <strong>in</strong> particular, let G = 〈N, A, τ〉<br />

be the totally anchored AG over a label set L and a s<strong>in</strong>gle timel<strong>in</strong>e 〈T, ≤ T 〉 from the<br />

previous section. In order to build—on the fly—a multi-rooted tree, MultTree A, from<br />

Part A, the partition <strong>of</strong> A, created by apply<strong>in</strong>g partition to A, we slightly adapt and<br />

extend the algorithm. 9<br />

9 Modified l<strong>in</strong>es are <strong>in</strong>dicated by a prime follow<strong>in</strong>g the l<strong>in</strong>e number (x’). Additional l<strong>in</strong>es are <strong>in</strong>dicated<br />

by a subsequent l<strong>in</strong>e number count<strong>in</strong>g (x.1, x.2 etc.).<br />

78 LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

partition A<br />

% <strong>in</strong>clud<strong>in</strong>g the construction <strong>of</strong> a multi-rooted tree, MultTree A, from Part A<br />

1 r ← 0<br />

2’ construct A r, T r % construct first partition set and<br />

correspond<strong>in</strong>g tree<br />

3 Part A ← 〈A r〉 % <strong>in</strong>itialize list <strong>of</strong> partition sets<br />

3.1 MultTree A ← 〈T r〉 % <strong>in</strong>itialize multi-rooted tree (list)<br />

4 until A r = ∅ repeat<br />

5 r ← r + 1<br />

6 construct A r, T r % construct next (potential) partition<br />

set and correspond<strong>in</strong>g tree<br />

7 if A r ≠ ∅ then<br />

8 Part A ← Part A · 〈A r〉 % append next partition set to list<br />

8.1 MultTree A ← MultTree A · 〈T r〉 % append next tree to multi-rooted<br />

tree (list)<br />

9 fi<br />

10 return Part A<br />

10.1 return MultTree A<br />

The subprocedure construct A r, T r is “just” an extension <strong>of</strong> construct A r.<br />

construct A r, T r<br />

1 A r ← ∅<br />

1.1 T r ← ∅<br />

2 if K > 0 then 10<br />

2.1 χ 0,K−1 ← ɛ % def<strong>in</strong>e (essential part <strong>of</strong>) root address <strong>of</strong> T r<br />

2.2 k 0,K−1 ← ɛ % dummy for l<strong>in</strong>e 4.1 <strong>of</strong> procedure explore A 0,K-1<br />

2.3 real root 0,K−1 ← false % potentially no arc from node 0 to node K-1<br />

3 explore A 0,K-1<br />

4 fi<br />

For 0 ≤ p < q < K, the modified subprocedure explore A p,q is now given by<br />

10 Recall that, because <strong>of</strong> (i) <strong>of</strong> Def<strong>in</strong>ition 2.1, K > 1 if N is nonempty.<br />

Band 22 (2) – 2007 79


Michaelis, Mönnich<br />

explore A p,q<br />

1 if A p,q ≠ ∅ then<br />

2 choose a ∈ A p,q<br />

3 A r ← A r ∪ {a} % add a to partition set A r<br />

4 A p,q ← A p,q \ {a} % remove a from set A p,q<br />

4.1 χ p,q ← χ p,q · k p,q % def<strong>in</strong>e (essential part <strong>of</strong>) new<br />

node address for T r<br />

4.2 T r ← T r ∪ {〈〈χ p,q, r〉, label(a)〉} % add to T r a correspond<strong>in</strong>g new<br />

node labeled by label(a) 11<br />

4.3 real root p,q ← true % no dummy node, cf. l<strong>in</strong>e 4.10, 11.1<br />

4.4 k ← 0 % potential next daughter to f<strong>in</strong>d<br />

is leftmost child<br />

4.5 if p + 1 = q then<br />

4.6 T r ← T r ∪ {〈〈χ p,q · k, r〉, 〈p, p + 1〉〉} % yield <strong>of</strong> T r comprises (arc cover<strong>in</strong>g)<br />

<strong>in</strong>terval [p,p+1]<br />

4.7 fi<br />

4.8 else<br />

4.9 if p = 0 and q = K − 1then<br />

4.10 T r ← T r ∪ {〈〈ɛ, r〉, dummy-label r 〉} % dummy root cover<strong>in</strong>g potential nontime-cross<strong>in</strong>g,<br />

non-<strong>in</strong>clusive arcs<br />

(cf. T 0 and T 1 from Example 3.9)<br />

4.11 k ← 0<br />

4.12 else<br />

4.13 k ← k p,q<br />

4.14 fi<br />

5 fi<br />

6 i ← 0<br />

7 j ← 1<br />

11 For each arc a = 〈p, q, l〉 ∈ A for some p, q ∈ N and l ∈ L, we take label(a) to denote its label l.<br />

80 LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

8 while p + i < q − j and A p+i,q−j = ∅ repeat % search<strong>in</strong>g leftmost “child”<br />

9 if p + i < q − j − 1 then<br />

10 j ← j + 1<br />

11 else<br />

11.1 if real root p,q = true then<br />

11.2 T r ← T r ∪ {〈〈χ p,q · k, r〉, 〈p + i, p + i + 1〉〉} % yield <strong>of</strong> T r comprises (arc<br />

cover<strong>in</strong>g) <strong>in</strong>terval [p+i,p+i+1]<br />

11.3 k ← k + 1<br />

11.4 fi<br />

12 i ← i + 1<br />

13 j ← 0<br />

14 fi<br />

15 i p,q ← i<br />

16 j p,q ← j<br />

17 if p + i p,q < q − j p,q then<br />

17.1 χ p+ip,q,q−jp,q ← χ p,q<br />

17.2 k p+ip,q,q−jp,q ← k<br />

17.3 real root p+ip,q,q−jp,q ← real root p,q<br />

18 explore A p+ip,q,q−jp,q<br />

19 if j p,q > 0 then<br />

19.1 χ p+ip,q,q−jp,q ← χ p,q<br />

19.2 k q−jp,q,q ← k p+ip,q,q−jp,q + 1<br />

19.3 real root q−jp,q,q ← real root p,q<br />

20 explore A q−jp,q,q % search<strong>in</strong>g for right “sibl<strong>in</strong>g” <strong>of</strong><br />

leftmost “child”<br />

21 fi<br />

22 fi<br />

Remark. The previously stated results on the time complexity bounds <strong>of</strong> partition A<br />

are not affected (cf. Proposition 3.5 and 3.6).<br />

Proposition 3.8 For each T r appear<strong>in</strong>g as a component <strong>in</strong> MultTree A, let L r denote<br />

the (label) set L ∪ {dummy-label r } ∪ {〈p, p + 1〉 | 0 < p < K − 1}. Then the set <strong>of</strong><br />

first components <strong>of</strong> the first components <strong>of</strong> the elements <strong>of</strong> T r, {χ ∈ N ∗ | 〈〈χ, r〉, l〉 ∈<br />

T r for some l ∈ L r}, constitutes a tree doma<strong>in</strong>. In this sense T r can straightforwardly<br />

be <strong>in</strong>terpreted as a f<strong>in</strong>ite labeled tree. All non-term<strong>in</strong>al nodes <strong>of</strong> T r, except for the<br />

root node, are necessarily labeled by arc labels from L. The root node is labeled by<br />

dummy-label r <strong>in</strong> case A 0,K−1 was empty, while process<strong>in</strong>g explore A 0,K−1. Otherwise<br />

it is also labeled by an element from L. Each leaf is labeled by 〈p, p + 1〉 for some<br />

p < K − 1.<br />

Example 3.9 (cont<strong>in</strong>ued) Apply<strong>in</strong>g the modified algorithm partition to A ex, <strong>in</strong><br />

particular yields a multi-rooted tree MultTree A ex = 〈T 0, T 1, T 2〉 with<br />

Band 22 (2) – 2007 81


Michaelis, Mönnich<br />

T 0 = {〈 〈 ɛ , 0 〉 , dummy-label 0 〉 , 〈 〈 0 , 0 〉 , label(x 0) 〉 , 〈 〈 00 , 0 〉 , 〈 0 , 1 〉 〉 ,<br />

〈 〈 1 , 0 〉 , label(x 1) 〉 , 〈 〈 10 , 0 〉 , label(a (2)<br />

2 ) 〉 , 〈 〈 100 , 0 〉 , 〈 1 , 2 〉 〉 ,<br />

〈 〈 11 , 0 〉 , label(x 2) 〉 , 〈 〈 110 , 0 〉 , 〈 2 , 3 〉 〉 , 〈 〈 2 , 0 〉 , label(a (5)<br />

2 ) 〉 ,<br />

〈 〈 20 , 0 〉 , 〈 3 , 4 〉 〉 , 〈 〈 3 , 0 〉 , label(x 3) 〉 〉 , 〈 〈 30 , 0 〉 , 〈 4 , 5 〉 〉 }<br />

T 1 = {〈 〈 ɛ , 1 〉 , dummy-label 1 〉 , 〈 〈 0 , 1 〉 , label(x ′ 0) 〉 , 〈 〈 00 , 1 〉 , 〈 0 , 1 〉 〉 ,<br />

〈 〈 1 , 1 〉 , label(x ′ 1) 〉 , 〈 〈 10 , 1 〉 , 〈 1 , 2 〉 〉 , 〈 〈 11 , 1 〉 , label(x ′ 2) 〉 ,<br />

〈 〈 110 , 1 〉 , 〈 2 , 3 〉 〉 , 〈 〈 2 , 1 〉 , label(x ′ 3) 〉 〉 , 〈 〈 20 , 1 〉 , 〈 4 , 5 〉 〉}<br />

T 2 = {〈 〈 ɛ , 2〉 , dummy-label 2 〉 , 〈 〈 0 , 2〉 , label(a (6)<br />

1 ) , 〈 〈 00 , 2〉 , label(a(4) 1 ) ,<br />

〈 〈 000 , 2〉 , 〈 2 , 3〉〉 , 〈 〈 001 , 2〉 , 〈 3 , 4〉〉 , 〈 〈 01 , 2〉 , 〈 4 , 5〉 〉 }<br />

with x i, x ′ i ∈ X i such that x i ≠ x ′ i for 0 ≤ i ≤ 3, where<br />

X 0 = {a (2)<br />

(cf. Figure 5).<br />

1 , a(3) 1<br />

} , X1 = {a(1)<br />

1 , a(3) 2<br />

} , X2 = {a(2)<br />

3 , a(5) 1<br />

} and X3 = {a(5)<br />

3 , a(7) 1 }<br />

T 0 dummy-label 0<br />

label(x 0)<br />

label(x 1)<br />

label(a (5)<br />

2 )<br />

label(x 3)<br />

〈0 , 1〉<br />

label(a (2)<br />

2 )<br />

label(x 2)<br />

〈3 , 4〉<br />

〈4 , 5〉<br />

〈1 , 2〉<br />

〈2 , 3〉<br />

T 1 dummy-label 1<br />

T 2 dummy-label 2<br />

label(x ′ 0 )<br />

label(x ′ 1 )<br />

label(x 3)<br />

label(a (6)<br />

1 )<br />

〈0 , 1〉<br />

〈1 , 2〉 label(x ′ 2 )<br />

〈4 , 5〉<br />

label(a (4)<br />

1 )<br />

〈4 , 5〉<br />

〈2 , 3〉<br />

〈2 , 3〉 〈3 , 4〉<br />

Figure 5: The tree components <strong>of</strong> the multi-rooted tree MultTree A ex = 〈T 0 , T 1 , T 2 〉.<br />

4 Envoi<br />

In this paper we have sketched the beg<strong>in</strong>n<strong>in</strong>gs <strong>of</strong> a logical theory <strong>of</strong> annotation graphs.<br />

Along the way we have tried to emphasize the follow<strong>in</strong>g po<strong>in</strong>ts:<br />

• Abstract logical framework with multilayer capabilities for l<strong>in</strong>guistic annotations<br />

• Compact logical representation<br />

• Efficient MSO theory<br />

82 LDV-FORUM


<strong>Towards</strong> a <strong>Logical</strong> <strong>Description</strong> <strong>of</strong> <strong>Trees</strong> <strong>in</strong> <strong>Annotation</strong> <strong>Graphs</strong><br />

• Subset <strong>of</strong> regular trees<br />

• Internal def<strong>in</strong>ability <strong>of</strong> tree structure<br />

• Partition <strong>of</strong> annotation layers<br />

While the logical approach towards annotation models provides a unified format for<br />

the syntactic level it still has to be complemented with a component that serves to<br />

<strong>in</strong>tegrate syntactic with semantic structures. Primary candidates for this component<br />

are amalgamation techniques from model theory and the assembly <strong>of</strong> heterogeneous<br />

formal specifications via transformation systems. Care must be taken <strong>in</strong> this effort<br />

to preserve the nice complexity properties that are associated with f<strong>in</strong>ite graphs <strong>of</strong><br />

bounded treewidth. On the other hand, annotation graphs <strong>of</strong>fer a m<strong>in</strong>imal formalization<br />

<strong>of</strong> typical transcription needs by means <strong>of</strong> acyclic graphs with fielded records on the<br />

edges. Semantic <strong>in</strong>formation is easily <strong>in</strong>tegrated <strong>in</strong>to this m<strong>in</strong>imal framework. It is for<br />

this reason that we believe that our general perspective on the formal properties <strong>of</strong><br />

annotation graphs will reta<strong>in</strong> its value if additional types <strong>of</strong> annotation are added to<br />

the current format <strong>of</strong> transcription schemes.<br />

Acknowledgements<br />

This research was partially supported by grant MO 386/3-4 <strong>of</strong> the German Research<br />

Foundation (DFG).<br />

References<br />

Bird, S. and Liberman, M. (2001). A formal framework for l<strong>in</strong>guistic annotation. Speech<br />

Communication, 33:23–60.<br />

Cotton, S. and Bird, S. (2000). An <strong>in</strong>tegrated framework for treebanks and multilayer annotations.<br />

In Proceed<strong>in</strong>gs <strong>of</strong> the Third International Conference on Language Resources<br />

and Evaluation (LREC 2002), Las Palmas de Gran Canaria, pages 1670–1677. European<br />

Language Resources Association.<br />

Courcelle, B. (1990). The monadic second-order logic <strong>of</strong> graphs I: Recognizable sets <strong>of</strong> f<strong>in</strong>ite<br />

graphs. Information and Computation, 85:12–75.<br />

Courcelle, B. (2006). The monadic second-order logic <strong>of</strong> graphs XV: On a conjecture by D.<br />

Seese. Journal <strong>of</strong> Applied Logic, 4:79–114.<br />

Robertson, N. and Seymour, P. D. (1986). Graph m<strong>in</strong>ors. II. Algorithmic aspects <strong>of</strong> tree-width.<br />

Journal <strong>of</strong> Algorithms, 7:309–322.<br />

Schmidt, T. (2005). Time-based data models and the text encod<strong>in</strong>g <strong>in</strong>itiative’s guidel<strong>in</strong>es for<br />

transcription <strong>of</strong> speech. Arbeiten zur Mehrsprachigkeit, Folge B, Nr. 62 (Work<strong>in</strong>g Papers <strong>in</strong><br />

Multil<strong>in</strong>gualism, Series B, No. 62), Universität Hamburg, Hamburg.<br />

Seese, D. (1991). The structure <strong>of</strong> the models <strong>of</strong> decidable monadic theories <strong>of</strong> graphs. Annals<br />

<strong>of</strong> Pure and Applied Logic, 53:169–195.<br />

Band 22 (2) – 2007 83

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!