Introduction to Computational Linguistics
Introduction to Computational Linguistics
Introduction to Computational Linguistics
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
12. Finite State Au<strong>to</strong>mata 37<br />
Suppose that the first character is at position 1230. Then the string 129.023<br />
matches the first bracket and the string 145.110 the second bracket. These strings<br />
can be recalled using the function matched_group. It takes as input a number<br />
and the original string, and it returns the string of the nth matching bracket. So, if<br />
directly after the match on the string assigned <strong>to</strong> u we define<br />
(104)<br />
let s = "The first half of the IP address is<br />
"^(Str.matched_group 1 u)<br />
we get the following value for s:<br />
(105) "The first half of the IP address is 129.23<br />
To use this in an au<strong>to</strong>mated string replacement procedure, the variables \\0, \\1,<br />
\\2,..., \\9. After a successful match, \\0 is assigned <strong>to</strong> the entire string, \\1, <strong>to</strong><br />
the first matched string, \\2 <strong>to</strong> the second matched string, and so on. A template<br />
is a string that in place of characters also contains these variables (but nothing<br />
more). The function global_replace takes as input a regular expression, and<br />
two strings. The first string is used as a template. Whenever a match is found it<br />
uses the template <strong>to</strong> execute the replacement. For example, <strong>to</strong> cut the IP <strong>to</strong> its first<br />
half, we write the template "\\1". If we want <strong>to</strong> replace the original IP address<br />
by its first part followed by .0.1, then we use "\\.0.1". If we want <strong>to</strong> replace the<br />
second part by the first, we use "\\1.\\".<br />
12 Finite State Au<strong>to</strong>mata<br />
A finite state au<strong>to</strong>ma<strong>to</strong>n is a quintuple<br />
(106) A = 〈A, Q, i 0 , F, δ〉<br />
where A, the alphabet, is a finite set, Q, the set of states, also is a finite set,<br />
i 0 ∈ Q is the initial state, F ⊆ Q is the set of final or accepting states and,<br />
finally, δ ⊆ Q × A × Q is the transition relation. We write x → a y if 〈x, a, y〉 ∈ δ.