17.01.2015 Views

Prosodic Boundary Prediction based on Maximum Entropy Model ...

Prosodic Boundary Prediction based on Maximum Entropy Model ...

Prosodic Boundary Prediction based on Maximum Entropy Model ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2 <str<strong>on</strong>g>Prosodic</str<strong>on</strong>g> Boudary <str<strong>on</strong>g>Predicti<strong>on</strong></str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>Maximum</strong> <strong>Entropy</strong><br />

<strong>Model</strong><br />

In our experiment, a basic assumpti<strong>on</strong> is that a prosodic boundary <strong>on</strong>ly occurs at<br />

syntactic word boundaries. For Mandarin Chinese, it is reas<strong>on</strong>able as statistics show<br />

that <strong>on</strong>ly 6% of prosodic words are part of a l<strong>on</strong>g syntactic word, and all the rest agree<br />

with this assumpti<strong>on</strong>.<br />

Given a string of c<strong>on</strong>secutive syntactic words, for each boundary between two of<br />

them (say w i and w i+1 ), there are 3 types: LW (w i and w i+1 are within the same prosodic<br />

word), PW(within the same prosodic phase but different prosodic words) and<br />

PPh(within different prosodic phrases). Then our task comes down to deciding the<br />

right type for each syntactic word boundary, which could be accomplished with a<br />

<strong>Maximum</strong> <strong>Entropy</strong> <strong>Model</strong>.<br />

2.1 <strong>Maximum</strong> <strong>Entropy</strong> <strong>Model</strong>ing<br />

C<strong>on</strong>sider a random process that produces an output value y <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> some c<strong>on</strong>textual<br />

informati<strong>on</strong> x, with x and y being a member of a finite set X and Y respectively. In our<br />

case, y is the type of a syntactic word boundary (i.e. LW, PW or PPh), and x could<br />

include any available informati<strong>on</strong> about that boundary.<br />

Our task is to c<strong>on</strong>struct a stochastic model that accurately represents the behavior<br />

of the random process. In other words, it should give a reliable estimati<strong>on</strong> of p(ylx),<br />

which denotes the c<strong>on</strong>diti<strong>on</strong>al probability that, given a c<strong>on</strong>text x, the process will<br />

output y.<br />

For this purpose, we observe the behavior of the random process for some time,<br />

collecting N samples (x 1 , y 1 ), (x 2 , y 2 ) . . . . . (x N , y N ). To express these facts, a feature<br />

functi<strong>on</strong> or feature for short is defined as:<br />

fi( x, y)<br />

= ⎧ ⎨<br />

⎩<br />

1<br />

0<br />

if y = y and x = x<br />

i i<br />

Otherwise<br />

The expected value of each feature f i with respect to the statistics of training<br />

samples could then be calculated as:<br />

p ( f i ) = ( , ) ( , )<br />

( x ∑ , y)<br />

p x y f i x y , .<br />

where p( x, y)<br />

is the empirical probability distributi<strong>on</strong> of the samples, defined by:<br />

1<br />

p( x, y) ≡ × number of times that ( x, y)<br />

occurs in the sample .<br />

N<br />

On the other hand, the expected value of f i with respect to the unknown model<br />

p(ylx) is:<br />

(1)<br />

(2)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!