08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

This is a linear program.<br />

As we remarked earlier, the equation A = BC will not hold exactly. A more practical<br />

model views A as a matrix <strong>of</strong> probabilities rather than exact frequencies. In this model,<br />

each document is generated by picking its terms in independent trials. Each trial for<br />

document j picks term 1 with probability a 1j ; term 2 with probability a 2j , etc. We are<br />

not given entire documents; instead we are given s independent trials for each document.<br />

Our job is to find B and C. We do not discuss the details <strong>of</strong> either the model or the<br />

algorithms. In this new situation, algorithms are known to find B and C when there exist<br />

anchor terms, even with a small number s <strong>of</strong> trials.<br />

At the heart <strong>of</strong> such an algorithm is the following problem:<br />

Approximate NMF Given a n×m matrix A and the promise that there is a n×r matrix<br />

B and a r × m matrix C, both with nonnegative entries, such that ||A − BC|| F ≤ ∆, find<br />

B ′ and C ′ <strong>of</strong> the same dimensions, with nonnegative entries such that ||A − B ′ C ′ || F ≤ ∆ ′ .<br />

Here, ∆ ′ is related to ∆ and if the promise does not hold, the algorithm is allowed to<br />

return any answer.<br />

Now for the case when anchor words exist, this reduces to the problem <strong>of</strong> finding which<br />

rows <strong>of</strong> A have the property that no point close to the row is positively dependent on<br />

other rows. It is easy to write the statement that there is a vector y close to a i which is<br />

positively dependent on the other rows as a convex program:<br />

∣ ∑ ∣∣∣∣<br />

∃x 1 , x 2 , . . . , x i−1 , x i+1 , . . . , x n such that<br />

x j a j − a i ≤ ε.<br />

∣<br />

| ∑ j≠i x ja j − a i | is convex function <strong>of</strong> x j and hence this problem can be solved efficiently.<br />

9.2 Hidden Markov Model<br />

A hidden Markov model, HMM, consists <strong>of</strong> a finite set <strong>of</strong> states with a transition<br />

between each pair <strong>of</strong> states. There is an initial probability distribution α on the states<br />

and a transition probability a ij associated with the transition from state i to state j. Each<br />

state has a probability distribution p(O, i) giving the probability <strong>of</strong> outputting the symbol<br />

O in state i. A transition consists <strong>of</strong> two components. A state transition to a new state<br />

followed by the output <strong>of</strong> a symbol. The HMM starts by selecting a start state according<br />

to the distribution α and outputting a symbol.<br />

Example: An example <strong>of</strong> a HMM is the graph with two states q and p illustrated below.<br />

j≠i<br />

303

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!