22.11.2012 Views

Interdisciplinary Journal of Contemporary Research in ... - Webs

Interdisciplinary Journal of Contemporary Research in ... - Webs

Interdisciplinary Journal of Contemporary Research in ... - Webs

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ijcrb.webs.com<br />

INTERDISCIPLINARY JOURNAL OF CONTEMPORARY RESEARCH IN BUSINESS<br />

represents the tree topology at site t. For m taxa, there are K = (2m - 5)!! dist<strong>in</strong>ct unrooted<br />

topologies (where !! denotes double factorial), hence St {1,..., K}. If a recomb<strong>in</strong>ation<br />

event has occurred, then there will be a change <strong>in</strong> topology <strong>in</strong> this region, correspond<strong>in</strong>g<br />

to a transition <strong>in</strong>to another hidden state at the breakpo<strong>in</strong>t <strong>of</strong> this region. Our objective is to<br />

predict the "optimal" sequence <strong>of</strong> hidden states<br />

given the sequence alignment and some optimality criterion to be discussed below.<br />

Obviously, this optimization problem is, <strong>in</strong> general, <strong>in</strong>tractable. First, the number <strong>of</strong><br />

possible topologies at a given site, K, <strong>in</strong>creases super-exponentially with the number <strong>of</strong><br />

sequences m. Second, there are K N different state sequences, which prevents an<br />

exhaustive search even for small values <strong>of</strong> K. Consequently, the <strong>in</strong>troduction <strong>of</strong><br />

approximations and restrictions is <strong>in</strong>evitable.<br />

To deal with the second source <strong>of</strong> computational complexity, <strong>in</strong>teractions between sites<br />

are limited to nearest-neighbor <strong>in</strong>teractions. This allows the application <strong>of</strong> a dynamic<br />

program<strong>in</strong>g scheme which reduces the computational complexity to (K 2 N), that is, to an<br />

expression l<strong>in</strong>ear <strong>in</strong> N. To deal with the first source <strong>of</strong> complexity, the scheme has to be<br />

restricted to alignments with small numbers <strong>of</strong> sequences. In the current work, we restrict<br />

our approach to alignments with only m = 4 taxa. In the Discussion we describe how this<br />

restriction can be relaxed.<br />

He<strong>in</strong> (1993)def<strong>in</strong>ed optimality <strong>in</strong> a parsimony sense. algorithm, searches for the most<br />

parsimonious state sequence S, that is, the one that m<strong>in</strong>imizes a given parsimony cost<br />

function E(S).<br />

2. Detect<strong>in</strong>g Recomb<strong>in</strong>ation with Hidden Markov Models<br />

Adopt<strong>in</strong>g a statistical approach to phylogenetics, illustrated <strong>in</strong> figure 3 the probabilistic<br />

equivalent to RECPARS is a hidden Markov model (HMM), whose application to the<br />

detection <strong>of</strong> recomb<strong>in</strong>ation was first suggested by McGuire, Wright, and<br />

Prenticle(2000)Figure 4left, shows the correspond<strong>in</strong>g probabilistic graphical model.<br />

White nodes represent hidden states, St, which have direct <strong>in</strong>teractions only with the<br />

states at adjacent sites, St-1 and St+1. Black nodes represent columns <strong>in</strong> the DNA sequence<br />

alignment, yt. The jo<strong>in</strong>t probability <strong>of</strong> the DNA sequence alignment, , and the sequences<br />

<strong>of</strong> hidden states, S, factorizes:<br />

COPY RIGHT © 2011 Institute <strong>of</strong> <strong>Interdiscipl<strong>in</strong>ary</strong> Bus<strong>in</strong>ess <strong>Research</strong><br />

JANUARY 2011<br />

VOL 2, NO 9<br />

531

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!