computing the quartet distance between general trees
computing the quartet distance between general trees
computing the quartet distance between general trees
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
7.1. THE ALGORITHM 45More interesting are <strong>the</strong> steps needed to prepare each of <strong>the</strong> tables used for look-up– because <strong>the</strong> actual calculation of <strong>the</strong>se is part of <strong>the</strong> algorithm and a thorough understandingis <strong>the</strong>refore essential. All tables are prepared during <strong>the</strong> preprocessing of eachpair of internal nodes, along with <strong>the</strong> table I , presented in Sec. 7.1.1. They are all resultsof fur<strong>the</strong>r processing of I and necessary to make <strong>the</strong> constant time calculation. With <strong>the</strong>exception of <strong>the</strong> tables I1 ′′′ ′′′and I2, none of <strong>the</strong> tables are time-consuming to deal withand no worse than O(d v d v ′) which, for all pairs of inner nodes, leads to a total time of∑v∈T∑v ′ ∈T ′ d v d v′ = ( ∑ v∈T d v )( ∑ v ′ ∈T ′ d v ′) ≤ (2|E|)(2|E|) = O(n2 ).These two exceptions, I1 ′′′ ′′′and I2, are actually identical and only differ in <strong>the</strong> way<strong>the</strong>y are calculated. With one term, <strong>the</strong>y shall simply be known as I ′′′ and <strong>the</strong> calculationof this table will be explained thoroughly in <strong>the</strong> following Sec. 7.1.4.1.7.1.4.1 The calculation of I ′′′The table named I ′′′ plays a part in <strong>the</strong> calculation of <strong>the</strong> number of different butterfliesand in order to keep <strong>the</strong> entire running time of <strong>the</strong> algorithm sub-cubic, special care isneeded in <strong>the</strong> calculation of <strong>the</strong> table. It is defined as follows:I ′′′ [i , j ] =∑d vd v ′∑k=1,k≠i l=1,l≠jI [i ,l]I [k, j ]I [k,l] (7.11)Filling <strong>the</strong> table naively, in accordance with <strong>the</strong> formula, takes time O(n 4 ) and <strong>the</strong>sub-cubic time bound promised will be broken. Hence, <strong>the</strong> calculation constitutes aserious barrier and demands separate attention. The solution is instead to calculate ei<strong>the</strong>rI ′′′1 = (I I T )I or o<strong>the</strong>rwise I ′′′2 = I (I T I ) which are both described in more detail in <strong>the</strong>appendix. At first sight, this does not seem to solve <strong>the</strong> problem, since <strong>the</strong> solution is relyingon matrix multiplication. The complexity of matrix multiplication, if done naively,is O(n 3 ). As explained in <strong>the</strong> article [14] choosing ei<strong>the</strong>r <strong>the</strong> first or second solution willresult in an explicit running time of ei<strong>the</strong>r O(d 2 v d v ′) or O(d v d 2 v ′ ) for processing a pair ofinternal nodes (v, v ′ ) with degrees d v and d v′ respectively. However, o<strong>the</strong>r methods forcalculating <strong>the</strong> matrix product may be utilized and this is essential to <strong>the</strong> algorithm. Applyinga matrix multiplication method with a time complexity of O(n ω ) on square matrices,one can make <strong>the</strong> calculation in time O(max(d v ,d v ′) ω ). This value might be smallerfor some matrices that are nearly square, but <strong>the</strong> approach requires that <strong>the</strong> matrices arepadded with zeroes to become square – i.e. extending <strong>the</strong> matrix to fit <strong>the</strong> requirementsof <strong>the</strong> matrix multiplication method used.It is difficult to predict <strong>the</strong> impact of <strong>the</strong> matrix multiplication on <strong>the</strong> entire algorithm;since it is on an internal node basis <strong>the</strong> complete running time is not identical to<strong>the</strong> one of <strong>the</strong> multiplication method. In <strong>the</strong> article [14], Section 4 gives a thorough case