19.07.2014 Views

Contents - Student subdomain for University of Bath

Contents - Student subdomain for University of Bath

Contents - Student subdomain for University of Bath

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

B.3. KARATSUBA’S METHOD 225<br />

additions at each step. While the “rounding up” is not important in O-theory,<br />

it matters in practice, and [Mon05] shows various other <strong>for</strong>mulae, e.g.<br />

(aX 2 + bX + c)(dX 2 + eX + f) =<br />

cf(1 − X) + be(−X + 2X 2 − X 3 ) + ad(−X 3 + X 4 ) + (b + c)(e + f)(X − X 2 )<br />

+(a + b)(d + e)(X 3 − X 2 ) + (a + b + c)(d + e + f)X 2 ,<br />

requiring six coefficient multiplications rather than the obvious nine, or eight if<br />

we write it as<br />

(<br />

aX 2 + (bX + c) ) ( dX 2 + (eX + f) ) =<br />

adX 4 + aX 2 (eX + f) + dX 2 (bX + c) + (bX + c)(eX + f)<br />

and use (B.4) on the last summand (asymptotics would predict 3 log 2 3 ≈ 5.7, so<br />

we are much nearer the asymptoic behaviour). Cascading this <strong>for</strong>mula rather<br />

than (B.4) gives O(n log 3 6 ), which as log 3 6 ≈ 1.63 is not as favorable asymptotically.<br />

His most impressive <strong>for</strong>mula describes the product <strong>of</strong> two six-term<br />

polynomials in 17 coefficient multiplications, and log 6 17 ≈ 1.581, a slight improvement.<br />

We refer the reader to the table in [Mon05], which shows that his<br />

armoury <strong>of</strong> <strong>for</strong>mulae can get us very close to the asymptotic costs.<br />

Theorem 38 We can multiply two (dense) polynomials with m and n terms<br />

respectively in O ( max(m, n) min(m, n) (log 2 3)−1) coefficient operations.<br />

Let the polynomials be f = a m−1 Y m−1 + · · · and f = b n−1 Y n−1 + · · · Without<br />

loss <strong>of</strong> generality, we may assume m ≥ n (so we have to prove O(mn log 2 3−1 ),<br />

and write k = ⌈ m n<br />

⌉. We can then divide f into blocks with n terms (possibly<br />

fewer <strong>for</strong> the most significant one) each: f = ∑ k−1<br />

i=1 f iY in . Then<br />

k−1<br />

∑<br />

fg = (f i g) Y in .<br />

i=1<br />

Each f i g can be computed in time O(n log 2 3 ), and the addition merely takes<br />

time O(m − n) (since there is one addition to be per<strong>for</strong>med <strong>for</strong> each power <strong>of</strong><br />

Y where overlap occurs). Hence the total time is O(kn log 2 3 ), and the constant<br />

factor implied by the O-notation allows us to replace k = ⌈ m n ⌉ by m n , which<br />

gives the desired result.<br />

B.3.1<br />

Karatsuba’s method in practice<br />

When it comes to multiplying numbers <strong>of</strong> n digits (in whatever base we are<br />

actually using, generally a power <strong>of</strong> 2), the received wisdom is to use O(n 2 )<br />

methods <strong>for</strong> small n (say n < 16), Karatsuba-style methods <strong>for</strong> intermediate<br />

n (say 16 ≤ n < 4096 and Fast Fourier Trans<strong>for</strong>m methods (section B.3.4 or<br />

[SS71]) <strong>for</strong> larger n. However, it is possible <strong>for</strong> the Fast Fourier Trans<strong>for</strong>m<br />

to require too much memory, and [Tak10] was <strong>for</strong>ced to use these methods on<br />

numbers one quarter the required size, and then nested Karatsuba to combine<br />

the results.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!