1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
FEI KEMT<br />
save memory. If we assume that the ECM unit does not execute addition and<br />
duplication <strong>in</strong> parallel, at most 7 registers for the values <strong>in</strong> Zn are required for<br />
phase 1. Additionally, we will require 4 temporary registers for <strong>in</strong>termediate values.<br />
Thus, a total of 11 registers is required for phase 1.<br />
4.1.2 Phase 2<br />
For the prime bounds chosen, 5 621 primes p ∈ [B1, B2] have to be tested <strong>in</strong> phase<br />
2. With the prime bounds fixed, the computational complexity depends on the size<br />
of D. Hence, D should consist of small primes <strong>in</strong> order to keep ϕ(D) as small as<br />
possible. We consider the cases D = 6, D = 30, D = 60 and D = 210. The<br />
<strong>in</strong>itial values can be computed by first comput<strong>in</strong>g ˆ Q = DQ, then B1<br />
D ˆ Q with the<br />
b<strong>in</strong>ary method, yield<strong>in</strong>g automatically ( B1<br />
D − 1) ˆ Q. The total number of modular<br />
multiplications is determ<strong>in</strong>ed by the number of po<strong>in</strong>t additions, po<strong>in</strong>t duplications<br />
and multiplications for the product Π.<br />
Table 4 – 1 displays the computational complexity and the number of registers<br />
required additionally for phase 2. For the numbers <strong>in</strong> the table, we assume the use<br />
of Algorithm 3 – 2 for comput<strong>in</strong>g the <strong>in</strong>itial values. E.g., <strong>in</strong> the case D = 30, the cost<br />
for the computation of DQ, ( B1<br />
D<br />
B1<br />
− 1)DQ, and DQ is as much as 8 po<strong>in</strong>t additions<br />
D<br />
and 8 po<strong>in</strong>t duplications. For the same D, the computation of the table <strong>in</strong>volves<br />
5 po<strong>in</strong>t additions and 2 po<strong>in</strong>t duplications, yield<strong>in</strong>g to a total of 13 590 modular<br />
multiplications.<br />
Remark: for the case D = 210, we start with B1 = 1 050 <strong>in</strong> order to assure that<br />
D and B1 share the same prime factors. For phase 2 we choose D = 30 to obta<strong>in</strong><br />
a m<strong>in</strong>imal AT product of the design. S<strong>in</strong>ce ϕ(D) = 8 is small, only 8 additional<br />
registers are required to store all coord<strong>in</strong>ates <strong>in</strong> a table. Unlike <strong>in</strong> phase 1, we have<br />
to consider the general case for po<strong>in</strong>t addition where zA−B �= 1. Hence, an additional<br />
register for this quantity is needed.<br />
For the product Π of all xA · zB − zA · xB, one more register is necessary. The<br />
temporary registers from phase 1 suffice to store the <strong>in</strong>termediate results xA · zB,<br />
zA · xB and xA · zB − zA · xB. Hence, additional 10 registers for phase 2 yield a total<br />
of 21 required registers for both phases. The computational complexity of phase 2 is<br />
1 881 po<strong>in</strong>t additions and 10 po<strong>in</strong>t duplications. Together with the 13 590 modular<br />
multiplications for comput<strong>in</strong>g the product Π, 24 926 modular multiplications and<br />
squar<strong>in</strong>gs are required.<br />
For a high probability of success (p > 80%) of f<strong>in</strong>d<strong>in</strong>g a s<strong>in</strong>gle factor of size of 42<br />
57