1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
FEI KEMT<br />
and case study with GNFS based on ECM units are summarised <strong>in</strong> the Section 4.3.<br />
F<strong>in</strong>ally, we conclude the chapter with discussion on obta<strong>in</strong>ed results.<br />
4.1 Parameterisation of the ECM Algorithm<br />
Our implementation focuses on the factorisation of numbers up to 200 bits with<br />
factors of up to around 42 bits. Thus, the most optimal parameters need to be found<br />
for the smoothness bounds B1, B2, and <strong>in</strong> the improved standard cont<strong>in</strong>uation used<br />
parameter D (see the description of the ECM second phase <strong>in</strong> Section 3.3.2). We<br />
f<strong>in</strong>d the values that yield a high probability of success and a relatively small runn<strong>in</strong>g<br />
time and area consumption. With the runn<strong>in</strong>g time depend<strong>in</strong>g on the size of the<br />
(unknown) factors to be found, optimal parameters cannot be known beforehand.<br />
Hence, good parameters can be found by experiments with different prime bounds.<br />
4.1.1 Phase 1<br />
Deduced from soft<strong>ware</strong> experiments, we choose B1 = 960 and B2 = 57 000 as prime<br />
bounds. The value of k has 1 375 bits, hence, assum<strong>in</strong>g the b<strong>in</strong>ary method (Algo-<br />
rithm 3 – 2), 1 374 po<strong>in</strong>t additions and 1 374 po<strong>in</strong>t duplications for the execution of<br />
phase 1 are required. Due to the use of <strong>Montgomery</strong> coord<strong>in</strong>ates, the coord<strong>in</strong>ate<br />
zP of the start<strong>in</strong>g po<strong>in</strong>t P can be set to 1, then the addition takes only 5 multi-<br />
plications <strong>in</strong>stead of 6. The improved phase 1 (with optimal addition cha<strong>in</strong>s) has<br />
to use the general case, where zP �= 1. For the sake of simplicity and a preferably<br />
simple control logic, we choose the b<strong>in</strong>ary method for the time be<strong>in</strong>g. For the chosen<br />
parameters, the computational complexity of phase 1 is 13 740 modular multiplica-<br />
tions and squar<strong>in</strong>gs 3 . With optimised addition cha<strong>in</strong>s this number can be reduced<br />
to approximately 12 000 modular multiplications and squar<strong>in</strong>gs.<br />
Accord<strong>in</strong>g to Equation 3.10, duplicat<strong>in</strong>g a po<strong>in</strong>t 2PA = PC <strong>in</strong>volves the <strong>in</strong>put<br />
values xA, zA, A24 and n, where A24 = (A + 2)/4 is computed from the curve pa-<br />
rameter A (see Equation 3.8) <strong>in</strong> advance and should be stored <strong>in</strong> a fixed register.<br />
A po<strong>in</strong>t addition PC = PA + PB handles the <strong>in</strong>put values xA, zA, xB, zB, xA−B, zA−B<br />
and n (see Equation 3.9).<br />
Notice that the values n, A24, xA−B and zA−B do not change dur<strong>in</strong>g phase 1.<br />
Furthermore, zA−B = z1 can be chosen to be 1. Thus, no register is required for<br />
zA−B. The output values xC and zC can be written to certa<strong>in</strong> <strong>in</strong>put registers to<br />
3 Squar<strong>in</strong>gs and multiplications are considered to have an identical complexity <strong>in</strong> our case s<strong>in</strong>ce<br />
the hard<strong>ware</strong> unit is the same for both, the multiplication and squar<strong>in</strong>g.<br />
56