1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
FEI KEMT<br />
embedded processor with a set of dedicated coprocessors. For such a system a<br />
highly flexible (although typically slower) scalable MMM coprocessor could be more<br />
attractive than a fixed length dedicated one.<br />
That direction was chosen <strong>in</strong> our research, when our goal is to analyse and<br />
implement solution that would allow quick prototyp<strong>in</strong>g of special purpose hard<strong>ware</strong><br />
designs and use features of target platform <strong>in</strong> order to accelerate execution of the<br />
MMM operation.<br />
The radix-2 MMM algorithm (b = 2) is very suitable for hard<strong>ware</strong> implemen-<br />
tation due to easily implementable operations as a word-by-bit multiplication, a<br />
bit-shift (division by two) and an addition. Implementations with higher radix were<br />
also published [30, 110] and offer a proper alternative, but us<strong>in</strong>g a more complex<br />
algebraic unit.<br />
Radix-2 <strong>Montgomery</strong> <strong>Multiplication</strong> Algorithm The simplified version of<br />
the MMM algorithm (Algorithm 1 – 2) when the radix b is equal to 2 (b = 2) for<br />
k-bit operands X = (xk−1, . . . , x1, x0), Y , and M is given as Algorithm 1 – 3.<br />
Algorithm 1 – 3 The basic radix-2 <strong>Montgomery</strong> multiplication algorithm for k-bit<br />
operands X = (xk−1, . . . , x1, x0), Y , and M<br />
Require: M = (mk−1 . . . m0)2, X = (xk−1 . . . x0)2, Y = (yk−1 . . . y0)2, M ′ =<br />
−M −1 mod 2, E = (et . . . e0)2 with et = 1, R = 2 k , and an <strong>in</strong>teger X, 1 ≤ X <<br />
M. The values R 2 mod M and R mod M may be also provided as precomputed<br />
<strong>in</strong>puts.<br />
Ensure: S = XY R −1 mod M.<br />
1: S0 ⇐ 0<br />
2: for i = 0 to k − 1 do<br />
3: qi ⇐ (Si + xiY ) mod 2<br />
4: Si+1 ⇐ (Si + xiY + qiM)/2<br />
5: end for<br />
6: if Sk ≥ M then<br />
7: Sk ⇐ Sk − M<br />
8: end if<br />
9: S ← Sk<br />
10: return S<br />
From a comparison of the Algorithms 1 – 2 and 1 – 3 one can see how the choice of<br />
b = 2 may help to simplify the operations <strong>in</strong>side the MMM. The modular reduction<br />
13