1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
FEI KEMT<br />
q<br />
x<br />
i<br />
S (j)<br />
2 w-1 S (j)<br />
1 w-1<br />
i<br />
Y (j)<br />
w-1 M(j)<br />
w-1<br />
FA FA<br />
FA<br />
S (j-1)<br />
2 w-1<br />
FA<br />
S (j)<br />
2 w-2 S (j)<br />
1 w-2<br />
S (j-1)<br />
1 w-1<br />
S (j-1)<br />
2 w-2<br />
Y (j)<br />
w-2 M(j)<br />
w-2<br />
FA<br />
S (j-1)<br />
1 w-2<br />
. . .<br />
. . .<br />
S (j)<br />
0 S (j)<br />
2 1 0<br />
Y (j)<br />
0<br />
FA<br />
M(j)<br />
0<br />
S (j-1)<br />
0 S (j-1)<br />
2 1 0<br />
Figure 2 – 3 Block diagram of the CSA-based w-bit MWR2MM process<strong>in</strong>g element (CSA PE)<br />
based on FA<br />
of the MWR2MM CSA algorithm. Positive property of the implementation is its<br />
<strong>in</strong>dependence on carry cha<strong>in</strong> logic on the target platform.<br />
Carry-Propagate Adder Unit Recent FPGAs conta<strong>in</strong> high-speed <strong>in</strong>terconnect<br />
l<strong>in</strong>es between adjacent logic blocks which have been designed to provide an efficient<br />
carry propagation. The CPA PE architecture presented <strong>in</strong> this thesis is optimal for<br />
the implementation of the MMM unit on any FPGA that has dedicated carry logic<br />
capability (e.g. modern Altera and Xil<strong>in</strong>x FPGAs). The basic organization of the<br />
ALU consists of two layers of conventional CPAs as shown <strong>in</strong> Figure 2 – 4.<br />
Unlike the CSA PE, the CPA PE does not support a feature of arbitrary word<br />
width w. The border for the number of FAs <strong>in</strong> one row is given by the target<br />
technology. The more LEs are cha<strong>in</strong>ed by fast (and short) <strong>in</strong>terconnection the higher<br />
the word width can be, achiev<strong>in</strong>g comparable speed results to CSA PE. The value<br />
of the carry signal raised <strong>in</strong> the first FA from the left side (for LSB) is subsequently<br />
processed <strong>in</strong> the adjacent FA that outputs another carry signal for the third adder<br />
<strong>in</strong> the row. . . In this way the carry signal is propagated till the most right FA (for<br />
28<br />
C