1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
1 Montgomery Modular Multiplication in Hard- ware
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
FEI KEMT<br />
In the coprocessor we need to store four operands for the MMM computations:<br />
three <strong>in</strong>put operands X, Y, M and the result S. The storage of S requires one or<br />
two registers for a case of the non-redundant or redundant representation form,<br />
respectively. The scalability feature applied to the ALU needs to be adopted to the<br />
memory block, too.<br />
The requirements for the scalable design make possible that the architecture<br />
is easily adaptable to the length of operands different from the one for which the<br />
system was orig<strong>in</strong>ally designed. In the memory block the number of stored variables<br />
is constant (four or five, depend<strong>in</strong>g on the chosen implementation). What varies is<br />
the number of words and consequently the number of bits needed to address them.<br />
We propose a model <strong>in</strong> which the each word of every variable can be addressed<br />
as from the coprocessor as well as from the host unit. We recognise an <strong>in</strong>ternal<br />
address of a word that specifies its location <strong>in</strong> given coprocessor and register, a<br />
register address that makes possible to choose a register with required variable<br />
and f<strong>in</strong>ally a coprocessor address dist<strong>in</strong>guish<strong>in</strong>g between several ALUs. With this<br />
memory management a control unit can address any word of a chosen coprocessor,<br />
store there the <strong>in</strong>put values for computations and afterwards read the results for<br />
further process<strong>in</strong>g. Number of address bits for each level can be adopted accord<strong>in</strong>g<br />
to number of coprocessors, variables and number of words. The address width is<br />
usually given by the word width of the <strong>in</strong>terface between the processor and the<br />
coprocessor. For the address longer than the <strong>in</strong>terface word width an appropriate<br />
address model needs to be chosen - accept<strong>in</strong>g several address signals <strong>in</strong> parallel or<br />
differenc<strong>in</strong>g the address type <strong>in</strong> other way.<br />
Table 2 – 1 Address of operands from host processor level (LSB right)<br />
coprocessor register <strong>in</strong>ternal<br />
XX XXX XXXXXXX<br />
The memory address bits are assigned as shown <strong>in</strong> Table 2 – 1 (LSB is right).<br />
The CPU <strong>in</strong> the presented example of the address format can handle up to 4 MMM<br />
coprocessors (two bits address) with 8 operands (three bits address) each composed<br />
of 128 words. Such configuration is suitable for the RSA computations on the<br />
operands’ length n = 2048 bits and word width w = 16 bits what gives e = 128<br />
number of words.<br />
33