04.11.2012 Views

1 Montgomery Modular Multiplication in Hard- ware

1 Montgomery Modular Multiplication in Hard- ware

1 Montgomery Modular Multiplication in Hard- ware

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FEI KEMT<br />

1. Fully soft<strong>ware</strong> solution implemented on a 32-bit Nios processor.<br />

2. Mixed soft<strong>ware</strong>-hard<strong>ware</strong> design with 16-bit Nios processor and the pipel<strong>in</strong>ed<br />

coprocessor <strong>in</strong>clud<strong>in</strong>g the CSA PE.<br />

3. Mixed soft<strong>ware</strong>-hard<strong>ware</strong> design with 16-bit Nios processor and the pipel<strong>in</strong>ed<br />

coprocessor <strong>in</strong>clud<strong>in</strong>g the CPA PE.<br />

Further, we provide the details of each system design and comment the obta<strong>in</strong>ed<br />

results.<br />

1. The soft<strong>ware</strong> implementation of the MMM algorithm has been written <strong>in</strong> the<br />

Nios assembly language by us<strong>in</strong>g all known optimization techniques for the<br />

target processor. The Separated Operand Scann<strong>in</strong>g (SOS) MMM method [39]<br />

was used as the best method for given Nios RISC architecture [66]. The<br />

Table 2 – 5 shows the tim<strong>in</strong>gs for the execution of the MMM on the fully<br />

soft<strong>ware</strong> solution runn<strong>in</strong>g on the processor clocked at 50 MHz. The 32-bit<br />

Nios processor occupies 2137 LEs without the logic for the <strong>in</strong>teger multiplier<br />

(for MUL <strong>in</strong>struction) that requires additional 446 LEs.<br />

In case of the soft<strong>ware</strong> implementation it is effective to apply a different algo-<br />

rithms for the multiplication and squar<strong>in</strong>g what reduces the execution time for<br />

the squar<strong>in</strong>g operation. However due to vulnerability aga<strong>in</strong>st the side-channel<br />

attacks it is better to align the execution times of both operations.<br />

Table 2 – 5 Execution times of soft<strong>ware</strong> implementation of MMM on Altera Nios development<br />

board (with APEX EP20K200 clocked at 50 MHz)<br />

Length Method <strong>Multiplication</strong> Squar<strong>in</strong>g<br />

(e × w) (ms) (ms)<br />

1024 SOS32MEM 2.40 1.87<br />

2048 SOS32MEM 9.47 7.24<br />

2. In the mixed hard<strong>ware</strong>-soft<strong>ware</strong> design the multiplication and squar<strong>in</strong>g is com-<br />

pletely implemented <strong>in</strong> the hard<strong>ware</strong>. Both operations share the same arith-<br />

metic unit. Due to move of the computational complexity from the ma<strong>in</strong> pro-<br />

cessor to the dedicated coprocessor one does not need to use the 32-bit version<br />

of the Nios core. Instead of the 32-bit controller one can <strong>in</strong>clude the 16-bit<br />

40

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!