1 Montgomery Modular Multiplication in Hard- ware

More documents

Recommendations

Info

FEI KEMT The more universal is a design the lower is its speed in comparison to a system designed for fixed operands parameters. A typical scalable coprocessor consists of two separate blocks – memory registers and arithmetic logic unit (ALU) connected by w-bit data path as shown in Figure 2 – 1. Parameter of the word width w decides on the smallest operated data unit – word, dividing the operands length k to smaller, for target hardware structure more suitable, lengths which is usually a multiple of 8 bits. data input w scalable ALU data memory data output control logic Figure 2 – 1 Architecture of a general scalable coprocessor based on separate memory and ALU connected by w-bit data-path Separation of the ALU and the memory is the first fundamental difference from the FPGA designs including the MMM optimized for fixed-length operands (e.g. [29, 41]). The scalable algorithm requires a word-oriented processing that would make possible to change the number of words, or even the word width w. Normally w is smaller than the operands length k, therefore the computation time is proportionally longer. Better performance can be still achieved by implementation of smaller but faster ALU allowing higher clock frequency. Let us consider w-bit words. For operands with k-bit precision, e1 = ⌈(k +1)/w⌉ words are required for Algorithm 1 – 3. An extra bit used in the calculation of e1 is required since Si (internal variable of radix-2 algorithm) is in the range [0, 2M − 1] [108]. Then all the computations of Algorithm 1 – 3 must be done with an extra bit of precision. The input operands will need an extra zero bit value at the MSB position in order to have the precision extended to the correct value. Algorithm 1 – 4 requires e2 = ⌈(k + 3)/w⌉ words in order to support extended range of input variables X, � Y , and internal variable Si. Note that in many practical configurations e1 = e2 and no additional words are required for Algorithm 1 – 4. The operands X will need two extra 0 bit values at the MSB and subsequent position in order to have the precision extended to the k + 3 cycles required by Algorithm 1 – 4. In practical configurations k ≥ 1024 therefore the difference in number of cycles is 21
FEI KEMT not significant. On the other hand, the possibility to remove correction unit from hardware design of Algorithm 1 – 4 brings valuable advantage. In the rest of the thesis the notions e1 or e2 are used to denote the number of words in cases we need to emphasis the difference of the number of words in the algorithms, or we use the notation e in case we mean a number of words in general. 2.1.1 Scalable Multiple-Word Algorithms Operations in Algorithm 1 – 3 and Algorithm 1 – 4 are performed on the full-precision operands and do not provide scalability feature explained above. We analyse rela- tions between parameters of the multipliers and underlying FPGA structure and provide solution suitable for devices including fast carry architecture. A scalable algorithm in which the operand Y (multiplicand) is scanned word- by-word, and the operand X (multiplier) is scanned bit-by-bit was proposed in [108,109]. The Multiple Word Radix-2 Montgomery Multiplication algorithm (MW- R2MM) uses the following vectors: M = (M (e−1) , . . . , M (1) , M (0) ) (2.1) Y = (Y (e−1) , . . . , Y (1) , Y (0) ) S = (S (e−1) , . . . , S (1) , S (0) ) X = (xk−1, . . . , x1, x0) where the words are marked with superscripts and the bits are marked with sub- scripts. The concatenation of vectors a and b is noted as (a, b). A particular range of bits in a vector a from position i to position j, j > i will be expressed as aj..i. The bit position i of the k-th word of a is represented by symbol a (k) i . The details of the MWR2MM algorithm (further referred to as MWR2MM CSA, where CSA states for Carry-Save Adder) are given in [108] and in the thesis it will be denoted as Algorithm 2 – 1. Optimized version of MMM Algorithm 1 – 4 can be transformed to a multiple word form (referred to as MWR2MM CPA, where CPA states for Carry-Propagate Adder) in a similar way, shown in Algorithm 2 – 2. The reason for such naming of algorithms is given by the way of their implementation and we explain more about it in the following parts of the thesis. The algorithms compute a partial sum S for each bit of X, scanning the words of Y and M. Once the precision is exhausted, another bit of X is taken, and the scan is repeated. Thus, the algorithms MWR2MM CSA as well as MWR2MM CPA 22
Page 1 and 2: Technical University of Koˇsice Fa
Page 3 and 4: Metadata Sheet Author: Martin ˇ Si
Page 5 and 6: FEI KEMT ciel’ovej platformy a vl
Page 7 and 8: Acknowledgement There are several p
Page 9 and 10: Contents Introduction 1 1 Montgomer
Page 11 and 12: FEI KEMT 6.2.3 Analysis of TRNG in
Page 13 and 14: FEI KEMT 6 - 2 Block diagram of dig
Page 15 and 16: List of Algorithms 1 - 1 Montgomery
Page 17 and 18: FEI KEMT π(p) prime counting funct
Page 19 and 20: FEI KEMT MWR2MM Multiple Word Radix
Page 21 and 22: FEI KEMT and Elliptic Curve Cryptog
Page 23 and 24: FEI KEMT The hardware implementatio
Page 25 and 26: FEI KEMT data inputs clock Look-up
Page 27 and 28: FEI KEMT A (X) request for B’s pr
Page 29 and 30: FEI KEMT Algorithm 1 - 1 Montgomery
Page 31 and 32: FEI KEMT In the Algorithm 1 - 1 the
Page 33 and 34: FEI KEMT by the radix b changes to
Page 35 and 36: FEI KEMT the ECC instead of the RSA
Page 37 and 38: FEI KEMT Algorithm 1 - 5 Key genera
Page 39: FEI KEMT 2 Montgomery Modular Multi
Page 43 and 44: FEI KEMT Algorithm 2 - 2 The multip
Page 45 and 46: FEI KEMT Beside the internal struct
Page 47 and 48: FEI KEMT q x i S (j) 2 w-1 S (j) 1
Page 49 and 50: FEI KEMT x i x i-1 xi-n+1 Y (j) M (
Page 51 and 52: FEI KEMT especially if the access t
Page 53 and 54: FEI KEMT 2.2.3 Interface to Control
Page 55 and 56: FEI KEMT Clock Signal Distribution
Page 57 and 58: FEI KEMT 2.3.2 Montgomery Multiplic
Page 59 and 60: FEI KEMT 1. Fully software solution
Page 61 and 62: FEI KEMT 2.3.4 Implementation Resul
Page 63 and 64: FEI KEMT 3 Elliptic Curve Method in
Page 65 and 66: FEI KEMT improving the area-time pr
Page 67 and 68: FEI KEMT 3.3.1 Pollard’s (p − 1
Page 69 and 70: FEI KEMT If the order of P ∈ E(Fq
Page 71 and 72: FEI KEMT handicap of the Montgomery
Page 73 and 74: FEI KEMT Two major improvements hav
Page 75 and 76: FEI KEMT and case study with GNFS b
Page 77 and 78: FEI KEMT Table 4 - 1 Computational
Page 79 and 80: FEI KEMT from or writing to a singl
Page 81 and 82: FEI KEMT Algorithm 4 - 1 Modified M
Page 83 and 84: FEI KEMT Algorithm 4 - 2 Modular ad
Page 85 and 86: FEI KEMT microprocessor, e.g. Alter
Page 87 and 88: FEI KEMT timings of ECM implementat
Page 89 and 90: FEI KEMT hardware was implemented o
Page 91 and 92:
FEI KEMT [68] what can be expressed
Page 93 and 94:
FEI KEMT and a harvesting mechanism
Page 95 and 96:
FEI KEMT control also all the syste
Page 97 and 98:
FEI KEMT we can mention a generator
Page 99 and 100:
FEI KEMT noise to binary signal a c
Page 101 and 102:
FEI KEMT over time. For this reason
Page 103 and 104:
FEI KEMT The Bucci and Luzzi Testab
Page 105 and 106:
FEI KEMT CLI PLL PLL 1 2 CLJ CLK D
Page 107 and 108:
FEI KEMT For R it holds that the in
Page 109 and 110:
FEI KEMT extraction. In other words
Page 111 and 112:
FEI KEMT and effort needed for repr
Page 113 and 114:
FEI KEMT 6 True Random Number Gener
Page 115 and 116:
FEI KEMT clock input F IN F FB Phas
Page 117 and 118:
FEI KEMT Table 6 - 2 Parameters of
Page 119 and 120:
FEI KEMT Since the size of the jitt
Page 121 and 122:
FEI KEMT Table 6 - 3 Parameters set
Page 123 and 124:
FEI KEMT where N1 is the number of
Page 125 and 126:
FEI KEMT oscillator with frequency
Page 127 and 128:
FEI KEMT Table 6 - 7 Area occupatio
Page 129 and 130:
FEI KEMT 0,75 0,5 0,25 1 0 1 30 59
Page 131 and 132:
FEI KEMT Stochastic Model The clock
Page 133 and 134:
FEI KEMT Table 6 - 8 Mean values me
Page 135 and 136:
FEI KEMT Figure 6 - 8 Sampled wavef
Page 137 and 138:
FEI KEMT Table 6 - 9 Results of sta
Page 139 and 140:
FEI KEMT Figure 6 - 11 Amount of sa
Page 141 and 142:
FEI KEMT Figure 6 - 13 Amount of sa
Page 143 and 144:
FEI KEMT 7 Research Contribution Wi
Page 145 and 146:
FEI KEMT Curriculum vitae Professio
Page 147 and 148:
FEI KEMT [16] Altera Corporation. S
Page 149 and 150:
FEI KEMT [37] Bundesamt für Sicher
Page 151 and 152:
FEI KEMT [54] Federal Information P
Page 153 and 154:
FEI KEMT [71] Gura, N., Chang, S.,
Page 155 and 156:
FEI KEMT of Commerce, month = aug,
Page 157 and 158:
FEI KEMT and C. Paar, Eds., no. 216
Page 159:
FEI KEMT [128] Zimmermann, P. ECMNE
show all

1 Montgomery Modular Multiplication in Hard- ware

Create successful ePaper yourself

Delete template?

Save as template?