13.07.2015 Views

The PowerPC 604 RISC Microprocessor - eisber.net

The PowerPC 604 RISC Microprocessor - eisber.net

The PowerPC 604 RISC Microprocessor - eisber.net

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

17-bit signed integer. It can sustain onemultiply per rwo cycles for larger 32x32-bit operands. <strong>The</strong> CFX also uses themultiply pipeline stages to execute adivide instruction in 20 cycles. <strong>The</strong> <strong>604</strong>Stwo simple integer units execute allother integer instructions in one cycle.<strong>The</strong> chree-stage floating-point unit cansustain one double-precision multiplyaddper cycle, one single-precisiondivide every 18 cycles, or one doubleprecisiondivide every 31 cycles. It compliesfully with the IEEE-Std 754floating-point arithmetic standard. <strong>The</strong><strong>604</strong> provides hardware support fordenormalization, exceptions, and threegraphics instructions. It also provides aGPR result busesIntegerunit 1non-IEEE mode for graphics support. Tne non-IEEE modeconverts a denorm.alized result to zero to avoid prenormalizationin subsequent operations.Instruction completion and write back. After aninstruction executes, the execution unit copies results to itsrename buffer entries and the execution status to its reorderbuffer entry. Among other things, the execution status indicateswhether the instruction Finished execution without anexception. Of the four reorder buffer entries examined everycycle, up to fcur instructions that finished without an exceptioncomplete in program order.Other than the in-order completion necessary to supportprecise exceptions, the <strong>604</strong> imposes only a few additionalrestrictions on instruction completion. <strong>The</strong>y are1) Stop before a store instruction. Since a store dataoperand is read from the register file in the completionstage, a store instruction cannot complete if its storeoperand is still in the rename buffer. Stopping completionbefore a store instruction allows the store operandto be written to the register File, even if it is producedby an instruction currently being completed.2) Stop after a taken branch instruction. Since a takenbranch instruction sets the program counter to its targetaddress. it is speed critical to advance the programcounter from the new target address by the number ofinstructions completed after the taken branch in onecycle. Stopping completion after a taken branch instructionavoids this logic altogether.To minimize effects of long execution latenc'i on in-ordercompletion. the completion stage overlaps with the last executioncycle for those instructions with multicycle executionstage. <strong>The</strong>se include the multiply. divide. store. load miss, andexecution serialized instructions A store instruction completesas soon :is it is translated without an exception. Similarly. aIntegerunit 2Complexinteger unitGPRrenamebufferGPR General-purpose registerOp OperandFigure 4. GPR result buses, reservation stations, and rename buffer.load instruction that misses in the data cache completes upontranslation without an exception. Since the reservation stationscan forward the load data when it is available to the dependentinstructions, the load miss can safely be completed.Most superscalar designs impose additional restrictionsdue to a limited number of ports to register files. For instance.four write ports would be required to complete up to fourinstructions if each one can update one register. <strong>The</strong> <strong>604</strong>GPR file would require eight write ports to complete fourload-with-update instnictions per cycle. <strong>The</strong> <strong>604</strong> avoids thisproblem by decoupling instruction completion from registerFile updates using the write-back stage. Instructions completewithout regard to the type or number of registers theyupdate. <strong>The</strong> completion stage updates their results if ports areavailable; if not. the write-back stage updates them. <strong>The</strong>rename buffer entries function as temporary buffers for thoseinstructions that are not completed and as write-back buffersfor those that are. All three GPR. FPR. and CR rename bufferscontain two read ports for write back. Correspondingly, thethree register files have two write ports for write back.Memory operationsHigh-speed superscalar processors require a greater memorybandwidth to sustain their performance. <strong>The</strong> <strong>604</strong> meetsthe increased demand with large on-chip caches, nonblockingmemory operations, and a high-bandwidth systeminterface. <strong>The</strong> <strong>604</strong> takes advantage of the weakly orderedmemory model, to which the <strong>PowerPC</strong> architecture subscribes,to offer efficient memory operations. Although loadsand stores that hit in the data cache can bypass earlier loadsand stores, program order memory access can be enforcedwith instructions provided for this purpose.Load/store unit Figure 5 (next page) shows a block diagramof the load/store unit and the memory queues. Tnisunit has a two-cycle execution stage It calculates the memoryaddress and translates that address with a 128-entry, two-October 1994 13

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!