17.11.2012 Views

Soft-Core Processor Design - CiteSeer

Soft-Core Processor Design - CiteSeer

Soft-Core Processor Design - CiteSeer

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

prefetch program counter (PPC), which enables fetching of instructions from the target address.<br />

The target address for BR and BSR instructions is calculated in the operand stage of the pipeline.<br />

Calculating the target address in the operand stage, as opposed to the execute stage, enables the<br />

update of the PPC to happen earlier in the pipeline. Updating the PPC earlier ensures that the<br />

instructions from the target address are fetched sooner, which results in fewer pipeline stalls.<br />

Updating the PPC even earlier in the pipeline is not feasible, unless some form of speculative<br />

execution is implemented. The control-flow instruction might be preceded by a skip instruction<br />

whose outcome is unknown before the skip reaches the execute stage. Furthermore, JMP and<br />

CALL instructions use a register value to specify the target address. The value of the register<br />

might be written by an instruction preceding the JMP or CALL, and the result is unknown until<br />

that instruction reaches the execute stage. Values from the execute stage are forwarded to the<br />

branch unit in the same cycle they are calculated in, so that branching may start immediately. For<br />

the TRAP instruction, a target address has to be fetched from a vector table entry in memory.<br />

Since the address of the vector table entry is calculated in the execute stage of the pipeline,<br />

branching is performed in the execute stage when the data memory responds with the target<br />

address. Details of the execute stage modules are described in the following sections.<br />

4.1.6. Arithmetic and Logic Unit<br />

Arithmetic and logic unit (ALU) performs computations specified by the arithmetic and logic<br />

instructions, as well as the memory address calculation for memory instructions. The ALU<br />

consists of several subunits, including the addition/subtraction unit, shifter unit, logic unit, and<br />

the ALU multiplexer. All subunits operate in parallel, and the ALU multiplexer selects the result<br />

of one of the subunits, depending on the instruction being executed.<br />

The addition/subtraction unit is implemented using the lpm_add_sub LPM library component.<br />

The output of the addition/subtraction unit is used directly as a data master memory address.<br />

Since memory addresses are always calculated using the addition/subtraction unit, there is no<br />

need to use the output of the multiplexer for the memory address. Bypassing the multiplexer<br />

reduces the delay of the memory address signals. Data to be written to the data memory using<br />

store instructions is forwarded directly from the O/X pipeline register. As mentioned previously<br />

in section 4.1.4, the addition/subtraction unit is also used for 2’s complement (NEG), and<br />

absolute (ABS) operations. The 2’s complement of a number is calculated by subtracting the<br />

number from 0. The same result is used by an ABS instruction if the original number is negative,<br />

while the result is the number itself if it is positive or zero.<br />

39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!