Soft-Core Processor Design - CiteSeer
Soft-Core Processor Design - CiteSeer
Soft-Core Processor Design - CiteSeer
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Similarly, the control-flow instructions require an instruction fetch from the target address in the<br />
instruction memory, so their latency depends on the instruction memory latency. The control-flow<br />
instructions introduce at least two cycles of branch penalty if the synchronous memory is used.<br />
Pipeline execution results are stored temporarily in the pipeline registers. The result of the<br />
fetch stage is a fetched instruction, which is stored temporarily in the IR register. The results of<br />
the decode and operand stages are stored in pipeline registers D/O and O/X, respectively.<br />
Unlike traditional RISC architectures [34], UT Nios does not have a write-back stage. The<br />
operand stage was introduced instead to reduce the delay of the critical path in the execute stage.<br />
Introducing the write-back stage would likely decrease the processor performance because of the<br />
stalls caused by data hazards. The operand stage does not incur such stalls. A discussion of the<br />
pipeline organization and its implication on performance is given in Chapter 6. The following<br />
sections present the structure and the functionality of the datapath modules.<br />
4.1.1. Prefetch Unit<br />
The UT Nios prefetch unit performs the functionality of the UT Nios instruction master. The<br />
prefetch unit connects directly to the Avalon bus, and communicates with the instruction memory<br />
by using the predefined Avalon signals. If the pipeline commits one instruction per cycle,<br />
instructions from the prefetch unit are directly forwarded to the decode stage of the pipeline.<br />
Since the instruction master on the Avalon bus supports latency transfers, the prefetch unit issues<br />
several consecutive reads, even if the pipeline stalls, and the instructions are not required<br />
immediately. In this case, the prefetched instructions are temporarily stored in a FIFO buffer.<br />
When the stall is resolved, the next instruction is ready, and the pipeline execution may continue<br />
immediately. Using the FIFO buffer reduces the pipeline latency. Without it, a memory read<br />
would have to be issued, and the execution could only continue when the new instruction has<br />
been fetched. The prefetch unit issues only as many memory read operations as the size of the<br />
FIFO buffer if the pipeline is stalled. The size of the UT Nios FIFO buffer is configurable using a<br />
defparam Verilog statement.<br />
On system reset, the prefetch unit starts fetching instructions from a user-defined starting<br />
memory address. To keep track of an instruction that needs to be fetched next, the prefetch unit<br />
maintains a copy of the program counter called the prefetch program counter (PPC). The PPC is<br />
independent of the program counter visible to the programmer, and gets incremented every time a<br />
memory read is issued by the prefetch unit. It is updated with the branch target address by the<br />
branch unit when a taken branch executes in the pipeline. Since branches are executed in the third<br />
stage of the pipeline, the prefetch unit may have already fetched instructions past the delay slot of<br />
33