17.11.2012 Views

Soft-Core Processor Design - CiteSeer

Soft-Core Processor Design - CiteSeer

Soft-Core Processor Design - CiteSeer

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

not mean that, for example, the instruction OP code should not be decoded, but directly used as a<br />

set of control signals. The instruction decoder should be designed to generate as few signals as<br />

possible per datapath unit, which will in many cases be less than the number of bits in the OP<br />

code. For instance, a 4-to-1multiplexer in the datapath can be controlled by only two signals,<br />

while the instruction OP code may have 6 or more bits, all of which may be required to determine<br />

the function of the multiplexer. Depending on the circuit configuration, minimizing the number of<br />

signals feeding the logic may not always reduce the number of LEs used, but in our experience, it<br />

often does.<br />

We considered introducing additional pipeline stages to improve the Fmax. Since a full<br />

implementation of a different pipeline organization is time consuming, we used a simple method<br />

for analyzing the possible benefits of the new pipeline organization. We introduced registers into<br />

the design where the pipeline registers would be added if the new pipeline stage was introduced.<br />

The Fmax of such a design is an upper bound on the Fmax achievable by the new pipeline<br />

organization, because the additional logic has to be introduced to handle newly introduced<br />

hazards and/or stalls, which would likely decrease the Fmax. The cycle count of the new<br />

implementation can be estimated by analyzing newly introduced stalls and estimate the frequency<br />

of the stalls. Although generally this requires a detailed architectural research using simulation,<br />

we used only rough estimates. Our estimates indicated that possible gains in Fmax (described in the<br />

next section) obtained by introducing an additional stage to the pipeline could hardly compensate<br />

for the increase in the cycle count. The performance comparison between the UT Nios and the<br />

Altera Nios, which is a 5-stage pipelined Nios implementation, presented in the previous chapter<br />

suggests that our estimates were correct. However, more research is needed to prove this.<br />

The UT Nios design was tested using several different approaches. In the early stages of the<br />

design process, when the implementation was not fully functional, the assembly programs<br />

executing sequences of instructions that were implemented at that point were used in the<br />

ModelSim simulator. ModelSim [19] supports the simulation of a whole system, including<br />

memories, and provides means for specifying the initial contents of the memories in the system.<br />

In this way, the program can be preloaded into the memory and its execution may be simulated.<br />

Furthermore, the UART model generated by the SOPC builder provides a virtual terminal that<br />

can be used to communicate with the program whose execution is being simulated in ModelSim,<br />

as if the program was actually running on the development board. As more functionality was<br />

added to the UT Nios implementation, small programs could be compiled and executed in the<br />

simulator, and finally on the development board. After the interrupt support was implemented, the<br />

design was tested using simple programs with the UART causing interrupts whenever a key on<br />

79

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!