15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

FIGURE 5.17<br />

Looking forward to the future of multithreaded processors, the pace of change makes for rich opportunities<br />

and also for great challenges. Although it is difficult to precisely predict where this field will go,<br />

this final section seeks to outline some of the key areas of development in multithreaded processors.<br />

Whatever technological breakthroughs occur and whatever directions the market takes, the fundamental<br />

issues addressed in the “Parallel Processing Software Framework” section will still apply. The realization<br />

of multithreaded processors will still rest upon good techniques to perform thread selection, inter-PE<br />

communication, and synchronization. The core techniques for addressing these issues will remain valid;<br />

however, the way that they are employed will surely change as the critical parameters of clock speeds and<br />

wire delays continue to change.<br />

It is difficult to obtain good performance without having complexity somewhere in the hardwaresoftware<br />

multithreaded system! In a high-performance multithreaded processor, the complexity could be<br />

at the static partitioning side (programming or compiler), at the dynamic partitioning hardware side, or at<br />

the PE interconnect side. Figure 5.17 illustrates this concept. To support hardware scalability, complexity<br />

at the dynamic partitioning hardware and the PE interconnect act as hurdles. In the long run, as more<br />

transistors are integrated into a processor chip, it can be expected that the number of PEs would be scaled up.<br />

However, the trend towards higher clock rates will make it more difficult to support complexity in the<br />

2<br />

dynamic partitioning hardware and in the PE interconnect. Thus, the end result of the trends in high<br />

transistor count and high clock rates (which encourage multithreading/multiprocessing) is a shift towards<br />

doing more and more things statically, as opposed to dynamically. This means that program partitioning<br />

will eventually be done only at compilation time, and perhaps more and more at programming time.<br />

To Probe Further<br />

Multiprocessing has been around for a long time, and so naturally the computer literature has an<br />

overabundance of articles and textbooks on this subject. The multiprocessing community consists of<br />

different camps, which often use different terminology for the same concepts. This lack of consensus<br />

makes it somewhat difficult to merge the ideas presented in different papers or books. Nevertheless, we<br />

© 2002 by CRC Press LLC<br />

Multiscalar<br />

Superthreading<br />

Static Partitioning Complexity<br />

(compilation and programming)<br />

Dynamic Partitioning Complexity<br />

(mostly logic delays)<br />

SpMT processor<br />

Complexity in multithreading/multiprocessing.<br />

Trace processor<br />

DMT<br />

Difficulty in Scaling Up<br />

the number of PEs<br />

PE Interconnect Complexity<br />

(mostly wire delays)<br />

Superscalar<br />

2<br />

Although it is possible to pipeline a crossbar interconnect so that it can accept new requests every cycle, the long<br />

inter-PE latency that it causes would increase the number of clock cycles required to execute a program, compared<br />

with what is obtained with scalable interconnects [27].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!