12.07.2015 Views

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

COPYRIGHT 2008, PRINCETON UNIVERSITY PRESS

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

360 chapter 14TABLE 14.2Computation of Matrix [C]=[A]+[B]Step 1 Step 2 ··· Step 99c(1) = a(1) + b(1) c(2) = a(2) + b(2) ··· c(99) = a(99) + b(99)TABLE 14.3Vector Processing of Matrix [A]+[B]=[C]Step 1 Step 2 Step 3 ··· Step Zc(1) = a(1) + b(1)c(2) = a(2) + b(2)c(3) = a(3) + b(3)···c(Z)=a(Z)+b(Z)processing of [A]+[B]=[C], the successive fetching of and addition of the elementsA and B are grouped together and overlaid, and Z ≃ 64–256 elements (the sectionsize) are processed with one command, as seen in Table 14.3. Depending on thearray size, this method may speed up the processing of vectors by a factor of about10. If all Z elements were truly processed in the same step, then the speedup wouldbe ∼ 64–256.Vector processing probably had its heyday during the time when computer manufacturersproduced large mainframe computers designed for the scientific andmilitary communities. These computers had proprietary hardware and softwareand were often so expensive that only corporate or military laboratories couldafford them. While the Unix and then PC revolutions have nearly eliminated theselarge vector machines, some do exist, as well as PCs that use vector processing intheir video cards. Who is to say what the future holds in store?14.7 Unit II. Parallel ComputingThere is little question that advances in the hardware for parallel computing areimpressive. Unfortunately, the software that accompanies the hardware often seemsstuck in the 1960s. In our view, message passing has too many details for applicationscientists to worry about and requires coding at a much, or more, elementarylevel than we prefer. However, the increasing occurrence of clusters in which thenodes are symmetric multiprocessors has led to the development of sophisticatedcompilers that follow simpler programming models; for example, partitioned globaladdress space compilers such as Co-Array Fortran, Unified Parallel C, and Titanium.Inthese approaches the programmer views a global array of data and then manipulatesthese data as if they were contiguous. Of course the data really are distributed,−101<strong>COPYRIGHT</strong> <strong>2008</strong>, PRINCET O N UNIVE R S I T Y P R E S SEVALUATION COPY ONLY. NOT FOR USE IN COURSES.ALLpup_06.04 — <strong>2008</strong>/2/15 — Page 360

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!