10.07.2015 Views

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Is Parallel Programming Hard, And, If So, What Can You Do About It?

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

12.2. MEMORY BARRIERS 131CPU 1000 111 B=2000 111 A=1000 111000 111 C=&B000 111 D=4000 111000 111wwwwwwwwwwwwwwwwMakes sure all effectsprior to the store of Care perceptible tosubsequent loads0000 1111Y−>80000 1111C−>&Y0000 11110000 11110000 1111C−>&B0000 11110000 11110000 1111X−>90000 11110000 1111B−>20000 11110000 1111dddddddddddddddddFigure 12.9: Data Dependency Barrier SuppliedCPU 2CPU 1000 111 A=1000 111000 111 B=2000 111000 111wwwwwwwwwwwwwwww0000 1111A−>00000 1111B−>90000 11110000 11110000 1111B−>20000 1111A−>00000 11110000 1111A−>10000 11110000 1111Figure 12.10: Read Barrier NeededCPU 2has the value on hand.<strong>It</strong>mayturnoutthattheCPUdidn’tactuallyneedthe value (perhaps because a branch circumventedthe load) in which case it can discard the value orjust cache it for later use. For example, consider thefollowing:CPU 1 CPU 2LOAD BDIVIDEDIVIDELOAD AOnsomeCPUs,divideinsturctionscantakealongtime to complete, which means that CPU 2’s busmight go idle during that time. CPU 2 might thereforespeculativelyloadAbeforethedividescomplete.In the (hopefully) unlikely event of an exceptionfrom one of the dividees, this speculative load willhave been wasted, but in the (again, hopefully) commoncase, overlapping the load with the divides willpermit the load to complete more quickly, as illustratedby Figure 12.14.Placing a read barrier or a data dependency barrierjust before the second load:CPU 1 CPU 2LOAD BDIVIDEDIVIDELOAD Awill force any value speculatively obtained to bereconsidered to an extent dependent on the type ofbarrier used. <strong>If</strong> there was no change made to thespeculated memory location, then the speculatedvalue will just be used, as shown in Figure 12.15. Onthe other hand, if there was an update or invalidationtoAfrom some other CPU, then the speculationwill be cancelled and the value of A will be reloaded,as shown in Figure 12.16.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!