11.07.2015 Views

Chapter 4 Introduction

Chapter 4 Introduction

Chapter 4 Introduction

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Morgan Kaufmann Publishers 31 January 2013Loop Unrollingn Replicate loop body to expose moreparallelismn Reduces loop-control overheadn Use different registers per replicationn Called “register renaming”n Avoid loop-carried “anti-dependencies”n Store followed by a load of the same registern Aka “name dependence”n Reuse of a register nameCSE 420 <strong>Chapter</strong> 4 — The Processor — 117Loop Unrolling ExampleLoop: lw $t0, 0($s1) # $t0=array elementaddu $t0, $t0, $s2 # add scalar in $s2sw $t0, 0($s1) # store resultaddi $s1, $s1,–4 # decrement pointerbne $s1, $zero, Loop # branch $s1!=0ALU/branch Load/store cycleLoop: addi $s1, $s1,–16 lw $t0, 0($s1) 1nop lw $t1, 12($s1) 2addu $t0, $t0, $s2 lw $t2, 8($s1) 3addu $t1, $t1, $s2 lw $t3, 4($s1) 4addu $t2, $t2, $s2 sw $t0, 16($s1) 5addu $t3, $t4, $s2 sw $t1, 12($s1) 6nop sw $t2, 8($s1) 7bne $s1, $zero, Loop sw $t3, 4($s1) 8n IPC = 14/8 = 1.75nCloser to 2, but at cost of registers and code size<strong>Chapter</strong> 4 — The Processor — 118

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!