CS184a: Computer Architecture (Structure and ... - Caltech


CS184a: Computer Architecture (Structure and ... - Caltech

CS184a:Computer Architecture(Structure and Organization)Day 13: February 6, 2003Interconnect 3: RichnessCaltech CS184 Winter2003 -- DeHon1• Rent’s RuleLast Time– And it’s implications• Superlinear growth rate of interconnectp>0.5 Area growth Ω(N 2p )Caltech CS184 Winter2003 -- DeHon2

Step 1: Build ArchitectureModel• Assume geometric growth• Pick parameters: Build architecture cantune• F, C• α, pCaltech CS184 Winter2003 -- DeHon5Tree of Meshes• Nature model ishierarchical• Restricted internalbandwidth• Can match to modelCaltech CS184 Winter2003 -- DeHon6

Parameterize CCaltech CS184 Winter2003 -- DeHon7Parameterize Growth(2 1)* => α=√2(2 2 2 1)* =>α=2 (3/4)Caltech CS184 Winter2003 -- DeHon(2 2 1)* => α=(2*2) (1/3) =2 (2/3)8

Step 2: Area Model• Need to know effect of architectureparameters on area (costs)– focus on dominant components•wires• switches• logic blocks(?)Caltech CS184 Winter2003 -- DeHon9Area Parameters•A logic = 40Κλ 2•A sw = 2.5Κλ 2• Wire Pitch = 8λCaltech CS184 Winter2003 -- DeHon10

Switchbox Population• Full population is excessive (next lecture)• Hypothesis: linear population adequate– still to be (dis)provenCaltech CS184 Winter2003 -- DeHon11“Cartoon” VLSI Area Model(Example artificially small for clarity)Caltech CS184 Winter2003 -- DeHon12

Larger “Cartoon”1024 LUTNetworkP=0.67LUTArea 3%Caltech CS184 Winter2003 -- DeHon13Effects of P (α) on AreaCaltech CS184 Winter2003 -- DeHonP=0.5 P=0.67 P=0.751024 LUT Area Comparison14

Effects of P on CapacityCaltech CS184 Winter2003 -- DeHon15Step 3: CharacterizeApplication Requirements• Identify representative applications.– Today: IWLS93 logic benchmarks• How much structure there?• How much variation amongapplications?Caltech CS184 Winter2003 -- DeHon16

Application RequirementsMax: C=7, P=0.68Caltech CS184 Winter2003 -- DeHonAvg: C=5, P=0.7217Benchmark WideCaltech CS184 Winter2003 -- DeHon18

Benchmark ParametersCaltech CS184 Winter2003 -- DeHon19Complication• Interconnect requirements vary amongapplications• Interconnect richness has large effecton area• What is effect of architecture/applicationmismatch?– Interconnect too rich?– Interconnect too poor?Caltech CS184 Winter2003 -- DeHon20

Interconnect Mismatch inTheoryCaltech CS184 Winter2003 -- DeHon21Step 4: Assess ResourceImpact• Map designs to parameterizedarchitecture• Identify architectural resource requiredCompare: mapping to k-LUTs; LUT count vs. k.Caltech CS184 Winter2003 -- DeHon22

Mapping to Fixed WireSchedule• Easy if need lesswires than Net• If need morewires than net,must depopulateto meetinterconnectlimitations.Caltech CS184 Winter2003 -- DeHon23Mapping to Fixed-WS• Better results if“reassociate”rather thankeeping originalsubtrees.Caltech CS184 Winter2003 -- DeHon24

Observation• Don’t really want a “bisection” of LUTs– subtree filled to capacity by either of•LUTs• root bandwidth– May be profitable to cut at some placeother than midpoint• not require “balance” condition– “Bisection” should account for both LUTand wiring limitationsCaltech CS184 Winter2003 -- DeHon25Challenge• Not know where to cut design into– not knowing when wires will limit subtreecapacityCaltech CS184 Winter2003 -- DeHon26

Brute Force Solution• Explore all cuts– start with all LUTs in group– consider “all” balances–try cut– recurseCaltech CS184 Winter2003 -- DeHon27• Too expensive• Exponential workBrute Force• …viable if solving same subproblemsCaltech CS184 Winter2003 -- DeHon28

Simplification• Single linear ordering• Partitions = pick split point on ordering• Reduce to finding cost of [start,end]ranges (subtrees) within linear ordering•Only n 2 such subproblems• Can solve with dynamic programmingCaltech CS184 Winter2003 -- DeHon29Dynamic Programming• Start with base setof size 1• Compute all splits ofsize n, fromsolutions to allproblems of size n-1or smaller• Done when computewhere to split 0,N-1Caltech CS184 Winter2003 -- DeHon30

Dynamic Programming• Just one possible “heuristic” solution tothis problem– not optimal– dependent on ordering– sacrifices ability to reorder on splits to avoidexponential problem size• Opportunity to find a better solution here...Caltech CS184 Winter2003 -- DeHon31• Another problemOrdering LUTs– lay out gates in 1D line– minimize sum of squared wire length• tend to cluster connected gates together– Is solvable mathematically for optimal• Eigenvector of connectivity matrix• Use this 1D ordering for our linear orderingCaltech CS184 Winter2003 -- DeHon32

Mapping ResultsCaltech CS184 Winter2003 -- DeHon33Step 5: Apply Area Model• Assess impact of resource resultsCaltech CS184 Winter2003 -- DeHon34

Resources × Area Model ⇒AreaCaltech CS184 Winter2003 -- DeHon35Net AreaCaltech CS184 Winter2003 -- DeHon36

Picking Network Design PointDon’t optimize for 100% compute util. (100% yield)…also don’t optimize for highest peak.Caltech CS184 Winter2003 -- DeHon37What about a single design?Caltech CS184 Winter2003 -- DeHon38

LUT Utilization predict Area?SingledesignCaltech CS184 Winter2003 -- DeHon39Methodology1. Architecture model (parameterized)2. Cost model3. Important task characteristics4. Mapping Algorithm– Map to determine resources5. Apply cost model6. Digest results– find optimum (multiple?)– understand conflicts (avoidable?)Caltech CS184 Winter2003 -- DeHon40

Caltech CS184 Winter2003 -- DeHonBig Ideas[MSB Ideas]• Interconnect area dominates logic area• Interconnect requirements vary– among designs– within a single design• To minimize area– focus on using dominant resource(interconnect)– may underuse non-dominant resources (LUTs)41Caltech CS184 Winter2003 -- DeHonBig Ideas[MSB Ideas]• Two different resources here– compute, interconnect• Balance of resources required varies amongdesigns (even within designs)• Cannot expect full utilization of everyresource• Most area-efficient designs may waste somecompute resources (cheaper resource)42

More magazines by this user
Similar magazines