Logic Synthesis slides - UCSD VLSI CAD Laboratory

vlsicad.ucsd.edu

Logic Synthesis slides - UCSD VLSI CAD Laboratory

ECE260B – CSE241AWinter 2004Logic Synthesis(Dr. Cho Moon, guest lecturer)Website: http://vlsicad.ucsd.edu/courses/ece260b-w04ECE 260B – CSE 241A Static Timing Analysis 1Andrew B. Kahng, UCSD


Outline• Introduction• Two-level Logic Synthesis• Multi-level Logic SynthesisECE 260B – CSE 241A Static Timing Analysis 2Andrew B. Kahng, UCSD


Introduction• Cho Moon• PhD from UC Berkeley 92• Lattice Semiconductor (synthesis) 92 - 96• Cadence Design Systems (synthesis, verification and timinganalysis) 96 – present• Why logic synthesis?• Ubiquitous – used almost everywhere VLSI is done• Body of useful and general techniques – same solutions can beused for different problems• Foundation for many applications such as- Formal verification- ATPG- Timing analysis- Sequential optimizationECE 260B – CSE 241A Static Timing Analysis 3Andrew B. Kahng, UCSD


RTL Design FlowHDLRTLSynthesisManualDesignModuleGeneratorsLibrarynetlistLogicSynthesisab01sdclkqnetlistab01dqPhysicalSynthesissclklayoutECE 260B – CSE 241A Static Timing Analysis 4Slide courtesy of Devadas, et. alAndrew B. Kahng, UCSD


Logic Synthesis Problem• Given• Initial gate-level netlist• Design constraints- Input arrival times, output required times, power consumption, noiseimmunity, etc…• Target technology libraries• Produce• Smaller, faster or cooler gate-level netlist that meets constraintsVery hard optimization problem!ECE 260B – CSE 241A Static Timing Analysis 5Andrew B. Kahng, UCSD


Combinational Logic Synthesisnetlist2-levelLogic optLibraryLogicSynthesistechindependenttechdependentmultilevelLogic optnetlistLibraryECE 260B – CSE 241A Static Timing Analysis 6Slide courtesy of Devadas, et. alAndrew B. Kahng, UCSD


Outline• Introduction• Two-level Logic Synthesis• Multi-level Logic Synthesis• Sequential Logic SynthesisECE 260B – CSE 241A Static Timing Analysis 7Andrew B. Kahng, UCSD


Two-level Logic Synthesis Problem• Given an arbitrary logic function in two-level form,produce a smaller representation.• For sum-of-products (SOP) implementation on PLAs,fewer product terms and fewer inputs to each productterm mean smaller area.F = A B + A B CI1I2F = A BO1O2ECE 260B – CSE 241A Static Timing Analysis 8Andrew B. Kahng, UCSD


Boolean Functionsf(x) : B nBB = {0, 1}, x = (x 1 , x 2 , …, x n )• x 1 , x 2 , … are variables• x 1 , x 1 , x 2 , x 2 , … are literals• each vertex of B n is mapped to 0 or 1• the onset of f is a set of input values for which f(x) = 1• the offset of f is a set of input values for which f(x) = 0ECE 260B – CSE 241A Static Timing Analysis 9Andrew B. Kahng, UCSD


LiteralsECE 260B – CSE 241A Static Timing Analysis 10Slide courtesy of Devadas, et. alAndrew B. Kahng, UCSD


Boolean FormulasECE 260B – CSE 241A Static Timing Analysis 11Slide courtesy of Devadas, et. alAndrew B. Kahng, UCSD


Logic Functions:ECE 260B – CSE 241A Static Timing Analysis 12Slide courtesy of Devadas, et. alAndrew B. Kahng, UCSD


Cube RepresentationECE 260B – CSE 241A Static Timing Analysis 13Slide courtesy of Devadas, et. alAndrew B. Kahng, UCSD


Operations on Logic Functions• (1) Complement: f finterchange ON and OFF-SETS• (2) Product (or intersection or logicalAND)h = f • g or h = f ∩ g• (3) Sum (or union or logical OR):h = f + g or h = f ∪ gECE 260B – CSE 241A Static Timing Analysis 14Andrew B. Kahng, UCSD


Sum-of-products (SOP)• A function can be represented by a sum of cubes(products):f = ab + ac + bcSince each cube is a product of literals, this is a “sum ofproducts” representation• A SOP can be thought of as a set of cubes FF = {ab, ac, bc} = C• A set of cubes that represents f is called a cover of f.F={ab, ac, bc} is a cover of f = ab + ac + bc.ECE 260B – CSE 241A Static Timing Analysis 15Andrew B. Kahng, UCSD


Prime Cover• A cube is prime if there is no other cube that contains it(for example, b c is not a prime but b is)• A cover is prime iff all of its cubes are primecabECE 260B – CSE 241A Static Timing Analysis 16Andrew B. Kahng, UCSD


Irredundant Cube• A cube of a cover C is irredundant if C fails to be a coverif c is dropped from C• A cover is irredundant iff all its cubes are irredudant (forexmaple, F = a b + a c + b c)caECE 260B – CSE 241A Static Timing Analysis 17bNot coveredAndrew B. Kahng, UCSD


Quine-McCluskey Method• We want to find a minimum prime and irredundant coverfor a given function.• Prime cover leads to min number of inputs to each product term.• Min irredundant cover leads to min number of product terms.• Quine-McCluskey (QM) method (1960’s) finds aminimum prime and irredundant cover.• Step 1: List all minterms of on-set: O(2^n) n = #inputs• Step 2: Find all primes: O(3^n) n = #inputs• Step 3: Construct minterms vs primes table• Step 4: Find a min set of primes that covers all the minterms:O(2^m) m = #primesECE 260B – CSE 241A Static Timing Analysis 18Andrew B. Kahng, UCSD


QM Example (Step 1)• F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c• List all on-set mintermsMintermsa’ b’ c’a b’ c’a b’ ca b ca’ b cECE 260B – CSE 241A Static Timing Analysis 19Andrew B. Kahng, UCSD


QM Example (Step 2)• F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c• Find all primes.primesb’ c’a b’a cb cECE 260B – CSE 241A Static Timing Analysis 20Andrew B. Kahng, UCSD


QM Example (Step 3)• F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c• Construct minterms vs primes table (prime implicanttable) by determining which cube is contained in whichprime. X at row i, colum j means that cube in row i iscontained by prime in column j.b’ c’a b’a cb ca’ b’ c’Xa b’ c’XXa b’ cXXa b cXXa’ b cXECE 260B – CSE 241A Static Timing Analysis 21Andrew B. Kahng, UCSD


QM Example (Step 4)• F = a’ b’ c’ + a b’ c’ + a b’ c + a b c + a’ b c• Find a minimum set of primes that covers all the minterms“Minimum column covering problem”b’ c’a b’a cb ca’ b’ c’Xa b’ c’XXa b’ cXXa b cXXa’ b cXEssential primesECE 260B – CSE 241A Static Timing Analysis 22Andrew B. Kahng, UCSD


ESPRESSO – Heuristic Minimizer• Quine-McCluskey gives a minimum solution but is onlygood for functions with small number of inputs (< 10)• ESPRESSO is a heuristic two-level minimizer that finds a“minimal” solutionESPRESSO(F) {do {reduce(F);expand(F);irredundant(F);} while (fewer terms in F);verfiy(F);}ECE 260B – CSE 241A Static Timing Analysis 23Andrew B. Kahng, UCSD


ESPRESSO ILLUSTRATEDReduceExpandIrredundantECE 260B – CSE 241A Static Timing Analysis 24Andrew B. Kahng, UCSD


Outline• Introduction• Two-level Logic Synthesis• Multi-level Logic SynthesisECE 260B – CSE 241A Static Timing Analysis 25Andrew B. Kahng, UCSD


Multi-level Logic Synthesis• Two-level logic synthesis is effective and mature• Two-level logic synthesis is directly applicable to PLAsand PLDsBut…• There are many functions that are too expensive toimplement in two-level forms (too many product terms!)• Two-level implementation constrains layout (AND-plane,OR-plane)• Rule of thumb:• Two-level logic is good for control logic• Multi-level logic is good for datapath or random logicECE 260B – CSE 241A Static Timing Analysis 26Andrew B. Kahng, UCSD


Representation: Boolean NetworkBoolean network:• directed acyclic graph (DAG)• node logic function representationf j(x,y)• node variable y j: y j= f j(x,y)• edge (i,j) if f jdepends explicitly ony iInputs x = (x 1 , x 2 ,…,x n )Outputs z = (z 1 , z 2 ,…,z p )ECE 260B – CSE 241A Static Timing Analysis 27Slide courtesy of BraytonAndrew B. Kahng, UCSD


Multi-level Logic Synthesis Problem• Given• Initial Boolean network• Design constraints- Arrival times, required times, power consumption, noise immunity,etc…• Target technology libraries• Produce• a minimum area netlist consisting of the gates from the targetlibraries such that design constraints are satisfiedECE 260B – CSE 241A Static Timing Analysis 28Andrew B. Kahng, UCSD


Modern Approach to Logic Optimization• Divide logic optimization into two subproblems:• Technology-independent optimization- determine overall logic structure- estimate costs (mostly) independent oftechnology- simplified cost modeling• Technology-dependent optimization(technology mapping)- binding onto the gates in the library- detailed technology-specific cost model• Orchestration of various optimization/transformationtechniques for each subproblemECE 260B – CSE 241A Static Timing Analysis 29Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Technology-Independent OptimizationSimplified cost models• Area = sum of factored form literals in all nodes• Number of product terms is not a good measure of area in multilevelimplementationf=ad+ae+bd+be+cd+ce (6 product terms)f’=a’b’c’+d’e’ (2 product terms)The only difference between f and f’ is inversion⇒ f=(a+b+c)(d+e) (5 literals in factored form)⇒ f’= a’b’c’ + d’e’ (5 literals in factored form)• Delay = levels of logic on critical pathsECE 260B – CSE 241A Static Timing Analysis 30Andrew B. Kahng, UCSD


Technology-Independent Optimization• Technology-independent optimization is a bag of tricks:• Two-level minimization (also called simplify)• Constant propagation (also called sweep)f = a b + c; b = 1 => f = a + c• Decomposition (single function)f = abc+abd+a’c’d’+b’c’d’ => f = xy + x’y’; x = ab ; y = c+d• Extraction (multiple functions)f = (az+bz’)cd+e g = (az+bz’)e’ h = cde⇓f = xy+e g = xe’ h = ye x = az+bz’ y = cdECE 260B – CSE 241A Static Timing Analysis 31Andrew B. Kahng, UCSD


More Technology-Independent Optimization• More technology-independent optimization tricks:• Substitutiong = a+b f = a+bc⇓f = g(a+b)• Collapsing (also called elimination)f = ga+g’b g = c+d⇓f = ac+ad+bc’d’ g = c+d• Factoring (series-parallel decomposition)f = ac+ad+bc+bd+e => f = (a+b)(c+d)+eECE 260B – CSE 241A Static Timing Analysis 32Andrew B. Kahng, UCSD


Summary of Typical Recipe for TI Optimization• Propagate constants• Simplify: two-level minimization at Boolean network node• Decomposition• Local “Boolean” optimizations• Boolean techniques exploit Boolean identities (e.g., a a’ = 0)Consider f = a b’ + a c’ + b a’ + b c’ + c a’ + c b’• Algebraic factorization proceduresf = a (b’ + c’) + a’ (b + c) + b c’ + c b’• Boolean factorization producesf = (a + b + c) (a’ + b’ + c’)ECE 260B – CSE 241A Static Timing Analysis 33Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Technology-Dependent OptimizationTechnology-dependent optimization consists of• Technology mapping: maps Boolean network to a set ofgates from technology libraries• Local transformations• Discrete resizing• Cloning• Fanout optimization (buffering)• Logic restructuringECE 260B – CSE 241A Static Timing Analysis 34Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Technology MappingInputOutput1. Technology independent, optimized logic network2. Description of the gates in the library with their cost• Netlist of gates (from library) which minimizes total costGeneral Approach• Construct a subject DAG for the network• Represent each gate in the target library by pattern DAG’s• Find an optimal-cost covering of subject DAG using thecollection of pattern DAG’s• Canonical form: 2-input NAND gates and invertersECE 260B – CSE 241A Static Timing Analysis 35Andrew B. Kahng, UCSD


DAG Covering• DAG covering is an NP-hard problem• Solve the sub-problem optimally• Partition DAG into a forest of trees• Solve each tree optimally using tree covering• Stitch trees back togetherECE 260B – CSE 241A Static Timing Analysis 36Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Tree Covering Algorithm• Transform netlist and libraries into canonical forms• 2-input NANDs and inverters• Visit each node in BFS from inputs to outputs• Find all candidate matches at each node N- Match is found by comparing topology only (no need to comparefunctions)• Find the optimal match at N by computing the new cost- New cost = cost of match at node N + sum of costs for matches atchildren of N• Store the optimal match at node N with cost• Optimal solution is guaranteed if cost is area• Complexity = O(n) where n is the number of nodes innetlistECE 260B – CSE 241A Static Timing Analysis 37Andrew B. Kahng, UCSD


Tree Covering ExampleFind an ``optimal’’ (in area, delay, power) mapping of this circuitinto the technology library (simple example below)ECE 260B – CSE 241A Static Timing Analysis 38Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Elements of a library - 1Element/Area CostTree Representation (normal form)INVERTER 2NAND2 3NAND3 4NAND4 5ECE 260B – CSE 241A Static Timing Analysis 39Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Elements of a library - 2Element/Area CostTree Representation (normal form)AOI21 4AOI22 5ECE 260B – CSE 241A Static Timing Analysis 40Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Trivial Coveringsubject DAG7 NAND2 (3) = 215 INV (2) = 10Area cost 31Can we do better with tree covering?ECE 260B – CSE 241A Static Timing Analysis 41Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering - 13223``subject tree’’ECE 260B – CSE 241A Static Timing Analysis 42Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering - 2382235``subject tree’’ECE 260B – CSE 241A Static Timing Analysis 43Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering - 338132235``subject tree’’Cover with ND2 or ND3 ?1 NAND2 3+ subtree 5Area cost 8ECE 260B – CSE 241A Static Timing Analysis 441 NAND3 = 4Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering – 3b38132235 4``subject tree’’Label the root of the sub-tree with optimal match and costSlide courtesy of KeutzerECE 260B – CSE 241A Static Timing Analysis 45Andrew B. Kahng, UCSD


Optimal tree covering - 43Cover with INV or AO21 ?81322254``subject tree’’1 Inverter 2+ subtree 13Area cost 15ECE 260B – CSE 241A Static Timing Analysis 461 AO21 4+ subtree 1 3+ subtree 2 2Area cost 9Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering – 4b3813922254``subject tree’’Label the root of the sub-tree with optimal match and costECE 260B – CSE 241A Static Timing Analysis 47Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering - 59Cover with ND2 or ND3 ?82NAND2subtree 1 9subtree 2 41 NAND2 3ECE 260B – CSE 241A Static Timing Analysis 484subtree 1 8subtree 2 2subtree 3 41 NAND3 4Area cost 16Area cost 18Slide courtesy of Keutzer``subject tree’’NAND3Andrew B. Kahng, UCSD


Optimal tree covering – 5b981624``subject tree’’Label the root of the sub-tree with optimal match and costECE 260B – CSE 241A Static Timing Analysis 49Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering - 6Cover with INV or AOI21 ?13165``subject tree’’INVsubtree 1 161 INV 2AOI21subtree 1 13subtree 2 51 AOI21 4ECE 260B – CSE 241A Static Timing Analysis 50Area cost 18Slide courtesy of KeutzerArea cost 22Andrew B. Kahng, UCSD


Optimal tree covering – 6b1318165``subject tree’’Label the root of the sub-tree with optimal match and costECE 260B – CSE 241A Static Timing Analysis 51Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Optimal tree covering - 7Cover with ND2 or ND3 or ND4 ?``subject tree’’ECE 260B – CSE 241A Static Timing Analysis 52Slide courtesy of KeutzerAndrew B. Kahng, UCSD


Cover 1 - NAND2Cover with ND2 ?918164``subject tree’’subtree 1 18subtree 2 01 NAND2 3Slide courtesy of KeutzerECE 260B – CSE 241A Static Timing Analysis 53Area cost 21Andrew B. Kahng, UCSD


Cover 2 - NAND3Cover with ND3?94``subject tree’’ECE 260B – CSE 241A Static Timing Analysis 54subtree 1 9subtree 2 4subtree 3 01 NAND3 4Slide courtesy of KeutzerArea cost 17Andrew B. Kahng, UCSD


Cover - 3Cover with ND4 ?824``subject tree’’ECE 260B – CSE 241A Static Timing Analysis 55subtree 1 8subtree 2 2subtree 3 4subtree 4 01 NAND4 5Slide courtesy of KeutzerArea cost 19Andrew B. Kahng, UCSD


Optimal Cover was Cover 2ND2AOI21ND3INVND3``subject tree’’ECE 260B – CSE 241A Static Timing Analysis 56INV 2ND2 32 ND3 8AOI21 4Slide courtesy of KeutzerArea cost 17Andrew B. Kahng, UCSD


Summary of Technology Mapping• DAG covering formulation• Separated library issues from mapping algorithm (can’t do thiswith rule-based systems)• Tree covering approximation• Very efficient (linear time)• Applicable to wide range of libraries (std cells, gate arrays) andtechnologies (FPGAs, CPLDs)• Weaknesses• Problems with DAG patterns (Multiplexors, full adders, …)• Large input gates lead to a large number of patternsECE 260B – CSE 241A Static Timing Analysis 57Andrew B. Kahng, UCSD


Local Transformations• Given• Technology-mapped netlist• Target technology libraries• Some violations (timing, noise immunity, power, etc…)• Produce• New netlist that corrects the given violations without introducingnew violations• Approach: another bag of tricks• Discrete resizing• Cloning• Buffering• Logic restructuring• More…ECE 260B – CSE 241A Static Timing Analysis 58Andrew B. Kahng, UCSD


Discrete Resizingabab?A0.035def0.20.20.3d0.050.040.030.020.0100 0.2 0.4 0.6 0.8 1loadA B CabC0.026Note that some arrival and requiredtimes become invalidECE 260B – CSE 241A Static Timing Analysis 59DAC-2002, Physical Chip ImplementationAndrew B. Kahng, UCSD


Cloningd0.050.040.030.020.0100 0.2 0.4 0.6 0.8 1loadA B Cd0.2dab?efgh0.20.20.20.2abABefghNote that loads at a, b increaseECE 260B – CSE 241A Static Timing Analysis 60DAC-2002, Physical Chip ImplementationAndrew B. Kahng, UCSD


Bufferingd0.050.040.030.020.0100 0.2 0.4 0.6 0.8 1loadA B Cabd0.2e 0.2af? 0.2bg 0.2h 0.2B0.1Bde0.20.2fgh0.20.20.2ECE 260B – CSE 241A Static Timing Analysis 61DAC-2002, Physical Chip ImplementationAndrew B. Kahng, UCSD


Logic Restructuring 1• Nodes in critical section that fan out outside of criticalsection are duplicatedffaeCollapsednodeabeebhdhECE 260B – CSE 241A Static Timing Analysis 62cLate inputsignalsSlides courtesy of KeutzercdAndrew B. Kahng, UCSD


Logic Restructuring 2• Place timing-critical nodes closer to output• Make them pass through fewer gates• After collapse, a divisor is selected such that substituting k into f places criticalsignal c and d closer to outputCollapse critical sectionRe-extract factor kffCollapsednodeab c deakdivisorbecdclose tooutputECE 260B – CSE 241A Static Timing Analysis 63Slides courtesy of KeutzerAndrew B. Kahng, UCSD


Summary of Local Transformations• Variety of methods for delay optimization• No single technique dominates• The one with more tricks wins? No!• Need a good framework for evaluating and processingdifferent transforms• Accurate, fast timing engine with incremental analysis capability- don’t want to retime the whole design for each local transform• Simultaneous min and max delay analysis- How does fixing the setup violation affect the existing hold checks?ECE 260B – CSE 241A Static Timing Analysis 64Andrew B. Kahng, UCSD

More magazines by this user
Similar magazines