13.07.2015 Views

2 - UCSD VLSI CAD Laboratory

2 - UCSD VLSI CAD Laboratory

2 - UCSD VLSI CAD Laboratory

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Section II: Basic IssuesJan. 2003 ASPDAC03 – Physical Chip Implementation 1Overview• The Big Five Dimensions of PhysicalDesign• Hierarchy Pros and Cons• RTL practices• Tools, Machines, Flows• Data Prep: Netlists and IP• PackagingJan. 2003 APSDAC03 - Physical Chip Implementation 2


The Big Five• Area: 5x5mm TSMC shuttle runs up to18x18mm+ “boomer-class” chips• Timing: DC to 350Mhz-ish in .18u• Power: Battery to 60Watts+• Schedule: 0 to Infinite• CorrectnessJan. 2003 APSDAC03 - Physical Chip Implementation 3The Big Five, part 2• Area, Timing, Power, Schedule all tradeoff with each other• Correctness mostly trades off with Risk(Working chip vs.. Rock)• Choose only two or three to optimize.Jan. 2003 APSDAC03 - Physical Chip Implementation 4


Classic Hierarchy• Various blocks, channel routed together• Padring constructed at top levelJan. 2003 APSDAC03 - Physical Chip Implementation 5Classic Flat• Everything at the top level, but….• May be macro blocks that have beenprevious PnR’ed• Simplest and most common techniquein use todayJan. 2003 APSDAC03 - Physical Chip Implementation 6


Abutted HierarchyNothing at the top level except macroblocks!Jan. 2003 APSDAC03 - Physical Chip Implementation 7Hierarchy Pro• Enables parallelism when building thechipEngineers already decouple work on RTL andverification, physical design can also benefit• Results come quicker if you can keepcomputers and licenses busyExample: 1.5 million placeable objects, placeand route in 24 hrsJan. 2003 APSDAC03 - Physical Chip Implementation 8


Hierarchy Pro (#2)• Memory capacity issues eased, don’t need 64bit machines, can use Linux (2X boost)• Vulnerability to tool failure lower if runtime islower. Easier to use patch releases to fixspecific core dumps ant specific stages• Trade runtime for even better quality ofresults• Wireload models and other QOR influencinginputs can be tuned on a per-block basismore easilyJan. 2003 APSDAC03 - Physical Chip Implementation 9Hierarchy Pro(#3)• More deterministic results:Global nets are the same for each buildBlock boundary conditions (pins) are verysimilar from build to build• Speed of builds can enableexperimentation in chipArchitecture/RTL/Floorplan/Synthesis.Take chip design to the next levelJan. 2003 APSDAC03 - Physical Chip Implementation 10


Hierarchy Pro(#4)• Incremental, deterministic block closurebecomes possible• Easier to mix and mach different vendortools for different blocks.• Logic == Physical hierarchy at the blocklevel enables rebuild of completelyresynthesized netlist late in design(assuming it fits)Jan. 2003 APSDAC03 - Physical Chip Implementation 11Hierarchy Pro (#5)• Multiply instantiated PnR Blocks can beone block build, saving time andresources• Clock distribution can take advantage ofhierarchy to offer lower intra-block skewat the cost of higher inter-block skewJan. 2003 APSDAC03 - Physical Chip Implementation 12


Hierarchy Con:Pin assignment difficult• the “Horizon” effectGlobal timing requires block constraintsERC/DRC difficultMore clock tuning requiredJan. 2003 APSDAC03 - Physical Chip Implementation 13Hierarchy Con (#2)• Channels are badWaste areaCoupling issues, or waste even more areaRepeater placement and power in channels canbe difficultLong timing paths around blocksFeedthroughs difficult for tools, flowJan. 2003 APSDAC03 - Physical Chip Implementation 14


Hierarchy Con (#3)• Data management problems muchworseMore files of ALL typesExperimentation creates even more versions ofblocks..which is the one to tapeout?Multiple tool usage only makes IP setupproblems worseJan. 2003 APSDAC03 - Physical Chip Implementation 15Hierarchy Con (#4)• Top level partitioning and floorplanningrequired, can be problematic• Using hierarchy is fighting against toolsnot designed with it in mind• Abutted block hierarchy can eliminatethe incremental block closure capability• Abutted block floorplanning can haveissues with various snapping grids inthe design (stdcell, power, bga bumps)Jan. 2003 APSDAC03 - Physical Chip Implementation 16


Hierarchy Con (#5)• Multiply instantiated blocks have todiverge due to their differentenvironments• Classic Hierarchy requires a top levelrouter tool as well as the block levelrouterJan. 2003 APSDAC03 - Physical Chip Implementation 17Hierarchy …..or not?• Weigh design size, team experience,tool selections.• Probably best to use modified flatdesign for nowJan. 2003 APSDAC03 - Physical Chip Implementation 18


ReShape’s FlowNetlistsChip FloorplanningBlock PnRBlock PnRBlock PnRBlock PnRChip IntegrationGDSIIJan. 2003 APSDAC03 - Physical Chip Implementation 19Abutted Chip Example 1Jan. 2003 APSDAC03 - Physical Chip Implementation 20


Chip Example 2Jan. 2003 APSDAC03 - Physical Chip Implementation 21Routed ChipJan. 2003 APSDAC03 - Physical Chip Implementation 22


Abutted PinsJan. 2003 APSDAC03 - Physical Chip Implementation 23RTL Practices(or……….. “The Rules of the Tavern”)• Planning milestones• Large scale issues• Small scale issuesJan. 2003 APSDAC03 - Physical Chip Implementation 24


Concurrent Design Milestone1Functional DesignStandard Cell LibrariesI/O LibrariesCustom MacrosSpecificationsInitial IPInitialNetlistiFullNetlistii99% FinalNetlist3- 4weeksFinalNetlist2- 3weeksTape OutPhysical DesignPhysical DesignJan. 2003 APSDAC03 - Physical Chip Implementation 25Concurrent Design Milestone2Functional DesignContains as much structural content aspossibleAll components included in this netlistshould have front end views, includingabstractsInitial IPInitialNetlistiFullNetlistii99% FinalNetlist3- 4weeksFinalNetlist2- 3weeksTape OutPhysical DesignPhysical DesignJan. 2003 APSDAC03 - Physical Chip Implementation 26


Concurrent Design Milestone3Functional DesignAll structural components included innetlist.Full views of all IP should be available,even if preliminaryInitial IPInitialNetlistiFullNetlistii99% FinalNetlist3- 4weeksFinalNetlist2- 3weeksTape OutPhysical DesignPhysical DesignJan. 2003 APSDAC03 - Physical Chip Implementation 27Concurrent Design Milestone4Functional DesignNetlist should be at or close to tape outquality with respect to size, timing, andfunctionality. Only very small changesexpected between this and the finalnetlistInitial IPInitialNetlistiFullNetlistii99% FinalNetlist3- 4weeksFinalNetlist2- 3weeksTape OutPhysical DesignPhysical DesignJan. 2003 APSDAC03 - Physical Chip Implementation 28


Concurrent Design Milestone5Functional DesignThis is the final netlist delivered bythe customer. It will be taken throughthe tape out processInitial IPInitialNetlistiFullNetlistii99% FinalNetlist3- 4weeksFinalNetlist2- 3weeksTape OutPhysical DesignPhysical DesignJan. 2003 APSDAC03 - Physical Chip Implementation 29Large Scale Issues, “ThinkPhysical”• Where is this logic going to live on thechip? I.e. Have a trial floorplan in mind• No “dirty hierarchy”, I.e. keep logic outof top levels of tree. Even bettergenerate top levels using tools• Snoopers, etc coded in such a way thathierarchy can be changed laterJan. 2003 APSDAC03 - Physical Chip Implementation 30


“Think Physical” (#2)• No snake paths, problematic foroptimization and analysis• Registered-in or Registered-outparadigm for the major blocks• “Edge” or “io affinity logic” carefullypartitioned to make it easer to find laterJan. 2003 APSDAC03 - Physical Chip Implementation 31“Think Physical” (#3)• Choose global repeater and repeaterdistance (e.g. BUFX20 every 2.2mm, TSMC.18u)• Calibrate process for ‘chip crossing time’, I.e.repeater insertion delays. Know thesenumbers by heart• Add default constraints for the block synthesisflow that shows a full chip crossing delay bydefault on the input (registered-out) and arepeater loading on every outputJan. 2003 APSDAC03 - Physical Chip Implementation 32


Small Scale Issues• Just say NO:Tri-states. Problems galore with ERC, timingchecks. Rarely seen, not usually any need forthem in 6+ layers of metalMulti-cycle paths (complexity for timing, and apure area saving issue)Bulk use of asynchronous set/resets or latchesJan. 2003 APSDAC03 - Physical Chip Implementation 33Small Scale Issues, #2• Make sure Synthesis is:Not using “dangerous cells” (don’t_use)Not using XL (lower power) cellsNot using AOI gates for muxes (or withfeedback for clock enabled flops or sync resetflops)Not using 8:1 muxes (slow, big and congested)Has slightly over-constrained clocksHas a max transition run: 800ps – 1.5 nsJan. 2003 APSDAC03 - Physical Chip Implementation 34


Small Scale Issues (#3)• Use a “reasonable” wireload model (more onthis later). Take post-synthesis timing with agrain of salt• Bad scan methodology dangers: Test_compiler paradigm assumes ALL synopsyscompiles are done “test aware” However, most people too cheap to buy that manylicenses, and want to use it as a ‘translator’ after mainsynthesis is done Careful or it may undo your synthesis QOR, e.g.undoesmax transition fixesJan. 2003 APSDAC03 - Physical Chip Implementation 35Small Scale Issues (#4)• Assembly of gate-level netlist from RTLshould match simulation netlist. Badsensitivity lists, bad include filehandling, RTL not under version control,etc can cause formal verification to fail.• Backend netlist should be checked aftereach build to make sure it is free ofgtech cells, raw RTL, other clutter.Jan. 2003 APSDAC03 - Physical Chip Implementation 36


Small Scale Issues (#5)• Assign statements ok inside pnr blockhierarchy, but not ok at block boundary.Use synthesis to add buffers here.• Jan. 2003 APSDAC03 - Physical Chip Implementation 37Small Scale Issues (#6)• Ram instantiation wrappers a goodidea: specify logic construct (fifo, regfile), width+depth. Wrapper addscontrol, BIST and adds optimal physicalram object to construct the wholeJan. 2003 APSDAC03 - Physical Chip Implementation 38


Ram wrappers (#2)• Minimize ram overhead area (decodersand sense amps)• Make floorplanning easy by keepingeach ram the same form factor ifpossible• Small changes in widths or depthsshould keep instance names the same(give the floorplanner a break)Jan. 2003 APSDAC03 - Physical Chip Implementation 39Ram wrappers (#3)• Re-partitioning will change the name, sotry to use leaf names and make theseunique• Round widths and depths up so thatlittle changes in RTL don’t effect thefloorplanning (or require new rambuilds!)Jan. 2003 APSDAC03 - Physical Chip Implementation 40


Small Scale issues• Hand instantiation: used to for precisecontrol of gates and/or placement.Name no longer changes. Use macrosor other tricks to allow actually choice ofgate type to be changed later• Logic loops: e.g. process monitors(procmon ckts), etc must have loopsopened or some timing aware tools willfreak outJan. 2003 APSDAC03 - Physical Chip Implementation 41Choices: Tools, Machines,Flows and Languages• Pick hardware+OS that is the “first release”platform for your newest, least stable tools.Watch for Linux.• The tool you already know how to use can bea lot better than the shiny new tool you don’tknow. Never overestimate the capabilities ofa tool you haven’t used before.• Avoid “Science Projects” and focus on whatreally matters. Remember another name for“Engineer” is “A person that fixes problemswe don’t have”Jan. 2003 APSDAC03 - Physical Chip Implementation 42


Choices (#2)• On the other hand, anticipate when yourcurrent tools/flow will be out of gas in aparticular area.• Beware of “estimator” tools. Pushforward before too long to check them.• Hierarchy capable solutions are many,which do you think is best?Jan. 2003 APSDAC03 - Physical Chip Implementation 43Choices (#3)• Understand the inherent limitations ofaccuracy in the tools: ex How accurate isextracted timing + primetime? How aboutswitch windows used in coupling analysis?How about IR thermal maps in static poweranalysis?• Pick your process and think carefully. .13u isno picnic. Check the availability of all IP onthat process. Has it seen silicon yet or areyou the guinea pig for it?Jan. 2003 APSDAC03 - Physical Chip Implementation 44


Data Prep: Netlists• Careful with assigns, they indicate fuzzythinking at the block level.• 1’b0, 1’b1 and tie-high, tie-low cells.• Bizarre name space problems: Best to avoid netnames like ‘input, output’. Duh. Truly bizarre cases: qp, next, csh• Mismatches with IP, case sensitivity issues• Debus netlists, but not IP ports• Normalize (remap) block names in backendso that downstream flow is always broken:e.g. “FarbleBlockUnit0” -> “fb0”Jan. 2003 APSDAC03 - Physical Chip Implementation 45Data Prep: IP Types• StdcellsPower connected by abutment, placed in sea-ofof-rows (rarely rotated)DRC clean in any combinationCircuit clean (I.e. no naked T-gates, Tno hugeinput capacitances)Jan. 2003 APSDAC03 - Physical Chip Implementation 46


Stdcell Example: BUFX20Jan. 2003 APSDAC03 - Physical Chip Implementation 47Stdcells, Continued• 8,9,10+ tracks in height• Metal 1 only used (hopefully)• Separate scan outputs vs.. dual rail outputs• Setup/hold margin on scan flops determinesclock skew target (ex: artisan 120ps or so)• Strange timings (ex: setup on artisan FFs)Jan. 2003 APSDAC03 - Physical Chip Implementation 48


Stdcells, Continued• Multi-height stdcells• Buffers: sizes, intrinsic delay steps, optimalrepeater selection, EM issues for largest?• Special clock buffers + gates (balanced P:N)• Special metastability hardened flops• Cap cells (metal1 used?)• Gap fillers (metal1 used?)• Tie-high, tie-lowJan. 2003 APSDAC03 - Physical Chip Implementation 49Stdcells, Continued• Stdcells, continuedSpares, tied off in netlist? Or tied off internally.Added and placed after placement, or as part ofincoming netlist? FIB-able? Spare nets androutes?Antenna protected stdcells, .13uJan. 2003 APSDAC03 - Physical Chip Implementation 50


Macrocells: Rams• Artisan compiled rams exampleRings for power: rotated rams require newmaster?BIST/redundancyAntenna protected?Reasonable drive?Layer use?Jan. 2003 APSDAC03 - Physical Chip Implementation 51Macrocells: analog• PLLS, DLLs• DAC, ADC• XO, Voltage reference generators• RAC, Serializer/DeserializersJan. 2003 APSDAC03 - Physical Chip Implementation 52


Tool and Flow Setup• Tech filesCorrect process and process cornersVerify the extraction and library data with teststructuresPlace under revision control, you may have tofiddle with it (e.g. artisan Apollo tech file doesnot have via stacking turned on)Jan. 2003 APSDAC03 - Physical Chip Implementation 53Queuers and Wrappers fortool execution• Wrappers Allows version and patch versionselection for every tool, can also help withlicenses management.• Give users correct std version for tool, butalways allow special version to be used atany point in flow• Setup queuer: lsf, openpbs, gridware. Mustdeal with different machine speeds/memory,OS.• Automatic machine selection + user specifiedhostnameJan. 2003 APSDAC03 - Physical Chip Implementation 54


Batch mode issues withAvant!• License use problems (variable route ruleswith tapering option selected suddenly need“HPO” license!? How did I know?)• Tools should not grab all licenses by defaultupon startup• Tools should spin lock by default for license ifthey don’t have it• Tools should not hold license if they aren’tusing it anymoreJan. 2003 APSDAC03 - Physical Chip Implementation 55Batch mode issues withAvant! (#2)• NullX Server issues: tools riddled withassumption that X server is present.• Aserver process may or may not forkupon tool execution, can confuse scriptswaiting for exit (e.g. make)Jan. 2003 APSDAC03 - Physical Chip Implementation 56


Flow automation• ‘make’ a great tool, but:No dependencies on program outputs directlyNo control over depth vs. breadth firstexecution• Automatic Log checking essential, butwatch out for signal or instance nameswith ‘error’ as part of the name 8)• Automatic command file generationJan. 2003 APSDAC03 - Physical Chip Implementation 57IO and Packaging• Pin count• Performance (electrical, thermal)• Availability/risk• CostJan. 2003 APSDAC03 - Physical Chip Implementation 58


Packaging Styles, Level 1• Level 1 (chip to package)WirebondPadring –> linear or staggeredRight-angle or radial bondingFlipchip (BGA)Padring – peripheral/perimeter vs. area arrayEutectic or high temp (C4) solder bumpsUnder fill for thermal stress managementJan. 2003 APSDAC03 - Physical Chip Implementation 59Packaging Styles, Level 2• Level 2 (package to board)Leaded (e.g QFP)Leaded (e.g QFP)Low pin count (typically < 304)Low performance (plastic, inductive) 2.5 – 3Watts maxLow CostFast turnaroundJan. 2003 APSDAC03 - Physical Chip Implementation 60


Packaging Styles, Level 2• Level 2, continuedLeadless (e.g. BGA , CSP [Chip Scalepackage], CGA [Column Grid Array])High pinout (> 2000)High performance (planes, matchedimpedence, low Inductance)Higher cost (1.5 to 10 cents/pin, 1 cent/pin the“holy grail”Longer Lead Time (10-14 weeks once in thequeue!)• One customer pkg cost == 5 X silicon costJan. 2003 APSDAC03 - Physical Chip Implementation 61Padring Shot 1• verticalJan. 2003 APSDAC03 - Physical Chip Implementation 62


Padring Shot 2horizontalJan. 2003 APSDAC03 - Physical Chip Implementation 63Packaging “Gotchas”• Cavity packages have min and max diesize limits• Flip has no diearea wiggle room oncepackage build started• Bumps vs. Balls: Bump to Ball ration isNOT 1:1Jan. 2003 APSDAC03 - Physical Chip Implementation 64


Packaging Gotchas (#2)• Material limits exist: the coefficient ofthermal expansion limits the body sizeoptions of some packages (e.g.ceramic)• Routability is limited by via pitch and fileline technology. I.e. going to bump pitch< 200um and ball pitch < 1mm doesn’tmake sense right nowJan. 2003 APSDAC03 - Physical Chip Implementation 65IO and Package• Power hookupRings, slottingEM issues, double bondingUnusual LVS issuesJan. 2003 APSDAC03 - Physical Chip Implementation 66


Flip-Flop vs. Latch TimingJan. 2003 ASPDAC03 – Physical Chip Implementation 67Latch and Flip-Flop GatesActive high latchclockRising edge flip-flopclockclockDclockclockQNDclockclockclockclockQNclockQclockclockQenableenablein outin outenableenableLatch and flip-flop schematics from TSMC 0.13um LV Artisan Sage-X Standard Cell Library.Jan. 2003 APSDAC03 - Physical Chip Implementation 68


Latch and Flip-Flop BehaviorActive high latchRising edge flip-flopWhen clock is highWhen clock is highDQNDQNQQt DQ2 inverter delayst CQ4 inverter delaysWhen clock is lowWhen clock is lowDQNDQNQQJan. 2003 APSDAC03 - Physical Chip Implementation 69Clock CharacteristicsABclockClock skewclock at Aclock at B(a)t sk,ABt sk,ABDuty cyclejitterclock at Bclock at BCycle-to-cycleedge jitter(b) T high –t dutyclock at Bclock at B(c) t j /2 T – t jt j /2t dutyJan. 2003 APSDAC03 - Physical Chip Implementation 70


Flip-Flop Timing CharacteristicsRising edge flip-flopABclockt comb,minnon-idealclockABt CQ t comb,max t su t sk +t jAnon-idealclockBt CQ,minT flip-flopst sk t hSetup time constraintHold time constraintJan. 2003 APSDAC03 - Physical Chip Implementation 71Latch Setup Time andTransparencyActive high latchABABclocknon-ideal Aclock Bclocknon-ideal Aclock Bt CQ t comb,max t su t duty t sk +t jt DQt combt DQSetup time constraintNo penalty to clock periodfor setup time constraint!Jan. 2003 APSDAC03 - Physical Chip Implementation 72

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!