Part V - UCSD VLSI CAD Laboratory

vlsicad.ucsd.edu

Part V - UCSD VLSI CAD Laboratory

Three Trends

Part V: Design Optimizations

Andrew B. Kahng

UCSD and Blaze DFM, Inc.

abk@ucsd.edu

�Trend 1: Reactions to “failure of WYSIWYG”

• Shape (litho, etch) and thickness (CMP) simulators

• Geometric criteria (process-window hot-spot checkers, etc.) before electrical

criteria (Iddq, FMax variation, etc.)

• Library/IP development use models before full-chip use models

• Analyses before optimizations

�Trend 2: Reactions to “uncontrollable variation”

• Experiments with statistical analysis tools

�Trend 3: Commoditization of IDM internal technologies

• Defect-oriented yield analyses: critical area analysis

• Simple layout methodologies: post-route via/contact doubling

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

2

1


Some Moderate Failures of Imagination

�Linear extrapolation

• Larger guardbands

• More design rules

• Better equipment

�Putting the “virtual fab” or “litho simulator” onto the designer’s

desktop

�Statistical timing analysis

�Industry-wide regression

• DFM’s first wave: “All I want is what IBM has been making and using internally

for the past 10 years…”

Proposed Precepts for DFM

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Don’t assume what doesn’t exist

• Example: “detailed process information”

• What drives or even allows the process to improve?

• Process evolves over time/design with long time constant

• What improves the design today may hurt it tomorrow

� Don’t mess with anything golden

• Handoff: GDSII/OASIS formats, BSIM4 model, .lib model

• Signoff: If the design is closed, don’t un-close it !!!

• Analyses: RC extraction, performance, litho simulation

• Private: Litho setup, OPC recipes

� Don’t assume a “new silicon engineer”

• 21 st -Century IC designer = deep and broad (“from C to OPC”?)

• But not unboundedly so � separation of concerns is a good thing

• Don’t ask a designer to become a lithography engineer

• Don’t ask lithography engineers to understand the design

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

3

4

2


Where We Are Today

� Huge $$$ still left on the table

• “Left on table” = recoverable by improved design technology

without any process or productivity change

• Many concrete examples exist !

• Will recover much of this in the next 3-4 years?

• Power: 0.5 x full technology node

• Area: 0.3 x full technology node

• Frequency: 1.0 x full technology node

• Variability control: 1.0 x full technology node

� Simulation- and analysis-centric “first wave” of DFM

• Still has some “failures of imagination”

� Near-term goals

• Embrace variation and optimize parametric yield

• Give clear ROI for products

Outline

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Detailed Placement for Process Window

Enhancement

� CMP Fill at 65nm and Below

� Auxiliary Pattern Methodology for Cell-Based OPC

� Crosstalk Awareness in SSTA

� Other

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

5

6

3


Bias OPC

Original Design ( or Mask) Wafer Patterns

OPC

lithography

Process

compare

Lithography

Process

OPC Design (or Mask) Wafer Patterns

� Mask design is modified to match photo-resist edges to layout edge

using a layout sizing technique

• Bias OPC has limitation in enhancing process margins with respect to defocus

and exposure dose

SRAF (Sub-Resolution AF)

Active

Layout (or Mask ) Design

SB=1

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

SB=2

Wafer structure (SEM)

SB=0

CD�

Process Margin (180nm)

� SRAF = Scattering Bar (SB)

� SRAFs enhance process window (focus, exposure dose)

• Extremely narrow lines � do not print on water

• More SBs help to enhance DOF margin and to meet the target CD

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

0.22

0.2

0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

DOF�

0.0 0.1 0.2 0.3 0.4 0.5 0.6

SB2 SB1 SB0

#SB = 0 #SB=1 #SB=2

CD (nm) 160 177 182

7

8

4


SRAFs and Bossung Plots

CD (nm)

180

140

100

60

20

12

11.5

11

10.5

10

9.5

-20

-0.8 -0.6 -0.4 -0.2 0

DOF (um)

0.2 0.4 0.6 0.8

-20

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

Bias OPC SRAF OPC

� Bossung plot

• Measurement to evaluate lithographic manufacturability

• For though-pitch process margin, maximize the common process window

• Horizontal axis: Depth of Focus (DOF); Vertical axis: CD

� SRAF OPC

• Improves process margin of isolated pattern

• Larger overlap of process window between dense and isolated lines

Forbidden Pitches

CD (nm)

170

130

90

50

CD (nm)

180

140

100

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

60

20

DOF (um)

#SB=1 #SB=2 #SB=3 #SB=4

10

W/O OPC(Best DOF)

W/O OPC(Defocus)

Bias OPC(Defocus)

-30

SRAF OPC (Defocus)

100 300 500 700 900 1100 1300 1500

pitch (nm)

� Some Pitches do not allow for sufficient SRAF

• � Lowers printability, DOF and exposure margins

• � Called the forbidden pitch

� Bias OPC � NOT allowable CD for intermediate and large pitches

� SRAF OPC has intervals of allowed and forbidden pitches

� � Must avoid forbidden pitches in layout

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

12

11.5

11

10.5

10

9.5

Allowable

Forbidden

9

10

5


Layout Composability for SRAFs

Better than

�x+δx� � x �

� Small set of allowed feature spacings

• Perturbation makes bad-printing layout assist-correct

� Two components of SRAF-aware methodology

• Assist-correct libraries

• Library cell layout should avoid all forbidden pitches

• Intelligent library design

• Assist-correct placement � THIS TOPIC

• Intelligent whitespace adjustment in the placer

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

AFCorr: SRAF-Correct Placement

Forbidden pitch

Cell boundary

Before AFCorr After AFCorr

� By adjusting whitespace, additional SRAFs can be inserted

between cells

• Resist image improves and avoids open fault at worst-case defocus

� Problem: Perturb given placement minimally to achieve as much

SRAF insertion as possible

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

11

12

6


Horizontal AFCorr (H-AFCorr)

Forbidden pitch

Cell Boundary

� Horizontal-forbidden pitch is caused by interactions of poly

geometries in the same row

� H-AFCorr is cell placement-perturbation in horizontal

direction to avoid H-forbidden pitches

Vertical AFCorr (V-AFCorr)

Before H-AFCorr

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

After H-AFCorr

Horizontal

Perturbation

Forbidden pitch Before V-AFCorr

Cell Boundary

After V-AFCorr

� Vertical-forbidden pitch is caused by interactions of poly geometries in the

inter cell row

• � Adjust cell row in left- or right-direction to remove forbidden pitch � Space

becomes assist-correct

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

13

14

7


Perturbation (H- + V- AFCorr)

AFCorr

� AFCorr: H-AFCorr + V-AFCorr

• Adjusting whitespace � additional SRAFs � reduce # of

forbidden pitch

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Minimum Perturbation Approach

H-AFCorr

V-AFCorr

� Objective:

• Reduce forbidden pitch violation

• Reduce weighted CD degradation with defocus

• Minimum perturbation: preserve timing

� Constraint:

• Placement site width must be respected

� How:

• One standard cell row at a time

• Solve each cell row by dynamic programming

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

15

16

8


Feasible Placement Perturbation

x a-1

Sa-1 RP

W a-1

Sa LP

X a

Minimize Σ | δ i |

s.t. δ a +δ a-1 +S a-1 RP + Sa LP + (xa –x a-1 –w a-1 ) ∈ AF

w i and x i = width and location of C i

δ i = perturbation of location of cell C i

AF = set of allowed spacings

RP, LP = boundary poly shapes with overlapping y-spans

S = spacing from cell border to boundary poly

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Dynamic Programming Solution

COST (1,b) = | x 1-b| // subrow up through cell 1, location b

COST (a,b) = λ(a) |(x a -b)| +

MIN {Xa-SRCH ≤ i ≤ Xa+SRCH} [COST(x a-1 ,i) + HCost(a,b,a-1,i)+VCost(a,b)]

// SRCH = maximum allowed perturbation of cell location

HCost = “forbidden-pitch cost” = sum over Horiz-adjacencies of

slope(j) *|HSpace –AF j | s.t. AF j+1 > HSpace ≥ AF j

VCost = “forbidden-pitch cost” = sum over Verti-adjacencies of

slope(j) *|VSpace –AF j | s.t. AF j+1 > VSpace ≥ AF j

� λ = proportional to the timing criticality of cell ‘a’

� Slope = ∆CD / ∆Pitch = CD degradation per unit space

between AF values

� AF i = closest assist-feasible spacing ≤ HSpace

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

17

18

9


Experimental Setup

� KLA-Tencor’s Prolith

• Model generation for OPCpro

• Best focus/ worst (0.5 micron) defocus

• Calculating forbidden pitches

� Mentor’s OPCpro, SBar SVRF

• OPC, SRAF insertion, ORC (Optical Rule Check)

� Cadence SOC Encounter

• Placement & Route

� Synopsys Design Complier

• Benchmark design ALU from OpenCore.org

• Synthesis

Experimental Metrics

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� SB Count

• Total number of scattering bars or SRAFs inserted in the design

• Higher number of SRAFs indicates less through-focus variation and is

hence desirable

� Forbidden Pitch Count

• Number of border poly geometries estimated as having greater than

10% CD error through-focus

� EPE Count

• Number of edge fragments on border poly geometries having greater

than 10% edge placement error at the worst defocus level

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

19

20

10


Results: Increased SB Count

# Total SB

300000

250000

200000

150000

100000

50000

0

w AFCorr

w/o AFCorr

90 80 70 60 50

Utilization(%)

130nm

SB difference (130)

SB difference (90)

SB w/o AFCorr(130)

SB w AFCorr(130)

SB w/o AFCorr(90)

SB w AFCorr(90)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

90000

80000

70000

60000

50000

40000

30000

20000

10000

� SB count increases as utilization decreases due to increased

whitespace

� #SB increases after AFCorr placement � Better DOF

Results: Reduced F/P and EPE

Reduction (%)

100

90

80

70

60

90nm

90 80 70 60 50

Utilization(%)

� Forbidden pitch count

• 89%~100% in 130nm, 93%~100% in 90nm

� EPE Count

• 80%~98% in 130nm, 83%~100% in 90nm

EPE (130)

EPE (90)

F/Pitch (130)

F/Pitch (90)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

0

# SB Difference

21

22

11


Impact on Other Design Metrics

130nm

90nm

Utilization(%)

Flow:

#EPE

R/T (s)

GDS (MB)

Delay (ns)

#EPE

R/T(s)

GDS(MB)

Delay(s)

Orig

8772

6721

42.9

4.21

7523

4835

41.1

2.478

41.9

1262

5011

� Data size � 3%, OPC run time � 4%, Cycle time � 6%

� Other impacts are negligible and/or at inherent noise level, compared to

large improvement in printability metrics

AFCorr Summary

90

AFCorr

2267

6732

4.49

42.3

2.305

Orig

5975

6839

41.8

4.547

4813

5451

41.2

2.458

AFCorr

962

6899

42.3

532

5535

2.602

Orig

4976

6878

42.2

2131

5529

2.522

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

80

4.444

43.2

4.501

42.2

70

AFCorr

274

6932

42.2

4.371

107

5632

� AFCorr is an effective approach to achieve assist

feature compatibility in physical layout

� Up to 100% reduction of forbidden pitch and EPE

� Relatively negligible impacts on GDSII size, OPC

runtime, and design clock cycle time

• Compared to huge improvement in printability

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

42.3

2.47

23

24

12


Etch Dummy Insertion Problem

Active

SRAF

Poly

Etch dummy

CD (nm)

120

100

� Etch skew increases as pitch of primary pattern increases �

Etch dummy � Reduce poly-to-poly space � Reduce etch

skew

� Etch dummies are placed outside of diffusion-layer (or active

layer) region

80

60

40

Resist CD

Etch CD

100 600 1100 1600 2100

Space (nm)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Etch Dummy Correction Problem

Poly

Active

Assist feature missing

Assist feature

Etch dummy

No forbidden pitch forbidden pitch

� Given a standard-cell layout,

• determine perturbations to inter-cell spacings so as to simultaneously

insert SRAFs in forbidden pitches and insert etch dummies.

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

25

26

13


Technique 1: SAEDM

(SRAF-Aware Etch Dummy Method)

L

Before SAEDM (SRAF missing: L=R)

After SAEDM (SRAF inserting: L≠ R)

Active

SRAF

L

R

R

Poly

Etch dummy

� Typical etch dummy rule: fixed

rule of active-to-etch dummy

spacing

� SAEDM: flexible etch dummy rule

according to active-to-etch

dummy spacing

• Calculate left poly-to-dummy

and right poly-to-dummy

spacings to insert Assist

Features and Etch Dummies

simultaneously

• Inserted Etch Dummies have

asymmetric active-to-dummy

spacings

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Technique 2: AFCorr + EtchCorr

Placement Correctness Requirements

� Key Idea: Change whitespace

distribution of standard-cell

placement � best printability

• Maximize number of assist features

(AFCorr)

• Optimal location of etch dummy

(EtchCorr)

� AS (ES): sets of feasible spaces

between two gates that allow

insertion of required assist features

(etch dummy)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

27

28

14


Algorithmic Approach: Corr Technique

Dynamic Programming (DP)

� The “AFCorr and EtchCorr” can be solved by dynamic programming (DP): Corr =

AFCorr + EtchCorr

� Cost(a;b): the cost of placing cell a at placement site number b

• Component 1: perturbation component (x_a - b) from the original placement of

cell "a" measured in placement sites

• Component 2: AFCost and EtchCost correspond to the printability deterioration

of resist and etch CD, respectively

� λ: a factor decides the relative importance of preserving the initial placement and the

final EtchCorr benefit achieved.

� α and β are user-defined weights for AFCost and EtchCost, respectively

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Design and Evaluation Flow

Typical

Design

Flow

Modified library

& netlist

Placement

Route

Typical GDSII

Etch dummy generation

based on SAEDM

Post-placement

(Corr)

Route

Assist and etch

dummy corrected

GDSII

SB OPC

- SB Insertion

- Model-based OPC

Lithography & etch model

generation

Etch Dummy and

SRAF insertion rules,

Forbidden pitch

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Quality metrics

OPCed

GDS

29

- Printability

- #Etch dummy and #SB

- EPEs of resist and etch

- Performance

- Delay, OPC run time

� More amendable to insert SRAF and etch dummy

� Novel design flow: the added steps of forbidden pitch and SRAF insertion rules,

and SAEDM and Corr techniques to typical design flow

30

15


Experimental Results

# Total SB/ Dummy

160000

120000

80000

40000

0

Dummy difference

SB difference

Dummy w/o EtchCorr

Dummy w EtchCorr

SB w/o EtchCorr

SB w EtchCorr

90 80 70

Utilization(%)

60 50

40000

35000

30000

25000

20000

15000

10000

5000

0

# SB/Dummy Difference

Reduction(%)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

120

100

80

60

40

20

0

W SAEDM and W/O EtchCorr (Resist)

W SAEDM and W EtchCorr(Resist)

W SAEDM and W EtchCorr(Etch)

90 80 70 60 50

Utilization(%)

� Number of total SRAFs and etch dummies increases due to increased

whitespace

� Forbidden Pitch Count reduction of photo process � 58%-97% with

SAEDM and 90%-100% with (SAEDM + Corr)

� Forbidden Pitch Count reduction of etch process � 77%-97% with (SAEDM

+ Corr)

Corr Summary

� Corr placement perturbation with SAEDM can achieve up to 100%

reduction in number of cell border poly geometries having forbidden

pitch violations. The corresponding reduction in EPE is up to 100%

(resist CD) and 97% (etch CD).

� SB count and etch dummy counts, which indicate less through-focus CD

variation and etch skew, increase up to 10.8% and 18.6%, respectively.

� The increases of data size, OPC running time and maximum delay

overheads of Corr are within 3%, 4% and 6%, respectively.

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

31

32

16


Outline

� Detailed Placement for Process Window

Enhancement

� CMP Fill at 65nm and Below

� Auxiliary Pattern Methodology for Cell-Based OPC

� Crosstalk Awareness in SSTA

� Other

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

BEOL Contribution to Variation

BEOL metal

(Metal mistrack, thin/thick wires)

Device fatigue (NBTI, hot electron effects)

Model/hardware uncertainty

(Per cell type)

N/P mistrack

(Fast rise/slow fall, fast fall/slow rise)

PLL

(Jitter, duty cycle, phase error)

Parameter

Environmental

(Voltage islands, IR drop, temperature)

V t and T ox device family tracking

(Can have multiple V t and T ox device families)

�� Scalable optimal CMP fill (metal, STI, timing, fill pattern)

�� Combinatorial methods for redundant via insertion

�� “Religious questions”

±15 %

±10%

± 5%

± 5%

±10%

±10%

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Delay Impact

-10% → +25%

33

34

17


CMP and DFM

Topography

CMP

R,C Parasitics

Depth of Focus

Design Timing

and Power

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Lithographic

Manufacturability

• CMP and Fill effects

• Cu erosion and dishing cause resistance change

• Dummy fill to aid CMP in achieving planarity causes

capacitance change

• Topographic variation translates to focus variation for

imaging of subsequent layers �reduced process

window � linewidth variation � R, C variation

• CMP interacts with design as well as lithography closely

Fixed-Dissection Regime

� To make filling more tractable, monitor only fixed set of w × w windows

• offset = w/r (example shown: w = 4, r = 4)

Partition n x n layout into nr/w × nr/w fixed dissections

� Each w × w window is partitioned into r2 tiles

� Basic rules: upper / lower bounds on window densities (original layout + inserted fill)

• Example: windows have w = 100um

• Each window divided into r = 4 “steps”

• Step distance = 25um

• � 20mm, 10LM ASIC chip will have 6.4 million “tiles”

w

n

w/r

tile

Overlapping

windows

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

35

36

18


Previous / New Objectives in Density Control

� Objective for Manufacture = Min-Var

minimize window density variation

subject to upper bound on window density

� Objective for Design = Min-Fill

minimize total amount of added fill features

subject to upper bound on window density variation

NEW !!!

� Multi-layer and Multi-window constraints

� Fully staggered fill patterning and/or wire-like (“track”) fill

� Maximize via fill

� Maximize smoothness of density

� Drive with CMP (post-polish wafer topography) simulation

� Handle analog symmetry requirements

� …

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Previous Works on Fill Synthesis

� Kahng et al.

• First LP-based approach for Min-Var objective

• Minimize M s.t.

M ≥ |dens(Wi) – dens(Wj)| ∀ i,j (force minimum variation)

|dens(Wi ) – dens(Wj )| ≤ K ∀ i,j neighbors (smoothness)

where dens(W) = density(orig layout + added fill) in all tiles of W

(Problem: there are millions of tiles in the chip!)

• Iterated Monte-Carlo/greedy methods, hierarchical and multiplelayer

fill methods

� Wong et al.

• LP-based approaches for Min-Fill objective

• LP-based approaches for multiple-layer fill problem and dualmaterial

fill problem

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

37

38

19


What Would “Optimum” CMP Fill Look Like?

� Kahng et al. 1998: Linear Programming (LP) approach for Min-Var objective

Minimize M s.t.

Minimize

variationM

≥ |dens(Wi ) – dens(Wj )| ∀ window pairs Wi , Wj |dens(W i ) – dens(W j )| ≤ K ∀ window pairs W i , W j

that are neighbors

dens(W) = sum of original layout + added fill in all tiles of W

Enforce

smoothness

variables that we optimize

“fill slack” computed

by initial layout

analysis

� Variables in LP = amounts of fill 0 ≤ f ijk ≤ s ijk added into each tile

� Difficulty: There are millions of variables in this LP !!!

“Difficult” image

sensor chip

Min. D Max. D delta D # of Fill Avg. Smoothness

Original Solution 0.1652 0.4717 0.3065 --- 0.0508

minVar 0.4153 0.5448 0.1295 784,968 0.0234

minFill 0.3234 0.4717 0.1483 416,773 0.0317

maxSmoothness 0.3945 0.5243 0.1298 711,429 0.0174

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Density Variation and Smoothness

Cases

Original

minVar

maxSmoothness

minFill

Window

Size

(um)

Minimum

Density

Maximum

Density

Density

Range

Variation

Average

Delta

Density

Smoothness

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Fill Area

(um x um)

25 0.0000 0.7600 0.7600 - -

50 0.0000 0.7600 0.7600 - -

100 0.0080 0.7033 0.6953 0.1339 -

25 0.1914 0.7600 0.5686 -

50 0.2273 0.7600 0.5327 - 302915

100 0.2355 0.7033 0.4678 0.0435

25 0.1914 0.7600 0.5686 -

50 0.2273 0.7600 0.5327 - 308204

100 0.2354 0.7033 0.4679 0.0298

25 0.1504 0.7600 0.6096 -

50 0.1555 0.7600 0.6045 - 201952

100 0.1612 0.7033 0.5421 0.0532

39

40

20


Religious Questions in BEOL DFM

� Should CMP fill be owned by the routing / timing closure tool or

by the DRC / PG tool?

• Answer: proper fill is best achieved today post-layout by a tool that maintains

the signoff

� Must fill be “timing-driven”, or is “timing-aware” sufficient?

• Answer: “Timing-aware” is likely sufficient through the 45nm node

� Are CMP and litho simulations for “more accurate parasitics

and signoff” really necessary?

• Answer: Probably not. CDs and thickness variations are “self-compensating”

w.r.t. timing. Guardbands are reasonable. There is a big mess with existing

calibrations of the RC extraction tool to silicon.

� If two solutions both meet the spec, are they of equal value?

� How elaborate must cost functions and layout knobs be for

EDA tools to understand via yield / reliability, EM, etc.?

� ...

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

“Intelligent” Fill Goals for 65nm and beyond

� True timing- and SI-awareness

• Driven by internal engines for incremental extraction, delay calculation,

static timing/noise analysis

• Open Question: is this done by the router? Or post-layout processing?

� True multi-layer, multi-window global optimization of effective density

smoothness and uniformity

• Recall: millions of “tiles” – can we optimize all fill on all layers

simultaneously?

� Analog fill, capacitor fill, via fill

� Floating, grounded and track fill

� Standalone, ECO, and ripup-refill use models

� Supports thickness bias models (CMP predictors)

� Key technology for managing BEOL variability and enhancing

parametric yield

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

41

42

21


Density Histogram of Pre-/Post- Fill

Original “Oxide” Density Histogram (∆D = 31%)

minVar “Oxide” Density Histogram

(∆D = 13%)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Generate Symmetric Fill (Analog Regions)

Analog Cell

Axis of Symmetry

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

minFill “Oxide”

Density Histogram

(∆D = 15%)

43

44

22


Timing-Driven Fill: Early Ideas

� General guidelines:

• Minimize total number of fill features

• Minimize fill feature size

• Maximize space between fill features

• Maximize buffer distance between original and fill features

� Sample observations in literature

• Motorola [Grobman et al., 2001]: key parameters are fill feature

size and buffer distance

• Samsung [Lee et al., 2003]: floating fills must be included in

chip-level RC extraction and timing analysis to avoid timing

errors

• MIT MTL [Stine et al., 1998]: proposed a rule-based area fill

methodology to minimize added interconnect coupling

capacitance

Extensions

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Consider impact due to fill on overlap and fringe capacitance

• Directly impacts dynamic power (CV 2 f)

� Multi-layer filling for better CMP modeling and timing paths

across different layers

� Use fill to intentionally benefit timing robustness

• Shortcut power/ground distribution networks � better IR drop

• Extra capacitance for hold time critical paths � more robust timing

� Integrate a simplified CMP model in fill insertion and

intermediate RC estimation

� Let’s look at some possibilities for timing-aware flow and CMP

model integration

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

45

46

23


Timing-Aware and Timing-Driven Use Models

P&R (ECO)

DEF / DB

Intelligent Fill

(ECO)

DEF’ / DB’ (or, GDS)

RCX

SI / Timing

External

CMP

Model

GDS

Topo

Map

SPEF, SDC

SI / Timing reports

List of critical nets

GDS / LEF/DEF / OA

Tech file / DRM

User parameters

Intelligent Fill

GDS’ / DEF’ /

OA’

Reports

SPEF’

(to signoff

analyses)

Timing-Aware =

Timing-Driven = +

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Critical Net File Example

MULT.C[46] { M1 M1 1.4 \

M1 M2 1.12 \

M2 M1 1.12 \

M2 M2 1.4 \

M2 M3 1.12 \

M3 M2 1.12 \

M3 M3 1.4 \

M3 M4 1.12 \

M4 M3 1.12 \

M4 M4 1.4 \

M4 M5 1.12 \

M5 M4 1.12 \

M5 M5 1.4 \

M5 M6 1.12 \

M6 M5 1.12 \

M6 M6 1.4 \

M6 M1_2B 2.24 \

M1_2B M6 2.24 \

M1_2B M1_2B 2.8 \

M1_2B M2_2B 2.24 \

M2_2B M1_2B 2.24 \

M2_2B M2_2B 2.8 \

}


For a net segment in M3,

block 1.12um from the segment in M2.

For each of the top K critical nets, e.g.,

block out areas in:

(1) layer below the net

(2) layer of the net

(3) layer above the net

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

47

48

24


M2 Fragment Showing Timing-Aware Keepout

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Timing-Aware Keepout Illustration (M4)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

49

M4 route

M4 fill

M4 keepout

50

25


Density Variation vs. #Critical Nets

Density Range

14.0%

13.8%

13.6%

13.4%

13.2%

13.0%

Metal 5 Layer

100 200 300 400 500 600 700 800 900

# of Critical Net Chosen

• Density range is a weak function of # of critical nets.

• Blaze IF can compensate the loss of potential fill areas.

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Timing-Aware and Power-Aware Fill

Design: Image Processor

(1.3mmX1.3mm, 90nm, 8 metal layers, maxSmooth)

TIMING-AWARE FILL

No fill Fill w/o CNF Fill w/ CNF

# of violating endpoints

Worst endpoint slacks (ns)

0 5 0

..ICACHE/ICACHE/MyBusy_R_reg/D 0.000 -0.084 0.000

..COPIF3/COPIFX/COPLOGIC1/CWRDATA_R_reg[31]/D 0.040 -0.050 0.044

..ICACHE/ICACHE/IC_HALT_S_R_reg[1]/D 0.045 -0.034 0.045

..ICACHE/ICACHE/IC_HALT_S_R_reg[0]/D 0.048 -0.019 0.048

..ICACHE/ICACHE/IC_HALT_S_R_reg[2]/D

Layout Density Variation

0.048 -0.003 0.048

Metal1 0.659 0.659

Metal2 0.747 0.805

Metal3 0.769 0.721

Metal4 0.703 0.804

Metal5 0.684 0.748

Metal6 0.665 0.730

Metal7 0.600 0.630

Metal8

POWER-AWARE FILL

0.613 0.613

Dynamic power (mW) 20.131 21.229 20.471

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

51

52

26


Intelligent Fill With CMP Modeling

Layout, Design Data,

Fill Constraints

Intelligent Fill

Uniform Effective Density

Objective

Post-Fill Layout,

Reports

Signoff CMP Model

Layout, Design Data,

Fill Constraints

Intelligent Fill

Uniform Effective Density +

Step Height Objective

Post-Fill Layout,

Reports

Signoff CMP Model

(1) TOMORROW? (2) AFTER

TOMORROW??

GDS,

Topo Map

External

CMP

Model

Layout, Design Data,

Fill Constraints

Intelligent Fill

Uniform Effective

Density +Step

Height Objective

Post-Fill Layout,

Reports

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Approximating the Signoff CMP Model

Calibration data for each grid point:

• X (um), Y (um)

• Density

• Cu thickness (A)

• Dielectric thickness (A)

• Optional: Pre-CMP Cu thickness,

trench depth, barrier thickness, etc.

Layout, Design Data,

Fill Constraints

Intelligent Fill

Uniform Effective

Density +Step

Height Objective

Internal

CMP

Model

Post-Fill Layout, Reports

Signoff CMP Model

Test Layouts

Signoff CMP Model

Internal

CMP

Model

Signoff CMP Model

(3) AFTER AFTER

TOMORROW???

(or silicon)

Topography Predictions

(or measurements)

Approximation of

Signoff CMP Model

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

53

54

27


Multi-Layer Fill Optimization

Y

T

M3

M2

M1

X

M1 topography impacts M3 topography

Min-Var Optimization

M ≥ |Dmax,3 –Dmin,3|

M ≥ |Dmax,2 –Dmin,2|

M ≥ |Dmax,1 –Dmin,1|

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

CD Variation Due To Topography

Layers co-optimized for

minimum density variation

RISC CPU Core Example (90nm)

Original Thickness Post-Fill Thickness

Variation (A) Variation (A)

Metal1 662 492

Metal2 1642 1217

Metal3 1270 1300

Metal4 1969 1658

Metal5 1657 1608

Metal6 1935 1711

Metal7 1835 1670

� Side view showing thickness variation over regions with dense and

sparse layout.

� Top view showing CD variation when a line is patterned over a region

with uneven wafer topography, i.e., under conditions of varying

defocus.

Goal: OPC technique that is aware of post-CMP topography

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

55

56

28


Topography-Aware OPC (TOPC) Flow

Library &

Technology

GDSII

CMP

Simulation

DOF

Marking Layer

SOPC

DOF Model

Database

TOPC

Standard OPC Flow

SOPCed GDSII

Input GDSII

TOPCed GDSII

for TOPC

� A map of thickness variation from CMP simulation is converted to

defocus marking layers and then fed into GDSII for TOPC

TOPC Results

Number of EPE

10000

8000

6000

4000

2000

0

Test Case

SOPC

TOPC

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

0 0.1 0.2 0.3

(a) DOF (um)

0.4 0.5 0.6

Original

GDS (MB)

SOPC

GDS (MB)

� TOPC achieves up to 90% reduction in edge placement errors.

� The improvement in process window comes at the cost of some increase in

data volume and OPC runtime.

Number of EPE

10000

8000

6000

4000

2000

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

0

SOPC

Runtime (min)

SOPC

TOPC

0 0.1 0.2 0.3 0.4 0.5 0.6

(b) DOF (um)

CASE I : 53% improvement CASE II : 90% improvement

TOPC

GDS (MB)

TOPC

Runtime (min)

CASE I 2.3 3.8 35 4.2 43

CASE II 2.3 3.8 35

4.4 45

57

58

29


Conclusions: Futures for CMP/Fill in DFM

� Goal: Design convergence

• Integrate design intent and physical models

• CMP simulation + fill pattern synthesis + RCX + timing/SI driven

� Performance awareness

• Maintain timing and SI closure

• “Multi-use” fill: IR drop management, decap creation

• Device layer: STI CMP modeling / fill synthesis, etch dummy

� Topography awareness

• Close the loop back to RCX, fill pattern synthesis, OPC guidance

� Intelligent fill pattern synthesis

• Minimum variation and smoothness in addition to density bounds

• Handle MANY constraints at once: multi-window, multi-layer, etc.

• Optional mixing of grounded and floating fill

• Mask data volume control (e.g., shot-size aware, compressible fill)

References

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Thy-Lai Tung, “A Method for Die-Scale Simulation of CMP Planarization, ” Proc. of SISPAD, pp. 65-68,

1997.

� Brian E. Stine, Dennis O. Ouma, Rajesh R. Divecha, Duane S. Boning, James E. Chung, Dale L.

Hetherington, C. Randy Harwood, O. Samuel Nakagawa and Soo-young Oh, “Rapid Characterization

and Modeling of Pattern-Dependent Variation in Chemical-Mechanical Polishing, ” IEEE Trans. on

Semiconductor Manufacturing, Vol. 11, No. 1, pp. 129-140, Feb. 1998.

� Duane S. Boning, William P. Moyne, Taber H. Smith, James Moyne, Ronald Telfeyan, Arnon Hurwitz,

Scott Shellman and John Taylor, “Run by Run Control of Chemical-Mechanical Polishing,” IEEE Trans.

on Components, Packaging and Manufacturing, Vol. 19, No. 4, Oct. 1996.

� Xuan Zeng, Mingyuan Li, Wenqing Zhao, Pushan Tang and Dian Zhou, “Parasitic and Mismatch

Modeling for Optimal Stack Generation,” Proc. of ISCAS, pp. 193-196, 2000.

� Yu Chen, Andrew B. Kahng, Gabriel Robins and Alexander Zelikovsky, “Hierarchical Dummy Fill for

Process Uniformity,” Proc. of ASP-DAC, pp.139-144, 2001.

� Ruiqi Tian, Robert Boone, Sejal Chheda, Brad Smith, Xiaoping Tang, Ed Travis and D. F. Wong,

“Proximity Dummy Feature Placement and Selective Via Sizing for Process Uniformity in a Trench-First-

Via-Last Dual-Inlaid Metal Process,” Proc. of IITC, pp.48-50, 2001.

� Ruiqi Tian, Xiaoping Tang and D. F. Wong, “Dummy Feature Placement for Chemical-Mechanical

Uniformity in a Shallow Trench Isolation Process,” IEEE Trans. on Computer-Aided Design of Integrated

Circuits and Systems, Vol. 21, No.1, pp.63.71, Jan. 2002.

� Andrew B. Kahng, Gabriel Robins, Anish Singh and Alexander Zelikovsky, “Filling Algorithms and

Analyses for Layout Density Control,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and

Systems, Vol. 18, No. 4, Apr. 1999.

� Yu Chen, Puneet Gupta and Andrew B. Kahng, “Performance-Impact Limited Area Fill Synthesis,” Proc.

of DAC, pp. 22-27, 2003.

� Lei He, Andrew B. Kahng, King H. Tam and Jiang Xiong, “Variability-Driven Considerations in the Design

of Integrated-Circuits Global Interconnects,” Proc. VMIC, pp. 214-221, 2004.

� Lei He, Andrew B. Kahng, King H. Tam and Jiang Xiong, “Simultaneous Buffer Insertion and Wire Sizing

Considering Systematic CMP Variation and Random Leff Variation,” Proc. of ISPD, pp. 78-85, 2005.

� Atsushi Kurokawa, Toshiki Kanamoto, Tetsuya Ibe, Akira Kasebe, Chang Wei Fong, Tetsuro Kage,

Yasuaki Inoue and Hiroo Masuda, “Dummy Filling Methods for Reducing Interconnect Capacitance and

Number of Fills,” Proc. of ISQED, pp. 586-591, 2005.

60

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

59

30


References

� Brian E. Stine, Duane S. Boning, James E. Chung, Lawrence Camilletti, Frank Kruppa, Edward R.

Equi, William Loh, Sharad Prasad, Moorthy Muthukrishnan, Daniel Towery, Micheal Berman and

Ashook Kapoor, “The Physical and Electrical Effects of Metal-Fill Patterning Practices for Oxide

Chemical-Mechanical Polishing Processes,” IEEE Trans. on Electron Devices, Vol. 45, No. 3, pp.

665-679, Mar. 1998.

� J.-K. Park, K.-H. Lee, Y.-K. Park, and J.-T. Kong, “An Exhaustive Method for Characterizing the

Interconnect Capacitance Considering the Floating Dummy-Fills by Employing an Efficient Field

Solving Algorithm,” Proc. of SISPAD, pp. 98-101, 2000.

� Dennis Ouma, Duane S. Boning, James Chung, Greg Shinn, Leif Olsen and John Clark, “An

Integrated Characterization and Modeling Methodology for CMP Dielectric Planarization,” Proc. of

IITC, pp. 67-69, 1998.

� Keun-Ho Lee, Jin-Kyu Park, Young-Nam Yoon, Dai-Hyun Jung, Jai-Pil Shin, Young-Kwan Park and

Jeong-Taek Kong, “Analyzing the Effects of Floating Dummy-Fills: From Feature Scale Analysis to

Full-Chip RC Extraction,” Proc. of IEDM, pp.31.3.1-31.3.4, 2001.

� Yu Chen, Andrew B. Kahng, Gabriel Robins and Alexander Zelikovsky, “Area Fill Synthesis for

Uniform Layout Density,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems,

Vol. 21, No. 10, pp. 1132-1147, Oct. 2002.

� Ruiqi Tian, D. F. Wong and Robert Boone, “Model-Based Dummy Feature Placement for Oxide

Chemical-Mechanical Polishing Manufacturability,” IEEE Trans. on Computer-Aided Design of

Integrated Circuits and Systems, Vol. 20, No. 7, pp. 902-910, Jul. 2001.

� Brian Lee, “Modeling of Chemical Mechanical Polishing for Shallow Trench Isolation,” Ph.D. Thesis,

MIT, 2002.

� Dennis Ouma, “Modeling of Chemical Mechanical Polishing for Dielectric Planarization,” Ph.D. Thesis,

MIT, 1998.

� Tae Hong Park, “Characterization and Modeling of Pattern Dependencies in Copper Interconnects for

Integrated Circuits,” Ph.D. Thesis, MIT, 2002.

� Tamba E. Gbondo-Tugbawa, “Chip-Scale Modeling of Pattern Dependencies in Copper Chemical

Mechanical Polishing Processes,” Ph.D. Thesis, MIT, 2002.

Outline

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Detailed Placement for Process Window

Enhancement

� CMP Fill at 65nm and Below

� Auxiliary Pattern Methodology for Cell-Based OPC

� Crosstalk Awareness in SSTA

� Other

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

61

62

31


Motivation

� OPC is mask modification to match photo-resist

edge to layout edge

� It takes a long time

• 12 days for OPC + MDP

• 30 days for a hot lot to go through entire process

� It is expensive: many licenses, many CPUs

� Auxiliary pattern (AP) technique

• Minimizes CD difference between cell-based OPC (COPC) and

design-based OPC (DOPC)

• Enables cell-based timing modeling

• Helps OPC runtime and cell re-spins for ECO

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

The Ideal of Cell-Based OPC

Original

Standard-Cell

GDSII

AND2X1

NAND2X2

NOR2X4


XOR2X8

OPC

SBAR

AND2X1

NAND2X2

NOR2X4


XOR2X8

� Cell-based OPC is a solution for saving of OPC runtime

• Master cell layouts are corrected before placement

• P&R steps are performed with corrected master cells

• OPCed IC design can be completed almost instantly after P&R � OPC run

time is negligible ( 1~2 hours )

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

OPCed

Standard-Cell

GDSII

P&R

OPCed

IC Design

63

64

32


Why Cell-Based OPC Doesn’t Work

Cell without a neighboring cell Cell with a neighboring cell

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

OPC

Re-run

� “Optical radius” of pattern interactions is between 4λ and 6 λ

(λ=193nm)

� OPC must be re-corrected in the interaction areas between

cells of a standard-cell block

Cell-Based Timing Modeling Also Fails

Standard GDSII

AND2X1

NAND2X2

NOR2X4


XOR2X8

OPC

SBAR

PrintImage

Chang SPICE netlist

based on PrimtImage

AND2X1

NAND2X2

NOR2X4


XOR2X8

Cell-based

Timing-Library

� OPC, SBAR and PrintImage are applied to each master cell

� SPICE netlist of cell is changed based on PrintImage result, and cell timing

model is then characterized

� Problem: Model is inaccurate due to CD errors of gates located near

boundaries of cell instances

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

65

66

33


Auxiliary Pattern (AP) Methodology

� Observation: nearest neighbor of a given feature is dominant influence on

proximity CD error

� Idea: Insert “auxiliary patterns” (APs) to shield poly patterns near cell

boundary from proximity effects

� AP minimizes CD difference between cell-based OPC and conventional modelbased

OPC

L=R

Cell Outline

A

B

�Example: “Vertical Type-1 AP”: L=R=50nm

�Horizontal AP: 40nm (in 90nm processes)

�Restricted design rule approach needed to maintain

required minimum values of A and B

• A = Space between border poly and vertical AP

• B = Space between border active-layer and vertical AP

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Proximity Shielding – Line Body

S

Measurement

window

0

100 180 260 340 420 500 580 660

Space (nm)

� Test pattern structure

• Width = 100, Pitch = 300, AP-to-outline space = 90nm

� Maximum CD difference between cell-OPC and standard full

design-OPC:

• 2.98nm without AP and 0.98nm with AP

Difference Error (nm)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

18

15

12

9

6

3

Difference (DOPC-COPC w/o AP)

Difference (DOPC-COPC w AP)

AP

67

68

34


Proximity Shielding – Line End

S

Difference (nm)

0

100 200 300 400 500 600

Space (nm)

� Test pattern structure

• Minimum space between line-ends to insert AP = 320nm

� Maximum CD difference between COPC and DOPC

• 10.1nm without AP and 2.7nm with AP

50

40

30

20

10

Line-end error w/o OPC

Difference (DOPC-COPC w/o AP)

Difference (DOPC-COPC w AP)

AP

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Proximity Shielding – Contact

Pitch = 300nm, Width= 100nm

Contact overlap = 200x200(nm)

S

� Test pattern structure

• Space between poly and cell outline = 50nm

• Min. space between polys to insert an AP = 380nm

� Maximum CD difference between COPC and DOPC

• 4.37nm without AP and 1.2nm with AP

Difference Error (nm)

35

30

25

20

15

10

5

Difference (DOPC-COPC w/o AP)

Difference (DOPC-COPC w AP)

0

100 180 260 340 420 500 580 660

Space (nm)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

AP

69

70

35


AP Flow Includes Placement Optimization

Standard Cell

GDSII

AP Generation

SRAF Insertion

OPC

OPCed

Standard Cell

GDSII

Placement

Post-Placement

Optimization

Route

AP-Correct

Placement

OPC GDSII

� Idea: Use whitespace in the standard-cell block to maximize number of APaugmented

cell instances, and benefit from cell-based OPC

� Recall: *CORR technique in first part of this talk !

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Result of AP Insertion

After Post-Placement Opt (PO)

UTIL %

90%

80%

70%

60%

50%

7683

3802

1300

1204

702

“ALU”

W/O PO W PO

4906

� Placement Opt tries to put one placement site between cells

� Post-placement optimization can lead to 100% cell-based OPC with

utilizations of < 70%

� 80+% of model-based OPC work is eliminated at 80% utilizations

� Cell-Based OPC Becomes Practical !!!

207

0

0

0

9573

7292

3023

2113

1315

“AES”

W/O PO W PO

5054

1250

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

0

0

0

71

72

36


Outline

� Detailed Placement for Process Window

Enhancement

� CMP Fill at 65nm and Below

� Auxiliary Pattern Methodology for Cell-Based OPC

� Crosstalk Awareness in SSTA

� Other

Variability

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Increased variability in nanometer VLSI designs

• Process:

• OPC � Lgate

• CMP � thickness

• Doping � Vth

• Environment:

• Supply voltage � transistor performance

• Temperature � carrier mobility µ and Vth

� These (PVT) variations result in circuit performance variation

p 2

PVT Parameter

Distributions

p 1

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

d 2

Gate/net Delay

Distribution

d 1

73

37


Timing Analysis

� Min/Max-based

• Inter-die variation

• Pessimistic

� Corner-based

• Intra-die variation

• Computational expensive

� Statistical

• pdf for delays

• Reports timing yield

CLK

max

FF FF

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Block-Based vs. Path-Based

Q

max

� Represent signal arrival times as random variables

• Block-based

• Each timing node has an arrival time distribution

• Static worst case analysis

• Efficient for circuit optimization

• Path-based

• Each timing node for each path has an arrival time distribution

• Corner-based or Monte Carlo analysis

• Accurate for signoff analysis

gate delay

pdfs

I

1

Arrival

time pdf

A

Arrival

time pdf

combinational

logic

min

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

B

C

D

D

38


Corner vs. Statistical Timing

Key Challenge

delay

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

130nm 90nm 65nm

No improvement with worst case sign-off

Over-design � difficult timing closure

How to reduce design margin?

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

typical case

77

78

39


Solutions

� Reduce process window

• Fast yet accurate OPC simulator

• Accurate RC extraction

• Timing calculation with “real” RC

• Reduce systematic variation

� SSTA

• Accurate manufacturing process model

(foundry)

• SSTA can handle non-Gaussian distribution

(EDA)

• SI-aware SSTA (EDA)

Example of Emerging Flow

Old flow

Physical design

RC extraction

STA

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Litho simulation CMP simulation

Stat. RCx

SSTA

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

79

Foundry model

New flow

80

40


Current SSTA Tools

� Main players in SSTA: IBM, Extreme-DA, Magma, Synopsys

• Common key features

• Ability to handle Global, Spatial and Independently Random

variations statistically

• Handling of uncorrelated, fully correlated or partially correlated

variation parameters, with multiple types of distributions

• Sensitivity analysis - Analyze delay/slew sensitivity to

particular process parameters enabling improved robustness

• Handling correlation in reconvergent paths

• Statistical tool kit: min/max/add/sub operations

• Common drawbacks

• Signal integrity blind

• Dynamic variation missing

• Can not handle non-Gaussian distribution

Current SSTA Tools

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

• IBM

• Based on EinsTimer

• Emphasis on speed of analysis and optimization/repair

• Multi-mode/multi-corner analysis in a single runtime

• EinsVAT can analyze mixed corners

• Synopsys

• Later to market

• Emphasis on accuracy

• Will support statistical RC extraction

• Extreme-DA

• Startup

• Statistical RC extraction

• Handles spatial correlations

• Sensitivity analysis

• Block-based SSTA

• Variational delay calculation

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

41


SSTA Correlations

� Delays and signal arrival times are random variables

� Correlations come from

• Spatial

• inter-chip, intra-chip, random variations

• Re-convergent fanout

• Multiple-input switching

• Cross-coupling

• ……

Multiple-Input Switching

corr(g1, g2)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

g1

g2

corr(g1, g3)

corr(g2, g3)

� Simultaneous signal switching at multiple inputs of a

gate leads to up to 20%(26%) gate delay mean

(standard deviation) mismatch [Agarwal-Dartu-

Blaauw-DAC’04]

Probability

Gate delay

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

g3

84

42


Crosstalk Aggressor Alignment

� We consider an equally significant source of uncertainty in SSTA,

which is crosstalk aggressor alignment induced gate delay

variation

MIS

CAA

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Problem Formulation (SSTA-SI)

�Given

• a system of coupled interconnects with their

driver gates

• statistical signal arrival time variation at the inputs

of the driver gates, and

• statistical process parameter variations for the

interconnects and their driver gates

�Find

• statistical signal arrival time variations at the

outputs of the system

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

85

86

43


Methodology

� Process variation extraction

� Performance characterization

� Probabilistic symbolic analysis

� PDF propagation

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Process Variation Extraction

� A signal arrival time is a function of multiple process

parameter variabilities

• global (inter-die)

• location dependent (intra-die)

• purely random

� Polynomial approximation

� Principle Component Analysis (PCA) gives a smaller

set of uncorrelated r.v.’s

x = f ( r1, r2,...)

1

Pr( ri) = e

2πσ

ri

2

ri−µ ri

2


ri

( )


DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

87

88

44


Performance Characterization

� Delay calculation for sampled crosstalk alignments

� Least mean square regression for piecewise quadratic polynomial

approximation

⎧ d2 x'< t0


2

a0 + a1x'+ a2x' t < x'< t


τ = ⎨ d t < x'< t


2

b0 + b1x'+ b2x' t < x'< t


⎩⎪

d1 t3 < x'

0 1

0 1 2

PDF Propagation

� Given

2 3

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

• Joint probabilistic density function of k

random variables x

• A piecewise polynomial function y = f(x)

� Find

• Probabilistic density function of y

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

89

90

45


PDF Propagation

� Integration of conditional probabilities in the variable

space

Pr( y = τ)

r r

= Pr( xy | = τ)

dx


R

i


r

x∈R i

−1

= ∑ Pr( x1) Pr( x2)... P( x = f ( x , x ,... x , y = τ)

dx dx ... dx

i


R

r

i x∈Ri k R 1 2 k−1 1 2 k−1

� Analytical inverse function is available for order-d

polynomial (d


Implementation

� STA-SI goes through an iteration of timing

window refinement for reduced pessimism of

worst case analysis

� SSTA-SI goes through an iteration of

signal arrival time pdf refinement with

reduced deviations

Runtime Analysis

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

� Performance characterization for N

sampled crosstalk alignments takes O(N)

time, where N = min(t3-t0, 6 σ of crosstalk

alignment) / time_step

� Regression takes O(N) time

� Computing output signal arrival time

distribution takes constant time, e.g.,

updating in an iterative SSTA

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

47


Experimental Setting

� Coupled global interconnects and 16X inverter drivers in

70nm Berkeley Predictive Technology Model

� Extracted coupled interconnects of 451 resistors and 1637

ground and coupling capacitors and 16X inverter drivers

in 130nm industry designs

70nm

global

intermediate

local

L (um)

1000

200

30

W(um)

0.45

0.14

0.10

S(um)

0.45

0.14

0.10

1.20

0.35

0.20

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Interconnect Delay Distribution

Probability

0.35

0.3

0.25

0.2

0.15

0.1

0.05

SPICE (Tr = 10ps)

Model (Tr = 10ps)

SPICE (Tr = 20ps)

Model (Tr = 20ps)

SPICE (Tr = 50ps)

Model (Tr = 50ps)

SPICE (Tr = 100ps)

Model (Tr = 100ps)

0

0 5 10 15 20 25 30 35 40

Interconnect Delay (ps)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

T(um)

For a pair of 1000um coupled global interconnects in 70nm BPTM

technology, with 10, 20, 50 and 100ps input signal transition time, and

crosstalk alignment in a normal distribution N(0, 10ps)

48


Driver Gate Delay Distribution

For a pair of 1000um coupled global interconnects in 70nm BPTM

technology, with 10, 20, 50 and 100ps input signal transition time, and

crosstalk alignment distribution N(0, 10ps)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Interconnect Output Signal Arrival Time Distribution

Test case 1: 1000mm interconnects of 70nm BPTM technology

3s Tr(ps)

50 50

100 100

200 200

SPICE

m s

3.83 0.85

3.82 0.92

3.78 0.96

Delay

Model

m s

3.82 0.83

3.84 0.83

3.78 0.82

SPICE

m s

29.4 16.2

54.8 32.8

105.2 65.9

Test case 2: interconnects in a 130mm industry design

3s Tr(ps)

50 100

100 100

200 200

SPICE

m s

4.29 0.16

4.30 0.18

4.25 0.18

Delay

Model

m s

4.30 0.15

4.30 0.17

4.25 0.16

SPICE

m s

54.5 16.4

54.8 32.9

105.2 65.9

Output

Model

m s

29.7 16.6

55.6 33.9

106.3 67.0

Output

Model

m s

53.4 16.1

54.1 33.0

104.9 66.1

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

% diff

m s

0.78 2.46

1.52 3.38

1.06 1.65

% diff

m s

-2.09 -0.05

-0.17 0.18

-0.28 0.35

49


Driver Gate Output Signal Arrival Time Distribution

Test case 1: 1000µm interconnects of 70nm BPTM technology

3σ Tr(ps)

50 50

100 100

200 200

SPICE

µ σ

52.8 8.86

61.4 16.0

74.3 23.2

Delay

Model

µ σ

52.74 8.42

61.85 15.9

74.13 23.1

SPICE

µ σ

78.6 12.19

112.9 24.9

177.4 52.9

Model

Output

µ σ

77.3 12.66

113.7 24.43

173.8 53.83

Test case 2: coupled interconnects in a 130nm industry design

3σ Tr(ps)

50 50

100 200

200 200

Outline

SPICE

µ σ

169.7 0.81

198.0 1.5

198.8 2.52

Delay

Model

µ σ

168.84 0.8

197.68 1.5

198.73 2.5

SPICE

µ σ

195.4 16.4

299.5 32.96

301.9 66.49

Output

Model

µ σ

193.6 16.2

291.8 33.6

297.8 65.8

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

% diff

µ σ

-1.65 3.86

0.71 –2.16

-2.03 1.72

% diff

µ σ

−0.92 -1.03

-2.57 1.61

-1.36 –0.93

� Detailed Placement for Process Window

Enhancement

� CMP Fill at 65nm and Below

� Auxiliary Pattern Methodology for Cell-Based OPC

� Crosstalk Awareness in SSTA

� Other

100

50


Parametric Yield Optimization – Blaze MO

Design

RTL SP&R PV

� Design driven

• Turn design requirements into manufacturing directives

• Intercept at the hand-off � the first manufacturing step is software

� Parametric focus

• Improve leakage, timing, variability, and yield

� No major changes

models

libraries

rules Manufacturing

Blaze RET

GDSII

Mask FEOL BEOL Test

• “Same” data, design flow, golden signoff, manufacturing handoff

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Design-Specific Manufacturing

Aggressive leakage

reduction objective;

desired length

94-96nm

Gate on holdcritical

path;

desired length

90-92nm

� Blaze MO: Silicon QOR impact (even “post-tapeout”)

• Small increase in gate length � large reduction in leakage power and variability

• Benefit to customer: Reduce leakage power by 20%, leakage variability by 30%

• Benefit to manufacturing: Same process offers targeted value to customer: power, speed, variability,

parametric yield

• Blaze MO design kits available from major foundries at 90nm, 65nm

*Patent Pending

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Gate on setupcritical

path;

desired length

88-90nm

101

102

51


Blaze MO Results – ARM926 Block

Leakage cut by 25%; Variability cut in half

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Yield Boost at Sort: (A,B) Silicon Results

Normalized Yield

1.0

0.8

0.6

0.4

0.2

0.0

POR

BLAZE

0.83

0.53

0.0 0.5 1.0 1.5 2.0

Normalized IDDQ-1.35V

Blaze MO optimization consistently gives lower IDDQ and higher total

yield over the entire FMAX-IDDQ range of interest.

Normalized Yield

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

1

0.8

0.6

0.4

0.2

0

8.68

0.83

0.53

POR

BLAZE

8 9 10 11 12

Normalized FMAX1

103

104

52


Increased Value Likely in 45nm Node

� Parametric yield and variability improvements from CD biasing

are likely to remain significant at 45nm node

� At 45nm, multi-Vt knob for leakage reduction may disappear

• Reduced supply voltages do not leave enough headroom for HVT device

• � Gate length biasing is the main leakage reduction technique available at

device level

� For foundry processes, 5nm of CD bias likely to be permitted

� Example 45nm low-power strategy scenario

• Two distinct types of library cell layouts, e.g., with 40nm and 60nm gates

• CD biasing range of 40-45nm (positive biasing only) for 40nm gates

• CD biasing range of 55-65nm (both negative and positive biasing) for 60nm

gates

• With this range of available biasing options, gate-length biasing will likely

continue to offer significant potential for leakage and variability reduction

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

Other Topics of Interest?

� Restricted layout methodologies?

� … (your questions here)

DAC-2006 DFM Tutorial: Nagaraj, Schoellkopf, Smayling, Wong, Kahng Andrew B. Kahng

105

106

53

More magazines by this user
Similar magazines