29.04.2015 Views

E0286 – VLSI Test VLSI Test

E0286 – VLSI Test VLSI Test

E0286 – VLSI Test VLSI Test

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>E0286</strong> – <strong>VLSI</strong> <strong>Test</strong><br />

Background material on <strong>Test</strong>:<br />

<strong>Test</strong> requirements. <strong>Test</strong> handoffs. <strong>Test</strong>ers.<br />

Where DUT and DFT fit into design / manufacturing framework.<br />

Basic philosophy: <strong>Test</strong>, ATPG, DFT, BIST, COF, TTR.<br />

<strong>Test</strong> cost metrics and test economics.


Design and <strong>Test</strong> Cost Projections<br />

<strong>Test</strong> cost projections have been possibly<br />

clamped due to better test methods, higher<br />

multi-site, on-die concurrency, etc.<br />

But the percentage cost of test is still<br />

increasing.


Cost of <strong>Test</strong><br />

<strong>Test</strong><br />

cost<br />

(% of<br />

total<br />

cost)<br />

100%<br />

Design<br />

cost<br />

(% of<br />

total<br />

cost)<br />

50%<br />

Uncertainty<br />

Infrastructure<br />

Fault models<br />

Volume<br />

Logic<br />

30%<br />

50%<br />

1950 1960 1970 1980 1990 2000 2010<br />

Recurring cost of test for 5 (50) seconds of test time is $ 0.05 ($ 0.5)<br />

to $ 0.5 ($ 5.0) @ $ 0.01 to $ 0.1 per second of tester use.


<strong>Test</strong> Through the Years<br />

Evolution/<br />

Careabouts<br />

Curiosity/<br />

Indifference<br />

1950<br />

Research<br />

labs./Post<br />

design test<br />

University<br />

research<br />

Which<br />

parts<br />

are<br />

bad?<br />

Design for<br />

testability<br />

(DFT)<br />

How<br />

many<br />

bad<br />

parts<br />

have<br />

escaped?<br />

Design for<br />

manufacturability<br />

How<br />

good<br />

are<br />

the<br />

good<br />

parts?<br />

1960 1970 1980 1990 2000 2010 Year


Technology and Failure Modes<br />

<strong>Test</strong><br />

complexity<br />

Hookup<br />

tests<br />

Manufacturing<br />

tests / <strong>Test</strong><br />

automation<br />

<strong>Test</strong>s for<br />

better DPPM<br />

screening<br />

Periodic<br />

testing<br />

Board/<br />

External<br />

faults<br />

Stuck-at/<br />

Iddq faults<br />

Layout/<br />

Coupling<br />

faults<br />

Parametric/<br />

Delay faults<br />

Technology scaling / shrinking<br />

Transient<br />

faults


Terminology<br />

- Yield: The entitlement, from the process of good chips, (expressed in %).<br />

- Coverage: Number of faults detected out of the total number of faults that exist,<br />

(expressed in %).<br />

- DPPM: Defective Parts (test escapes) Per Million certified good devices.<br />

Computed for time zero.<br />

- FIT rate: Failures In Time, measured in terms of number of failures in 10^9<br />

hours of operation.<br />

- Reliability: Quantitatively, it can be described using the actual FIT rate.


Terminology (2)<br />

- Defect: This is the actual cause of a fault or failure, e.g. leaky transistor, gate<br />

oxide short, etc.<br />

- Fault / Failure: This is the effect, e.g. high Iddq, gate output stuck-at 0, etc.<br />

- Errors: This is the manifestation of the fault at an observable output.<br />

- Outlier: Devices for which parametric measurements do not conform to an<br />

acceptable deviation around the mean.


Few Decades of <strong>Test</strong><br />

q 1950s - Gedankan experiments.<br />

University<br />

q 1960s - D-algebra.<br />

q 1970s - LSSD and scan design.<br />

Design Groups<br />

q 1980s - DFT. Early automation.<br />

q 1990s - Automation for DFT insertion and pattern generation.<br />

CAD Groups<br />

q 2000s - DFM. DFY. Cost / Quality tradeoffs.<br />

q 2010s - ... .


Interesting <strong>Test</strong> Data<br />

Reliability / DPPM control:<br />

- 0% fault coverage -> 100% yield.<br />

- > 99% stuck-at fault coverage at 75% yield for DPPM < 200.<br />

- But likely DPPM > 200. Confidence in yield low.<br />

- DPPM requirements for several products in the100s, 10s and 1s range.<br />

- 0.5% yield -> DPPM of 5000.<br />

<strong>Test</strong> cost:<br />

- Cost of test vs design. Crossover in a few years.<br />

- Cost of logic inside device vs outside.<br />

- Rule of 10s: $1 in device -> $1000+ on field.<br />

- Cost impact: Few seconds to tens of seconds.<br />

- Cost per second: 2 to 10 cents. Infrastructure extra.


Interesting <strong>Test</strong> Data (2)<br />

Design effort towards DFT:<br />

- From 10% to 40%.<br />

-Variation depending upon nature of IP cores and SOC, extent of re-use, etc.<br />

Time to production:<br />

- Design time: Months.<br />

- <strong>Test</strong> screening / Ramp to production: Also months.<br />

- Fail Pass iterations: Costly. Result in longer manufacturing cycles and<br />

increased time to ramp to volume.<br />

SOCs designs and DSM (deep sub-micron) effects together aggravate problems<br />

in each of the above.


Components of <strong>Test</strong> Cost<br />

Design costs - primary and derived:<br />

- Area, test generation time, etc.<br />

- Cost of attaining coverage, performance, etc.<br />

<strong>Test</strong> infrastructure costs:<br />

- <strong>Test</strong> automation tools<br />

- <strong>Test</strong> program creation.<br />

- <strong>Test</strong> volume. <strong>Test</strong> application time.<br />

- Probe cards, boards and accessories.<br />

- <strong>Test</strong>er time.<br />

<strong>Test</strong> technology costs:<br />

- Capabilities for test screening and debug.<br />

- Impact on design and infrastructure.


Cost of <strong>Test</strong>ing<br />

CPUD = ( CTGD + CTBD ) / (TNOD * Y)<br />

CPUD = Cost per unit die.<br />

CTGD = Cost of testing good dies.<br />

CTBD = Cost of testing bad dies. (May be = CTGD in multi-site context).<br />

TNOD = Total number of dies.<br />

Y = Yield.


Cost of <strong>Test</strong>ing (2)<br />

Cw = Wafer cost.<br />

D = Dies per wafer.<br />

Y = <strong>Test</strong> yield.<br />

Tg<br />

= <strong>Test</strong> time taken to test a good part.<br />

Tb = Average time it takes for a bad part to fail.<br />

Ctu = <strong>Test</strong>er time cost per unit time.<br />

<strong>Test</strong> time per wafer (Tt) = [D * Y * Tg] + [D * (1-Y) * Tb]<br />

<strong>Test</strong> cost per wafer (Ct) = Ctu * Tt<br />

<strong>Test</strong> cost per good die (Ctg) = Ct / (D * Y)<br />

= Ctu {Tg<br />

+ Tb * [1/Y –1]}<br />

Fabrication cost per good die (Cwg) = Cw / (D * Y)<br />

<strong>Test</strong> cost -> Add costs across different tests / testers.


Cost Tradeoffs – Example 1<br />

Trade off coverage with effectiveness.<br />

Select tests based on their effectiveness.<br />

- <strong>Test</strong> A, Efficiency = 80%, Coverage = 90%.<br />

- <strong>Test</strong> B, Efficiency = 90%, Coverage = 95%.<br />

- <strong>Test</strong> C, Efficiency = 70%, Coverage = 100%.<br />

Selection of A + B is more effective than B + C.<br />

0.9*0.95 + 0.8*0.9 > 0.9*0.95 + 0.7+1.0


Cost Tradeoffs – Example 2<br />

Reduce the time taken for the bad parts to fail. Order the tests based on their<br />

efficiency.<br />

- <strong>Test</strong> A, Yield = 80% (less coverage), Time = 7ms.<br />

- <strong>Test</strong> B, Yield = 60% (more coverage), Time = 8ms.<br />

<strong>Test</strong> A followed by <strong>Test</strong> B:<br />

Total test time = 7 ms + 8x0.8 ms = 13.4 ms.<br />

<strong>Test</strong> B followed by <strong>Test</strong> A:<br />

Total test time = 8 ms + 7x0.6 ms = 12.2 ms


Reducing Cost of <strong>Test</strong>ing<br />

- Target tests on cheaper testers:<br />

- Application costs: $0.01/sec to 0.1/sec and above.<br />

- Actual costs: $ 0.2 M to $ 2 M and above.<br />

- Multi-site testing. Concurrent testing.<br />

- Reduce dependency on tester infrastructure.<br />

- Increase / Reduce the test application speed.<br />

- Improve quality of tests for a given cost. Trade off coverage with quality.<br />

- <strong>Test</strong> sequence: Bad parts to fail early.<br />

- Incur DFT overhead. (Parallel scan, faster scan, test points, BIST, test modes,<br />

isolation, etc.).


Apportioned <strong>Test</strong> Cost<br />

700000<br />

Cost<br />

600000<br />

500000<br />

400000<br />

300000<br />

200000<br />

<strong>Test</strong> time - 1<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

Design DFT - 1<br />

100000<br />

0<br />

1 2 3<br />

4 5 6<br />

Time


Product Life-time Cost<br />

3000000<br />

Cost<br />

2500000<br />

2000000<br />

1500000<br />

1000000<br />

<strong>Test</strong> time - 1<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

Design DFT - 1<br />

500000<br />

0<br />

1 2 3 4 5 6 7 8 9<br />

10 11 12 13 14 15 16<br />

Time


Diverse Markets<br />

q Designs spec’ed and created for one market re-used for others. Examples:<br />

¦ Catalogue wireless connectivity chips re-used inside cars / planes.<br />

¦ DSPs re-used for automotive engine control.<br />

¦ Micro-contollers for medical applications.<br />

q Quality is an opportunity cost. Price paid to meet vs price incurred not to.<br />

Parameter Catalog Portable Infrastructure Automotive<br />

Coverage<br />

Time Zero Quality<br />

Field DPPM<br />

<strong>Test</strong> Cost<br />

<strong>Test</strong> Power<br />

Performance<br />

Area


Apportioned Quality Cost<br />

700000<br />

Cost<br />

600000<br />

500000<br />

<strong>Test</strong> time - 2<br />

<strong>Test</strong>er HW - 2<br />

400000<br />

Design DFT - 2<br />

<strong>Test</strong> time - 1<br />

300000<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

200000<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

100000<br />

Design DFT - 1<br />

0<br />

1 2 3<br />

4 5 6<br />

Time


Apportioned Quality Cost (2)<br />

3500000<br />

Cost<br />

3000000<br />

2500000<br />

<strong>Test</strong> time - 2<br />

<strong>Test</strong>er HW - 2<br />

2000000<br />

Design DFT - 2<br />

<strong>Test</strong> time - 1<br />

1500000<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

1000000<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

Design DFT - 1<br />

500000<br />

0<br />

1 2 3 4 5 6 7 8 9<br />

10 11 12 13 14 15 16<br />

Time


Different Techniques Beyond Production <strong>Test</strong><br />

Techniques<br />

Technology: Cell hardening.<br />

Physical design rules.<br />

Margins: Additional design margins.<br />

Margin mode testing.<br />

DFT partitioning:<br />

Scan partitioning. Clock skewing / staggering. <strong>Test</strong> res.part. x x<br />

ATPG: Parametric tests. Defect based tests. Bus BIST. x x<br />

Power aware test.<br />

Power management: Power grid partitioning / over-design.<br />

Power isolation switches. Retention.<br />

x x<br />

x<br />

Device configurability:<br />

Pre-shipment calibration. Memory repair. Module isolation. x<br />

On-chip test / measurement:<br />

Self-test. Self-calibration. Self-repair. Adaptivity.<br />

Die test: Over-test. Stress test.<br />

Under-test. Binning. Adaptive test.<br />

System test:<br />

Field test. Periodic testing.<br />

Tolerance: Error checking and correction.<br />

Redundancy and reconfiguration.<br />

Yield Reliability Power<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x


How Coverage Impacts Fall-out<br />

Reject rate<br />

4500<br />

4000<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

1000<br />

500<br />

Will Brown<br />

Seth Ag 1.5<br />

Seth Ag 1.8<br />

Seth Ag 2.0<br />

Seth Ag 2.5<br />

DSP empirical<br />

The formulae are a<br />

guide. Deviations exist<br />

for different reasons.<br />

0<br />

55% 60% 65% 70% 75% 80% 85% 90% 95% 100%<br />

Stuck-at <strong>Test</strong> Coverage


Importance of DPPM<br />

- Theoretical example:<br />

- Sample of 10^6 devices.<br />

- 10000 are faulty.<br />

- 100 escape manufacturing test screen.<br />

- Yield = 99%. DPPM = 100.<br />

- Coverage = 99%, (assuming equal distribution and occurrence of modelled<br />

faults).<br />

- Practical case:<br />

- Yield much lower. More faulty devices.<br />

- Coverage much lower. More test escapes.<br />

- Modelled faults do not occur uniformly.<br />

- Several non-modelled faults also occur.<br />

- Beta quality of silicon test program.<br />

- Result: DPPM of 10000s. On ramp, DPPM of 100s.


Impact of Non-zero DPPM<br />

Consider a board with ten devices, each with DPPM of 1000:<br />

- The DPPM of this board is 10000, i.e. 1%.<br />

- One in hundred boards is bad.<br />

Consider an automobile with 100 devices, each with DPPM of 100:<br />

- The DPPM of this automobile system, (chips alone,other systems apart), is<br />

10000, i.e. 1%.<br />

- One in hundred automobiles is bad.<br />

Consider an automobile with 100 devices, each with DPPM of 1:<br />

- The DPPM of this automobile system is 100, i.e. 0.01%. One in ten<br />

thousand automobiles is bad.


DPPM Calculation<br />

The Williams and Brown equation relates the escape rate (DPPM) to the fault<br />

coverage for a given yield.<br />

D = (1 - Y^(1-C)), where<br />

Y = Yield, 0


IP Mixture in SOC<br />

H<br />

A<br />

C<br />

D<br />

E<br />

B<br />

F<br />

G<br />

Core A: No compression. No bounding. E: Glue logic.<br />

Core B: No compression. Bounding only. G: SOC level CoDec(s).<br />

Core C: Compression + Bounding<br />

H: SOC bounding.<br />

Core D: Compression only. No bounding. Bounding.<br />

F: DFT logic<br />

• <strong>Test</strong> IPs – Memory BIST,<br />

scan CoDecs, test mode<br />

controls, E-Fuse, etc.<br />

• Wrappers – Pin-muxing,<br />

analog PMT, 1500 bounding,<br />

etc.


<strong>Test</strong> Scheduling Options<br />

Schedule<br />

C -> D -> G(A+B+E) -> F<br />

C -> [Init(A+E) + Init(B) + Init(C) -> D]* -> G(A+B) -> E -> F<br />

C -> [Init(C) -> Init(E) -> [G(A+B) || D]* -> E -> F<br />

C || G(A+B+E) || D -> F<br />

<strong>Test</strong> Time<br />

<strong>Test</strong> Qual<br />

H<br />

A<br />

C<br />

D<br />

E<br />

B<br />

F<br />

G


<strong>Test</strong> Data Volume Required to <strong>Test</strong> DUT<br />

Pattern Count<br />

31000<br />

29000<br />

27000<br />

25000<br />

23000<br />

21000<br />

19000<br />

17000<br />

15000<br />

Patterns<br />

0 250 500 750 1000 1250 1500 1750<br />

TDV per Pattern<br />

The product of X-Y co-ordinate values is not constant. DUT test<br />

entropy drives CoDec selection and QOR. <strong>Test</strong> data meeting<br />

entropy requirements ensures higher coverage.


Generic Self-test Controller<br />

External host<br />

or interface<br />

Self-test<br />

microcode<br />

Internal memory<br />

(self-test config.)<br />

Read<br />

Master CPU<br />

DUT with scan<br />

compression +<br />

DUT<br />

addl. control<br />

BIST<br />

Write<br />

Status<br />

registers<br />

Possibilities: (i) One-time manufacturing<br />

test. (ii) Fixed time field test. (iii)<br />

Periodic field test. (iv) Online test<br />

concurrent with normal operation.<br />

Normal appl.<br />

time slot<br />

<strong>Test</strong><br />

application<br />

A1 T1 A2 T2 A3 T3 A4


Scatter Plots – How to Distinguish between Good and Bad<br />

Normalized Fmax<br />

Process Spread<br />

Process spread


Illustration: Variability DPPM<br />

q Variability DPPM : Fails intermittently at


Coverage Improvements Across Fault Models<br />

Coverage<br />

Hybrid ATPG<br />

FM1<br />

FM2<br />

Stuck-at<br />

Transition<br />

Path delay<br />

Bridging<br />

Pattern Count<br />

Methodology<br />

q TC1(FM1) + TC2(FM2) vs defect coverage.<br />

q Pattern sets: P1 + P2 versus merged set of<br />

patterns.<br />

Optimized Pattern Set


Cost versus Coverage<br />

<strong>Test</strong><br />

cost<br />

Fixed test cost<br />

<strong>Test</strong> A<br />

<strong>Test</strong> B<br />

Defect coverage<br />

Move to more effective tests B, e.g. transition, bridging, etc.


Cost versus Quality Tradeoffs<br />

Quality<br />

Required<br />

quality<br />

Benefit<br />

of test<br />

Unacceptable<br />

quality<br />

Process capability<br />

<strong>Test</strong> allows use of inherently low quality process to manufacture devices<br />

with high quality levels. Yield loss is made up for by increased competitiveness.


Bath-tub Curve<br />

Extrinsic / Latent<br />

defects<br />

Intrinsic failures /<br />

Ageing<br />

Product<br />

Eval.<br />

Device Eval


<strong>Test</strong> Components under Multi-site<br />

q <strong>Test</strong> 1: 200 ms.<br />

Die IP Type of <strong>Test</strong><br />

MS<br />

Factor<br />

q <strong>Test</strong>s 2 to N: Less than 20 ms.<br />

X1 X16 X16 -><br />

X64<br />

X16 -><br />

X128<br />

Option 1 220 ms 14 ms NA NA<br />

Option 2 220 ms 1.5 ms 4.5 ms NA<br />

Option 3 SKIP 1.5 ms SKIP 3.25 ms<br />

Single Single Single site<br />

Single Multiple (same) Single site<br />

Single Multiple (different) Single site<br />

Multiple (same) Single Multi-site<br />

Multiple (same) Multiple (same) Multi-site<br />

Multiple (same) Multiple (different) Multi-site<br />

Multiple (different) Single Non-identical<br />

Multiple (different) Multiple (same) Non-identical<br />

Multiple (different) Multiple (different) Non-identical<br />

Multiple insertions required.<br />

<strong>Test</strong> content varies in different insertions.<br />

Selection of test content, test concurrency and multi-site factor important.


<strong>Test</strong>er Board


<strong>Test</strong> Power Concerns<br />

Normalized<br />

Power<br />

15<br />

10<br />

5<br />

0<br />

5.2X<br />

Video<br />

Decode<br />

1.7X<br />

<strong>Test</strong> (Pre-<br />

Opt) *<br />

<strong>Test</strong> (with<br />

Opt)<br />

<strong>Test</strong> power can be several times<br />

more than normal mode power<br />

Peak test power issues (IR drop issues)<br />

impact yield<br />

Affects both shift and capture operations.


Distinguishing Good Parts – An Analog Process<br />

Good / Perfect<br />

Part<br />

SPECIFICATION<br />

/ FEATURE<br />

TEST<br />

• Defect free.<br />

• Identification elusive.<br />

• Costly.<br />

Acceptable<br />

Part<br />

FUNCTIONAL<br />

TEST<br />

• Error free.<br />

• Used in speed binning.<br />

• Often enabled through<br />

outlier analysis.<br />

• May need Schmoo data.<br />

• Parametric defects<br />

targetted.<br />

• Targetted tests: Iddq, path<br />

delay, DFT R/W controls,<br />

DC parametrics,<br />

functional, etc.<br />

Bad Part<br />

DEFECT<br />

ORIENTED TEST<br />

• Static defects targetted.<br />

• Gross errors assumed.<br />

• Targetted tests: Stuck-at,<br />

transition, small delay<br />

defect, bridging, memory<br />

algorithmic tests, etc.<br />

• Successful created,<br />

adopted, optimised,<br />

adapted.


Four Quadrant Analysis<br />

II<br />

I<br />

Built-in Self <strong>Test</strong>s<br />

(Structural)<br />

BAD GOOD<br />

Underkill<br />

Yield Loss<br />

III<br />

IV<br />

Units to Ship<br />

Overkill<br />

BAD<br />

GOOD<br />

Traditional <strong>Test</strong>s<br />

(Functional, Parametric, …)


<strong>Test</strong> Effectiveness Using Venn Diagrams<br />

42


Thank you.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!