29.04.2015 Views

E0286 – VLSI Test VLSI Test

E0286 – VLSI Test VLSI Test

E0286 – VLSI Test VLSI Test

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>E0286</strong> – <strong>VLSI</strong> <strong>Test</strong><br />

Background material on <strong>Test</strong>:<br />

<strong>Test</strong> requirements. <strong>Test</strong> handoffs. <strong>Test</strong>ers.<br />

Where DUT and DFT fit into design / manufacturing framework.<br />

Basic philosophy: <strong>Test</strong>, ATPG, DFT, BIST, COF, TTR.<br />

<strong>Test</strong> cost metrics and test economics.


Design and <strong>Test</strong> Cost Projections<br />

<strong>Test</strong> cost projections have been possibly<br />

clamped due to better test methods, higher<br />

multi-site, on-die concurrency, etc.<br />

But the percentage cost of test is still<br />

increasing.


Cost of <strong>Test</strong><br />

<strong>Test</strong><br />

cost<br />

(% of<br />

total<br />

cost)<br />

100%<br />

Design<br />

cost<br />

(% of<br />

total<br />

cost)<br />

50%<br />

Uncertainty<br />

Infrastructure<br />

Fault models<br />

Volume<br />

Logic<br />

30%<br />

50%<br />

1950 1960 1970 1980 1990 2000 2010<br />

Recurring cost of test for 5 (50) seconds of test time is $ 0.05 ($ 0.5)<br />

to $ 0.5 ($ 5.0) @ $ 0.01 to $ 0.1 per second of tester use.


<strong>Test</strong> Through the Years<br />

Evolution/<br />

Careabouts<br />

Curiosity/<br />

Indifference<br />

1950<br />

Research<br />

labs./Post<br />

design test<br />

University<br />

research<br />

Which<br />

parts<br />

are<br />

bad?<br />

Design for<br />

testability<br />

(DFT)<br />

How<br />

many<br />

bad<br />

parts<br />

have<br />

escaped?<br />

Design for<br />

manufacturability<br />

How<br />

good<br />

are<br />

the<br />

good<br />

parts?<br />

1960 1970 1980 1990 2000 2010 Year


Technology and Failure Modes<br />

<strong>Test</strong><br />

complexity<br />

Hookup<br />

tests<br />

Manufacturing<br />

tests / <strong>Test</strong><br />

automation<br />

<strong>Test</strong>s for<br />

better DPPM<br />

screening<br />

Periodic<br />

testing<br />

Board/<br />

External<br />

faults<br />

Stuck-at/<br />

Iddq faults<br />

Layout/<br />

Coupling<br />

faults<br />

Parametric/<br />

Delay faults<br />

Technology scaling / shrinking<br />

Transient<br />

faults


Terminology<br />

- Yield: The entitlement, from the process of good chips, (expressed in %).<br />

- Coverage: Number of faults detected out of the total number of faults that exist,<br />

(expressed in %).<br />

- DPPM: Defective Parts (test escapes) Per Million certified good devices.<br />

Computed for time zero.<br />

- FIT rate: Failures In Time, measured in terms of number of failures in 10^9<br />

hours of operation.<br />

- Reliability: Quantitatively, it can be described using the actual FIT rate.


Terminology (2)<br />

- Defect: This is the actual cause of a fault or failure, e.g. leaky transistor, gate<br />

oxide short, etc.<br />

- Fault / Failure: This is the effect, e.g. high Iddq, gate output stuck-at 0, etc.<br />

- Errors: This is the manifestation of the fault at an observable output.<br />

- Outlier: Devices for which parametric measurements do not conform to an<br />

acceptable deviation around the mean.


Few Decades of <strong>Test</strong><br />

q 1950s - Gedankan experiments.<br />

University<br />

q 1960s - D-algebra.<br />

q 1970s - LSSD and scan design.<br />

Design Groups<br />

q 1980s - DFT. Early automation.<br />

q 1990s - Automation for DFT insertion and pattern generation.<br />

CAD Groups<br />

q 2000s - DFM. DFY. Cost / Quality tradeoffs.<br />

q 2010s - ... .


Interesting <strong>Test</strong> Data<br />

Reliability / DPPM control:<br />

- 0% fault coverage -> 100% yield.<br />

- > 99% stuck-at fault coverage at 75% yield for DPPM < 200.<br />

- But likely DPPM > 200. Confidence in yield low.<br />

- DPPM requirements for several products in the100s, 10s and 1s range.<br />

- 0.5% yield -> DPPM of 5000.<br />

<strong>Test</strong> cost:<br />

- Cost of test vs design. Crossover in a few years.<br />

- Cost of logic inside device vs outside.<br />

- Rule of 10s: $1 in device -> $1000+ on field.<br />

- Cost impact: Few seconds to tens of seconds.<br />

- Cost per second: 2 to 10 cents. Infrastructure extra.


Interesting <strong>Test</strong> Data (2)<br />

Design effort towards DFT:<br />

- From 10% to 40%.<br />

-Variation depending upon nature of IP cores and SOC, extent of re-use, etc.<br />

Time to production:<br />

- Design time: Months.<br />

- <strong>Test</strong> screening / Ramp to production: Also months.<br />

- Fail Pass iterations: Costly. Result in longer manufacturing cycles and<br />

increased time to ramp to volume.<br />

SOCs designs and DSM (deep sub-micron) effects together aggravate problems<br />

in each of the above.


Components of <strong>Test</strong> Cost<br />

Design costs - primary and derived:<br />

- Area, test generation time, etc.<br />

- Cost of attaining coverage, performance, etc.<br />

<strong>Test</strong> infrastructure costs:<br />

- <strong>Test</strong> automation tools<br />

- <strong>Test</strong> program creation.<br />

- <strong>Test</strong> volume. <strong>Test</strong> application time.<br />

- Probe cards, boards and accessories.<br />

- <strong>Test</strong>er time.<br />

<strong>Test</strong> technology costs:<br />

- Capabilities for test screening and debug.<br />

- Impact on design and infrastructure.


Cost of <strong>Test</strong>ing<br />

CPUD = ( CTGD + CTBD ) / (TNOD * Y)<br />

CPUD = Cost per unit die.<br />

CTGD = Cost of testing good dies.<br />

CTBD = Cost of testing bad dies. (May be = CTGD in multi-site context).<br />

TNOD = Total number of dies.<br />

Y = Yield.


Cost of <strong>Test</strong>ing (2)<br />

Cw = Wafer cost.<br />

D = Dies per wafer.<br />

Y = <strong>Test</strong> yield.<br />

Tg<br />

= <strong>Test</strong> time taken to test a good part.<br />

Tb = Average time it takes for a bad part to fail.<br />

Ctu = <strong>Test</strong>er time cost per unit time.<br />

<strong>Test</strong> time per wafer (Tt) = [D * Y * Tg] + [D * (1-Y) * Tb]<br />

<strong>Test</strong> cost per wafer (Ct) = Ctu * Tt<br />

<strong>Test</strong> cost per good die (Ctg) = Ct / (D * Y)<br />

= Ctu {Tg<br />

+ Tb * [1/Y –1]}<br />

Fabrication cost per good die (Cwg) = Cw / (D * Y)<br />

<strong>Test</strong> cost -> Add costs across different tests / testers.


Cost Tradeoffs – Example 1<br />

Trade off coverage with effectiveness.<br />

Select tests based on their effectiveness.<br />

- <strong>Test</strong> A, Efficiency = 80%, Coverage = 90%.<br />

- <strong>Test</strong> B, Efficiency = 90%, Coverage = 95%.<br />

- <strong>Test</strong> C, Efficiency = 70%, Coverage = 100%.<br />

Selection of A + B is more effective than B + C.<br />

0.9*0.95 + 0.8*0.9 > 0.9*0.95 + 0.7+1.0


Cost Tradeoffs – Example 2<br />

Reduce the time taken for the bad parts to fail. Order the tests based on their<br />

efficiency.<br />

- <strong>Test</strong> A, Yield = 80% (less coverage), Time = 7ms.<br />

- <strong>Test</strong> B, Yield = 60% (more coverage), Time = 8ms.<br />

<strong>Test</strong> A followed by <strong>Test</strong> B:<br />

Total test time = 7 ms + 8x0.8 ms = 13.4 ms.<br />

<strong>Test</strong> B followed by <strong>Test</strong> A:<br />

Total test time = 8 ms + 7x0.6 ms = 12.2 ms


Reducing Cost of <strong>Test</strong>ing<br />

- Target tests on cheaper testers:<br />

- Application costs: $0.01/sec to 0.1/sec and above.<br />

- Actual costs: $ 0.2 M to $ 2 M and above.<br />

- Multi-site testing. Concurrent testing.<br />

- Reduce dependency on tester infrastructure.<br />

- Increase / Reduce the test application speed.<br />

- Improve quality of tests for a given cost. Trade off coverage with quality.<br />

- <strong>Test</strong> sequence: Bad parts to fail early.<br />

- Incur DFT overhead. (Parallel scan, faster scan, test points, BIST, test modes,<br />

isolation, etc.).


Apportioned <strong>Test</strong> Cost<br />

700000<br />

Cost<br />

600000<br />

500000<br />

400000<br />

300000<br />

200000<br />

<strong>Test</strong> time - 1<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

Design DFT - 1<br />

100000<br />

0<br />

1 2 3<br />

4 5 6<br />

Time


Product Life-time Cost<br />

3000000<br />

Cost<br />

2500000<br />

2000000<br />

1500000<br />

1000000<br />

<strong>Test</strong> time - 1<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

Design DFT - 1<br />

500000<br />

0<br />

1 2 3 4 5 6 7 8 9<br />

10 11 12 13 14 15 16<br />

Time


Diverse Markets<br />

q Designs spec’ed and created for one market re-used for others. Examples:<br />

¦ Catalogue wireless connectivity chips re-used inside cars / planes.<br />

¦ DSPs re-used for automotive engine control.<br />

¦ Micro-contollers for medical applications.<br />

q Quality is an opportunity cost. Price paid to meet vs price incurred not to.<br />

Parameter Catalog Portable Infrastructure Automotive<br />

Coverage<br />

Time Zero Quality<br />

Field DPPM<br />

<strong>Test</strong> Cost<br />

<strong>Test</strong> Power<br />

Performance<br />

Area


Apportioned Quality Cost<br />

700000<br />

Cost<br />

600000<br />

500000<br />

<strong>Test</strong> time - 2<br />

<strong>Test</strong>er HW - 2<br />

400000<br />

Design DFT - 2<br />

<strong>Test</strong> time - 1<br />

300000<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

200000<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

100000<br />

Design DFT - 1<br />

0<br />

1 2 3<br />

4 5 6<br />

Time


Apportioned Quality Cost (2)<br />

3500000<br />

Cost<br />

3000000<br />

2500000<br />

<strong>Test</strong> time - 2<br />

<strong>Test</strong>er HW - 2<br />

2000000<br />

Design DFT - 2<br />

<strong>Test</strong> time - 1<br />

1500000<br />

Die DFT<br />

<strong>Test</strong>er PE<br />

1000000<br />

<strong>Test</strong>er HW - 1<br />

Tools DFT<br />

Design DFT - 1<br />

500000<br />

0<br />

1 2 3 4 5 6 7 8 9<br />

10 11 12 13 14 15 16<br />

Time


Different Techniques Beyond Production <strong>Test</strong><br />

Techniques<br />

Technology: Cell hardening.<br />

Physical design rules.<br />

Margins: Additional design margins.<br />

Margin mode testing.<br />

DFT partitioning:<br />

Scan partitioning. Clock skewing / staggering. <strong>Test</strong> res.part. x x<br />

ATPG: Parametric tests. Defect based tests. Bus BIST. x x<br />

Power aware test.<br />

Power management: Power grid partitioning / over-design.<br />

Power isolation switches. Retention.<br />

x x<br />

x<br />

Device configurability:<br />

Pre-shipment calibration. Memory repair. Module isolation. x<br />

On-chip test / measurement:<br />

Self-test. Self-calibration. Self-repair. Adaptivity.<br />

Die test: Over-test. Stress test.<br />

Under-test. Binning. Adaptive test.<br />

System test:<br />

Field test. Periodic testing.<br />

Tolerance: Error checking and correction.<br />

Redundancy and reconfiguration.<br />

Yield Reliability Power<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x<br />

x


How Coverage Impacts Fall-out<br />

Reject rate<br />

4500<br />

4000<br />

3500<br />

3000<br />

2500<br />

2000<br />

1500<br />

1000<br />

500<br />

Will Brown<br />

Seth Ag 1.5<br />

Seth Ag 1.8<br />

Seth Ag 2.0<br />

Seth Ag 2.5<br />

DSP empirical<br />

The formulae are a<br />

guide. Deviations exist<br />

for different reasons.<br />

0<br />

55% 60% 65% 70% 75% 80% 85% 90% 95% 100%<br />

Stuck-at <strong>Test</strong> Coverage


Importance of DPPM<br />

- Theoretical example:<br />

- Sample of 10^6 devices.<br />

- 10000 are faulty.<br />

- 100 escape manufacturing test screen.<br />

- Yield = 99%. DPPM = 100.<br />

- Coverage = 99%, (assuming equal distribution and occurrence of modelled<br />

faults).<br />

- Practical case:<br />

- Yield much lower. More faulty devices.<br />

- Coverage much lower. More test escapes.<br />

- Modelled faults do not occur uniformly.<br />

- Several non-modelled faults also occur.<br />

- Beta quality of silicon test program.<br />

- Result: DPPM of 10000s. On ramp, DPPM of 100s.


Impact of Non-zero DPPM<br />

Consider a board with ten devices, each with DPPM of 1000:<br />

- The DPPM of this board is 10000, i.e. 1%.<br />

- One in hundred boards is bad.<br />

Consider an automobile with 100 devices, each with DPPM of 100:<br />

- The DPPM of this automobile system, (chips alone,other systems apart), is<br />

10000, i.e. 1%.<br />

- One in hundred automobiles is bad.<br />

Consider an automobile with 100 devices, each with DPPM of 1:<br />

- The DPPM of this automobile system is 100, i.e. 0.01%. One in ten<br />

thousand automobiles is bad.


DPPM Calculation<br />

The Williams and Brown equation relates the escape rate (DPPM) to the fault<br />

coverage for a given yield.<br />

D = (1 - Y^(1-C)), where<br />

Y = Yield, 0


IP Mixture in SOC<br />

H<br />

A<br />

C<br />

D<br />

E<br />

B<br />

F<br />

G<br />

Core A: No compression. No bounding. E: Glue logic.<br />

Core B: No compression. Bounding only. G: SOC level CoDec(s).<br />

Core C: Compression + Bounding<br />

H: SOC bounding.<br />

Core D: Compression only. No bounding. Bounding.<br />

F: DFT logic<br />

• <strong>Test</strong> IPs – Memory BIST,<br />

scan CoDecs, test mode<br />

controls, E-Fuse, etc.<br />

• Wrappers – Pin-muxing,<br />

analog PMT, 1500 bounding,<br />

etc.


<strong>Test</strong> Scheduling Options<br />

Schedule<br />

C -> D -> G(A+B+E) -> F<br />

C -> [Init(A+E) + Init(B) + Init(C) -> D]* -> G(A+B) -> E -> F<br />

C -> [Init(C) -> Init(E) -> [G(A+B) || D]* -> E -> F<br />

C || G(A+B+E) || D -> F<br />

<strong>Test</strong> Time<br />

<strong>Test</strong> Qual<br />

H<br />

A<br />

C<br />

D<br />

E<br />

B<br />

F<br />

G


<strong>Test</strong> Data Volume Required to <strong>Test</strong> DUT<br />

Pattern Count<br />

31000<br />

29000<br />

27000<br />

25000<br />

23000<br />

21000<br />

19000<br />

17000<br />

15000<br />

Patterns<br />

0 250 500 750 1000 1250 1500 1750<br />

TDV per Pattern<br />

The product of X-Y co-ordinate values is not constant. DUT test<br />

entropy drives CoDec selection and QOR. <strong>Test</strong> data meeting<br />

entropy requirements ensures higher coverage.


Generic Self-test Controller<br />

External host<br />

or interface<br />

Self-test<br />

microcode<br />

Internal memory<br />

(self-test config.)<br />

Read<br />

Master CPU<br />

DUT with scan<br />

compression +<br />

DUT<br />

addl. control<br />

BIST<br />

Write<br />

Status<br />

registers<br />

Possibilities: (i) One-time manufacturing<br />

test. (ii) Fixed time field test. (iii)<br />

Periodic field test. (iv) Online test<br />

concurrent with normal operation.<br />

Normal appl.<br />

time slot<br />

<strong>Test</strong><br />

application<br />

A1 T1 A2 T2 A3 T3 A4


Scatter Plots – How to Distinguish between Good and Bad<br />

Normalized Fmax<br />

Process Spread<br />

Process spread


Illustration: Variability DPPM<br />

q Variability DPPM : Fails intermittently at


Coverage Improvements Across Fault Models<br />

Coverage<br />

Hybrid ATPG<br />

FM1<br />

FM2<br />

Stuck-at<br />

Transition<br />

Path delay<br />

Bridging<br />

Pattern Count<br />

Methodology<br />

q TC1(FM1) + TC2(FM2) vs defect coverage.<br />

q Pattern sets: P1 + P2 versus merged set of<br />

patterns.<br />

Optimized Pattern Set


Cost versus Coverage<br />

<strong>Test</strong><br />

cost<br />

Fixed test cost<br />

<strong>Test</strong> A<br />

<strong>Test</strong> B<br />

Defect coverage<br />

Move to more effective tests B, e.g. transition, bridging, etc.


Cost versus Quality Tradeoffs<br />

Quality<br />

Required<br />

quality<br />

Benefit<br />

of test<br />

Unacceptable<br />

quality<br />

Process capability<br />

<strong>Test</strong> allows use of inherently low quality process to manufacture devices<br />

with high quality levels. Yield loss is made up for by increased competitiveness.


Bath-tub Curve<br />

Extrinsic / Latent<br />

defects<br />

Intrinsic failures /<br />

Ageing<br />

Product<br />

Eval.<br />

Device Eval


<strong>Test</strong> Components under Multi-site<br />

q <strong>Test</strong> 1: 200 ms.<br />

Die IP Type of <strong>Test</strong><br />

MS<br />

Factor<br />

q <strong>Test</strong>s 2 to N: Less than 20 ms.<br />

X1 X16 X16 -><br />

X64<br />

X16 -><br />

X128<br />

Option 1 220 ms 14 ms NA NA<br />

Option 2 220 ms 1.5 ms 4.5 ms NA<br />

Option 3 SKIP 1.5 ms SKIP 3.25 ms<br />

Single Single Single site<br />

Single Multiple (same) Single site<br />

Single Multiple (different) Single site<br />

Multiple (same) Single Multi-site<br />

Multiple (same) Multiple (same) Multi-site<br />

Multiple (same) Multiple (different) Multi-site<br />

Multiple (different) Single Non-identical<br />

Multiple (different) Multiple (same) Non-identical<br />

Multiple (different) Multiple (different) Non-identical<br />

Multiple insertions required.<br />

<strong>Test</strong> content varies in different insertions.<br />

Selection of test content, test concurrency and multi-site factor important.


<strong>Test</strong>er Board


<strong>Test</strong> Power Concerns<br />

Normalized<br />

Power<br />

15<br />

10<br />

5<br />

0<br />

5.2X<br />

Video<br />

Decode<br />

1.7X<br />

<strong>Test</strong> (Pre-<br />

Opt) *<br />

<strong>Test</strong> (with<br />

Opt)<br />

<strong>Test</strong> power can be several times<br />

more than normal mode power<br />

Peak test power issues (IR drop issues)<br />

impact yield<br />

Affects both shift and capture operations.


Distinguishing Good Parts – An Analog Process<br />

Good / Perfect<br />

Part<br />

SPECIFICATION<br />

/ FEATURE<br />

TEST<br />

• Defect free.<br />

• Identification elusive.<br />

• Costly.<br />

Acceptable<br />

Part<br />

FUNCTIONAL<br />

TEST<br />

• Error free.<br />

• Used in speed binning.<br />

• Often enabled through<br />

outlier analysis.<br />

• May need Schmoo data.<br />

• Parametric defects<br />

targetted.<br />

• Targetted tests: Iddq, path<br />

delay, DFT R/W controls,<br />

DC parametrics,<br />

functional, etc.<br />

Bad Part<br />

DEFECT<br />

ORIENTED TEST<br />

• Static defects targetted.<br />

• Gross errors assumed.<br />

• Targetted tests: Stuck-at,<br />

transition, small delay<br />

defect, bridging, memory<br />

algorithmic tests, etc.<br />

• Successful created,<br />

adopted, optimised,<br />

adapted.


Four Quadrant Analysis<br />

II<br />

I<br />

Built-in Self <strong>Test</strong>s<br />

(Structural)<br />

BAD GOOD<br />

Underkill<br />

Yield Loss<br />

III<br />

IV<br />

Units to Ship<br />

Overkill<br />

BAD<br />

GOOD<br />

Traditional <strong>Test</strong>s<br />

(Functional, Parametric, …)


<strong>Test</strong> Effectiveness Using Venn Diagrams<br />

42


Thank you.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!