E0286 â VLSI Test VLSI Test
E0286 â VLSI Test VLSI Test
E0286 â VLSI Test VLSI Test
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>E0286</strong> – <strong>VLSI</strong> <strong>Test</strong><br />
Background material on <strong>Test</strong>:<br />
<strong>Test</strong> requirements. <strong>Test</strong> handoffs. <strong>Test</strong>ers.<br />
Where DUT and DFT fit into design / manufacturing framework.<br />
Basic philosophy: <strong>Test</strong>, ATPG, DFT, BIST, COF, TTR.<br />
<strong>Test</strong> cost metrics and test economics.
Design and <strong>Test</strong> Cost Projections<br />
<strong>Test</strong> cost projections have been possibly<br />
clamped due to better test methods, higher<br />
multi-site, on-die concurrency, etc.<br />
But the percentage cost of test is still<br />
increasing.
Cost of <strong>Test</strong><br />
<strong>Test</strong><br />
cost<br />
(% of<br />
total<br />
cost)<br />
100%<br />
Design<br />
cost<br />
(% of<br />
total<br />
cost)<br />
50%<br />
Uncertainty<br />
Infrastructure<br />
Fault models<br />
Volume<br />
Logic<br />
30%<br />
50%<br />
1950 1960 1970 1980 1990 2000 2010<br />
Recurring cost of test for 5 (50) seconds of test time is $ 0.05 ($ 0.5)<br />
to $ 0.5 ($ 5.0) @ $ 0.01 to $ 0.1 per second of tester use.
<strong>Test</strong> Through the Years<br />
Evolution/<br />
Careabouts<br />
Curiosity/<br />
Indifference<br />
1950<br />
Research<br />
labs./Post<br />
design test<br />
University<br />
research<br />
Which<br />
parts<br />
are<br />
bad?<br />
Design for<br />
testability<br />
(DFT)<br />
How<br />
many<br />
bad<br />
parts<br />
have<br />
escaped?<br />
Design for<br />
manufacturability<br />
How<br />
good<br />
are<br />
the<br />
good<br />
parts?<br />
1960 1970 1980 1990 2000 2010 Year
Technology and Failure Modes<br />
<strong>Test</strong><br />
complexity<br />
Hookup<br />
tests<br />
Manufacturing<br />
tests / <strong>Test</strong><br />
automation<br />
<strong>Test</strong>s for<br />
better DPPM<br />
screening<br />
Periodic<br />
testing<br />
Board/<br />
External<br />
faults<br />
Stuck-at/<br />
Iddq faults<br />
Layout/<br />
Coupling<br />
faults<br />
Parametric/<br />
Delay faults<br />
Technology scaling / shrinking<br />
Transient<br />
faults
Terminology<br />
- Yield: The entitlement, from the process of good chips, (expressed in %).<br />
- Coverage: Number of faults detected out of the total number of faults that exist,<br />
(expressed in %).<br />
- DPPM: Defective Parts (test escapes) Per Million certified good devices.<br />
Computed for time zero.<br />
- FIT rate: Failures In Time, measured in terms of number of failures in 10^9<br />
hours of operation.<br />
- Reliability: Quantitatively, it can be described using the actual FIT rate.
Terminology (2)<br />
- Defect: This is the actual cause of a fault or failure, e.g. leaky transistor, gate<br />
oxide short, etc.<br />
- Fault / Failure: This is the effect, e.g. high Iddq, gate output stuck-at 0, etc.<br />
- Errors: This is the manifestation of the fault at an observable output.<br />
- Outlier: Devices for which parametric measurements do not conform to an<br />
acceptable deviation around the mean.
Few Decades of <strong>Test</strong><br />
q 1950s - Gedankan experiments.<br />
University<br />
q 1960s - D-algebra.<br />
q 1970s - LSSD and scan design.<br />
Design Groups<br />
q 1980s - DFT. Early automation.<br />
q 1990s - Automation for DFT insertion and pattern generation.<br />
CAD Groups<br />
q 2000s - DFM. DFY. Cost / Quality tradeoffs.<br />
q 2010s - ... .
Interesting <strong>Test</strong> Data<br />
Reliability / DPPM control:<br />
- 0% fault coverage -> 100% yield.<br />
- > 99% stuck-at fault coverage at 75% yield for DPPM < 200.<br />
- But likely DPPM > 200. Confidence in yield low.<br />
- DPPM requirements for several products in the100s, 10s and 1s range.<br />
- 0.5% yield -> DPPM of 5000.<br />
<strong>Test</strong> cost:<br />
- Cost of test vs design. Crossover in a few years.<br />
- Cost of logic inside device vs outside.<br />
- Rule of 10s: $1 in device -> $1000+ on field.<br />
- Cost impact: Few seconds to tens of seconds.<br />
- Cost per second: 2 to 10 cents. Infrastructure extra.
Interesting <strong>Test</strong> Data (2)<br />
Design effort towards DFT:<br />
- From 10% to 40%.<br />
-Variation depending upon nature of IP cores and SOC, extent of re-use, etc.<br />
Time to production:<br />
- Design time: Months.<br />
- <strong>Test</strong> screening / Ramp to production: Also months.<br />
- Fail Pass iterations: Costly. Result in longer manufacturing cycles and<br />
increased time to ramp to volume.<br />
SOCs designs and DSM (deep sub-micron) effects together aggravate problems<br />
in each of the above.
Components of <strong>Test</strong> Cost<br />
Design costs - primary and derived:<br />
- Area, test generation time, etc.<br />
- Cost of attaining coverage, performance, etc.<br />
<strong>Test</strong> infrastructure costs:<br />
- <strong>Test</strong> automation tools<br />
- <strong>Test</strong> program creation.<br />
- <strong>Test</strong> volume. <strong>Test</strong> application time.<br />
- Probe cards, boards and accessories.<br />
- <strong>Test</strong>er time.<br />
<strong>Test</strong> technology costs:<br />
- Capabilities for test screening and debug.<br />
- Impact on design and infrastructure.
Cost of <strong>Test</strong>ing<br />
CPUD = ( CTGD + CTBD ) / (TNOD * Y)<br />
CPUD = Cost per unit die.<br />
CTGD = Cost of testing good dies.<br />
CTBD = Cost of testing bad dies. (May be = CTGD in multi-site context).<br />
TNOD = Total number of dies.<br />
Y = Yield.
Cost of <strong>Test</strong>ing (2)<br />
Cw = Wafer cost.<br />
D = Dies per wafer.<br />
Y = <strong>Test</strong> yield.<br />
Tg<br />
= <strong>Test</strong> time taken to test a good part.<br />
Tb = Average time it takes for a bad part to fail.<br />
Ctu = <strong>Test</strong>er time cost per unit time.<br />
<strong>Test</strong> time per wafer (Tt) = [D * Y * Tg] + [D * (1-Y) * Tb]<br />
<strong>Test</strong> cost per wafer (Ct) = Ctu * Tt<br />
<strong>Test</strong> cost per good die (Ctg) = Ct / (D * Y)<br />
= Ctu {Tg<br />
+ Tb * [1/Y –1]}<br />
Fabrication cost per good die (Cwg) = Cw / (D * Y)<br />
<strong>Test</strong> cost -> Add costs across different tests / testers.
Cost Tradeoffs – Example 1<br />
Trade off coverage with effectiveness.<br />
Select tests based on their effectiveness.<br />
- <strong>Test</strong> A, Efficiency = 80%, Coverage = 90%.<br />
- <strong>Test</strong> B, Efficiency = 90%, Coverage = 95%.<br />
- <strong>Test</strong> C, Efficiency = 70%, Coverage = 100%.<br />
Selection of A + B is more effective than B + C.<br />
0.9*0.95 + 0.8*0.9 > 0.9*0.95 + 0.7+1.0
Cost Tradeoffs – Example 2<br />
Reduce the time taken for the bad parts to fail. Order the tests based on their<br />
efficiency.<br />
- <strong>Test</strong> A, Yield = 80% (less coverage), Time = 7ms.<br />
- <strong>Test</strong> B, Yield = 60% (more coverage), Time = 8ms.<br />
<strong>Test</strong> A followed by <strong>Test</strong> B:<br />
Total test time = 7 ms + 8x0.8 ms = 13.4 ms.<br />
<strong>Test</strong> B followed by <strong>Test</strong> A:<br />
Total test time = 8 ms + 7x0.6 ms = 12.2 ms
Reducing Cost of <strong>Test</strong>ing<br />
- Target tests on cheaper testers:<br />
- Application costs: $0.01/sec to 0.1/sec and above.<br />
- Actual costs: $ 0.2 M to $ 2 M and above.<br />
- Multi-site testing. Concurrent testing.<br />
- Reduce dependency on tester infrastructure.<br />
- Increase / Reduce the test application speed.<br />
- Improve quality of tests for a given cost. Trade off coverage with quality.<br />
- <strong>Test</strong> sequence: Bad parts to fail early.<br />
- Incur DFT overhead. (Parallel scan, faster scan, test points, BIST, test modes,<br />
isolation, etc.).
Apportioned <strong>Test</strong> Cost<br />
700000<br />
Cost<br />
600000<br />
500000<br />
400000<br />
300000<br />
200000<br />
<strong>Test</strong> time - 1<br />
Die DFT<br />
<strong>Test</strong>er PE<br />
<strong>Test</strong>er HW - 1<br />
Tools DFT<br />
Design DFT - 1<br />
100000<br />
0<br />
1 2 3<br />
4 5 6<br />
Time
Product Life-time Cost<br />
3000000<br />
Cost<br />
2500000<br />
2000000<br />
1500000<br />
1000000<br />
<strong>Test</strong> time - 1<br />
Die DFT<br />
<strong>Test</strong>er PE<br />
<strong>Test</strong>er HW - 1<br />
Tools DFT<br />
Design DFT - 1<br />
500000<br />
0<br />
1 2 3 4 5 6 7 8 9<br />
10 11 12 13 14 15 16<br />
Time
Diverse Markets<br />
q Designs spec’ed and created for one market re-used for others. Examples:<br />
¦ Catalogue wireless connectivity chips re-used inside cars / planes.<br />
¦ DSPs re-used for automotive engine control.<br />
¦ Micro-contollers for medical applications.<br />
q Quality is an opportunity cost. Price paid to meet vs price incurred not to.<br />
Parameter Catalog Portable Infrastructure Automotive<br />
Coverage<br />
Time Zero Quality<br />
Field DPPM<br />
<strong>Test</strong> Cost<br />
<strong>Test</strong> Power<br />
Performance<br />
Area
Apportioned Quality Cost<br />
700000<br />
Cost<br />
600000<br />
500000<br />
<strong>Test</strong> time - 2<br />
<strong>Test</strong>er HW - 2<br />
400000<br />
Design DFT - 2<br />
<strong>Test</strong> time - 1<br />
300000<br />
Die DFT<br />
<strong>Test</strong>er PE<br />
200000<br />
<strong>Test</strong>er HW - 1<br />
Tools DFT<br />
100000<br />
Design DFT - 1<br />
0<br />
1 2 3<br />
4 5 6<br />
Time
Apportioned Quality Cost (2)<br />
3500000<br />
Cost<br />
3000000<br />
2500000<br />
<strong>Test</strong> time - 2<br />
<strong>Test</strong>er HW - 2<br />
2000000<br />
Design DFT - 2<br />
<strong>Test</strong> time - 1<br />
1500000<br />
Die DFT<br />
<strong>Test</strong>er PE<br />
1000000<br />
<strong>Test</strong>er HW - 1<br />
Tools DFT<br />
Design DFT - 1<br />
500000<br />
0<br />
1 2 3 4 5 6 7 8 9<br />
10 11 12 13 14 15 16<br />
Time
Different Techniques Beyond Production <strong>Test</strong><br />
Techniques<br />
Technology: Cell hardening.<br />
Physical design rules.<br />
Margins: Additional design margins.<br />
Margin mode testing.<br />
DFT partitioning:<br />
Scan partitioning. Clock skewing / staggering. <strong>Test</strong> res.part. x x<br />
ATPG: Parametric tests. Defect based tests. Bus BIST. x x<br />
Power aware test.<br />
Power management: Power grid partitioning / over-design.<br />
Power isolation switches. Retention.<br />
x x<br />
x<br />
Device configurability:<br />
Pre-shipment calibration. Memory repair. Module isolation. x<br />
On-chip test / measurement:<br />
Self-test. Self-calibration. Self-repair. Adaptivity.<br />
Die test: Over-test. Stress test.<br />
Under-test. Binning. Adaptive test.<br />
System test:<br />
Field test. Periodic testing.<br />
Tolerance: Error checking and correction.<br />
Redundancy and reconfiguration.<br />
Yield Reliability Power<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x<br />
x
How Coverage Impacts Fall-out<br />
Reject rate<br />
4500<br />
4000<br />
3500<br />
3000<br />
2500<br />
2000<br />
1500<br />
1000<br />
500<br />
Will Brown<br />
Seth Ag 1.5<br />
Seth Ag 1.8<br />
Seth Ag 2.0<br />
Seth Ag 2.5<br />
DSP empirical<br />
The formulae are a<br />
guide. Deviations exist<br />
for different reasons.<br />
0<br />
55% 60% 65% 70% 75% 80% 85% 90% 95% 100%<br />
Stuck-at <strong>Test</strong> Coverage
Importance of DPPM<br />
- Theoretical example:<br />
- Sample of 10^6 devices.<br />
- 10000 are faulty.<br />
- 100 escape manufacturing test screen.<br />
- Yield = 99%. DPPM = 100.<br />
- Coverage = 99%, (assuming equal distribution and occurrence of modelled<br />
faults).<br />
- Practical case:<br />
- Yield much lower. More faulty devices.<br />
- Coverage much lower. More test escapes.<br />
- Modelled faults do not occur uniformly.<br />
- Several non-modelled faults also occur.<br />
- Beta quality of silicon test program.<br />
- Result: DPPM of 10000s. On ramp, DPPM of 100s.
Impact of Non-zero DPPM<br />
Consider a board with ten devices, each with DPPM of 1000:<br />
- The DPPM of this board is 10000, i.e. 1%.<br />
- One in hundred boards is bad.<br />
Consider an automobile with 100 devices, each with DPPM of 100:<br />
- The DPPM of this automobile system, (chips alone,other systems apart), is<br />
10000, i.e. 1%.<br />
- One in hundred automobiles is bad.<br />
Consider an automobile with 100 devices, each with DPPM of 1:<br />
- The DPPM of this automobile system is 100, i.e. 0.01%. One in ten<br />
thousand automobiles is bad.
DPPM Calculation<br />
The Williams and Brown equation relates the escape rate (DPPM) to the fault<br />
coverage for a given yield.<br />
D = (1 - Y^(1-C)), where<br />
Y = Yield, 0
IP Mixture in SOC<br />
H<br />
A<br />
C<br />
D<br />
E<br />
B<br />
F<br />
G<br />
Core A: No compression. No bounding. E: Glue logic.<br />
Core B: No compression. Bounding only. G: SOC level CoDec(s).<br />
Core C: Compression + Bounding<br />
H: SOC bounding.<br />
Core D: Compression only. No bounding. Bounding.<br />
F: DFT logic<br />
• <strong>Test</strong> IPs – Memory BIST,<br />
scan CoDecs, test mode<br />
controls, E-Fuse, etc.<br />
• Wrappers – Pin-muxing,<br />
analog PMT, 1500 bounding,<br />
etc.
<strong>Test</strong> Scheduling Options<br />
Schedule<br />
C -> D -> G(A+B+E) -> F<br />
C -> [Init(A+E) + Init(B) + Init(C) -> D]* -> G(A+B) -> E -> F<br />
C -> [Init(C) -> Init(E) -> [G(A+B) || D]* -> E -> F<br />
C || G(A+B+E) || D -> F<br />
<strong>Test</strong> Time<br />
<strong>Test</strong> Qual<br />
H<br />
A<br />
C<br />
D<br />
E<br />
B<br />
F<br />
G
<strong>Test</strong> Data Volume Required to <strong>Test</strong> DUT<br />
Pattern Count<br />
31000<br />
29000<br />
27000<br />
25000<br />
23000<br />
21000<br />
19000<br />
17000<br />
15000<br />
Patterns<br />
0 250 500 750 1000 1250 1500 1750<br />
TDV per Pattern<br />
The product of X-Y co-ordinate values is not constant. DUT test<br />
entropy drives CoDec selection and QOR. <strong>Test</strong> data meeting<br />
entropy requirements ensures higher coverage.
Generic Self-test Controller<br />
External host<br />
or interface<br />
Self-test<br />
microcode<br />
Internal memory<br />
(self-test config.)<br />
Read<br />
Master CPU<br />
DUT with scan<br />
compression +<br />
DUT<br />
addl. control<br />
BIST<br />
Write<br />
Status<br />
registers<br />
Possibilities: (i) One-time manufacturing<br />
test. (ii) Fixed time field test. (iii)<br />
Periodic field test. (iv) Online test<br />
concurrent with normal operation.<br />
Normal appl.<br />
time slot<br />
<strong>Test</strong><br />
application<br />
A1 T1 A2 T2 A3 T3 A4
Scatter Plots – How to Distinguish between Good and Bad<br />
Normalized Fmax<br />
Process Spread<br />
Process spread
Illustration: Variability DPPM<br />
q Variability DPPM : Fails intermittently at
Coverage Improvements Across Fault Models<br />
Coverage<br />
Hybrid ATPG<br />
FM1<br />
FM2<br />
Stuck-at<br />
Transition<br />
Path delay<br />
Bridging<br />
Pattern Count<br />
Methodology<br />
q TC1(FM1) + TC2(FM2) vs defect coverage.<br />
q Pattern sets: P1 + P2 versus merged set of<br />
patterns.<br />
Optimized Pattern Set
Cost versus Coverage<br />
<strong>Test</strong><br />
cost<br />
Fixed test cost<br />
<strong>Test</strong> A<br />
<strong>Test</strong> B<br />
Defect coverage<br />
Move to more effective tests B, e.g. transition, bridging, etc.
Cost versus Quality Tradeoffs<br />
Quality<br />
Required<br />
quality<br />
Benefit<br />
of test<br />
Unacceptable<br />
quality<br />
Process capability<br />
<strong>Test</strong> allows use of inherently low quality process to manufacture devices<br />
with high quality levels. Yield loss is made up for by increased competitiveness.
Bath-tub Curve<br />
Extrinsic / Latent<br />
defects<br />
Intrinsic failures /<br />
Ageing<br />
Product<br />
Eval.<br />
Device Eval
<strong>Test</strong> Components under Multi-site<br />
q <strong>Test</strong> 1: 200 ms.<br />
Die IP Type of <strong>Test</strong><br />
MS<br />
Factor<br />
q <strong>Test</strong>s 2 to N: Less than 20 ms.<br />
X1 X16 X16 -><br />
X64<br />
X16 -><br />
X128<br />
Option 1 220 ms 14 ms NA NA<br />
Option 2 220 ms 1.5 ms 4.5 ms NA<br />
Option 3 SKIP 1.5 ms SKIP 3.25 ms<br />
Single Single Single site<br />
Single Multiple (same) Single site<br />
Single Multiple (different) Single site<br />
Multiple (same) Single Multi-site<br />
Multiple (same) Multiple (same) Multi-site<br />
Multiple (same) Multiple (different) Multi-site<br />
Multiple (different) Single Non-identical<br />
Multiple (different) Multiple (same) Non-identical<br />
Multiple (different) Multiple (different) Non-identical<br />
Multiple insertions required.<br />
<strong>Test</strong> content varies in different insertions.<br />
Selection of test content, test concurrency and multi-site factor important.
<strong>Test</strong>er Board
<strong>Test</strong> Power Concerns<br />
Normalized<br />
Power<br />
15<br />
10<br />
5<br />
0<br />
5.2X<br />
Video<br />
Decode<br />
1.7X<br />
<strong>Test</strong> (Pre-<br />
Opt) *<br />
<strong>Test</strong> (with<br />
Opt)<br />
<strong>Test</strong> power can be several times<br />
more than normal mode power<br />
Peak test power issues (IR drop issues)<br />
impact yield<br />
Affects both shift and capture operations.
Distinguishing Good Parts – An Analog Process<br />
Good / Perfect<br />
Part<br />
SPECIFICATION<br />
/ FEATURE<br />
TEST<br />
• Defect free.<br />
• Identification elusive.<br />
• Costly.<br />
Acceptable<br />
Part<br />
FUNCTIONAL<br />
TEST<br />
• Error free.<br />
• Used in speed binning.<br />
• Often enabled through<br />
outlier analysis.<br />
• May need Schmoo data.<br />
• Parametric defects<br />
targetted.<br />
• Targetted tests: Iddq, path<br />
delay, DFT R/W controls,<br />
DC parametrics,<br />
functional, etc.<br />
Bad Part<br />
DEFECT<br />
ORIENTED TEST<br />
• Static defects targetted.<br />
• Gross errors assumed.<br />
• Targetted tests: Stuck-at,<br />
transition, small delay<br />
defect, bridging, memory<br />
algorithmic tests, etc.<br />
• Successful created,<br />
adopted, optimised,<br />
adapted.
Four Quadrant Analysis<br />
II<br />
I<br />
Built-in Self <strong>Test</strong>s<br />
(Structural)<br />
BAD GOOD<br />
Underkill<br />
Yield Loss<br />
III<br />
IV<br />
Units to Ship<br />
Overkill<br />
BAD<br />
GOOD<br />
Traditional <strong>Test</strong>s<br />
(Functional, Parametric, …)
<strong>Test</strong> Effectiveness Using Venn Diagrams<br />
42
Thank you.