16.11.2014 Views

Solving Chip-Level Verification Challenges - Test and Verification ...

Solving Chip-Level Verification Challenges - Test and Verification ...

Solving Chip-Level Verification Challenges - Test and Verification ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Solving</strong> <strong>Chip</strong>-<strong>Level</strong><br />

<strong>Verification</strong> <strong>Challenges</strong><br />

Balaji Kaliraj<br />

Lavanya Rekha<br />

Broadcom Corporation<br />

Broadcom Proprietary <strong>and</strong> Confidential. © 2013 Broadcom Corporation. All rights reserved.


Overview<br />

• Gate-level simulations<br />

– <strong>Challenges</strong> seen<br />

– <strong>Test</strong> bench configurations<br />

– Simulation time<br />

– Timing closed PD blocks<br />

– Statistics<br />

• The Package-Aware <strong>Test</strong> Bench (PAT)<br />

– <strong>Chip</strong>-Aware <strong>and</strong> Package-Aware <strong>Test</strong> Benches<br />

– The need for PAT<br />

– Methodology<br />

– Advantages<br />

2


Gate-<strong>Level</strong> Simulations<br />

• RTL simulations verify the functionality of the design.<br />

• To verify interface timing across modules, gate-level<br />

simulations are required.<br />

• Timing at clock domain crossings is verified.<br />

• St<strong>and</strong>ard Delay Format (SDF) annotated netlist is<br />

used instead of RTL.<br />

<strong>Test</strong> Bench<br />

<strong>Chip</strong> RTL<br />

<strong>Chip</strong><br />

netlist<br />

with SDF<br />

3


<strong>Challenges</strong><br />

• The design is huge.<br />

• The tool may crash while trying to load the entire<br />

SDF annotated design.<br />

• Running the simulation with the complete netlist is<br />

practically impossible.<br />

• Simulation on full chip test bench<br />

takes approximately 40 CPS.<br />

4


<strong>Test</strong> Bench Configurations<br />

• These challenges are solved by having different<br />

flavors of test benches on which to run the gatesims.<br />

• The design is functionally partitioned, with a netlist<br />

for blocks of interest <strong>and</strong> black-boxing of other<br />

modules.<br />

• Decreases in overall design size make it easy to<br />

load the design <strong>and</strong> reduce simulation time.<br />

• The configurations make sure every interface timing<br />

is covered.<br />

5


6<br />

<strong>Chip</strong> <strong>Level</strong> <strong>Test</strong> Bench<br />

<strong>Test</strong> Bench<br />

DUT<br />

Memory<br />

Management<br />

Unit<br />

Clock &<br />

Resets<br />

CPU<br />

Interface<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

SerDes MAC Ingress<br />

Pipe<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

SerDes<br />

MAC<br />

Egress<br />

Pipe<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

Ingress<br />

BFM<br />

Egress<br />

BFM<br />

Mem<br />

Mgmt<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM


7<br />

Three <strong>Test</strong> Bench Configurations<br />

<strong>Test</strong> Bench<br />

DUT<br />

Memory<br />

Management<br />

Unit<br />

Clock &<br />

Resets<br />

CPU<br />

Interface<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

SerDes MAC Ingress<br />

Pipe<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

Serde<br />

s<br />

MAC<br />

SerDes<br />

MAC<br />

Egress<br />

Pipe<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

Ingress<br />

BFM<br />

Egress<br />

BFM<br />

Mem<br />

Mgmt<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

xMII<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

SerDes<br />

BFM<br />

Config1<br />

Config2<br />

Config3


Timing Closed PD Blocks<br />

• Having separate PD blocks (timing closed) from the<br />

Physical Design team helps in verifying timing between<br />

blocks (e.g., SerDes <strong>and</strong> MAC)<br />

• We had seen cases where the SerDes netlist was not a<br />

part of the delivery from the PD team. SerDes RTL was<br />

used, but missed the verification with timing across<br />

interface, since the SerDes SDF(contains timing<br />

information) was not used.<br />

• Now using a complete timing-closed wrapper with MAC +<br />

SerDes, the verification is more reliable <strong>and</strong> complete.<br />

8


Simulation Time<br />

• Simulation time for running gatelevel<br />

simulations is exceptionally high.<br />

• It can be greatly reduced with different<br />

test-bench configurations.<br />

• Even after that, simulation takes a<br />

substantial number of hours (<strong>and</strong><br />

sometimes days) to complete.<br />

• The most time-consuming task in simulation is the<br />

initialization procedure.<br />

9


Simulation Time (Cont.)<br />

• The initialization procedure involves register/table<br />

configurations.<br />

• Scripts have been developed to enable back-door<br />

slamming of register/tables in the netlist.<br />

• Depending on the test bench configuration used,<br />

irrevelant tasks can be skipped.<br />

10


Statistics<br />

<strong>Test</strong> Bench Used<br />

<strong>Chip</strong> RTL<br />

Configuration A (Most of<br />

the Modules Included)<br />

Configuration B (Lighter<br />

Configuration)<br />

Compile<br />

Database Size (as<br />

per VCS<br />

Simulator)<br />

4.2G 35G 10G<br />

<strong>Test</strong> Bench Used<br />

<strong>Chip</strong> RTL<br />

Configuration A<br />

(Some Crucial Blocks<br />

Enabled)<br />

Configuration B (Lighter<br />

Configuration)<br />

Simulation Time<br />

(in Clocks per<br />

Second of the<br />

Simulator)<br />

291 CPS 23 CPS 446 CPS<br />

11


The Package-Aware <strong>Test</strong> Bench <strong>and</strong><br />

<strong>Chip</strong>-Aware <strong>Test</strong> Bench<br />

• A chip-aware test bench system requires clock/reset driver, bfm to<br />

drive/sample all signals/features that the chip supports.<br />

• For each of the packages that a chip supports, some features are<br />

enabled <strong>and</strong> some are disabled. This is achieved in the chipaware<br />

test bench using $plusarg/`ifndef <strong>and</strong> more manual effort.<br />

• There is a need to test all different test benches separately.<br />

• Increases the chances of bugs being left in certain test benches.<br />

• A package-aware test bench requires only one test bench for all<br />

the package modes.<br />

• The chip top is considered a die; logic to implement package(s) is<br />

instantiated in the test bench<br />

• The common interfaces of all the packages are tied together <strong>and</strong><br />

only one package mode is active at a time.<br />

12


The Need for PATs<br />

• With more packages added for every chip that a design has,<br />

it becomes necessary to implement package intelligence in<br />

test benches.<br />

• There’s also a need to minimize the number of chip-level test<br />

benches.<br />

• We must be able to run almost all the top-level test cases in<br />

all the packages that our chip supports.<br />

• <strong>Chip</strong> initialization routines <strong>and</strong> configurations must be<br />

dependent on the package we select.<br />

13


Methodology<br />

• Pass-transistors to interconnect test bench <strong>and</strong> DUT:<br />

– Equivalent to interfaces in Sverilog.<br />

– Can be used for inputs/outputs <strong>and</strong> inouts, unlike the ternary operator<br />

(? : ), which can be used only for inputs.<br />

• A new module “chip_pkg” is created based on the pinout extract<br />

provided to package team. This extract has the following interfaces:<br />

– All inputs, outputs, <strong>and</strong> inouts of chip_top<br />

– All inputs, outputs, <strong>and</strong> inouts of chip_top, with their direction reversed<br />

– One input for package selection<br />

– Glue-logic to feed through signals based on the package_selection<br />

input<br />

• <strong>Chip</strong>_pkg instantiated multiple times (for as many packages as we<br />

support) in the test bench.<br />

• Package selection input vector that is one-hot encoded can be<br />

triggered based on runtime plusargs.<br />

14


PAT: the Package-Aware <strong>Test</strong> Bench<br />

Sig1 pkg_Sig1<br />

Sig1<br />

Sig2<br />

Sig2<br />

Sig3<br />

Sig3<br />

pkg_Sig1<br />

Sig1<br />

pkg_Sig2<br />

pkg_Sig2<br />

Xactors Sig2<br />

pkg_Sig3<br />

pkg_Sig3<br />

Monitors/<br />

Checkers<br />

i/p<br />

i/p<br />

o/p<br />

o/p<br />

pin_list xls file gets communicated to<br />

package team by pd team after review.<br />

Pkg mode A<br />

Pkg mode B<br />

This file is an input to a script which<br />

generates package modules <strong>and</strong><br />

wrappers.<br />

pkg_O1<br />

O1<br />

Autogenerated package modules<br />

O1 <strong>and</strong> wrappers are instantiated in<br />

testbench <strong>and</strong> are ready for<br />

pkg_O2 O2 testing.<br />

Monitors<br />

pkg_O2 O2<br />

/<br />

O2 Checker<br />

s<br />

pkg_O3<br />

pkg_O1<br />

pkg_O3<br />

O3<br />

O1<br />

O3<br />

Sig3<br />

i/p<br />

o/p<br />

O3<br />

pkgA_en<br />

pkgB_en<br />

15


PAT Advantages<br />

• Run-time selection<br />

• Integration of all packages into one test bench using pass<br />

transistors; this calls for thin code.<br />

• Reduced simulation overhead.<br />

• Automated method of creating/integrating packages<br />

minimizes the chances of leaving a bug behind.<br />

• With this implementation, all test cases can be run in<br />

different package modes using run-time option with almost<br />

no penalty on simulation run time.<br />

• Clocks, resets, <strong>and</strong> feature-enables that are supposed to be<br />

disabled/enabled for a certain package are automatically<br />

taken care of (through scripts).<br />

16


Broadcom Proprietary <strong>and</strong> Confidential. © 2013 Broadcom Corporation. All rights reserved.<br />

Thank You!

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!