15.01.2013 Views

U. Glaeser

U. Glaeser

U. Glaeser

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

As is the case with sizing and topologies for performance, complexity in the analysis and optimization<br />

for robustness and reliability can be controlled through the generation of rather simple guidelines in<br />

sizing, wire length versus spacings, keeper size ratios, etc. In the case of reliability and robustness, however,<br />

verification that noise criteria are met is critical.<br />

13.3 Logic Design Implications<br />

As the technology-independent (FO4-based) cycle time shrinks, and signal distribution and clock and<br />

latch overhead shrink less quickly, the arithmetic and logic design must be more efficient. Linear depth<br />

arithmetic and logic structures quickly become impractical. Linear carry ripple addition, even on small<br />

group size, must be replaced with logarithmic carry-lookahead structures. Multiple arithmetic computations<br />

with a late selection, for example compound addition, may become necessary. In critical macros,<br />

additional logic may need to be introduced to avoid waiting for external select signals. For example, the<br />

carry-generation logic from a floating-point adder needed to be reproduced within the leading-zeroanticipator<br />

of a high-frequency, floating-point unit to avoid waiting for the sign of the add to be formed<br />

and delivered from the floating-point adder. 24<br />

Traditionally sequential events may need to be performed in parallel. For instance, to compute condition<br />

codes, rather than waiting for the result of the ALU operation and using additional levels of logic<br />

to form the condition codes, a parallel condition code generation unit, which formed the condition codes<br />

directly from the ALU input operands faster than the ALU result was required. 25 Sum-addressed caches<br />

have also been used to replace the sequential effective-address addition followed by cache access with a<br />

carry-free addition as part of the cache decoder. 26,27 Of more ubiquitous application, merging of logic<br />

with the latch is important as cycle times shrink. With increased pressure on cycle time, the latches and<br />

clocked circuits need a low skew and low jitter clock. Low jitter clock generation and low skew distribution<br />

require significant design, wiring, and power resource. 28,29<br />

13.4 Variability and Uncertainty<br />

The wires and devices, which are actually fabricated in a design, may differ significantly from the nominal<br />

devices. In addition, the operating environment of the devices may vary widely in temperature, supply<br />

voltage, and noise. Usually delay analysis and simulation are performed at multiple corner conditions in<br />

which combinations of best and worst device and environmental conditions are used. For circuits in<br />

which the matching of individual transistors is required for correct operation such as current and voltage<br />

reference circuits and amplifiers, despite care being taken in the design and layout, mismatch does occur. 30<br />

For timing chains, strobes, latches, and memories as well as the analog functions previously described,<br />

Monte Carlo analysis as well as worst-case analysis is often used to ensure correct operation and ascertain<br />

delays.<br />

In the design of a high-frequency processor, performance gain can be made by minimizing the variation<br />

where possible and then taking advantage of systematic variation and only guard banding for random<br />

variations, variations which are time variant, and those for which the cost of taking advantage of the<br />

systematic variation exceeds the benefit. Methods of efficient models of the systematic and random<br />

components for device and interconnect can be used by the designer to optimize the design and determine<br />

what guard band is necessary to account for the random variations. 31,32<br />

13.5 Summary<br />

Designing for the increasingly difficult task of high-frequency processors is largely a process of optimizing<br />

a design to within a reasonable distance of the constraints of (1) the ability of the interconnect to communicate<br />

signals, distribute clocks, and supply power, (2) the current delivering capabilities and noise<br />

margins of the devices, (3) the random or difficult to predict systematic variations in the processing, and<br />

© 2002 by CRC Press LLC

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!