Design of a High-Speed 12-bit Differential Pipelined A/D Converter

DESIGN OF A HIGH-SPEED 12-BIT DIFFERENTIAL 

PIPELINED A/D CONVERTER 

Diploma Project 

Thomas Liechti 

February 2004 

Assistant: Zeynep Toprak (LSM) 

Professor: Yusuf Leblebici (LSM) 

Microelectronic Systems Laboratory (LSM) 

Swiss Federal Institute of Technology Lausanne

Design of a High-Speed 12-bit Differential Pipelined A/D Converter 

Table of Contents 

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 

1.1 Performance measures of A/D converters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 

1.2 A/D-Converter Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 

1.3 A/D Converter Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 

1.4 ADC Pipeline Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

1.4.1 4-Stage Converter Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 

1.4.2 Analog Pipeline Stage Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 

1.4.3 Pipelined A/D Conversion and Digital Error Correction . . . . . . . . . . . . . . . . . . . . . . 6 

1.4.4 Analysis of accuracy requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

1.4.5 Clocking scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 

2 4-bit Flash Analog-to-Digital Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

2.1 4-bit Flash A/D Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 

2.1.1 Flash ADC Floorplan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.2 Differential Comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.2.1 Choice of Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 

2.2.2 Comparator circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 

2.2.3 Comparator Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

2.3 Performance verification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 

2.3.1 Comparator Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 

2.3.2 Flash ADC Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 

3 4-bit Digital-to-Analog Converter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

3.1 Current-steering D/A Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

3.1.1 Continuous DAC Gain Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 

3.1.2 DAC Floorplan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 

3.2 Unit current cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 

3.2.1 Fournier-Senn [15] Current cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 

3.2.2 Regulated Cascode Current Cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 

3.3 Simulated Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 

4 Residue Amplifier and Sample-and-Hold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 

4.1 Switched Capacitor Residue Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

4.1.1 Circuit Topology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 

4.1.2 Charge Injection of MOS Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 

4.1.3 Capacitor and Switch Sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 

4.2 Differential OTA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 

4.2.1 Mirrored cascode with class AB input stage with preamplifier . . . . . . . . . . . . . . . 28 

5 Top-Level Floorplanning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

5.1 Analog Pipeline Stage Floorplan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

5.2 Floorplan of complete pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 

6 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 

i


References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 

ii


1 Introduction 

The goal of this diploma project is to redesign an existing pipelined 12-bit 200-MS/s 

single-ended analog-to-digital converter [4] to make its analog signal path fully differential. 

The converter has four 4-bit pipeline stages, each stage consisting of a Flash 

analog-to-digital converter, a digital-to-analog converter, and a residue amplifier (Figure 2). 

Only the blocks in the analog signal path have to be redesigned. The digital part consisting 

of thermometric to binary encoders and digital error correction does not need to be 

modified to make the converter differential. It is taken as is from the design presented in [4]. 

A prototype of a converter stage will be implemented on silicon. Some blocks will need to 

be finished after this report has been written. 

Making the analog signal path of the converter fully differential has several advantages: 

• Increased signal dynamic range, which is especially important for low-voltage analog 

designs. 

• Even harmonics introduced by circuit non-linearities are cancelled, improving the 

harmonic distortion characteristics of the system. 

• Immunity to common mode noise coming from the power supply or digital parts 

residing on the same chip for example. 

• Errors due to MOS-switch charge injection and clock feedthrough can more easily be 

cancelled as these errors are often common-mode signals. This is especially important 

in high-precision switched-capacitor circuits. 

These advantages make a fully differential signal path especially useful for mixed-signal 

designs [1], where analog and (noisy) digital circuitry have to coexist on the same die, and 

where the low supply voltage is imposed by the digital process. 

The ADC is designed as a block in a conventional logic 0.18µ CMOS process so it can easily 

be integrated into a digital system. As this process does not provide high precision resistors 

and capacitors the design should rely as little as possible on the precise matching of these 

elements. Thus, using precisely matched resistors and capacitors has to be avoided where 

possible. Special care has to be taken when drawing the layout of matched elements. 

The design also has to cope with low-voltage and deep-submicron technology issues. To 

analog circuits, the down-scaling of minimum feature size is not as beneficial as to digital 

circuits [8]: 

• The signal-to-noise ratio (SNR), i.e. the dynamic range between signal amplitude and 

noise floor is inherently limited by the low supply voltage (imposed by low 

breakdown voltages) because a thermal noise floor is always present, and the supply 

voltage imposes an upper bound on (voltage) signal amplitude. For very low voltage 

designs, current mode processing can thus be an interesting alternative to voltage 

mode processing as current is not directly limited by the supply voltage (and the 

oxide breakdown voltage of the process). 

• Many low voltage circuit topologies are inherently slower that their high-voltage 

counterparts. Also, as oxide layers get thinner and features move closer together with 

technology scaling, parasitic capacitances get larger. 

• Short channel effects such as very high g ds (drain-to-source conductance) degrade 

transistor performance. 

1


The challenge of this project is thus to find and dimension fully differential 

implementations of the functional blocks of the single-ended converter that satisfy the very 

ambitious speed specification (Section 1.3) despite the technological limits. 

Applications of ADC with performance specifications in this range are used for high 

bandwidth applications such as digital video and wireless communication devices. 

This report first gives a short summary of commonly used performance measures for A/D 

converters. Only a very short overview on converter architectures is given because a 

pipelined architecture has already been chosen for this design. In Section 1.4 the converter 

structure is described in detail. The design of the Flash ADC is detailed in Section 2, 

Section 3 describes the DAC design, and the Residue Amplifier is described in Section 4. 

Finally top level floorplanning is discussed in Section 5. Section 6 summarizes the work that 

has been done and indicates the following steps in the development of the converter. 

1.1 Performance measures of A/D converters 

An analog-to-digital converter (Figure 1) transforms an analog signal (continuous in time 

and amplitude) to a digital signal which is discrete in time and amplitude. First, the input 

waveform is sampled at discrete time intervals (assumed equidistant) by a sample-and-hold 

(S/H) circuit. The S/H output is a continuous-amplitude discrete-time signal proportional 

to the input signal’s amplitude at the sampling instant. The n-bit A/D converter quantizes 

this signal into 2 n discrete amplitude levels, each one of which is described by a n-bit 

codeword. 

The amplitude quantization introduces a quantization error. For input signals with 

frequency content only below half the sampling rate, the system’s accuracy ideally is only 

limited by this error. However, in practical converters, other sources of errors such as circuit 

element mismatches and random noise add to the total error, and limit the effective 

accuracy that can be achieved. 

The definitions used in this work of different performance measure of ADC systems are 

summarized next. More performance measures are described in [4]. 

Digital Output 

Analog 

Input 

S/H 

Sampled 

Input 

n−bit 

A/D Converter 

B 0 

B 1 

B 

2 

B n−3 

B n−2 

B n−1 

Figure 1: Block diagram of an Analog-to-Digital Converter [5] 

• The Differential Nonlinearity (DNL) error describes the difference between the ideal 

step size (1 LSB) and the effective step sizes of the converter. For an A/D converter, 

2


the DNL error is thus the difference between two adjacent converter thresholds, 

normalized by 1 LSB. 

• The Integral Nonlinearity (INL) error measures the difference between ideal and real 

code midpoints (see Figure 7) of the converter DC characteristic. 

• Sampling time uncertainty (aperture jitter) measures the deviation of the effective 

sampling instant from the ideal sampling instant. If the input signal is time varying, 

this uncertainty introduces an effective amplitude error in the S/H output because 

the sampled value is implicitly assigned to the ideal sampling instant. The magnitude 

of the introduced error increases with the rate of change in the input signal, and hence 

with input signal frequency. 

• The signal-to-noise ratio (SNR) measures the ratio between signal power and total 

noise power. If (harmonic) distortion is included in the noise, the SNR is also called 

SINAD (signal-to-noise-and-distortion ratio). The SNR of an A/D converter is 

intrinsically limited by the quantization noise. This upper limit on the SNR is 

approximately given by (1), where N is the number of bits of the converter. 

S ⁄ N = 6.02 ⋅ N + 1.76dB 

(1) 

• The effective number of bits is the value of N in (1) that results in the effectively 

measured SNR. 

• The effective resolution bandwidth (ERBW) is the input signal range, for which the 

SNR at the converter output stays within 3 dB of the low frequency SNR value. 

1.2 A/D-Converter Architectures 

Analog-to-digital converters can be classified according to the sequence of operations 

performed to determine the digital value corresponding to the analog input sample value. 

At the highest level, converters can be divided into oversampling converters and Nyquist 

rate converters. Nyquist rate converters convert each analog sample at maximum precision, 

thus minimizing quantization noise for each sample. This allows conversion of analog 

signals approaching the Nyquist rate, although in practice signals are usually sampled at 3 

to 20 times the input signal’s bandwidth. Strictly speaking, only converters whose input 

bandwidth is at least f s /2 are Nyquist rate converters [10]. Oversampling converters highly 

oversample (typically by a factor of 20 to 512) the input signal to spread the quantization 

noise over a large frequency band. Noise outside the signal band can then be filtered to 

improve the SNR. The quantization error power is minimized for a sequence of samples 

rather than for single samples. 

For high-speed applications, Nyquist-rate converters have to be used because the required 

sampling rate for oversampling is orders of magnitude higher than signal bandwidth. For 

high bandwidth applications, the required sampling rate cannot be realized with current 

technology. (New very high speed technology permits ∆Σ-Converters to be used for RF 

applications [10].) 

3


There is a large variety of architectures for Nyquist rate converters: 

• Flash (parallel) converter are very fast but the number of required comparators 

increases exponentially with the number of bits, thus entailing large ICs (high cost, 

difficult device matching), high power consumption and high input capacitance. 

• Time-interleaving converters: To or more converters work in parallel with shifted 

clocks. High power dissipation. 

• Serial and successive approximation converters: Convert an input sample using a 

number of sequential steps. These converters can be very accurate and small, but slow 

because several conversion steps need to be performed for each sample. 

Because the architecture to be used for the converter is given, no study of other converter 

architectures has been carried out. For high-speed Nyquist-rate A/D converters a pipelined 

architecture is very well suited because it allows to decouple conversion rate (sampling 

rate) from conversion time. That is, throughput can be increased by extending the latency 

between the time the analog sample is taken and the time when the corresponding digital 

value is available at the converters output (in many applications latency is not as critical as 

throughput.) The idea is to split the conversion into a number of serially executed 

low-resolution conversions. In one sampling interval, only a low resolution conversion has 

to be achieved instead of a full resolution conversion. The low resolution conversion can be 

much faster because the accuracy requirement of the comparators (in the flash ADC) is 

relaxed, which allows faster comparator architectures to be used (trade accuracy for speed). 

Another advantage is the reduced number of comparators: The pipelined ADC uses less, 

and less accurate comparators than a Flash ADC with the same resolution. The other 

elements in the pipeline stage (DAC and residue amplifier, see Figure 3), however, need full 

resolution accuracy, not just the per-stage resolution accuracy (see Section 1.4.4). 

1.3 A/D Converter Specifications 

The A/D Converter specifications are summarized in Table 1. Note that no power spec is 

given. The first objective is to reach the 200 MHz sampling speed, and not low power 

consumption. The design may thus trade power for speed and accuracy. 

Table 1: ADC Specifications 

Stated Resolution 

Voltage swing 

Sampling Rate 

Architecture 

12 bits 

1V pp (differential) 

200 MHz 

4 pipelined 4-bit flash stages with bit overlapping 

Technology UMC 0.18µ logic CMOS (1.8 V) 

The specifications in Table 1 are comparable to the performance of current state of the art 

converters [10][4]. 

The 1 Volt peak-to-peak input signal swing has been set to 0.6 to 1.6 V, resulting in an analog 

ground level of 1.1 V. The lower bound of 0.6 V allows using NMOS differential input pairs 

(the nominal threshold voltage being 0.5 V) while leaving 200 mV headroom for PMOS 

current mirrors. The analog ground level stays the same throughout the analog pipeline. 

4


1.4 ADC Pipeline Architecture 

1.4.1 4-Stage Converter Structure 

Figure 2 shows the structure of the complete 4-stage converter pipeline. It consists of a 

(external) 

ANALOG 

PIPELINE STAGE 1 

ANALOG 


ANALOG 


ANALOG 


vin 

Front−End 

S/H 

vin’ 

ADC 1 

DAC 1 

ADC 2 

DAC 2 

ADC 3 

DAC 3 

ADC 4 

4 

A1 B1 

3 

A2 B2 

ENCODER 1 ENCODER 2 (smart) 

ENCODER 3 (smart) ENCODER 4 (smart) 

3 

A3 B3 

3 

B1 

0 

FA 

DFF DFF DFF DFF 

FA FA FA FA 

A1 

B1 

B2 

DFF 

FA 


FA FA FA FA 

DFF DFF DFF 

FA FA FA 

A2 

B2 

B3 

DFF 

FA 


FA FA FA FA 

DFF DFF DFF 

FA FA FA 

DFF DFF DFF 

FA FA FA 

A3 

B3 

DFF 


DFF DFF DFF 

DFF DFF DFF DFF DFF DFF 

OVERFLOW MSB LSB 

DISCARDED 

Figure 2: Block diagram of four-stage pipelined A/D converter [4] 

horizontal analog pipeline and a vertical digital pipeline performing digital error correction 

and assembling the digital outputs of the four analog ADC stages. The same signal flow 

structure is used in the layout of the converter (Section 5.2). It allows easy slicing of the 

system into four almost identical parts that can simply be abutted to form the whole 

pipeline, thus reducing routing length between the stages. Another important benefit of this 

arrangement of circuit blocks is that it clearly separates the analog part from the digital part. 

This will minimize noise injected from the digital circuitry into the analog signal path. 

Once the first stage is completed, the whole pipeline can be assembled very easily. Only 

small modifications are needed in the encoder and digital part of the stages. 

A front-end sample-and-hold circuit required at input of first pipeline stage to hold the 

input stable during the conversion cycle. For the following stages the residue amplifier (see 

Section 1.4.2) in the previous pipeline stage will act as sample and hold circuit. The 

front-end sample-and-hold block has very stringent requirements on sampling time 

uncertainty (aperture jitter) because it samples an time-varying analog signal. Inside the 

pipeline stages, settled signals are sampled, and consequently sampling time jitter is less 

critical there. 

In [10] it is suggested that aperture jitter is the dominant limiting factor for the SNR of 

current high-performance ADCs. Front-end sampling is thus very critical. In the prototype, 

the front-end sample-and-hold circuit will be external to the chip and its design is not part 

5


of this project. For the measurements of DC characteristics, slow varying analog inputs can 

be used to render the front-end S&H circuit unnecessary. 

The circuits in the digital pipeline are directly taken from [4]. Only their layout has to be 

redrawn. 

Note that the encoder delay is not part of the total analog stage delay as a DAC topology 

directly using the Flash thermometric output is employed. The digital pipeline thus works 

in parallel with the analog one; its timing is less critical than that of the analog pipeline. 

1.4.2 Analog Pipeline Stage Structure 

Figure 3 shows the structure of an analog pipeline stage. It consists of a Flash A/D 

converter, a D/A converter, and a residue amplifier. The unit gain buffer is needed because 

of the poor output driving capability of the DAC topology used (Section 3.1) 

The DC transfer characteristic of the 4-bit DAC (Figure 4) has been chosen such that the 

(ideal) residue voltage always stays within +/- 0.5 LSB. 

Analog 4−Bit Pipeline Stage 

4−Bit Flash 

A/D Converter 

15 

4−Bit 

D/A Converter 

Unit Gain 

Buffer 

Residue Amplifier 

and 

Sample & Hold 

8x 

Clock 

to Encoder 

Clock 

Figure 3: 

Structure of an Analog Pipeline Stage 

1.4.3 Pipelined A/D Conversion and Digital Error Correction 

Digital error correction by bit overlapping uses an extra bit per stage to detect possible overor 

underflow of the amplified residue from the previous stage. Error correction thus 

requires halving the input range of stages 2, 3, and 4 to 500 mV peak-to-peak. Each stage 

except the first one only contribute 3 bits to the digital output code. 

In the absence of any other error sources, bit overlapping allows for a +/-0.5 LSB integral 

nonlinearity of the 4-bit A/D converter. It relaxes the requirements for comparator offset 

and reference ladder precision. However, an effort must still be made to keep offset and 

reference ladder errors small in order to keep the bit-overlapping as a “last resort”. Note 

that gain and linearity errors in the D/A converter and the residue amplifier are not 

corrected by the bit overlapping. 

Two different types of encoders are needed for this error correction scheme: the first stage 

has a normal thermometric to binary encoder, all following stages have use smart encoders 

to detect over- or underflow conditions (Table 2). The shaded cells highlight the codes 

resulting from over- or underflow of the amplified residue. 

6


vout 

+1V 

1111 

1110 

1101 

1100 

1011 

1010 

1001 

−1V 

1000 

0111 

+1V 

vin 

0110 

0101 

0100 

0011 

0010 

0001 

0000 

−1V 

Figure 4: 

DC Characteristic of 4-bit Flash ADC and DAC connected in series 

Table 2: Thermometric to binary code conversion table 

Thermometer code 

(decimal representation) 

Binary code 

(stage 1 encoder) 

Binary code of 

“smart” encoders 

(stages 2-4) 

Overflow 

(a) 

Underflow 

(b) 

15 1111 011 1 0 

14 1110 010 1 0 

13 1101 001 1 0 

12 1100 000 1 0 

11 1011 111 0 0 

10 1010 110 0 0 

9 1001 101 0 0 

8 1000 100 0 0 

7 0111 011 0 0 

6 0110 010 0 0 

5 0101 001 0 0 

7


Table 2: Thermometric to binary code conversion table 

Thermometer code 

(decimal representation) 

Binary code 

(stage 1 encoder) 

Binary code of 

“smart” encoders 

(stages 2-4) 

Overflow 

(a) 

Underflow 

(b) 

4 0100 000 0 0 

3 0011 111 0 1 

2 0010 110 0 1 

1 0001 101 0 1 

0 0000 100 0 1 

Bubble errors in the thermometric code (due to metastability or noise in the comparators for 

example) are not corrected, but could cause gross errors (including short circuit) depending 

on the encoder. The encoder implementation proposed in [4] could cause short circuits by 

connecting a bit line simultaneously to Vdd and ground because more than one decoder 

row is activated. 

1.4.4 Analysis of accuracy requirements 

Thanks to bit overlapping the Flash ADC theoretically only needs to be 4 bit accurate. The 

DAC and the Residue Amplifier, however, need full 12-bit accuracy, at least in first stage. 

Because all stages use the same building blocks, the blocks all have to fulfill the precision 

requirements of the first stage. 

Errors are most significant in first stage: DAC errors in the first stage are amplified 8 3 = 512 

times before reaching the last pipeline stage. There they should still be smaller than 0.5 LSB 

of the Flash converter. The error at the input of the first stage’s Residue Amplifier thus has 

to be < 0.24 mV. Note that the last stage has an LSB of 250 mV (instead of 125 mV like the 

other stages) because the LSB of its 3 output bits can be discarded for a 12-bit output (see 

Figure 1). 

The maximum allowable gain error of the first stage Residue Amplifier can be estimated as 

follows: Everything in the converter is assumed ideal except for the first stage Residue 

Amplifier which has a gain of 8(1+ε). In the worst case the residue will be 0.5 LSB = 62.5 mV. 

Hence 8*62.5*ε*8 2 < 125 mV => ε < 2/8 3 = 0.004. The maximum allowable gain error is thus 

estimated to be a few per mils. 

In the presented design, calibration is only used where it is easily implementable and does 

not add much circuit complexity, or need many additional I/O pins. A continuous 

calibration feedback is used to adjust DAC gain, but there is no calibration mechanism for 

the Residue Amplifier gain. 

1.4.5 Clocking scheme 

The clocking scheme is kept as simple as possible. Because of switched-capacitor Residue 

Amplifier, more clock phases are needed than in the single-ended design in [4]. Figure 5 

shows the clocking scheme of the pipeline stage. The signal names refer to the clock phases 

used in the comparator (Figure 11) and the Residue Amplifier (Figure 26). 

8


input/output stable 

input/output stable 

PHI2 

comprator 

reset 

1a,b 

sample input 

sample input 

R 

residue amp 

reset 

2a 

2b 

sample DAC output 

t=0 

DAC settled 

t=Ts 

SR latch switched 

Residue Amp settled 

time 

Figure 5: 

Clocking Diagram for Pipeline Stages 

There are three critical phases that cannot overlap and that have to fit inside the 5 ns 

sampling interval: (1) reset of the Residue Amplifier, (2) sampling of the DAC output and 

settling of the Residue Amplifier output, and (3) sampling and settling of input voltage. The 

input to the stage (and thus to the comparators) can change as soon as the SR-latches of the 

comparators (Figure 11) have switched. The comparator will be fully unbalanced, and its 

stage can only change after it has been reset. 

Four external triggers are needed, all other clocks can be triggered by other clock edges 

(indicated in Figure 5 by dashed arrows): the beginning of the regeneration phase of the 

comparator, the end of the sampling of the input voltage, the end of the Residue Amplifier 

reset, and the end of the DAC output sampling. Making these four events evenly spaced 

allows generating all clocks from two 90-degree shifted 200MHz clocks. 

Note that the comparator reset timing is not critical. To reduce possible hysteresis due to 

incomplete reset between two cycles, the comparator is reset for as long as possible. 

Designing a self-contained clock generation circuit (using a PLL or DLL) is not necessary for 

the first prototype. In fact, inputting two shifted 200 MHz clocks from the outside gives 

more degrees of freedom (frequency, duty cycle) on the timing of the internal clock signals. 

2 4-bit Flash Analog-to-Digital Converter 

2.1 4-bit Flash A/D Converter 

A Flash topology is used for the 4-bit ADC as this topology is very fast and quite compact 

for a small number of bits. An N-bit Flash ADC needs 2 N -1 comparators, hence, for 4 bits, 

15 comparators are needed. The 15 differential reference voltages are generated using a 

9


resistive ladder as shown in Figure 6 [6]. N+Polysilicon resistors are used for the resistor 

vin1 

vin2 

vref1 

vref2 

VBIAS 

PHI2 

PHI1 

Q15 

Q15 

Q14 

Q14 

Q13 

Q13 

Q12 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

R R R R 

R R R 

R 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

Q1 

Q1 

Q2 

Q2 

Q3 

Q3 

Q4 

Q12 

vref+ 

vref− 

vref+ 

vref− 

Q4 

Q11 

Q11 

Q10 

Q10 

Q9 

Q9 

PHI1 

PHI2 

VBIAS 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

R 

R R 

R 

R R 

R R 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

vin+ 

vin− 

vref+ 

vref− 

Q5 

Q5 

Q6 

Q6 

Q7 

Q7 

Q8 

Q8 

Figure 6: 

Flash Converter 

ladder because they provide reasonable matching and linearity. 

The resistor ladder value has been set to 500 Ω, resulting in a static current of 125 µA. A large 

resistor area (W*L) improves matching and helps stabilize the reference voltages thanks to 

the large parasitic capacitance. The recovery time of the reference voltage nodes should be 

short enough for the nodes potentials to fully recover from coupling noise between two 

sampling instants. 

Two external pins (vref_plus and vref_minus) are used to define signal range. 

10


output code 

15 

14 

13 

12 

11 

ideal threshold 

10 

−1V 

levels 

9 

8 

+1V 

7 

vin 

6 

5 

4 

3 

ideal code 

midpoints 

2 

1 

0 

Figure 7: 

Ideal DC transfer characteristic of the 4-bit Flash ADC 

2.1.1 Flash ADC Floorplan 

Figure 8 shows the floorplan of the 4-bit Flash ADC. The folded arrangement of resistor 

ladder causes mismatches to be symmetrical to the input range midpoint (first and last 

resistor etc. are closely matched because adjacent). The floorplan indicates a possible way 

of laying out and connecting the 16 reference ladder resistors. The proposed arrangement 

of 32 resistor elements can be made wide to stack to about the same height as the 15 

comparators (comparator height is 15µm). Wider resistors will improve matching and the 

increased parasitic capacitance will make the nodes less prone to capacitive coupling. 

2.2 Differential Comparator 

A differential comparator compares a differential input voltage to a differential reference 

voltage, i.e. it implements the following inequality: 

( v in1 – v in2 ) v ref 1 – v ref 2 

– ( ) > 0 

(2) 

One state of the binary comparator output will indicate that (2) valuates to true, the other 

one that it evaluates to false. 

2.2.1 Choice of Topology 

Inequality (2) indicates that a differential comparator can be built as a differencing circuit 

followed by a single-ended comparator. 

11


reference input 

analog input 

01 

01 

Resistor ladder 

01 

01 

00 11 

00 11 

00 11 

00 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

reference voltages 

11 

Comparator 1 

Comparator 15 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

Comparator 2 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

01 

01 

00 11 

00 11 

00 11 

01 

01 

01 

01 

01 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

00 11 

Comparator 14 

01 

01 

01 

01 

01 

01 

Comparator 7 

digital outputs to encoder and DAC 

00 11 

00 11 

00 11 

00 11 

00 11 

Comparator 9 

00 11 

00 11 

01 

01 

00 11 

00 11 

00 11 

00 11 

00 11 

Comparator 8 

00 11 

00 11 

00 11 

00 11 

00 11 

Clock 

Figure 8: 

Floorplan of 4-bit Flash ADC 

A topology based on two cross-coupled flip-flops [11] and two differential pairs [12] has 

finally been chosen (briefly mention other topologies that have been considered?). The cross 

coupled flip-flops (Figure 9(b)) provide fast decision: The small input difference (output of 

the differencing circuit) is quickly regenerated to a rail-to-rail signal by the high (nonlinear) 

gain of the regeneration flip-flops. The two differential pairs (Figure 9(a)) implement a 

(nonlinear) differential difference amplifier: 

f( I a , v in1 – v ref 1 ) – f( I b , v in2 – v ref 2 ) > 0 

(3) 

Because f(I,∆v) is monotonic in ∆v (and in this particular case also I), (2) and (3) always 

evaluate to the same logic value if I a and I b are the same. 

Note that (2) can also be written as 

( v in1 – v ref 1 ) –( v in2 – v ref 2 ) > 0 

(4) 

The second view (4) is better for the input range requirements of the differential pairs. When 

crossing thresholds, the differential inputs should be as close as possible to the origin of the 

transfer curves of the differential pairs because there the gain is highest, leading to smaller 

12


resolvable voltage differences. Far away from the origin, the differential gain of the pairs 

goes to zero, no decision can be made. 

Vdd 

Vdd 

phi1 

phi1 

iout1 

iout2 

vout1 

vout2 

S 

phi1 

R 

Q 

vin1 

vref1 

vin2 

vref2 

iin1 

iin2 

phi2 

Q 

vbias 

(a) (b) (c) 

Figure 9: 

Comparator Elements: NMOS Input Pairs (a), Regenerative Flip-Flops (b), SR-Latch (c) 

A topology consisting of the three parts shown in Figure 9 was finally adopted: (a) two 

differential input pairs (differencing circuit), (b) a clocked cross-coupled latch (the actual 

comparator), and (c) an SR-latch to hold the comparator output until the next clock cycle. 

Three different versions of the comparator were examined. All have NMOS input pairs 

since PMOS transistors would have to be very large to allow a 0.6 to 1.6 V input range. 

1. PMOS pull-up, inverters, NAND-based SR latch 

2. current mirrors, NMOS pull-down, NAND-based SR latch 

3. PMOS pull-up, NOR-based SR latch 

NMOS pull-down (for setting or resetting the SR-latch) provides faster response time than 

PMOS pull-up. Inverters at the output for the regenerative flip-flops add delay but at the 

same time buffer the cross-coupled latch’s output. Finally, the NAND-based SR latch is 

faster than NOR-based equivalent because the NOR version has PMOS transistors in series, 

while in the NAND version, the NMOS transistors are stacked; also, the NOR causes low 

output crossing point, which is not useful when using NMOS current switches in the DAC 

(see Section 3.1). 

Version 2 of the comparator proved to perform the best. A useful side effect of mirroring the 

current from the differencing circuit before injecting it into the regeneration stage is that 

there is less switching induced noise injected into reference ladder. 

2.2.2 Comparator circuit 

The comparator circuit in Figure 11 works as follows: During reset, the switches M5a and 

M5b disconnect S and R of the SR latch from the sensing nodes (aa and bb). The inputs of the 

latch are pulled up to Vdd by M6a and M6b, causing the latch to keep it’s state. M7 is closed, 

equalizing the sensing node voltages. A mismatch between the two differential input 

voltages causes an unequal amount of current to be injected into the sensing nodes. When 

switch M7 is released, the first regeneration phase starts, and the small current imbalance 

will cause the cross coupled transistors M3a and M3b to pull down one of the sensing nodes. 

Then M5a and M5b are opened, and either SorR is pulled to ground, switching the state of 

13


the SR-latch. The comparator can then be reset again without disturbing it’s output state. 

The two non-overlapping clock phases controlling the comparator are shown in Figure 10. 

Making the first regeneration phase longer speeds up the second phase as the second phase 

starts with larger difference voltage. However, since the first regeneration phase also adds 

to the total comparator response time, making it too long will actually slow down 

comparator response. A value of 200ps has been chosen.- 

Table 3: Transistor aspect ratios and fingering for the comparator circuit (Figure 11) 

Transistor W (total) [µm] L [µm] Number of Fingers a 

M0a, M0b, M0c, M0d 1.5 1 2 

M1a, M1b 6 2.5 4 

M2a, M2b, M2c, M2d 3.6 0.18 4 

M3a, M3b 3 0.18 2 

M4a, M4b 1 0.18 2 

M5a, M5b 1 0.18 2 

M6a, M6b 0.24 0.18 1 

M7 0.5 0.18 1 

M8a, M8b 2 0.18 1 

M9a, M9b 2.5 0.18 1 

M10a, M10b 3 0.18 1 

M11a, M11b 0.24 0.18 1 

a. number of fingers per transistor in layout of the comparator cell 

PHI1 

PHI2 

Reset 

100 ps 

200 ps 

5ns 

Figure 10: 

Comparator Timing 

The complete comparator circuit it given in Figure 11. A bias current of 10 µA has been 

chosen. Trade-offs exist for the switch sizing: making M5a and M5b large helps pulling 

down the active branch quickly, but increases the glitch size when switching on. The size of 

the transistors has to be kept small enough to prevent the glitches from feeding through the 

14


Vdd 

M2a 

M2b 

M2c 

M2d 

phi1 

M6a M4a 

M4b M6b 

phi1 

M10a M11a 

M11b M10b 

S 

a 

b 

bb 

aa 

M5a 

phi1 

M5b 

R 

Q 

Q 

vin1 

M0a 

M0b 

vref1 

vin2 

M0c 

M0d 

vref2 

phi2 

M9a 

M9b 

M7 

vbias 

M1a 

M1b 

M3a 

M3b 

M8a 

M8b 

Figure 11: 

Circuit schematic of the fully differential comparator 

SR-latch to the comparator output. The size of the resetting switch influences the required 

resetting time. Since the clocking scheme of the pipeline (Figure 5) allows a long reset phase, 

this switch can be kept small, reducing parasitic capacitance on the regeneration nodes, and 

increasing the gain during the sensing phase (end of reset) because the current difference 

flowing through M7 causes a larger voltage imbalance of the sensing nodes. 

2.2.3 Comparator Layout 

The complete comparator layout is given in Appendix A. The silicon area is approximately 

450 µm 2 . To reduce parasitic capacitances of interconnects, minimum width metal lines have 

been used. Also, care has been taken to balance the parasitic caps of the sensing nodes. 

Unequal sensing node capacitance could lead to increased comparator offset. 

2.3 Performance verification 

The extracted netlist of the comparator has been used to compare pre- and post-layout 

performance. The number of fingers in the comparator schematic is made equal to the 

effective number of fingers in the layout, so that the effect due to interconnect parasitics can 

be distinguished from effects due to different transistor fingering, which has a large impact 

on the source and drain junction capacitances. 

The three investigated simulation corners are given in Table 1. For the simulations, each 

comparator output has been loaded with 50 fF to simulate the input capacitance of the DAC 

current switches and the thermometric-to-binary encoder. 

Table 4: Simulation Corners for Performance Verification 

Worst Case Typical Case Best Case 

Technology Corner SS TT FF 

Temperature 125 C 27 C -25 C 

Vdd -10% nominal +10% 

Bias current -10% nominal +10% 

15


2.3.1 Comparator Performance 

Figure 12 to Figure 16 give an overview of the obtained simulation results. The comparator 

offset plots reveal that a 1 ns reset time is too short, especially when layout parasitics are 

taken into account. Making the comparator reset as long as possible will fix the hysteresis 

problem. Figure 15 shows that already an extension to 1.5 ns almost completely removes the 

hysteresis for typical simulation conditions. Figure 16 shows that typical comparator 

response time is below 600 ps, even for input differences of only a few millivolts. In the 

worst case (Figure 17), the comparator does not respond within the 5ns clock period. Such 

a condition will lead to bit errors in the Flash output. 

The mean comparator power consumption under typical conditions is about 140 µW. 

15 

10 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

Schematic, Typical case 

20 

15 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

Post−layout, Typical case 

10 

5 

5 

Offset [mV] 

0 

Offset [mV] 

0 

−5 

−5 

−10 

−10 

−15 

−15 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Thermometric code 

−20 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 12: 

Comparator Offsets in Typical Case (1 ns Reset Time) 

30 

20 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

Schematic, Worst case 

50 

40 

30 

Post−layout, Worst case 

Offset [mV] 

10 

0 

Offset [mV] 

20 

10 

0 

−10 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

−10 

−20 

−20 

−30 

−40 

−30 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


−50 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 13: 

Comparator Offsets in Worst Case (1 ns Reset Time) 

16


6 

4 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

Schematic, Best case 

2 

1.5 

Post−layout, Best case 

1 

2 

Offset [mV] 

0 

Offset [mV] 

0.5 

0 

1.1 V 

−2 

−0.5 

−4 

−1 

−6 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


−1.5 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 14: 

Comparator Offsets in Best Case (1 ns Reset Time) 

15 

10 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

Post−layout, Typical case, 1.5 ns reset 

25 

20 

15 

Post−layout, Worst case, 1.5 ns reset 

5 

10 

5 

Offset [mV] 

0 

Offset [mV] 

0 

−5 

−5 

−10 

−10 

−15 

−20 

1.0 V 

1.05 V 

1.1 V 

1.15 V 

1.2 V 

−15 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


−25 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 15: 

Comparator Offset Improvement for extended reset time of 1.5 ns 

2.3.2 Flash ADC Performance 

Mismatch in the resistor ladder has been modeled by using normally distributed resistor 

values with mean 500 Ω and 5% standard deviation (σ). The resistor matching report 

indicates that N+Poly resistor values will have a σ of less than 1%. We used 5% to 

compensate for the fact that this model assumes that the values of adjacent resistors are 

uncorrelated, which is certainly not true on silicon. 

Resistor ladder mismatch (σ = 5%) has not been included in the INL and DNL plots, as it is 

dominating the INL and DNL due to the comparators. Its effect on INL and DNL is shown 

as dashed lines in Figure 18. 

To construct the DNL and INL plots of the 4-bit Flash the mean of the rising and falling 

offset with 1 ns reset time has been used as effective offset. The hysteresis is assumed to 

introduce a constant difference between rising and falling threshold. The mean value of the 

17


650 

Comparator response time (Typical Case) 

−0.875 mV 

0 mV 

0.875 mV 

1200 

1150 

Comparator response time (Worst Case) 

−0.875 mV 

0 mV 

0.875 mV 

1100 

600 

1050 

Response time [ps] 

Response time [ps] 

1000 

950 

900 

550 

850 

800 

750 

500 

−100 −50 0 50 100 

∆in [mV] 

700 

−100 −50 0 50 100 

∆in [mV] 

Figure 16: 

Comparator Response Time for post-layout simulations (1.5 ns Reset Time) 

two offsets represents the comparator offset for a long enough reset phase). The input 

common mode voltage has been set to 1.1V. 

8 x 10−3 Flash DNL (schematic simulation) 

0.5 x 10−3 Flash INL (schematic simulation) 

6 

0 

4 

2 

−0.5 

Typical case 

Worst case 

Best case 

DNL [LSB] 

0 

INL [LSB] 

−1 

−2 

−1.5 

−4 

−6 

Typical case 

Worst case 

Best case 

−2 

−8 

2 3 4 5 6 7 8 9 10 11 12 13 14 15 


−2.5 

2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 17: 

4-bit Flash DNL and INL (schematic simulations) 

Figure 19 shows a Signal-to-Noise Ratio (SNR) plot for the 4-bit Flash. It has been obtained 

by simulating the Flash converter with the extracted comparator netlist. An 11 MHz 

sinusoidal input signal is sampled at 200 MHz. The dashed line indicates the maximum 

SNR that can be obtained by a 4-bit ADC. This limit is imposed by the quantization noise. 

The simulated SNR should approximately stay constant for input signal frequencies up to 

the Nyquist rate. Figure 19 shows a significantly lower SNR already for frequencies well 

below the Nyquist rate. This is probably due to the very small number of ADC output 

samples (91) that have been used to calculate the SNR. A small number of samples has been 

taken because of long simulation times. 

18


0.06 

0.05 

0.04 

Flash DNL (post layout simulation) 

Typcial case 

Worst case 

Best case 


0.05 

0.04 

0.03 

Flash INL (post layout simulation) 

0.03 

0.02 

0.02 

0.01 

DNL [LSB] 

0.01 

INL [LSB] 

0 

0 

−0.01 

−0.02 

−0.01 

−0.02 

−0.03 

Typcial case 

Worst case 

Best case 


−0.03 

−0.04 

−0.04 

2 3 4 5 6 7 8 9 10 11 12 13 14 15 


−0.05 

2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 18: 

4-bit Flash DNL and INL (post-layout simulation) 

To find the maximum sampling frequency of the 4-bit DAC, the sampling frequency has to 

be swept for a fixed input frequency (which has to be lower than half the lowest sampling 

frequency tested). The sampling frequency at which the SNR decreases by 3 dB can be 

regarded as the converter’s maximum sampling speed. Because the comparator response 

time (Figure 16) and reset time sum up to almost 2 ns, this maximum frequency is expected 

to be around 500 MHz. In the pipeline, however, only the response time is critical, as the 

comparators can be reset while the DAC and the Residue Amplifier are working (Figure 5). 

For a more complete characterization of the converter, SNR as a function of input signal 

amplitude (at constant input and sampling frequencies) does also have to be simulated. 

These simulations have not yet been done. 

26 

SNR of Flash ADC 

25 

SNR of ADC 

Quantization Noise Limit 

24 

SNR [dB] 

23 

22 

21 

20 

19 

11 22 33 44 55 66 77 88 99 

Input frequency [MHz] 

Figure 19: 

4-bit Flash SNR for Typical case (post-layout) 

19


3 4-bit Digital-to-Analog Converter 

Because there are no accurate capacitors available, the DAC is not implemented as an 

MDAC as in [4], but as a current-steering DAC. Using a current steering DAC will greatly 

increase the power consumption of the converter compared to an implementation using a 

capacitive MDAC (which has no static current consumption). 

Matching of unit-current cells is very critical because DAC linearity directly depends on the 

matching of the unit currents. Special care has to be taken when layout out the current cells 

(see Section 3.1.2). Since the gain is controlled by the absolute values of the unit current and 

the load resistors, a calibration feedback is absolutely needed. 

Using non-weighted current cells simplifies design as the flash output can directly be used 

for controlling the current cells. Depending on the current cell, a simple deglitching circuit 

may have to be employed: when switching the current between the output branches 

(Figure 20), there must always be a path for the current drawn by the current source 

transistor. If the current path is blocked, the transistor will leave saturation; reestablishing 

the current will then take some time, causing a glitch in the output voltage. 

The DAC output needs buffering because it must drive the large input capacitance of the 

Residue Amplifier. The buffer needs precise gain of 1 and large output swing. Since the 

OTA(s) in the buffer will be used in unit gain configuration, the OTA(s) will also need large 

input signal swing. 

A resistor ladder DAC was also tested, but then rejected because of insufficient resistor 

matching accuracy, and because it would have needed to introduce an encoder into the 

analog signal path, thus increasing delay. 

3.1 Current-steering D/A Converter 

The schematic of the current-steering DAC is given in Figure 20. It’s output voltage is given 

by: 

v od = v out1 – v out2 = RI 0 ( 2n– 

N) 

v ------------------------------ v out1 + v out2 

oc 2 

V NI + 

= = – --------------------------- 

2RI 0 1 

dd 2 

(5) 

(6) 

where n is the thermometric output value of the Flash, R the load resistance, I 0 the unit 

current, and N = 15. I 1 is a DC current used to adjust the common mode level of the output 

voltage. R and I 0 have to be chosen to obtain the required gain. The values used here are R 

= 1250 Ω, I 0 = 50 µA, and I 1 = 185 µA. 

3.1.1 Continuous DAC Gain Calibration 

Continuous calibration only needs to regulate a DC level and can thus be slow. However, 

the slow feedback will also take a long time to recover from glitches injected on the reference 

node. 

Calibration is done using a unit current cell and dummy resistor identical to load resistors 

to set the nominal voltage of the vref control pin to Vdd-0.5LSB (1.7375 V). Due to voltage 

20


R dummy 

Unit Current Cells 

R 

R 

vout1 

vout2 

vref 

+ 

− 

1 0 

I0 

va 

000000000 

111111111 

Q1 

I0 

Q1 

Q2 

I0 

Q2 

0000000000 

1111111111 000000000 

111111111 

1100 

01 

Q15 

I0 

Q15 

I1 

I1 

Figure 20: 

Current-steering DAC with continuous gain calibration 

drops in the power rails and feedback amplifier offset, this voltage will have to be adjusted 

to obtain a gain of precisely 8. 

No high gain needed in the feedback amplifier since Vdd seen by dummy resistor is 

different from the external Vdd due to resistive voltage drops. Vref will have to be adjusted 

from the outside anyway, so static offset at amplifier input not a problem. Because the inputs 

of the amplifier are very close to Vdd, an NMOS input pair with folded cascode has been 

chosen, which does not need a diode connected transistor as load at drains of input 

transistors. Only one stage is used to simplify stabilizing the feedback loop: The current 

source gate node va has a large capacitive load (4-5pF), and thus has to be the dominant pole 

node. A two stage folded cascode opamp would provide higher gain but would be hard to 

compensate. 

M4 

M7a 

M7b 

M8a 

vcasc 

M8b 

vout 

Ibias 

vin 

M0a 

M0b 

vref 

M6a 

vcasc 

M6b 

M3 

M2 

M1 

M5a 

M5b 

Figure 21: 

Feedback amplifier for continuous DAC calibration 

21


A bias current of 10µA is used. The bias voltage vcasc can be generated using a simple MOS 

Table 5: DAC calibration feedback amplifier transistor sizes 

Transistor W (total) [µm] L [µm] 

M0a, M0b 10 0.5 

M1 6 0.5 

M2 1.5 0.5 

M3 1.5 0.5 

M4 2 0.18 

M5a, M5b 1 0.18 

M6a, M6b 2.5 0.5 

M7a, M7b 8 0.18 

M8a, M8b 10 0.5 

(diode-connected) voltage divider. Varying vcasc by 10% does not degrade performance of 

the calibration feedback loop. 

3.1.2 DAC Floorplan 

The current source transistor array can be laid out compactly because all transistors share 

the same source and gate node. The array is laid out such that transistors 1 through 16 all 

share the same geometry centroid (common centroid layout). 

The regulated cascode feedback and the current switch for each cell is put outside the 

matched capacitor array (Cells 1 to 15). This way the matched transistors can be put together 

very closely, improving matching. On the other hand, the digital inputs from the Flash ADC 

do not need to be routed across the analog parts of the DAC. Because the RGC feedback 

regulates the voltage at the source of the cascoding transistor, the voltage drop between the 

source of the cascode transistors and the drains of the current-source transistors should be 

matched, i.e. the routing collecting the current from the transistor array will have to be 

balanced. 

Once the exact size of the layout of one current cell (feedback and switch) is known, the 

floorplan may have to be adjusted. 

3.2 Unit current cell 

The unit current cell must have a minimum output resistance in the order of 100 MΩ, and 

provide fast switching without introducing large current glitches. The following sections 

discuss some switching current cells that have been examined. 

3.2.1 Fournier-Senn [15] Current cell 

The advantages of this cell topology is it’s speed and low minimum output voltage. Because 

the current switch transistors also serve as cascoding transistors only two transistors need 

22


digital 

inputs 

Cell 1 

Cell 2 

Cell 3 

Cell 4 

Cell 15 Cell 14 

Cell 13 Cell 12 

1 

2 3 4 5 6 7 8 

Load resistors 

2 1 4 

3 6 5 8 7 

9 10 11 12 13 14 15 16 

10 9 12 11 14 13 16 15 

15 16 13 14 11 12 9 10 

analog 

output 

16 15 14 13 12 11 10 9 

7 8 5 6 3 4 1 2 

8 7 6 5 4 3 2 1 

Cell 8 Cell 9 Cell 10 Cell 11 

digital 

inputs 

Ref. Cell 

Cell 7 

Cell 6 

Cell 5 

Figure 22: 

Floorplan of 4-bit current-steering DAC 

S 

S 

vb 

vb 

Q 

Q 

Q 

va 

Q 

Figure 23: Current-cell based on the configuration proposed in [15] 

to be stacked. The drawback is that this cell needs a deglitching circuit to generate the 

control signals from the Flash output. Also, the required 100 MΩ output resistance could not 

be reached with a simple cascoding of two transistors. 

23


3.2.2 Regulated Cascode Current Cell 

To improve the output resistance of the current cell we tried to include a local feedback in 

the Fournier-Senn current cell (Figure 23). This failed because the feedback circuit has to 

charge cascode transistor gate capacitance to turn on the switch, as the gate is always 

completely discharged for switching off. This makes the feedback slow and causes long a 

settling time of after switching the current. Also, the feedback aggravated the glitch 

problem of the current cell. 

out1 

out2 

Q 

M2a 

M2b 

Q 

vbias 

M2 

M4 

vref + 

− 

M1 

vin− 

M0a 

M0b 

vin+ 

vout 

va 

M0 

M1a 

M1b 

M3 

Figure 24: 

(a) Unit current cell 

Regulated cascode unit current cell 

(b) RGC feedback amplifier 

Next, the regulated cascode principle has been applied to the simple cascode current source 

with a stacked current switch (Figure 24). This circuit has finally been adopted although it 

is slower than Fournier-Senn current cell. Simulations show that output resistances of 100 

MW and more can easily achieved with this circuit. 

The circuit is slower for low output voltages because the RGC feedback needs to make a 

large output excursion to regulate current when output voltage is close to 0.6 V. 

3.3 Simulated Performance 

Figure 25 shows the simulated DNL and INL plots for the DAC using the RGC current 

source described in the precious section. The load on the DAC output nodes is 100 fF. 

Simulations show that the worst case settling time is 2.3 ns, and the largest glitch size at the 

output is 12 mV. 

4 Residue Amplifier and Sample-and-Hold 

Instead of making the Residue Amplifier and the interstage sample-and-hold circuit two 

separate circuits, we chose to combine them in a single switched capacitor circuit. The idea 

is thus to build a sample-and-hold circuit with a gain of 8 and a differencing circuit at its 

input. This approach allows subtraction of the input voltage from the DAC output and 

amplification of the resulting residue in one step without using more matched capacitors 

than would be needed for a differential sample-at-hold, which in any case needs four 

matched elements. 

Matching of the four capacitors in the Residue Amplifier is extremely critical for accurate 

interstage gain. Special layout techniques will have to be employed achieve sufficient 

24


12 x 10−4 DAC DNL 

10 

8 

Typical case 

Worst case 

Best case 

20 x DAC INL 

10−3 

Typical case 

Worst case 

Best case 

15 

INL [LSB] 

6 

4 

2 

DNL [LSB] 

10 

5 

0 

−2 

0 

−4 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


−5 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 


Figure 25: 

DNL and INL of current-steering DAC 

matching. No gain calibration mechanism is implemented. A simple mechanism using 

capacitor banks would require too many external pins for the first prototype 

implementation of the converter stage. 

4.1 Switched Capacitor Residue Amplifier 

4.1.1 Circuit Topology 

R 

1b 

vbias 

v1 

1a 

00 11 

00 11 

01 

01 

vdac1 

2a 

01 

01 

C1a 

01 

01 

2b 

00 11 

00 11 

+ 

C2a 

− 

01 

01 

vout1 

v2 

1b 

01 

01 

2b 

− 

+ 

00 11 01 

vout2 

vdac2 

2a 

C1b 

1a 

00 11 

C2b 

01 

vbias 

R 

Figure 26: 

Topology of Switched-Capacitor Residue Amplifier and Sample-and-Hold Circuit 

The circuit topology in Figure 26 has been chosen because it allows holding the output 

voltage while sampling the first differential voltage. This is important because the output of 

25


the Residue Amplifier is connected to the analog input of the next pipeline stage, while one 

of the inputs of the Residue Amplifier has to sample the output value of the previous stage. 

It it thus necessary to sample the input voltage while holding the output voltage constant 

for the next stage. 

The differential and common-mode output voltages under ideal conditions (no charge 

injection) are given in Equations (7) and (8). v 0 represents the analog ground voltage used 

while sampling v 1 and v 2 (vbias in Figure 26), and v 0 ’ represents the virtual analog ground 

voltages imposed by the negative feedback around the OTA. Ideally v0 and v0’ should be 

equal. Two separate voltages have been introduced for the derivation of Equations (7) and 

(8) in order to study the effect of mismatch between vbias and the OTA output common 

mode voltage. 

v od = v out1 – v out2 = 

C 1a 

C 

-------- 1b 

( v 

C 1 – v dac1 + v 0 ′ – v 0 )– 

-------- ( v 

2a 

C 2 – v dac2 + v 0 ′ – v 0 ) 

2b 

(7) 

v 

v out1 + v out2 C 1a 

C 

oc ------------------------------ v 

2 

0 ′ ----------- 1b 

= = + ( v 

2C 1 – v dac1 + v 0 ′ – v 0 ) + ----------- ( v 

2a 

2C 2 – v dac2 + v 0 ′ – v 0 ) 

2b 

(8) 

If capacitors C1a and C1b, and C2a and C2b are perfectly matched, Equations (7) and (8) 

simplify to the expressions given by (9) and (10). It is interesting to note that under these 

conditions, a mismatch between v 0 and v 0 ’ only has an influence on the common-mode 

output signal only. The differential output voltage is unaffected. This means that the 

common mode feedback of the OTA (see Section 4.2) doesn’t not need to align the output 

common mode to analog ground with very high precision. This greatly simplifies the 

high-speed common-mode feedback design, as high gain is not required. 

v od = v out1 – v out2 = 

C 

----- 1 

( v 

C 1 – v dac1 – v 2 + v dac2 ) 

2 

(9) 

v 

v out1 + v out2 C 1 

oc ------------------------------ v 

2 

0 ′ 

C ----- v1 + v v 

---------------- 2 dac1 + v 

= = + ⎛ – ------------------------------- dac2 

+ v 

2 2 2 

0 ′ – v ⎞ 

⎝ 

0 ⎠ 

(10) 

The circuit in Figure 26 uses a conventional differencing circuit to form the difference of the 

two differential input voltages. As seen in (10), this circuit does not reject the differential 

common-mode input voltage (i.e. the difference between the common-modes of the two 

differential inputs). A configuration based on the differencing circuit proposed in [14] 

would provide better common-mode rejection but require more switches and clock phases. 

Because the common modes of the input signals are approximately aligned, and the 

following stage (Flash ADC) provides some common mode-rejection, common-mode 

rejection is not critical in the Residue Amplifier. The simple conventional circuit is thus 

used. The clocking scheme for this circuit is included in Figure 5. 

4.1.2 Charge Injection of MOS Switches 

Equations (7) and (8) are derived using charge conservation in the sampling capacitors. The 

switches are assumed not to absorb or release any charge when turned on or off. For 

switches implemented with MOS transistors, however, this assumption is not true. The 

charge forming the transistor’s channel when in on-state is released when to source, drain 

and bulk (substrate) node of the transistor, potentially disturbing the voltage levels on the 

sampling capacitors. How the charge is distributed between these nodes depends on the 

26


impedance of these nodes seen by the transistor, the rise or fall time of the gate voltage, and 

on the source and drain potential of the switch. Charge injection is thus difficult to predict, 

and because it is signal dependent, it is difficult to cancel its effects even using fully 

differential circuits. 

Note that high speed (small time constant C/g) entails large amounts of injected charge 

because the amount of injected charge and the switch on-resistance are linked (Equation 

(11)). 

µQ 

g on = –------- 

L 2 

(11) 

Note that ideally, the switches should only introduce common mode charge injection errors 

which will be rejected by the input stage of the comparators of the next pipeline stage. 

However, because the amount of injected charge is signal dependent, differential errors will 

still occur. 

Techniques for charge injection cancellation: 

• Use shorted dummy transistors on high impedance node side to absorb the injected 

charge. The dummy transistor has to be turned on when the switch transistor is 

turned off. 

• Use bootstrapped switches [8]: this makes the injected charge almost independent of 

signal level. Thus, if the circuit topology is such that all injected charge results in a 

common mode error, this may allowing at least partial error cancellation; however, 

bootstrapped switches are complex to implement 

• Choose large capacitors to reduce effect of injected charge on voltage. However, to 

keep the speed of the circuit constant, this also requires scaling of the switch 

transistors, resulting in more injected charge. 

• Bottom plate sampling (series sampling): sample against a constant potential to 

reduce signal dependency of error [8]. 

4.1.3 Capacitor and Switch Sizing 

From Equation 9 it can be seen that the ratio of C 1 /C 2 needs to be 8 for a gain of 8. The 

capacitors should be large enough to allow precise matching of C1 and C2, and small 

enough to keep the Residue Amplifier settling time short enough for 200 MHz sampling 

rate (ideally less than 2 ns only). A value of 100 fF has been chosen for C 1 , requiring in C 2 = 

800 fF. 

CMOS switches are used to implement all switches in Figure 26. This allows the switches to 

work properly over the whole signal range. The gate overdrive of the switch transistors in 

on-state is small due to the relatively low Vdd of 1.8 V. This results in large on-resistance of 

the switches, which in turn requires the switch transistors to be large. Large transistors, 

however, will aggravate charge injection errors and increase the parasitic capacitances in the 

switched-capacitor circuit. Bootstrapped switches [8] could ease this problem but add 

complexity. The idea of bootstrapped switches is to boost the gate voltage of the switch 

transistor so its gate overdrive is always Vdd-V th when in on-state. This reduces the signal 

dependency of on-resistance and injected charge, and allows use of smaller switch 

transistors for a given required on-resistance (the amount of injected charge is not reduced 

by the bootstrapping technique, as can be seen from (1)). 

27


The effects of different capacitor and switch sizes will have to be investigated further, as 

these parameters are very critical for Residue Amplifier speed and accuracy. 

4.2 Differential OTA 

Switched capacitor residue amplifier needs fully differential Opamp with high GBW and 

slew rate. A class AB topology is thus used instead of instead of class A amplifier. However, 

the conventional class AB output stage push-pull source followers do not work because of 

the low supply voltage. Since the nominal output range of the residue amplifier is limited 

to half the 1 V peak-to-peak range (due to bit overlapping), the differential OTA does not 

necessarily need a rail-to-rail output stage, but can use cascodes, allowing high gain and 

large GBW with few stages. However, especially PMOS cascodes are critical, since they need 

to be large because there is only about 400 mV headroom to Vdd, and since the carrier 

mobility in PMOS transistors is much lower than in NMOS transistors. 

Single stage amplifier are an interesting choice because they can have only one high 

impedance node at the output. The load capacitance will then slow down the dominant 

pole, improving stability. In a “classical” two stage op-amp, the load capacitance affects the 

non-dominant pole that one wishes to push to high frequency. 

To speed up a switched-capacitor circuit one has to increase both G m and I slew (or decrease 

C, which is bad for noise). An OTA GBW requirement estimation formula is presented see 

[8]. The simple first order model used there predicts a required GBW of around 10GHz, 

which is not practical. With a more complex circuit topology, it is hoped to achieve fast 

settling even with a significantly lower GBW of only approximately 1 GHz. Because the 

OTA in the Residue Amplifier is used with a feedback factor f < 1, the phase margin required 

for stability is not the phase at the unit gain frequency, but at the frequency corresponding 

to a gain of 1/f. This simplifies achieving a sufficient phase margin. 

The slew rate requirement on the OTA is tightened by the fact that the output is reset in each 

cycle. The output voltage may have to change by up to half the output peak-to-peak value 

(here: ∆V max = 250mV). 

k ⋅ Vmax I C L 

SR 

= -------------------------- 

T S 

(12) 

The required slewing current can be estimated from (12), where I SR is the slewing current 

available, T s is the time available for settling, and V max the voltage swing. If a third of the 

settling time is used for slewing, k=3. 

4.2.1 Mirrored cascode with class AB input stage with preamplifier 

Simple Folded Cascode and Mirrored Cascode topologies have been evaluated, but found 

to either provide insufficient slewing current or a too small GBW. Finally, a topology using 

a class AB input stage [9] was adopted since the required voltage swing at the OTA input is 

small. 

A low-gain high-speed input amplifier (Figure 29) is used as a preamplifier to the OTA to 

take advantage of AB input stage. The input signal has very limited swing, which permits 

28


vin+ 

vin− 

Residue Amplifier OTA 

+ 

− 

− 

pre− 

Amplifier 

+ 

+ 

− 

− 

class−AB 

OTA 

cmfb 

+ 

Common−mode 

feedback amplifier 

in2 

vin1 

vout+ 

vout− 

Figure 27: 

Overall structure of the Residue Amplifier differential OTA 

M7a 

M3a 

M3b 

M7b 

M6a 

vout+ 

M5a 

M4a 

vp 

M11b 

vn 

M9a 

vbias2 

vbias2 

M10a 

M10b 

vp 

vin+ 

vin− 

M11b 

M1a 

M0a 

M0b 

M1b 

vn 

cmfb 

M8a 

M8b 

vbias1 vbias1 

M2a 

M2b 

M9b 

M6b 

vout− 

M5b 

M4b 

Figure 28: 

Low-Voltage differential class-AB mirrored cascode OTA 

vout− 

M1a 

M1d 

M1c 

M1b 

vout+ 

M4 

M2a 

M2b 

Rc 

Rc 

Cc 

Cc 

vcmref 

vin+ 

M0a 

M0b 

vin− 

vin2 

M0a 

M0b 

M0c 

M0d 

vin1 

vout 

M3 

vbias 

M1a 

M1b 

Figure 29: 

(a) Preamplifier 

(c) Common-mode feedback amplifier 

Preamplifier and common-mode feedback amplifier for differential class-AB OTA 

not folding the amplifier, but also doesn’t fully unbalance the input pair to take advantage 

of the class AB structure. 

Figure 29 (b) shows the implementation of the common-mode feedback amplifier. 

29


5 Top-Level Floorplanning 

Since the design is only a first prototype, no self-contained biasing circuitry has been 

designed. An external pin is used to adjust the bias current levels, another pin is used to 

observe the current. Since all bias currents used are multiples of 10uA, they can easily be 

derived from the externally controlled reference current. 

5.1 Analog Pipeline Stage Floorplan 

Figure 30 shows the placement of the different blocks in the analog pipeline stage. Note that 

the encoder is not in the analog signal path, but has been put between Flash ADC and DAC 

to minimize routing. 

analog 

input 

ref. input 

1 

DAC Output Buffer 

Encoder 

analog 

output 

Flash ADC 

Clock 

binary 

output 

Current−steering DAC 

SC Residue Amplifier 

Clock 

Figure 30: 

Floorplan of analog pipeline stage 

5.2 Floorplan of complete pipeline 

Figure 31 shows the floorplan of the complete converter pipeline. The structure from [4] has 

been preserved. Note that analog and digital I/O’s lie on opposite sides of the block. 

30


ANALOG I/O 

Analog Pipeline Stage 1 

Analog Pipeline Stage 2 

Analog Pipeline Stage 3 Analog Stage 4 




Flash ADC 

Encoder 


Flash ADC 

Encoder 


Flash ADC 

Encoder 


Flash ADC 

Encoder 




POWER and CLOCK 

Digital Error Correction Block 


Clock distribution and buffering 


Digital Error 

Correction Block 

POWER and CLOCK 

DIGITAL I/O 

Figure 31: 

Overall floorplan of the pipelined ADC 

31


6 Conclusions 

In the limited time available for this project only a part of the ADC could be redesigned. 

Circuit design of the 4-bit ADC and DAC has been finished, and floorplans for the layout of 

these blocks have been elaborated. The layout of the comparator used in the Flash has been 

completed and post-layout simulations have been carried out to verify the design. 

A switched-capacitor topology for the Residue Amplifier has been chosen, but the required 

OTA is still in the design phase. The correct switch and capacitor sizes also yet have to be 

found. First simulation results on the Residue Amplifier circuit indicate that the sampling 

speed specification may have to be relaxed. 

For the prototype of the pipeline stage, at least the Flash and ADC converter will be 

finished. If a Residue Amplifier is to be included, a DAC output buffer has to be designed 

as well. 

As only one stage will be implemented, only the thermometric-to-binary encoder will have 

to be included, the error correction using at least two pipeline stages. 

The full design of a high-speed pipelined differential ADC clearly exceeded the amount of 

work that could be done in only 4 months time. Especially because a large portion was 

needed to learn how to use the software tools. Many important aspects of a thorough design 

have not been addressed. For example, no noise analysis for the different circuit blocks has 

been carried out. 

Lausanne, February 20, 2004 

Thomas Liechti 

32


References 

1 Maloberti F. Analog Design for CMOS VLSI Systems. Kluwer Academic Publishers, 2001. 

2 Carvajal R.G., Galan J., Ramirez-Angulo J., Torralba A. “New low-power low-voltage 

differential class-AB OTA for SC circuits”. Proceedings of the 2003 International Symposium 

on Circuits and Systems, vol. 1, 2003. 

3 Mallya S.M., Nevin J.H. “Design Procedures for a Fully Differential Folded-Cascode 

CMOS Operational Amplifier”. IEEE Journal of Solid-state Circuits, vol. 24, no. 6, December 

1989. 

4 Toprak Z., Design and Realization of a High-Speed 12-bit Pipelined Analog/Digital Converter 

Block, Master Thesis, Sabanci University, 2001 

5 Van de Plassche R., Integrated Analog-to-Digital and Digital-to-Analog Converters, Kluwer 

Academic Publishers, 1994. 

6 Razavi B., Wooley A. “A 12-b 5-MSample/s Two-Step CMOS A/D Converter”, IEEE 

Journal of Solid-state Circuits, vol. 27, no. 12, December 1992. 

7 Yotsuyanagi M., Etoh T., Hirata K. “A 10-b 50-MHz Pipelined CMOS A/D Converter with 

S/H”, IEEE Journal of Solid-state Circuits, vol. 28, no. 3, March 1993. 

8 Waltari M. E., Halonen A. I. Circuit Techniques for Low-Voltage and High-Speed A/D 

Converters. Kluwer Academic Publishers, 2002. 

9 Elwan H., Gao W., Sadkowski R., Ismail M. “CMOS low-voltage class-AB operational 

trasconductance amplifier”. Electronics Letters, vol. 36, no. 17, August 2000. 

10 Walden R. H., “Analog-to-Digital Converter Survey and Analysis”, IEEE Journal on 

Selected Areas in Communications, vol. 17, no. 4, April 1999. 

11 Yin G. M., Op’t Eynde F., Sansen W., “A High-Speed CMOS Comparator with 8-b 

Resolution”, IEEE Journal of Solid-state Circuits, vol. 27, no. 2, February 1992. 

12 Uyttenhove K., Steyaert M. S. J., “A 1.8-V 6-Bit 1.3-GHz Flash ADC in 0.25-µm CMOS”, 

IEEE Journal of Solid-state Circuits, vol. 38, no. 7, July 2003. 

13 Shih T., Der L., Lewis S. H., Hurst P.J., “A Fully Differential Comparator Using a 

Switched-Capacitor Differencing Circuit with Common-Mode Rejection”, IEEE Journal of 

Solid-state Circuits, vol. 32, no. 2, February 1997. 

14 Der L., Lewis S. H., Hurst P. J., “A Switched-Capacitor Differencing Circuit with 

Common-Mode Rejection for Fully Differential Comparators”, Proceedings of the 36th 

Midwest Symposium on Circuits and Systems, August 1993. 

15 Fournier J. M., Senn P. “A 130-Mhz 8-b CMOS Video DAC for HDTV Applications”, IEEE 

Journal of Solid-state Circuits, vol. 26, no. 7, July 1991. 

33

Design of a High-Speed 12-bit Differential Pipelined A/D Converter

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?