18.08.2013 Views

DDR4 Design Considerations - EEWeb

DDR4 Design Considerations - EEWeb

DDR4 Design Considerations - EEWeb

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>EEWeb</strong> PULSE TECH ARTICLE<br />

Click the image below to read Part 1:<br />

Continued from Part 1<br />

5. SYSTEM OVERVIEW<br />

After choosing the TDC structure, we started working<br />

on our design implementation following a top-down<br />

approach. From our point of view, and due to the large<br />

degree of freedom available for this design, this was the<br />

most logical starting point.<br />

5.1. Matrix structure<br />

Choosing an appropriate matrix structure was the first<br />

design issue we had to face. As already mentioned, there<br />

are two main considerations to take into account:<br />

• Dummy structure minimization within the matrix,<br />

since they consume power and contribute to the overall<br />

design area.<br />

• Delay stage loading since a homogeneous load for<br />

every delay stage will contribute to an easier design and<br />

a better resolution controllability.<br />

As discussed before, a bigger matrix yields a larger<br />

number of dummy structures but a more homogeneous<br />

capacitive load for both X and Y delay chains, as the<br />

row-column ratio approaches 1 and the matrix becomes<br />

square, making resolution easier to set. A smaller matrix<br />

will have the opposite effect, thus yielding a smaller<br />

number of dummy structures but making resolution harder<br />

to set.<br />

Keeping these two design parameters in mind, we<br />

derived the mathematical expressions for calculating<br />

them. Afterwards, we built the following table which<br />

summarizes all the possible solutions for our 32 stage 2-D<br />

Vernier TDC design, allowing us to analyze the problem<br />

and find the best solution:<br />

From this table we can conclude the following: elements (X delay chain) and another one of 7 delay<br />

elements (Y delay chain).<br />

Figure 1: 2-D Vernier matrix figures of merit<br />

To achieve a square matrix, and therefore a homogeneous<br />

load for every delay element, we need a 17×17 matrix.<br />

However, this structure will yield an unacceptable number<br />

of dummy structures (88.93% of the matrix will consist<br />

of dummy structures). Hence this matrix layout was<br />

discarded.<br />

There is an interesting set of solutions, yielding a minimum<br />

number of columns (10 columns for 5, 6 and 7 rows).<br />

Naturally, the number of dummy structures increases<br />

with the number of rows. However, we finally chose the<br />

7×10 matrix configuration, as shown in Figure 2, since the<br />

dummy increment is not very large and the row-column<br />

ration is closer to 100% among those three configurations.<br />

Delay y<br />

1 8 15 22 29<br />

2 9 16 23 30<br />

3 10 17 24 31<br />

4 11 18 25 32<br />

5 12 19 26<br />

Delay x<br />

Figure 2: 7x10 TDC matrix structure<br />

5.2. DELAY CHAINS<br />

6 13 20 27<br />

7 14 21 28<br />

As seen in the previous chapters, the Vernier delay line<br />

architecture uses two different delay chains. While in the<br />

linear Vernier delay line architecture these chains only<br />

differ in the nominal propagation delay of each element<br />

(i.e. the propagation delay measured when the other<br />

input signal is tied to ground), 2-D Vernier delay chains<br />

also differ in the number of elements they are made of.<br />

For instance, a 32 stage linear Vernier TDC would need<br />

two delay chains of 32 delay elements each, while our<br />

2-D Vernier TDC would need only one chain of 10 delay<br />

For the delay stages within the chains, we decided to<br />

use non-inverting buffer structures as the main building<br />

blocks. These structures yield a worse propagation<br />

time than a single inverter, but they provide the delay<br />

comparison and encoding stages with a very simple<br />

time information format. For this purpose, we created<br />

two different components within our library, called BUF_X<br />

and BUF_Y.<br />

Since the Y delay chain has to be faster than the X delay<br />

chain but its capacitive load per delay element is, by<br />

construction, larger than the X’s, it makes sense to initially<br />

increase the size of the BUF_Y transistors. However, since<br />

we are only dealing with low-to-high transitions, this can<br />

be achieved by just increasing the pMOS transistor in<br />

the second inverter within the BUF_Y structure. For the<br />

BUF_X component we initially set to minimum size for<br />

both pMOS and nMOS transistors.<br />

Besides setting the minimum size, we added an extra<br />

delay element at the end of both delay chains. This final<br />

delay element was left open (actually it is driving a 1GΩ<br />

resistor to avoid Cadence WARNING messages) and<br />

its only goal is to balance every capacitive load within<br />

the chain.<br />

We also included an extra input delay stage on both chains,<br />

called FIX_DELAY within our library. These structures<br />

were used to provide a rise and fall time independent<br />

signals to the delay chain during the first design tests.<br />

While FIX_DELAY elements remained unchanged through<br />

the design process, BUF_X and BUF_Y buffers were<br />

resized and optimized to achieve the desired resolution.<br />

5.3. Delay comparators<br />

The time difference between the START and STOP signals<br />

is measured by the use of several memory elements<br />

which capture the moment when the START signal is<br />

surpassed by the STOP signal. Following this principle,<br />

a 32-bit pseudo thermo-code format is generated by<br />

the TDC, where the delay information is kept as the<br />

transition from 1 to 0. Finally, this code is passed to the<br />

5-bit encoding circuit.<br />

Choosing among all the available memory elements for<br />

this task, we followed the recommendations given in<br />

[1] and used a NAND gate based S-R latch as the basic<br />

delay comparison element. The main advantage which<br />

presents this structure is its symmetry for both S and<br />

R signals, helping us to achieve a more homogeneous<br />

capacitive loading for both the X and Y delay chains.<br />

Besides, we also included an inverting buffer at each<br />

output, as recommended in [1], making this device less<br />

sensitive to output loading variations and preventing the<br />

design from unwanted non-linear behavior.<br />

28 <strong>EEWeb</strong> | Electrical Engineering Community<br />

Visit www.eeweb.com<br />

S#<br />

R#<br />

Figure 3: Delay comparator (gate level)<br />

Figure 4: S-R latch truth table<br />

Q#<br />

Special care has to be taken when connecting the feedback<br />

and input signals to the NAND pull-down network due to<br />

data dependent delay. Indeed, the nominal propagation<br />

delay of each element within the chain can be affected<br />

by the S-R latch current state, introducing non-linear<br />

effects. In particular, for this TDC architecture, this effect<br />

becomes quite significant since there are several S-R<br />

latches connected to the same delay element output.<br />

Figure 5 shows the two possible configurations.<br />

VDD<br />

GND<br />

FB2<br />

FB2<br />

FB1<br />

S# R#<br />

FB1<br />

Config 1 Config 2<br />

FB1<br />

FB2<br />

VDD<br />

GND<br />

FB2<br />

FB2<br />

FB1<br />

S# R#<br />

Figure 5: Delay comparator (transistor level)<br />

We obtained some interesting results while testing both<br />

configurations; which are shown in Figure 6. In particular,<br />

this figure shows the propagation delay for each delay<br />

element within the X delay chain for both configurations.<br />

We used for this purpose a 10 ps resolution configuration<br />

and a time delay of 160 ps between the START and STOP<br />

FB1<br />

Q<br />

FB1<br />

FB2<br />

29

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!