12.12.2012 Views

Xcell Journal: The authoritative journal for programmable ... - Xilinx

Xcell Journal: The authoritative journal for programmable ... - Xilinx

Xcell Journal: The authoritative journal for programmable ... - Xilinx

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

written in a <strong>programmable</strong> latency buffer<br />

to match the trigger latency. A number of<br />

data verification checks are carried out. <strong>The</strong><br />

data is finally <strong>for</strong>matted into 64-bit words<br />

and written into FIFOs. From there, it is<br />

read out by the event builder on the board.<br />

Without going into the details of the<br />

functionality, we will focus on the various<br />

issues we faced (and solved) in making the<br />

real hardware churn out correct data, with<br />

a focus on the use of RocketIO transceivers.<br />

Much of what we learned was on a<br />

trial-and-error basis. <strong>The</strong> main issue was<br />

related to the reference clock, which we’ll<br />

describe in detail in the next section.<br />

<strong>The</strong> other significant issue that we<br />

faced was the alignment of the K character<br />

within the 2-byte data path of the received<br />

data. We were initially using<br />

the Gigabit_Ethernet primitive<br />

in half-rate mode <strong>for</strong> a 2-byte<br />

data path. But we observed that<br />

not all of the channels were<br />

putting the K character in the<br />

same place within the 2-byte<br />

word and there was no way to<br />

<strong>for</strong>ce this alignment in the<br />

Gigabit_Ethernet primitive<br />

(the ALIGN_COMMA_MSB<br />

parameter of this primitive is<br />

set to FALSE by default).<br />

Because our protocol expected<br />

the K to always appear on the<br />

LSB of the word, we switched to<br />

the GT_CUSTOM primitive,<br />

where we could <strong>for</strong>ce the alignment and subsequently<br />

swap the position of K to the LSB<br />

of the data. <strong>The</strong> simulations showed perfect<br />

alignment – but in real hardware, some of<br />

the channels were getting misaligned.<br />

A colleague of ours referred us to the<br />

design note about 32-bit word comma<br />

alignment in the RocketIO transceiver user<br />

guide. Although this is usually needed only<br />

<strong>for</strong> a 4-byte data path, we implemented a<br />

similar scheme <strong>for</strong> our 2-byte data path and<br />

this fixed our misalignment problem.<br />

Clock, Programming, and JTAG<br />

We cannot over-emphasize the need <strong>for</strong> a<br />

high-quality reference clock. Besides satisfying<br />

all of the criteria specified in the<br />

RocketIO user manual, we made sure that<br />

... this is a flexible approach, as the<br />

FPGAs are re<strong>programmable</strong> and a more<br />

economical solution in the long term.<br />

our reference clock was as clean as we could<br />

possibly get (see Figure 2).<br />

We used a quartz-based phase-locked<br />

loop (QPLL) circuit developed at CERN<br />

<strong>for</strong> our system to provide the best jitter-free<br />

clock source (100 ps peak-to-peak). We<br />

found that a lot of problems in the per<strong>for</strong>mance<br />

of the RocketIO devices could be<br />

traced to a noisy/jittery reference clock. If<br />

you are using RocketIO transceivers on<br />

both halves of the chip, then it’s much bet-<br />

Figure 2 – Clock jitter measurement<br />

ter to have two reference clocks. We believe<br />

this helps even if you are running the<br />

RocketIO transceivers in half-rate mode<br />

(which is our case).<br />

Another aspect of the clocking scheme<br />

that we used was to pass the reference clock<br />

through a global clock buffer after an input<br />

global differential clock buffer. We<br />

observed improved stability and a more<br />

uni<strong>for</strong>m distribution of the reference clock<br />

with the FPGA editor.<br />

Also, though not directly related to the<br />

high-speed transceivers, we found that an<br />

independent post-configuration DCM<br />

reset logic (usually recommended if you<br />

have an external feedback clock) is useful<br />

even when using internal feedback. This<br />

solved a problem we were having with the<br />

DCMs where they were sometimes not<br />

locking after reconfiguration. <strong>Xilinx</strong><br />

Technical Support helped us find the solution<br />

(<strong>Xilinx</strong> Answer Record 14425).<br />

As <strong>for</strong> programming and JTAG, we<br />

used the same group of EPROMs to configure<br />

eight of the nine FPGAs. One of the<br />

FPGAs is the master and provides the clock<br />

<strong>for</strong> all the devices in the chain. <strong>The</strong> ninth<br />

FPGA has a different pinout and a separate<br />

EPROM <strong>for</strong> itself.<br />

All circuits are connected in the same<br />

JTAG chain, which improved reprogramming<br />

time mainly during the “test” stages.<br />

We found that a need exists <strong>for</strong> a pull-up<br />

resistor on the TDO output of each <strong>Xilinx</strong><br />

device, something that we hope <strong>Xilinx</strong> will<br />

add in future devices. <strong>The</strong> JTAG is used<br />

also to check the board interconnections<br />

after assembly.<br />

Conclusion<br />

In this article, we’ve shown the advantages<br />

of using embedded deserializers instead of<br />

discrete components on a large project. By<br />

using nine 456-pin FPGAs to do the same<br />

job as 105 TQFPs, we saved time, both in<br />

the design and debugging phases. Plus, this<br />

is a flexible approach, as the FPGAs are<br />

re<strong>programmable</strong> and a more economical<br />

solution in the long term.<br />

We are currently considering migrating<br />

to a bigger <strong>Xilinx</strong> device as our processing<br />

requirements from the FPGAs increase.<br />

<strong>The</strong>re<strong>for</strong>e, we are studying the new devices<br />

available and how such a migration will<br />

affect our PCB design in terms of the routing<br />

of the high-speed lines.<br />

We believe that by following the design<br />

rules concerning high-speed design, like<br />

clean clock distribution, power supply<br />

filtering, and good routing of the internal<br />

reference clocks, it is possible to obtain a<br />

successful design in good time. For more<br />

in<strong>for</strong>mation, please write to us at<br />

jc.silva@cern.ch or adarsh.jain@cern.ch.<br />

Winter 2004 <strong>Xcell</strong> <strong>Journal</strong> 91

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!