Automotive Innovators Hit High Gear in - Xilinx
Automotive Innovators Hit High Gear in - Xilinx
Automotive Innovators Hit High Gear in - Xilinx
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
then map or connect the clock-capable p<strong>in</strong>s<br />
to regional or I/O clock buffers.<br />
Normally, it is good to use these clockcapable<br />
p<strong>in</strong>s and regional buffers (BUFR)<br />
to map source-synchronous clock <strong>in</strong>puts.<br />
The regional buffer has a lower skew and<br />
can access three regions (one where the<br />
regional buffer is located, one above and<br />
one below). But for bank selection of<br />
source-synchronous data, we prefer to use a<br />
s<strong>in</strong>gle I/O bank. If we need additional I/O,<br />
it is better to use I/O banks for data signals<br />
that we’ve previously mapped to adjacent<br />
banks. (For package <strong>in</strong>formation, refer to<br />
ug195.pdf from the Xil<strong>in</strong>x Web site.)<br />
We followed several steps for the <strong>in</strong>itial<br />
floor plann<strong>in</strong>g of the design. First, we<br />
placed the system clock <strong>in</strong> the upper half<br />
and then placed the audio capture (optional)<br />
clock <strong>in</strong> the lower half. We locked the<br />
CMT of each half for the I/O bank 3/4<br />
requirements. This map ensures that each<br />
half is left with two PLL/DCMs (CMTs)<br />
that we can use for the PCI Express and<br />
Gigabit Ethernet MAC (SGMII) features.<br />
Because we mapped synchronous data<br />
to banks that consisted of regional clocks,<br />
we mapped 10 audio/video channel <strong>in</strong>puts<br />
on the rema<strong>in</strong><strong>in</strong>g I/O banks. Each video<br />
channel consisted of 20 data l<strong>in</strong>es, three<br />
control signals and video clock <strong>in</strong>puts.<br />
Meanwhile, each audio channel consisted<br />
of four data signals, three control signals<br />
and one audio clock signal. This made a<br />
total requirement of 32 signals with at least<br />
two clock-capable p<strong>in</strong>s (the FPGA’s 14<br />
banks can support 40 p<strong>in</strong>s and four clockcapable<br />
p<strong>in</strong>s).<br />
For this design, 10 A/V channels use 10<br />
I/O banks. We mapped the video clock and<br />
audio clock to clock-capable p<strong>in</strong>s to ensure<br />
we effectively used the regional and I/O<br />
clock buffers. Based on the PCB requirements,<br />
we selected for audio/video channels<br />
banks 5, 6, 13, 17, 18, 19, 20, 22 and 25.<br />
For DDR memory, the design supports<br />
a 32-bit data bus, 14 address l<strong>in</strong>es and a<br />
few control l<strong>in</strong>es. We needed 85 to 90 signals<br />
to map the DDR memory <strong>in</strong>terface.<br />
As per the PCB layout, we used I/O banks<br />
11, 23 and 15 to map all DDR I/O signals.<br />
S<strong>in</strong>ce DDR memory works on the system<br />
clock, we chose to map the read data strobe<br />
signal generated by the DDR to clockenabled<br />
I/O l<strong>in</strong>es.<br />
Xil<strong>in</strong>x offers the PCI Express and Gigabit<br />
Ethernet (GbE) MAC as hard macros. The<br />
Xil<strong>in</strong>x CORE Generator tool generates the<br />
proper IP core with the comb<strong>in</strong>ation of hard<br />
macro, block RAM and some advanced<br />
RTL logic to render the blocks usable. The<br />
tool also provides detailed constra<strong>in</strong>ts for<br />
p<strong>in</strong> mapp<strong>in</strong>g, the PLL/DCM and tim<strong>in</strong>g for<br />
a specific Xil<strong>in</strong>x FPGA. We advise us<strong>in</strong>g the<br />
recommended p<strong>in</strong> def<strong>in</strong>itions as described<br />
<strong>in</strong> the release notes or UCF file that CORE<br />
Generator creates for your design. Also, you<br />
can use Xil<strong>in</strong>x’s Plan-Ahead tool to confirm<br />
or cross-check any p<strong>in</strong> mapp<strong>in</strong>g you’ve<br />
def<strong>in</strong>ed manually.<br />
Core Generation and IP Integration<br />
The task of generat<strong>in</strong>g cores with CORE<br />
Generator and <strong>in</strong>tegrat<strong>in</strong>g <strong>in</strong>tellectual<br />
property can be tricky. Let’s exam<strong>in</strong>e some<br />
of the challenges <strong>in</strong>volved <strong>in</strong> generat<strong>in</strong>g<br />
and <strong>in</strong>tegrat<strong>in</strong>g the CMT, ASYNC FIFO,<br />
block RAM, PCI Express, GbE MAC and<br />
DSP48E blocks. (For more detailed <strong>in</strong>formation<br />
on the PCI Express and GbE MAC<br />
blocks, visit the Xil<strong>in</strong>x Web site to make<br />
sure you have the most recent version of<br />
CORE Generator and the latest IP.)<br />
The Virtex-5 supports various configurations<br />
of clock<strong>in</strong>g modules that you can<br />
generate with the CORE Generator utility.<br />
They <strong>in</strong>clude filter clock jitter PLLs, a PLL-<br />
DCM pair with filter clock jitters, a PLL-<br />
DCM pair or DCM for output dual-data<br />
rate (ODDR), a standard phase-shift clock<br />
DCM and dynamic clock-switch<strong>in</strong>g PLLs.<br />
To generate PLLs, you first need to see<br />
whether the <strong>in</strong>put is s<strong>in</strong>gle-ended or differential<br />
(<strong>in</strong> the example design, it is all s<strong>in</strong>gleended).<br />
Then you must determ<strong>in</strong>e whether<br />
clock jitter is appropriate (<strong>in</strong> our case, it was<br />
120 picoseconds) and whether you’ve used<br />
the global buffer to buffer all the outputs.<br />
Each PLL can generate up to six different<br />
frequency clocks. In our case, the<br />
design needed four 200-MHz system<br />
clocks, each with 0, 90, 180 and 270<br />
degrees of phase, and one audio capture<br />
clock of 19.048 MHz or 39.096 MHz.<br />
To drive clocks us<strong>in</strong>g ODDR flip-flops<br />
<strong>in</strong> source-synchronous outputs, we imple-<br />
XCELLENCE IN NEW APPLICATIONS<br />
mented a DCM that drives the ODDR<br />
flip-flops for forward clock<strong>in</strong>g. This DCM<br />
runs <strong>in</strong> parallel to the DCM we used for<br />
<strong>in</strong>ternal clock<strong>in</strong>g.<br />
We generated the ASYCN FIFO or<br />
block RAM us<strong>in</strong>g CORE Generator and<br />
supported ECC with <strong>in</strong>terrupt<strong>in</strong>g logic on<br />
an embedded microprocessor core to perform<br />
data error detection.<br />
While generat<strong>in</strong>g the PCI Express core,<br />
we had to ensure the reference clock had<br />
the same performance as the PC motherboard’s<br />
PCI Express slot output (that is,<br />
100 MHz). Also, we needed to def<strong>in</strong>e how<br />
many base address registers (BARs) the core<br />
needed and whether the BARs were memory<br />
mapped or I/O mapped. We used the<br />
BAR monitor, which helps <strong>in</strong> generat<strong>in</strong>g<br />
BAR hits, for address decod<strong>in</strong>g.<br />
Dur<strong>in</strong>g the design of the bridge<br />
between PCI Express and the system local<br />
bus, we used the BARs—which act as<br />
memory or I/O region chip select—to<br />
access memory-mapped or I/O-mapped<br />
registers or block RAM. We designed the<br />
bridge logic <strong>in</strong> such a way as to make sure<br />
that the core and bus properly accessed all<br />
the register or block RAM. The Xil<strong>in</strong>x<br />
PCI Express core also has a default ROM<br />
expansion capability, and to accommodate<br />
it we had to implement an address and<br />
map <strong>in</strong>side the bridge. Bit position 6 of<br />
the BAR hit po<strong>in</strong>ts <strong>in</strong> this expansion<br />
ROM area and the <strong>in</strong>ternal <strong>in</strong>terface must<br />
respond to these BAR hits.<br />
If any of the above is miss<strong>in</strong>g, the host<br />
PC won’t get any response if it tries to communicate<br />
and perform a read transaction.<br />
It will enter an unknown state or generate<br />
an unrecoverable error.<br />
We used the CoreGen utility to generate<br />
the GbE MAC with an RGMII/SGMII<br />
external <strong>in</strong>terface. We used the built-<strong>in</strong><br />
GTP module to communicate with selected<br />
PHY devices. The GbE MAC supports<br />
the MDIO <strong>in</strong>terface to configure external<br />
physical devices, a host <strong>in</strong>terface and a 16bit<br />
s<strong>in</strong>gle-channel client <strong>in</strong>terface.<br />
The DSP48E block, for its part, is a<br />
25x18-bit multiplier and 48-bit hardmacro<br />
accumulator. You can use it directly<br />
as an <strong>in</strong>stance or, by mapp<strong>in</strong>g the multiplyaccumulate,<br />
add and subtract functionality<br />
Fourth Quarter 2008 Xcell Journal 37