DNK7_F5_8_Cluster Xilinx Kintex 7 Rack Mount HPC Cluster
DNK7_F5_8_Cluster Xilinx Kintex 7 Rack Mount HPC Cluster
DNK7_F5_8_Cluster Xilinx Kintex 7 Rack Mount HPC Cluster
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
PC3-1600<br />
PC3-1600<br />
PC3-800<br />
DDR3<br />
PC3-800<br />
DDR3<br />
DDR3<br />
(128M DDR3x 16)<br />
(128M x 16)<br />
PC3-800 PC3-1600<br />
DDR3<br />
DDR3<br />
(128M x 16) (128M x 16)<br />
(128M x 16) (128M x 16)<br />
DDR3 DDR3<br />
DDR3<br />
DDR3<br />
DDR3<br />
DDR3<br />
(128M DDR3 x 16) (128M DDR3 x 16)<br />
(M x 16) (M x 16)<br />
(M x 16) (M x 16)<br />
(128M x 16) (128M x 16)<br />
DDR3<br />
DDR3<br />
(M x 16)<br />
(M x 16)<br />
<strong>Kintex</strong>-7<br />
<strong>Kintex</strong>-7<br />
7K410T/7K325T/<br />
46<br />
7K410T/7K325T/<br />
46<br />
7K160T<br />
7K160T<br />
<strong>Kintex</strong>-7<br />
FPGA 1<br />
FPGA 2<br />
6<br />
(FFG676)<br />
(FFG676)<br />
7K410T/7K325T<br />
FPGA 0<br />
(FFG900)<br />
46<br />
8 CFG Bus<br />
46<br />
JTAG<br />
6<br />
1<br />
0 2<br />
<strong>Kintex</strong>-7<br />
<strong>Kintex</strong>-7<br />
78<br />
7K410T/7K325T/<br />
46<br />
7K410T/7K325T/<br />
3<br />
7K160T<br />
7K160T<br />
FPGA 4<br />
FPGA 3<br />
CPLD 4<br />
PCIe<br />
(FFG676)<br />
(FFG676)<br />
DDR3<br />
DDR3<br />
DDR3<br />
DDR3<br />
(M x 16) (M x 16)<br />
(M x 16) (M x 16)<br />
DDR3<br />
DDR3<br />
DDR3<br />
DDR3<br />
(M x 16) (M DDR3 x 16)<br />
(M x 16) (M DDR3 x 16)<br />
PC3-800<br />
(M x 16)<br />
PC3-800<br />
(M x 16)<br />
PCIe<br />
I 2 C 2<br />
PC3-1600<br />
PC3-1600<br />
Bridge<br />
Serial<br />
cfg Prog<br />
5<br />
<strong>HPC</strong>LK<br />
(FF484)<br />
EEProm<br />
OSC<br />
DDR3 Clocks<br />
(all FPGAs)<br />
cfg Prog<br />
5<br />
Flash<br />
HRCLK<br />
FPGA<br />
OSC<br />
Config<br />
FPGA 1<br />
SYSCLK (all FPGAs)<br />
4-lane PCIe<br />
PCIe<br />
(GEN1 / GEN2)<br />
SFB Clock (CFG Bus Clocking)<br />
FPGA<br />
<strong>DNK7</strong><strong>F5</strong>PCIe<br />
PCIe<br />
Godzilla’s Bad Hair Day<br />
Block Diagram v1.10<br />
Product Brief<br />
Oct. 2012<br />
Ver. 1.0<br />
Features<br />
• 5U <strong>Rack</strong>mount Chassis containing:<br />
TM<br />
- 1 Intel Xeon E3-1275 processor card<br />
- 8 <strong>DNK7</strong>_<strong>F5</strong>PCIe FPGA cards each<br />
with 5 <strong>Xilinx</strong> <strong>Kintex</strong> 7<br />
FPGAs (7K325T)<br />
• PCIe 4-lane (GEN1/GEN2)<br />
• 40 FPGAs in total, 100% dedicated to<br />
application<br />
- Other configurations with different CPU to<br />
FPGA ratios are available<br />
- 2 bays for SATA-3 hard drives<br />
• Processor card<br />
TM<br />
- Intel Xeon E3-1200 series (Sandy Bridge)<br />
processors, 3.4 GHz<br />
• Quad-Core, 8MB shared L2 cache<br />
• 4 GB DDR3 memory<br />
(4 GB total)<br />
- Options to 32 GB DDR3 memory<br />
(32 GB max)<br />
• VGA with standard D-Sub connector<br />
• 10/100/1000BASE-T Ethernet (2 ports)<br />
• USB 2.0 (4 ports total)<br />
- 2 ports on front panel<br />
- 2 ports on back bracket<br />
• Supports most all Linux distributions<br />
• <strong>DNK7</strong>_<strong>F5</strong>PCIe FPGA <strong>HPC</strong> Acceleration card<br />
-PCI Express (4-lane) FPGA-based algorithm<br />
acceleration peripheral with 5 <strong>Kintex</strong>-7 FPGAs<br />
•4 <strong>Xilinx</strong> <strong>Kintex</strong>-7 FPGAs: 7K325T-1 (FFG676)<br />
-7K410T-3,-2,-1, 7K325T-3,-2,-1, or 7K160T-3,-2,-1<br />
•1 <strong>Xilinx</strong> <strong>Kintex</strong>-7 FPGA: 7K325T-1 (FFG900)<br />
-7K410T-3,-2,-1 or 7K325T-3,-2,-1<br />
<strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong><br />
<strong>Xilinx</strong> <strong>Kintex</strong> 7 <strong>Rack</strong> <strong>Mount</strong> <strong>HPC</strong> <strong>Cluster</strong><br />
- Fixed 4-lane PCIe interface and controller<br />
• PCIe GEN1/GEN2<br />
• Full mastering DMA<br />
- 2 transmit (host memory -> card)<br />
- 2 receive (card -> host memory)<br />
- <strong>Xilinx</strong> FPGA <strong>Kintex</strong>-7 7K325T-1 - 5 total user FPGAs<br />
•508,400 flip-flops per FPGA<br />
- 254K flips-flops with 6-input LUT<br />
•1,540, 25x18 multipliers + 48-bit accumulator per FPGA<br />
•1,590, 18 Kbit block RAM (2 Mbytes) per FPGA (or 445, 36 Kbit blocks)<br />
- Fully dual-ported<br />
- Each block RAM configurable as:<br />
- 32K x 1, 16K x 2, 8K x 4, 4K x 9 (or 8),<br />
- 2K x 18 (or 16), 1K x 36 (or 32), or 512 x 72 (or 64)<br />
-- 5 separate 256Mb x 16 DDR3 memories for each field FPGA<br />
• 3 memories PC3-1600<br />
• 2 memories PC3-800<br />
• Each memory has separate address, data, and control<br />
- 6 separate 256Mb x 16 DDR3 memories for Dataflow Manager FPGA<br />
• 3 memories PC3-1600<br />
• 3 memories PC3-800<br />
• Each memory has separate address, data, and control<br />
- Two independent low-skew global clock networks<br />
• distributed differentially and balanced<br />
- Fast and Painless FPGA configuration via PCIe<br />
• On-board battery for AES bitstream encryption<br />
- Full support for embedded logic analyzers via JTAG interface<br />
• ChipScope, and other third-party debug solutions<br />
- FPGA-controlled LEDs<br />
• Enough light to use as LED-based flashlight<br />
<strong>DNK7</strong>_<strong>F5</strong>PCIe<br />
PCIe (GEN1/GEN2)<br />
4 4 4 4 4<br />
4 4 4<br />
Block Diagram<br />
PCIe Fan Out<br />
Switch<br />
8 8<br />
PCIe<br />
PCIe Fan Out<br />
Switch<br />
Xeon E3-1275<br />
(CPU1)<br />
8 GB DDR3<br />
8 GB DDR3<br />
8 GB DDR3<br />
8 GB DDR3<br />
Platform<br />
Controller<br />
Hub<br />
(3420)<br />
6<br />
LAN 1<br />
10/100/1000 base-T<br />
LAN 2<br />
USB 0<br />
USB 1<br />
Video (SandyBridge)<br />
HDD<br />
Up to 6 HDD (SATA 3)<br />
<strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong><br />
Block Diagram v1.00<br />
1 the DINI group
<strong>DNK7</strong>_<strong>F5</strong>PCIe Block Diagram<br />
PC3-800<br />
DDR3<br />
(256M DDR3 x 16)<br />
(256M x 16)<br />
DDR3<br />
(256M x 16)<br />
PC3-1600<br />
DDR3<br />
(256M DDR3 x 16)<br />
(256M x 16)<br />
DDR3<br />
(256M x 16)<br />
PC3-800<br />
DDR3<br />
(256M x 16)<br />
DDR3<br />
(256M x 16)<br />
<strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> <strong>Xilinx</strong> <strong>Kintex</strong> 7 <strong>Rack</strong> <strong>Mount</strong> <strong>HPC</strong> <strong>Cluster</strong><br />
PC3-1600<br />
DDR3<br />
(256M DDR3x 16)<br />
(256M x 16)<br />
DDR3<br />
(256M x 16)<br />
PC3-800<br />
DDR3<br />
(256M x 16)<br />
DDR3<br />
(256M x 16)<br />
PC3-1600<br />
DDR3<br />
(256M DDR3x 16)<br />
(256M x 16)<br />
DDR3<br />
(256M x 16)<br />
LED<br />
<strong>Kintex</strong>-7<br />
7K410T/7K325T<br />
FPGA 0<br />
(FFG900)<br />
46<br />
<strong>Kintex</strong>-7<br />
7K410T/7K325T<br />
FPGA 1<br />
(FFG676)<br />
46<br />
6<br />
<strong>Kintex</strong>-7<br />
7K410T/7K325T<br />
FPGA 2<br />
(FFG676)<br />
46<br />
46<br />
8<br />
CFG Bus<br />
JTAG<br />
1<br />
0 2<br />
CPLD 4<br />
PCIe<br />
3<br />
78<br />
<strong>Kintex</strong>-7<br />
7K410T/7K325T<br />
FPGA 4<br />
(FFG676)<br />
6<br />
46<br />
<strong>Kintex</strong>-7<br />
7K410T/7K325T<br />
FPGA 3<br />
(FFG676)<br />
PCIe<br />
Bridge<br />
(FF484)<br />
I 2 C<br />
2<br />
Serial<br />
EEProm<br />
Flash<br />
FPGA<br />
Config<br />
DDR3<br />
DDR3<br />
DDR3<br />
DDR3<br />
(256M x 16) (256M x 16)<br />
(256M x 16) (256M x 16)<br />
DDR3<br />
DDR3<br />
DDR3<br />
DDR3<br />
(256M x 16) (256M DDR3 x 16)<br />
(256M x 16) (256M DDR3 x 16)<br />
PC3-800<br />
(256M x 16)<br />
PC3-800<br />
(256M x 16)<br />
PC3-1600<br />
PC3-1600<br />
cfg<br />
cfg<br />
Prog<br />
OSC<br />
Prog<br />
OSC<br />
FPGA 1<br />
5<br />
5<br />
<strong>HPC</strong>LK<br />
HRCLK<br />
DDR3 Clocks<br />
(all FPGAs)<br />
SYSCLK (all FPGAs)<br />
Overview<br />
Description<br />
PCIe<br />
4-lane PCIe<br />
(GEN1 / GEN2)<br />
PCIe<br />
FPGA<br />
SFB Clock (CFG Bus Clocking)<br />
<strong>DNK7</strong>_<strong>F5</strong>_PCIe<br />
Block Diagram v1.2<br />
The <strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> is a complete, 5U rack mount FPGA acceleration cluster. The standard configuration<br />
integrates the following:<br />
Trenton TSB7053 Xeon processor card (boots to Linux). Other cards are also available.<br />
8 – <strong>DNK7</strong>_<strong>F5</strong>PCIe <strong>Xilinx</strong> <strong>Kintex</strong> 7 FPGA cards with 5 7K325T FPGAs per card.<br />
2 TB SATA II Hard Drive<br />
This system contains the maximum number of cost effective FPGA that can be reasonability integrated into a 5U<br />
chassis. Power and cooling are the constraining variables. High performance data paths between FPGA boards<br />
enable data movement under algorithmic control that is wholly separate from the host processor, enabling FPGAbased<br />
acceleration of whole new classes of data intensive algorithms.<br />
In short, the <strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> is a massive number of large, low cost FPGAs integrated with an excellent dual<br />
Xeon-based processor host.<br />
A partial list of possible applications includes:<br />
- bioinformatics<br />
- Genomic search<br />
- financial analytics<br />
• low latency analysis<br />
• derivative calculations<br />
- image processing<br />
- signal processing<br />
- radar<br />
- scientific computing<br />
- video compression<br />
- encryption/decryption (cryptography)<br />
the DINI group<br />
2
<strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> <strong>Xilinx</strong> <strong>Kintex</strong> 7 <strong>Rack</strong> <strong>Mount</strong> <strong>HPC</strong> <strong>Cluster</strong><br />
The Processor Card – single/dual Xeons<br />
Central to the <strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> is the Trenton TSB7053 host processor card. This single-board computer<br />
has Intel Xeon E3 processor, clocked at 3.4GHz. The processor board has 4 DIMM slots that can be stuffed with<br />
up to 32GB DDR3 RAM, max 8GB of memory per slot. The processor card has two 10/100/1000 Base-T<br />
Ethernet ports, along with 2, USB2.0 ports. The chassis can host up to 6 SATA drives. Power and cooling are<br />
provided for up to 8 <strong>DNK7</strong>_<strong>F5</strong>PCIe cards. Power is cabled to the FPGA cards separately and also drawn from<br />
the motherboard, allowing us to exceed the 25W slot PCIe limitation. The power budget is 50W per board. Note that<br />
this requires a lot of airflow and the fans are noisy. Fully populated, the system is perhaps too noisy to be in close<br />
quarters with an engineer.<br />
The <strong>DNK7</strong>_<strong>F5</strong>PCIe – 5 <strong>Xilinx</strong> <strong>Kintex</strong> 7 FPGAs<br />
The <strong>DNK7</strong>_<strong>F5</strong>PCIe is <strong>Xilinx</strong> <strong>Kintex</strong>-7 based FPGA board optimized for algorithmic acceleration applications<br />
requiring FPGAs with high performance local memory. Data movement to/from the FPGA grid is accomplished via<br />
a fixed 4-lane, GEN1/GEN2 PCIe bridge. Each field <strong>Kintex</strong>-7 FPGA (FPGAs 1-4 in the block diagram) has five<br />
separate 256M x 16 DDR3 (4 Gb) memories. The Dataflow Manager FPGA (FPGA 0 in the block diagram) has six<br />
256M x 16 DDR3 memories. The <strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> can host 8 of these cards.<br />
Dedicated PCIe, 4-lane controller (GEN1 or GEN2)<br />
We ship the <strong>DNK7</strong>_<strong>F5</strong>PCIe with a fixed, full function, 4-lane master/target PCIe controller. The PCI controller has<br />
two mastering DMA engines, 2 for transmit (board -> host) and 2 for receive (host -> board). Drivers with 'C' source<br />
for several operating systems are included at no cost.<br />
<strong>Kintex</strong>-7 FPGAs from <strong>Xilinx</strong> - Performance and Low Power<br />
The <strong>Xilinx</strong> <strong>Kintex</strong>-7, 28 nm FPGAs are utilized. We use the second largest member of this cost effective (read:<br />
CHEAP) family. The <strong>Kintex</strong>-7 FPGA family has an impressive price/performance ratio for hardware-in-the-loop<br />
accelerators with excellent device power consumption properties. Operation frequency is approximately twice that<br />
of the previous low cost <strong>Xilinx</strong> FPGAs - Spartan-6.<br />
Features of <strong>Kintex</strong>-7 include efficient, dual-register 6-input look-up table (LUT) logic, 36 Kb block RAMs, and<br />
second generation DSP slices which contain 25 x 18 multipliers along with a 48-bit accumulator.<br />
We use the second largest device from this family, the 7K325T, in the FFG900 and FFG676 packages. 100% of the<br />
FPGA resources are dedicated to your application. All FPGAs, excluding the PCIe controller, are configured via<br />
PCIe. The PCIe FPGA can be updated in the field.<br />
Memory - DDR3<br />
The availability of large amounts of local high speed memory is pivotal to FPGA-based algorithmic acceleration<br />
applications. The <strong>DNK7</strong>_<strong>F5</strong>PCIe is optimized accordingly. Each of the four field FPGAs (FPGAs 1 thru 4) has a<br />
total of five, 4 Gb DDR3 memories. Each memory is 256M x16 with separate data, address and control. Three of<br />
these DDR3 memories are connected to FPGA pins capable of 800 MHz (1600 Mb/s per data pin) and remaining<br />
two are connected to FPGA pins capable of 400 MHz (800 Mb/s per data pin). The <strong>Xilinx</strong> Memory Interface<br />
Generator (MIG) works fine. The five memories can be used independently or grouped in any manner that best fits<br />
your application. The Dataflow Manager FPGA (FPGA 0) has a total of six, 4 Gb DDR3 memories. Three of these<br />
memories are connected to FPGA pins capable of 800 MHz (1600 Mb/s per data pin) and three are connected to<br />
FPGA pins capable of 400 MHz (800 Mb/s per data pin).<br />
As always, we provide examples and reference designs to help you with all of your memory interface issues. Please<br />
check with us to make sure that what we ship for no charge meets your requirements.<br />
3 the DINI group
<strong>DNK7</strong>_<strong>F5</strong>_8_<strong>Cluster</strong> <strong>Xilinx</strong> <strong>Kintex</strong> 7 <strong>Rack</strong> <strong>Mount</strong> <strong>HPC</strong> <strong>Cluster</strong><br />
Power Consumption<br />
The PCI Express specification limits slot power to 25 watts. The <strong>DNK7</strong>_<strong>F5</strong>PCIe is capable of consuming power<br />
significantly beyond that. In addition to the PCIe fingers, a separate connector adds a second path for power.<br />
This product is shipped with adequate heat sinks to consume TBD watts, but airflow is required in the chassis to<br />
dissipate the heat. Contact the factory if you require high reliability, no-fan heatsinks.<br />
Status LEDs, Debug<br />
Although no specific testing was performed, sophisticated statistical finite element models and back of the envelope<br />
calculations are showing the number of status LEDs to be bright enough to work as a flashlight. Contact the factory<br />
for more information about this sophisticated feature and make sure an adult is present during operation. These LEDs<br />
are user controllable from the FPGAs so can be used as visual feedback in addition to emergency lighting. A JTAG<br />
connector provides an interface to ChipScope and other third party debug tools.<br />
the DINI group<br />
For technical applications and sales support, call 858.454.3419<br />
7469 Draper Ave.<br />
La Jolla, CA 92037-5026<br />
Phone: 858.454.3419<br />
Fax: 858.454.1728<br />
E-Mail: sales@dinigroup.com<br />
Web: http://www.dinigroup.com<br />
The DINI Group reserves the right to make changes to the product(s) or information contained herein without notice. No liability is<br />
assumed as a result of their use or application. No rights under any patent accompany the sale of any such product(s) or information.<br />
4