“Gate” lookup table (LUT)
ZXJP8
ZXJP8
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
• <strong>“Gate”</strong>: <strong>lookup</strong> <strong>table</strong> (<strong>LUT</strong>)<br />
<strong>LUT</strong><br />
∧ ∧ I2 // AND<br />
1
• <strong>“Gate”</strong>: <strong>lookup</strong> <strong>table</strong> (<strong>LUT</strong>)<br />
<strong>LUT</strong><br />
1<br />
1<br />
1<br />
⊕ ⊕ I2 // odd parity<br />
1
• Logic element (LE)<br />
LE<br />
<strong>LUT</strong><br />
FF
• Configurable logic block<br />
LE<br />
L<br />
LE<br />
LE<br />
LE<br />
LE<br />
LE<br />
LE<br />
LE
• Configurable logic blocks<br />
L L M<br />
×<br />
+<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
IO IO IO IO CFG
• Interconnect fabric<br />
L L M<br />
L L M<br />
×<br />
+<br />
×<br />
+<br />
IO<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
IO IO IO IO CFG
L L M<br />
×<br />
+<br />
IO<br />
• I/O blocks<br />
L L M<br />
L L M<br />
×<br />
+<br />
×<br />
+<br />
IO<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
IO IO IO IO CFG
L L M<br />
×<br />
+<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
• Fixed logic blocks<br />
• BRAM: dual-port 1K×36b<br />
L L M<br />
L L M<br />
×<br />
+<br />
×<br />
+<br />
IO<br />
IO<br />
IO IO IO IO CFG
L L M<br />
×<br />
+<br />
IO<br />
L L M<br />
×<br />
+<br />
IO<br />
• Fixed logic blocks<br />
• BRAM: dual-port 1K×36b<br />
• DSP: fixed/floating multiply-add<br />
L L M<br />
L L M<br />
×<br />
+<br />
×<br />
+<br />
IO<br />
IO<br />
IO IO IO IO CFG
L L M<br />
L L M<br />
×<br />
+<br />
×<br />
+<br />
IO<br />
IO<br />
• Fixed logic blocks<br />
• BRAM: dual-port 1K×36b<br />
• DSP: fixed/floating multiply-add<br />
• Much more<br />
L L M<br />
L L M<br />
×<br />
+<br />
×<br />
+<br />
IO<br />
IO<br />
IO IO IO IO CFG
http://www.xilinx.com/content/dam/xilinx/imgs/products/zynq/zynq-ev-block.PNG
https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/wp/wp-01264-stratix10mx-devices-solve-memory-bandwidth-challenge.pdf
• Cloud computing<br />
• Deep learning<br />
•<br />
•<br />
• Interconnect and storage<br />
• SDN, NFV, programmable<br />
data plane<br />
•<br />
[http://www.xilinx.com/applications/megatrends.html]
[Andrew Putnam, Doug Burger, et al, MSR, ISCA-41, 2014] [https://www.microsoft.com/en-us/research/project/project-catapult/]
[Toward Accelerating Deep Learning at Scale Using Specialized Hardware in the Datacenter, Hot Chips 27]<br />
[http://www.hotchips.org/wp-content/uploads/hc_archives/hc27/HC27.25-Tuesday-Epub/HC27.25.40-FPGAs-Epub/HC27.25.432-Catapult_HOTCHIPS2015_Chung_DRAFT_V8.pdf]
[http://www.hotchips.org/wp-content/uploads/hc_archives/hc27/HC27.25-Tuesday-Epub/HC27.25.40-FPGAs-Epub/HC27.25.432-Catapult_HOTCHIPS2015_Chung_DRAFT_V8.pdf]
Verilog,<br />
VHDL<br />
Synthesis<br />
Technology<br />
Mapping<br />
Place<br />
Route<br />
Configuration<br />
Bitstream<br />
Slow!
Latency: 25121 clocks<br />
DSPs: 3<br />
[http://tcfpga.org/fpga2013/VivadoHLS_Tutorial.pdf]
#include <br />
ap_int<br />
ap_int<br />
ap_int<br />
#pragma HLS ARRAY_PARTITION DIM=2 VARIABLE=a complete<br />
#pragma HLS ARRAY_PARTITION DIM=1 VARIABLE=b complete<br />
Latency: 25121 clocks<br />
DSPs: 3<br />
↓<br />
Latency: 260 clocks<br />
DSPs: 16<br />
#pragma HLS pipeline<br />
ap_int<br />
[http://tcfpga.org/fpga2013/VivadoHLS_Tutorial.pdf]
• Clusters<br />
1 MIPS/<strong>LUT</strong>
{240,000 <strong>LUT</strong>s + 600 BRAMs} ÷ 320 <strong>LUT</strong>s ≈ 750 PEs??
IRAM<br />
2:1<br />
IRAM<br />
2:1<br />
IRAM<br />
2:1<br />
32 KB CRAM<br />
CLUSTER DATA RAM<br />
ACCELERATOR(S)<br />
IRAM<br />
2:1<br />
PE<br />
4:4<br />
PE<br />
PE<br />
PE<br />
PE<br />
PE<br />
PE<br />
PE<br />
XBAR
IRAM<br />
2:1<br />
IRAM<br />
2:1<br />
IRAM<br />
2:1<br />
32 KB CRAM<br />
CLUSTER DATA RAM<br />
ACCELERATOR(S)<br />
IRAM<br />
2:1<br />
HOPLITE<br />
ROUTER<br />
300<br />
NOC ITF<br />
PE<br />
4:4<br />
PE<br />
PE<br />
PE<br />
PE<br />
PE<br />
PE<br />
PE<br />
XBAR