Automotive Innovators Hit High Gear in - Xilinx
Automotive Innovators Hit High Gear in - Xilinx
Automotive Innovators Hit High Gear in - Xilinx
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
MVE - SoC<br />
MVE<br />
InstantVision<br />
Embedded Embedded<br />
Runn<strong>in</strong>g on MB<br />
Video In Data<br />
Stream<strong>in</strong>g/<br />
Conversion<br />
ITUBT.656,<br />
ITUBT.1120<br />
Display and Video<br />
Out Data<br />
Stream<strong>in</strong>g/<br />
Conversion<br />
C-MVA IP<br />
Cores<br />
Communcation<br />
and Media Data<br />
Stream<strong>in</strong>g<br />
DVI, GPIO UART, Ethernet<br />
Turbo-charg<strong>in</strong>g us<strong>in</strong>g FPGA Accelerator Blocks<br />
To truly realize the full potential of an<br />
FPGA-based video analytics system, we<br />
needed to design and <strong>in</strong>tegrate the video<br />
accelerator eng<strong>in</strong>es <strong>in</strong>to the embedded base<br />
system. We anticipated several of the performance<br />
bottlenecks, so our design team<br />
had begun early development of a set of<br />
accelerators us<strong>in</strong>g VHDL. The code profiler,<br />
<strong>in</strong>cluded as part of the Xil<strong>in</strong>x ISE Design<br />
Suite and the Embedded Development Kit,<br />
proved <strong>in</strong>strumental <strong>in</strong> help<strong>in</strong>g us identify<br />
further performance bottlenecks and develop<br />
all the accelerator blocks we required for<br />
this design. Table 2 provides a comprehensive<br />
list of IP core families.<br />
.NET APP<br />
(VA Configuration)<br />
.NET API<br />
WIN32 API<br />
Drivers (Serial/Ethernet)<br />
MVE Drivers (Serial/Ethernet)<br />
MVE Application<br />
(InstantVision TM Embedded<br />
C-MVA IP Core Family Ver. 1 Ver. 2 Ver. 3 Function<br />
WIN32 APP1<br />
(User Def Cfg)<br />
Figure 4 – MVE analytics eng<strong>in</strong>e, InstantVision and driver software<br />
Our development team, like those at<br />
many other companies, consisted of separate<br />
hardware and software developers. It<br />
was critical to the success of this project to<br />
ma<strong>in</strong>ta<strong>in</strong> developer productivity by preserv<strong>in</strong>g<br />
sufficient abstraction between<br />
these two design doma<strong>in</strong>s. We streaml<strong>in</strong>ed<br />
this task us<strong>in</strong>g a feature <strong>in</strong> Xil<strong>in</strong>x Platform<br />
Studio, Create IP Wizard, which generates<br />
RTL templates and software driver files for<br />
hardware accelerator blocks. These templates<br />
<strong>in</strong>clude the <strong>in</strong>terface logic the design<br />
required to access registers, DMA logic<br />
and FIFOs from the embedded system.<br />
Once we used the template to create the<br />
RTL, we placed the RTL <strong>in</strong>to the embed-<br />
IPC-WSC Image flow, up/down scal<strong>in</strong>g and w<strong>in</strong>dow<strong>in</strong>g<br />
IPC-CNF Image flow condition<strong>in</strong>g and noise filter<strong>in</strong>g,<br />
<strong>in</strong>clud<strong>in</strong>g ga<strong>in</strong> control and contrast modification<br />
IPC-FBS Foreground-background separation<br />
IPC-BMF B<strong>in</strong>ary morphological filter<strong>in</strong>g, with size<br />
classification and contour-structure shap<strong>in</strong>g<br />
IPC-SFE Multi-event/object signature and/or<br />
feature extraction<br />
IPC-EFE Event/object-focused enhancement<br />
.NET APP2<br />
(User Def Cfg)<br />
IPC-EBC Application-specific event/object-based control<br />
InstantVision Algorithmic framework and specific modules for<br />
Embedded video flow analytics<br />
Table 2 – IP core families developed as special hardware accelerator blocks<br />
for three generations of MVE / C-MVA<br />
XCELLENCE IN AUTOMOTIVE & ISM<br />
ded IP catalog, where a developer can further<br />
modify it as needed.<br />
Our IP core development procedure<br />
<strong>in</strong>cludes a generic, modular periphery<br />
block development flow for the PLB46-<br />
MPMC-OPB-based backbone. These<br />
peripheries consist of both s<strong>in</strong>gle- and<br />
multi-I/O prototypes (SIMO, MIMO,<br />
MISO models), allow<strong>in</strong>g us to flexibly create<br />
a multithread coprocessor pipel<strong>in</strong>e for<br />
demand<strong>in</strong>g image flow process<strong>in</strong>g algorithms.<br />
We achieved this by comb<strong>in</strong><strong>in</strong>g the<br />
IP cores <strong>in</strong> almost arbitrary order and configur<strong>in</strong>g<br />
them dur<strong>in</strong>g the design and customization<br />
of various analytics eng<strong>in</strong>es.<br />
The MVE analytics eng<strong>in</strong>e consists of<br />
the InstantVision Embedded software<br />
modules and the hardware accelerators that<br />
make up the C-MVA analytics coprocessor.<br />
We prototyped the MVE <strong>in</strong> a Xil<strong>in</strong>x<br />
Spartan-3A-DSP 3400A FPGA and created<br />
our SoC reference design. It <strong>in</strong>cludes all<br />
the required I/O functions for communication<br />
and data stream<strong>in</strong>g (see Figure 2 for<br />
the complete hardware-firmware block diagram).<br />
This complete SoC reference<br />
design, encompass<strong>in</strong>g not only the MVE<br />
analytics eng<strong>in</strong>e but also all the support<strong>in</strong>g<br />
I/O modules, uses 91 percent of the logic<br />
slices, 81 percent of the block RAMs and<br />
32 percent of the DSP slices.<br />
Separat<strong>in</strong>g out the MVE analytics<br />
eng<strong>in</strong>e (exclud<strong>in</strong>g the MPMC-PLB part of<br />
the backbone and specialized I/O components)<br />
uses only 46 percent of the logic<br />
slices, 44 percent of the block RAMs and<br />
23 percent of the DSP slices, thus mak<strong>in</strong>g<br />
a migration path to the lower-cost Spartan-<br />
3A-DSP 1800A FPGA device feasible.<br />
We designed all the IP cores of the C-<br />
MVA coprocessor to complete their associated<br />
process<strong>in</strong>g with<strong>in</strong> a s<strong>in</strong>gle clock<br />
cycle. This feature, comb<strong>in</strong>ed with the<br />
asynchronous FSL <strong>in</strong>terfaces, <strong>in</strong> turn<br />
allows the system <strong>in</strong>tegrator to drive the<br />
C-MVA coprocessor with a different clock<br />
doma<strong>in</strong> from the rest of the system.<br />
Do<strong>in</strong>g so allows the C-MVA to run at the<br />
lower pixel clock frequency while driv<strong>in</strong>g<br />
the backbone at a higher-frequency <strong>in</strong>ternal<br />
system clock, greatly reduc<strong>in</strong>g power<br />
consumption while ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g the system’s<br />
performance requirements.<br />
Fourth Quarter 2008 Xcell Journal 31<br />
MVE<br />
"Client"<br />
"Server"