Packet Processing to support 40 /100GE Line Rates - Ethernet ...
Packet Processing to support 40 /100GE Line Rates - Ethernet ...
Packet Processing to support 40 /100GE Line Rates - Ethernet ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Agenda• Current Challenges• Application Requirements• Tabula Solutions for <strong>100GE</strong> <strong>Packet</strong><strong>Processing</strong>• Tabula Technology & Tools• Summary• Next StepsSanta Clara, CA USA April2013 2
<strong>40</strong>GE / <strong>100GE</strong> & Beyond....<strong>Packet</strong> <strong>Processing</strong> ChallengesSanta Clara, CA USA April2013 3
<strong>Ethernet</strong> Stack: <strong>Processing</strong>Challenge<strong>Ethernet</strong> OSI ModelContext Layer FunctionDataDataDataSegments<strong>Packet</strong>sFramesBitsApplicationPresentationSessionTransportNetworkData LinkPhysicalDeep <strong>Packet</strong> InspectionApplication Aware Routing, SwitchingFlow ClassificationRoutingBridging, Switching1
Device Elements for <strong>Packet</strong><strong>Processing</strong>ASIC / ASSPFunctionality
Device Elements for <strong>Packet</strong><strong>Processing</strong>FPGAFunctionalityDSP Block6-Input LUT Architecture
<strong>Packet</strong> <strong>Processing</strong> at <strong>100GE</strong>• ~50% of <strong>Ethernet</strong> packets have a minimumpacket size ~64 bytes• <strong>Line</strong> rate packet processing requirements:1GE10GE<strong>40</strong>GE<strong>100GE</strong><strong>Ethernet</strong> Minimum <strong>Packet</strong> Size(64 bytes + 20-byte IFG)672ns67.2ns16.8ns6.72ns[K. Thompson, G. Miller, R. Wilder, Wide-area traffic patterns and characterization, IEEE Network, Dec. 1997]@<strong>100GE</strong> <strong>Line</strong> Rate, implementation = max 2-3 <strong>Packet</strong> operations + bufferingSanta Clara, CA USA April20137
Performance Bottlenecks areEVERYWHERE• Limited Application Performance• E.g. Network Security Appliance Implementation• Multiple ASICs, ASSPs, PLDs• Software• IO Bandwidth: Multiple 10G, typically• 4-8 10GE lanes• 10-20 1GE lanes• Application Performance:• Firewall Bandwidth: ~50%• IPS Throughput:
Application RequirementsSanta Clara, CA USA April2013 9
Application Requirements• Performance Matched Device Elements• High Bandwidth Memory Elements• Ability <strong>to</strong> S<strong>to</strong>re Once, and Simultaneously ReadMultiple TimesSanta Clara, CA USA April2013 10
Application Requirements• Simple Design Flow• Design for Intent• Au<strong>to</strong>matic Optimization for Higher Perfromance• Faster Design Cycle TimesSanta Clara, CA USA April2013 11
Tabula Solutions for <strong>100GE</strong> <strong>Packet</strong><strong>Processing</strong>Santa Clara, CA USA April2013 12
100G L2-L4 <strong>Packet</strong> Parser Engine• Function• 9-tuple Header extraction at wirelinespeed• Features• Multiple <strong>Ethernet</strong> packet formats• Rapid adaptation of additionalpacket formats• Microcode compiler• Performance• < 0.05% Device Resources• < 20ns Latency• 256b @ 500MHzMulti-portMem[Pattern Match]Multi-portMem[App Logic]Bit<strong>Processing</strong>PL RegPL RegSanta Clara, CA USA April2013 13
<strong>100GE</strong> – 12x10GE Layer2 BridgeSanta Clara, CA USA April2013 14
4 x 100G Crossbar• Function• 4-port <strong>100GE</strong> Crossbar• Features• Support 288KB of per port buffering• 8 queues / port with 32kB / queue• Built in unicast / multicast <strong>support</strong>• Performance• < 0.05% LUTs• < 20% LRAMs• Port-<strong>to</strong>-port latency ~12ns• 256b @ 472 MHzPorts are made of3-ported RAMblocksSanta © 2013 Clara, Tabula CA USA Confidential April 2013 15
Ternary Search EngineSanta Clara, CA USA April2013 16
Tabula Technology & ToolsSanta Clara, CA USA April2013 17
Spacetime Addresses TheInterconnect BottleneckInterconnect2 GHzRAMLogicRAM2 GHz2 GHz2 GHzTransceiversTransceiversEverything can run at 2 GHz on Tabula DevicesSanta Clara, CA USA April2013 18
Spacetime Addresses The MemoryBottleneck• High bandwidth, multi-ported RAM blocks• Up<strong>to</strong> 24 ports• Includes ASYNC FIFO Controllers• High bandwidth, multi-ported RAM enable• Crossbars• State machines & Microcontrollers• Data structures• Hash tables• ROMsSanta Clara, CA USA April2013 19
Queuing and De-Queuing– This memory is dividedin<strong>to</strong> separate queueswith independent writeports– A single read port hasaccess <strong>to</strong> all queues• FPGA / ASIC: Multi-ported memory must beconstructed out of dual ported memories• Similar principle for a single write port <strong>to</strong> multipleread ports20© 2013 Tabula, Inc.
Broadcasting– A single write portprovides access <strong>to</strong> thebuffer for the input data– Independent ReadPorts have access <strong>to</strong> allof the data• FPGA / ASIC: Multi-ported memory must beconstructed out of dual ported memories© 2013 Tabula, Inc.21
Data Selection / IndexingMemoryLogic Block 1Logic Block 2Logic Block 3• IPv4 header aligned <strong>to</strong> 32bits• Logic can operate on 64-bits and smaller portions of 16-bit or8-bit using multiple ports from the memory• Data normally appearing on different clocks can be accessedwithin the same clock using multiple read ports22© 2013 Tabula, Inc.
Tabula ABAX2P1 –Designed for 100G System Implementation• User clock rate: up <strong>to</strong> 2 GHz• Memory Rich Architecture: high capacity at highperformance• IO Rich Architecture: designed for 4 x 100 GigEtraffic• Hard IP• Memory Controllers• <strong>Processing</strong> Rich Architecture• DSP: 2 GHz blocks• LCB:Santa Clara, CA USA April2013 23
Tabula Tools: Stylus®ASIC / ASSP / FPGA: Multiple Optimization StepsTiming ClosureSanta Clara, CA USA April2013 24
Tabula Tools: Stylus®ASIC / ASSP / FPGAMultiple Optimization StepsSTYLUSTiming Aware Design FlowSynthesisPlacementRouting- Accurate feedback- Short cycleSanta Clara, CA USA April2013 25
SummarySanta Clara, CA USA April2013 26
<strong>100GE</strong> <strong>Packet</strong> <strong>Processing</strong> &Beyond• Industry trends driving <strong>40</strong>GE / <strong>100GE</strong> adoption• Big Data will be here sooner than we anticipate• Cheaper Optics: Silicon Pho<strong>to</strong>nics• Simplified Network Programming Initiative: SDN• <strong>40</strong>/<strong>100GE</strong> <strong>Packet</strong> <strong>Processing</strong> Solutions• Plan your architecture right• System Integration– Fixed Functions– Programmable Hardware– SoftwareSanta Clara, CA USA April2013 27
Want To Learn MoreSanta Clara, CA USA April2013 28
Tabula Solutions• Tabula Technology & Tools• http://cus<strong>to</strong>mer.tabula.comSanta Clara, CA USA April2013 29
QuestionsSanta Clara, CA USA April2013 30