13.07.2015 Views

A Network Interface Card Architecture for I/O - IBM Haifa Labs

A Network Interface Card Architecture for I/O - IBM Haifa Labs

A Network Interface Card Architecture for I/O - IBM Haifa Labs

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Technische Universität MünchenA <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong>I/O Virtualization in Embedded SystemsHolm RauchfussThomas WildAndreas HerkersdorfInstitute <strong>for</strong>IntegratedSystemsTheresienstr. 90D-8290 Munich,Germanywww.lis.ei.tum.de


Technische Universität MünchenOutline• Motivation• State of the Art <strong>for</strong> I/O virtualization• Specific requirements <strong>for</strong> embedded systems• Proposed architecture – ES-VNIC– Concept overview– Exemplary Rx packet processing– Preliminary per<strong>for</strong>mance estimation– Key components: Queue-Allocation and Management• Future work and summaryH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems2


Technische Universität MünchenMotivation• Virtualization is mainstream in HighPer<strong>for</strong>mance Computing / Data Center– XEN, KVM, VMWare, Intel, AMD, …– I/O virtualization (IOV) is under research• Virtualization is emerging topic <strong>for</strong>embedded systems (ES)– Multiprocessor System-on-Chips (MPSoCs)– Consolidation of different, dynamicworkloads on shared plat<strong>for</strong>me.g., automotive head unit Do current concepts <strong>for</strong> I/O virtualization fit <strong>for</strong> embedded systems?H. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems3


Technische Universität MünchenState of the Art <strong>for</strong> I/O virtualization – OverviewSW solutionsExtensions in<strong>Network</strong> <strong>Interface</strong><strong>Card</strong> (NIC)VirtualMachineMonitorMulti-queuenetwork cardsDriver domainSelf-virtualizednetwork cards(VNICs)H. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems4


Technische Universität MünchenState of the Art – Virtual Machine Monitor and driver domain• Hypervisor itself (VMWare)– Increased complexity and trusted computing base– Needs own drivers• Driver domain (XEN)– Driver domain in critical data path– Latency and complex scheduling– Needs dedicatedresourcesManagement&Central I/OApps.Apps.Apps.OS (domo) OSOS (domU) OSOS (domU) OSOS (domU) OSBackendDevice DriverFrontendDevice driverFrontendDevice driverFrontendDevice driverControl-IFSecure HW-IFEvent-ChannelVirtual CPUXen Virtual Machine MonitorVirtual MMU[1]Hardware: CPUs, Memory, (IO-)DevicesH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems5


Technische Universität MünchenState of the Art – Multi-queue NICs and VNICs• Multi-queue NIC (VMDq):– Fixed number of queuepairs– Relying on driver domain– Rx scheduled by packetarrival (head of lineblocking), Tx as roundrobin[2]System Bus[3]• VNICs (RiceNIC, SV-VNIC)– Embedded systems on theirown (IXP2400, RiceNIC)– NIC-CPU/SW centric(RiceNIC)– More memory on NIC <strong>for</strong>each additional interface[4]NICRx MAC TxDMANIC-CPUManagementDMA-Mgmt.SignalingHeader-ParsingQueueingSchedulingNIC Internal / Instruction MemoryDMASystem MemoryCPU CPUP/C ListsRx/Tx RingsPacketsH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems6


Technische Universität MünchenSpecific requirements <strong>for</strong> a NIC regarding ES and IOVExtend (only) goal of maximum throughput• Low latency• Real-time processing (<strong>for</strong> certain domains)• Differentiated service levels with signaling– Prioritization of packets and interfaces– Bandwidth guarantees• Limited HW extensions on NIC <strong>for</strong> I/O virtualization and reasonable sizecompared to actual embedded system• Offloaded I/O virtualization from VMM and domains i.e., spare CPUpower/processing New concepts/architectures are needed <strong>for</strong> I/O virtualization in embeddedsystemsH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems7


Technische Universität MünchenES-VNIC – Proposed <strong>Architecture</strong>System BusNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesCPUCPUManagementRx MAC TxHeader-ParsingFSMsSchedulingQueue-AllocSignalingDMASystem MemoryContextsP/C ListsRx/Tx RingsPacketsNIC Buffer• Pipelined, (re-)configurable and multithreaded FSMs <strong>for</strong> packet processing• System memory primary storage <strong>for</strong> configuration and data; cache on NIC Scalable and flexible resource sharing between interfacesH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems8


Technische Universität MünchenHeader parsing and bufferingNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesSystem BusCPUCPUManagement• Parsing of packet header to determinereceiving domain (minimum: MAC, VLAN)• Buffering packet on NIC as a whole• Arbitrarily access to any packet to allow outof order processing i.e., <strong>for</strong> high-prioritypacketsRx MAC TxHeader-ParsingFSMsSchedulingQueue-AllocNIC BufferSignalingDMASystem MemoryContextsP/C ListsRx/Tx RingsPacketsH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems9


Technische Universität München(Re-)Configuration via managmentNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesSystem BusCPUCPUManagement• Context define interface (handling, priority,base addresses, …)• Stored in system memory and cached <strong>for</strong>active interfaces on the ES-VNIC• Pinning <strong>for</strong> critical interfaces• Parallel handling of packets to decreasestallingRx MAC TxHeader-ParsingFSMsSchedulingQueue-AllocNIC BufferSignalingDMASystem MemoryContextsP/C ListsRx/Tx RingsPacketsH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems10


Technische Universität MünchenQueue-Allocation and schedulingNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesSystem BusCPUCPUManagement• DMA descriptors <strong>for</strong> packets stored insystem memory and cached on ES-VNIC• Transfer of packet scheduled based onactive packets and priority (from context)Rx MAC TxHeader-ParsingFSMsSchedulingQueue-AllocSignalingDMASystem MemoryContextsP/C ListsRx/Tx RingsPacketsNIC BufferH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems11


Technische Universität MünchenDMA packet to system memoryNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesSystem BusCPUCPUManagement• Transfer of packet to system memory withDMARx MAC TxHeader-ParsingFSMsSchedulingQueue-AllocSignalingDMASystem MemoryContextsP/C ListsRx/Tx RingsPacketsNIC BufferH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems12


Technische Universität MünchenPreliminary per<strong>for</strong>mance estimationSystem BusSystem BusNICRxMAC TxDMANIC-CPUManagementDMA-Mgmt.SignalingHeader-ParsingQueueingSchedulingNIC Internal / Instruction MemoryDMACPU CPUSystem MemoryP/C ListsRx/Tx RingsPackets• Firmware on NIC-CPU withsequential trail of tasks Potential bottleneck• Data cache (re-)loading andinstruction fetching not optimal<strong>for</strong> packet processing relatedtasks FSMs better suited• Pipelined architecture withserveral stages <strong>for</strong> ES-VNIC Same throughput withlower frequencyNICRx MAC TxLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesHeader-ParsingFSMsManagementSchedulingQueue-AllocNIC BufferSignalingDMASystem MemoryCPU CPUContextsP/C ListsRx/Tx RingsPacketsH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems13


Technische Universität MünchenPreliminary per<strong>for</strong>mance estimation (cont.)T NIC BufferTHeader−ParsingT ManagementT SchedulingTQueue−AllocT DMAT DelayRX= max(TNIC Buffer,THeader−Parsing) T max( max(T+T Management+ max(T+T DMAScheduling,TQueue−Alloc)Delta=RXT Managementmax(TT DMA)NIC Buffer,,SchedulingTHeader−Parsing, T ),Queue−Alloc),H. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems14


Technische Universität MünchenPreliminary per<strong>for</strong>mance estimation (cont.)TDelta=RXmax( max(TTmax(TT DMA)T ManagementNIC Buffer,,SchedulingTHeader−Parsing, T ),Queue−Alloc),• T Header-Parsing : Depend on header size• T NIC Buffer: Depend on packet size, dominantregarding T Header-Parsing• T Management : Depend on cache hit and systembus/memory (only few cycles if cached)Worst case: 64 Bytes packetsback-to-back <strong>for</strong> 1Gb Ethernet(64+20)∗8bit1Gbit/s672 nanoseconds= 84125MHz=672 nanosecondscycles!• T Queue-Alloc : Depend on cache hit and systembus/memory (only few cycles if cached)• T Scheduling : Only a few cycles• T DMA : Depends on packet size and systembus/memoryES-VNIC scales with system (bus/memory)Pinning/Prefetch <strong>for</strong> real-time interfaces T Queue-Alloc and T management are the most criticalelementsH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems15


Technische Universität MünchenSystem BusES-VNIC – Queue-AllocationNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesSystemMemoryNICFrom P/C ListsRx RingsTx RingsA B [m] C D[n]Rx MAC TxHeader-ParsingFSMsManagementSchedulingQueue-AllocSignalingDMASystem MemoryCPU CPUContextsP/C ListsRx/Tx RingsPacketsAssignableQueuesA[o]NIC BufferTo Scheduling• Rx/Tx Rings with DMA descriptors are held in system memory• Assignable queues provide cache <strong>for</strong> DMA descriptors of active interfaces• Sharing of queues between and reserving <strong>for</strong> real-time interfaces• Uneven number of Rx/Tx interfaces possible <strong>for</strong> broadcast and servicedifferentiationH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems16


Technische Universität MünchenSystem BusES-VNIC – ManagementNICLocal Cache <strong>for</strong> Contexts,P/C Lists, Rx/Tx QueuesSystemMemoryNICLocalCacheContexts…A B ZAXXX[v]A[w][m+n]To P/C ListsQueue-AllocMultithreadedFSMsTo Queue-AllocTo SchedulingRx MAC TxHeader-ParsingFSMsManagementSchedulingNIC BufferSignalingDMASystem MemoryCPU CPUContextsP/C ListsRx/Tx RingsPacketsFrom Header Parsing• Contexts are held in system memory• Fetched/cached on NIC <strong>for</strong> active interfaces• Pinning <strong>for</strong> real-time interfaces• Multithreaded FSM at the coreH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems17


Technische Universität MünchenMultithreaded, (re-)configurable FSMs• Fast switching <strong>for</strong> handling differentpackets/interfacesInput RegistersInput Input Registers Registers• Based on memory-based FSMs• Extended <strong>for</strong> changing by context– Default behavior– Adding/removing states andtransitions e.g., polling instead ofsignaling• Multiple sets of input/output registers<strong>for</strong> holding in<strong>for</strong>mation even whencontext has been removedContextsMemory-basedFSMInput RegistersInput RegistersOutput RegistersH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems18


Technische Universität MünchenFuture work and summary• ES-VNIC addresses I/O virtualization requirements <strong>for</strong> embedded systems– Work in progress• Validate architecture by simulation of key components– Scenarios with high dynamic (contexts, DMA descriptors, …)– Dimensioning of cache size, packet buffers, queues– Number of multithreaded FSMs and their functional verification• Implement as part of an MPSoC demonstrator in an FPGAH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems19


Technische Universität MünchenDiscussionH. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems20


Technische Universität MünchenReferences <strong>for</strong> pictures[1] M. Malhalingam, I/O Virtualization (IOV) <strong>for</strong> Dummies, VMWorld 2007[2] Intel Virtualization Technology <strong>for</strong> Connectivity, IDF 2008[3] H. Raj, K. Schwan (2007) High Per<strong>for</strong>mance and Scalable I/O Virtualizationvia Self-Virtualized Devices, HPDC 2007[4] J. Shafer and S. Rixner, "A Reconfigurable and Programmable GigabitEthernet <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong>", Rice University Electrical and ComputerEngineering Technical Report TREE0611H. Rauchfuss – A <strong>Network</strong> <strong>Interface</strong> <strong>Card</strong> <strong>Architecture</strong> <strong>for</strong> I/O Virtualization in Embedded Systems21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!