The Wishbone II transaction bus - Military Embedded Systems

Hardware: Systems – big and smallThe Wishbone II transaction bus:Another grade of speedSingle Print OnlyBy Uros PlatiseWishbone specifications have beenreleased by OpenCores and Silicorewith the aim to provide a standardIP core interconnection schemeto fulfill requirements of modernSystem-on-Chip (SoC) designs,including CPUs, DMA engines,memory interfaces, peripheralinterfaces, and so on. The andEuroscompany has used the Wishbonespecification since its inceptionand has developed an improved versionof the Wishbone bus, calledWishbone II, to propose an advancedpipelined architecture where readand write transactions are separatedand the bus acts as a transaction bus.In this way, multiple transactionscan take place at the same time,removing all latencies along thepath and stalling RMW cycles byincorporating a new per-cell lockingconcept. The ultimate benefit, ofcourse, is that finally bus throughputhas been increased to the maximum.Design and development of large-scaleFPGA/ASIC SoC designs have forceddesigners to implement a modular architecturewith a standardized module interfacethat connects various IP modules inany possible configuration. One of themost popular interconnect architectureswas released by OpenCores called theWishbone B.3 bus (www.opencores.org).In a similar way, Altera has introduced itsown interconnect scheme called AvalonBus (www.altera.com) around whichSOPC Builder and Nios (II) Systems aremade. Xilinx has also introduced its ownbus called the On-Chip Peripheral Buscombined with the Processor Local Bus(www.xilinx.com).These interconnect architectures are singletransaction master/slave oriented, meaningthat a CPU requesting a word from agiven address stalls itself and a path (bus)to the destination for as long as this wordis not received. Lots of bus cycles arelost in this way, giving lower actual datathroughput than expected despite the relativelyhigh system bus frequency. Evenwith fast burst reads and writes introducedby special signals, bus cycles are still lostuntil the first word is received at the additionalcost of doubling the burst logicat both sides, source and destination. Busstalling is more evident when accessingslower modules with greater latencies. Inthese cases, system performance degradesdramatically; for example, a 100 MHzsystem may see its throughput fall as lowas a few MB per second.That is why there was a desperate need todevelop bus architectures employing newconcepts. Some new signals have beenintroduced to support new transaction busconcepts based on the Wishbone B.3 architecture,overcoming latency issues whilemaintaining backwards compatibility.Wishbone II transactionbus conceptIn our proposed bus, transactions are representedby a transaction vector containing:• Source (module) address• Destination (module) address• Operator• DataSource and destination addresses definethe path; the operator describes one ormore operations to be executed along thepath and/or at the destination address;and some operations require supplementaldata given to complete the transaction.Actual implementation requiresadditional handshaking signals.Transaction vectors are placed onto atransaction bus transporting the vectorfrom source to destination, and executingbus-oriented operations as requestedby the vector. Once the transaction vectoris placed (sent), the source has no furtherresponsibility and the transaction bustakes complete control over it. The sourceis then ready to issue the next transactionvector. Multiple tasks or requests maybe issued beforehand, one per bus cycle,which reduces the need for any predictionlogic at the destination module to supportburst reads or writes as prediction logicfor various kinds of burst reads.There are two kinds of transactions:• Independent• Dependent (when their orderis important)To support dependent transactions, thetransaction bus must never change theorder of already placed transactions. Thetransaction bus features a fully acknowledgedmechanism to accept new transactionvectors, execute internal forwarding,and deliver to the destination module. The

transparent architecture reflects itself as asimple input-output black box; however,the implementation is based on a multipipelinedstructure where each (FIFO)line holds one transaction vector.The Wishbone II transaction bus proposesfour basic operations only:• Single read• Single write• Cell lock• Bus lockSingle read and write are issued by modules,where cell and bus locking operationsare in the transaction bus domain.Burst reads and burst writes are accomplishedby issuing a stream of read orwrite transactions. RMW cycles are supportedthrough the bus, or even better,they can be facilitated using the new celllocking concept, which instead of stallingthe complete SoC bus locks a singleor multiple memory cells only to a givenowner. These cells cannot be accessed byothers as long as they are not unlocked.Wishbone II signalsA Wishbone II transaction vector iscomposed from the Wishbone B.3 specificationsby introducing the followingnew signals:WB_ACW Write AcknowledgeWB_ACR Read AcknowledgeWB_TGA Address Tag in bothdirectionsWB_ALK Address LockIn the further text, prefix WB may bechanged to WBM denoting a masterinterface, and WBS denotes a slave interfaceor can be left blank to describe anymaster or slave interfaces. Input signalsare appended _I at the end and output signalswith _O. The proposed bus discardsthe Wishbone B.3 ACK signal since itsfunctionality is now split among the ACRand ACW signals. Complete basic signaldescriptions for master and slave arelisted in Table 1. New signals are markedin bold.DESCRIPTION MASTER SLAVEData from master to slave (Data) WBM_DAT_O WBS_DAT_IData from slave to master (Data) WBM_DAT_I WBS_DAT_OSlave (Destination) Address WBM_ADR_O WBS_ADR_ITransaction Strobe (Handshaking) WBM_STB_O WBS_STB_IDestination Operation (Operator) WBM_WE_O WBS_WE_IBus Lock (Operator) WBM_LOCK_O WBS_LOCK_IWrite Acknowledge (Handshaking) WBM_ACW_I WBS_ACW_ORead Acknowledge (Handshaking) WBM_ACR_I WBS_ACR_OAddress Tag Write (Source) WBM_TGA_O WBS_TGA_ISingle Print OnlyAddress Tag Read (Destination) WBM_TGA_I WBS_TGA_OAddress Lock (Operator) WBM_ALK_O WBS_ALK_IMASTERWishbone II bus transactionsWrite transactionsA write transaction is almost identical tothe write transaction given in the WishboneB.3 specifications, except WishboneII uses the ACW signal to acknowledge awrite cycle. A read and write transaction iscomposed of read requests that are identicalto write transactions except that thedestination operation signal WE is set.Read transactionsA read transaction is composed of twotransactions:• Read request transaction issued bysource• Read response transaction issuedby destinationA read request is sent by the master modulerepresenting a source by first issuinga write transaction with the destinationoperation WE set to read. The MasterTable 1pipelinestage1/zwrite path1/zread pathFigure 2SLAVE(memory)should set the Address Tag Write vectorto identify read response. (If there is asingle master, this is not necessary.) Theread request transaction is acknowledgedin the same way as the write transaction.The destination completes the transactionby returning a separate read responsetransaction marked by the acknowledgesignal ACR and providing valid dataand Address Tag Read information.Address Tag Read is a copy of theAddress Tag Write.Figure 1 shows an example system withone pipeline stage on write (input) andread (output) paths between the source(master) and destination (slave) devices.The system has 1 cycle directions on bothdirections; therefore, a request-responseloop takes at least 2 wait cycles. Slave(memory) may also perform some internalmanagement like refresh, which addsup to the total number of wait states.

Hardware: Systems – big and smallYou can see that Figure 2 depicts a transactionbus data flow diagram for the givenexample of the three read request transactionsplaced by the master as AD0, AD1,and AD2, and the associated returned readresponse transactions as DO0, DO1, andDO2. The signal WE is assumed to becleared for all three transactions to indicateread operations. Transactions AD0and AD1 are burst transactions, meaningthat AD1 = AD0 + 1, and the AD2 is anindependent transaction triggered meantimethat could be a cause of an externalinterrupt that loads its interrupt vector,and so forth.Each read request transaction is acknowledgedby the ACW signal, and thereturned read response transaction ismarked (acknowledged) by the ACRsignal. Note that the latency order maynot be the same, due to other higher prioritymaster(s) or memory refresh functions,and so on. In the previous example,the AD0 is immediately acknowledgedbut it takes 3 wait cycles to return theDO0; the AD1 is acknowledged 1 cyclelater while the DO1 is returned in2 wait cycles only, and the DO2 againtakes 3 wait cycles. All three transactionsare completed in 9 cycles; theoretically,without adding two illustrative waitcycles, they would complete in 7 cyclesonly. Using the Wishbone B.3 specifications,the same scenario is shown inFigure 3.Where again AD0 and AD1 are bursts,AD1 = AD0 + 1, and the AD2 is an independentrequest. All three transactionsare completed in 12 cycles, decreasingperformance for 41 percent (at a minimum7 cycles in Wishbone II) even atadditional silicon cost, a memory burstlogic implementation on both sides:source and destination.Imagine a continuous burst Wishbone IIwould perform with 0 wait cycles (latencyis completely removed) and absolutelyno loss (again 0 wait cycles) at the slaveside when more than just one mastercoexists in the system for issuing the firstword. To be more illustrative for a systemrunning at 150 MHz, long bursts withfixed latency of 2 cycles would yield aWishbone II bandwidth of 150 Mwords,and Wishbone B.3 of 50 Mwords only.Single Print OnlyDAT, ADR,STB, WETGA, LOCK, ALKCLKDAT, ADRSTB, WESTBACKCLKSTBACWACRAD0 AD1 AD2DAT, TGA DO0 DO1 DO2Figure 2AD0 AD1 AD2DAT DO0 DO1 DO2Figure 3Read-modify-write cyclesand exclusive bus/AddresslockingA read-modify-write cycle can be madeusing the bus LOCK signal by issuing theread request and LOCK signal set, waitingfor the read response, followed by awrite, and finally releasing the LOCKafterward. To not stall the completebus, Wishbone II introduces a per-cellmemory locking feature using the ALKsignal, which is used in almost the sameway as Wishbone LOCK signal, just thatit doesn’t stall the complete bus but grantsexclusive permissions to a given moduledistinguished by the source TGA.Wishbone II races intothe futureThe Wishbone II bus proposes anadvanced transaction bus-oriented architecturefor SoC designs for FPGAs andASICs in which architecture write andread operations are handled as separatewrite and read transactions. Each transactionis stored in a single line, and themulti-pipeline architecture acts as a FIFObuffer transporting multiple transactionsfrom and to multiple source and destinationmodules. An advanced locking mechanismprevents the complete bus fromstalling due to the RMW cycles using atemporary per-cell locking mechanism.In this way, overall design data throughputis increased just up to the maximumwhile the design successfully integratesslow- and high-speed, low- and highlatencyperipherals and CPUs.

Uros Platise has been R&D manager for more than 10 yearsat andEuros, specializing in electronics, robotics, and softwareengineering. His expertise includes FPGA architectures, price/performance optimizations, communication protocols, sensornetworks, and so forth. He will receive his PhD in Isotropic Networks,from JSI in Ljubljana, Slovenia. Uros can be contacted aturos@andEuros.org.andEuros • +385-52-777-341 • www.andEuros.org/erdTo find out more about Wishbone II, visit www.andEuros.com/erd.© 2008 OpenSystems Publishing. Not Licensed for distribution. Visit opensystems-publishing.com/reprints for copyright permissions.

The Wishbone II transaction bus - Military Embedded Systems

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?