Technologic - OpenSystems Media

www.embedded-computing.comDecember 2012 | Volume 10 • Number 8www.embedded-computing.comON THE COVERHigh-speed interconnect technology has become anecessary component to optimize the performanceof high-bandwidth embedded systems. Readupdates on some of the leading interconnectstandards in our Silicon section, and don’t missour special advertising section exploring thequestion of “What’s Hot in 2013 in Embedded?”BLOGS | MARKET STATS | SURVEYS | VIDEOS | SOCIAL NET7 Tracking Trends inEmbedded TechnologyBandwidth demands redefine interconnect technologyBy Warren Webb42-community BeatKeeping the embedded conversation goingBy Jennifer HesseSiliconHigh-speed interconnectsSoftwareAndroidStrategiesMulticore10 16298PCI Express over M-PHY:Proven technologies scaleI/O performance to meetmobile needsQ&A with Rick Wietfeldt, MIPI Alliance,and Al Yanes, PCI-SIGRapidIO:Optimized for low-latencyprocessor connectivityBy Sam Fuller, RapidIO Trade AssociationEmbedded DisplayPort:Increased flexibility and powersavings render greater displayefficiencyBy Craig Wiley, VESA1013Developing Android-drivenapplications for automotiveinfotainmentBy Andrew Patterson, Mentor Graphics▲Heterogeneoussystem architecture:Multicore image processingusing a mix of CPU andGPU elementsBy Frank Altschuler and Jonathan Gallmeier, AMDA modern software platformfor the era of multicore andcloud computingQ&A with Mark Brewer, TypesafeNext-generation multicoreSoC architectures for tomorrow’scommunications networksBy David Sonnier, LSI Corporation2026292012 OpenSystems Media ®© 2012 Embedded Computing DesignAll registered brands and trademarks within Embedded Computing Designmagazine are the property of their respective owners.iPad is a trademark of Apple Inc., registered in the U.S. and other countries.App Store is a service mark of Apple Inc.ISSN: Print 1542-6408, Online: 1542-6459E-CASTSBuilding Intelligent Therapeutic DevicesDecember 11, 12 p.m. ESTPresented by: Wind RiverImproving Healthcare through Standards and InteroperabilityDecember 12, 11 a.m. ESTPresented by: IBM, Freescale Semiconductor, Wind River4 | December 2012 Embedded Computing Design www.embedded-computing.com

andPERFORMANCE OR SWaPYou Can Do More with Less.XPand6000 Series Small Form Factor (SFF) Systems from X-ES support the highestperformanceFreescale QorIQ and 3 rd generation Intel® Core i7 processors inthe smallest SFF Systems available, at less than 72 in 3 and 3.5 lbs. With COM Expressprocessor modules, PMC/XMC I/O, and SSD storage, application-ready SFF Systemsenable you to focus on your area of expertise. Call us today to learn more.Tight on space and high in performance. That’s Extreme.Extreme Engineering Solutions608.833.1155 www.xes-inc.com

Silicon | High-speed interconnectsQ & APCI Express over M-PHY:Proven technologies scaleI/O performance to meetmobile needsQ&A with Rick Wietfeldt, Chair, MIPI Alliance Technical Steering Group, andAl Yanes, President and Chairman, PCI-SIGThe MIPI Alliance and PCI-SIG recently announced an agreement to deliver an adaptation of thePCI Express (PCIe) architecture over the MIPI M-PHY physical layer technology to provide scalableI/O functionality for mobile devices. Rick and Al describe how this collaboration combines theinteroperability of the PCIe standard with the power efficiency of the M-PHY physical layer to createa low-power, high-performance technology for the mobile market.ECD: What is the MIPI Alliance, and how does it helpembedded device developers?WIETFELDT: MIPI Alliance is an organization with globalmembership focused on developing interface standardsfor mobile devices. These interface standards include highspeedgigabit-per-second interfaces for performance applicationssuch as wireless modems, cameras, and displays, aswell as low-speed and multipoint interfaces for other applicationssuch as audio and control. Key MIPI objectives includeachieving low-power operation for battery-constrainedmobile devices and minimizing the diversity of interfaces thatspeed OEM time to market. Embedded device developersbenefit from a well-known and optimized set of industrystandardinterfaces.ECD: What is the PCI-SIG, and how does it help embeddeddevice developers?YANES: The PCI-SIG is the industry organization chartered todevelop and manage the PCI Express (PCIe) standard. Witha global membership community, the PCI-SIG’s principalcharter is to evolve the PCIe architecture to meet the currentand emerging needs of its members and the industry.A board of directors comprising nine people, each electedby the membership, leads the PCI-SIG. The PCI-SIG fulfillsits charter by continuing to promote innovation and evolvingthrough interoperability testing, technical support, seminars,and industry events.As PCs become lighter and thinner and tablets and smartphonesbecome more functional, consumers want seamless,always on/always connected functionality from their computingdevices. To respond to these market expectations, devicemanufacturers need efficient, intelligent I/O technologies. ThePCIe architecture satisfies all of these requirements, and withthe adaptation to operate over the M-PHY specification, itcan deliver consistent high performance in power-constrainedplatforms such as Ultrabooks, tablets, and smartphones. Bydelivering this technology, the PCI-SIG is meeting the emergingneeds of its members and the industry.ECD: What is M-PHY physical layer technology, where isit currently deployed, and what can we expect in the nearfuture?WIETFELDT: The M-PHY physical layer was developed tosatisfy the anticipated needs of mobile devices including highperformance, low power, and high Electromagnetic Interference(EMI) immunity. High performance is achieved through three“gears” of operation including Gear 1 at 1.45 Gbps/lane,Gear 2 at 2.9 Gbps/lane, and Gear 3 at 5.8 Gbps/lane. Eachgear includes two closely spaced frequencies of operationproviding the capability to select operating frequency basedon optimal mitigation of EMI issues in the mobile device, suchas ensuring adequate radio receive sensitivity.M-PHY provides key bandwidth scalability via support forvarying lane widths – typically one, two, four, or eight lanes –and bandwidth asymmetry via differing numbers of lanes in thetransmitter/receiver directions. M-PHY was also designed toaccommodate a variety of overlay protocols for different applications,such as USB 3.0 or PCIe for chip-to-chip connections,Camera Serial Interface to camera sensors, and Display SerialInterface to (internal) displays. The flexible support for differingprotocols provides a key foundational element for broad use ofMIPI M-PHY technology in a number of different applicationswithin a mobile device.M-PHY has been broadly accepted within key industry organizationsdeveloping interfaces for specialized applications.JEDEC has adopted M-PHY for its Universal Flash Storage(UFS) interface as an evolution to the embedded Multi-MediaCard (eMMC) interface for embedded storage. USB-IF hasselected M-PHY for its SuperSpeed Inter-Chip (SSIC) interfacefor transporting the USB 3.0 protocol. And most recently, thePCI-SIG has chosen M-PHY for use as a low-power PHY to carrythe PCIe protocol.8 | December 2012 Embedded Computing Design www.embedded-computing.com

ECD: What was the impetus behind this collaboration todevelop an adaptation of PCIe over M-PHY, and how willthis technology be implemented in mobile applications?PCIeArchitectureSoftwareAdapted PCIeArchitectureMIPI M-PHYYANES: PCI-SIG is interested in adapting its ubiquitous PCIearchitecture to operate on the MIPI M-PHY to provide arobust I/O architecture for use by its members interested inultra-low-power solutions. MIPI has succeeded in defining abest-in-class low-power, scalable PHY (M-PHY). The layeredarchitecture of PCIe allows for an easy way to replace thephysical layer with a new PHY. This design uses the M-PHY layercombined with the PCIe data link and transaction layer.WIETFELDT: The collaboration to develop the PCIe on M-PHYsolution stemmed from the need in increasingly powerfulmobile computing devices to merge the ubiquitous PCIeprotocol on the emerging M-PHY physical layer. Similar tothe joint MIPI/USB-IF development of the SSIC specification,known informally as “USB 3.0 over M-PHY”, the joint MIPI/PCI-SIG development of the “PCIe over M-PHY” technology(see Figure 1) delivers high performance and broad ecosystemsupport of applications in the PCIe market, as does SSIC in theUSB market and application space.ECD: How can PCIe technology reduce productdevelopment schedules and cost?YANES: As a broadly adopted technology standard, PCIe benefitsfrom several decades of innovations with universal supportin all major operating systems, a robust device discovery andconfiguration mechanism, and comprehensive power managementcapabilities that few, if any, other I/O technologies canmatch. PCIe technology has a flexible, layered protocol thatenables innovations to occur at each layer of the architectureindependent of the other layers.PCIe architecture meets the software support and testabilityneeded for today’s power-constrained platforms; no newdevelopment or investment in the architecture is needed. Inthis way, power-efficient PHY technologies such as MIPI M-PHYcan be integrated with the familiar and highly functionalPCIe protocol stack to deliver best-in-class and highly scalableI/O performance in mobile devices such as Ultrabook, tablet,and smartphone devices.ECD: Can this new technology be used in traditionalembedded applications other than mobile?YANES: PCIe is currently deployed in a number of embeddedplatforms that run the gamut from enterprise to consumerapplications and form factors. With the adaptation to supportthe M-PHY, this ubiquitous I/O technology has been renderedsuitable for adoption in Ultrabook, tablet, and smartphone platformsalmost overnight. In addition, it is expected that this technologywill be adopted in future storage applications in varioustopologies due to the anticipated migration of storage attachpoints from SATA to PCIe technology. As a power-efficient,general-purpose, load-store I/O architecture, component anddevice designers can implement this technology in other I/Oexpansion usage models of their choice.ProtocolPHY LogicalPHY AnalogNormativereferenceFigure 1 | The proposed PCIe over MIPI M-PHY technologyleverages the layered architecture of PCIe and the extensibleprotocol stack of M-PHY to deliver scalable I/O functionality formobile devices.WIETFELDT: While the MIPI Alliance develops interfacestandards primarily for mobile devices such as smartphones,tablets, and other consumer electronics, MIPI interfaces canbe used in products not employing wireless communications.The primary difference is nontechnical and relates toMIPI’s licensing model, which involves Reasonable and Non-Discriminatory – Zero Cost (RANDZ) terms for mobile devicesand Reasonable and Non-Discriminatory (RAND) terms forother devices. This licensing model is very common in the hightechnology industry.ECD: Does the MIPI Alliance or PCI-SIG offer anysoftware tools, libraries, and educational materials to helpembedded designers get started with these standards?YANES: PCI-SIG offers a range of technical support for members,including test specifications and procedures, complianceworkshops around the world, and annual developerconferences to help members design and implement PCIestandards.WIETFELDT: MIPI Alliance offers an online learning center,which includes general interface tutorials, specification tutorials,white papers, webinars, and other presentations/videos.MIPI Alliance also provides a suite of testing resources, whichincludes test suite documents, test tools, and multivendorinteroperability events.Rick Wietfeldt, PhD, is chair of the Technical SteeringGroup at MIPI Alliance and senior director of technologyat Qualcomm.Al Yanes is president and chairman of the PCI-SIGand a Distinguished Engineer for IBM in the Systems &Technology Division.MIPI Alliance | www.mipi.orgFollow:OptimizedPHY LogicalinNormativereferencePCI-SIG | www.pcisig.orgPHY AnalogImplementers must be members of PCI-SIG and MIPIwww.embedded-computing.com Embedded Computing Design December 2012 | 9

Silicon | High-speed interconnectsRapidIO:Optimized for low-latencyprocessor connectivityBy Sam FullerInterconnect architectures reflect the problems they are designed to solve. Focusing too heavily on rawbandwidth metrics often misses more important attributes such as reliability, determinism, and latency,which continue to be critical factors in embedded computing design. Comparing interconnects based onthese criteria demonstrates the advantages of using RapidIO for connecting today’s high-performanceprocessors and the tightly coupled computing systems of the future.As Moore’s Law has continued to drivethe performance and integration ofprocessors, the need for higher-speedinterconnects has continued to grow aswell. Today’s interconnects commonlysport speeds ranging from 10 Gbps to80 Gbps and have roadmaps leading tohundreds of gigabits per second.In the race to faster and faster interconnectspeeds, some topics are notoften discussed, including the types oftransactions supported, communicationslatency and overhead, and what sortsof topologies can be easily supported.Designers tend to think of all interconnectsbeing created equal and havinga figure of merit based solely on peakbandwidth.Reality is quite different. Much as there aredifferent forms of processors optimizedfor general-purpose, signal-processing,graphics, and communications applications,interconnects are also designedand optimized for different connectivityproblems. An interconnect typicallysolves the problems it was designedfor and can be pressed into service toaddress other applications, but it will beless efficient in these applications.RapidIO design objectivesIt is instructive to look at RapidIO in thiscontext. RapidIO was designed to serveas a low-latency processor interconnectfor use in embedded systems requiringhigh reliability, low latency, and deterministicoperation. It was designed toconnect different types of processorsfrom different manufacturers in a singlesystem. Because of this, RapidIO hasfound widespread use in wireless infrastructureequipment, where there isa need to combine general-purpose,digital signal, FPGA, and communicationprocessors together in a tightlycoupled system with low latency andhigh reliability.The usage model of RapidIO requiredproviding support for memory-tomemorytransactions, including atomicread-modify-write operations. To meetthese requirements, RapidIO providesRemote Direct Memory Access (RDMA),messaging, and signaling constructs thatcan be implemented without softwareintervention. For example, in a RapidIOsystem, a processor can issue a loador store transaction, or an integratedDMA engine can transfer data betweentwo memory locations. These operationsare conducted across a RapidIOfabric, where their sources or destinationaddresses are located, and typicallyoccur without any software intervention.As viewed by the processor, theyare no different than common memorytransactions.RapidIO was also designed to supportpeer-to-peer transactions. It wasassumed multiple host or master processorswould be in the system and thatthose processors needed to communicatewith each other through sharedmemory, interrupts, and messages.Multiple processors (up to 16K) can beconfigured in a RapidIO network, eachwith their own complete address space.10 | December 2012 Embedded Computing Design www.embedded-computing.com

PCI Express(I/O Interconnect)MemoryProcessorMemoryProcessorMemoryProcessorMemoryProcessorDaisy ChainRapidIO Switch(Processor Interconnect). . . . . .. . .evolved significantly in the past 35 years.Similar to the increase in computer processingspeeds, its peak bandwidthhas grown steadily. Currently availableEthernet Network Interface Controller(NIC) cards can support 40 Gbps operatingover four pairs of SERDES with10 Gbps signaling. Such NIC cardscontain significant processing on theirown to be able to transmit and receivepackets at these speeds.Enet NIC Enet NIC Enet NIC Enet NICFigure 1 | RapidIO, PCI Express, and Ethernet provide different options for connectingprocessors, I/O, and systems.RapidIO also provides a clean dividingline between the functionality ofswitches and endpoints. RapidIOswitches only make switching decisionsbased on explicit source/destinationaddress pairs and explicit priorities.This allows RapidIO endpoints to addnew transaction types without requiringchanges or enhancements to the switchdevices.Comparing interconnectsAs more and more of the system is incorporatedonto a single piece of silicon,PCI Express (PCIe) and Ethernet arebeing integrated within Systems-on-Chips (SoCs). This integration, however,has not changed the nature of the transactionsprovided by these interconnects(see Figure 1).PCIe does not natively support peerto-peerprocessor connectivity. UsingPCIe for this sort of connectivity can beexceedingly complex because it wasdesigned as a peripheral componentinterconnect (hence PCI). It was intendedto connect peripheral devices, typicallyslave devices like I/O and graphicschips, to the main host processor. It wasnot designed as a processor interconnect,but rather as a serialized versionof the PCI bus. Building a multiprocessorinterconnect out of PCI requiresa step beyond the base PCI specificationEthernet Switch(LAN Interconnect)to create new mechanisms that mapaddress spaces and device identifiersamong multiple host or root processors.To date, none of the proposed mechanismsto do this – Advanced Switching(AS), Non-Transparent Bridging (NTB), orMulti-Root – I/O Virtualization (MR-IOV) –have been commercially successful.For systems where there is a clear singlehost device and other processorsand accelerators operate as slavedevices, PCIe is a good choice for connectivity.However, for connecting manyprocessors together in more complexsystems, PCIe has significant limitationsin topology and support for peer-topeerconnectivity.Many developers are looking to leverageEthernet as a solution for connectingprocessors in systems. Ethernet hasBandwidth and per-port economicsThe sending and receiving of Ethernetpackets by a NIC is a long way frombeing a solution to tightly coupled interprocessorcommunications. The overheadassociated with both PCIe andEthernet transaction processing (bothstacks must be traversed in a NIC), plusthe associated SERDES functions andEthernet media access protocol andswitching add latency, complexity, andhigher power dissipation as well as costto systems where much more direct connectionapproaches could be utilized(see Table 1).Using Ethernet as an integrated embeddedprocessor interconnect requiressignificant transaction acceleration andenhancements to the Ethernet MediaAccess Controller (MAC) as well as theEthernet switch devices themselves.Even with these enhancements, RDMAoperations should be limited to largeblock transactions to amortize the overheadof using Ethernet.Standards that have been deployed tosolve this problem include the iWARPRDMA protocol from the InternetEngineering Task Force and RDMAover Converged Ethernet (RoCE). BothiWARP and RoCE are typically implementedthrough acceleration coprocessors.Despite this acceleration, RDMASystem requirement 10 GbE Ethernet RapidIOSwitch-per-port performanceraw data rate10 Gbps 20 Gbps (4x)End-to-end packet termination >10 μs ~1-2 μsMessaging performanceReliabilitySoftwareDrops packets onerror or congestionHardware – directlycoupled to processorIntegrated errordetection and retryTable 1 | A comparison of Ethernet and RapidIO shows the benefits of a more directconnection approach.www.embedded-computing.com Embedded Computing Design December 2012 | 11

Silicon | High-speed interconnectstransactions must still be carefullymanaged to reduce communicationsoverhead. The reason is that althoughEthernet offers high bandwidth, especiallyin 10 GbE and 40 GbE implementations,it also has high transactionlatencies that are typically measured inmicroseconds.Current RapidIO applicationsThe RapidIO value proposition has beenwidely acknowledged in the embeddedmarket for many years. This same valueproposition can now be extended tomore mainstream data-processing markets,which are evolving to demandmany of the same system attributes thatcommunications networks have longrequired.One well-known application whereinRapidIO is utilized is the wirelessbase station. This application combinesmultiple forms of processing(DSP, communications, and control)that must be completed within a veryshort time frame. Communications be -tween processing devices should beas quick and deterministic as possibleto ensure that real-time constraints areachieved.For example, in 4G Long-Term Evolution(LTE) wireless networks, frames are sentevery 10 milliseconds. These frames,Atom D525 Touch PCNew - PPC-080T-525ŸIntel Atom D525 Dual Core 1.8 GhzŸ2GB DDR3 800MHz SODIMM RAMŸ16GB SATA SSD Hard DriveŸExternal ESATA PortŸExternal VGA PortŸExternal RS 232 PortŸ4 External USB 2.0 PortsŸAudio Line In & Line OutŸ2W Stereo Internal speakersŸ8" Color LCD Screen- 800 x 600ŸAnalog Resistive TouchscreenŸ10/100/1000 Base-T EthernetŸWireless 802.11 b/g/n, 12V DC in2.6 KERNELŸDimensions: 8.5" x 7.20" x 1.77”ŸQuantity 1 pricing begins at $795Since 1985OVER27YEARS OFSINGLE BOARDSOLUTIONSwww.emacinc.com/ecd_d525EQUIPMENT MONITOR AND CONTROLPhone: ( 618) 529-4525 · Fax: (618) 457-0110 · Web: www.emacinc.comwhich contain data for multiple concurrentmobile sessions, are distributedacross multiple subcarriers, which aresupported by multiple DSP devices.Communications between the DSP andgeneral-purpose processing devicesmust be deterministic and low latencyto ensure that a new frame is readyfor transmission every 10 milliseconds.At the same time, the receive pathmust support the data arriving fromthe mobile devices connected to thenetwork. On top of this complexity, thesystem must track the location of themobile device in real time and managethe device’s signal power.Another example of a RapidIO applicationis semiconductor wafer processing.Similar to the wireless infrastructureapplication, semiconductor waferprocessing has real-time constraints,including a control loop of sensors, processing,and actuators. Leading-edgesystems often have hundreds of sensorscollecting information, with sensordata processed by tens to hundredsof processing nodes. The processingnodes generate commands that go toactuators and AC and DC motors toreposition the wafers and wafer imagingsubsystems. This is all performed in arecurring control loop with a frequencyof up to 100 kHz or 10 microseconds.Systems like this benefit from thelowest-latency communications possiblebetween devices.The future of high-performancecomputingThe introduction of virtualization, ARMbasedservers, and highly integratedSoC devices is paving the way for thenext stage in the evolution of highperformancecomputing. This evolutionis moving toward more tightly coupledclusters of processors that representprocessing farms built to host hundredsor thousands of virtual machines.These processor clusters will be composedof up to thousands of multicoreSoC devices connected through highperformance,low-latency processorinterconnects. The more efficient thisinterconnect is, the better the system’sperformance and economics will be.Technologies like PCIe and 10 GbE arenot going away anytime soon, but theywill not be the foundation for thesefuture tightly coupled computing systems.PCIe is not a fabric and can onlysupport the connectivity of small numbersof processors and/or peripherals. Itcan simply serve as a bridge to a fabricgateway device. While 10 GbE can beused as a fabric, it has significant hardwareand software protocol processingrequirements. Its widely variable framesizes (46 B to 9,000 B for jumbo frames)drive the need for fast processing logicto support several small packets andlarge memory buffers to support largepackets in endpoints and switches, thushiking up silicon cost. The use of PCIeor 10 GbE will either restrict the topologiesand connectivity available or addcost and overhead to the system. Thesedrawbacks create opportunities both forproprietary fabrics and for open solutionssuch as RapidIO in this interestingnew market.Implementing integrated server, storage,and networking systems presents anopportunity for OEMs to innovate. Akey component of that innovation willbe the internal system connectivity.RapidIO is a mature, well-proven technologywith the attributes requiredfor success in this market. As was thecase for wireless infrastructure, whereRapidIO went from early innovation tobecome the de facto base station interconnectstandard, RapidIO’s biggestchallenge in server, storage, and highperformancecomputing will be to crossthe chasm from today’s innovators andearly adopter markets to mass-marketproliferation.Sam Fuller isa founder andexecutive directorof the RapidIOTrade Association.RapidIO Trade Associationsam@rapidio.orgwww.rapidio.orgFollow:f12 | December 2012 Embedded Computing Design www.embedded-computing.com

EmbeddedDisplayPort:Increased flexibilityand power savingsrender greater display efficiencySilicon | High-speed interconnectsBy Craig WileyWhile mobile device display performance continues to increase, system chip processes geometriescontinue to shrink, resulting in a greater proportion of system power consumed by the display and itshigh-speed interface. The new Embedded DisplayPort (eDP) v1.4 standard offers several new features thatmaximize system power efficiency, further consolidate the display interface, and address a wide range ofsystem profiles to satisfy the growing demand for power optimization in the embedded display system.The Video Electronics Standards Association (VESA) first released the EmbeddedDisplayPort (eDP) standard in 2009 asan extension of the DisplayPort standardfor use with embedded displays.VESA developed eDP to replace theaging Low-Voltage Differential Signaling(LVDS) standard, and today eDP is usedwidely in notebook computers and allin-onesystems. The primary benefits ofeDP over LVDS include the reductionof signal wires due to its higher datarate, compatibility with submicron chipprocesses, decreased interferencewith wireless services, and its ability toaccommodate new features.Since its initial release the eDP standardhas gone through a series of revisionsadding new features not shared withDisplayPort, as eDP has evolved totarget battery-operated embeddeddisplay systems. For example, in 2010eDP version 1.2 was published, addingcontrol of display and backlight featuresover the auxiliary channel. In 2011eDP version 1.3 introduced Panel SelfRefresh (PSR).While eDP was originally designed fornotebooks and all-in-one systems, it isbecoming increasingly optimized forsmaller form factor systems includingtablet and PC smart phone applications.Released in December 2012, eDPversion 1.4 adds new optional featuresdeveloped to address this broaderrange of form factors and further reducesystem power.The importance of loweringdisplay-related system powerToday, mobile devices are a majordriving force in the electronics industry.Every year new mobile devices areintroduced with increased processingcapability, better displays, a smallerand lighter form factor, and extendedbattery life. Taking into account typicalCPU idle time, an average displayconsumes about 75 percent of systempower. While system chip power reductionis accomplished through shrinkingsemiconductor process geometries,display power reduction comes throughimprovements in backlight and LCDtechnologies, as well as new pixelstructures.However, the recent trend towardbrighter, higher-resolution displays isdriving up display power. The secondgenerationiPad had a 1024x768 displayand a 25 watt-hour battery, while the latestiPod has a 2048x1536 display (a 400 percentpixel increase) and a 42.5 watt-hourwww.embedded-computing.com Embedded Computing Design December 2012 | 13

Silicon | High-speed interconnectsbattery (a 70 percent power increase),both delivering a 10-hour battery life.The higher-resolution display requiresadditional pixel-driving circuitry and ahigher data rate display interface, as wellas faster Graphics Processing Unit (GPU)rendering and display image processingcircuitry.This display power challenge has ledto many new architectural developmentsat the platform level. Reducingdisplay power means longer battery lifeand less battery capacity requirementand therefore smaller, lighter, and lessexpensive systems. Rather than beingtreated as a simple rendering device,display deployment has become moreintegrated into the overall systemdesign. The new eDP v1.4 brings manyof these concepts together, as explainedin the following discussion.Panel Self Refresh (PSR)Introduced in eDP v1.3, the PSR functionprovides a means to lower display-relatedsystem power by allowing portions ofthe GPU and display interface to entera low-power state during a static displayimage condition (see Figure 1). Whenthe system enters PSR mode, the remoteframe buffer built into the LCD timingcontroller (Tcon) performs the routine displayrefresh task, offloading the GPU anddisplay interface. The local frame buffercan then be updated later by the GPUwith a new static image (such as when aflashing cursor is turned on or off), or thesystem can exit from PSR mode to displayconstantly changing images (such aswhen playing a video). Because most displaysare refreshed 60 times per second,PSR allows the GPU circuits and displayinterface to remain in a low-power statefor most of the system operating time,resulting in significant power savings.PSR with selective frame updateThe latest version of eDP enhances thePSR mechanisms by enabling updatesto selected regions of the video framein the Tcon frame buffer. With eDP v1.3,the entire frame must be updated everytime any portion of the image changes.With eDP v1.4, only the new portion of theframe requires updating (see Figure 2).This lets the GPU display interfaceremain active for a shorter duration, furtherreducing system power.Additional eDP v1.4 features thatsupport PSRAdvanced link power managementThe ability to manage the display interfaceis improved in eDP v1.4 by greatlyreducing the wake-up time from thelow-power state, which is important tomaximize the efficiency of the PSR selectiveupdate process. Previous versions ofeDP required more than 100 microsecondsto wake up and retrain the datalink. With eDP v1.4, if the new PSR modeis supported, wake-up from standby is amaximum 0.5 microseconds, and waketime from the lower-power sleep state is20 microseconds.Figure 1 | The PSR function helps reduce power by allowing portions of the GPU anddisplay interface to remain in a low-power state during static display image conditions.Figure 2 | The latest version of eDP enables a PSR selective update process that furtherdecreases system power.14 | December 2012 Embedded Computing Design www.embedded-computing.com

Video timing synchronizationIn DisplayPort and earlier versions ofeDP, display-to-GPU synchronization isperformed over the DisplayPort mainlink, the high-speed data channel thatcarries video data using special controlcodes. In eDP v1.4, frame synchronizationis also provided over the lowerspeedauxiliary channel. This enablesquick bursts of eDP main link operationto perform timely selective updates ofthe remote frame buffer.Display stream compressionDisplay interfaces including DisplayPortand earlier versions of eDP send uncompressedpixel data across the displayinterface. Image compression is traditionallylimited to media delivery to thesystem or for storage. In the interest offurther power savings, eDP v1.4 introducesa display stream compressionalgorithm that reduces the data rateacross the display interface. In contrastto typical video or image compressionalgorithms, the display stream compressionalgorithm is optimized for highdata throughput, low latency, and lowgate count and targets low-compressionratios in the 2x to 5x range, dependingon the image type. By using a minimum2x compression ratio configuration, forexample, the display interconnect bandwidthcan be cut in half, with typicallyno loss in image content. Display streamcompression can also be used whenupdating a display’s PSR frame buffer,providing further power savings. It canalso be used to support display resolutionsbeyond the uncompressed mainlink capability.Regional backlight controlWhen eDP v1.2 was released in 2010, itintroduced the capability to control LCDbacklight modulation frequency andbrightness through the auxiliary channel,thus eliminating the need for an extrabacklight control interface. In eDP v1.4,regional backlight control provides theoption to independently set differentportions of the display backlight region.This allows the GPU to actively darkenselective portions of the display basedon display contents, increasing powersavings. Up to 15 backlight regionscan be controlled with a single auxiliarytransaction.eDP v1.4 will help propel eDP intonew applications and establish it as a universalembedded display interface.Multitouch over auxiliary channelTouch-sensitive displays are commonin many smaller embedded display systemsand will become more prevalentin PCs. The latest version of eDP addsthe ability to transport multitouch datafrom the display to the host through theauxiliary channel. The multitouch datatransport uses a framework that is compatiblewith the USB Human InterfaceDevice specifications. By eliminating thededicated USB interface commonly usedfor this purpose, both electrical connectionsand power are saved.Reduced differential voltage swingPrior to eDP v1.4, the standard utilizedthe same interface signal voltage amplitudesas DisplayPort. Today, eDP v1.4reduces the main link’s minimum differentialvoltage amplitude levels from400 mV to 200 mV, which is suitable forthe short transmission distances in smallembedded form factors. This reducesinterface drive power by as much as75 percent, as power is proportional tothe square of amplitude. Additional flexibilityis also added to the link trainingamplitude step sizes, helping increasesuitability for different transmissionmedia including chip-on-glass.Increased link rate flexibilityAnother addition in eDP v1.4 is greaterflexibility in the main link data rate.Previous versions of eDP limited linkrate selection to 1.62 Gbps, 2.7 Gbps,and 5.4 Gbps per lane (main linkdata channel), which is the same asDisplayPort. The current version of eDPnow includes seven standard rates,enabling further power efficiency.With DisplayPort, increasing a 1080p60 Hz display format from 24-bit colorto 30-bit color would require changinga two-lane 2.7 Gbps configuration toeither a four-lane 2.7 Gbps configurationor a two-lane 5.4 Gbps configuration.Ether case represents a 100 percentincrease in interface bandwidth andpower, with only a 25 percent increasein data content (additional dummy bitswould be added to make up the difference).With eDP v1.4 in the same application,the rate could be bumped upfrom 2.7 Gbps to 3.24 Gbps, resulting inonly a 20 percent link rate increase.Customizable link rates are also supported,with a new set of registersdefined for the display to declare thiscapability, as well as the custom linkrate frequency. In addition to furtherpower optimization, this increasesapplication diversity as link rates canbe adjusted based on particular systemtiming requirements. It also enableslink rate adjustment to mitigate radiofrequency interference with systemwireless services.System deployment with eDP v1.4Offering increased flexibility andreduced system power, the latestrelease of eDP v1.4 will help propeleDP into new applications and establishit as a universal embedded displayinterface. Given the current cycle ofeDP standard publication and adoption,it is expected that systems using someof the new features in eDP v1.4 willappear on the market in two to threeyears. Higher-resolution notebook PCsand tablets will likely be the first candidatesto leverage this power-optimizedinterface.Craig Wiley is chairmanof the VideoElectronics StandardsAssociation (VESA)and senior director ofmarketing for ParadeTechnologies.VESAmoderator@vesa.orgwww.vesa.orgFollow: f inwww.embedded-computing.com Embedded Computing Design December 2012 | 15

Software | AndroidDeveloping Android-drivenapplications for automotive infotainmentBy Andrew PattersonGoogle’s Android mobile Operating System (OS) has quickly become the dominant platform forsmartphone devices. Can Android be equally effective in the In-Vehicle Infotainment (IVI) sector? DespiteAndroid’s many functional advantages, software developers must take a careful look at Android’s strengthsand weaknesses before adopting the OS in modern IVI systems.Developers today can make a strongcase that Android is now the most successfulportable OS of all time. In termsof recent smartphone sales, according toresearch firm IDC, Android devices representa 68 percent share of the globalmarket (quarter ended September 2012).This compares to a 17 percent marketshare held by Apple. Within 12 monthsit is expected that more than 1 billionAndroid devices will be in use – anachievable goal considering that nearly700,000 new smartphones are activatedevery day. Easy access to software anddevelopment tools for Android meansjust about anyone from an individualengineer to the largest R&D departmentat a large corporation can get involved.For corporations selling online services,it’s almost a limitation not to have anappropriate Android smartphone app.During the past five years, user expectationshave migrated from seeing a goodwebsite to seeing a good mobile websiteto having an Android or iPhone appavailable. According to AOL Tech, thedownload rate for Android apps in 2012is 1.5 billion installs per month, with atotal of nearly 20 billion installs to date.Unfair to compareIt was inevitable that individuals whoown both a smartphone and a vehicleequipped with an IVI system wouldcompare and contrast the two. Thefunctionality of a typical infotainmentsystem has evolved in the past 10 years,restricted by the lengthy developmentcycles of automakers and their traditionallyconservative approach to productdevelopment. Quality and reliability areparamount, as well as the overwhelmingneed to keep costs low to ensure thefinal product stays competitive.At the recent Paris Motor Show, severalautomakers announced their latestmodels embodying the concept of thealways connected automobile. One suchsystem was Renault’s Android-basedR-Link infotainment system featuringbuilt-in Android apps such as navigation,multimedia, and phone support via anonline store for Renault-approved apps.Despite these and other IVI enhancements,any driver today can look at acontemporary smartphone and findmuch more functionality and personalizationon that device compared to anIVI system. Carmakers are becomingincreasingly desperate to incorporatethis level of functionality and flexibilityinto a vehicle without compromisingits safety and security. Using Android,there are several ways to accomplishthis endeavor, each with its own set ofadvantages and disadvantages.Bring Your Own Device (BYOD) toyour vehicleIf the Android smartphone can beconsidered the ultimate infotainmentdevice, then why not have it connectedinside the car? This is the approach takenby the Car Connectivity Consortium, anindustry alliance established to allowsmartphone screens to be displayed on16 | December 2012 Embedded Computing Design www.embedded-computing.com

the infotainment head unit. Several infotainmentplatform providers, includingMentor Graphics, offer this approach,whereby the head unit acts as a thinclientdisplay, with apps running directlyon the smartphone. Connectivity isprovided through USB cable today,but Wi-Fi connections are emerging.Bluetooth 3.0 may also offer sufficientbandwidth for video streaming betweensmartphones and IVI systems.The advantage of this approach is thatthe phone connectivity technology willnot become obsolete as the car ages,which is an important factor given thatthe typical smartphone enjoys a higherrefresh rate over its lifetime. The conceptof random IVI software updates isseen as too risky for the more permanentcar-based system; OEMs want tokeep the process under their strict control.Looking 10 years ahead, this meansthat the infotainment system can still becurrent and relevant because its functionalityis based on the smartphone atthe time.This approach also presents a costadvantage, as the permanently fixedinfotainment system costs less for anOEM or Tier 1 developer to design andmaintain. Another benefit is related toshared or rented vehicles – the smartphoneimmediately personalizes thevehicle to which it is tethered, withoutneeding to learn a new user interfaceevery time. One example showing theadvantages of integrating a smartphoneinto the infotainment system is Androidcar mode, which turns an Android phoneinto a better driving companion by providingquick access to key applicationssuch as GPS navigation, voice-activatedcommands, and a phone’s contact list.The main disadvantages of allowingsmartphone screens to be displayedon the infotainment head unit are lossof control and marketability of theinfotainment system as a car feature.High-end automobile manufacturers arenow differentiating themselves throughthe sophistication of an infotainmentsystem; they are not willing to pass thatadvantage over to phone makers. Thereis also the unknown security risk lurkingin terms of the potential for someoneto hack into the vehicle system via asmartphone.Considerations for building in anAndroid OSMany designed-in infotainment systemssuch as the Renault’s R-Link buildAndroid directly into the vehicle andpre-load a number of approved andtested apps. This offers a pre-built,tested, state-of-the-art infotainmentsystem to a prospective car buyer. Theidea here is that it’s now possible for thevehicle owner to download additionalAndroid apps from an online store managedby the manufacturer. The AndroidOS is kept isolated from other vehiclefunctions and apps are only offered frommanufacturer-approved repositories tohelp protect the system from malware.However, as Generation Y Android usersstart to dominate the car buyer population,they will want the freedom todownload their favorite apps and won’tbe happy with a pre-defined mix decidedfor them.Looking at this from an OEM perspective,adopting Android as a base OS posesa few significant business risks. SomeOEMs are nervous about the omnipresenceof Google as the owner and licensorof the Android OS platform. BecauseGoogle manages the release scheduleand content of Android, many automotivestrategists are wary about Androidchanges affecting their product releasecycle. What would happen if the licenseor terms of use suddenly changed?The original Android OS was designedexclusively for mobile smartphones, andit must be modified to handle the widevariety of audio streams in the vehiclewith signals coming from reversingsensors, radio, DVD player, navigation,phone, and external sources. The middlewarein Android that covers audiostream routing has proved difficult tomodify and re-test; the intended infotainmentsystem has to link in at severalpoints including the audio flinger(mixer providing a single output at aspecified sample rate), underlying audioIndustrial ControllersPowerful, Rugged, AffordableSeries starts at 199 qty 100229 qty 1250MHz (ARM9) or 800MHz(ARM9 or Cortex-A8) CPUFast startup (under 3 seconds)Fanless operation from -20°C to +70°CProgram in Ladder Logic or CDebian LinuxIndustrial screw-down connectorsOpto-Isolated DIO, Digital Counters,QuadratureUp to 46 DIO with PWMTemperature SensorsOpto-Isolation available for RS-232,RS-485 and CANDIN mount optionOver 25 years in businessNever discontinued a productEngineers on Tech SupportOpen Source Visionpicture of TS-8820-BOXUser-programmable opencore FPGAModbus supportPoE capable 10/100 EthernetUSB 2.0 high speed Host PortsDesign your solution withone of our engineersCustom products w/ excellent pricing andturn-around timeMost products ship next dayTechnologicSystemsWe use our stuff.visit our TS-7800 powered website atwww.embeddedARM.com(480) 837-5200

Software | Androidhardware, and audio manager. Somedevelopers are questioning why theyshould commit to the technology whenit’s possible to dock a smartphone intothe vehicle.Embedded Android architecturesDevelopers can choose from severalpossible approaches to implementAndroid into a vehicle. Some vehiclemanufacturers use Android as thecore OS for the infotainment system,deeming that it’s secure and matureenough to fulfill this role. For designerswho are not as bold and want to stickwith Linux, Android can still be includedas a guest OS in a “container” (seeFigure 1). Using the Linux Container(LXC), resources can be assigned by theLinux host to the Android guest, whichincludes memory available for apps,access privileges, services available,and interaction with other domains. Thecontainer is intended to be a secureenvironment, so users can potentiallydownload entrusted apps into this area.Another technique for including Androidin an IVI system is to use a hardwareor software virtualization layer (seeFigure 2). In this scenario, each OS ordomain runs on a dedicated virtualmachine, and the hardware resourcesavailable from the underlying host platformare shared. Communication isallowed in a controlled way between thedifferent domains, and boot-ups maybe independent, allowing safety-criticalfeatures running on a dedicated domainto be available more quickly than theinfotainment or Android systems.IVIStackHardware LayerNetworkingNavigationEntertainmentMobile OfficeMentor IVI Linux OS/LXC Resource ManagementMulticore Multi-Core CPUAndroid AppsAndroid OSGPUCommon HMIFigure 1 | When running Android in a Linux Container, privileges and permissions canbe tightly controlled.Common HMIIVIStackLinux OSHypervisorSoCGraphics Layer ManagementNetworkingNavigationEntertainmentMobile OfficeGraphics Layer ManagementAndroidApps.Hypervisor Solution: e.g. PikeOS, Coqos, OKL4ARM Multicore CPUGPU, PeripheralsAndroid OSLinux ContainerFigure 2 | Android and Linux can run simultaneously on a virtualization layer orhypervisor.Several hardware platform providersprovide isolated domains in hardware.Software virtualization is available usingproprietary software from providerssuch as SYSGO, OpenSynergy, andOpen Kernel Labs. These virtualizationlayers consume a small amount of overallresource (typically 1 percent to 4 percent)and allow a high degree of domain isolationand safety.In a few years, all drivers will expecttheir vehicles to be permanentlyconnected to the Internet. This willallow access to cloud data services,telematics, video and audio streaming,and apps download. It is no longer aquestion of if this happens, but ratherwhen all of this becomes available tothe general public. The explosion ofAndroid in smartphones has ensuredthat Android apps will need to be accessiblein vehicles, and users will decideif these are built in or accessed via aBYOD solution.Andrew Pattersonis businessdevelopmentdirector for MentorGraphics EmbeddedSoftware Division,specializing inthe automotive market.Mentor GraphicsEmbedded Software DivisionAndrew_patterson@mentor.comwww.mentor.comFollow: BLOG in18 | December 2012 Embedded Computing Design www.embedded-computing.com

Strategies | MulticoreHeterogeneoussystem architecture:Multicore image processingusing a mix of CPU and GPU elementsBy Frank Altschuler and Jonathan GallmeierImage processing is computationally intensive, requiring immense resources in CPU and memorythroughput. Parallelism through multiple CPU cores helps, but even with today’s dual-, quad-, and highercountcore devices, multimedia tasks demand either a great deal of power or the addition of dedicatedimage processing hardware. Adding a pool of parallel processing elements as a programmable acceleratorto the CPU creates a proper balance of general-purpose processing, high performance, and low power, asdemonstrated in an example showing 2x-3x performance and performance-per-watt improvements.Although multimedia processing has benefited tremendouslyfrom the increased availability of multicore processors, untilrecently, the tendency has been to consider multicore from thestandpoint of largely homogeneous CPU architectures. A typicalapproach offers dual, quad, and higher numbers of coresarranged in Symmetric Multi-Processor (SMP) or cache-coherentNon-Uniform Memory Architecture (ccNUMA) groups. Thesedevices sometimes come with additional hardware accelerationto offload particularly difficult tasks, such as the Context-Adaptive Variable-Length Coding (CAVLC) or processor-intensiveContext-Adaptive Binary Arithmetic Coding (CABAC) entropyencoding functions used in H.264 video processing.Many companies now offer integrated graphics in their devices,largely as a way to save board space and bill of materialcosts. An additional benefit of having an integrated GraphicsProcessing Unit (GPU) is that it can provide another path toattacking challenging computational tasks. As an example,AMD’s R-Series processors integrate the Bulldozer CPU architecturewith a discrete-class AMD Radeon GPU to produce anAccelerated Processing Unit (APU).Figure 1 shows a high-level view of the architecture. The GPU isan array of Single Instruction Multiple Data (SIMD) processingelements. Note the common memory controller between thearray and x86 cores. This architecture is an early step in themovement to AMD’s Heterogeneous System Architecture(HSA), which breaks significantly from the traditional multicoreapproach by treating integrated graphics processing elementsas a pool of computational resources that can be applied tonongraphics data. It is similar to the way that an attached hardwareaccelerator can be used, but with the advantage of beinga large cluster of programmable devices – up to 384 coresat the high end – with many hundreds of GFLOPS worth ofcomputational throughput.Table 1 illustrates the basic operating parameters of an AMDdesktop part with the same underlying architecture as theAMD R-Series APU. The Bulldozer-based CPU’s floating-pointperformance at 121.6 GFLOPS is dwarfed by the GPU, whichoffers 614 GFLOPS to applications that can utilize the resourceeffectively. The key to leveraging the added GPU processorsin imaging is to consider the time needed to transfer a blockof data to the GPU, process it, then return the results to themain thread.Performance analysis in a video transcoding applicationAlthough current-generation AMD R-Series processors requirea buffer transfer between the GPU and CPU, data-parallelapplications can achieve an improvement in throughput andpower efficiency that significantly outweighs the cost of thedata transfer.20 | December 2012 Embedded Computing Design www.embedded-computing.com

OpenCL has been used in both of these projects to maximizethe GPU compute resources available within the APU. In x264,the quality optimizations associated with the lookahead functionwere ported completely to OpenCL on the GPU. This canbe used in transcoding applications to improve the outputvideo quality and normally consumes up to about 20 percent ofCPU time. Handbrake uses x264 directly as the video encoder.In addition, Handbrake uses OpenCL to perform video scalingand color space conversion from the RGB to YUV color space.AMD’s hardware video decoder (UVD) was also used inHandbrake to perform some of the video decoding tasks.Data was produced on a Trinity 100 W A10 desktop machine;however, the underlying architecture is identical to that used inthe embedded product.Figure 1 | A high-level view of the AMD R-Series APUarchitecture shows a common memory controller between thearray of SIMD processing elements and x86 cores.Handbrake A10 5800K (100 W TDP) CPU GPUFrequency (GHz) 3.8 0.8Cores 4 384FLOPS/core 8 2GFLOPS 121.6 614.4A10 5800K total GFLOPS 736A10 5800K power (W TDP) 100Table 1 | A comparison of operating parameters demonstratesCPU versus GPU performance in an AMD desktop part.One example of a data-parallel application is Handbrake, awidely used video transcoding utility that runs on a variety ofoperating systems and hardware platforms. The utility is notablein that it has well-integrated multicore support and thereforeserves as an appropriate platform to study the effectivenessof multicore strategies. Handbrake uses x264, a full-featuredH.264 encoder used in several open-source and commercialproducts that provide adequate multicore support.To simplify analysis of the APU’s complex power managementarchitecture, the top Thermal Design Power (TDP) was evaluatedinstead of breaking out the power consumed by eachsubsystem. The data displayed in Table 2 shows that whenprocessing is performed purely in the CPU, a 14.8 fps framerate is achieved, and processing is accomplished in 62 seconds.Involving the GPU and UVD blocks in the processing increasedthe frame rate to 18.3 fps, reducing processing time and hencepower consumption by 14 seconds or 22 percent.The floating-point operations total in Table 2 is the product ofthe floating-point throughput from Table 1 and the processingtime, from which can be drawn a figure of merit: operations perwatt. Looking at the total floating-point operations and normalizedoperations per watt, it becomes clear that processingresources are being left on the table. CPU-only throughputis 75 operations per watt, while it is boosted to 353 operationsper watt when the GPU is added for an improvement intheoretical floating-point throughput of nearly 5x (4.7).Twenty-two percent is nontrivial savings, but while this generationof APU makes significant progress in the direction of HSA,all memory accesses are not uniform from the standpoint oflatency and throughput. When the kernel accesses memory,it does so directly. Such operations occur at a theoretical rateof 22 GBps or 16 to 18 GBps in typical applications. When abuffer is created in host memory to access GPU data, however,that data goes through a different path where the effectiveHandbrake A10 5800K (100 W TDP)High preset, 1080p input transcoded to 720p 6 Mbps outputNo GPU GPU used + UVD Difference14.8 fps 18.3 fps 3.5 fps62 sec 48 sec -14 secFloating-point operations total (x10^6) 7,539.2 3,5328 27,788.8Operations/watt 75.392 353.28 277.888Table 2 | An analysis of a Handbrake application on an A10 desktop machine shows the performance difference when using a CPU aloneversus using a GPU and UVD blocks for processing.www.embedded-computing.com Embedded Computing Design December 2012 | 21

A 1.66GHz Intel®Atom TMIndustrial SBC ofEPIC ProportionsFanlessExtendedTemperatureAll-in-0neSingle BoardComputerThe EPX-C380 isexpandable withPC/104 I/O modules.

The EPX-C380 is a rugged single board computerthat provides a powerful platform forindustrial applications.Powered with either a single or dual core processor, it has a wealth of onboardI/O plus expansion options. Also it supports Windows®, Windows® Embedded,Linux, and other x86 real-time operating systems.Features include:• 1.66GHz Intel® Atom N450 single core or D510 dual core processor• Embedded Gen 3.5+ GFX video core supports CRT and LVDS flat panelssimultaneously• Custom splash screen on start up• Optional 1MB of battery backed SRAM• Two, Gigabit Ethernet ports• Two SATA channels• Eight USB 2.0 ports each with over-current protection• Four serial RS-232/422/485 channels• 48 bi-directional digital I/O lines• CompactFlash (CF) card supported• MiniPCIe and PC/104-Plus expansion connectors• High Definition audio supported (7.1)• WDT, RTC, LPT, status LEDs, and beeper• +5 volt only operation• EPIC sized: 4.5” x 6.5” and RoHS compliant• EBX sized SBC also available• -40ºC to +85ºC operation• Responsive and knowledgeable technical support• Long-term product availabilityContact us for additional information, custom configurations, and pricing.Our factory applications engineers look forward to working with you.Call 817-274-7553 or Visitwww.winsystems.com/EPX-C380E2Ask about our product evaluation program.715 Stadium Drive • Arlington, Texas 76011Phone 817-274-7553 • FAX 817-548-1358E-mail: info@winsystems.comEMBEDDED PCS POWER SUPPLIES COMPACT FLASH SUMIT PANEL PCS I/O MODULES ENCLOSURES SBCS

Strategies | Multicorebandwidth is closer to 8 MBps. This means that the gap inmemory bandwidth between what the kernel can expect locallyand what it can expect if the programmer sets up a local bufferto GPU memory is nearly 3:1. This is symmetrical. If the GPUneeded data residing in CPU memory, the same relationshipwould hold true.Future devices adhering to the full HSA architecture will havea uniform memory model, eliminating the need to copy databetween the CPU and GPU memory regions. Algorithms suchas the lookahead functions of x264, when moved into the GPU’sprocessing domain, will have near parity from the standpoint ofmemory bandwidth.Performance analysis in a benchmarking applicationA second example of a data-parallel application is LuxMark,a graphics-oriented benchmarking package that uses OpenCLand illustrates what can be achieved when differences inmemory access times are removed from the equation. Thedefault integrated benchmark was run on the AMD EmbeddedR-464L APU, whose characteristics are shown in Table 3.The results in Table 4 clearly demonstrate that the GPU outperformsthe CPU by more than 25 percent for this class ofprocessing task. With a full HSA implementation, one wouldTechChannelsUp-to-the-minute, in-depth,focused infoOpenSystems Media is continuing innovationwith TechChannels:Microsites explore topics in more detail.Combine contributed articles from industryexperts with the best in staff-written material,and mix in instant information from:■ Latestnews andvideos■ Blogs■ WhitePapers■ E-casts■ Twitter■ Facebook■ LinkedIntech.opensystemsmedia.comR-464L (35 W TDP) CPU GPUFrequency (GHz) 2.3 0.685Cores 4 384FLOPS/core 8 2GFLOPS 73.6 526.08R-464L total GFLOPS 599.68R-464L power (W TDP) 35Table 3 | A comparison of operating parametersdemonstrates CPU versus GPU performance in an AMDEmbedded R-464L APU.LuxMark v. 2.0 AMD R-464L (35 W)Default integrated benchmarkOpenCL GPU 198OpenCL CPU + GPU 230OpenCL CPU 146Table 4 | Results from running the LuxMark 2.0 benchmarkon the AMD Embedded R-464L APU indicate how the GPUoutperforms the CPU in this processing task.expect to see the addition of the CPU plus GPU to be closeto the sum of GPU plus CPU: 344. In practice, the resultingnumber is 230, which is a nontrivial 63 percent boost but notreflective of fully utilized resources. The gap is largely explainedby the overhead of the CPU parsing data and transferring thedata back and forth between the CPU and GPU at the lowerspeeds described earlier. Such features will enable the utilizationof both the CPU and the GPU with minimal overhead,allowing the combined performance of both CPU and GPU tobe significantly better than either one alone.Maximizing computational resourcesNewly released APU products provide significant computationalresources to designers of embedded computing products.Taking advantage of those resources requires a solidunderstanding of the underlying hardware architecture, insightinto the data flow issues within the target application, andfamiliarity with the latest tools such as OpenCL.AMDwww.amd.comFollow: f inFrank Altschuler issenior product managerat AMD.Jonathan Gallmeieris a SMTS/CASengineer at AMD.24 | December 2012 Embedded Computing Design www.embedded-computing.com

Embedded Building Blocks BlocksBuilding EmbeddedModular Computing Solutions for Embedded ApplicationsHigh-PerformanceApplication-OptimizedX8DA6, C7SIM-Q, C7Q67Compact Form FactorShort-DepthSuperServer® 5015A-EHF-D525SuperServer® 5017C-LFSmall Form FactorX7SPE-H(F)-D525, X7SPA-H(F)-D525X9SCV Series• Supports Intel® Xeon®, Intel® Core i7/i5/i3 and Intel® Atom Processors• Energy Efficient, Low Power and Fanless Solutions• High Performance, Scalable Systems• High-Density, Compact Form Factors• Open Standard and Multi-OS Support• Ruggedized and Industrial Grade Chassis• 7- Year Life Cycle SupportVersatile, Whisper Quiet andHighly ConfigurableSuperServer® 5037C-i/TIndustrial PCShort-DepthSuperServer® 6046T-TUFwww.supermicro.com/Embedded© Super Micro Computer, Inc. Specifications subject to change without notice.Intel®, the Intel® logo are trademarks of Intel Corporation in the US and /or other countries.

Strategies | MulticoreQ & AA modern software platform forthe era of multicore and cloudcomputingQ&A with Mark Brewer, President and CEO, TypesafeCreated from the ground up to address multicore and parallel computing, the Scala programminglanguage smoothly integrates features of object-oriented and functional languages, enablingdevelopers to be more productive while retaining full interoperability with Java. Mark explains howScala-based middleware technology can maximize modern multicore hardware and cloud computingsoftware by raising the abstraction level for building multithreaded applications.ECD: What are the advantages of usinggeneral-purpose programming languageslike Java and Scala for embeddeddevelopment?BREWER: Scala is a general-purpose programminglanguage designed to express commonprogramming patterns in a concise, elegant,and type-safe way. Scala smoothly integratesfeatures of object-oriented and functionallanguages, enabling developers to be moreproductive while retaining full interoperabilitywith Java and taking advantage of modernmulticore hardware.Scala is also a functional language. Inspired bythe long tradition of functional programming,Scala makes it easy to avoid shared state sothat computation can be readily distributedacross cores on a multicore server and acrossservers in a data center. This makes Scala anespecially good match for modern multicoreCPUs and distributed cloud computing workloadsthat require concurrency and parallelism.Scala is equipped with an expressive typesystem that detects and avoids many kinds ofapplication errors at compile time. At the sametime, a sophisticated type inference capabilityfrees developers from the redundant typeinformation “boilerplate code” that is typicalof Java.Because the code sizes are typically reduced bya factor of two to three when compared to anequivalent Java application, Scala is well suitedfor an embedded environment due to its lightweightand concise nature, as demonstrated inthe following code samples:Java class definitionpublic class Person {public final String name;public final int age;Person(String name, int age) {this.name = name;this.age = age;}}Equivalent Scala class definitionclass Person(val name: String, val age: Int) {}Java codeimport java.util.ArrayList;...Person[] people;Person[] minors;Person[] adults;{ ArrayList minorsList = new ArrayList();ArrayList adultsList = new ArrayList();for (int i = 0; i < people.length; i++)(people[i].age < 18 ? minorsList : adultsList).add(people[i]);minors = minorsList.toArray(people);adults = adultsList.toArray(people);}Equivalent Scala codeval people: Array[Person]val (minors, adults) = people partition (_.age < 18)26 | December 2012 Embedded Computing Design www.embedded-computing.com

Scala protects investments in existingJava libraries, tools, and developer programmingskills. Scala programs arecompiled directly to Java bytecode thatruns on the mature Java Virtual Machine(JVM), leveraging its robust just-in-timecompilation, garbage collection, andwell-understood deployment techniques.The operations team doesn’t seea difference. Developers keep workingwith their familiar tools, but they’rewriting code that’s shorter, faster, morescalable, more correct, and maybe evenmore fun.ECD: How does the latest wave ofmulticore processors affect softwaredevelopment with object-orientedprograms?BREWER: The Java programming languagewas created in 1995, so it wassuited to handle the first generation ofInternet applications, with object-orientedprogramming models, multiplatform runtime,and network orientation. However,with the advent of cloud computing andinteractive applications that demand nearreal-time capabilities, the Java languageand traditional Java middleware havebegun to show their age when faced withthe equally significant hurdles of largescaledistributed applications and multicoreplatforms.In most languages, the key to utilizingthe full power of multicore CPUs is bywriting concurrent multithreaded applications.Considering the shared state,state visibility, threads, locks, concurrentcollections, and thread notificationsinvolved, writing this type of applicationis difficult, even for experienced developers.These concepts are by natureerror-prone and often result in deadlocksor application crashes.In an attempt to keep up with the times,Java has evolved, but has become bulkyand cumbersome in the process. Javaand Java Enterprise Edition (JEE) applicationservers are inherently difficult toscale predictably; scaling up is extremelyhard to accomplish, and the commerciallicensing terms offered by most vendorsprohibit the economical scale-out thatcustomers so desperately need as analternative to this model.Another essential component of buildingscalable applications for embeddeddevices is middleware. Akka, built uponthe scale afforded by the Scala programminglanguage, is a message-orientedprogramming model for building multithreadedapplications. Akka raises thelevel of abstraction so that developersonly need to worry about messages andbusiness logic, instead of dealing withthe low-level plumbing required in Java.Play! builds upon Akka to deliver aModel-View-Controller (MVC)-style Webframework with a development experiencemuch like that enjoyed by Railsdevelopers. Play-mini is a subset of thePlay! framework that consists of a RESTlayer on top of the Netty non-blockingI/O socket server. It offers the ability todeploy Akka applications for servicelayerjobs that don’t need the rest ofthe Play tools (for example, the MVC/interface layer).ECD: How does advancedmiddleware technology enabledevelopers to build better softwarefor the cloud?BREWER: Akka is an event-driven middlewareframework implemented in Scalafor building reliable, high-performancedistributed applications. It raises theabstraction level for the developer, removingthe need to worry about the lowlevelplumbing required to create highlyconcurrent applications in languagessuch as Java.Raising the abstraction level is key forbuilding better software in the cloud.Developers can focus on implementingbusiness logic and adding value, notspending time worrying about implementinglow-level features such as highavailabilityservices and state/memorymanagement. With Akka, the developergets an ideal fabric for the cloud that is:››Elastic and dynamic, with the abilityto expand and contract based on theactual load the system is experiencing››Fault-tolerant and self-healing,where the system detects failuresautomatically and can restart individualcomponents or entire serversbased on the business requirementsSWITCHES& IP ROUTERSMore than 3O models... VPX, VME, cPCIComEth 4410a• Data/control Planes 3U VPX switch• Six 4-lanes ports (PCIe x4 Gen2)• Up to ten Giga Ethernet PortsSBCsIntel ® & Freescale ® processorsQorIQ TM• P5020, or• P3041Intel ® Core TM i7• 2 nd or 3 rd Gen• Dual or Quad Core• with a Kintex TM 7• and its personality modulewww.interfaceconcept.com+33 (0)2 98 573 030www.embedded-computing.com Embedded Computing Design December 2012 | 27

Strategies | Multicore››Customizable and adaptiveload-balancing, where softwarecomponents can route messagesbased on system load or any otheruser-definable criteria››Configured so that clusters can rebalancethe load in failover situationsvia actor migration, where actors aremoved between systems dynamicallyThe bottom line is that it’s very easy tobuild loosely coupled and dynamic systemsthat can almost organically changeand adapt at runtime.ECD: Briefly explain Typesafe’stechnology and the currentapplications for it in an embeddedcomputing environment.BREWER: Typesafe’s vision is to enabledevelopment of concurrent, fault-tolerantapplications with a single unified programmingmodel, managed runtime, andbinary compatible distribution.Typesafe was founded in 2011 by thecreators of the Scala programminglanguage and Akka middleware, whojoined forces to create a modern softwareplatform for the era of multicorehardware and cloud computing workloads.The company provides an easyto-usepackaging of Scala, Akka, Play!,and developer tools through both anopen-source stack and a commercialstack that provides commercial support,maintenance, and operations tools viathe Typesafe Subscription. In conjunctionwith its partners, Typesafe also providestraining and consulting services toaccelerate the commercial adoption ofScala, Akka, and Play!Companies that use the Typesafe Stackin an embedded way typically rely onits low latency, high throughput, andresiliency design points. It is used asfor handling millions of messages persecond across networks and networkdevices.ECD: What challenges are yourcustomers dealing with right now?BREWER: Performance at scale is amain driver for customers to considerthe Typesafe Stack. We’ve found thatcustomers who have tried the traditionaldevelopment paradigms using JEE applicationservers, PHP, and Ruby hit a performanceor efficiency wall. In addition tobeing highly efficient, the components ofthe Typesafe Stack are compact from botha disk and memory standpoint, especiallywhen compared to the traditional JEEapplication server, and lend themselveswell to the embedded space.Mark Brewer is president and CEO ofTypesafe.Typesafemark.brewer@typesafe.comtypesafe.comFollow: f inPC/104ConsortiumMaintaining & Distributing PC/104 Specifications• PC/104-Express & PCIe/104• PC/104, PC/104-Plus & PCI-104• EBX & EBX Express• EPIC & EPIC ExpressAll specifications are FREE for downloadChoose PC/104...• Backward Compatible• Forward Looking• Latest PC TechnologiesVisit us to learn about PC/104, search for products and more!www.PC104.org28 | December 2012 Embedded Computing Design www.embedded-computing.com

Strategies | MulticoreNext-generationmulticore SoCarchitectures fortomorrow’scommunicationsnetworksBy David SonnierIT managers are under increasing pressure to boost network capacity and performance to cope with thedata deluge. Networking systems are under a similar form of stress with their performance degrading asnew capabilities are added in software. The solution to both needs is next-generation System-on-Chip(SoC) communications processors that combine multiple cores with multiple hardware acceleration engines.The data deluge, with its massive growthin both mobile and enterprise networktraffic, is driving substantial changesin the architectures of base stations,routers, gateways, and other networkingsystems. To maintain high performanceas traffic volume and velocity continueto grow, next-generation communicationsprocessors combine multicoreprocessors with specialized hardwareacceleration engines in SoC ICs.The following discussion examines therole of the SoC in today’s network infrastructures,as well as how the SoC willevolve in coming years. Before doing so,it is instructive to consider some of thetrends driving this need.Networks under increasing stressIn mobile networks, per-user accessbandwidth is increasing by more than anorder of magnitude from 200-300 Mbpsin 3G networks to 3-5 Gbps in 4GLong-Term Evolution (LTE) networks.Advanced LTE technology will doublebandwidth again to 5-10 Gbps. Higherspeedaccess networks will need moreand smaller cells to deliver these datarates reliably to a growing number ofmobile devices.In response to these and other trends,mobile base station features are changingsignificantly. Multiple radios arebeing used in cloud-like distributedantenna systems. Network topologiesare flattening. Operators are offeringadvanced Quality of Service (QoS) andlocation-based services and moving toapplication-aware billing. The increasedvolume of traffic will begin to place considerablestress on both the access andbackhaul portions of the network.Traffic is similarly exploding withindata center networks. Organizationsare pursuing limitless-scale computingworkloads on virtual machines, whichis breaking many of the traditionalnetworking protocols and procedures.The network itself is also becomingvirtual and shifting to a Network-asa-Service(NaaS) paradigm, which isdriving organizations to a more flexibleSoftware-Defined Networking (SDN)architecture.These trends will transform the datacenter into a private cloud with a serviceorientednetwork. This private cloud willneed to interact more seamlessly andsecurely with public cloud offerings inhybrid arrangements. The result willbe the need for greater intelligence,scalability, and flexibility throughoutthe network.www.embedded-computing.com Embedded Computing Design December 2012 | 29

Strategies | MulticoreMoore’s Law not keeping paceOnce upon a time, Moore’s Law – thedoubling of processor performanceevery 18 months or so – was sufficientto keep pace with computing and networkingrequirements. Hardware andsoftware advanced in lockstep in bothcomputers and networking equipment.As software added more features withgreater sophistication, advances in processorsmaintained satisfactory levels ofperformance. But then along came thedata deluge.In mobile networks, for example, trafficvolume is growing by some 2,600 percentper year, owing mostly to theincrease in video traffic. This is alreadycausing considerable congestion, andthe problem will only get worse whenan estimated 50 billion mobile devicesare in use by 2016 and the total volumeof traffic grows by a factor of 50 in thecoming decade.In data centers, data volume and velocityare also growing exponentially. Accordingto IDC, digital data creation is rising60 percent per year. The research firm’sDigital Universe Study predicts thatannual data creation will grow 44-foldbetween 2009 and 2020 to 35 zettabytes(35 trillion gigabytes). All of this datamust be moved, stored, and analyzed,making Big Data a big problem for mostorganizations today.With the data deluge demanding morefrom network infrastructures, vendorshave applied a Band-Aid to the problemby adding new software-based featuresand functions in networking equipment.Software has now grown so complexthat hardware has fallen behind. Oneway for hardware to catch up is touse processors with multiple cores. Ifone general-purpose processor is notenough, try two, four, 16, or more.Another way to improve hardware performanceis to combine something new –multiple cores – with something old –Reduced Instruction Set Computing(RISC) technology. With RISC, less ismore based on the uniform registerFigure 1 | SoC communications processors combine multiple general-purposeprocessor cores with multiple task-specific acceleration engines to deliver higherperformance with a lower component count and lower power consumption.file load/store architecture and simpleaddressing modes. ARM, for example,has made some enhancements to thebasic RISC architecture to achieve abetter balance of high performance,small code size, low power consumption,and small silicon area, with the last twofactors being important to increasingthe core count.Hardware accelerationnecessary, but …General-purpose processors, regardlessof the number of cores, are simplytoo slow for functions that mustoperate deep inside every packet,such as packet classification, cryptographicsecurity, and traffic management,which is needed for intelligentQoS. Because these functions mustoften be performed in serial fashion,there is limited opportunity to processthem simultaneously in multiple cores.For these reasons, such functions havelong been performed in hardware, andit is increasingly common to have thesehardware accelerators integrated withmulticore processors in specialized SoCcommunications processors.The number of function-specific accelerationengines available also continuesto grow, and more engines (along withmore cores) can now be placed on asingle SoC. Examples of accelerationengines include packet classification,deep packet inspection, encryption/decryption, digital signal processing,transcoding, and traffic management.It is even possible now to integratea system vendor’s unique intellectualproperty into a custom accelerationengine within an SoC. Taken together,these advances make it possible toreplace multiple SoCs with a singleSoC in many networking systems (seeFigure 1).In addition to delivering higherthroughput, SoCs reduce the cost ofequipment, resulting in a significantprice/performance improvement.Furthermore, the ability to tightly couplemultiple acceleration engines makes iteasier to satisfy end-to-end QoS andservice-level agreement requirements.The SoC also offers a distinct advantagewhen it comes to power consumption,which is an increasingly important considerationin network infrastructures, byproviding the ability to replace multiplediscrete components in a single energyefficientIC.The powerful capabilities of today’sSoCs make it possible to offload packet30 | December 2012 Embedded Computing Design www.embedded-computing.com

processing entirely to system line cardssuch as a router or switch. In distributedarchitectures like the IP MultimediaSystem and SDN, the offload can similarlybe distributed among multiple systems,including servers.Although hardware acceleration isnecessary, the way it is implementedin some SoCs today may no longer besufficient in applications requiring deterministicperformance. The problem iscaused by the workflow within the SoCitself when packets must pass throughseveral hardware accelerators, which isincreasingly the case for systems taskedwith inspecting, transforming, securing,and otherwise manipulating traffic.If traffic must be handled by a generalpurposeprocessor each time it passesthrough a different acceleration engine,latency can increase dramatically, anddeterministic performance cannot beguaranteed under all circumstances.This problem will get worse as datarates increase in Ethernet networks from1 Gbps to 10 Gbps, and in mobile networksfrom 300 Mbps in 3G networks to5 Gbps in 4G networks.Next-generation multicore SoCsLSI addresses the data path problemin its Axxia SoCs with Virtual Pipelinetechnology. The Virtual Pipeline createsa message-passing control path thatenables system designers to dynamicallyspecify different packet-processing flowsthat require different combinations ofmultiple acceleration engines. Each trafficflow is then processed directly throughany engine in any desired sequence withoutintervention from a general-purposeprocessor (see Figure 2). This designnatively supports connecting differentheterogeneous cores together, enablingmore flexibility and better poweroptimization.In addition to faster, more efficientpacket processing, next-generationSoCs also include more general-purposeprocessor cores (to 32, 64, and beyond),highly scalable and lower-latency interconnects,nonblocking switching, and aFigure 2 | To maximize performance,next-generation SoC communicationsprocessors process packets directlyand sequentially in multiple accelerationengines without intermediateintervention from the CPU cores.wider choice of standard interfaces(Serial RapidIO, PCI Express, USB, I2C,and SATA) and higher-speed Ethernetinterfaces (1G, 2.5G, 10G, and 40G+).To easily integrate these increasinglysophisticated capabilities into a system’sdesign, software development kits areenhanced with tools that simplify development,testing, debugging, andoptimization tasks.Next-generation SoC ICs acceleratetime to market for new products whilelowering both manufacturing costs andpower consumption. With deterministicperformance for data rates in excess of40 Gbps, embedded hardware is onceagain poised to accommodate anyadditional capabilities required by thedata deluge for another three to fouryears.David Sonnier is atechnical fellow insystem architecturefor the NetworkingSolutions Group ofLSI Corporation.LSI Corporationdavid.sonnier@lsi.comwww.lsi.comFollow:fwww.embedded-computing.com Embedded Computing Design December 2012 | 31

SPECIAL ADVERTISING SECTIONWhat’sin 2013 in Embedded?The coming year promises strong potential for growth in many market segmentsthat are increasingly relying on embedded technologies to manage power, optimizeperformance, and run any number of essential processes for fast and reliable systemoperations. In particular, the trend toward deploying intelligent systems connected bysmart devices is driving productivity and ingenuity in the embedded space.What’s in 2013 What’s in 2013In this special advertising section, we approached embedded technology companies seeking their perspectives and opinionson what the New Year holds for our industry, summed up in the question “What’s Hot in 2013 in Embedded?” Leading vendorsspanning all of our coverage areas – Silicon, Software, and Strategies – present their viewpoints on what they believe will fuel thefires of innovative development in 2013. Hot topics run the gamut, from automotive infotainment to Machine-to-Machine (M2M)connectivity, FPGA and System-on-Chip (SoC) integration to software monetization, and standardized embedded platforms withflexible processor options to e-Commerce-based support for tackling the challenges of software design complexity.These trends represent opportunities for embedded designers to achieve faster performance, improved power efficiency,heightened security, and a range of features that address the current and future demand for intelligent devices and systems.32 | December 2012 Embedded Computing Design www.embedded-computing.com

SPECIAL ADVERTISING SECTIONSemiconductor Memories for theAutomotive Electronics IndustryMicron TechnologyAdvancement and innovation in the automotive industry are what’s hot – from solutions that help keep drivers’ eyes onthe road with heads-up displays, to wireless solutions that allow cars to talk to each other, to systems that improve nightvision through infrared sensors and safety features that provide visibility to protect pedestrians.The key megatrends that are fueling the pervasiveness of microelectronics in cars and driving memory consumptioninclude:• Infotainment: According to industry analysts, many buyers today care more about the infotainment technologiesembedded in the dashboard than what’s under the hood. High-end electronics provide drivers and passengers within-car navigation and entertainment systems, as well as information delivered over a wireless network. As a result, theglobal automotive embedded telematics market is expected to grow at a Compound Annual Growth Rate (CAGR) ofmore than 20 percent from 2013 to 2015.• Safety: Automotive safety is moving from passive-only systems for accident response to active systems for accidentprevention, which can take control of the car and support the driver in emergency situations. Some Advanced DriverAssistance System (ADAS) features are already well-known and provide welcome convenience and safety. Theseinclude adaptive cruise control, blind spot monitoring, lane departure warning, and night vision. Microelectronics isthe fundamental enabling technology for all advanced safety features in today’s cars.• Environment: Restrictive government regulations aimed at reducing emissions drive continuous improvementin engine and transmission control applications. Microelectronics will play a major role in the quest for the ultimateenvironmentally friendly car engine. Fuel-saving engines and hybrid drives have one thing in common: they makeuse of intelligent electronics to reduce fuel consumption.The megatrends fueling the growth of automotive electronics have a significant impact on the market demand forsemiconductor memories. On average there are 1.5 memory devices in each car, and in some scenarios, up to sevenper car. With more than 121 million memory devices shipped into the automotive market in the past year, Micron is at theforefront of developing memory solutions that are driving the power of progress in the automotive industry to a new level.Car telematics and infotainment applications that require additional storage space for rich multimedia data andadvanced software and applications are driving explosive growth for both volatile and nonvolatile memories. Fully digitaldashboard instrumentation increases the needs for advanced solid-state storage solutions. And new ADAS features relyon embedded intelligent control units with significantly large memory systems for program and data.Infotainment applications alone are – and will continue to be – the highest memory consumers, representing morethan 60 percent of total market value in 2012, followed by ADAS and body control at 34 percent, and then power trainsolutions.Memory Matters: High Quality, Innovation, Broad Memory OfferingMicron has one of the broadest portfolios of leading memory technologiesfor the automotive market. From best-in-class DRAM, NOR, NAND,and e•MMC devices for infotainment; to NOR and DRAM for drivingassistance, advancement safety systems, and user-friendly dashboardinstrumentation; to highly reliable NOR for power train andtransmission – Micron’s product offerings enable customersto rely on a single source for all their memory needs.Micron Technologywww.micron.comwww.embedded-computing.com Embedded Computing Design December 2012 | 33What’s in 2013 What’s in 2013

SPECIAL ADVERTISING SECTIONThe Solution Company Business ModelFlexera SoftwareFlexNet ® Licensing ® and FlexNet Operations ®What’s in 2013 What’s in 2013In 2013, hardware commoditization will continue to negatively challenge theprofit margins of intelligent device manufacturers. With globalization continuing totransform the marketplace, device manufacturers can no longer rely on predictablesales of devices to drive growth.In addition, the intelligent device marketplace will continue to see growth as leading manufacturers and serviceproviders formulate and execute their Machine-to-Machine (M2M) strategies. As more companies move into the M2Mspace, device manufacturers will continue to experience increased competition. Because customer requirements andexpectations will constantly shift and grow, to remain relevant and competitive, device manufacturers must adapt.To drive growth in the face of a demanding marketplace, device manufacturers must differentiate themselves byrethinking their business models and product strategies. One trend that will continue to gain more traction in 2013is device manufacturers’ transformation of business models to become more like solution companies, where softwaredrives differentiation and value. The solution company business model empowers manufacturers to transform theirproduct lines by harnessing the power of embedded software (already provisioned on their devices) to differentiatethemselves from their competitors.Solution-centric business models allow device manufacturers to deliver more value to customers in the form ofcompetitive differentiation, easy capacity and capability provisioning and upgrades, protection of Intellectual Property(IP), added revenue streams, and targeting of multiple geographies without increasing manufacturing Stock KeepingUnits (SKUs). The result is increased customer satisfaction, increased profit margins, and significantly improvedbusiness performance.Licensing and entitlement management is critical to the success of the transformation. Flexera Software has helpedmany device manufacturers evolve from using an exclusively hardware-based business model to one that optimizessoftware revenues. Flexera Software’s FlexNet Licensing and FlexNet Operations solutions enable device manufacturersto monetize intelligent devices, equipment, and software.FlexNet Licensing and FlexNet Operations benefit device manufacturers in a number of ways:• Protect and monetize IP, enabling subscription licensing, maintenance/upgrade fees, usage-based pay-for-use orpay-for-burst, metering, and many other models. These solutions support the full software licensing spectrum, fromstrict enforcement to usage-based “trust, but verify” models.• Allow device manufacturers to efficiently create lower-end products at lower price points, mid-market productsat moderate price points, and premium products at higher price points – all with the same basic code and hardware –by using electronic licensing to turn device capabilities and capacities on and off.• Enable device manufacturers to offer customers a way to immediately activate newly purchased productcapabilities by simply entering a license key. Device manufacturers can also offer these capabilities on a “try beforeyou buy” basis.• Allow device manufacturers to enter new markets by protecting IP that runs in virtual and cloud (private and public)environments. This gives the customer the benefit of a lower cost by eliminating expensive hardware while preservingprofit margin. Software-only virtual appliance solutions can also be delivered under a Software as a Service (SaaS)and/or external cloud model, representing a rapidly growing worldwide market.Flexera Softwarewww.flexerasoftware.com/ecm34 | December 2012 Embedded Computing Design www.embedded-computing.com

SPECIAL ADVERTISING SECTIONSeamless M2M ConnectivityLilee SystemsLilee Systems TransAir Intelligent Connectivity SolutionMaking connectivity a reality for an increasing number of application areas: that’s what’s hot. From our personal livesto the business world, having key data available at our fingertips is essential, and knowing that the information is alwaysup-to-date and easily accessible is becoming the norm. The expectation that the massive amounts of data collected canmake the world a better place also is sought.This is the age of the Internet of Things (IoT) and increased intelligent Machine-to-Machine (M2M) implementation.The means by which intelligent systems enable IoT is set to be the challenge that every IT team around the worldis looking to achieve both with company-owned tools and systems, as well as the growing Bring Your Own Device(BYOD) network addition. And, of course, the desire is that this can be achieved with remote management andwithout compromising security or adding painful complexity to the overall network topology.This is a challenge that Lilee Systems is knee-deep in tackling. For Lilee Systems, it is all about the science of keepingM2M assets connected and communicating seamlessly no matter where they are, whether they are stationary or mobile,how fast or slow they are moving, how many endpoints they need to talk to, whether they utilize wireless technologiescost-effectively, and no matter what the environmental condition. The result beyond seamless M2M connectivity willbe advancements in situational awareness (GPS, video, voice), proactive information distribution, improved safety,optimized resources, and much more.Connected transportation is one of the three hottest areas that Lilee Systems predicts will receive a great dealof attention in 2013. The other two are smart energy and environmental monitoring. Serving the M2M connectivityrequirements of these areas requires deep understanding of networking and wireless technologies. The applicationswithin these three market areas touch lives and require complete solutions that are achieved by way of a recipethat combines hardware and software – a recipe that is found in the Lilee Systems TransAir intelligent connectivityM2M solution.The architecture of the Lilee Systems TransAir intelligent connectivity solution has two parts: the messaginglayer and the core network layer. Requirements for link abstraction and message routing along with user-specificapplications are provided for in the messaging layer. The core network layer in the TransAir end-to-end solution consistsof both wired and wireless connectivity. It enables the components of the messaging layer to communicate with oneanother. For the end user, this means that setting up new elements within growing M2M networks is simplified withrespect to the time required for implementation and the amount of staff trainingneeded. With remote configuration and maintenance enabled, IT teams get theadded assurance of maximized system uptime. M2M solutions need built-inflexibility such that costs for power usage and data collection/transmissionscan be kept in budget. Look to see further advancements here in 2013.Lilee Systems sees that the solutions for these three hot marketscan be used to enable intelligent connectivity requirements acrossadditional markets, including industrial automation, governmentand aerospace, eHealth, retail, and finance.Lilee Systemswww.lileesystems.comWhat’s in 2013 What’s in 2013www.embedded-computing.com Embedded Computing Design December 2012 | 35

SPECIAL ADVERTISING SECTIONFPGA Technology to Deliver NewCapabilities for Systems Requiring Security,Low Power, Reliability, and IntegrationMicrosemi CorporationBy Paul Ekas, VP of Marketing, SoC Products GroupWhat’s in 2013 What’s in 2013FPGA suppliers have made impressive gains in functional integration. Virtually all major vendorsoffer devices combining an FPGA fabric with System-on-Chip (SoC) functionality, including built-inmath blocks, high-speed serial interfaces, and microprocessor cores. As we move into 2013, we will see the addition ofnew capabilities for industrial, military, aviation, communications, medical, and other applications that require even moreintegration, along with advanced security, low-power operation, and improved reliability.Design security is important for protecting FPGA configurations from cloning, reverse-engineering, or tampering.The greatest advancements here will come from making base-level security easier to use and adopt. Security should beinherent, which requires the use of SoC FPGAs built on nonvolatile flash technology since, unlike SRAM-based FPGAs,they don’t expose a programming bitstream at start-up.SoC FPGAs must go beyond simply providing design security. During 2013, we will see an increasing emphasis onprotecting application data that the FPGA is processing. This includes hardware protection from differential poweranalysis attacks, the inclusion of non-deterministic random bit generators, and hardware firewalls to protect theintegrated microcontroller core.Low power consumption is also a critical focus area, especially in military designs that must support SWaPrequirements. While microcontrollers have included power-saving modes for years, SoC FPGA low-power modeshave only just become available. For systems that operate reactively (in standby mode, awaiting a trigger event tobegin processing) or periodically (processing on a recurring basis), placing SoC FPGAs in ultra-low-power modebetween activity bursts can reduce power by 95 percent or more. Today’s solutions can enter into and exit fromlow-power mode within 100 milliseconds, without affecting register and SRAM states or the FPGA I/O. These newSoC FPGAs reduce standby power by 100x over previous-generation devices.Reliability is an equally important SoC FPGA focus area for 2013. The biggest issue is Single Event Upsets (SEUs), themost common cause of system failures. SEUs occur when alpha particles in packaging or neutrons in the atmospheredeliver enough charge to change one or more configuration bits in an SRAM-based FPGA. SEUs are the most commoncause of system failure, which can lead to injury in safety-critical industrial and medical systems. SRAM-based FPGAsuppliers have attempted SEU mitigation techniques, but there are still significant periods of time after an SEU eventwhen the system is operating in a corrupted state. In contrast, flash-based SoC FPGAs are immune to SEU effects.We expect to see growing adoption of SoC FPGAs that deliver higher levels of integration, combining the inherentlyreliable flash-based FPGA fabric with numerous hard IP blocks on a single die. This architecture maximizes functionalresources in smaller silicon space while improving performance and enabling design customization. Embedded processorcores remove the complexity and logic consumption of soft processor cores implemented in the FPGA fabric, and thetight coupling of peripherals and subsystems further improves size and performance.The latest SoC FPGAs integrate a full-featured microcontroller subsystem with a powerful embedded processor,plus advanced security processing accelerators, DSP blocks, SRAM, embedded nonvolatile memory, and multigigabitserial communication interfaces. Because these devices use a flash process, they can include embedded nonvolatilememory for data and code storage, significantly improving security. The result is reduced board component count,system cost, and power consumption. As we move into 2013, designers of mission- and life-critical systems can use SoCFPGAs with inherently better reliability and security provided via low-power operation in a highly integrated solution.Microsemi Corporationwww.microsemi.com36 | December 2012 Embedded Computing Design www.embedded-computing.com

SPECIAL ADVERTISING SECTIONMonetizing Embedded SoftwareSafeNetSoftware Monetization SolutionsEmbedded developers are evolving as embedded systems increasingly become off-the-shelf hardware running solutionspecificsoftware. Because of this transition, software is becoming a more important part of an embedded system’sintellectual property value. Hardware vendors are finding that they are now software vendors, and this evolution isforcing an operational change that some vendors are finding very challenging and others are seeing as an opportunity.Beyond the transition to software-focused embedded solutions, developers are facing problems with deliberate andunintentional misuse of their technology, tampering and theft of IP. Usage enforcement, copy protection, and theftprevention can guard against these threats, but they are only the first step toward unlocking the potential value of IP,which requires a focus on both revenue and profitability.To monetize IP, intelligent device manufacturers need to leverage the four aspects of a successful software monetizationstrategy – control, packaging, management, and tracking. When software monetization strategies are implementedsuccessfully, intelligent device manufacturers are able to simultaneously offer a better customer experience and a moreprofitable solution.Access and Usage ControlControlling IP is the foundation of softwaremonetization, as intelligent device manufacturers faceproblems with deliberate and unintentional misuse oftheir software, product and feature overuse, competitiveIP theft, reverse-engineering, and code tampering. Thefirst step is controlling who has access to the device’ssoftware, when they’re granted access, and to whatextent. By effectively controlling access to source code,device vendors can protect revenue and safeguard the integrity of their brands and products.Vendors must control the use of their software at the product and feature level to prevent deliberate or unintentionaloveruse of their offerings and ensure that they are being fairly compensated. Feature-based licensing helps devicevendors minimize manufacturing costs while achieving greater packaging flexibility and enabling sophisticated pricingstrategies. Feature-based licensing and entitlement management open the door to a world of marketing and sales toolsand allow device manufacturers to develop and maintain a single, feature-rich application installed on all devices. Thefunctionality of the device is then controlled through licensing. This enables software vendors to ship the same productwith different functionality to different customers at varying price points and upgrade products remotely with lowersupport and fulfillment costs, thus delivering a better customer experience.Tracking UsageTracking product activation and usage down to the feature level gives intelligent device vendors information todrive product packaging decision-making, roadmap investment, and sales and marketing strategies. With the rightknowledge, product management and engineering teams can discontinue unpopular feature combinations and createsoftware packages with the most valued features. In addition, marketing and sales teams can utilize reports to betterdetermine what, when, and how products are being used, and leverage this invaluable data to plan, launch, andexecute more effective sales and marketing activities.Embracing the ChangeShifting from a hardware focus to a software focus typically doesn’t happen overnight. Intelligent device manufacturerswho embrace the transition and use software monetization solutions to overcome challenges will minimize threats totheir IP, reduce manufacturing and inventory costs, expand their product lines, and bring innovative devices to marketwith greater efficiency, making them more nimble and competitive for the future.SafeNetwww.safenet-inc.com/software-monetization/embedded-applicationsWhat’s in 2013 What’s in 2013www.embedded-computing.com Embedded Computing Design December 2012 | 37

SPECIAL ADVERTISING SECTIONWhat’s in 2013 What’s in 2013Smart, Reliable Platforms forEmbedded ComputingElma Electronic Inc.Data Processing and StorageAs video streaming and processing have become ubiquitous in many applications, the demand for platforms that canhandle this data is increasing. OpenVPX is proving to be a strong candidate in video processing applications; while it hasmade strong progress in the defense space due to its rugged design, it is also of interest for critical industrial applications.Elma has been working tirelessly with multiple companies in the VPX ecosystem to show that interoperability is feasibleand contributes to a wider range of solutions and vendors, thereby reducing risk of adoption.Storage has progressed in leaps and bounds over the past few years, making rugged, solid-state options moreaffordable and reliable. Elma has created an entire family of storage products to address the growing need for storingthe overwhelming amounts of data being processed, such as in radar or unmanned surveillance systems or industrialinspection equipment. Requirements such as high read/write rates, removability, RAID arrays, and secure data are allsupported by our storage products.The Need for Small Form Factor (SFF) PlatformsElma is releasing more application-specific solutions to address the growing demand for harsh environment, small formfactor computing in support of industrial control, energy exploration, and infrastructure. More and more functionalitymust be placed in increasingly less space, making environmental issues such as heat dissipation ever more challenging.Elma is building on our 40-plus years of packaging experience to developing innovative, leading embedded solutionsthat address Size, Weight, and Power (SWaP) constraints in deployed environments. We are also leveraging our successin legacy standards to build intelligent tailored solutions for our customers.High-Speed Optical a Closer RealityOptical backplanes have been long anticipated in the embedded computer marketplace and have been seen by some asthe ultimate solution for higher-bandwidth interconnections. With the release of two ANSI-VITA standards (66.x) addressingbackplane optical interfaces, the practicality of backplane-based optical solutions has moved closer to reality. Thesestandards are laying the groundwork for migration to optical technologies within the embedded computing industry.Communications/TelephonyThe increased demand for faster and greater access to mobile data is driving the wider communications markets intonew infrastructure investments and small cell site networks. Elma believes that Voice over IP (VoIP) implementations willincrease as companies move away from traditional analog phone systems, necessitating more equipment build-outs.Worldwide adoption of IPv6 will also drive the need for embedded computing platforms. The last IPv4 IP addresseswere assigned this past year, and as more mobile devices come online there will be an increased need for the new IPaddresses, spurring the need for new equipment.Elma sees these next-gen advancements as a positive force for embedded computing suppliers. To date, wehave increased our efforts directed at this market space and have invested in new product offerings that span severalPICMG standards, including an entire family ofMicroTCA and new CompactPCI Serial platforms,innovative AdvancedTCA backplanes, andrugged mobile systems.Elma Electronic Inc.www.elma.com38 | December 2012 Embedded Computing Design www.embedded-computing.com

SPECIAL ADVERTISING SECTIONA Marketplace for Software DevelopersEmbedded Software StoreAvnet Electronics Marketing and ARM celebrate one year of the popular Embedded Software Store.com – an onlineinformation and e-Commerce-based website dedicated solely to the embedded design community. Joining forces tohelp solve the industry challenge of escalating software complexity via one, simplified source, the collective effortsof these two visionary leaders provides embedded developers across the globe access to proven solutions withcompetitive pricing, licensing terms, support, compatibility, and interoperability.By consolidating a large number of software options that support the ARM architecture within a single domain,the Embedded Software Store helps developers tackle the challenges presented by increased software and SoCdesign complexity.Over the past year, the Embedded Software Store (ESS) has grown to include products from 24 vendors, whileincreasing their online partner support and product selection to nearly 1,000 products.ESS was launched at ARM’s TechCon 2011 to provide an e-Commerce website and online information targetingembedded designers. It took the industry by storm and established a comprehensive one-stop online store that enablesembedded designers to evaluate and purchase leading software products to accelerate their ARM processor-baseddesign through features such as:• Pass-through Licensing Agreements• Software/Hardware Associative Sells• Order Processing via credit card or Avnet Account (Americas-only currently)• Interactive Online Support Community• Vendor SupportThe Embedded Software Store offers developers a broad range of foundation software for their embedded design.Examples include:• Software tools – the Embedded Software Store sells a broad range of the ARM Keil ® MDK softwaredevelopment toolkit.• RTOSs and the ESS offer leading RTOSs software through vendors such as Micrium, CMX, Code TimeTechnologies, and RoweBots. Customers have abroad choice based on functionality and memoryrequirements.• Customers who are exploring open source canfind a less time-consuming solution and simplifiedapproach offered by Timesys, which provides a tool,LinuxLink, that helps a developer leverage opensource code for specific silicon target boards.Your next great design is within reach. At the point ofinception, visit EmbeddedSoftwareStore.com to collaborate,create, and connect.Embedded Software Storewww.EmbeddedSoftwareStore.comWhat’s in 2013 What’s in 2013www.embedded-computing.com Embedded Computing Design December 2012 | 39

SPECIAL ADVERTISING SECTIONVIA Technologies EmbracesBoth x86 and ARMVIA EmbeddedBy Cliff Moon, VIA TechnologiesBoard-Level Integration Allows ISA-Independent InterfacesWhat’s in 2013 What’s in 2013The processor wars will heat up even more during 2013, spawning fierce battles between the proponents forARM-based systems and their instruction-set adversaries in the x86 camp. Instead of fighting the instruction-setarchitecture battle, VIA Technologies designs embedded microprocessors using both instruction sets to deliver a broadportfolio of chips and subsystems for embedded markets. The company’s Nano, Eden, and QuadCore product familiesintegrate VIA’s 64-bit, x86 CPU core to target performance-driven applications that need full compatibility with thePC ecosystem. For applications that need the absolute lowest power consumption, VIA integrates ARM’s 32-bit CPUcores into embedded SoCs. System designers now have more freedom to optimize their products to take advantageof the best aspects of either CPU architecture while working with a semiconductor company that has almost twodecades of experience building boards and modules for embedded applications.With standard interfaces such as USB, SATA, HDMI, and PCI Express, embedded system designers can implementa board-level product that integrates all of the processor-specific components into a module or single board computer.This integration allows the system to be differentiated at the software-application level, since the CPU at the heartof the system connects through the same standard hardware interfaces with full software driver support for mostembedded operating systems. The system designer can select embedded boards that use either x86 or ARM, basedon the requirements – or just an arbitrary preference – of the software developers. By selecting different boards orsubsystems, the design can scale to meet system-level constraints for performance and power consumption. Forexample, as the figure below illustrates, designers using the Pico-ITX form factor (the size of a credit card) can targethigh performance using as many as four x86 CPU cores or target ultra low power using a single-core ARM processor.Since the embedded processors are integrated at the board level, VIA Embedded can offer a broader range offeatures and performance by designing with semiconductor components from multiple vendors – not just chips fromVIA Technologies. For example, the VIA VAB-800 expanded the Pico-ITX family by using a low-power Cortex-A8 (i.MX)processor from Freescale. VIA also uses its own ARM-based processors in products such as the ARM DS system fordigital signage. By designing ARM and x86 processors, VIA has a much deeper understanding of microprocessor-basedsystems than vendors who only build boards. As the battles between ARM and x86 continue to intensify, VIA Embeddedis well-positioned in both ISA camps and will continue to expand its product offerings at the silicon, board, systems,and platform levels.VIA VAB-800VIA EPIA-P910VIA Embedded offers board-level products with either x86 orARM processors, using common form factors such as Pico-ITX.For example, EPIA-M910 supports the high-performanceVIA QuadCore, while the VIA VAB-800 uses a low-powerARM processor.VIA Embeddedwww.viaembedded.com40 | December 2012 Embedded Computing Design www.embedded-computing.com

-community BeatKeeping the embedded conversation goingBy Jennifer Hessewww.embedded-computing.comBLOGS | MARKET STATS | SURVEYS | VIDEOS | SOCIAL NETWORKING▲ARM TechCon 2012demo: Springboardfor embeddeddevelopmentTo unleash the powerof high-performancemobile processorslike the QualcommSnapdragon S4 Plus APQ8060A, embedded designers needa flexible development platform that can provide a quick pathto ship end products to market. Awarded the Best in Show forHardware Design at ARM TechCon 2012, the DragonBoardDevelopment Kit from Intrinsyc offers a complete package forstarting embedded designs, including a System-On-Module(SOM) with the Snapdragon dual-core processor plus Wi-Fiand Bluetooth capabilities and a carrier board with a full complementof I/O features that can be tailored to specific applicationrequirements. Developers can find support for designingintelligent devices and systems at the community forumhttp://mydragonboard.org.Watch the DragonBoard demo at http://opsy.st/TuJV0T.See more videos in our library:http://video.opensystemsmedia.com.Man embraces machines in M2M-connecteddevicesForget raging against the machine. In today’s society,the machine, or more specifically, Machine-to-Machine(M2M) communications is becoming all the rage, withforecasts predicting the market will hit 400 million unitsby the end of 2017, up from more than 110 millionat present, according to a new report from JuniperResearch. While smart metering was once the dominantfield for M2M deployment, the consumer electronicsand telematics sectors are quickly becoming the anchorindustries for the technology, thanks to the increaseddemand for embedded devices spurred by eReadersand telematics systems promising improved driverefficiency and cost management.The price of M2M modules is expected to decline,especially for 3G modules, as automotive and consumerelectronics systems require better bandwidth andlatency, and as 2G infrastructure is retired in somemarkets. While 4G chipsets are currently shipping inlow volumes, they are projected to see growth in theautomotive industry and specific applications suchas live video monitoring.Read more: http://opsy.st/TJgiIWEDA Digest TechChannelThe premier issue of EDA Digest ismaking a big entrance in the ElectronicDesign Automation (EDA) scene,highlighting top companies to watch in2012-2013 plus the industry’s top innovators, whose designsserve as the foundation of EDA today and are shaping designautomation for tomorrow’s systems. Check out the EDA DigestResource Guide for columns, articles, product profiles, andmore at: http://opsy.st/SLME4X. And find extended coverageincluding videos and social media updates at the EDA DigestTechChannel at: http://tech.opensystemsmedia.com/eda.2012 Top Innovators in EDA: Serial innovation, collaboration,and patienceBy Mike DemlerOne of the first observations I had when reviewing our 2012 TopInnovators in EDA was that there are a number of individuals who are“serial innovators,” having contributed to several of the most significantdevelopments in EDA over a period of many years. While an innovation may most oftenbe thought of as a new idea, I hope you will agree that the true measure of an innovationis the impact it has on changing how things are done. That takes time, and innovatorswho have done that more than once stood out in the group of nominees.Read more: http://opsy.st/TPVCTE42 | December 2012 Embedded Computing Design www.embedded-computing.com

INNOVATION IS 100% RELIABILITY65 MILLION MILES FROM HOME. AGAIN.NASA’s Curiosity RoverDeveloped with Wind River Workbenchand built on VxWorks RTOSImage courtesy NASA/JPL-CaltechWhen NASA JPL decided to return to Mars, they turned to Wind River ® . Again. The result is Curiosity, the most sophisticatedautonomous robotic spacecraft ever created. Using Wind River Workbench for development and debugging, the NASAJPL team was able to bring Curiosity to life. And now that Curiosity is safely on the Martian surface, VxWorks ® real-timeoperating system (RTOS) is controlling such mission-critical tasks as ground operations, data collection, and Mars-to-Earthcommunication. It’s the kind of 100% reliability NASA JPL has come to count on after nearly 30 years with Wind River.To see how Wind River can help you innovate more intelligently,visit us at http://www.windriver.com/announces/curiosity.INNOVATORS START HERE.

Technologic - OpenSystems Media

Create successful ePaper yourself

Delete template?

Save as template?