21.07.2015 Views

Linux Journal | November 2013 | Issue 235

Linux Journal | November 2013 | Issue 235

Linux Journal | November 2013 | Issue 235

SHOW MORE
SHOW LESS

Transform your PDFs into Flipbooks and boost your revenue!

Leverage SEO-optimized Flipbooks, powerful backlinks, and multimedia content to professionally showcase your products and significantly increase your reach.

Attend the Largest DedicatedAndroid Conference in the Universe!SAN FRANCISCO<strong>November</strong> 12-15, <strong>2013</strong>Get the best real-world Androiddeveloper training anywhere!• Choose from more than75 classes and tutorials• Network with speakers andother Android developers• Check out more than40 exhibiting companies“AnDevCon is a great opportunity to take your Android skills to the next level,get exposed to technologies you haven’t touched yet, and network withsome of the best Android developers in the world.”—Joe Mitchell, Software Engineer, Quicken Loans“It’s a blast learning and exchanging ideas with phenomenal speakersand cutting-edge experts who have the experience.”—Brad Holmes, Software Developer, uShipRegister Early and Save at www.AnDevCon.comAnDevCon is a trademark of BZ Media LLC. Android is a trademark of Google Inc. Google’s Android Robot is used under terms of the Creative Commons 3.0 Attribution License.A BZ Media EventFollow us: twitter.com/AnDevCon


OPENSTACK:SPECIAL REPORT112 Introduction to OpenStackA look at OpenStack—thefastest-growing Open Sourcecommunity in the world.Tom FitfieldCOLUMNS36 Reuven M. Lerner’sAt the ForgeRails and PostgreSQL48 Dave Taylor’s Work the ShellDebugging Web Sites52 Kyle Rankin’s Hack and /Super Pi Brothers58 Shawn Powers’ TheOpen-Source ClassroomLittle Computers, Big Projects122 Doc Searls’ EOFLife on the Forked RoadKNOWLEDGE HUB70 Webcasts and White PapersIN EVERY ISSUE8 Current_<strong>Issue</strong>.tar.gz10 Letters20 UPFRONT32 Editors’ Choice66 New Products125 Advertisers Index24 VALVE STEAMOS58 SOLIDRUN’S CUBOX PROON THE COVER• SIDUS for Extreme Operating System Deduplication, p. 100• GlusterFS: Install, Benchmark and Optimize, p. 72• Integrate Applications with the Python-Based Zato Platform, p. 84• OpenStack Special Report, p. 112• A Look at SolidRun’s CuBox Pro, p. 58• Play Classic Console Games on a Raspberry Pi, p. 52• Rails 4’s New PostgreSQL Features, p. 36LINUX JOURNAL (ISSN 1075-3583) is published monthly by Belltown Media, Inc., 2121 Sage Road, Ste. 395, Houston, TX 77056 USA. Subscription rate is $29.50/year. Subscriptions start with the next issue.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 5


Executive EditorSenior EditorAssociate EditorArt DirectorProducts EditorEditor EmeritusTechnical EditorSenior ColumnistSecurity EditorHack EditorVirtual EditorJill Franklinjill@linuxjournal.comDoc Searlsdoc@linuxjournal.comShawn Powersshawn@linuxjournal.comGarrick Antikajiangarrick@linuxjournal.comJames Graynewproducts@linuxjournal.comDon Martidmarti@linuxjournal.comMichael Baxtermab@cruzio.comReuven Lernerreuven@lerner.co.ilMick Bauermick@visi.comKyle Rankinlj@greenfly.netBill Childersbill.childers@linuxjournal.comContributing EditorsIbrahim Haddad • Robert Love • Zack Brown • Dave Phillips • Marco Fioretti • Ludovic MarcottePaul Barry • Paul McKenney • Dave Taylor • Dirk Elmendorf • Justin Ryan • Adam MonsenPublisherDirector of SalesAssociate PublisherWebmistressAccountantCarlie Fairchildpublisher@linuxjournal.comJohn Groganjohn@linuxjournal.comMark Irgangmark@linuxjournal.comKatherine Druckmanwebmistress@linuxjournal.comCandy Beauchampacct@linuxjournal.com<strong>Linux</strong> <strong>Journal</strong> is published by, and is a registered trade name of,Belltown Media, Inc.PO Box 980985, Houston, TX 77098 USAEditorial Advisory PanelBrad Abram Baillio • Nick Baronian • Hari Boukis • Steve CaseKalyana Krishna Chadalavada • Brian Conner • Caleb S. Cullen • Keir DavisMichael Eager • Nick Faltys • Dennis Franklin Frey • Alicia GibbVictor Gregorio • Philip Jacob • Jay Kruizenga • David A. LaneSteve Marquez • Dave McAllister • Carson McDonald • Craig OdaJeffrey D. Parent • Charnell Pugsley • Thomas Quinlan • Mike RobertsKristin Shoemaker • Chris D. Stark • Patrick Swartz • James WalkerAdvertisingE-MAIL: ads@linuxjournal.comURL: www.linuxjournal.com/advertisingPHONE: +1 713-344-1956 ext. 2SubscriptionsE-MAIL: subs@linuxjournal.comURL: www.linuxjournal.com/subscribeMAIL: PO Box 980985, Houston, TX 77098 USALINUX is a registered trademark of Linus Torvalds.


[ LETTERS ]assumed the use of vi(m), whichyou have (your mention that lessworks like vi, for instance), set-o vi at the bash prompt isinfinitely more powerful than usingthe default Ctrl-r and so on. Youmake the point that less is somewhatvim-like, but you miss that the shellcan be also. I think you’re sellingpeople short by offering somethingthat is less than optimal. I’m surescreen is very useful, but having adecent tabbed terminal emulator(konsole, for example in X; mremoteor poderosa in the Windowsenvironment) renders it somewhatirrelevant to me.As a UNIX user of >25 years, yes,I have mastered this stuff, but Ialso teach for a large internationaltraining organisation (in addition tomy day job). In that role, I urge mystudents that if they expect to useUNIX (read <strong>Linux</strong>) every day, learningvi is a good idea. Then I show themthe power of set -o vi (at thebash prompt), which leverages thepower of vi in command-line editing.Finally, I show them how using viwith autowrite switched on makeswriting (for instance) scripts orconfig files very easy.Suggesting that people have multiplescreens open and write the file invim on one and then restart thedæmon in another is just plain bad.Rather, show them how in vi theycan run a shell escape, which withautowrite switched on, will write thefile first, then call the shell-escapedcommand. No more “did I save thefile in the other session?” nonsense.No more “how do I get back toServer in a BoxSince 1985OVER28YEARS OFSINGLE BOARDSOLUTIONSCompact SIBlFanless x86 1GHz CPUlUp to 512MB on-board RAMlUp to 1GB Internal Flash Diskor Up to 64GB Compact Flashl10/100 Base-T-EthernetlCompact Flash SocketlMicro SD Flash Socketl3 USB 2.0 Host Portsl2 RS232 Portsl15-pin D-type female VGA connectorlOn Board AC 97 Audio line in and mic outlPS/2 Keyboard & Mouse PortlOn Board Mini-PCI connectorlDimensions: 4.53" x 4.53" x 1.38”lInternal 2.5” Hard Drive (Optional)lInternal Wireless 802.11g w/ 4dbi antenna (Optional)lWireless Antenna remote magnet mount 7dbi andremote 15dbi directional (Optional)2.6 KERNELThe EMAC Compact Server-in-a-Box is an energy efficient small footprintserver. Like all EMAC SIBs, the standard Compact SIB has no moving partsand features a rugged enclosure design. Qty 1 price starts at $223.www.emacinc.com/linux_journal_novEQUIPMENT MONITOR AND CONTROLPhone: (618) 529-4525 · Fax: (618) 457-0110 · www.emacinc.comWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 11


[ LETTERS ]terminal emulators instead.You talk a lot about editing configfiles and restarting dæmons, but that’smissing the point. Those were exampleactivities you might do on a remoteserver. The point is doing many thingsat the same time. Read a man page inone window. Tail a log file in another.Compile source code in a third. Runa debugger in a fourth, and so on.You could replace the editing configand restarting dæmon examples withthese or other activities, and the articlestill makes sense. What you do in theexamples was not the point. The pointwas that you can do many things.DDOS—Sendmail in ScriptTo Dave Taylor, regarding his article“Web Administration Scripts—Redux”in the September <strong>2013</strong> issue: justwondering, you say that -s for subjectsis not supported by the sendmail client,but you can, of course, create a texttemplate-emailwhere you can supportall the (data-part smtp) headers youwant, including a subject. When yourinstance of the template is ready, justfeed it through a pipe to sendmail, andyou’ve got everything you want, right?Or, it is always possible I’m notunderstanding what you’re doing,of course, in which case I’d ask youto discard this.I appreciate your monthly article,Dave. Keep it up.—LievenDave Taylor replies: You’re absolutelycorrect! Indeed, it’d be easy to havesomething like “sendmails” that simplyaccepted two arguments, $1 as therecipient and $2 as the subject:(echo "To: $1"echo "Subject: $2"echo ""cat - ) | sendmailThen, you’d invoke it as:sendmails lievendp@gmail.com "this is a sample sendmail wrapper"Is it elegant? Nah. But it’d work.Thanks for your e-mail and for readingmy column!Oculus Rift First Impressions andOfficial <strong>Linux</strong> SupportYou should check outthis editorial and video:http://www.gamingonlinux.com/articles/oculus-rift-first-impressionsand-official-linux-support.2133.—Liam DaweWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 13


[ LETTERS ]Dangit, Liam. I’ve never wantedto try a piece of hardware so bad!—Shawn PowersReading Suggestions forJeff ReganIn the September <strong>2013</strong> issue’sLetters, reader Jeff Regan lamentsreading LJ on a tablet. He is right;tablets are terrible for reading. Ihighly recommend the B&N NookSimpletouch or other E Ink devicefor reading. I have rooted mine(Android 2.1, which is enough touse Anki, read PDFs and do lightWeb browsing), and it has redefinedthe way I read. I cannot recommendit highly enough.—Dotan CohenMy last magazine-worthy tablet wasan original Kindle DX, and althoughit made large format files readable,its speed (or lack thereof) madeit frustrating. I’m guessing newerdevices have better refresh ratesand cleaner displays. I’d still missthe color though. Color E Ink, pleasecome quickly!—Shawn PowersCoarse LanguageThe language in the title of DocSearls’ article summary on <strong>Linux</strong>is entirely inappropriate [see Doc’s“<strong>Linux</strong> vs. Bullshit” in the September<strong>2013</strong> issue]. Each of our acts movesstandards and behaviour up ordown. This moves things down.Not good.—Bryan BattenDoc Searls replies: My main sourcefor the column was On Bullshit, anacademic work by Harry Frankfurt,a professor emeritus of philosophyat Princeton. It is, as far as I know,the only deep investigation of avernacular English word that carriesa high degree of meaning and forwhich there is no synonym. That it’salso vulgar does not give us cause toavoid it. Could be that understandingit better might help us create a truesynonym that isn’t vulgar.I believe <strong>Linux</strong> is also vernacular (thoughhardly vulgar) and visited this possibilityin a column titled “The New Vernacular”in 2001 (http://www.linuxjournal.com/article/4553).Renewed SubscriptionAfter many years of reading thepaper version of LJ, I sadly had tolet my subscription lapse when youwent all electronic. It simply didn’tfit with where the magazine went—for example, in the car, by the pool,cricket pitch and various others14 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


[ LETTERS ]places whilst waiting for numerouskids to finish whatever they weredoing. Although I had access tocomputers, none were portableenough for this use, and I couldn’tjustify the cost of another one.Yesterday, my new Nook arrived for£79 delivered—bargain.But you never forgot me, sending meregular e-mail messages and invitingme back, so last night, I downloadedthe free version and found all myfavourite authors still there toentertain and inform us. I renewedimmediately, of course. Thank you,and all the best for the future.—Roger GreenwoodRoger, welcome back! Our originalswitch to digital was at a roughtime in technology’s evolution.There were tablets, but theyweren’t mainstream enough (oraffordable enough). I’m glad tohear the Nook is working well foryou.—Shawn PowersSeptember Letter “Electronicvs. Paper”First, I am glad I renewed mysubscription to LJ. I was a longishtimepaper magazine subscriber. Ido have some sympathy with thosewho would prefer a paper copy, butthe copies on my hard drive do takeup less room.Anyway on to business—I am amember of the Radio Society ofGreat Britain (RSGB), which runsan excellent Web site full of usefulnews and radio-amateur-centricinformation, which practice, I believe,<strong>Linux</strong> <strong>Journal</strong> has copied successfully.However, the RSGB also brings outin the autumn (for Christmas) a“Year Book” that repeats much ofthe technical specifications found onthe Web site plus relevant articles onthings radio, such as the advantagesof building receiver type X overreceiver type Y and so on.I just wanted to add this to anythoughts LJ may have in thisdirection. I certainly look at my copya number of times in the year, asit is just so much more convenientand quicker than using the desktop.I know you do an archive CD, but abook makes a better present.—Roy ReadGift giving—that’s an interestingnotion. I agree it’s difficult to wrap1s and 0s. Although I can’t promiseanything, thank you for the idea.It’s worth considering for sure.—Shawn PowersWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 15


[ LETTERS ]nicer if you were more objectivelike a historian.—CanReuven M. Lerner replies: Thanksfor your message. I often tellpeople that there are open-sourcepeople who hate Microsoft, andthen I explain that I’m not one ofthem. Rather, I say I’m an opensourceperson who more or lessignores Microsoft—that I know it’sa major (indeed, dominant) forcein the computer industry, but thatit doesn’t directly affect my dayto-daylife, and that I’m ratherignorant about its technologies.There are oodles of smart peopleworking at Microsoft, and they haveclearly done a lot to advance Webtechnologies. But I must admit thatI know nothing, or almost nothing,about those technologies.So for me to have written aboutMicrosoft technologies wouldhave undoubtedly demonstratedmy ignorance, and/or requiredresearch that I don’t have time todo. That said, I completely acceptwhat you’re saying, and I probablyshould have noted that thecolumn was based on my personalexperience, and the technologiesI personally used—meaning thatI undoubtedly left out a lot ofinfluential and important ones.It’s always great to hear frompeople who read my columns, evenwhen you’re being critical. Thanksagain for writing, and always feelfree to let me know what you think,or if you have any suggestions forfuture topics.Dave Taylor’s DDOSI’m wondering if Dave Taylor hastried ModSecurity to fight his DDOS[see Dave’s column in the Augustand September <strong>2013</strong> issues]. I’mevaluating it for my own use, andit seems like it could help. If theattacker is placing something unusualin the HTTP headers, like a specialuser-agent string, ModSecurity can beused to drop the connection beforethe request is processed.There is a reasonably good e-bookabout it at http://www.packtpub.com.—Paul GassowayDave Taylor replies: Thanks foryour suggestion, Paul. As it happens,I ended up just moving to a newhosting company that has a teamfocused on helping out with thesesort of problems after all!WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 17


[ LETTERS ]Work the Shell IdeaAs a blogger who deals withclimate and weather, sometimes Ilike to know right away if a majorsudden event is happening, such asa tsunami. The National WeatherService has a tsunami alert system(@NWS_WCATWC) on Twitter.Whenever there is an earthquake,and there is no tsunami expected,they tweet something that includes“Tsunami NOT expected”, but whenthere is a tsunami expected, theytweet a different string (along withinfo on the event).I would like to see a script thatregularly checks Twitter (as a chronjob) for reliable sources (such asthe NWS specialized feeds and soon) for certain key phrases, and ifthere is an event indicated, thento search Twitter for an appropriatehashtag (such as #tsunami and#[place it is happening]). The scriptthen should send me an alert ofsome kind, so I can look at thelist of tweets to figure out whatis going on.This is a little like the DDOS script,because if you just search for#tsunami on Twitter, there wouldbe mostly junk most of the time,but when there is an officialalert, the #tsunami hash tagwould include a lot of actuallyuseful information.A new script by Dave that usesthe current Twitter API wouldbe useful, as that changes nowand then, and a script thatincludes parsing a select Twitterfeed, responding to that feedby then parsing other feeds,would be intelligent(ish) anda great demonstration of howto use a remarkable 20th-centurytechnology (bash, <strong>Linux</strong>) tosolve a problem in a very21st-century way.—Greg LadenThanks for the Unicode ArticleTo Reuven M. Lerner: I am asystems administrator, and to behonest, I normally skip over yourarticles in <strong>Linux</strong> <strong>Journal</strong>. However,I’ve got to give you some credit.The article on Unicode in the June<strong>2013</strong> issue was spectacular.You clearly and concisely explainedUnicode, and for the first time, Iunderstand how UTF-8 works andwhy it has become so popular.Thanks for the great article!—Carl18 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Reuven M. Lerner replies: Thanks for thewarm feedback; I’m delighted to know thatyou enjoyed the article!Photo of the MonthEnjoy.At Your ServiceSUBSCRIPTIONS: <strong>Linux</strong> <strong>Journal</strong> is availablein a variety of digital formats, including PDF,.epub, .mobi and an on-line digital edition,as well as apps for iOS and Android devices.Renewing your subscription, changing youre-mail address for issue delivery, paying yourinvoice, viewing your account details or othersubscription inquiries can be done instantlyon-line: http://www.linuxjournal.com/subs.E-mail us at subs@linuxjournal.com or reachus via postal mail at <strong>Linux</strong> <strong>Journal</strong>, PO Box980985, Houston, TX 77098 USA. Pleaseremember to include your complete nameand address when contacting us.ACCESSING THE DIGITAL ARCHIVE:Your monthly download notificationswill have links to the various formatsand to the digital archive. To access thedigital archive at any time, log in athttp://www.linuxjournal.com/digital.LETTERS TO THE EDITOR: We welcome yourletters and encourage you to submit themat http://www.linuxjournal.com/contact ormail them to <strong>Linux</strong> <strong>Journal</strong>, PO Box 980985,Houston, TX 77098 USA. Letters may beedited for space and clarity.WRITING FOR US: We always are lookingfor contributed articles, tutorials andreal-world stories for the magazine.An author’s guide, a list of topics anddue dates can be found on-line:http://www.linuxjournal.com/author.—Can ErtenWRITE LJ A LETTER We love hearing from our readers. Please send usyour comments and feedback via http://www.linuxjournal.com/contact.PHOTO OF THE MONTHRemember, send your <strong>Linux</strong>-related photos toljeditor@linuxjournal.com!FREE e-NEWSLETTERS: <strong>Linux</strong> <strong>Journal</strong>editors publish newsletters on botha weekly and monthly basis. Receivelate-breaking news, technical tips andtricks, an inside look at upcoming issuesand links to in-depth stories featured onhttp://www.linuxjournal.com. Subscribefor free today: http://www.linuxjournal.com/enewsletters.ADVERTISING: <strong>Linux</strong> <strong>Journal</strong> is a greatresource for readers and advertisers alike.Request a media kit, view our currenteditorial calendar and advertising due dates,or learn more about other advertisingand marketing opportunities by visitingus on-line: http://ww.linuxjournal.com/advertising. Contact us directly for furtherinformation: ads@linuxjournal.com or+1 713-344-1956 ext. 2.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 19


UPFRONTNEWS + FUNdiff -uWHAT’S NEW IN KERNEL DEVELOPMENTThe stable kernel series recently gotbit by an over-eagerness to acceptpatches. Typically, the rule had beenthat Greg Kroah-Hartman wouldn’taccept patches that had not gotteninto Linus Torvalds’ developmentkernel first. People sometimescomplained that this was absurd, sincethe patches in question always wereintended for the stable tree, and so itseemed to make no sense to wait forthem to be accepted into a completelydifferent, development-centric tree.But the rule existed to ensure that thegood and useful stabilization patcheswouldn’t go only into the ephemeralstable releases, but also would beincorporated into the long-term futureof the kernel (that is, Linus’ tree).Nevertheless, sometimes peoplecomplained. And, Greg did try to takepatches as soon as they crossed thethreshold into Linus’ git repository.But recently he did this with agenetlink patch that seemed fineat first, but it ended up having abad interaction with other code thatwasn’t caught until after the 3.0.92stable kernel came out.Greg viewed this as a bit of a wakeupcall to slow down on acceptingpatches until they were not only inLinus’s tree, but also had a chanceto be tested for the duration of one-rc cycle. That would provide a muchbetter chance of shaking out any bugsand keep the stable kernels as stableas one might reasonably expect.Linus pointed out that tying thewaiting period to an -rc cycle wasfairly arbitrary. He said waiting a weekor so probably would be fine. He alsoadded that for particularly criticalor well-discussed patches, a delayprobably would be unnecessary.Greg posted a more formaldescription of his new patchacceptancepolicy, which also specifiedthat if an official maintainer signedoff on the patch, or if Greg saw asufficient amount of discussion aboutit, or for that matter, if the patchwas just obvious, then the delay foracceptance could be skipped.Greg’s initial proposal also includedhis idea of waiting for a full -rccycle. Nicholas A. Bellinger hadno problem with that, and Willy20 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


[ UPFRONT ]The development process itself also willchange. At the time of Grant’s announcement,the process had become too frantic, and therewas no distinction between DT bindings thatwere clearly stable and those that were in fluxand still under development.Tomasz suggested having a clear process formigrating any given DT binding from a stagingarea into a stable area. Once stable, he said, theyshould have the sanctity of an application binaryinterface (ABI), which typically must never beallowed to break backward compatibility.David Gibson also was very interested in thisproject. He’d made an earlier attempt at schemavalidation, and he wanted to see if Tomasz couldfare better. He also pointed out that there wasa perception problem among users, and thatthey tended to think that a given DT binding—asbeing distributed in the kernel—was valid for thatkernel and no others. But David said this was notthe case, and that some effort should be made toinform users that DT bindings are intended to begeneralized descriptions of hardware and not tiedto any particular kernel version. But, he wasn’tsure how that could be done.The discussion took many paths. At somepoint, someone suggested using XML with acorresponding document type definition as theDT schema. There was some support for this, butTomasz didn’t like mixing XML with the existingDT schema format.The overarching discussion was open-ended,but clearly there are big changes on the way,regarding the DT-binding specifications.—ZACK BROWNThey Said ItIt is wonderfulhow quickly youget used to things,even the mostastonishing.—Edith NesbittThere is nothinglike the occasionaloutburst ofprofanity to calmjangled nerves.—Kirby Larson,Hattie Big Sky,2006The art of beingwise is the art ofknowing what tooverlook.—William JamesLaughter is innerjogging.—Norman CousinsFacing it, alwaysfacing it, that’sthe way to getthrough. Face it.—Joseph Conrad22 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


[ UPFRONT ]Android Candy:Flickr UploaderAs luck would have it, shortly after Ipurchased a Flickr Pro subscription,Yahoo decided to eliminate the pay-foroption with Flickr and give everyone afree Terabyte of space to store photos.Although I’m still not convinced Flickr isthe best way to store photos, I have foundit to be very flexible, so for now, my familymembers dump all their photos to Flickrfrom all their devices. (And then I backthem up with https://pypi.python.org/pypi/flickrbackup because I’m paranoid.)On my phone, a Galaxy S4, I find itcumbersome to upload photos directly toFlickr, so I bought Flickr Uploader from theGoogle Play store, and now all my photos areuploaded to Flickr instantly when I take them.Options are available to limit uploading toWi-Fi only, or to upload only certain kinds ofphotos (camera photos but not screenshots,for example). The application is $4.99, butthe developer has created “coupons” toget the application for a huge discount oreven free (https://github.com/rafali/flickr-uploader/wiki/Coupons).I find the application to be smooth, stable and incredibly convenient. It’simportant to realize, however, that if you’re the sort of person to take personalor compromising photos, double- and triple-check your settings to makesure you don’t upload publicly viewable shots automatically! If you’re a Flickruser, check out Flickr Uploader today: https://play.google.com/store/apps/details?id=com.rafali.flickruploader.—SHAWN POWERSWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 23


[ UPFRONT ]Valve—It ReallyDoes Love <strong>Linux</strong>I’ve teased about Steam,speculated about Steamand even bragged aboutSteam finally coming to<strong>Linux</strong>. Heck, check out thescreenshot for just a partiallist of games already runningnatively under our belovedOS. Little did I know thatthe folks at Valve not onlyplanned to support <strong>Linux</strong>,but they’re also puttinga big part of their futurebehind it as well!Valve recentlyannounced SteamOS(http://store.steampowered.com/livingroom/SteamOS), whichis a <strong>Linux</strong>-based operating systemdesigned from the ground up to playgames. Steam games. The operatingsystem will be free, and it alsowill be the OS running on Valve’snext big thing, SteamMachines(http://store.steampowered.com/livingroom/SteamMachines).Although the hardware won’t beavailable until sometime in 2014,the OS should be out before then.Will Valve be the company tobridge the gap between computergaming and console gaming? Will PCgames translate to a television screensmoothly? Will the SteamOS’s “gamestreaming” technology effectivelybring the entire Steam library to<strong>Linux</strong>? I’ve learned enough not tomake any predictions, but I can tellyou it’s an exciting prospect, and it’seven more exciting because it’s allrunning on <strong>Linux</strong>!—SHAWN POWERS24 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


[ UPFRONT ]brings up the main window witha working console that should bedisplaying the license informationfor FreeMat (Figure 1).As with most programs likeFreeMat, you can do arithmeticright away. All of the arithmeticoperations are overloaded to do“the right thing” automatically,based on the data types of theoperands. The core data typesinclude integers (8, 16 and 32bits, both signed and unsigned),floating-point numbers (32 and64 bits) and complex numbers (64and 128 bits). Along with those,there is a core data structure,implemented as an N-dimensionalarray. The default value for N is6, but is arbitrary. Also, FreeMatsupports heterogeneous arrays,where different elements areactually different data types.Commands in FreeMat are linebased.This means that a commandis finished and executed when youpress the Enter key. As soon as youdo, the command is run, and theoutput is displayed immediatelywithin your console. You shouldnotice that results that are notsaved to a variable are savedautomatically to a temporaryvariable named “ans”. You can usethis temporary variable to accessand reuse the results from previouscommands. For example, if youwant to find the volume of a cubeof side length 3, you could so with:--> 3 * 3ans =9--> ans * 3ans =27Of course, a much faster waywould be to do something like this:--> 3 * 3 * 3ans =27Or, you could do this:--> 3^3ans =27If you don’t want to clutter upyour console with the output fromintermediate calculations, you cantell FreeMat to hide that output byadding a semicolon to the end ofthe line. So, instead of this:--> sin(10)ans =-0.5440WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 27


[ UPFRONT ]overloaded when it comes tointeracting with arrays too. Forexample, let’s say you want to takeall of the elements of an arrayand double them. In a lower-levellanguage, like C or Fortran, youwould need to write a fair bit ofcode to handle looping througheach element and multiplying it by2. In FreeMat, this is as simple as:--> a = [1,2,3,4]a =1 2 3 4--> a * 2ans =2 4 6 8If you are used to somethinglike NumPY in the Pythonprogramming language, thisshould seem very familiar. Indexingarrays is done with brackets.FreeMat uses 1-based indexes, soif you wanted to get the secondvalue, you would use:--> a(2)You also can set the value at aparticular index using the samenotation. So you would change thevalue of the third element with:--> a(3) = 5These arrays all need to have thesame data type. If you have need ofa heterogeneous list of elements,this is called a cell array. Cell arraysare defined using curly braces.So, you could set up a name andphone-number matrix using:--> phone = { 'Name1', 5551223; 'name2', 5555678 }There are two new pieces ofsyntax introduced here. The first ishow you define a string. In FreeMat,strings are denoted with a set ofsingle quotation marks. So thiscell array has two strings in it. Thesecond new syntax item is the useof a semicolon in the definitionof your array. This tells FreeMatthat you’re moving to a new row.In essence, you’re now creating amatrix instead of an array.Plotting in FreeMat is similar toplotting in R or matplotlib. Graphicsfunctions are broken down intohigh-level and low-level functions.The most basic high-level functionis plot(). This function is overloadedto try to do the right thing based onthe input. A basic example would beplotting a sine function:--> t = -63:64;--> signal = sin(2*pi*t/32);--> plot(t, signal)WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 29


[ UPFRONT ]This way, if you have some pieceof code that is already written andtuned in C to get the maximumperformance, you simply canimport it and use it from withinFreeMat with very little fuss.Since FreeMat is as capableas MATLAB or Octave, you canassume that I’ve barely coveredeverything offered by FreeMathere. Hopefully, you have seenenough to spark your interest andwill make some time to go andexplore even more.Resourcesn Main Web Site:http://freemat.sourceforge.netn FreeMat Tutorials:https://code.google.com/p/freemat/wiki/Tutorialsn FreeMat Wiki:https://code.google.com/p/freemat/w/list—JOEY BERNARDPowerful: RhinoRhino M4700/M6700• Dell Precision M4700/M6700w/ Core i7 Quad (8 core)• 15.6"­17.3" FHD LEDw/ X@1920x1080• NVidia Quadro K5000M• 750 GB ­ 1 TB hard drive• Up to 32 GB RAM (1866 MHz)• DVD±RW or Blu­ray• 802.11a/b/g/n• Starts at $1375• E6230, E6330, E6430, E6530also available• High performance NVidia 3­D on an FHD RGB/LED• High performance Core i7 Quad CPUs, 32 GB RAM• Ultimate configurability — choose your laptop's features• One year <strong>Linux</strong> tech support — phone and email• Three year manufacturer's on­site warranty• Choice of pre­installed <strong>Linux</strong> distribution:Tablet: RavenRaven X230/X230 Tablet• ThinkPad X230/X230 tablet by Lenovo• 12.5" HD LED w/ X@1366x768• 2.6­2.9 GHz Core i7• Up to 16 GB RAM• 750 GB hard drive / 180 GB SSD• Pen/finger input to screen, rotation• Starts at $1920• W530, T430, T530, X1 also availableRugged: TarantulaTarantula CF­31• Panasonic Toughbook CF­31• Fully rugged MIL­SPEC­810G tested:drops, dust, moisture & more• 13.1" XGA TouchScreen• 2.4­2.8 GHz Core i5• Up to 16 GB RAM• 320­750 GB hard drive / 512 GB SSD• CF­19, CF­52, CF­H2 also availableEmperor<strong>Linux</strong>...where <strong>Linux</strong> & laptops convergewww.Emperor<strong>Linux</strong>.com1­888­651­6686Model specifications and availability may vary.


Notice packet loss to the Google DNS server, but none to my gateway. So, the problemisn’t with my house connection.SmokePing gets this month’sEditors’ Choice Award. Checkit out at http://oss.oetiker.ch/smokeping. —SHAWN POWERSWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 33


COLUMNSAT THE FORGEfor tiny data sets, and that it usesPostgreSQL by default, has made ita popular option for people learningRails, for small applications and formany people who want to outsourcetheir hosting.As as result of PostgreSQL’s growingpopularity, the latest (4.x) versionof Ruby on Rails includes extensive,built-in support for many PostgreSQLfeatures. In this article, I introduce anumber of these features, both fromthe perspective of a Rails developerand from that of a PostgreSQLadministrator and DBA. Even if youaren’t a Rails or PostgreSQL user, Ihope these examples will give youa chance to think about how muchyou can and should expect from yourdatabase, as opposed to handling itfrom within your application.UUIDs as Primary KeysOne of the first things databasedevelopers learn is the need for aprimary key, a field that is guaranteedto be unique and indexed, and thatcan be used to identify a completerecord. That’s why many countrieshave ID numbers; using that number,government agencies, banks andhealth-care systems quickly canpull up your information. Theusual standard for primary keys isan integer, which can be definedin PostgreSQL using the SERIALpseudo-type:CREATE TABLE People (id SERIAL PRIMARY KEY,name TEXT,email TEXT);When you use the SERIAL type inPostgreSQL, that actually creates a“sequence” object, on which youcan invoke the “nextval” function.That function is guaranteed togive you the next number in thesequence. Although you can defineit to have a step of more thanone, or to wrap around when it’sdone, the most usual case is touse a sequence to increment an IDcounter. When you ask PostgreSQLto show you how this table isdefined, you can see how the “id”field’s definition has been expanded:\d peopleTable "public.people"+--------+---------+--------------------------------------------+| Column | Type | Modifiers |+--------+---------+--------------------------------------------+| id | integer | not null default| name | text | || email | text | |➥nextval('people_id_seq'::regclass) |WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 37


COLUMNSAT THE FORGEcan invoke the uuid_generate_v1()function, getting back data oftype “uuid”:+--------------------------------------+--------+---------------------+| id | name | email |+--------------------------------------+--------+---------------------+| 9fc82492-276b-11e3-a814-28cfe91f81e7 | Reuven | reuven@lerner.co.il |select uuid_generate_v1();+--------------------------------------+| uuid_generate_v1 |+--------------------------------------+| 6167603c-276b-11e3-b71f-28cfe91f81e7 |+--------------------------------------+(1 row)If you want to use a UUID as yourprimary key, you can redefine thetable as follows:+--------------------------------------+--------+---------------------+Now, all if this is great if you’reworking directly at the databaselevel. But Rails migrations aresupposed to provide a layer ofabstraction, allowing you to specifyyour database changes via Rubymethod calls. Starting in Rails 4,that’s possible. I can create a newRails application with:CREATE TABLE People ();id UUID NOT NULL PRIMARY KEY DEFAULT uuid_generate_v1(),name TEXT,email TEXTAs you can see, here you havereplaced the SERIAL type with a UUIDtype (defined by the extension) andhave instructed PostgreSQL to invokethe UUID-generating function whenno UUID value is provided. If youinsert a row into this table, you’ll seethat the UUID is indeed generated:rails new pgfun -d postgresqlThis will create a new “pgfun”Rails app, using PostgreSQL as theback-end database. I then createan appropriate database user atthe command line (giving that usersuperuser privileges on the databasein question):createuser -U postgres -s pgfunI then create the developmentdatabase:INSERT INTO People (name, email)createdb -U pgfun pgfun_developmentVALUES ('Reuven', 'reuven@lerner.co.il');SELECT * FROM People;Now you’re now ready to createyour first migration. I use the built-inWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 39


COLUMNSAT THE FORGERails scaffold mechanism here, whichwill create a migration (as well ascontrollers, models and views) for me:def changecreate_table :people, id: :uuid do |t|t.text :namet.text :emailrails g scaffold person name:text email:textNotice that I haven’t specified aprimary key column. That’s becauseRails normally assumes there will bea numeric column named “id”, whichwill contain the primary key. However,you’re going to change that byopening up the migration file that wascreated in db/migrations. By default,the migration looks like this:class CreatePeople


COLUMNSAT THE FORGEmigration, this is what you’ll see:\d postsTable "public.posts"+------------+-----------------------------+--------------------------+| Column | Type | Modifiers |+------------+-----------------------------+--------------------------+| id | uuid | not null defaultThis tells Rails to ask ActiveRecordfor the first record in the Posts tableand to call up its “tags” column,which is returned as a Ruby arrayof strings. You then can iterateover those strings, printing them(or otherwise manipulating them).Although this isn’t very efficient orsmart, you also can do the following:➥uuid_generate_v1() || headline | text | |Post.all.select {|p| p.tags.member?('foo')}| body | text | || tags | character varying(255)[] | default '{}'::character➥varying[] || created_at | timestamp without time zone | |A more efficient way to do this wouldbe to use the ANY operator that you sawearlier, passed to PostgreSQL in a string:| updated_at | timestamp without time zone | |+------------+-----------------------------+--------------------------+Post.where("'general' = ANY(tags)").firstIndexes:"posts_pkey" PRIMARY KEY, btree (id)From a database perspective,things seem great; you can performan INSERT:INSERT INTO Posts (headline, body, tags, created_at, updated_at)Unfortunately, it doesn’t seemthat you can add elements to aPostgreSQL array using the standardRuby


Advertorial FeatureDDoS attacks:Does your hosting providerkeep you safe?The vast majority of companies underestimate their exposure to the risk of a Denial of Serviceattack (DDoS). As a result, few take measures to protect themselves from one. However, internetattacks are on the rise, occurring more frequently, with greater intensity, and targeting a widerscope than ever before. While in times past such attacks were primarily conducted by hackers inthe pay of unscrupulous competitors or underground networks wreaking havoc, these days theyhave become very easy to unleash, whether as a challenge or simply for fun. So the questionis no longer whether you need DDoS protection, but whether your hosting provider is able toeffectively protect you when you become the target of such an attack.Initially, the main objective of a cyberattack was to damage a corporation, brand,or government agency. To accomplishthis, hackers employed highly advancedtechniques, and hired out their skills atno small cost. These days, the neededtechniques are easily accessible on the webvia online forums or dedicated websites.More significantly, the resources needed tomount such an attack have become trivial.For $10, almost anyone can rent a numberof botnets (remotely controlled «zombie»machines) and set them to run a readymadescript that exploits the weaknesses ofa website, web application, portal or blog.The result? These days, most DDoS attacksare launched by teenagers, nicknamed«Script Kiddies». This has become a majornuisance for companies large and smallacross all industry sectors, and especiallyfor those who are heavily dependenton maintaining an uninterrupted onlinepresence.DDoS protection: Included in allOVH.com hosting plansThe protective measures taken by manyhosting providers have proven to beincreasingly ineffective in stemmingthe exponential growth, intensity andsophistication of certain denial of serviceattacks. This trend has led OVH.com to takea close look at its data hosting operationsand design a DDoS protection system thatprovides non-stop protection of its clients’physical and virtual IT infrastructures.This set of tools is able to detect an attack,counter it, and greatly minimize its impact.The costs of implementing such protectionare significant, and most hosting providerspass these costs on to their customers byoffering DDoS protection as a prohibitivelyexpensive paid option. OVH.com istaking a markedly different approach bymaking DDoS protection a standard serviceincluded in all of its hosting plans.Why make these costly protectiveservices available to all?If an attack is launched against one OVH.com client, there can be secondary negativeeffects on neighboring clients within thesame server bay. Depending on the intensityof the attack, such collateral damage couldescalate to impact a sub-network, router,or even an entire datacenter. And thenthere are «side attacks», where, in order toreach a difficult target, hackers first attackneighboring virtual servers that are lessprotected. In summary, in order for DDoSprotection to be truly effective, everyoneneeds to be protected. This is why OVH.com’s business model consists of spreadingthe costs of protection across all of itsservers, by providing it by default to all.OVH.com is the only hosting provider toinclude permanent DDoS protection in allof its hosting plans.For more information:www.ovh.com/us/anti-ddos


COLUMNSAT THE FORGEcompatibility with its predecessorsnearly as much as Rails 3, doesintroduce a great deal of newfunctionality. For me, one of the mostinteresting areas of this functionalityis a shift toward PostgreSQL,with ActiveRecord migrations andfunctionality abandoning some of itsplatform independence. In this article,I showed two of the features thatare now available, namely UUIDs andarrays. However, there are additionalfeatures, such as native support forINET (that is, IP address) data types,for JSON (which PostgreSQL 9.3supports even better than it did in thepast), ranges and even for hstore, aNoSQL-like storage system built ontop of PostgreSQL.No technology, includingPostgreSQL, is the right answer foreveryone at all times. However, inmy experience, PostgreSQL offersexcellent performance, features andstability, with a fantastic communitythat answers questions and strivesfor correctness. The fact that Rails4 has embraced many of thesefeatures is likely to expose evenmore people to PostgreSQL, whichcan only be good for Web anddatabase developers who useopen-source products.■Web developer, trainer and consultant Reuven M. Lerneris finishing his PhD in Learning Sciences at NorthwesternUniversity. He lives in Modi’in, Israel, with his wife and threechildren. You can read more about him at http://lerner.co.il,or contact him at reuven@lerner.co.il.Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.ResourcesThe PostgreSQL home page is at http://postgresql.org. From that site, you can download thesoftware, read documentation and subscribe to e-mail lists. The latest version, 9.3, includes alarge number of goodies that should excite many developers.Ruby on Rails, in version 4.0.0 at the time of this writing, is available at http://rubyonrails.org.You likely will want to use Ruby version 2.0, although version 1.9.3 is known to work with Rails4. You also will want to include the “pg” Ruby gem, which connects to PostgreSQL. Excellentdocumentation on Rails is available in the “Guides” series at http://guides.rubyonrails.org.A nice blog posting describing many of the updates to Rails 4 from a PostgreSQL perspective is athttp://blog.remarkablelabs.com/2012/12/a-love-affair-with-postgresql-rails-4-countdown-to-<strong>2013</strong>.46 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Why Join USENIX?We support members’ professional and technicaldevelopment through many ongoing activities, including:Open access to research presented at our eventsWorkshops on hot topicsConferences presenting the latest in research and practiceLISA: The USENIX Special Interest Group for Sysadmins;login:, the magazine of USENIXStudent outreachYour membership dollars go towards programs including:Open access policy: All conference papers and videos are immediately free to everyone uponpublicationStudent program, including grants for conference attendanceGood Works programHelping our many communities share, develop, and adopt ground-breaking ideas in advanced technologyJoin us at www.usenix.org


COLUMNSWORK THE SHELLDebuggingWeb SitesDAVE TAYLORDave has to fight a fire with shell scripts when he migrates histen-year-old Web site onto a new content management system!I know, I’m in the middle of aseries of columns about how to workwith ImageMagick on the commandline, but when other things arise,well, I imagine that a lot of you aresomehow involved in the managementof servers or systems, so you allunderstand firefighting.Of course, this means you all alsounderstand the negative feedbackloop that is an intrinsic part of systemadministration and IT management. Imean, people don’t call you and theCEO doesn’t send a memo saying,“system worked all day, printer evenprinted. Thanks!”Nope, it’s when things go wrongthat you hear about them, and thatpropensity to ignore the good andhave to deal with the bad when itcrops up is not only a characteristic ofbeing in corporate IT, it’s just as trueif you’re running your own system—which is how it jumped out of thepond and bit me this month.It all started ten years agowith my Ask Dave Taylor site.You’ve probably bumped into it(http://www.AskDaveTaylor.com),as it’s been around for more than adecade and served helpful tutorialinformation for tens of millions ofvisitors in that period.Ten years ago, the choice ofMovable Type as my blogging platformmade total sense and was a smartalternative to the raw, unfinishedWordPress platform with its neverendingparade of hacks and problems.As every corporate IT person knows,however, sometimes you get lockedin to the wrong platform and arethen stuck, with the work required tomigrate becoming greater and greatereach month nothing happens.For the site’s tenth anniversary,therefore, it was time. I had to bitethe bullet and migrate all 3,800articles and 56,000 comments fromMovable Type to WordPress, because48 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


COLUMNSWORK THE SHELLyes, WordPress won and is clearlythe industry standard for contentmanagement systems today.The task was daunting, not justbecause of the size of the import (itrequired the consulting team rewritingthe standard import tool to work withthat many articles and comments), butbecause the naming scheme changed.On Movable Type, I’d always had it setto convert the article’s name into aURL like this:Name: Getting Started with PinterestURL: /getting_started_with_pinterest.htmlThat was easy and straightforward,but on WordPress, URLs have dashes, notunderscores, and, more important, theydon’t end with .html because they’regenerated dynamically as needed. Thismeans the default URL for the newWordPress site would look like this:New URL: /getting-started-with-pinterest/URLs can be mapped upon importso that the default dashes becomeunderscores, but it was the suffix thatposed a problem, and post-importthere were 3,800 URLs that werebroken because every single link toxx_xx.html failed.Ah! A 301 redirect! Yes, butthousands of redirects slow downthe server for everyone, so a rewriterule is better. Within Apache, you canspecify “if you see a URL of the formxx_xx.html, rewrite it to ’xx_xx’ andtry again”, a darn handy capability.But life is never that easy,because although this rewrite willwork for 95% of the URLs on theold site, there were some that justended up with a different URLbecause I’d monkeyed with thingssomewhere along the way. Yeah,there’s always something.For example, the old site URL/schedule_facebook_photo_upload_fan_page.html is now on the serverwith the URL /schedule-a-facebookphoto-upload-to-my-fan-page/.That’s helpful, right? (Sigh.)These all can be handled with a301 redirect, but the question is, outof almost 4,000 article URLs on theold site, which ones don’t actuallysuccessfully map with the rewrite rule(.html to a trailing slash) to a page onthe new server?Finally Some ScriptingTo identify these rewrite fails, Ihad to create a script—and fast.After all, while the internal linkagesstill might work, the thousands ofexternal links from sites like PopularScience, the Wall Street <strong>Journal</strong>,Wired and elsewhere were notbroken. Yikes—not good at all.I started out on the command linewith one that I knew failed. Here’swhat happened when I used curl toWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 49


COLUMNSWORK THE SHELLThat’s boring though, because while I’m at it, I’dlike to know how many URLs were tested and howmany errors were encountered.I’m at it, I’d like to know how manyURLs were tested and how manyerrors were encountered. I mean, whynot, right? Quantification = good.It’s easily added, as it turns out,with the addition of two new variables(both of which need to be set to zeroat the top of the script):➥store_and_shopping_cart_to_my_site.html fails to resolve.* URL http://www.askdavetaylor.com/whats_amazons_simple_➥storage_solution_s3.html fails to resolve.* URL http://www.askdavetaylor.com/whats_my_yahoo_➥account_password_1.html fails to resolve.* URL http://www.askdavetaylor.com/youtube_video_for name in *.html ; do➥missing_hd_resolution.html fails to resolve.newname="$(echo $name | sed 's/\.html/\//')"test=$($curl $base/$newname | grep "$pattern")if [ -n "$test" ] ; thenfiecho "* URL $base/$name fails to resolve."error=$(( $error + 1 ))count=$(( $count + 1 ))doneThen at the very end of the script,after all the specific errors arereported, a status update:echo ""; echo "Checked $count links, found $error problems."Great. Let’s run it:Checked 3658 links, found 98 problems.Phew. Now I know the special casesand can apply custom 301 redirects to fixthem. By the time you read this article,all will be well on the site (or better be).Next month, back to ImageMagick,I promise—unless another fire eruptsthat I have to solve.■Dave Taylor has been hacking shell scripts for more than30 years. Really. He’s the author of the popular Wicked CoolShell Scripts and can be found on Twitter as @DaveTaylorand more generally at http://www.DaveTaylorOnline.com.$ bad-links.sh | tail -5* URL http://www.askdavetaylor.com/whats_a_fast_way_to_add_a_Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 51


COLUMNSHACK AND /up all of the commands you wouldneed to install everything on aRaspberry Pi into a single largescript. At the end, you have theRetroArch project fully installed andconfigured, which includes all ofthe major emulators you’d want anda centralized method to configurethem, plus an EmulationStationgraphical front end the Pi can bootdirectly into that makes it simple tonavigate to the game you want, allfrom a gamepad.Install RetroPieBefore you install RetroPie, you willwant to make sure your Raspbiandistribution (the default <strong>Linux</strong>distribution for a Raspberry Pi, andthe one this project assumes youwill use) is completely up to date,including any new firmware images.This just means a few common aptcommands. Although you certainlycould connect a keyboard to yourRaspberry Pi for this step, I’vefound it more convenient to ssh into the device so I could copy andpaste commands:$ sudo apt-get update$ sudo apt-get -y upgradeNow that the Raspberry Pi is up todate, make sure the git and dialogpackages are installed, and then usegit to download RetroPie:$ sudo apt-get -y install git dialog$ cd$ git clone --depth=0➥git://github.com/petrockblog/RetroPie-Setup.gitThis will create a RetroPie-Setupdirectory containing the main setupscript. Now you just need to go insidethat directory and execute it:$ cd RetroPie-Setup$ chmod +x retropie_setup.sh$ sudo ./retropie_setup.shThis script presents you with anin-terminal menu (Figure 1) whereyou can choose to perform a binaryinstallation or source installation,set up RetroPie, or perform a seriesof updates for the RetroPie setupscript and binaries. Choose eitherthe binary or source installation.The binary installation won’t takeas much time, but you may riskrunning older versions of some ofthe software. The source installationrequires you to compile software,so it takes longer, but at the end,you will have the latest possibleversions of everything. Personally,I opted for the binary install,knowing I always could re-run theWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 53


COLUMNSHACK AND /Figure 1. RetroPie Setup Menuscript and go with the source installif I found any problems.This part of the process willtake quite some time on a vanillaRaspbian image, as there area lot of different packages todownload and install. Once theinstallation completes, go backto the main RetroPie setup screenand select SETUP from the mainmenu. In this submenu, you cantweak settings, such as whether tostart EmulationStation from boot(recommended) and whether toenable a splash screen. In my case, Ienabled both settings as I intendedmy device to be a standaloneemulation machine. Note that if youdo allow EmulationStation to startup from boot, you still can alwaysssh in to the machine and run theoriginal RetroPie configuration scriptto change the settings.Adding ROMsYou also can add ROMs within theRetroPie setup screen. If you choosethe Samba method in the menu,you then can locate a local Sambamountpoint on your network, andyou can copy ROMs from that. Withthe USB stick method, RetroPiewill generate a directory structureon a USB stick that you plug in toyour Raspberry Pi that representsthe different emulators it supports.After this point, you can take thatUSB stick to another computer and54 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


COLUMNSHACK AND /gamepad or keyboard buttons so itcan work with the EmulationStationmenu. Note that this doesn’taffect how your controllers workwithin games, just within theEmulationStation menu. After yourcontroller or controllers are setup, you should be able to pressright and left on your controllerto switch between the differentemulator menus. In my case, all ofthe buttons on my gamepad weregoing to be used within games,so I made a point to bind a key ona separate keyboard to the menufunction so I could exit out ofgames when I was done withouthaving to reboot the Raspberry Pi.EmulationStation will showyou only menus that representemulators for which it has detectedROMs, so if you haven’t copiedROMs for a particular emulatoryet, you will want to do that andpotentially restart your RaspberryPi before you can see them. Also,by default, your controller will notbe configured for any games, butif you press the right arrow enoughtimes within EmulationStation, youwill get to an input configurationscreen that allows you to map keyson your controller to keys insidea game. The nice thing about thissetup is that after you configurethe keys, it will apply appropriatelywithin each emulator.That’s it. From this point, you canbrowse through your collection ofgames and press whatever buttonyou bound to Accept to startplaying. At first I was concernedthe Raspberry Pi wouldn’t have thehorsepower to play my games, butso far, it has been able to play anygames I tried without a problem.■Kyle Rankin is a Sr. Systems Administrator in the San FranciscoBay Area and the author of a number of books, including TheOfficial Ubuntu Server Book, Knoppix Hacks and Ubuntu Hacks.He is currently the president of the North Bay <strong>Linux</strong> Users’ Group.Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.ResourcesThe RetroPie Project: http://blog.petrockblock.com/retropieRetroPie Installation Docs: https://github.com/petrockblog/RetroPie-Setup56 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Unleash the power ofFree Toolsto to free free you you from the everydayIT management tasks!http://www.manageengine.com/free-toolsTrusted by Businesses of ALL SIZES| Active Directory, Exchange, SharePoint, and & SQL Monitoring| Desktops Management | Windows | Windows Monitoring Monitoring| Virtualization && Cloud Cloud Management Management| Ping, Traffic, and SNMP Troubleshooting Utilities | Mobile AppsPing, Traffic, & SNMP Troubleshooting Utilities | Mobile Appshttp://www.manageengine.com/free-tools


COLUMNSTHE OPEN-SOURCE CLASSROOMFigure 1. The CuBox is almost too small to look real.has liked the dangling little circuitboard hanging off the back of thetelevision. The CuBox looks great.Inside that simple case it has:n System board based on MarvellArmada 510 SoC (Figure 2).n 800MHz dual-issue ARM PJ4processor, VFPv3, wmmx SIMDand 512KB L2 cache.n 1080p Video Decode Engine.n OpenGL|ES 2.0 graphic engine.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 59


COLUMNSTHE OPEN-SOURCE CLASSROOMn SPDIF (optical audio).n HDMI 1080p Output (with CECfunction for TVs that support it).n 1GB DDR3 at 800MHz (2GB inthe Pro model I have).n Gigabit Ethernet (bootable via TFTP).n eSata 3Gbps (bootable).n 2xUSB 2.0 (bootable).n MicroSD (bootable).n Micro-USB (console).n Standard infra-red receiver for38KHz-based IR controllers.The little system comes with a DCpower adapter and draws a whopping3 watts of power while in use. 3watts. That’s not a typo. Regardlessof the use case you find for yourCuBox, power consumption shouldn’tbe an issue. In fact, 3 watts reallylends itself to battery power anda solar panel for those computingprojects off the grid. That might haveto wait until next month for me,however, because I already have moreindoor projects than I have CuBoxes.Check out how well they’ve crammedthe ports onto the back plane of thelittle thing in Figure 3.Figure 2. The Marvell system on a chip istiny, even by cell-phone standards.Figure 3. The only strange port placementis the optical audio port on the side.Thankfully, I’ve been using HDMI only.60 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


COLUMNSTHE OPEN-SOURCE CLASSROOM3. Start CuBox with the USB drive inthe top USB slot.That’s it—no Syslinux, no fiddlingwith the master boot record, nokeyboard combinations. Just havethe folder named boot with theinstallation scripts, and the CuBox willboot in to installer mode (Figure 4).From there, the installer scripts willdownload Ubuntu, XBMC, Debian orany custom installation scripts youcan find elsewhere on the Internet.The CuBox installer will format theincluded MicroSD card and get yoursystem completely ready to go.Cool Stuff to Do, I: DesktopThe CuBox Pro I received camewith Ubuntu preinstalled on theMicroSD card. I’ve used Raspbian asa desktop computer a couple times,and although it was cool to see theRPi power a fully graphical system,the experience wasn’t exactly fun.Because the CuBox has a fasterprocessor and 2 gigs of RAM, itmakes sense that it would be betteras a desktop replacement—a 3-wattdesktop replacement! As it turns out,the CuBox does perform better thana Raspberry Pi, but it’s still sluggishenough that I wouldn’t want to use itfor more than a little while. Plus, sinceit’s ARM-based, there’s no possibilityof getting Adobe Flash working.Sadly, that’s still a showstopper forsome of the things I do on a regularbasis. If you’re looking for a teenytinydesktop replacement computer,however, I think the CuBox might fitthe bill if:1. Instead of booting from MicroSD,you connect an SSD via eSATA.The same installation programwill install onto an eSATA driveinstead of MicroSD, and that wouldremove what felt like the biggestbottleneck. Once programs wereloaded into RAM, the system wasquite usable. MicroSD as your mainfilesystem is just painful.2. Rather than using Ubuntuproper, you use Xubuntu (theCuBox performs a bit better withXubuntu). Although the GPU onthe system is great for decodingvideo, it does struggle withdynamic GUI content. Ubuntu’soverlays and eye-candy-richinterface make Ubuntu feel slow.I forced myself to use the CuBoxfor an afternoon as my main writingcomputer. It worked, but if I wasdoing anything other than actualwriting, the lag when launchingprograms really frustrated me. I wish I62 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


COLUMNSTHE OPEN-SOURCE CLASSROOMhad an eSATA SSD to test the speed.Cool Stuff to Do, II: ServerI’ve mentioned my Raspberry Picolocated in Austria before. I’m stilleternally grateful to Kyle Rankin fortipping me off to the opportunity.As I put services on the RPi,however, I have noticed that as aserver, the Raspberry Pi can becomeoverwhelmed fairly quickly. But,with a headless Debian install onthe CuBox, I can see the difference2GB of RAM can make.The other big limitation withmy Raspberry Pi server is storagespace. I have a 32GB SD card anda 32GB USB drive in it, but notonly are both of those media typesrather slow, 64GB also isn’t exactlymassive cloud storage. With theCuBox, an external eSATA drivemeans the potential for terabytesof local storage.Although I’ll admit the CuBox asa desktop replacement was a littlefrustrating, as a server, I was veryimpressed with it. With GigabitEthernet, tons of RAM and visualappeal, it makes a great server foryour office. Thanks to the ARMprocessor, it’s perfectly silent andgenerates very little heat. I wouldcontinue to use the CuBox as aserver in my office, except....Cool Stuff to Do, III: Home TheaterGeeXboX is a home-theater softwaresystem that at some point startedutilizing XBMC as its software. Thatsame awesome installer programthat installs Ubuntu or Debian willinstall the latest version of GeeXboXon the CuBox, already tweaked andconfigured for the CuBox-specifichardware. At first glance, this doesn’tseem like much, but under the hood,there are some cool modificationsmade to help with menu renderingspeeds and screen blanking. Basically,if you use the CuBox installer, XBMCjust works (Figure 5).Running GeeXboX on the CuBoxis incredible. Admittedly, there are afew issues with slow menu response(especially while the library is beingscanned). If you’ve ever used XBMCon a Raspberry Pi, however, thisperforms like a dream. But, that’snot even the coolest part. Becausethe CuBox comes with an infraredreceiver built in to its little cube,that means programming the oldremote sitting in your junk drawer(an old series one TiVo remote in mycase) is simple. Well, okay, maybe“simple” isn’t quite true. Still,with a little bit of wiki reading andcommand-line tweaking, I was ableto get my old TiVo remote to workwith CuBox and GeeXboX in prettyWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 63


COLUMNSTHE OPEN-SOURCE CLASSROOMFigure 5. GeeXboX works flawlessly. Even the HDMI and SPDIF are detected correctlyand function without additional tweaking.short order.If you do try XBMC (or GeeXboX,which is the same thing now itseems), I recommend playingnetwork videos over NFS. There’sless overhead than Samba, and highbandwidth media seems to streambetter. If you don’t have 1080pvideos streaming over your network,it’s probably not a big issue, butNFS seems to work better for thosetruly huge videos.Bonus Stuff I Want to Try SoonAs I mentioned earlier, I’d like totry out the CuBox with an eSATAconnectedSSD. I’d also like toexplore booting via TFTP. The systemsupports it, but I haven’t lookedinto how the boot process works.Running something like LTSP over theGigabit Ethernet port with 2 gigs ofRAM might make for an incrediblyawesome thin-client environment.The CuBox has found a semipermanenthome as a GeeXboX/XBMC media player in our home.I would like to see the Plex HomeTheater system configured for theCuBox. Although we still use XBMC64 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


COLUMNSTHE OPEN-SOURCE CLASSROOMin our house, the central server-basedmodel Plex offers has features I don’tget with XBMC alone.And of course, I’ve been tryingto figure out how to use a CuBoxto somehow capture better videoof the various bird feeders aroundour house. The low power drawcoupled with impressive hardwarespecs make the CuBox ideal for abattery/solar-panel-type solution.With a USB Wi-Fi dongle, I could putbirdcams all over our woods. Pleasedon’t tell my wife. She still rolls hereyes when she sees the cell phonesattached to my office window! (Seemy column in the October <strong>2013</strong>issue for details on BirdCam.)■Shawn Powers is the Associate Editor for <strong>Linux</strong> <strong>Journal</strong>.He’s also the Gadget Guy for <strong>Linux</strong><strong>Journal</strong>.com, and he has aninteresting collection of vintage Garfield coffee mugs. Don’t lethis silly hairdo fool you, he’s a pretty ordinary guy and can bereached via e-mail at shawn@linuxjournal.com. Or, swing bythe #linuxjournal IRC channel on Freenode.net.Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.LINUX JOURNALon yourAndroid deviceDownload app now inthe Android Marketplacewww.linuxjournal.com/androidFor more information about advertising opportunities within <strong>Linux</strong> <strong>Journal</strong> iPhone, iPad andAndroid apps, contact John Grogan at +1-713-344-1956 x2 or ads@linuxjournal.com.


NEW PRODUCTSPentaho Business AnalyticsBusinesses striving to increase their competitiveadvantage with big data analytics can do so with helpfrom Pentaho Business Analytics 5.0, a completelyredesigned data integration and analytics platform.The result of many years of intensive planning,research and conversations with customers and industry experts, Pentaho 5.0 provides afull spectrum of analytics for today’s big data-driven businesses regardless of data type andvolume, IT architecture or analysis required. The new, modern interface simplifies the userexperience for all those working to turn data into competitive advantage. Highlights ofPentaho’s 250+ new features and improvements include blended big data for more accurateinsights, simplified analytics and user experience and enterprise-ready big data integration.These features are targeted at five key analytic personas in the enterprise, says Pentaho, andPentaho 5.0 offers benefits to all these roles and includes functionality to help eliminate manyof the common pain points that are holding back big data initiatives in companies of all sizes.http://www.pentaho.comCrucial LRDIMMsIf Crucial could tell you one thing about its new 64GB DDR3L Load-ReducedDIMMs (LRDIMMs) for servers, it would be that they enable more DIMMsper channel for up to twice the installed memory capacity per server. Byenabling high-end server environments with demanding workloads to reachthe maximum amount of installed memory possible, dramatic performancegains in memory bandwidth and overall server productivity are possible, allwhile reducing power costs relative to adding additional servers. Utilizing1.35V (vs. 1.5V) for optimal energy efficiency, Crucial LRDIMMs offer upto a 35% increase in memory bandwidth compared to standard registeredDIMMs and eliminate the channel ranking limitation of standard DDR3registered DIMMs. These new memory modules also are compatible withOEM servers and warranties, allowing users to upgrade their existing serverinfrastructures without having to purchase an entirely new system. CrucialLRDIMMs fully support the latest Intel Xeon processor E5 family.http://www.crucial.com/server66 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


NEW PRODUCTSThe Library of Congress’Constitution Annotated Appand Web PublicationOur community is packed with power citizens, from free speech advocates toSecond Amendment gurus and civil rights activists. To inform your activism, orsimply to live up to Thomas Jefferson’s ideal of an informed citizenry, the Libraryof Congress has a new resource for you. The Constitution of the United Statesof America: Analysis and Interpretation, popularly known as the “ConstitutionAnnotated” is now available as a free app and Web publication. The resource,formerly difficult to access by the general public due to its size and update cycle,makes analysis and interpretation of constitutional case law accessible to anyonewith a computer or mobile device. The Web publication consists of digitallysigned, searchable PDF documents. Meanwhile the app, which adds a wealthof interactive features, is available for the iOS platform; an Android version iscurrently under development.http://beta.congress.gov/constitution-annotatedOpenStackFoundation’sTrainingMarketplaceAs the OpenStack cloud computing platform infiltrates data centers worldwide,the demand for educated administrators and users to manage these clouds growsin tandem. To help developers and operators gain the valuable skills they need toexploit the burgeoning opportunities, the OpenStack Foundation and its ecosystempartners have responded with the Training Marketplace. The on-line “course catalog”features dozens of courses across 10 countries and 25 cities, from providers at launchtime as varied as Aptira, hastexo, The <strong>Linux</strong> Foundation, Mirantis, Morphlabs, Piston,Rackspace, Red Hat, SUSE and SwiftStack. In addition to paid and free trainingcourses, many community efforts are available to produce helpful documentation,how-to information and new Operations and Security guide books.http://www.openstack.org/marketplace/training68 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


NEW PRODUCTSZentyal ServerIn midst of a cloud push, Zentyal <strong>Linux</strong> Small Business Server3.2 represents a drop-in alternative to existing SMB solutionsfor companies seeking a choice in how to manage their network infrastructure. The all-in-oneIT backoffice, which can be set up in less than 30 minutes, is an integration of the complexSamba technologies that provides native interoperability with Microsoft Active Directory anda range of other services. The solution speaks to customers who have logistical, security andconnectivity concerns about moving to the cloud. The single most important improvementthat Zentyal Server 3.2 introduces, according to its developer, is greater integration of theSamba technology, meaning that it is now possible to introduce Zentyal Server transparentlyon a Windows environment, migrate network services and users to Zentyal and then turn offunneeded Windows servers without causing inconvenience to users. Zentyal Server 3.2 alsosupports Group Policy Objects and Organizational Units and secures mobile communicationsto the private company resources out of the box.http://www.zentyal.comSUSE CloudThe headlining new feature of SUSE Cloud 2.0is the capacity for setting up a mixed-hypervisorprivate cloud environment that can be deployedrapidly and managed easily, helping enterprisesincrease business agility and reduce IT costs. SUSECloud 2.0 supports KVM and Xen hypervisorenvironments and is the first OpenStack distribution(namely Grizzly) to add full support for Microsoft Hyper-V. VMware ESXi integration also isincluded as a technical preview. Other advances in v2.0 include a more robust installationprocess, integration with the Crowbar deployment framework, all of the features and fixesof OpenStack Grizzly, improved integration with SUSE Studio and SUSE Manager for buildingand managing cloud-based applications and a broader ecosystem of partner solutions toconfigure private clouds based on unique IT infrastructure requirements.http://www.suse.comPlease send information about releases of <strong>Linux</strong>-related products to newproducts@linuxjournal.com orNew Products c/o <strong>Linux</strong> <strong>Journal</strong>, PO Box 980985, Houston, TX 77098. Submissions are edited for length and content.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 69


KNOWLEDGE HUBA Call to Arms for Private Cloud BuildersSponsor: ActiveState | Topic: Cloud ComputingON DEMANDThe era of elastic IT is here. Businesses are realizing that the cloud not only allows cost reduction, but provides opportunitiesfor innovation and growth. Elastic clouds enable next-generation applications that drive revenue opportunities, increase agility,and make IT teams competitive with public cloud systems.In this presentation, Randy and John talk about the forces driving this change, and outline an action plan for building an elasticcloud infrastructure and dynamic applications using DevOps and Platform-as-a-Service.> http://lnxjr.nl/CTACloudWEBCASTSPrivate PaaS for the Agile EnterpriseSponsor: ActiveState | Topic: VirtualizationIf you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualizationoffers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’shypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizationsneed more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating aprivate PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.> http://lnxjr.nl/privatepaasAELearn the 5 Critical Success Factors to AccelerateIT Service Delivery in a Cloud-Enabled Data CenterToday's organizations face an unparalleled rate of change. Cloud-enabled data centers are increasingly seen as a way to accelerateIT service delivery and increase utilization of resources while reducing operating expenses. Building a cloud starts with virtualizingyour IT environment, but an end-to-end cloud orchestration solution is key to optimizing the cloud to drive real productivity gains.> http://lnxjr.nl/IBM5factors<strong>Linux</strong> Backup and Recovery WebinarSponsor: Storix | Topic: Backup and RecoveryMost companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However,fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications,settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to asystem, there must be a system to restore it to.In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness usingStorix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and <strong>Linux</strong> systems.> http://lnxjr.nl/StorixWebinar70 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


KNOWLEDGE HUBWHITE PAPERS<strong>Linux</strong> Management with Red Hat Satellite:Measuring Business Impact and ROISponsor: Red Hat | Topic: <strong>Linux</strong> Management<strong>Linux</strong> has become a key foundation for supporting today's rapidly growing IT environments. <strong>Linux</strong> is being used to deploybusiness applications and databases, trading on its reputation as a low-cost operating environment. For many ITorganizations, <strong>Linux</strong> is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utilityworkloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As <strong>Linux</strong> growsin importance in terms of value to the business, managing <strong>Linux</strong> environments to high standards of service quality —availability, security, and performance — becomes an essential requirement for business success.> http://lnxjr.nl/RHS-ROIStandardized Operating Environmentsfor IT EfficiencySponsor: Red HatThe Red Hat® Standard Operating Environment SOE helps you define, deploy, and maintain Red Hat Enterprise <strong>Linux</strong>®and third-party applications as an SOE. The SOE is fully aligned with your requirements as an effective and managedprocess, and fully integrated with your IT environment and processes.Benefits of an SOE:SOE is a specification for a tested, standard selection of computer hardware, software, and their configuration for useon computers within an organization. The modular nature of the Red Hat SOE lets you select the most appropriatesolutions to address your business' IT needs.SOE leads to:• Dramatically reduced deployment time.• Software deployed and configured in a standardized manner.• Simplified maintenance due to standardization.• Increased stability and reduced support and management costs.• There are many benefits to having an SOE within larger environments, such as:• Less total cost of ownership (TCO) for the IT environment.• More effective support.• Faster deployment times.• Standardization.> http://lnxjr.nl/RH-SOEWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 71


FEATURE Scale Out with GlusterFSScale Out withGlusterFSLearn how to install, benchmarkand optimize this popular,shared-nothing and scalableopen-source distributed filesystem.ALEX DAVIES and ALESSANDRO ORSARIA72 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


GlusterFS is an open-sourcedistributed filesystem,originally developed by asmall California startup, Gluster Inc.Two years ago, Red Hat acquiredGluster, and today, it sponsorsGlusterFS as an open-source productwith commercial support, called RedHat Storage Server. This article refersto the latest community version ofGlusterFS Serversnode1node2/brick01/brick02/brick01/brick02glusterfsdglusterfsdglusterfsdglusterfsdglusterdglusterdGlusterFS volumeIP NetworkGluster Native ClientKernel (FUSE)clientFigure 1. Simplified Architecture Diagram of a Two-Node GlusterFS Cluster and Native ClientWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 73


FEATURE Scale Out with GlusterFSDistributed Filesystemsand Metadata ServersWhile most distributed filesystems include a metadata server, GlusterFSdoes not. In a filesystem architecture with a metadata server, data is stripedamong different nodes, with another server keeping track of the location of themetadata and controlling access to the storage nodes. When a client issuesan I/O operation to a file, it sends the request to the metadata server, whichin turn tells the client where to retrieve the data from. With GlusterFS, nativeclients deterministically find the correct node where a file is stored via a hashingalgorithm. Eliminating the metadata server is a big advantage, as this is typicallya single point of contact for clients and often becomes a bottleneck.GlusterFS for CentOS, which is 3.4at the time of this writing. Thematerial covered here also appliesto Red Hat Storage Server as wellas other <strong>Linux</strong> distributions.Figure 1 shows a simplifiedarchitecture of GlusterFS. GlusterFSprovides a unified, global namespacethat combines the storage resourcesfrom multiple servers. Each nodein the GlusterFS storage poolexports one or more bricks via theglusterfsd dæmon. Bricks are justlocal filesystems, usually indicatedby the hostname:/directory notation.They are the basic building blocksof a GlusterFS volume, which can bemounted using the GlusterFS NativeClient. This typically provides thebest performance and is based onthe Filesystem In Userspace (FUSE)interface built in to the <strong>Linux</strong> kernel.GlusterFS also provides object accessvia the OpenStack Swift API andconnectivity through the NFS andCIFS protocols.You can increase capacity or IOPSin a GlusterFS storage pool by addingmore nodes and expanding theexisting volumes to include the newbricks. As you provision additionalstorage nodes, you increase notonly the total storage capacity, butnetwork bandwidth, memory andCPU resources as well. GlusterFSscales “out” instead of “up”.Unlike traditional storage arrays,there is no hard limit provided by the74 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


array controller; you are limited bythe sum of all your storage nodes.This scalability results in a smalllatency penalty per I/O operation.Therefore, GlusterFS works very wellwhere nearline storage access isrequired, such as bandwidth-intensiveHPC or general-purpose archivalstorage. It is less suited for latencysensitiveapplications, such as highlytransactional databases.Getting Started with GlusterFSInstalling GlusterFS is straightforward.You can download packages for themost common distributions, includingCentOS, Fedora, Debian and Ubuntu.For CentOS and other distributionswith the yum package manager, justadd the corresponding yum repo, asexplained in the official GlusterFSQuick Start Guide (see Resources).Then, install the glusterfs-server-*package on the storage servers andglusterfs-client-* on the clients.Finally, start the glusterd service onthe storage nodes:# service glusterd startThe next task is to create a trustedstorage pool, which is made fromthe bricks of the storage nodes.From one of the servers, for instancenode1, just type:# gluster peer probe node2And, repeat that for all the otherstorage nodes. The command glusterpeer status lists all the other peersand their connection status.Provisioning a VolumeThe first step to provision a volumeconsists of setting up the bricks.As mentioned previously, these aremountpoints on the storage nodes, withXFS being the recommended filesystem.GlusterFS uses extended file attributesto store key/value pairs for severalpurposes, such as volume replication orstriping across different bricks. To makethis work with XFS, you should formatthe bricks using a larger inode size—forexample, 512 bytes:# mkfs.xfs -i 512 /brick01GlusterFS volumes can belong toseveral different types. They can bedistributed, replicated or striped. Inaddition, they can be a combinationof these—for instance, distributedreplicated or striped replicated. Youcan specify the type of a volumeusing the stripe and replicakeywords of the gluster command.There is no keyword to indicate thata volume is distributed, as this is thedefault configuration.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 75


throughput on the newly createdGlusterFS volume. You can use severaltools to do this, such as iozone,mpirun or dd. For example, runthe following script after changingdirectory into a GlusterFS mountpoint:#!/bin/bashfor BS in 64 128 256 512 1024 2048 4096 8192 ; doiozone -R -b results-${BS}.xls -t 16 -r $BS -s 8g \done-i 0 -i 1 -i 2The previous code runs iozonewith different block sizes and 16active client threads, using an8GB file size to test. To minimizethe effects of caching, you shouldbenchmark I/O performance usinga file size at least twice the RAMinstalled on your servers. However,as RAM is cheap and many newservers come equipped with a lotof memory, it may be impracticalto test with very large file sizes.Instead, you can limit the amountof RAM seen by the kernel to 2GB—for example, by passing the mem=2Gkernel option at boot.Generally, you will discover thatthroughput is either limited bythe network or the storage layer.To identify where the system isconstrained, run sar and iostat inseparate terminals for each GlusterFSnode, as illustrated below:# sar -n DEV 2 | egrep "(IFACE|eth0|eth1|bond0)"# iostat -N -t -x 2 | egrep "(Device|brick)"Keep an eye on the throughput ofeach network interface reported by thesar command above. Also, monitorthe utilization column of the iostatoutput to check whether any of thebricks are saturated. The commandsgluster volume top and glustervolume profile provide many otheruseful performance metrics. Refer tothe documentation on the Glustercommunity Web site for more details.Once you have identified where thebottleneck is on your system, you cando some simple tuning.Tuning the Network LayerGlusterFS supports TCP/IP andRDMA as the transport protocols.TCP on IPoIB (IP over InfiniBand) isalso a viable option. Due to spacelimitations, here we mostly refer toTCP/IP over Ethernet, although someconsiderations also apply to IPoIB.There are several ways to improvenetwork throughput on your storageservers. You certainly should minimizeper-packet overhead and balancetraffic across your network interfacesin the most effective way, so that noneof the NIC ports are underutilized.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 77


FEATURE Scale Out with GlusterFSUnder normal conditions, no particularTCP tuning is required, as the <strong>Linux</strong>kernel enables TCP window scalingby default and does a good job ofautomatically adjusting buffer sizes forTCP performance.To reduce packet overhead, youshould put your storage nodes on thesame network segment as the clientsand enable jumbo frames. To do so,all the interfaces on your networkswitches and GlusterFS servers mustbe set up with a larger MTU size.Typically a maximum value of 9,000bytes is allowed on most switchplatforms and network cards.You also should ensure that TSO(TCP segment offload) or GSO(generic segment offload) is enabled,if supported by the NIC driver. Usethe ethtool command with the -koption to verify the current settings.TSO and GSO allow the kernel tomanipulate much larger packet sizesand offload to the NIC hardware theprocess of segmenting packets intosmaller frames, thus reducing CPUand packet overhead. Similarly, largereceive offload (LRO) and genericreceive offload (GRO) can improvethroughput in the inbound direction.Another important aspect to takeinto consideration is NIC bonding.The <strong>Linux</strong> bonding driver providesseveral load-balancing policies, suchas active-backup or round-robin. Ina GlusterFS setup, an active-backuppolicy may result in a bottleneck, asonly one of the Ethernet interfaceswill be used to transmit and receivepackets. Conversely, a pure roundrobinpolicy provides per-packet loadbalancing and full utilization of thebandwidth of all the interfaces ifrequired. However, depending on yournetwork topology, round-robin maycause out-of-order packet delivery,and your system would require extraCPU cycles to re-assemble them.Better alternatives in this caseare IEEE 802.3ad dynamic linkaggregation (selected with mode=4in the bonding driver) or adaptiveload balancing (mode=6). To deployIEEE 802.3ad link aggregation, anessential requirement is that theswitches connected to the serversmust also support this protocol. Then,the xmit_hash_policy parameterof the <strong>Linux</strong> kernel bonding driverspecifies the hash policy used tobalance traffic across the interfaces,when 802.3ad is in operation. Forbetter results, this parameter shouldbe set to layer3+4. This policyuses a hash of the source anddestination IP addresses and TCPports to select the outboundinterface. In this way, traffic is loadbalancedper flow, rather than per78 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


packet, and the end result is thatpackets from a certain connectionsocket always will be delivered outof the same network interface.In GlusterFS, when a client mountsa volume using the native protocol,a TCP connection is established fromthe client for each brick hosted onthe storage nodes. On an eight-nodecluster with two bricks per node,this results in 16 TCP connections.Hence, the “layer3+4” hash transmitpolicy typically would contain enough“entropy” to split those connectionsevenly across the network interfaces,resulting in an efficient and balancedusage of all the NIC ports.A final consideration concernsmanagement traffic. For securitypurposes, you may want to separateGlusterFS and management traffic(such as SSH, SNMP and so on)onto different subnets. To achievebest performance, you also shoulduse different physical interfaces;otherwise, you may get contentionbetween storage and managementtraffic over the same NIC ports. If yourservers come with only two networkinterfaces, you still can split storageand management traffic onto differentsubnets by using VLAN tagging.Figure 2 illustrates a potentialconfiguration to achieve this result.On a server with two physical networkports, we have aggregated those intoa bond interface named bond0 forGlusterFS traffic. The correspondingconfiguration is shown below:# cat /etc/sysconfig/network-scripts/ifcfg-bond0DEVICE=bond0BONDING_OPTS="miimon=100 mode=4 xmit_hash_policy=layer3+4"MTU=9000glusterfsdBOOTPROTO=noneIPADDR=192.168.1.1NETMASK=255.255.255.0bond0node1eth0IP NetworkManagement traffic(SSH, SNMP, ...)bond0.10eth1Figure 2. Network Configuration on aGlusterFS Server with Multiple BondInterfaces and VLAN TaggingWe have also created a second,logical interface named bond0.10. TheWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 79


FEATURE Scale Out with GlusterFSconfiguration is listed below:# cat /etc/sysconfig/network-scripts/ifcfg-bond0.10DEVICE=bond0.10ONPARENT=yesVLAN=10MTU=1500IPADDR=10.1.1.1NETMASK=255.255.255.0GATEWAY=10.1.1.254This interface is used for trafficmanagement and is set up with astandard MTU of 1,500 bytes and aVLAN tag. Conversely, bond0 is not setwith any VLAN tag, so it will send trafficuntagged, which on the switches willbe assigned to the native VLAN of thecorresponding Ethernet trunk.Tuning the Storage LayerAs of today, the commodity x86 serverswith the highest density of local storagecan fit up to 26 local drives in theirchassis. If two drives are dedicated tothe OS, this leaves 24 drives to host theGlusterFS bricks. You can add additionaldrives via Direct Attached Storage(DAS), but before scaling “up”, youshould check whether you have enoughspare network bandwidth. With largesequential I/O patterns, you may run outof bandwidth well before exhausting thethroughput of your local drives.The optimal RAID level and LUNsize depend on your workload. If yourapplication generates lots of randomwrite I/O, you may consider aggregatingall the drives in a single RAID 10 volume.However, efficiency won’t be the best, asyou will lose 50% of your capacity. Onthe other hand, you can go with a RAID0 configuration for maximum efficiencyand performance, and rely on GlusterFSto provide data redundancy via itsreplication capabilities.It is not surprising that the bestapproach often lies in the middle.With 24 drives per node, Red Hat’srecommended setup consists of2 x RAID 6 (10+2) volumes (seeResources). A single volume with aRAID 50 or 60 configuration also canprovide similar performance.On a LUN spanning a RAID 6 volumeof 12 SAS hard disk drives, a typicalsustained throughput is 1–2GB persecond, depending on the type andmodel of the disk drives. To transferthis amount of data across thenetwork, a single 10Gbps interfaceeasily would become a bottleneck.That’s why you should load-balancetraffic across your network interfaces,as we discussed previously.It is important to configure a localLUN volume properly to get the mostof the parallel access benefits providedby the RAID controller. The chunk size(often referred to as element stripe80 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


FEATURE Scale Out with GlusterFSAgain, there is no optimal value forall workloads, but as a starting point,it may be worth noting that Red Hatrecommends to set this parameter to64MB. However, if your applicationI/O profile is IOPS-intensive and mostlymade of small random requests, youmay be better off tuning the read aheadsize to a lower value and deployingsolid disk drives. The commercialversion of GlusterFS provides a built-inprofile for the tuned dæmon, whichautomatically increases read ahead to64MB and configures the bricks to usethe deadline I/O elevator.ConclusionGlusterFS is a versatile distributedfilesystem, well supported on standard<strong>Linux</strong> distributions. It is trivial toinstall and simple to tune, as we haveshown here. However, this article ismerely a short introduction. We didnot have space to cover many of themost exciting features of GlusterFS,such as asynchronous replication,OpenStack Swift integration andCIFS/NFS exports—to mention just afew! We hope this article enables youto give GlusterFS a go in your ownenvironment and explore some of itsmany capabilities in more detail.■Alex Davies and Alessandro Orsaria worked together for a largeWeb firm before moving into technology within the algorithmicand hedge-fund industries. When they are not trekking inScotland, mountain climbing in the Alps, playing the pianoor volunteering, they are typically in front of a computer andcan be reached at alex@davz.net and alex@linux.it.Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.ResourcesGlusterFS Download Page: http://download.gluster.orgGlusterFS Quick Start Guide:http://www.gluster.org/community/documentation/index.php/QuickStartB. England and N. Khare, Best Practices for Red Hat Storage Server Performance: slidesavailable from http://www.redhat.com/summit/<strong>2013</strong>/presentationsDell Drive Characteristics and Metrics: http://www.dell.com/downloads/global/products/pvaul/en/enterprise-hdd-sdd-specification.pdf<strong>Linux</strong> Ethernet Bonding Driver HOWTO: https://www.kernel.org/doc/Documentation/networking/bonding.txt82 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in PythonZatoAgile ESB,SOA, RESTand CloudIntegrationsin PythonLearn how to integrate applications with an elegantand agile platform that is quick to adapt to everfasterbusiness changes. This how-to introduces themethodology and includes an integration example ofa treasury.gov’s financial API with two external apps.Dariusz Suchojad84 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Zato is a Python-basedplatform for integratingapplications and exposingback-end services to front-end clients(https://zato.io/docs/index.html).It’s an ESB (Enterprise Service Bus) andan application server focused on dataintegrations. The platform doesn’tenforce any limits on architecturalstyle for designing systems and canbe used for SOA (Service OrientedArchitecture), REST (RepresentationalState Transfer) and for buildingsystems of systems running in-houseincoming traffic using asynchronousnotification libraries, such as libeventor libev, but all of that is hidden fromprogrammers’ views so they can focuson their job only.Servers always are part of acluster and run identical copiesof services deployed. There is nolimit on how many servers a singlecluster can contain.Each cluster keeps its configurationin Redis and an SQL database. Theformer is used for statistics or datathat is frequently updated and mostlyZATO PROMOTES LOOSE COUPLING, REUSABILITYOF COMPONENTS AND HOT-DEPLOYMENT.or in the cloud.At its current version of 1.1 (at thetime of this writing), Zato supportsHTTP, JSON, SOAP, SQL, AMQP,JMS WebSphere MQ, ZeroMQ,Redis NoSQL and FTP. It includesa browser-based GUI, CLI, API,security, statistics, job scheduler,HAProxy-based load balancer andhot-deployment. Each piece isextensively documented from theviewpoint of several audiences:architects, admins and programmers.Zato servers are built on top ofgevent and gunicorn frameworksthat are responsible for handlingread-only. The latter is where the morestatic configuration shared betweenservers is kept.Users access Zato through itsWeb-based GUI (https://zato.io/docs/web-admin/intro.html), the commandline (https://zato.io/docs/admin/cli/index.html) or API (https://zato.io/docs/public-api/intro.html).Zato promotes loose coupling,reusability of components andhot-deployment. The high-levelgoal is to make it trivial to accessor expose any sort of information.Common integration techniquesand needs should be, at most, aWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 85


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in Pythoncouple clicks away, removing theneed to reimplement the same stepsconstantly, slightly differently in eachintegration effort.Everything in Zato is aboutminimizing the interference ofcomponents on each other, andserver-side objects you create can beupdated easily, reconfigured on flyor reused in other contexts withoutinfluencing any other.This article guides you throughthe process of exposing complexXML data to three clients usingJSON, a simpler form of XML andSOAP, all from a single code basein an elegant and Pythonic waythat doesn’t require you to thinkabout the particularities of anyformat or transport.To speed up the process ofretrieving information by clients,back-end data will be cached inRedis and updated periodically bya job-scheduled service.The data provider used will beUS Department of the Treasury’sreal long-term interest rates(http://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=reallongtermrate).Clients will be generic HTTP-basedones invoked through curl, althoughin practice, any HTTP client would do.The Process and IRA ServicesThe goal is to make it easyand efficient for external clientapplications to access long-term USrates information. To that end, you’llmake use of several features of Zato:n Scheduler will be employed toinvoke a service to download theXML offered by treasury.gov andparse interesting data out of it.n Redis will be used as a cache tostore the results of parsing the XML.n Zato’s SimpleIO (SIO, https://zato.io/docs/progguide/sio.html) willallow you to expose the sameset of information to more thanone client, each using a differentmessage format, without any codechanges or server restarts.Zato encourages the division ofeach business process into a set of IRAservices (https://zato.io/docs/intro/esb-soa.html)—that is, each serviceexposed to users should be:n Interesting: services should providea real value that makes potentialusers pause for a moment and, atleast, contemplate using the servicein their own applications for theirown benefit.86 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


n Reusable: making services modularwill allow you to make use of themin circumstances yet unforeseen—to build new, and possiblyunexpected, solutions on top oflower-level ones.n Atomic: a service should have awell defined goal, indivisible fromthe viewpoint of a service’s users,and preferably no functionalityshould overlap between services.The IRA approach closely followsAnyone who already has createdan interesting interface of any sortin a single-noded application writtenin any programming language willfeel right like home when dealingwith IRA services.From Zato’s viewpoint, there isno difference in whether a servicecorresponds to an S in SOA or an R inREST; however, throughout this article,I’m using the the former approach.Laying Out the ServicesThe first thing you need is to diagramWHEN YOU DESIGN AN IRA SERVICE, IT IS ALMOSTEXACTLY LIKE DEFINING APIS BETWEEN THECOMPONENTS OF A STANDALONE APPLICATION.the UNIX philosophy of “do onething and do it well” as well asthe KISS principle that is wellknown and followed in many areasof engineering.When you design an IRA service,it is almost exactly like definingAPIs between the componentsof a standalone application. Thedifference is that services connectseveral applications running in adistributed environment. Once youtake that into account, the mentalprocess is identical.the integration process, pull out theservices that will be implemented anddocument their purpose. If you needa hand with it, Zato offers its ownAPI’s documentation as an example ofhow a service should be documented(see https://zato.io/docs/progguide/documenting.html and https://zato.io/docs/public-api/intro.html):n Zato’s scheduler is configuredto invoke a service (update-cache)refreshing the cache once inan hour.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 87


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in Pythonn update-cache, by default,fetches the XML for the currentmonth, but it can be configuredto grab data for any date. Thisallows for reuse of the servicein other contexts.n Client applications use eitherJSON or simple XML to requestlong-term rates (get-rate), andresponses are produced basedon data cached in Redis, makingthem super-fast. A single SIO Zatoservice can produce responsesin JSON, XML or SOAP. Indeed,the same service can be exposedindependently in completelydifferent channels, such as HTTPor AMQP, each using differentsecurity definitions and notinterrupting the message flow ofother channels.Figure 1.OverallBusinessProcess88 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


WHEN USING ZATO SERVICES, YOU ARE NEVERREQUIRED TO HARD-CODE NETWORK ADDRESSES.ImplementationThe full code for both services isavailable as a gist on GitHub, andonly the most interesting parts arediscussed (https://gist.github.com/dsuch/6440366).linuxjournal.update-cacheSteps the service performs are:n Connect to treasury.gov.n Download the big XML.n Find interesting elementscontaining the business data.n Store it all in Redis cache.Key fragments of the service arepresented below.When using Zato services, you arenever required to hard-code networkaddresses. A service shields suchinformation and uses human-definednames, such as “treasury.gov”; duringruntime, these resolve into a set ofconcrete connection parameters. Thisworks for HTTP and any other protocolsupported by Zato. You also canupdate a connection definition on thefly without touching the code of theservice and without any restarts:1 # Fetch connection by its name2 out = self.outgoing.plain_http.get('treasury.gov')34 # Build a query string the backend data source expects5 query_string = {6 '$filter':'month(QUOTE_DATE) eq {} and year(QUOTE_DATE) eq{}'.format(month, year)7 }89 # Invoke the backend with query string, fetch# the response as a UTF-8 string10 # and turn it into an XML object11 response = out.conn.get(self.cid, query_string)lxml (http://lxml.de) is a very goodPython library for XML processingand is used in the example to issueXPath queries against the complexdocument (http://data.treasury.gov/feed.svc/DailyTreasuryRealLongTermRateAverageData?$filter=month%28QUOTE_DATE%29%20eq%208%20and%20year%28QUOTE_DATE%29%20eq%20<strong>2013</strong>) returned:1 xml = etree.fromstring(response)23 # Look up all XML elements needed (date and rate) using XPath4 elements = xml.xpath('//m:properties/d:*/text()',➥namespaces=NAMESPACES)For each element returnedWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 89


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in Pythonby the back-end service, youcreate an entry in the Redis cachein the format specified byREDIS_KEY_PATTERN—for instance,linuxjournal:rates:<strong>2013</strong>:09:03with a value of 1.22:9 def handle(self):10 # Get date needed either from input or current day11 year, month, day = get_date(self.request.input)1213 # Build the key the data is cached under14 key = REDIS_KEY_PATTERN.format(year, month, day)151 for date, rate in elements:23 # Create a date object out of string4 date = parse(date)56 # Build a key for Redis and store the data under it7 key = REDIS_KEY_PATTERN.format(8 date.year, str(date.month).zfill(2),➥str(date.day).zfill(2))9 self.kvdb.conn.set(key, rate)1012 # Leave a trace of our activity13 self.logger.info('Key %s set to %s', key, rate)linuxjournal.get-rateNow that a service for updating thecache is ready, the one to return thedata is so simple yet powerful that itcan be reproduced in its entirety:1 class GetRate(Service):2 """ Returns the real long-term rate for a given date3 (defaults to today if no date is given).4 """5 class SimpleIO:6 input_optional = ('year', 'month', 'day')7 output_optional = ('rate',)816 # Assign the result from cache directly to response17 self.response.payload.rate = self.kvdb.conn.get(key)A couple points to note:n SimpleIO was used—this is adeclarative syntax for expressingsimple documents that can beserialized to JSON or XML in thecurrent Zato version, with more tocome in future releases.n Nowhere in the service did youhave to mention JSON, XML oreven HTTP at all. It’s all workingon a high level of Python objectswithout specifying any outputformat or transport method.This is the Zato way. It promotesreusability, which is valuable becausea generic and interesting service,such as returning interest rates, isbound to be desirable in situationsthat cannot be predicted.As an author of a service, youare not forced into committingto a particular format. Those are90 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in Pythonachieved from the command line or byuser-created clients making API calls.On top of that, server-side objectscan be managed “en masse”(https://zato.io/docs/admin/guide/enmasse.html) using a JSON-basedconfiguration that can be kept in aconfig repository for versioning anddiffing. This allows for interestingworkflows, such as creating a baseFigure 2. Scheduler Job Creation Form92 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Figure 3. Outgoing HTTP Connection Creation FormFigure 4. JSON Channel Creation FormWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 93


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in PythonFigure 5. Plain XML Channel Creation FormFigure 6. SOAP Channel Creation Form94 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


configuration on a developmentenvironment and exporting it totest environments where the newconfiguration can be merged into anexisting one, and later on, all that canbe exported to production.Figures 2–6 show the followingconfigs:n Scheduler’s job to invoke theservice updating the cache.n Outgoing HTTP connection definitionsfor connecting to treasury.gov.n HTTP channels for each client—there is no requirement that eachclient be given a separate channelbut doing so allows one to assigndifferent security definitions toeach channel without interferingwith any other.Testing Itupdate-cache will be invoked bythe scheduler, but Zato’s CLI offersthe means to invoke any servicefrom the command line, even ifit’s not mounted on any channel,like this:There was no output, becausethe service doesn’t produce any.However, when you check the logsyou notice:INFO - Key linuxjournal:rates:<strong>2013</strong>:09:03 set to 1.22Now you can invoke get-rate fromthe command line using curl with JSON,XML and SOAP. The very same serviceexposed through three independentchannels will produce output in threeformats, as shown below (outputslightly reformatted for clarity).Output 1:$ curl localhost:17010/client1/get-rate -d$➥'{"year":"<strong>2013</strong>","month":"09","day":"03"}'➥{"response": {"rate": "1.22"}}Output 2:$ curl localhost:17010/client2/get-rate -d '<strong>2013</strong>0903'K295602460207582970321705053471448424629ZATO_OK$ zato service invoke /opt/zato/server1 linuxjournal.update-cache➥--payload '{}'(None)$1.22$WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 95


FEATURE Zato—Agile ESB, SOA, REST and Cloud Integrations in PythonOutput 3:$ curl localhost:17010/client3/get-rate \-H "SOAPAction:get-rates" -d '<strong>2013</strong>0903'K175546649891418529601921746996004574051ZATO_OK1.22$IRA Is the KeyIRA (Interesting, Reusable, Atomic)is the key you should always keep inmind when designing services thatare to be successful.Both the services presented in thearticle meet the following criteria:n I: focus on providing datainteresting to multiple parties.n R: can take part in manyprocesses and be accessedthrough more than one method.n A: focus on one job only and doit well.In this vein, Zato makes it easyfor you to expose services overmany channels and to incorporatethem into higher-level integrationscenarios, thereby increasing theiroverall attractiveness (I in IRA) topotential client applications.It may be helpful to think of a fewways not to design services:n Anti-I: update-cache could beturned into two smaller services.One would fetch data and storeit in an SQL database; the otherwould grab it from SQL and put itinto Redis. Even if such a designcould be defended by somerationale, neither of the pair ofservices would be interesting for96 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


external applications. A thirdservice wrapping these twoshould be created and exposedto client apps, in the case ofit being necessary for othersystems to update the cache.In other words, let’s keep theimplementation details insidewithout exposing them to thewhole world.n Anti-R: hard-coding nontrivialparameters is almost always apoor idea. The result being thata service cannot be driven byexternal systems invoking itwith a set of arguments. Forinstance, creating a service thatis limited to a specific year onlyensures its limited use outsidethe original project.n Anti-A: returning a list ofprevious queries in response toa request may be a need of oneparticular client application, butcontrary to the needs of another.In cases when a compositeservice becomes necessary, itshould not be obliged upon eachand every client.Designing IRA services is likedesigning a good programminginterface that will be releasedas an open-source library andused in places that can’t bepredicted initially.Born Out of Practical ExperienceZato it not only about IRA but alsoabout codifying common admin andprogramming tasks that are of apractical nature:n Each config file is versionedautomatically (https://zato.io/docs/admin/guide/install-config/overview.html#config-files-areversioned)and kept in a local bzrrepository (http://bazaar-vcs.org),so it’s always possible to revertto a safe state. This is completelytransparent and needs noconfiguration nor management.n A frequent requirement beforeintegration projects are started,particularly if certain services alreadyare available on the platform, is toprovide usage examples in the formof message requests and responses.Zato lets you specify that one-in-ninvocations of a service be storedfor a later use, precisely so that suchrequirements can be fulfilled byadmins quickly.Two popular questions askedregarding production are: 1) What areWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 97


FEATURE SIDUS—the Solution for Extreme Deduplication of an Operating SystemSIDUSthe Solutionfor ExtremeDeduplicationof an OperatingSystemProbe corrupted computers withoutdisassembling anything, and provideusers with a full-featured environmentin just a few seconds.Emmanuel Quemener and Marianne Corvellec100 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


SIDUS (Single-InstanceDistributing Universal System)was developed at CentreBlaise Pascal (Ecole normalesupérieure de Lyon, Lyon, France),where one administrator alone is incharge of 180 stations. EmmanuelQuemener started SIDUS in February2010, and he significantly cut hisworkload for administering this parkof stations. SIDUS is now in use atthe supercomputing centre PSMN(Pôle Scientifique de ModélisationNumérique) of the Ecole normalesupérieure de Lyon.for the simplified management of thinterminals through X11 or RDP accessto a server. Thus, all the processingload is on the latter server. On thecontrary, SIDUS makes full use (orpartial use, as the user wishes) of thestation’s resources. Only the OS isstored remotely.SIDUS is not FAI. FAI or Kickstartoffer full simplified installs so thatadministration can be reduced ordismissed altogether. On the contrary,SIDUS offers a single system in a treethat integrates the base system as wellas all manually installed applications.SIDUS offers users a single given environmentthat is easily configurable at any moment.With SIDUS, you can provide anew user with a complete functionalenvironment in just a few seconds. Youcan probe corrupted computers withoutdisassembling anything. You can testnew equipment without installing anOS on them. You can make your life somuch easier when managing hundredsof cluster nodes, of workstations or ofself-service stations. You drastically canreduce the amount of storage neededfor the OS on these machines.DisclaimerSIDUS is not LTSP. LTSP is a solutionSIDUS is flexible. When organizingIT-training sessions, you might wantto give participants a specific virtualenvironment. But once they downloadit, you cannot modify it for them. SIDUSoffers users a single given environmentthat is easily configurable at any moment.SIDUS is not exotic. SIDUS makes useof services available with any distribution(DHCP, PXE, TFTP, NFSroot, DebootStrapand AUFS). You can install SIDUSknowing only these few keywords.Besides, SIDUS makes use of distributiontricks from live CDs. SIDUS works onDebian, all the way from version Etch.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 101


FEATURE SIDUS—the Solution for Extreme Deduplication of an Operating SystemHow Good Is SIDUS?SIDUS is:n Universal: platform-independent,x86 or x86_64 architectures.n Efficient: installing takes a fewminutes, and booting takes afew seconds.n Energy-saving: it takes onlyone core, 1GB of RAM, 40GBin disk space and an Ethernet(Gbit) network.n Scalable: tested successfully on ahundred nodes.n Multipurpose: we chose to useDebian as it comes with broadintegration of open-sourcescientific software.Installing SIDUS on Your SystemIt takes a little preparation for yoursystem to host SIDUS. We have severalservices at our disposal in order todeploy our clients: DHCP, TFTP andNFS servers. Now, either you are ongreat terms with your own IT staff,or you are able to access freely thewell-defined LDAP and DNS servers:n DHCP service provides the clientwith one IP address but propagatestwo complementary pieces ofinformation: IP address of the TFTPserver (variable “next-server”) andthe name of the PXE binary, oftencalled pxelinux.0.n TFTP service then comes into play.Booting the system is enabled byTFTP, through the binary pxelinux.0,the kernel and startup of theclient’s system. If you need to givea client some parameters, youjust build a dedicated file whosename stems from the client’s MACaddress (prefixing with 01 andreplacing : with -).n NFS service now enters the loop:it gives the system’s root via itsprotocol (NFSroot). Accordingly,you will install your client systemin this root—for example,/src/nfsroot/sidus.In our configuration, we have usedisc-dhcp-server, tftpd-hpa andnfs-kernel-server for the servers DHCP,TFTP and NFS, respectively. Let’s lookinto this configuration.For DHCP, the configuration file(/etc/dhcp/dhcpd.conf) reads:next-server 172.16.20.251;102 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


filename "pxelinux.0";allow booting;For TFTP, there are three filesand one directory (pxelinux.cfg)in /srv/tftp:./pxelinux.0./vmlinuz-Sidus./initrd.img-Sidus./pxelinux.cfgThe pxelinux.0 file comes from thesyslinux-common package. In pxelinux.cfg,there is the file called default.To boot, you need the following:the kernel vmlinuz-Sidus, the systeminitrd.img-Sidus and the serverNFSroot 10.13.20.13 with themountpoint /srv/nfsroot/sidus.Below is an example of a bootfile. It takes two inputs: tmpfs andiscsi (we’ll come back to the iscsiinput later on):DEFAULT tmpfsLABEL tmpfsKERNEL vmlinuz-SidusAPPEND console=tty1 root=/dev/nfsinitrd=initrd.img-Sidusnfsroot=10.13.20.13:/srv/nfsroot/sidus,rsize=8192,wsize=8192,tcp ip=dhcp aufs=tmpfsLABEL iscsiKERNEL vmlinuz-SidusAPPEND console=tty1 root=/dev/nfsinitrd=initrd.img-Sidusnfsroot=10.13.20.13:/srv/nfsroot/wheezy64,rsize=8192,wsize=8192,tcp ip=dhcp aufs=iscsiISCSI_TARGET_IP=10.13.20.14ISCSI_INITIATOR=iqn.<strong>2013</strong>-04.zone.sidus.target:default root=LABEL=ISCSIRegarding the NFS server, it takesone line in the file /etc/exports toconfigure it:/srv/nfsroot/sidus10.13.20.0/255.255.255.0(ro,no_subtree_check,async,no_root_squash)Here, we open a read-only access tostations with IP between 10.13.20.1and 10.13.20.254.Once you have configured thesethree services (DHCP, TFTP and NFS),you can install a full SIDUS. Notethat you also will need a root foruser accounts (via NFSv4) and aprocess enabling their identification/authentication (via LDAP or Kerberos).We have deployed SIDUS onenvironments where these servicesare provided by third-party serversbut also on standalone environments.Installing an OpenLDAP serverwith SSL or a Kerberos server isoff-topic, so we simply show theWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 103


FEATURE SIDUS—the Solution for Extreme Deduplication of an Operating Systemclient configuration files for ourinfrastructure (again, LDAP foridentification/authentication andNFSv4 for user folders).Install the Debian Basewith DebootstrapWith Debootstrap, you can installa system in an extra root location.Debootstrap needs you to specifyparameters, such as the install root,the hardware architecture, thedistribution and the FTP or HTTPDebian archive to use for downloadingon Debian worldwide mirror sites.Warning: this is where we get Debianspecific.Debootstrap is a familiar toolfor all Debian-like distributions (typically,it is available on Ubuntu). It is not toodifficult to make it happen on Red Hatlikedistributions though. There is aclone for Fedora called febootstrap; wehave not tested it though.Debootstrap (http://wiki.debian.org/Debootstrap) also takes as input a listof archives—as we all know, Debianis very particular about distinguishingthe main archive area from the contriband non-free ones—a list of packagesto include and a list of packages todismiss. We wish we could specifythe latter two lists, but you cannothandle everything with Debootstrap.We install from the very beginning aset of tools we deem necessary (suchas the kernel, some firmware andauditing tools).We define environment variablescorresponding to the root of ourSIDUS system. We define a commandthat enables the execution ofcommands via chroot, with a specificoption for package install. Thevariable $MyInclude correspondsto the (comma-separated) list ofpackages you want, and $MyExcludecorresponds to the list of packagesyou do not want:export SIDUS=/srv/nfsroot/sidustime debootstrap --arch amd64--components='main,contrib,non-free'--include=$MyInclude --exclude=$MyExcludewheezy $SIDUS http://ftp.debian.org/debianPrecautions before Moving onwith the InstallAfter running the last-mentionedcommand, you should be a littlecautious. A Debian packagenormally starts after install. Youneed to define a hook to inhibitthe booting of services. Aftercompletion of the install, you canremove this hook:printf '#!/bin/sh\nexit 101\n' >${SIDUS}/usr/sbin/policy-rc.d104 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


chmod +x ${SIDUS}/usr/sbin/policy-rc.dSome packages require accessto the list of processes, system,peripherals, peripheral pointers andvirtual memory. Hence, you shouldbind the mounting of these hostsystem folders to SIDUS:alias sidus="DEBIAN_FRONTEND=noninteractive chroot${SIDUS} $@"sidus mount -t proc none /procsidus mount -t sysfs sys /sysmount --bind /run/shm ${SIDUS}/run/shmmount --bind /dev/pts ${SIDUS}/dev/ptssuggested packages: the option--install-suggests is availablefrom Wheezy onward (releasedMay 5, <strong>2013</strong>).When installing, the costliestphase is downloading packages andconfiguring certain components (Perland LaTeX). In the best-case scenario, ittakes 45 minutes for a 32GB full tree.There is a price to pay for this installcraze. Some packages do not installwell, and you will want to purge some,such as a M*tlab installer:time sidus apt-get purge -y -f --force-yes matlab-*Install Additional Packages(Scientific Libraries)To make it simpler when installingpackages of the same family, Debianships with meta-packages. In ourcase, we are interested in thescientific ones: their names areprefixed by “science”. For example,we have “science-chemistry”,including all chemistry packages.You install all scientific packageswith only one command:Local EnvironmentUsually, you will want to adaptthe system to a local environment(authentication and user sharing).The default is US, so you may wantto configure:n ${SIDUS}/etc/locale.gen.n ${SIDUS}/etc/timezone.n ${SIDUS}/etc/default/keyboard.time sidus apt-get install --install-suggests -f-m -y --force-yes science-*Because we are talking about afull-featured OS, we also install theFor LDAP authentication, you may wantto configure: ${SIDUS}/etc/nsswitch.conf,${SIDUS}/etc/libpam_ldap.conf,${SIDUS}/etc/libnss-ldap.conf and${SIDUS}/etc/ldap/ldap.conf.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 105


FEATURE SIDUS—the Solution for Extreme Deduplication of an Operating SystemHow do you share SIDUS without duplicating it?We found the best solution to be via live CD.As for the mounting of NFSuser folders:n ${SIDUS}/etc/default/nfs-common,${SIDUS}/etc/default/idmapd.confand ${SIDUS}/etc/fstab (for NFSv4).n ${SIDUS}/etc/fstab (for NFSv3).Set Up the Boot SequenceHow do you share SIDUS withoutduplicating it? We found the bestsolution to be via live CD. The bootsequence includes the two layersyou need—that is, a read-only layer(on media for live CD and on NFSin our case) and a read-write layer(on TMPFS). The two layers arelinked via AUFS (the successor ofUnionFS). Everything is taken careof by a single hook upon boot (thescript called rootaufs). It operatesin five steps:1. Creates the temporary files /ro,/rw and /aufs.2. Moves the root of NFSroot from theoriginal mountpoint to /ro.3. Mounts the local or remote partition.4. Superimposes /ro and /rw into /aufs.5. Moves /aufs into the originalmountpoint. rootaufs goes into${SIDUS}/etc/initramfs-tools/scripts/init-bottom.The original script is inspired bythe rootaufs project by Nicholas A.Schembri (http://code.google.com/p/rootaufs). We adapted it toa large extent to match ourinfrastructure. A version is availableat http://www.cbp.ens-lyon.fr/sidus/rootaufs:wget -O${SIDUS}/etc/initramfs-tools/scripts/init-bottomhttp://www.cbp.ens-lyon.fr/sidus/rootaufsThe system is not functional yet.You need to create an initrd specificto your NFS boot. Add aufs in${SIDUS}/etc/initramfs-tools/modulesand force eth0 as DEVICE in ${SIDUS}/etc/initramfs-tools/initramfs.conf:sidus update-initramfs -k all -u106 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Then, you just copy the kernel andbootloader in the definition:cp ${SIDUS}/vmlinuz /srv/tftp/vmlinux-Siduscp ${SIDUS}/srv/nfsroot/boot/initrd/srv/tftp/initrd-SidusHow can you take advantageof SIDUS while keeping a givenconfiguration from one boot to thenext? Mounting NFS on each nodeseparately is very costly. It is preferableto mount iSCSI on each node.Originally, we investigated howto offer a second NFS share in readwritemode to ensure persistence ofclient-related changes from one bootto the next. This version, althoughfunctional, required an atomizedNFS—one for each client. This was notsustainable for the server.Therefore, we decided on anothersolution to ensure persistence. Wecreate an iSCSI share for each client.The settings for mounting the iSCSIdisk are defined in the line command.So we use a network drive fromiSCSI technology. In the config file/srv/tftp/pxelinux.cfg/default, wehave the definition LABEL=iscsi.Each SIDUS client needs its own iSCSIstorage space to ensure persistence.For reasons of simplicity, in theinitrd booting sequence, the SIDUSclients fetch the volumes that beartheir respective IPs. The rootaufs filecontains a default login/password.A few tricks:n Erase /etc/hostname to set thehostname through DHCP.n Set /etc/resolv.conf with ahard-coded definition.n Define a loopback in /etc/network/interfaces.n Change the booting of GDM3 so itstarts only after NSCD is launched.n Set /etc/security/limits.conf(essential in an HPC environment).n Set /etc/fstab with input from theNFS server of user accounts.n For VirtualBox-based virtual systems,install VBox<strong>Linux</strong>Additions.run inthe SIDUS system.n For systems with an InfiniBandcard, force loading of modules in/etc/modules and regenerate initrd.In /etc/rc.local, execute a scriptthat gets the Ethernet IP addressand builds an IP address for theInfiniBand card.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 107


administration phase abides by theinstallation mechanisms: protectionagainst booting and mounting ofsystem folders.Administration techniques aresimilar to those for the initial install.We use a script so that commandsexecuted in SIDUS are surroundedby pre/post operations. We use thisscript either automatically (typicallyfor updates) or manually. At the endof the day, these additional commandsrepresent a negligible burden withrespect to the benefits we get. Wenow have several (many) stations thatare bit-for-bit identical to a given basesystem. Other benefits:n SIDUS works on user stations. Theindividual workstations can beconsidered shared. We started with adozen Neoware light clients that werememory-enhanced and overclocked.We now have about 20 of those.n SIDUS works on cluster nodes. InMarch 2010, we had a proof ofconcept with 24 nodes. Nowadays,SIDUS serves 86 permanent nodes overfour different hardware architectures.n SIDUS works on virtual stations.Every year since 2011, UniversitéJoseph Fourier organizes a summerschool on scientific computing. Forten busy days, students train handson.Thus, it’s necessary to offer thema homogeneous environment in notime. Two virtual images are offered:a persistent one they can use afterthe summer school and another onevia SIDUS. This way, teachers canadapt materials and activities day byday. Since summer 2012, this solutionhas been used at the Laboratory ofChemistry of ENS de Lyon as well.n SIDUS works on suspicious stations.Booting via the network enablesthe investigation of the shutdownsystem mass storage. There is noneed for a live CD—always short ofyour ideal forensic tool.n SIDUS works on loan stations.Hardware manufacturers usuallyoffer assessment equipment. Theinstall phase can be tedious onrecent equipment. Using SIDUS, thesystem boots just like on other (inuse) equipment—for example, ittakes a few minutes for 20 nodes.ConclusionWho should use SIDUS and why?n Users: you choose the resourceson which you want to boot yourWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 109


FEATURE SIDUS—the Solution for Extreme Deduplication of an Operating Systemstation. Therefore, workstations canbe segmented at will. The VirtualBoxversion of SIDUS has been testedsuccessfully on <strong>Linux</strong>, MS Windowsand Mac OS. GPU accelerationand sharing with the host areavailable. Users find themselvesin the same environment as thenodes’. This makes code integrationtremendously easier. Performancewise,losses due to virtualizationvary between 10% and 20% forVirtualBox and about 5% for KVM.n Administrators: a given operationpropagates to the entireinfrastructure, as if simply syncingover the SIDUS tree. The install takesa few tens of minutes for a fullfeaturedsystem. To work out minordifferences between systems, simplescripts or puppets do the job. In thecase of more important differences,just build another SIDUS tree. TheSIDUS tree might even just be clonedinstantaneously using snapshot tools(LVM or, better, ZFSon<strong>Linux</strong>).n For the sake of experiments: theSIDUS environment offers scientistsand system engineers a frameworkfor conducting reproducibleexperiments. Two nodes booting onthe same SIDUS base do run the exactsame system. This way, even if thestations are not actually identical,relevant tests still can be carried out.How much does it cost in terms ofresources? To get an idea, the clusters’server (also gateway) at CBP hosts DHCP,DNS, TFTP and NFS services, as well asa batch server OAR. When booting theentire infrastructure (88 nodes), the NFSserver takes in 900Mb/s.To conclude, you will want to useSIDUS on a variety of environments,be they HPC nodes, workstationsor virtual machines. SIDUS givesunprecedented flexibility to both usersand administrators. It is so energyefficientand it propagates so rapidly,you won’t want to live without it!■Emmanuel Quemener defines his job as an “IT test pilot”. Hiswork at the HPC “Centre Blaise Pascal” (Lyon, France) involvessoftware integration, storage, scientific computing with GPUsand technology transfer in science.Marianne Corvellec is a physicist who gratefully was exposed tothe wonderful world of free/libre and open-source software. Shewants to make computation science a better place, promotingbest practices and interacting with software experts.Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.110 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


OpenStack: Special ReportIntroductionto OpenStackYou’ve probably heard of OpenStack. It’s that cloud softwarethat’s getting a lot of attention from a lot of big names in theIT industry and major users like CERN, Comcast and PayPal.However, did you know that it’s more than that? It’s also thefastest-growing Open Source community in the world, and avery interesting collaboration among technology vendors.TOM FITFIELDOpenStack is really truly open.If you want to test it out, you canstop reading this article (comeback after—we’ll still be here),visit http://status.openstack.org/release, and get a real-time list offeatures that are under development.Select one, follow it to the codereview system, and give yourcomments. This is an example ofOpen Development, just one of the“Four Opens” on which OpenStackwas founded. Of course, you likelyalready know that OpenStack is fullyreleased under the Open SourceApache 2 license, no bits reserved,but did you also know about theprinciple of “Open Design”?The software is released on a sixmonthcycle, and at the start of eachcycle, we host a Design Summit forthe contributors and users to gatherand plan out the roadmap for thenext release. The Design Summithas become part of an increasinglylarge conference that also hostsworkshops for newcomers, inspiringkeynotes and some fairly amazinguser stories. However, somewheretucked away are rooms with chairsarranged in semicircular layoutsensconcing dozens of developersengaged in robust discussion,taking notes on a collaborativedocument displayed on a projector.This is where the roadmap of112 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Open Stack: Special ReportThe patches also are run throughthe continuous integrationsystem, which effectively buildsa new cloud for every codechange submitted, ensuring thatthe interaction of that piece ofsoftware is as expected withall other parts of OpenStack.As a result of this quest forquality, many have found thatcontributing has improved theirPython coding skills.Of course, not everyone is aPython developer. However, don’tworry; there’s still a place for you towork with us to change the face ofcloud computing:n Documentation: in many cases,the easiest way to become anOpenStack contributor is toparticipate in the documentationefforts. It requires no coding,just a willingness to read andunderstand the systems thatyou’re writing about. Becausethe documentation is treated likecode, you will be learning themechanics necessary to makecontributions to OpenStack itselfby helping with documentation.Visit https://wiki.openstack.org/wiki/Documentation/HowTo formore information.n Translation: if you know how towrite in another language,translations of OpenStack happenthrough an easy-to-use Web interface(https://www.transifex.com/projects/p/openstack). For help,contact the Internationalization team(https://wiki.openstack.org/wiki/I18nTeam).n File bugs: OpenStack is one ofthe few projects you will workon that loves to hear your angryrants when something does notwork. If you notice somethingoff, consider filing a bug reportwith details of your environmentand what you experienced athttp://docs.openstack.org/trunk/openstack-ops/content/upstream_openstack.html.n Asking questions: a StackOverflowstyleboard for questions aboutOpenStack is available athttp://ask.openstack.org. Feel freeto use it to ask yours, and if youhave the ability, stick around and tryto answer someone else’s (or at leastvote on the ones that look good).n Evangelism: do you thinkOpenStack is pretty cool? Helpus out by telling your friends;we’d really appreciate it. You114 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Open Stack: Special Reportcan find some materials tohelp at http://openstack.org/marketing, or join the marketingmailing list to find out aboutsome cool events to attend.n User groups: join your local usergroup (https://wiki.openstack.org/wiki/OpenStackUserGroups). They’rein about 50 countries so far. Attendto learn, or volunteer to speak.We’d love to have you.Navigating the Ecosystem:Where Does Your OpenStackJourney Begin?One of the unique aspects aboutOpenStack as an open-source projectis that there are many differentlevels at which you can begin toengage with it. You don’t have to doOf course, for many people,the enticing part of OpenStack isbuilding their own private clouds,and there are several ways to dothat. Perhaps the simplest of all isan appliance-style solution. Youpurchase a “thing”, unbox it, plugin the power and the network, andit just is an OpenStack cloud.However, hardware choice isimportant for many applications,so if that applies to you, considerthat there are several softwaredistributions available. You can, ofcourse, get enterprise-supportedOpenStack from Canonical, Red Hatand SUSE, but take a look also atsome of the specialized distributions,such as those from Rackspace, Piston,SwiftStack or Cloudscaling.If you want someone to help guideOF COURSE, FOR MANY PEOPLE, THE ENTICING PARTOF OPENSTACK IS BUILDING THEIR OWN PRIVATECLOUDS, AND THERE ARE SEVERAL WAYS TO DO THAT.everything yourself.Starting with Public Clouds,you don’t even need to have anOpenStack installation to start usingit. Today, you can swipe your creditcard at eNovance, HP, Rackspaceand others, and just start migratingyour applications.you through the decisions from thehardware up to your applications,perhaps adding in a few features orintegrating components along theway, consider contacting one of thesystem integrators with OpenStackexperience like Mirantis or Metacloud.Alternately, if your preferenceWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 115


Open Stack: Special Reportis to build your own OpenStackexpertise internally, a good way tokick start that might be to attendor arrange a training session. TheOpenStack Foundation recentlylaunched a Training Marketplace(http://www.openstack.org/marketplace/training), where youcan look for events nearby. There’salso a community training effort(https://wiki.openstack.org/wiki/Training-manuals) underway to produceopen-source content for training.To derive the most from theflexibility of the OpenStackframework, you may elect to performa DIY solution. In which case, westrongly recommend getting a copyof the OpenStack Operations Guide(http://docs.openstack.org/ops),which discusses many of the decisionsyou will face along the way. There’salso a new OpenStack Security Guide(http://docs.openstack.org/sec),which is an invaluable reference forhardening your installation.DIY-ing Your OpenStack CloudIf, after careful analysis, you’vedecided to construct OpenStackyourself from the ground up, there area number of areas to consider.Storage One of the mostfundamental underpinnings of acloud platform is the storage onwhich it runs. In general, when youselect storage back ends, ask thefollowing questions:n Do my users need block storage?n Do my users need object storage?n Do I need to support live migration?n Should my persistent storage drivesbe contained in my compute nodes,or should I use external storage?n What is the platter count I canachieve? Do more spindles result inbetter I/O despite network access?n Which one results in the best costperformancescenario I’m aiming for?n How do I manage the storageoperationally?n How redundant and distributed isthe storage? What happens if astorage node fails? To what extentcan it mitigate my data-lossdisaster scenarios?n Which plugin do I use for blockstorage?For many new clouds, the objectstorage and persistent/block storage116 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Open Stack: Special Reportare great features that users want.However, with OpenStack, you’re notforced to use either if you want asimpler deployment.Many parts of OpenStack arepluggable, and one of the bestexamples of this is Block Storage,which you are able to configureto use storage from a long list ofvendors (Coraid, EMC, GlusterFS,Hitachi, HP, IBM, LVM, NetApp,Nexenta, NFS, RBD, Scality, SolidFire,Windows Server and Zadara).Network If this is the firsttime you are deploying a cloudinfrastructure in your organization,after reading this section, yourfirst conversations should be withyour networking team. Networkusage in a running cloud is vastlydifferent from traditional networkdeployments, and it has thepotential to be disruptive at both aconnectivity and a policy level.For example, you must planthe number of IP addresses thatyou need for both your guestinstances as well as managementinfrastructure. Additionally, youmust research and discuss cloudnetwork connectivity through proxyservers and firewalls.One of the first choices you need tomake is between the “legacy” novanetworkand OpenStack Networking(aka Neutron). Nova-network isa much simpler way to deploy anetwork, but it does not have the fullsoftware-defined networking featuresof Neutron, and it will be deprecatedafter 12–18 months.Object Storage’s network patternsmight seem unfamiliar at first.Consider these main traffic flows:n Among object, container andaccount servers.n Between those servers and the proxies.n Between the proxies and your users.Object Storage is very “chatty”among servers hosting data—even asmall cluster does megabytes/secondof traffic, which is predominantly“Do you have the object?”/“YesI have the object.” Of course, ifthe answer to the aforementionedquestion is negative or times out,replication of the object begins.Consider the scenario wherean entire server fails, and 24TBof data needs to be transferred“immediately” to remain at threecopies—this can put significant loadon the network.Another oft-forgotten fact isthat when a new file is beinguploaded, the proxy server mustWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 117


Open Stack: Special Reportwrite out as many streams as thereare replicas—giving a multiple ofnetwork traffic. For a three-replicacluster, 10Gbps in means 30Gbpsout. Combining this with theprevious high-bandwidth demandsof replication is what results in therecommendation that your privatenetwork is of significantly higherbandwidth than your public needbe. Oh, and OpenStack ObjectStorage communicates internallywith unencrypted, unauthenticatedsync for performance—you do wantthe private network to be private.The remaining point on bandwidthis the public-facing portion. Swift-proxyis stateless, which means that youeasily can add more and use HTTP loadbalancingmethods to share bandwidthand availability between them.More proxies mean more bandwidth,if your storage can keep up.“Cloud Controller” To achievemaximum scalability via a sharednothing/distributed-everythingarchitecture, OpenStack doesnot have the concept of a “cloudcontroller”. Indeed, one of thebiggest decisions deployers face isexactly how to segregate out all ofthe “central” services, such as theAPI endpoints, schedulers, databaseservers and the message queue.For best results, acquiring somemetrics on how the cloud will beused is necessary. Although ofcourse, with a proper automatedconfiguration management system, itwill be possible to scale as operationalexperience is gained. Key questions toconsider may include:n How many instances will run at once?n How many compute nodes will runat once?n How many users will access the API?n How many users will access thedashboard?n How many nova-API services do yourun at once for your cloud?n How long does a single instance run?n Does your authentication systemalso verify externally?When choosing the size ofcompute node hardware is mainlydependent on the types of virtualmachines running, “central service”machines can be more difficult.Contrast two clouds running 1,000virtual machines. One is mainly usedfor long-running Web sites, and inthe other the average lifetime is118 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


Open Stack: Special ReportWhen users provision resources,they can specify from whichavailability zone they would liketheir instance to be built. This allowscloud consumers to ensure that theirapplication resources are spreadacross disparate machines to achievehigh availability in the event ofhardware failure.Host aggregates, on the otherhand, enable you to partitionOpenStack Compute deploymentsinto logical groups for loadbalancing and instance distribution.You can use host aggregatesto partition an availability zonefurther. For example, you mightuse host aggregates to partition anavailability zone into groups of hoststhat either share common resources,such as storage and network, orhave a special property, such astrusted computing hardware.A common use of host aggregatesis to provide information for usewith the compute scheduler. Forexample, you might use a hostaggregate to group a set of hoststhat share specific images.Another very useful feature forscaling is Object Storage GlobalClusters. This adds the concept ofa Region and allows for scenarioslike having one of the three replicasin an off-site location. Check theSwiftStack blog for more information(http://swiftstack.com/blog/2012/09/16/globally-distributed-openstackswift-cluster).SummaryIf you’ve looked at the storageoptions, determined which typesof storage you want and how theywill be implemented, plannedthe network carefully (taking intoaccount the different ways to deployit and how it will be managed),acquired metrics to design your cloudcontroller, then considered howto scale your cluster, you probablyare now an OpenStack expert. Inwhich case, we’d encourage you tocustomize your deployment and shareyour findings with the community.■After learning about scalability in computing from particle physicsexperiments like ATLAS at the Large Hadron Collider, Tom led thecreation of the NeCTAR Research Cloud. Tom is currently harnessinghis passion for large-scale distributed systems by focusing on theirmost important part, people, as Community Manager at the OpenStackFoundation. Tom is a co-author of the OpenStack OperationsGuide, a documentation core reviewer and a contributor in a smallway to Compute (Nova), Object Storage (Swift), Block Storage(Cinder), Dashboard (Horizon) and Identity Service (Keystone).Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.120 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


EOFLife on theForked RoadDOC SEARLSWe are analog and digital. One is old, the other new.Civilizing the latter will take some work.On a panel at the last<strong>Linux</strong>Con, Linus was askedif the US governmentever wanted a backdoor added to<strong>Linux</strong> (http://www.eweek.com/developer/linus-torvalds-talks-linuxdevelopment-at-linuxcon.html).He nodded “yes” while saying “no”.While we’re sure Linus meant whathe said, his answer calls to mindthe immortal words of Yogi Berra(http://en.wikipedia.org/wiki/Yogi_Berra): “When you come toa fork in the road, take it.” We areat that fork now. We arrived whenEdward Snowden began revealing tomuggles what the wizards among usalways knew, or at least suspected:that government spying on ordinarydata communications among privateindividuals was not only possible,but happening.I was talking the other day toa high-ranking US military officerwho referred to our new collectivecondition as “post-Snowden”. Itis a permanent state. We aren’tgoing back.The central problem of post-Snowden life is that our lives are nowboth analog and digital. Snowdenrevealed the forked road we hadalready taken and made clear howhard it is to travel both at once.The difference between the twosides of the fork is binary. The analogand the digital are as different andcomplementary as zero and one,which are also the only numbers weneed for binary math. For better andworse, we now cohabitate a worldthat is both analog and base-2.Today’s use of base-2 mathematicstraces back to Gottfried Leibniz(http://en.wikipedia.org/wiki/Gottfried_Leibniz), who wrote,“Now one can say that nothingin the world can better present122 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


EOFand demonstrate this power thanthe origin of numbers, as it ispresented here through the simpleand unadorned presentation of oneand zero or nothing.” The thing“presented here” was the ChineseI Ching, which is wrapped aroundthe more ancient notion of yin andyang, drawn by Taoists as the familiarsymbol shown in Figure 1.Berra-ists might prefer anotherEastern symbol, the endless knot shownin Figure 2 (http://en.wikipedia.org/wiki/Endless_knot).Both convey embrasure of irony:one in a state, the other in a path.I’d love to see a symbol for this:embodied and disembodied. Thesymbol shown in Figure 3 will haveto do for now.Our analog selves are embodied.Our digital ones aren’t. Yet wemanifest both ways to each other.As fleshy creatures, being analogis our permanent fate. Being digitalisn’t, but it’s what we also are now.We need to learn what that is andhow to deal with it. And we have along way to go before being digitalcomes as naturally as being analog.If it ever does.In Philosophy in the Flesh, GeorgeLakoff and Mark Johnson say, “Themind is inherently embodied.”Because of this, they continue,our bodies supply the primarymetaphors we use to make senseof the world. For example, wesay good is up and bad is downbecause we walk upright. Wesay good is light and bad is darkbecause we are diurnal. If we werenocturnal, those might be reversed.We speak of life in terms of travel(birth is arrival, death is departureor passing on, careers are paths)because we are ambulatory animals.To live is to move.It is also fairly easy to own thingsin the analog world, or at least toFigure 1. I ChingFigure 2. Endless KnotFigure 3. Embodied andDisembodiedWWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 123


EOFclaim them as our own. “Possessionis nine-tenths of the law” becauseit is nine-tenths of the three-yearold.She says “It’s mine!”, becauseshe has mastery of her opposablethumbs. Possession is holding.Our senses also extend outwardbeyond our bodies through a capacitycalled indwelling. When a carpenterspeaks of “my hammer”, and driversspeak of “my engine”, it is becausetheir senses dwell in those things asextensions of their bodies.But how do we dwell in thedisembodied place we call theInternet? It isn’t a place, even thoughit has sites, locations and domainsthat we visit and browse. Our bodies,even in public places, tend to beclothed, with things we call “privates”concealed. On the Net—so far—weare exposed in ways that are foreignto our bodily experience. Even themuggles among us knew that, tosome degree, in pre-Snowden time.Now just about all of us know ourdegrees of personal exposure couldbe...anything, to a degree that isincalculable in the limited terms of ouranalog experience.This near-absolute range ofpossibility on the disembodieddigital side of our forked road isencapsulated in this single line fromKevin Kelly: “The Internet is a copymachine” (http://www.kk.org/thetechnium/archives/2008/01/better_than_fre.php). When yousend me a digital document overthe Net, you don’t yield it, as youwould in the analog way. Instead,you contribute to the proliferationof everything.T.Rob says the difference“is one of atoms versus bits”(https://ioptconsulting.com/industrystill-puzzling-over-consumer-reactionto-tracking).He goes on to explain:When surveillance was physicalNewtonian physics limited whatcould be done. We didn’t needlaws or policies stating that youcouldn’t surveil all of the peopleall of the time because to do sowasn’t physically possible. Becausewe have never had that capabilitybefore, we do not have anyexperience with it from a policymakingstandpoint.Having always been constrainedalong that boundary by what waspossible, it was difficult to overstepthe line of what was ethical. Todaywe have the reverse. The line ofpossibility has withdrawn so farthat it no longer constrains usfrom crossing the line of what isethical. But we are so unused to124 / NOVEMBER <strong>2013</strong> / WWW.LINUXJOURNAL.COM


having to exercise self-restraintalong this boundary that manyare not recognizing the ethicalline when they cross it. Many arein fact treating it like a land rush,grabbing all the territory betweenthe ethical line and the possibleline before policy and legislationcan step in to claim it.I suspect in the end it will come downto manners. There will be things thatwe do not do, even though we can. Isee hope in the word protocol, whichis well understood in technical circles,as well as among diplomatic and otherpractices back in the analog world.But I don’t have faith. Not yet.Not while the land rush is still on formarketers and government agencies,trampling our analog sensibilities.Snowden’s revelations may have markedPeak Surveillance, but they also markedthe point at which we start to workseriously on civilizing our digital selves.■Advertiser IndexThank you as always for supporting ouradvertisers by buying their products!ADVERTISER URL PAGE #AnDevCon Fall http://www.andevcon.com 2Drupalize.me http://drupalize.me 99Emac, Inc. http://www.emacinc.com 11Emperor<strong>Linux</strong> http://www.emperorlinux.com 31FSCONS <strong>2013</strong> https://fscons.org/<strong>2013</strong>/ 83iXsystems, Inc. http://www.ixsystems.com 7Manage Engine http://www.manageengine.com 57Mirantis http://www.mirantis.com 121New Relic http://newrelic.com 34, 35OVH http://www.ovh.com 45Silicon Mechanics http://www.siliconmechanics.com 3USENIX LISA https://www.usenix.org/conference/LISA<strong>2013</strong> 47Doc Searls is Senior Editor of <strong>Linux</strong> <strong>Journal</strong>. He is also afellow with the Berkman Center for Internet and Society atHarvard University and the Center for Information Technologyand Society at UC Santa Barbara.Send comments or feedback viahttp://www.linuxjournal.com/contactor to ljeditor@linuxjournal.com.ATTENTION ADVERTISERSThe <strong>Linux</strong> <strong>Journal</strong> brand’s following hasgrown to a monthly readership nearlyone million strong. Encompassing themagazine, Web site, newsletters andmuch more, <strong>Linux</strong> <strong>Journal</strong> offers theideal content environment to help youreach your marketing objectives. Formore information, please visithttp://www.linuxjournal.com/advertising.WWW.LINUXJOURNAL.COM / NOVEMBER <strong>2013</strong> / 125

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!