10.07.2015 Views

vP0Ui

vP0Ui

vP0Ui

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

What Is Big Data? 17gold becomes too much of a gambling game. Sure, history has its gold rushfever stories, but nobody ever mobilized millions of people to dig everywhereand anywhere; that would be too expensive. Today’s miners work differently.Gold mining leverages massive capital equipment that can process millionsof tons of dirt (low value-per-byte data) to find nearly invisible strands ofgold (ore grades of 30 ppm are usually needed before gold is visible to thenaked eye). In other words, there’s a great deal of gold (high value-per-bytedata) in all of this dirt (low value-per-byte data), and with the right equipment,you can economically process lots of dirt and keep the flakes of goldthat you find. The flakes of gold are taken for processing (perhaps to yourdata warehouse or another insight engine) and combined to make a bar ofgold, which is stored and logged in a place that’s safe, governed, valued, andtrusted. The gold industry is working on chemical washes that aim to revealeven finer granularizations of gold to find more value out of previouslyextracted dirt (now think data). We think this analogy fits well into our BigData story because we’re willing to bet that if you had a corpus composed often years of your transaction data, new analytic approaches will let youextract more insight out of it three years from now than you can withtoday’s technology.In addition, if you look at the access patterns that characterize a data warehouseand a Hadoop repository, one differentiator you’ll find is that a datawarehouse is often characterized by response times that allow you to workwith the system interactively. Indeed, terms such as “speed-of-thoughtresponse times” aren’t the descriptors you’re going to find associated with abatch system, which is what Hadoop is, for now.A Big Data platform lets you store all of the data in its native businessobject format and get value out of it through mass parallelism on commoditycomponents. For your interactive navigational needs, you’ll continue to pickand choose sources, cleanse that data, and keep it in warehouses. But you canget more value out of having a large amount of lower-fidelity data by pullingin what might seem to be unrelated information to form a more robust picture.In other words, data might sit in Hadoop for a while, and when its valuebecomes proven and sustainable, it can be migrated to the warehouse.The difference between observed value and discovery is not as black andwhite as we have described here. For example, a common use case for a

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!