14.03.2014 Views

Visual Analytics - An Interaction of Sight and Thought - JMP

Visual Analytics - An Interaction of Sight and Thought - JMP

Visual Analytics - An Interaction of Sight and Thought - JMP

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Visual</strong> <strong><strong>An</strong>alytics</strong><br />

<strong>An</strong> <strong>Interaction</strong> <strong>of</strong> <strong>Sight</strong> <strong>and</strong> <strong>Thought</strong><br />

Stephen Few<br />

Perceptual Edge<br />

www.PerceptualEdge.com<br />

Copyright © 2012 Stephen Few


The world is not yet prepared for a true information age. Most people lack the basic<br />

skills <strong>and</strong> the good tools that are required to use information in effective ways. I recently<br />

discovered an organization named the 21st Century Fluency Project that is trying to<br />

help people develop the skills that are needed to survive <strong>and</strong> thrive in a true information<br />

age.


Hurray data!<br />

Hurray technology!<br />

We celebrate technology. We live for the latest gadgets.


But the abundance <strong>of</strong> data has left people overwhelmed <strong>and</strong> frustrated. You can spend<br />

millions <strong>of</strong> dollars to build the most robust <strong>and</strong> pristine data warehouse in the world,<br />

running on the most powerful hardware, <strong>and</strong> accessed by state-<strong>of</strong>-the-art business<br />

intelligence s<strong>of</strong>tware, but if the people who work with the data don’t know how to make<br />

sense <strong>of</strong> it <strong>and</strong> then clearly present what they find to decision makers, your investment<br />

is wasted.


Upon this gifted age, in its dark hour,<br />

rains from the sky a meteoric shower<br />

<strong>of</strong> facts…they lie, unquestioned, uncombined.<br />

Wisdom enough to leach us <strong>of</strong> our ill<br />

is daily spun; but there exists no loom<br />

to weave it into a fabric.<br />

“Huntsman, What Quarry?”, 1939, Edna St. Vincent Millay<br />

The poet Edna St. Vincent Millay anticipated our situation today way back in 1939 when<br />

she wrote these words. Our problem is not a lack <strong>of</strong> data, but rather an inability to make<br />

sense <strong>of</strong> <strong>and</strong> use what we have.


Information<br />

Data<br />

Sensemaking<br />

Data – all those facts that we’ve accumulated – do not qualify as information until they<br />

inform. To use data effectively, we make first make sense <strong>of</strong> them. Today, sensemaking<br />

is our primary task.


We must tap into that steady stream <strong>of</strong> information that is coming at us in a way that<br />

allows us to make sense <strong>of</strong> it <strong>and</strong> make use <strong>of</strong> it. Endless columns <strong>and</strong> rows <strong>of</strong><br />

numbers don’t reveal the stories that live in our data. Data needs to be expressed in a<br />

way that our eyes can easily see <strong>and</strong> our brains can easily underst<strong>and</strong>.


Only then will the important stories that live in our data come to light.


Data analysts<br />

<strong>JMP</strong><br />

are now sexy.<br />

Data analysts are now beginning to get the respect they deserve. No longer are they<br />

kept in some back room huddled over your computers. The role <strong>of</strong> data analyst is<br />

becoming the sexiest role in the job market. Data analysts will become the priests <strong>of</strong> the<br />

modern world – those who talk with God <strong>and</strong> divine the mysteries <strong>of</strong> data.


Hal Varian<br />

U.C. Berkeley <strong>and</strong> Google<br />

I keep saying the sexy job in the next ten years will be statisticians. People think I’m joking, but<br />

who would’ve guessed that computer engineers would’ve been the sexy job <strong>of</strong> the 1990s? The<br />

ability to take data—to be able to underst<strong>and</strong> it, to process it, to extract value from it, to<br />

visualize it, to communicate it—that’s going to be a hugely important skill in the next decades,<br />

not only at the pr<strong>of</strong>essional level but even at the educational level for elementary school kids,<br />

for high school kids, for college kids. Because now we really do have essentially free <strong>and</strong><br />

ubiquitous data. So the complimentary scarce factor is the ability to underst<strong>and</strong> that data <strong>and</strong><br />

extract value from it.<br />

I think statisticians are part <strong>of</strong> it, but it’s just a part. You also want to be able to visualize the<br />

data, communicate the data, <strong>and</strong> utilize it effectively. But I do think those skills—<strong>of</strong> being able<br />

to access, underst<strong>and</strong>, <strong>and</strong> communicate the insights you get from data analysis—are going to<br />

be extremely important. Managers need to be able to access <strong>and</strong> underst<strong>and</strong> the data<br />

themselves.<br />

You always have this problem <strong>of</strong> being surrounded by “yes men” <strong>and</strong> people who want to<br />

predigest everything for you. In the old organization, you had to have this whole army <strong>of</strong> people<br />

digesting information to be able to feed it to the decision maker at the top. But that’s not the<br />

way it works anymore: the information can be available across the ranks, to everyone in the<br />

organization. <strong>An</strong>d what you need to ensure is that people have access to the data they need to<br />

make their day-to-day decisions. <strong>An</strong>d this can be done much more easily than it could be done<br />

in the past. <strong>An</strong>d it really empowers the knowledge workers to work more effectively.<br />

“Hal Varian on how the Web challenges managers”, www.McKinseyQuarterly.com


To do this we need<br />

the right skills<br />

augmented by<br />

the right tools<br />

To explore <strong>and</strong> make sense <strong>of</strong> data, we need the right skills <strong>and</strong> good tools.


Skills<br />

Tools<br />

Sensemaking requires a close collaboration between our brains <strong>and</strong> computer-based<br />

tools that are designed to augment our brains.


Two <strong>of</strong> the right skills<br />

• Statistical thinking<br />

• <strong>Visual</strong> thinking<br />

Two <strong>of</strong> the important skill sets that we need include statistical thinking – the ability to<br />

find the stories that live in numbers <strong>and</strong> make sense <strong>of</strong> them – <strong>and</strong> visual thinking – the<br />

ability to see meaningful patterns in number by representing <strong>and</strong> interacting with them<br />

visually.


Three modes <strong>of</strong> sensemaking<br />

Verbal<br />

A B C<br />

Complementary<br />

Numerical<br />

1 2 3<br />

<strong>Visual</strong><br />

We humans developed three powerful modes <strong>of</strong> sensemaking during the course <strong>of</strong> our<br />

evolution: verbal, numerical, <strong>and</strong> visual. These modes developed separately to some<br />

degree, but evolved to work in complementary <strong>and</strong> collaborative ways.


What is visual thinking?<br />

Temple Gr<strong>and</strong>in<br />

We may consider “visual thinking” as that form <strong>of</strong> thought in which images are generated or<br />

recalled in the mind <strong>and</strong> are manipulated, overlaid, translated, associated with other similar<br />

forms (as with a metaphor), rotated, increased or reduced in size, distorted, or otherwise<br />

transformed gradually from one familiar image into another…“Pattern recognition” has been a<br />

major concern in artificial intelligence research <strong>and</strong> has proved to be quite complex. It has<br />

turned out to be one <strong>of</strong> those things which is very difficult for computers but easy for most<br />

human beings…For our purposes, we will consider pattern recognition to be the ability to<br />

discern similarities <strong>of</strong> form among two or more things, whether these be textile designs, facial<br />

resemblance <strong>of</strong> family members, graphs <strong>of</strong> repeating biological growth cycles, or similarities<br />

between historical epochs…From pattern recognition it is but a short step to “problem solving,”<br />

since, at least for its more common aspects, problem solving generally involves the recognition<br />

<strong>of</strong> a developing or repeating pattern <strong>and</strong> the carrying out <strong>of</strong> actions to obtain desired results<br />

based on one’s underst<strong>and</strong>ing <strong>of</strong> this pattern. (In the Mind’s Eye, Thomas G. West, p. 36)<br />

Temple Gr<strong>and</strong>in is an amazing visual thinker. Perhaps because she’s autistic, the visual<br />

thinking parts <strong>of</strong> her brain have developed more fully than in most <strong>of</strong> us. A recent HBO movie<br />

about her starring Claire Danes does a wonderful job <strong>of</strong> explaining her world <strong>and</strong> the way she’s<br />

put her unique resources to work to make the world better, especially the life <strong>of</strong> cattle. One <strong>of</strong><br />

the most revealing lines in that trailer was when Gr<strong>and</strong>in said, “I know my system will work<br />

because I’ve been through it a thous<strong>and</strong> times in my head.” The most gifted visual thinkers can<br />

see complex images <strong>and</strong> manipulate them in their mind’s eye.<br />

Most <strong>of</strong> us can’t hold complex images in our heads <strong>and</strong> manipulate them like Temple Gr<strong>and</strong>in. We<br />

can, however, use complex images that are in front <strong>of</strong> our eyes on computers <strong>and</strong> manipulate<br />

them.


We need greater balance.<br />

Modes <strong>of</strong> reasoning through history<br />

Numerical<br />

<strong>Visual</strong><br />

During our hunter-gatherer days, before symbolic reasoning developed, which led to<br />

language <strong>and</strong> mathematics, we were primarily visual, with just a crude sense <strong>of</strong> number.


We need greater balance.<br />

Modes <strong>of</strong> reasoning through history<br />

Verbal<br />

Numerical<br />

<strong>Visual</strong><br />

After language developed, however, especially after the development <strong>of</strong> writing, those in<br />

power favored verbal skills <strong>and</strong> numerical skills to a lesser degree, but visual skills lost<br />

favor as the low-brow reasoning <strong>of</strong> the masses.


We need greater balance.<br />

Modes <strong>of</strong> reasoning through history<br />

Verbal<br />

Numerical<br />

<strong>Visual</strong><br />

With the problems that we face today, this lack <strong>of</strong> balance remains, but we need to<br />

correct it.


Leonardo da Vinci<br />

For some four hundred or five hundred years we have had our schools teaching<br />

basically the skills <strong>of</strong> the medieval clerk—reading, writing, counting, <strong>and</strong><br />

memorizing texts. Now it seems that we might be on the verge <strong>of</strong> a new era, when<br />

we will wish to, <strong>and</strong> be required to, emphasize a very different set <strong>of</strong> skills—those<br />

<strong>of</strong> a Renaissance man such as Leonardo da Vinci. With such a change, traits that<br />

are considered desirable today might very well be obsolete <strong>and</strong> unwanted<br />

tomorrow. In place <strong>of</strong> the qualities desired in a well-trained clerk, we might,<br />

instead, find preferable a habit <strong>of</strong> innovation in many diverse fields, the<br />

perspective <strong>of</strong> the global generalist rather than the narrowly focused specialist,<br />

<strong>and</strong> an emphasis on visual content <strong>and</strong> analysis over parallel verbal modes.<br />

If we continue to turn out people who primarily have the skills (<strong>and</strong> outlook) <strong>of</strong> the<br />

clerk, however well trained, we may increasingly be turning out people who will,<br />

like the unskilled laborer <strong>of</strong> the last century, have less <strong>and</strong> less to sell in the<br />

marketplace. Sometime in the not too distant future machines will be the best<br />

clerks. It will be left to humans to maximize what is most valued among human<br />

capabilities <strong>and</strong> what machines cannot do—<strong>and</strong> increasingly these are likely to<br />

involve the insightful <strong>and</strong> integrative capacities associated with visual modes <strong>of</strong><br />

thought.<br />

(In the Mind’s Eye, Thomas G. West, 2009)


Many people who struggle with dyslexia (difficulty reading) are exceptional visual<br />

thinkings, <strong>of</strong>ten because they develop visual thinking skills to to a greater degree than<br />

the rest <strong>of</strong> us to make up for their inability to read. Thomas West does a wonderful job<br />

<strong>of</strong> describing this in his book In the Mind’s Eye.


Complex problems<br />

require visual<br />

thinking.<br />

Everyone agrees that we have a problem. Our technological culture is drowning in its own success.<br />

Masses <strong>of</strong> data <strong>and</strong> information are accumulating everywhere. Up to now, the basic strategy for dealing<br />

with these growing masses <strong>of</strong> information has been long, mind-numbing education <strong>and</strong> reckless,<br />

blinkered specialization. That this strategy has been effective in a great many respects, so far, there can<br />

be little debate. The problems we are discussing are a tribute to its ample <strong>and</strong> abundant success, so far.<br />

However, after long success, it is becoming increasingly clear that this strategy may be entering a<br />

phase <strong>of</strong> diminishing return. It has long been recognized that this strategy has always had built-in<br />

problems. The more one knows in one’s own, increasingly narrow area, the more one is ignorant in<br />

other areas, the more difficult is effective communication between unrelated areas, <strong>and</strong> the more<br />

unlikely it is that the larger whole will be properly perceived or understood. Like the student who reads<br />

too much small print, the specialist’s habitual near focus <strong>of</strong>ten promotes the myopic perspective that<br />

precludes the comprehension <strong>of</strong> larger, more important patterns. The distant view <strong>of</strong> the whole is blurred<br />

<strong>and</strong> unclear. If you focus only on a small group <strong>of</strong> stars at the edge <strong>of</strong> the Milky Way, you will not<br />

perceive the larger structure <strong>of</strong> the whole galaxy <strong>of</strong> which the group is one tiny part.<br />

The specialist strategy breeds its own limits. Pieces <strong>of</strong> the puzzle in separate areas remain far apart, or<br />

come together only after decades <strong>of</strong> specialist resistance, or success in one area leads to great<br />

problems in another. Material abundance produces waste-disposal problems; cars <strong>and</strong> aircraft produce<br />

wonderful mobility for many people, but also deplete resources, produce accidental fatalities, <strong>and</strong><br />

increase pollution; success in vaccination, hygiene, <strong>and</strong> health care lead to all the problems <strong>of</strong> great<br />

concentrations <strong>of</strong> human population.<br />

As the specialist strategy continues to be pursued, a sense <strong>of</strong> the whole is increasingly lost. Many know<br />

their areas; few see the whole. Many are expert; few are wise. But the visual thinkers, late bloomers,<br />

<strong>and</strong> creative dyslexics we have been dealing with have <strong>of</strong>ten been outsiders or reluctant participants in<br />

this specialist culture—especially those who are energetic, <strong>and</strong> globally minded, who seem always to be<br />

interested in everything, unable to settle down to a “serious” (that is, highly specialized) area <strong>of</strong> study.<br />

(In the Mind’s Eye, Thomas G. West, p. 298)


Only visual thinking can connect the dots.<br />

Connecting the dots <strong>of</strong> complex problems requires visual thinking.


Tools extend our reach<br />

when they’re well designed<br />

<strong>and</strong> properly used.<br />

Technologies extend our reach, but only when it’s well designed <strong>and</strong> properly used. Bad<br />

tools or good tools used in the wrong ways can actually narrow our reach.


Tools<br />

When tools are bad or misused, our reliance on them can make us lazy, weak, <strong>and</strong><br />

stupid.


Can computers replace us?<br />

Our brains <strong>and</strong> computer processors work differently <strong>and</strong> therefore have different<br />

strengths. We’ve designed computers to do things that we don’t do very well, which<br />

makes perfect sense, because we use them to complement <strong>and</strong> extend our abilities.<br />

They don’t, however, replace our abilities. Even though IBM’s supercomputer named<br />

Watson can beat the best human champions <strong>of</strong> Jeopardy, we can all do with relative<br />

ease in a second what Watson can’t figure out with endless effort: recognize that that<br />

this is the word “captcha.”


The right tools support<br />

• Statistical calculations<br />

• Interactive visualizations<br />

Good data sensemaking tools support statistical calculations, using the strength <strong>of</strong><br />

computers to perform those calculations quickly <strong>and</strong> accurately, <strong>and</strong> interactive<br />

visualizations, making it possible to find <strong>and</strong> underst<strong>and</strong> the meaningful patterns in our<br />

data.


<strong>Visual</strong>ization empowers statistics.<br />

"The greatest value <strong>of</strong> a picture is when it forces<br />

us to notice what we never expected to see."<br />

Back in the 1970s, Princeton statistician John Tukey recognized the important<br />

collaboration <strong>of</strong> statistical calculations <strong>and</strong> visual exploration. Even sophisticated<br />

statisticians <strong>of</strong>ten need to use their eyes to find <strong>and</strong> underst<strong>and</strong>s the patterns that<br />

reside in data.


Why the eyes?<br />

I became fascinated by data visualization — the use <strong>of</strong> our eyes in close collaboration<br />

with our brains to make sense <strong>of</strong> information — not because I’m a visually-oriented<br />

person (I’m actually much more verbal than visual) but because it <strong>of</strong>fers solutions to real<br />

analytical problems that concern us, solutions that st<strong>and</strong> apart in their ability to<br />

enlighten.


“I see!”<br />

There is an intimate connection between seeing <strong>and</strong> thinking. For this reason, when we<br />

talking about underst<strong>and</strong>ing, we tend to use terms that are related to vision. When we<br />

suddenly come to underst<strong>and</strong> something we say “I see!” It’s hard to think <strong>of</strong> terms that<br />

describe underst<strong>and</strong>ing that don’t refer to sight.


Vision is our dominant sense.<br />

50%<br />

<strong>of</strong> the brain’s resources<br />

If you think the brain has to devote to vision a lot <strong>of</strong> its precious thinking<br />

resources, you are right on the money. It takes up about half <strong>of</strong> everything you<br />

do, in fact. (Brain Rules, John Medina, p. 231)<br />

When it comes to memory, researchers have known for more than 100 years that<br />

pictures <strong>and</strong> text follow very different rules. Put simply, the more visual the input<br />

becomes, the more likely it is to be recognized — <strong>and</strong> recalled. The<br />

phenomenon is so pervasive, it has been given its own name: the pictorial<br />

superiority effect, <strong>of</strong> PSE. (Ibid, p. 233)


Vision is our dominant sense.<br />

30%<br />

70%<br />

Human perception is amazing. I cherish all five <strong>of</strong> the senses that connect us to the world, that allow<br />

to experience beauty <strong>and</strong> an inexhaustible <strong>and</strong> diverse wealth <strong>of</strong> sensation. But <strong>of</strong> all the senses, one<br />

st<strong>and</strong>s out dramatically as our primary <strong>and</strong> most powerful channel <strong>of</strong> input from the world around us,<br />

that is vision. Approximately 70% <strong>of</strong> the body’s sense receptors reside in the eye.<br />

Perhaps the world’s top expert in visual perception <strong>and</strong> how its power can be harnessed for the effec<br />

display <strong>of</strong> information is Colin Ware, who has convincingly described the importance <strong>of</strong> data<br />

visualization. He asks:<br />

Why should we be interested in visualization? Because the human visual system is a pattern seeke<br />

enormous power <strong>and</strong> subtlety. The eye <strong>and</strong> the visual cortex <strong>of</strong> the brain form a massively parallel<br />

processor that provides the highest-b<strong>and</strong>width channel into human cognitive centers. At higher lev<br />

<strong>of</strong> processing, perception <strong>and</strong> cognition are closely interrelated, which is the reason why the words<br />

‘underst<strong>and</strong>ing’ <strong>and</strong> ‘seeing’ are synonymous. However, the visual system has its own rules. We ca<br />

easily see patterns presented in certain ways, but if they are presented in other ways, they become<br />

invisible…The more general point is that when data is presented in certain ways, the patterns can<br />

readily perceived. If we can underst<strong>and</strong> how perception works, our knowledge can be translated in<br />

rules for displaying information. Following perception-based rules, we can present our data in such<br />

way that the important <strong>and</strong> informative patterns st<strong>and</strong> out. If we disobey the rules, our data will be<br />

incomprehensible or misleading.<br />

(Information <strong>Visual</strong>ization: Perception for Design, Second Edition, Colin Ware, Morgan Kaufmann<br />

Publishers, 2004, page xxi)<br />

Perhaps the best known expert in data visualization, Edward Tufte, says: “Clear <strong>and</strong> precise seeing<br />

becomes as one with clear <strong>and</strong> precise thinking.” (<strong>Visual</strong> Explanations, Edward R. Tufte, Graphics Pr<br />

Cheshire, CT.1997 page 53)


<strong>Visual</strong>ization weaves numbers into pictures.<br />

Data visualization is the loom that will weave the data that we collect into the fabric <strong>of</strong><br />

underst<strong>and</strong>ing. Pictures <strong>of</strong> data can make visible the meanings that might forever<br />

otherwise remain hidden.


Trends, patterns, <strong>and</strong> exceptions come to light.<br />

The presentation <strong>of</strong> data as text, such as you see in this table, is perfect when you need<br />

precise values or when the purpose is to look up or compare individual values, but not<br />

when you wish to see patterns, trends, <strong>and</strong> exceptions, or to make comparisons. When<br />

this is your goal, visualizations work best.<br />

When data is presented visually, it is given visible form, <strong>and</strong> from this we can easily<br />

glean insights that would take a long time to piece together from the same data<br />

presented textually, if ever. This graph <strong>of</strong> the same data that appears in the table makes<br />

brings to light several <strong>of</strong> the stories contained in the data that weren’t obvious before,<br />

<strong>and</strong> it did so instantly.<br />

When] we visualize the data effectively <strong>and</strong> suddenly, there is what Joseph Berkson<br />

called ‘interocular traumatic impact’: a conclusion that hits us between the eyes.<br />

(<strong>Visual</strong>izing Data, William S. Clevel<strong>and</strong>, Hobart Press, 1993, page 12)<br />

Modern data graphics can do much more than simply substitute for small statistical<br />

tables. At their best, graphics are instruments for reasoning about quantitative<br />

information. Often the most effective way to describe, explore, <strong>and</strong> summarize a set<br />

<strong>of</strong> numbers – even a very large set – is to look at pictures <strong>of</strong> those numbers.<br />

Furthermore, <strong>of</strong> all methods for analyzing <strong>and</strong> communicating statistical information,<br />

well-designed data graphics are usually the simplest <strong>and</strong> at the same time the most<br />

powerful.<br />

(The <strong>Visual</strong> Display <strong>of</strong> Quantitative Information, Edward R. Tufte, Graphics Press:<br />

Cheshire, CT 1983, Introduction)


A picture David is worth visits a thous<strong>and</strong> America. words.<br />

As we all know, the right picture can sometimes tell a story in a way that no amount <strong>of</strong><br />

words could ever match. To take advantage <strong>of</strong> visualization’s great potential, we first<br />

must know when pictures rather than words are needed.


This visualization fails to do this. This series <strong>of</strong> circles within circles––blue for the market<br />

values <strong>of</strong> banks in quarter 2 <strong>of</strong> 2007, before the recent financial meltdown, <strong>and</strong> green for<br />

declined values as <strong>of</strong> January <strong>of</strong> 2009––was published by Bloomberg. You would never<br />

guess its purpose, however, which was to show that J. P. Morgan’s decline in market<br />

value was less severe than all other major banks except one: Sant<strong>and</strong>er.<br />

This picture <strong>of</strong> the data doesn’t tell the story clearly, simply, or accurately. The<br />

comparative sizes <strong>of</strong> the circles are far from the comparative market values. Even if the<br />

sizes <strong>of</strong> the circles were accurate, we would still struggle with this chart because visual<br />

perception isn’t well-tuned to h<strong>and</strong>le size comparisons, but it is tuned to h<strong>and</strong>le length<br />

comparisons,...


...such as the lengths <strong>of</strong> these bars in my redesign <strong>of</strong> the chart. We can now easily see<br />

that J. P. Morgan lost roughly half <strong>of</strong> its market value during this period, but the fact that<br />

its losses were less severe than all by one bank––Sant<strong>and</strong>er––still isn’t obvious. The<br />

right addition to the picture, however, such as this one in the bottom half that displays<br />

the losses directly, can make this part <strong>of</strong> the story clear as well.


The right balance<br />

38<br />

Information visualization makes possible an ideal balance between unconscious<br />

perceptual <strong>and</strong> conscious cognitive processes. With the proper tools, we can shift much<br />

<strong>of</strong> the analytical process from conscious processes in the brain to pre-attentive<br />

processes <strong>of</strong> visual perception, letting our eyes do what they do extremely well.


Confused?<br />

Unfortunately, so much <strong>of</strong> “data visualization” today gives it a bad name <strong>and</strong> causes<br />

confusion about what it is, how it works, <strong>and</strong> what can be accomplished when it is<br />

properly done.


Many visualization tools<br />

feature decoration,<br />

not substance.<br />

Rather than building useful tools, most s<strong>of</strong>tware vendors are competing to out dazzle<br />

one another with silly visual effects that treat data visualization like it’s a video game.


Pie!!!<br />

Most so-called data analysis tools – especially data visualization tools – so far show<br />

little underst<strong>and</strong>ing <strong>of</strong> the people who use them <strong>and</strong> little respect for their intelligence.<br />

They treat people as if they’re dumb, <strong>and</strong> by doing so encourage them to become<br />

dumb, assuming that they wish merely to be entertained by cute pictures <strong>and</strong> pretty<br />

colors.


Bling your graph.<br />

Many s<strong>of</strong>tware vendors promote the notion that data visualization is not about<br />

underst<strong>and</strong>ing <strong>and</strong> communication, but about “bling.”


Flashy nonsense<br />

Dashboards like this have become a popular form <strong>of</strong> display in the last few years, but<br />

while they give the appearance <strong>of</strong> data density, most <strong>of</strong> them present little <strong>and</strong> the what<br />

they display is presented poorly. But they look cool, <strong>and</strong> for many that’s all that matters,<br />

but it isn’t enough.


3-D graphs are rarely useful.<br />

One <strong>of</strong> the common themes <strong>of</strong> most s<strong>of</strong>tware vendors is that 2-D displays are boring;<br />

never as good as 3D.


Adding a third dimension <strong>of</strong> depth to the bars on the right without adding a<br />

corresponding third variable, however, is not only meaningless, it makes it more difficult<br />

to decode the data.


Even when there’s a 3 rd variable.<br />

Even when a third dimension <strong>of</strong> meaning exists, such as the regions along the Z-axis <strong>of</strong><br />

these graphs, problems with 3-D graphs persist.


The graph on the left illustrate the common problem <strong>of</strong> occlusion--the fact the some<br />

objects, such as these bars, become hidden behind others. In a graph, we should be<br />

able to see all the data at once to make the necessary connections <strong>and</strong> comparisons.<br />

Looking at the graph on the right, I bet you can’t even determine which line represents<br />

the South region. If we can’t figure out which line is which region, what good is the<br />

graph?


We don’t want to confuse people with graphical puzzles.<br />

This chart <strong>of</strong> Escher’s changing popularity through time was created by B. Brucker. I<br />

found it at www.GraphJam.com.


Data visualization builds a bridge<br />

from data to knowledge.<br />

Data <strong>Visual</strong>ization<br />

Knowledge<br />

Data<br />

Built on an underst<strong>and</strong>ing<br />

<strong>of</strong><br />

visual perception<br />

<strong>and</strong><br />

cognition<br />

Data visualization can help us build a bridge from data to knowledge, but only if the<br />

tools that we use were built on an underst<strong>and</strong>ing <strong>of</strong> visual perception (how we see) <strong>and</strong><br />

cognition (how we think).


Good visualizations <strong>and</strong> good visualization tools are carefully designed to take<br />

advantage <strong>of</strong> human perceptual <strong>and</strong> cognitive strengths <strong>and</strong> to augment human abilities<br />

that are weak. If the goal is to count the number <strong>of</strong> circles, this visualization isn’t well<br />

designed. It is difficult to remember what you have <strong>and</strong> have not counted.<br />

50


This visualization, which shows the same number <strong>of</strong> circles, however, is well designed<br />

for the counting task. Because the circles are grouped into small sets <strong>of</strong> five each, it is<br />

easy to remember which groups have <strong>and</strong> have not been counted, easy to quickly count<br />

the number <strong>of</strong> circles in each group, <strong>and</strong> easy to discover with little effort that each <strong>of</strong><br />

the five groups contains the same number <strong>of</strong> circles (i.e., five), resulting in a total count<br />

<strong>of</strong> 25 circles.<br />

51


This arrangement is even better yet.<br />

52


<strong>Visual</strong> perception works according to<br />

its own rules.<br />

Many <strong>of</strong> the ways that visual perception work are not intuitive. Looking at these two sets<br />

<strong>of</strong> objects, we naturally see those on the left as convex <strong>and</strong> on those the right as<br />

concave.


<strong>Visual</strong> perception works according to<br />

its own rules.<br />

The effect has now been reversed: we see the objects on the left as concave <strong>and</strong> those<br />

on the right as convex. All I did, however, was turn each sets <strong>of</strong> objects upside down—I<br />

didn’t switch them. The reason that we now see those on the left as concave is<br />

because, through eons <strong>of</strong> evolution, visual perception learned to assume that light was<br />

shining from above, which causes us to see the objects on the left as concave, because<br />

the shadows are on the top, <strong>and</strong> those on the right as convex, because the shadows<br />

are on the bottom.


<strong>Visual</strong> perception is not just camera work.<br />

Unlike a camera, visual perception does not record absolute values <strong>of</strong> the things that<br />

we see, but differences between them.


<strong>Visual</strong> perception is not just camera work.<br />

Despite how differently they look in the original image, squares A <strong>and</strong> B are exactly the<br />

same color. What we see is not a simple recording <strong>of</strong> what is actually out there. Seeing<br />

is an active process that involves interpretations by our brains <strong>of</strong> data that is sensed by<br />

our eyes in an effort to make sense <strong>of</strong> it in context. The presence <strong>of</strong> the cylinder <strong>and</strong> its<br />

shadow in the image <strong>of</strong> the checkerboard triggers an adjustment in our minds to<br />

perceive the square labeled B as lighter than it actually is. The illusion is also created by<br />

the fact that the sensors in our eyes do not register actual color but rather the difference<br />

in color between something <strong>and</strong> what’s nearby. The contrast between square A <strong>and</strong> the<br />

light squares that surround it <strong>and</strong> square B <strong>and</strong> the dark squares that surround it cause<br />

us to perceive squares A <strong>and</strong> B quite differently, even though they are actually the same<br />

color, as you can clearly see above after all <strong>of</strong> the surrounding context has been<br />

removed.<br />

The ability to use graphs effectively requires a basic underst<strong>and</strong>ing <strong>of</strong> how we<br />

unconsciously interpret what we see.


Context affects what we see. This image illustrates the surprising effect that a simple<br />

change in the lightness <strong>of</strong> the background alone has on our perception <strong>of</strong> color. The<br />

large rectangle displays a simple color gradient <strong>of</strong> a gray-scale from fully light to fully<br />

dark. The small rectangle is the same exact color everywhere it appears, but it doesn’t<br />

look that way because our brains perceive visual differences rather than absolute<br />

values, in this case between the color <strong>of</strong> the small rectangle <strong>and</strong> the color that<br />

immediately surrounds it.<br />

Among other things, underst<strong>and</strong>ing this should tell us that using a color gradient as the<br />

background <strong>of</strong> a graph should be avoided.


Two essential activities<br />

• Multi-faceted views<br />

• Comparisons<br />

Data sensemaking revolves primarily around two essential activities: viewing data from<br />

multiple perspectives <strong>and</strong> comparing things.<br />

No one view <strong>of</strong> the data will suffice. To underst<strong>and</strong> your data thoroughly, you must view<br />

it from every possible perspective. With each new view, more than anything else, you<br />

make compare things.


Multiple views <strong>of</strong> change through time<br />

Have you ever noticed that time-series data can look quite different if you change the<br />

interval <strong>of</strong> time that you’ve aggregated it to in a graph? For example, if you are<br />

examining a year’s worth <strong>of</strong> visits to your website with one value per quarter<br />

(aggregated to quarterly intervals), <strong>and</strong> then switch to a monthly aggregation <strong>of</strong> the<br />

same data, <strong>and</strong> then switch to a daily aggregation, the patterns <strong>of</strong> change might look<br />

quite different.<br />

All three versions <strong>of</strong> the graph are useful <strong>and</strong> correct, but the daily version reveals<br />

details that aren’t visible when viewing the same data by month or quarter. On the other<br />

h<strong>and</strong>, the overall trend is difficult to discern from the daily view. One view isn’t better<br />

than the other in general, but one is definitely better than the other when you have a<br />

specific analytical purpose in mind. For this reason, don’t restrict your view <strong>of</strong> timeseries<br />

data to a single interval <strong>of</strong> time, especially when looking for anything that seems<br />

interesting. Switch the level <strong>of</strong> aggregation from year to quarter, quarter to month,<br />

month to week, week to day, <strong>and</strong> so on—back <strong>and</strong> forth—to tease out the insights that<br />

can only arise when this is done.<br />

S<strong>of</strong>tware products that allow you to quickly <strong>and</strong> easily switch between various intervals<br />

<strong>of</strong> time while viewing data graphically are what’s needed to encourage this practice. The<br />

ability to switch time intervals with a mouse click or two, or by using something as<br />

simple as an interval slider control, will set you free to explore without distraction from<br />

onerous operations.


Data analysis requires rapid interaction.<br />

?Think<br />

See<br />

Modify<br />

Direct dynamic interaction with the properly visualized data allows us to see discover<br />

meaningful patterns, trends, <strong>and</strong> exceptions in the display <strong>and</strong> to interact with it directly<br />

to filter out what we don’t need, drill into details, combine multiple variables for<br />

comparison, etc., in ways that promote a smooth flow between seeing something,<br />

thinking about it, <strong>and</strong> manipulating it, with no distracting lags in between. This is what I<br />

call “visual analysis at the speed <strong>of</strong> thought.”<br />

Great analysts, like great scientists, great artists, great people <strong>of</strong> all sorts, accept the<br />

call to serve as a voice for data. Important stories can be found in data. We can learn to<br />

discern the meanings that live in information <strong>and</strong> to unravel the stories that are woven<br />

through it. What I do isn’t just work; it is my mission. I work hard to learn the world’s<br />

stories <strong>and</strong> to tell them truthfully. I believe that there is no higher calling. I fight against<br />

those who try to hide or alter the truth. I believe that the truth really can set us free.


“Compared to what?”<br />

Edward Tufte once said that the essential question <strong>of</strong> data sensemaking is “Compared<br />

to what?” This involves comparing the magnitudes <strong>of</strong> values, but not just one to another.<br />

We must compare many values. To do so, we must see how they relate to one another<br />

to form patterns. We not only compare the magnitudes <strong>of</strong> values; we also compare<br />

patterns formed by sets <strong>of</strong> values. We look for how they are similar <strong>and</strong> we look for how<br />

they are different, especially differences that appear to be dramatic departures from the<br />

norm. When we spot these visual characteristics in the data, we then interact with the<br />

data to find out why these things have happened.


Key interactions <strong>of</strong> visual analysis<br />

• Sorting<br />

• Filtering<br />

• Adding/removing variables<br />

• Highlighting<br />

• Aggregating/Disaggregating<br />

• Grouping<br />

• Zooming/Panning<br />

• Re-visualizing<br />

• Re-expressing<br />

• Re-scaling<br />

The process <strong>of</strong> visual data analysis involves several common interactions with data to uncover what’s meaningful. Here<br />

are some <strong>of</strong> the primary interactions:<br />

• Sorting. The act <strong>of</strong> sorting data, especially by the magnitude <strong>of</strong> the values from high to low or low to high, features the<br />

ranking relationship between those values <strong>and</strong> makes it easier to compare the magnitude <strong>of</strong> value to the next.<br />

• Adding/removing variables. You might need to view different variable at different times during the analysis process,<br />

so it is common to add or remove field <strong>of</strong> data from view as necessary<br />

• Filtering. When you want to focus on a subset <strong>of</strong> data, nothing makes it easier to do so than filtering—the removal<br />

from view <strong>of</strong> everything your not interested in at the moment.<br />

• Highlighting. Sometimes you want to focus on a subset <strong>of</strong> information, but do so in a way that allows you to maintain<br />

a sense <strong>of</strong> how that subset relates to the whole. Rather than filtering out the data that falls outside your range <strong>of</strong> focus,<br />

you can simply reduce its visual salience or increase the visual salience <strong>of</strong> the data you wish to focus on. This allows<br />

you to focus on the subset with less distraction from the whole in a way that allow you to remain aware <strong>of</strong> the whole.<br />

This is one way <strong>of</strong> achieving what’s called a focus+context view.<br />

• Aggregating/Disaggregating. <strong>An</strong>alysis <strong>of</strong>ten requires that you examine data a different levels <strong>of</strong> detail. Aggregation<br />

involves viewing data at a higher level <strong>of</strong> summarization. Disaggregation involves viewing data at a lower level <strong>of</strong><br />

detail.<br />

• Grouping. Sometimes it is useful to combine members <strong>of</strong> a variable together, treating them as a single member <strong>of</strong> the<br />

variable. This may take the form <strong>of</strong> combining some members <strong>and</strong> leaving others as they are, or <strong>of</strong> creating an entirely<br />

new variable that combines all members <strong>of</strong> an existing variable into a groups to form members <strong>of</strong> a higher level<br />

variable.<br />

• Zooming/Panning. When a data visualization contains so much that it is difficult to clearly see all the data at once, it is<br />

useful to zoom in on that portion that you want to see more clearly. Panning involves moving around (for example, up,<br />

down, right, or left) in a zoomed view to focus on a different part <strong>of</strong> the larger visualization.<br />

• Re-visualizing. No one visual representation <strong>of</strong> data can show you everything there is to see, so visual analysis<br />

involves shifting from one type <strong>of</strong> visualization to another to explore data from various perspectives.<br />

• Re-expressing. Sometimes it is useful to express a quantitative variable as a different unit <strong>of</strong> measure, such as<br />

expressing dollars as percentages.<br />

• Re-scaling. No single quantitative scale on a graph can serve every analytical need. Rescaling involves changing the<br />

range <strong>of</strong> the quantitative scale to make it easier to see particular patterns <strong>and</strong> sometimes even changing the nature <strong>of</strong><br />

the scale, such as from a normal scale to a logarithmic scale.


Dynamic data filters support analytical flow.<br />

Direct dynamic interaction with the data allows you to manipulate the data easily <strong>and</strong><br />

immediately (such as by filtering it), without interrupting your stream <strong>of</strong> thought.


Shneiderman’s Mantra<br />

“Overview first,<br />

zoom <strong>and</strong> filter,<br />

then details-on-dem<strong>and</strong>.”<br />

When new recruits by intelligence organizations are trained in spy craft, they are taught<br />

a method <strong>of</strong> observation that begins by getting an overview <strong>of</strong> the scene around them<br />

while being sensitive to things that appear abnormal, not quite right, which they should<br />

then focus in on for close observation <strong>and</strong> analysis.<br />

A visual information-seeking mantra for designers: ‘Overview first, zoom <strong>and</strong> filter,<br />

then details-on-dem<strong>and</strong>.’<br />

(Readings in Information <strong>Visual</strong>ization: Using Vision to Think, Stuart K. Card, Jock D.<br />

Mackinlay, <strong>and</strong> Ben Shneiderman, Academic Press, San Diego, California, 1999,<br />

page 625)<br />

Having an overview is very important. It reduces search, allows the detection <strong>of</strong><br />

overall patterns, <strong>and</strong> aids the user in choosing the next move. A general heuristic <strong>of</strong><br />

visualization design, therefore, is to start with an overview. But it is also necessary for<br />

the user to access details rapidly. One solution is overview + detail: to provide<br />

multiple views, an overview for orientation, <strong>and</strong> a detailed view for further work.<br />

(Ibid., page 285)<br />

Users <strong>of</strong>ten try to make a ‘good’ choice by deciding first what they do not want, i.e.<br />

they first try to reduce the data set to a smaller, more manageable size. After some<br />

iterations, it is easier to make the final selection(s) from the reduced data set. This<br />

iterative refinement or progressive querying <strong>of</strong> data sets is sometimes known as<br />

hierarchical decision-making.<br />

(Ibid., page 295)


Overview first, zoom <strong>and</strong> filter, then details-on-dem<strong>and</strong><br />

Shneiderman’s technique begins with an overview <strong>of</strong> the data — the big picture. Let<br />

your eyes search for particular points <strong>of</strong> interest in the whole.


Overview first, zoom <strong>and</strong> filter, then details-on-dem<strong>and</strong><br />

When you see a particular point <strong>of</strong> interest, then zoom in on it.


Overview first, zoom <strong>and</strong> filter, then details-on-dem<strong>and</strong><br />

Once you’ve zoomed in on it, you can examine it more closely <strong>and</strong> in greater detail.


Overview first, zoom <strong>and</strong> filter, then details-on-dem<strong>and</strong><br />

Often you must remove data that is extraneous to your investigation to better focus on<br />

the relevant data.


Overview first, zoom <strong>and</strong> filter, then details-on-dem<strong>and</strong><br />

Filtering out extraneous data removes distractions from the data under investigation.


Overview first, zoom <strong>and</strong> filter, then details-on-dem<strong>and</strong><br />

Jan 21, 2011<br />

$25,193<br />

<strong>Visual</strong> data analysis relies mostly on the shape <strong>of</strong> the data to provide needed insights,<br />

but there are still times when you need to see the details behind the shape <strong>of</strong> the data.<br />

Having a means to easily see the details when you need them, without having them in<br />

the way when you don’t works best.


Data visualization must<br />

focus attention<br />

<strong>and</strong><br />

augment memory.<br />

To achieve this goal, tools must be designed to focus our attention on what’s meaningful<br />

in data <strong>and</strong> to help us work around the limitations <strong>of</strong> memory.


Attention is limited.<br />

Ever saw a magic show <strong>and</strong> wondered just how the magician took your watch<br />

without you even noticing? Ever wonder why is it that you can search for a set <strong>of</strong><br />

misplaced keys for a long time, only to later find them sitting in the exact place<br />

where you were looking? Research has shown that we don’t always see everything<br />

we’re looking at, <strong>and</strong> that attention plays a big part in what consciously registers to<br />

us. The effect where we’re blind to things we don’t attend to is known as<br />

‘Inattentional Blindness’.<br />

(Source: This demonstration <strong>of</strong> inattentional blindness <strong>and</strong> the explanation above<br />

was prepared under the direction <strong>of</strong> Ronald A. Rensink <strong>of</strong> the University <strong>of</strong> British<br />

Columbia. Several other examples <strong>of</strong> this visual phenomenon can be found at http://<br />

psyclab1.psych.ubc.ca/~viscoglab/.)


Memory is limited.<br />

What’s not stored in working memory is lost.<br />

In addition to underst<strong>and</strong>ing visual perception, visual analysis tools must also be rooted<br />

in an underst<strong>and</strong>ing <strong>of</strong> how people think. Only then can they recognize <strong>and</strong> support the<br />

cognitive operations that are necessary to make sense <strong>of</strong> information.<br />

Memory plays an important role in human cognition. Because memory suffers from<br />

certain limitations, visual analysis tools must be able to augment memory.<br />

The example above illustrates one <strong>of</strong> the limitations <strong>of</strong> working memory. We only<br />

remember that to which we attend. <strong>An</strong>y part <strong>of</strong> this image that never gets our attention<br />

will not be missed when we shift to another version <strong>of</strong> the image that lacks that<br />

particular part. If we don’t attend to it, we might notice the change from one version <strong>of</strong><br />

the image to the next, but only if the transition shift immediately from one to another,<br />

without even a split second <strong>of</strong> blank space between them.<br />

In addition to not remembering, we also don’t clearly see that on which we don’t focus.<br />

To see something clearly, we must focus on it, for only a small area <strong>of</strong> receptors on the<br />

retinas <strong>of</strong> our eyes are designed for high-resolution vision.<br />

(Source: This demonstration <strong>of</strong> change blindness was prepared by Ronald A. Rensink<br />

<strong>of</strong> the University <strong>of</strong> British Columbia. Several other examples <strong>of</strong> this visual phenomenon<br />

can be found at http://www.psych.ubc.ca/%7erensink/flicker/download/index.html.)


Capacity = 3 or 4<br />

chunks at a time<br />

Imagination<br />

The<br />

World<br />

Working<br />

Memory<br />

Long-term<br />

Memory<br />

When we think about things, trying to make sense <strong>of</strong> them, information is temporarily<br />

held in working memory. Only three or four chunks <strong>of</strong> visual information can be stored in<br />

working memory at any one time. Information that comes in through our eyes or that is<br />

retrieved from long-term memory in the moment <strong>of</strong> thought is extremely limited in<br />

capacity. When you release information from working memory, it can take one <strong>of</strong> two<br />

possible routes on its way out: 1) it can be stored permanently in long-term memory by<br />

means <strong>of</strong> a rehearsal process that we call memorization, or 2) it can simply be<br />

forgotten.


External aids can augment memory.<br />

To compare facts, you must hold them in working memory simultaneously. Because we<br />

can hold so little in working memory at any one time, however, to do analysis effectively,<br />

we must rely on external aids to memory. This is an ideal job for a computer. Even a<br />

piece <strong>of</strong> paper that you jot down notes on to keep track <strong>of</strong> information as you’re<br />

analyzing data is an external memory aid that is quite powerful despite being low-tech.<br />

A computer running properly designed s<strong>of</strong>tware, however, can augment our ability to<br />

think about information much better than pencil <strong>and</strong> paper.


Good analysis tools augment memory by<br />

exp<strong>and</strong>ing...<br />

• Quantity<br />

• Dimensions<br />

• Perspectives<br />

Good visual analysis s<strong>of</strong>tware can help us overcome the limitations <strong>of</strong> working memory<br />

in several ways. The goal is to enable as many meaningful comparisons as possible.<br />

Good tools can help us increase:<br />

• The amount <strong>of</strong> information that we can compare (that is, greater quantity)<br />

• The range <strong>of</strong> information that we can compare (that is, more dimensions)<br />

• The different views <strong>of</strong> the information that we can compare (that is, multiple<br />

perspectives)


Greater quantity through visual encoding<br />

1 chunk<br />

Traditional BI relies mostly on tabular data displays. Tables are wonderful if you need to<br />

look up individual values, compare a single value to another, or know values precisely,<br />

but they don’t display patterns or trends. This is a problem, because data analysis relies<br />

heavily on our ability to spot <strong>and</strong> make sense <strong>of</strong> patterns <strong>and</strong> trends in data. Take a look<br />

at the table <strong>and</strong> compare it to this line graph, which displays the same data. Relying on<br />

the table to discern the ups <strong>and</strong> downs <strong>of</strong> sales through time <strong>and</strong> to compare the<br />

patterns <strong>of</strong> change from region to region would yield very little <strong>of</strong> the information that is<br />

obvious in this graph. <strong>Visual</strong> representations give form to data, making pattern, trends,<br />

<strong>and</strong> exceptions easy to see.<br />

<strong>An</strong>other advantage <strong>of</strong> properly designed graphs over tables for analytical purposes is<br />

less obvious. If you needed to remember information in the table, you could hold only<br />

about four <strong>of</strong> the values (that is, four <strong>of</strong> the monthly sales numbers) in working memory<br />

at any one time. But by relying on the graph, 12 values are combined into each <strong>of</strong> the<br />

four lines to form a pattern that you could hold entirely as a single chunk in working<br />

memory. Simply by giving visual form to the values, you can hold much more<br />

information in memory.


More dimensions through multiple visual<br />

encodings <strong>and</strong> small multiples<br />

You can increase the number <strong>of</strong> dimensions (variables) that can be viewed<br />

simultaneously in two ways: 1) by using multiple visual encodings, <strong>and</strong> 2) by breaking<br />

the data into a series <strong>of</strong> small graphs that share common scales so they can be<br />

compared to one another. In the bubble plot on the right, we can view four variables at<br />

once: the rate <strong>of</strong> property crime based in position along the X-axis, the rate <strong>of</strong> violent<br />

crime based on position along the Y-axis, the population <strong>of</strong> states based on the size <strong>of</strong><br />

the bubbles, <strong>and</strong> the region in which each state resides based on color. The series <strong>of</strong><br />

small multiples on the left allows us to revenue, costs <strong>of</strong> goods sold, <strong>and</strong> gross pr<strong>of</strong>it<br />

along the Y-axis, quarters along the X-axis, <strong>and</strong> different product lines per graph. If we<br />

tried to display all <strong>of</strong> this information in a single graph, it would be far too cluttered <strong>and</strong><br />

difficult to perceive.


More perspectives through coordinated<br />

views<br />

There’s an old folktale that you’ve probably all heard about three blind men who encounte<br />

elephant one day for the first time <strong>and</strong> do their best to learn about it by touch alone. The<br />

experience <strong>of</strong> each is unique because each touches a different part <strong>of</strong> the elephant. This<br />

ancient story, originally from China, can teach us something important today about busine<br />

intelligence (BI). According to the original Chinese tale, the first man touches the elephant<br />

ear, the second his legs, <strong>and</strong> the third his tail. From this point, here’s how the story goes:<br />

The three blind men then went their way. Each one was secretly excited over the<br />

experience <strong>and</strong> had a lot to say, yet all walked rapidly without saying a word.<br />

"Let's sit down <strong>and</strong> have a discussion about this queer animal," the second blind man sa<br />

breaking the silence.<br />

"A very good idea. Very good." the other two agreed for they also had this in mind. With<br />

waiting for anyone to be properly seated, the second one blurted out, "This queer anima<br />

like our straw fans swinging back <strong>and</strong> forth to give us a breeze. However, it's not so big<br />

well made. The main portion is rather wispy."<br />

"No, no!" the first blind man shouted in disagreement. "This queer animal resembles two<br />

big trees without any branches."<br />

"You're both wrong." the third man replied. "This queer animal is similar to a snake; it's l<br />

<strong>and</strong> round, <strong>and</strong> very strong."<br />

How they argued! Each one insisted that he alone was correct. Of course, there was no<br />

conclusion for not one had thoroughly examined the whole elephant. How can anyone<br />

describe the whole until he has learned the total <strong>of</strong> the parts.<br />

If I retold this story today to teach a lesson about BI, I might call it “Three blind analysts an<br />

data warehouse.” Business people struggle every day to make sense <strong>of</strong> data, stumbling<br />

blindly, touching only small parts <strong>of</strong> the information, <strong>and</strong> coming away with a narrow <strong>and</strong><br />

fragmented underst<strong>and</strong>ing <strong>of</strong> what it means.


Coordinated views exp<strong>and</strong> what we can<br />

connect <strong>and</strong> compare.<br />

To underst<strong>and</strong> something, we <strong>of</strong>ten have to examine it from many angles <strong>and</strong> focus on<br />

many parts. Too much business data analysis involves looking only for one thing in<br />

particular. Is revenue going up? The answer is “yes” or “no” — end <strong>of</strong> story. Perhaps,<br />

however, you ought to look at revenues, expenses, pr<strong>of</strong>its, marketing campaigns,<br />

seasonality, composition <strong>of</strong> the sales force, new product introductions, <strong>and</strong> the<br />

competition to underst<strong>and</strong> the richer story that your data has to tell.


Who is qualified to analyze data?<br />

Data<br />

<strong>An</strong>alysis<br />

Knowledge<br />

?<br />

You are the key that opens the door for good data to result in good decisions. S<strong>of</strong>tware,<br />

no matter how sophisticated, is useless if you don’t possess the fundamental skills <strong>of</strong><br />

data analysis. Data analysis is for one purpose: to enable good decisions. Do you need<br />

to be an Einstein to make sense <strong>of</strong> your business data? For most business data<br />

analysis, the answer is “No”, but when statistical sophistication required, you need a<br />

statistically sophisticated tool.


<strong>An</strong>alytical challenges dem<strong>and</strong><br />

statistical horsepower.<br />

<strong>JMP</strong> is one <strong>of</strong> best tools available today for combining robust statistical functionality with<br />

rich interactive visualizations.


Predictive analysis is one <strong>of</strong> these<br />

challenges.<br />

U.S. Dollars<br />

(thous<strong>and</strong>s)<br />

4,000<br />

Sales Revenues<br />

Sales Revenues<br />

3,500<br />

3,000<br />

2,500<br />

2,000<br />

1,500<br />

1,000<br />

500<br />

0<br />

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec<br />

2012<br />

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec<br />

2013<br />

One <strong>of</strong> the analytical challenges that requires a high degree <strong>of</strong> statistical training is its<br />

use for predicting what might happen in the future. Let’s clarify the meaning <strong>of</strong><br />

predictive analytics. Like many terms that get tossed around by s<strong>of</strong>tware vendors <strong>and</strong><br />

even by thought leaders in the field <strong>of</strong> business intelligence, predictive analytics seems<br />

to mean whatever’s convenient at the moment. It might be useful for marketing<br />

purposes to keep definitions loose <strong>and</strong> adaptable to the occasion, but it doesn’t help us<br />

at all. We’ll start by turning to everyone’s favorite dictionary these days: Wikipedia.<br />

Predictive analytics encompass a variety <strong>of</strong> techniques from statistics <strong>and</strong> data<br />

mining that analyze current <strong>and</strong> historical data to make predictions about future<br />

events. Such predictions rarely take the form <strong>of</strong> absolute statements, <strong>and</strong> are more<br />

likely to be expressed as values that correspond to the odds <strong>of</strong> a particular event or<br />

behavior taking place in the future.<br />

In business, predictive models exploit patterns found in historical <strong>and</strong> transactional<br />

data to identify risks <strong>and</strong> opportunities. Models capture relationships among many<br />

factors to allow assessment <strong>of</strong> risk or potential associated with a particular set <strong>of</strong><br />

conditions, guiding decision making for c<strong>and</strong>idate transactions.<br />

Wikipedia entry for “predictive analytics” as <strong>of</strong> December 15, 2008.


Predictions rely on statistical models.<br />

Don’t let the terms statistical model or predictive model throw you. In concept, they’re<br />

quite simple, just like more familiar models <strong>of</strong> other types. Generally speaking, models<br />

are representations <strong>of</strong> things or events, which we use to examine <strong>and</strong> underst<strong>and</strong> those<br />

things or events when it isn’t possible or practical to observe or interact with them<br />

directly. Statistical models represent mathematical relationships between the parts that<br />

make up the thing or event. Predictive models are those that we can interact with to<br />

investigate the results <strong>of</strong> hypothetical conditions, such as by changing the values <strong>of</strong><br />

particular variables.<br />

Predictive models make it possible for us to do what some people call what-if analysis.<br />

What if such <strong>and</strong> such a condition existed or event occurred? What would happen as a<br />

result? Predictive models give us the means to predict what would probably happen<br />

(the probable outcomes <strong>of</strong> dependent variables) if particular conditions arose naturally<br />

or by intention (specified input values to one or more independent variables).


Good models don’t hide variables <strong>and</strong><br />

their relationships.<br />

Inputs<br />

?<br />

Outputs<br />

Most <strong>of</strong> the applications that I’ve seen marketed by business intelligence s<strong>of</strong>tware<br />

vendors for predictive analytics allow data to be entered on one end (inputs) <strong>and</strong> then<br />

results (outputs) pop out the other; what goes on in between remains hidden in a black<br />

box. Unfortunately, without seeing what goes on in that black box, our brains aren’t fully<br />

engaged in the process <strong>and</strong> too much is missed.<br />

Predictive analytics are most revealing when they allow us to see how all the variables<br />

that contribute either directly or indirectly to the outcomes that concern us relate to<br />

those outcomes <strong>and</strong> to one another. To underst<strong>and</strong> these relationships, we must see<br />

them; we must watch how changes in one variable directly cause or indirectly influence<br />

changes in the others. For this to happen, predictive models must be displayed visually<br />

in a way that allows: (1) our eyes to see the relationships <strong>and</strong> changes; <strong>and</strong> (2) our<br />

minds to make sense <strong>of</strong> them.


Finding <strong>and</strong> underst<strong>and</strong>ing the<br />

stories in data is not enough.<br />

You must tell them.<br />

As a data analyst, your job isn’t done once you’ve found <strong>and</strong> made sense <strong>of</strong> the stories<br />

that live in your data. Unless you can tell those stories in ways that others who need the<br />

information can underst<strong>and</strong>, your work is wasted.


Hans Rosling <strong>of</strong> www.GapMinder.org<br />

Few people tell quantitative stories as compellingly as Hans Rosling. When he spoke at<br />

the TED Conference (Technology, Entertainment, <strong>and</strong> Design) for he first time in 2006, it<br />

was perhaps the first time in history that a large audience <strong>of</strong> people found themselves<br />

on the edge <strong>of</strong> their seats watching a bubble plot.


Wisdom<br />

Knowledge<br />

Information<br />

Information cannot speak for itself. It needs our help. It relies on us to give it a voice.<br />

When we do, information can tell its story, <strong>and</strong> will thus become knowledge. The<br />

ultimate goal, however, isn’t knowledge; it is wisdom. Knowledge becomes wisdom<br />

when it is used to do something good. Only when we use what we know to make the<br />

world a better place has information served its purpose <strong>and</strong> we have done our job.<br />

Our networks are awash in data. A little <strong>of</strong> it is information. A smidgen <strong>of</strong> this shows<br />

up as knowledge. Combined with ideas, some <strong>of</strong> that is actually useful. Mix in<br />

experience, context, compassion, discipline, humor, tolerance, <strong>and</strong> humility, <strong>and</strong><br />

perhaps knowledge becomes wisdom.<br />

Turning Numbers into Knowledge, Jonathan G. Koomey, 2001, <strong><strong>An</strong>alytics</strong> Press:<br />

Oakl<strong>and</strong>, CA page 5, quoting Clifford Stoll.


O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.irishastronomy.org]


O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.trekvisual.com]


O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.i.pbase.com]


O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.]


Truth<br />

O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.shepherdpics.com]


O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.i163.photobucket.com]


The value <strong>of</strong> information depends on how it’s used.<br />

Use it wisely.<br />

O perpetual revolution <strong>of</strong> configured stars,<br />

O perpetual recurrence <strong>of</strong> determined seasons,<br />

O world <strong>of</strong> spring <strong>and</strong> autumn, birth <strong>and</strong> dying!<br />

The endless cycle <strong>of</strong> idea <strong>and</strong> action,<br />

Endless invention, endless experiment,<br />

Brings knowledge <strong>of</strong> motion, but not <strong>of</strong> stillness;<br />

Knowledge <strong>of</strong> speech, but not <strong>of</strong> silence;<br />

Knowledge <strong>of</strong> words, <strong>and</strong> ignorance <strong>of</strong> The Word.<br />

All our knowledge brings us nearer to our ignorance,<br />

All our ignorance brings us nearer to death,<br />

But nearness to death no nearer to God.<br />

Where is the Life we have lost in living?<br />

Where is the wisdom we have lost in knowledge?<br />

Where is the knowledge we have lost in information?<br />

Excerpt from The Rock, 1934, T.S. Elliot<br />

[Image source: www.jamin.org]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!