Big Visual Data
Big Visual Data
Big Visual Data
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
What makes<br />
<strong>Big</strong> <strong>Visual</strong> <strong>Data</strong> hard?<br />
© Quint Buchholz<br />
Alexei (Alyosha) Efros<br />
Carnegie Mellon University
My Goals<br />
1. To make you fall in love with <strong>Big</strong> <strong>Visual</strong> <strong>Data</strong><br />
• She is a fickle, coy mistress<br />
• but holds the key to achieving real visual<br />
understanding<br />
2. To ask for help in tackling this <strong>Big</strong><br />
Interdisciplinary Problem
Driven by <strong>Visual</strong> <strong>Data</strong><br />
Texture Synthesis<br />
Unsupervised Object Discovery<br />
Inferring 3D from 2D<br />
Dating Historical Images<br />
Action Recognition<br />
Seeing Through Water<br />
Illumination Estimation<br />
Geo-location
Texture: microcosm of <strong>Big</strong> <strong>Data</strong><br />
radishes rocks yogurt
Texture Synthesis
Classical Texture Synthesis<br />
Synthesis<br />
Analysis<br />
Parametric<br />
Texture<br />
Model<br />
Novel texture<br />
Sample texture<br />
This is hard!
Throwing away too much too soon?<br />
input texture synthesized texture
Non-parametric Approach<br />
Synthesis<br />
Analysis<br />
Novel texture<br />
Sample texture
[Efros & Leung, ’99, Efros & Freeman ‘01]<br />
p<br />
non-parametric<br />
sampling<br />
Input image
Texture Growing
input image<br />
Portilla & Simoncelli<br />
Xu, Guo & Shum<br />
Wei & Levoy Our algorithm
Two Kinds of Things in the World<br />
Navier-Stokes Equation + weather<br />
+ location<br />
+ …
Lots of data available
“Unreasonable Effectiveness of <strong>Data</strong>”<br />
• Parts of our world can be explained by<br />
elegant mathematics:<br />
– physics, chemistry, astronomy, etc.<br />
• But much cannot:<br />
– psychology, genetics, economics,… visual<br />
understanding?<br />
• Enter: The Magic of <strong>Data</strong><br />
– Great advances in several fields:<br />
[Halevy, Norvig, Pereira 2009]<br />
• e.g. speech recognition, machine translation, Google
The A.I. for the postmodern world
The Good News<br />
Really stupid algorithms + Lots of <strong>Data</strong><br />
= “Unreasonable Effectiveness”
140 billion images<br />
6 billion added monthly<br />
72 hours uploaded<br />
every minute<br />
6 billion images<br />
1 billion images<br />
served daily<br />
3.5 trillion<br />
photographs<br />
90% of net traffic will be visual!
Genetics<br />
Disease<br />
Tracking<br />
Drugs<br />
Policy<br />
Medical<br />
<strong>Data</strong><br />
Scientific<br />
Experiments<br />
Economic<br />
<strong>Data</strong><br />
Physics<br />
Business<br />
Intelligence<br />
Psychology<br />
Social<br />
Graphs<br />
<strong>Data</strong> Mining<br />
Business<br />
<strong>Data</strong><br />
Marketing<br />
Dating<br />
Collaborative<br />
Filters<br />
Web Text<br />
<strong>Visual</strong><br />
<strong>Data</strong>?<br />
Search
Bad News<br />
<strong>Visual</strong> <strong>Data</strong> is difficult to handle<br />
• text:<br />
– clean, segmented, compact, 1D, indexable<br />
• <strong>Visual</strong> data:<br />
– Noisy, unsegmented, high entropy, 2D/3D
Computing distances is hard<br />
CLIME - CRIME<br />
y<br />
x<br />
-<br />
-<br />
-<br />
y<br />
x<br />
= hamming distance of 1 letter<br />
= Euclidian distance of 5 units<br />
= Grayvalue distance of 50 values<br />
= ?
How similar are two pictures?<br />
?<br />
=
Medici Fountain, Paris
INDEXING VIA “VISUAL WORDS”
[SIFT: Lowe, 2004]<br />
“VISUAL WORD” MATCHING
[SIFT: Lowe, 2004]<br />
letter<br />
“VISUAL WORD” MATCHING
Medici Fountain, Paris (winter)
<strong>Visual</strong> “Garbage Heap”<br />
“It irritated him that the “dog” of 3:14 in the<br />
afternoon, seen in profile, should be indicated by<br />
the same noun as the dog of 3:15, seen<br />
frontally…”<br />
“My memory, sir, is like a garbage heap.”<br />
-- from Funes the Memorious<br />
Organizing the “Garbage Heap”:<br />
• Finding visual correspondences across data<br />
• Mining <strong>Visual</strong> <strong>Data</strong><br />
• Connecting visual data to enable<br />
understanding (<strong>Visual</strong> Memex)<br />
Jorge Luis Borges
Improving <strong>Visual</strong> Correspondence
Improving <strong>Visual</strong> Correspondence
Lots of Tiny Images<br />
• 80 million tiny images: a large dataset for non-<br />
parametric object and scene recognition<br />
Antonio Torralba, Rob Fergus and William T.<br />
Freeman. PAMI 2008.
Lots<br />
Of<br />
Images<br />
A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008
Lots<br />
Of<br />
Images<br />
A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008
Lots<br />
Of<br />
Images
Automatic Colorization<br />
Grayscale input High resolution<br />
Colorization of input using average<br />
A. Torralba, R. Fergus, W.T.Freeman. 2008
[Hays & Efros, SIGGRAPH’07]
Scene Descriptor
Scene Descriptor<br />
Scene Gist Descriptor<br />
(Oliva and Torralba 2001)
2 Million Flickr Images
… 200 scene matches
Improving <strong>Visual</strong> Correspondence
Improving <strong>Visual</strong> Correspondence
<strong>Visual</strong> <strong>Data</strong> has a Long Tail<br />
The rare is common!
LEARNING BETTER VISUAL<br />
CORRESPONDENCES<br />
ABHINAV SRIVASTAVA, TOMASZ MALISIEWICZ, ABHINAV GUPTA, ALEXEI EFROS<br />
SIGGRAPH ASIA’11
Input Query<br />
Top Matches
Input Query<br />
Top Matches
Input Query<br />
Top Matches
IMPORTANT PARTS?<br />
Input Query Important Parts
Input Query<br />
Top Matches
Way more efficient approaches:<br />
[Ramanan et al 2012, Durand et al 2012]
SEARCH USING PAINTINGS<br />
Input Painting<br />
Our Approach<br />
GIST<br />
Bag-of-Words<br />
Tiny Images<br />
HOG
SEARCH USING PAINTINGS<br />
Input Painting Top Matches
SEARCH USING PAINTINGS<br />
Input Painting Top Matches
SEARCH USING SKETCHES<br />
Input Sketch<br />
Our Approach<br />
Tiny Images<br />
GIST<br />
Bag-of-Words<br />
81
SEARCH USING SKETCHES
APPLICATIONS
RE-PHOTOGRAPHY<br />
Historical Image of<br />
Boston Station<br />
Computational Re-photography<br />
(Bae et al., 2010)<br />
Re-photographed Image
Historical Image of<br />
Boston Station<br />
RE-PHOTOGRAPHY<br />
Computational Re-photography (Bae et al., 2010)<br />
Re-photographed Image Then & Now View
INTERNET RE-PHOTOGRAPHY<br />
Historical Image of<br />
Boston Station<br />
Historical Image of<br />
Boston Station<br />
Computational Re-photography (Bae et al., 2010)<br />
Re-photographed Image Then & Now View<br />
Search<br />
10,000 Flickr Images<br />
of Boston<br />
Our Approach<br />
Top Match
INTERNET RE-PHOTOGRAPHY<br />
Historical Image of<br />
Boston Station<br />
Historical Image of<br />
Boston Station<br />
Computational Re-photography (Bae et al., 2010)<br />
Re-photographed Image Then & Now View<br />
Top Match<br />
From 10,000 Flickr Images<br />
Our Approach<br />
Then & Now View
WHERE WAS THE PAINTER STANDING?<br />
Input Painting
Input Painting<br />
PAINTING2GPS<br />
Retrieval set<br />
10,000 Geo-tagged Flickr Images<br />
100 top matches used to estimation
PAINTING2GPS<br />
Input Painting Estimated Geo-location<br />
Estimated using 100 top matches
VISUAL SCENE EXPLORATION
VISUAL SCENE EXPLORATION<br />
96
Query image<br />
FINDING SIMILAR IMAGES
PAIRWISE SIMILARITY MATRIX<br />
…<br />
. . . . . .<br />
. . . . .<br />
…
TRAVERSING THE GRAPH