Big Visual Data

What makes 

Big Visual Data hard? 

© Quint Buchholz 

Alexei (Alyosha) Efros 

Carnegie Mellon University

My Goals 

1. To make you fall in love with Big Visual Data 

• She is a fickle, coy mistress 

• but holds the key to achieving real visual 

understanding 

2. To ask for help in tackling this Big 

Interdisciplinary Problem

Driven by Visual Data 

Texture Synthesis 

Unsupervised Object Discovery 

Inferring 3D from 2D 

Dating Historical Images 

Action Recognition 

Seeing Through Water 

Illumination Estimation 

Geo-location

Texture: microcosm of Big Data 

radishes rocks yogurt

Texture Synthesis

Classical Texture Synthesis 

Synthesis 

Analysis 

Parametric 

Texture 

Model 

Novel texture 

Sample texture 

This is hard!

Throwing away too much too soon? 

input texture synthesized texture

Non-parametric Approach 

Synthesis 

Analysis 

Novel texture 

Sample texture

[Efros & Leung, ’99, Efros & Freeman ‘01] 

p 

non-parametric 

sampling 

Input image

Texture Growing

input image 

Portilla & Simoncelli 

Xu, Guo & Shum 

Wei & Levoy Our algorithm

Two Kinds of Things in the World 

Navier-Stokes Equation + weather 

+ location 

+ …

Lots of data available

“Unreasonable Effectiveness of Data” 

• Parts of our world can be explained by 

elegant mathematics: 

– physics, chemistry, astronomy, etc. 

• But much cannot: 

– psychology, genetics, economics,… visual 

understanding? 

• Enter: The Magic of Data 

– Great advances in several fields: 

[Halevy, Norvig, Pereira 2009] 

• e.g. speech recognition, machine translation, Google

The A.I. for the postmodern world

The Good News 

Really stupid algorithms + Lots of Data 

= “Unreasonable Effectiveness”

140 billion images 

6 billion added monthly 

72 hours uploaded 

every minute 



served daily 

3.5 trillion 

photographs 

90% of net traffic will be visual!

Genetics 

Disease 

Tracking 

Drugs 

Policy 

Medical 

Data 

Scientific 

Experiments 

Economic 


Physics 

Business 

Intelligence 

Psychology 

Social 

Graphs 

Data Mining 

Business 


Marketing 

Dating 

Collaborative 

Filters 

Web Text 

Visual 

Data? 

Search

Bad News 

Visual Data is difficult to handle 

• text: 

– clean, segmented, compact, 1D, indexable 

• Visual data: 

– Noisy, unsegmented, high entropy, 2D/3D

Computing distances is hard 

CLIME - CRIME 

y 

x 

- 

- 

- 

y 

x 

= hamming distance of 1 letter 

= Euclidian distance of 5 units 

= Grayvalue distance of 50 values 

= ?

How similar are two pictures? 

? 

=

Medici Fountain, Paris

INDEXING VIA “VISUAL WORDS”

[SIFT: Lowe, 2004] 

“VISUAL WORD” MATCHING

[SIFT: Lowe, 2004] 

letter 

“VISUAL WORD” MATCHING

Medici Fountain, Paris (winter)

Visual “Garbage Heap” 

“It irritated him that the “dog” of 3:14 in the 

afternoon, seen in profile, should be indicated by 

the same noun as the dog of 3:15, seen 

frontally…” 

“My memory, sir, is like a garbage heap.” 

-- from Funes the Memorious 

Organizing the “Garbage Heap”: 

• Finding visual correspondences across data 

• Mining Visual Data 

• Connecting visual data to enable 

understanding (Visual Memex) 

Jorge Luis Borges

Improving Visual Correspondence


Lots of Tiny Images 

• 80 million tiny images: a large dataset for non- 

parametric object and scene recognition 

Antonio Torralba, Rob Fergus and William T. 

Freeman. PAMI 2008.

Lots 

Of 

Images 

A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008

Lots 

Of 

Images 

A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008

Lots 

Of 

Images

Automatic Colorization 

Grayscale input High resolution 

Colorization of input using average 

A. Torralba, R. Fergus, W.T.Freeman. 2008

[Hays & Efros, SIGGRAPH’07]

Scene Descriptor

Scene Descriptor 

Scene Gist Descriptor 

(Oliva and Torralba 2001)

2 Million Flickr Images

… 200 scene matches



Visual Data has a Long Tail 

The rare is common!

LEARNING BETTER VISUAL 

CORRESPONDENCES 

ABHINAV SRIVASTAVA, TOMASZ MALISIEWICZ, ABHINAV GUPTA, ALEXEI EFROS 

SIGGRAPH ASIA’11

Input Query 

Top Matches

Input Query 

Top Matches

Input Query 

Top Matches

IMPORTANT PARTS? 

Input Query Important Parts

Input Query 

Top Matches

Way more efficient approaches: 

[Ramanan et al 2012, Durand et al 2012]

SEARCH USING PAINTINGS 

Input Painting 

Our Approach 

GIST 

Bag-of-Words 

Tiny Images 

HOG


Input Painting Top Matches


Input Painting Top Matches

SEARCH USING SKETCHES 

Input Sketch 

Our Approach 

Tiny Images 

GIST 

Bag-of-Words 

81

SEARCH USING SKETCHES

APPLICATIONS

RE-PHOTOGRAPHY 

Historical Image of 

Boston Station 

Computational Re-photography 

(Bae et al., 2010) 

Re-photographed Image



RE-PHOTOGRAPHY 

Computational Re-photography (Bae et al., 2010) 

Re-photographed Image Then & Now View

INTERNET RE-PHOTOGRAPHY 






Re-photographed Image Then & Now View 

Search 

10,000 Flickr Images 

of Boston 

Our Approach 

Top Match

INTERNET RE-PHOTOGRAPHY 






Re-photographed Image Then & Now View 

Top Match 

From 10,000 Flickr Images 

Our Approach 

Then & Now View

WHERE WAS THE PAINTER STANDING? 

Input Painting

Input Painting 

PAINTING2GPS 

Retrieval set 

10,000 Geo-tagged Flickr Images 

100 top matches used to estimation

PAINTING2GPS 

Input Painting Estimated Geo-location 

Estimated using 100 top matches

VISUAL SCENE EXPLORATION

VISUAL SCENE EXPLORATION 

96

Query image 

FINDING SIMILAR IMAGES

PAIRWISE SIMILARITY MATRIX 

… 

. . . . . . 

. . . . . 

…

TRAVERSING THE GRAPH

Big Visual Data

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?