Big Visual Data

zapaga

Big Visual Data

What makes

Big Visual Data hard?

© Quint Buchholz

Alexei (Alyosha) Efros

Carnegie Mellon University


My Goals

1. To make you fall in love with Big Visual Data

• She is a fickle, coy mistress

• but holds the key to achieving real visual

understanding

2. To ask for help in tackling this Big

Interdisciplinary Problem


Driven by Visual Data

Texture Synthesis

Unsupervised Object Discovery

Inferring 3D from 2D

Dating Historical Images

Action Recognition

Seeing Through Water

Illumination Estimation

Geo-location


Texture: microcosm of Big Data

radishes rocks yogurt


Texture Synthesis


Classical Texture Synthesis

Synthesis

Analysis

Parametric

Texture

Model

Novel texture

Sample texture

This is hard!


Throwing away too much too soon?

input texture synthesized texture


Non-parametric Approach

Synthesis

Analysis

Novel texture

Sample texture


[Efros & Leung, ’99, Efros & Freeman ‘01]

p

non-parametric

sampling

Input image


Texture Growing


input image

Portilla & Simoncelli

Xu, Guo & Shum

Wei & Levoy Our algorithm


Two Kinds of Things in the World

Navier-Stokes Equation + weather

+ location

+ …


Lots of data available


“Unreasonable Effectiveness of Data

• Parts of our world can be explained by

elegant mathematics:

– physics, chemistry, astronomy, etc.

• But much cannot:

– psychology, genetics, economics,… visual

understanding?

• Enter: The Magic of Data

– Great advances in several fields:

[Halevy, Norvig, Pereira 2009]

• e.g. speech recognition, machine translation, Google


The A.I. for the postmodern world


The Good News

Really stupid algorithms + Lots of Data

= “Unreasonable Effectiveness”


140 billion images

6 billion added monthly

72 hours uploaded

every minute

6 billion images

1 billion images

served daily

3.5 trillion

photographs

90% of net traffic will be visual!


Genetics

Disease

Tracking

Drugs

Policy

Medical

Data

Scientific

Experiments

Economic

Data

Physics

Business

Intelligence

Psychology

Social

Graphs

Data Mining

Business

Data

Marketing

Dating

Collaborative

Filters

Web Text

Visual

Data?

Search


Bad News

Visual Data is difficult to handle

• text:

– clean, segmented, compact, 1D, indexable

Visual data:

– Noisy, unsegmented, high entropy, 2D/3D


Computing distances is hard

CLIME - CRIME

y

x

-

-

-

y

x

= hamming distance of 1 letter

= Euclidian distance of 5 units

= Grayvalue distance of 50 values

= ?


How similar are two pictures?

?

=


Medici Fountain, Paris


INDEXING VIA “VISUAL WORDS”


[SIFT: Lowe, 2004]

“VISUAL WORD” MATCHING


[SIFT: Lowe, 2004]

letter

“VISUAL WORD” MATCHING


Medici Fountain, Paris (winter)


Visual “Garbage Heap”

“It irritated him that the “dog” of 3:14 in the

afternoon, seen in profile, should be indicated by

the same noun as the dog of 3:15, seen

frontally…”

“My memory, sir, is like a garbage heap.”

-- from Funes the Memorious

Organizing the “Garbage Heap”:

• Finding visual correspondences across data

• Mining Visual Data

• Connecting visual data to enable

understanding (Visual Memex)

Jorge Luis Borges


Improving Visual Correspondence


Improving Visual Correspondence


Lots of Tiny Images

• 80 million tiny images: a large dataset for non-

parametric object and scene recognition

Antonio Torralba, Rob Fergus and William T.

Freeman. PAMI 2008.


Lots

Of

Images

A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008


Lots

Of

Images

A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008


Lots

Of

Images


Automatic Colorization

Grayscale input High resolution

Colorization of input using average

A. Torralba, R. Fergus, W.T.Freeman. 2008


[Hays & Efros, SIGGRAPH’07]


Scene Descriptor


Scene Descriptor

Scene Gist Descriptor

(Oliva and Torralba 2001)


2 Million Flickr Images


… 200 scene matches


Improving Visual Correspondence


Improving Visual Correspondence


Visual Data has a Long Tail

The rare is common!


LEARNING BETTER VISUAL

CORRESPONDENCES

ABHINAV SRIVASTAVA, TOMASZ MALISIEWICZ, ABHINAV GUPTA, ALEXEI EFROS

SIGGRAPH ASIA’11


Input Query

Top Matches


Input Query

Top Matches


Input Query

Top Matches


IMPORTANT PARTS?

Input Query Important Parts


Input Query

Top Matches


Way more efficient approaches:

[Ramanan et al 2012, Durand et al 2012]


SEARCH USING PAINTINGS

Input Painting

Our Approach

GIST

Bag-of-Words

Tiny Images

HOG


SEARCH USING PAINTINGS

Input Painting Top Matches


SEARCH USING PAINTINGS

Input Painting Top Matches


SEARCH USING SKETCHES

Input Sketch

Our Approach

Tiny Images

GIST

Bag-of-Words

81


SEARCH USING SKETCHES


APPLICATIONS


RE-PHOTOGRAPHY

Historical Image of

Boston Station

Computational Re-photography

(Bae et al., 2010)

Re-photographed Image


Historical Image of

Boston Station

RE-PHOTOGRAPHY

Computational Re-photography (Bae et al., 2010)

Re-photographed Image Then & Now View


INTERNET RE-PHOTOGRAPHY

Historical Image of

Boston Station

Historical Image of

Boston Station

Computational Re-photography (Bae et al., 2010)

Re-photographed Image Then & Now View

Search

10,000 Flickr Images

of Boston

Our Approach

Top Match


INTERNET RE-PHOTOGRAPHY

Historical Image of

Boston Station

Historical Image of

Boston Station

Computational Re-photography (Bae et al., 2010)

Re-photographed Image Then & Now View

Top Match

From 10,000 Flickr Images

Our Approach

Then & Now View


WHERE WAS THE PAINTER STANDING?

Input Painting


Input Painting

PAINTING2GPS

Retrieval set

10,000 Geo-tagged Flickr Images

100 top matches used to estimation


PAINTING2GPS

Input Painting Estimated Geo-location

Estimated using 100 top matches


VISUAL SCENE EXPLORATION


VISUAL SCENE EXPLORATION

96


Query image

FINDING SIMILAR IMAGES


PAIRWISE SIMILARITY MATRIX


. . . . . .

. . . . .


TRAVERSING THE GRAPH

Similar magazines