segmentation

Segmentácia farebnéhoobrazu

Image **segmentation**

Image **segmentation**

Segmentation• Segmentation means to divide up the image into apatchwork of regions, each of which is “homogeneous”,that is, the “same” in some sense - intensity, texture,colour, …• The **segmentation** operation only subdivides an image;• it does not attempt to recognize the segmented imageparts!

Complete vs. partial**segmentation**Complete **segmentation** - divides an image intononoverlapping regions that match to the real worldobjects.• Cooperation with higher processing levels which usespecific knowledge of the problem domain is necessary.

Complete vs. partial**segmentation**• Partial **segmentation**- in which regions do not corresponddirectly with image objects.• Image is divided into separate regions that arehomogeneous with respect to a chosen property such asbrightness, color, texture, etc.

Gestalt (celostné) laws ofperceptual organization• The emphasis in the Gestalt approach wason the configuration of the elements.Proximity: Objects that are closer toone another tend to be groupedtogether.Closure: Humans tend toenclose a space by completinga contour and ignoring gaps.

Gestalt laws of perceptualorganizationSimilarity: Elements that looksimilar will be perceived as partof the same form. (color, shape,texture, and motion).Continuation: Humans tendto continue contourswhenever the elements ofthe pattern establish animplied direction.

Gestalt laws• A series of factors affect whether elements should begrouped together.• Proximity: tokens that are nearby tend to be grouped.• Similarity: similar tokens tend to be grouped together.• Common fate: tokens that have coherent motion tend to begrouped together.• Common region: tokens that lie inside the same closedregion tend to be grouped together.• Parallelism: parallel curves or tokens tend to be groupedtogether.

Gestalt laws• Closure: tokens or curves that tend to lead to closedcurves tend to be grouped together.• Symmetry: curves that lead to symmetric groups aregrouped together.• Continuity: tokens that lead to “continuous” curves tend tobe grouped.• Familiar configuration: tokens that, when grouped, lead toa familiar object, tend to be grouped together.

Gestalt laws

Gestalt laws

Consequence:Groupings by Invisible CompletionsStressing the invisible groupings:* Images from Steve Lehar’s Gestalt papers: http://cns-alumni.bu.edu/pub/slehar/Lehar.html

Image **segmentation**• Segmentation criteria: a **segmentation** is a partitionof an image I into a set of regions S satisfying:1. S i = S Partition covers the wholeimage.2. S i S j = , i j No regions intersect.3. S i , P(S i ) = true Homogeneity predicate issatisfied by each region.4. P(S i S j ) = false, Union of adjacent regionsi j, S i adjacent S j does not satisfy it.

Image **segmentation**So, all we have to do is to define and implementthe similarity predicate.• But, what do we want to be similar in eachregion?• Is there any property that will cause theregions to be meaningful objects?

Segmetnation methodsPixel-based• Histogram• ClusteringRegion-based• Region growing• Split and mergeEdge-basedModel-basedPhysics-basedGraph-based

Histogram-based **segmentation**• How many “orange” pixels arein this image?• This type of question can be answeredby looking at the histogram.

Histogram-based **segmentation**• How many modes are there?• Solve this by reducing the number of colors to K andmapping each pixel to the closest color.• Here’s what it looks like if we use two colors.

Clustering-based **segmentation**• How to choose the representative colors?• This is a clustering problem!• K-means algorithm can be used forclustering.

Clustering-based **segmentation**K-means clustering of color.

Clustering-based **segmentation**K-means clustering of color.

Results of K-Means Clustering:ImageClusters on intensity Clusters on colorK-means clustering using intensity alone and color alone22* From Marc Pollefeys COMP 256 2003

Clustering-based **segmentation**• Clustering can also be used with otherfeatures (e.g., texture) in addition to color.Original Images Color Regions Texture Regions

Clustering-based **segmentation**• K-means variants:• Different ways to initialize the means.• Different stopping criteria.• Dynamic methods for determining the rightnumber of clusters (K) for a given image.• Problem: histogram-based and clusteringbased**segmentation** can produce messyregions.• How can these be fixed?

Clustering-based **segmentation**• Expectation-Maximization (EM) algorithm can be usedas a probabilistic clustering method where each clusteris modeled using a Gaussian.• The clusters are updated iteratively by computing theparameters of the Gaussians.Example from the UC Berkeley’s Blobworld system.

Clustering-based **segmentation**Examples from the UC Berkeley’s Blobworld system.

Region growing• Region growing techniques start with one pixel of apotential region and try to grow it by adding adjacentpixels till the pixels being compared are too dissimilar.• The first pixel selected can be just the first unlabeledpixel in the image or a set of seed pixels can be chosenfrom the image.• Usually a statistical test is used to decide which pixelscan be added to a region.• Region is a population with similar statistics.• Use statistical test to see if neighbor on border fitsinto the region population.

Region growingimage**segmentation**

Split-and-merge1. Start with the whole image.2. If the variance is too high, break into quadrants.3. Merge any adjacent regions that are similar enough.4. Repeat steps 2 and 3, iteratively until no more splittingor merging occur. Idea: goodResults: blocky

Split-and-merge

Split-and-merge

Split-and-mergeA satellite image.A large connected regionformed by merging pixelslabeled as residential afterclassification.More compact sub-regionsafter the split-and-mergeprocedure.

Mean Shift Segmentationhttp://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Mean Shift AlgorithmMean Shift Algorithm1. Choose a search window size.2. Choose the initial location of the search window.3. Compute the mean location (centroid of the data) in the search window.4. Center the search window at the mean location computed in Step 3.5. Repeat Steps 3 and 4 until convergence.The mean shift algorithm seeks the “mode” or point of highest density of a data distribution:34

Mean Shift SegmentationMean Shift Setmentation Algorithm1. Convert the image into tokens (via color, gradients, texture measures etc).2. Choose initial search window locations uniformly in the data.3. Compute the mean shift window location for each initial position.4. Merge windows that end up on the same “peak” or mode.5. The data these merged windows traversed are clustered together.*Image From: Dorin Comaniciu and Peter Meer, Distribution Free Decomposition of MultivariateData, Pattern Analysis & Applications (1999)2:22–30

Mean Shift Segmentation ExtensionIs scale (search window size) sensitive. Solution, use all scales:Gary Bradski’s internally published agglomerative clustering extension:Mean shift dendrograms1. Place a tiny mean shift window over each data point2. Grow the window and mean shift it3. Track windows that merge along with the data they transversed4. Until everything is merged into one clusterBest 4 clusters:Best 2 clusters:Advantage over agglomerative clustering: Highly parallelizable36

Mean Shift Segmentation Results:http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html

Graph Cut

Graph Cut• First: select a region of interest

Graph Cut• How to select the object automatically??

Graph Cut• What are graphs?Nodes• usually pixels• sometimes samplesEdges• weights associated (W(i,j))• E.g. RGB value difference

Graph Cut• What are cuts?• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|

Graph Cut• Go back to our selected region• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|

Graph Cut• Go back to our selected region• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|

Graph Cut• We want highest sum of weights• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|

Graph Cut• We want highest sum of weights• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|These cuts give low pointsW(i,j) = |RGB(i) – RGB(j)|is low

Graph Cut• We want highest sum of weights• Each “cut” -> points, W(I,j)• Optimization problem• W(i,j) = |RGB(i) – RGB(j)|These cuts give high pointsW(i,j) = |RGB(i) – RGB(j)|is high

Graph-based **segmentation**• An image is represented by a graphwhose nodes are pixels or small groups ofpixels.• The goal is to partition the nodes intodisjoint sets so that the similarity withineach set is high and across different setsis low.http://www.cs.berkeley.edu/~malik/papers/SM-ncut.pdf

Graph-based **segmentation**• Let G = (V,E) be a graph. Each edge (u,v) has aweight w(u,v) that represents the similaritybetween u and v.• Graph G can be broken into 2 disjoint graphswith node sets A and B by removing edges thatconnect these sets.• Let cut(A,B) =u A, v Bw(u,v).• One way to segment G is to find the minimal cut.

Graph-based **segmentation**

Graph-based **segmentation**• Minimal cut favors cutting off small nodegroups, so Shi and Malik proposed thenormalized cut.cut(A,B) cut(A,B)Ncut(A,B) = --------------- + ---------------assoc(A,V) assoc(B,V)Normalizedcutassoc(A,V) =u A, t Vw(u,t)How much is A connectedto the graph as a whole

Solve for Minimum PenaltyPartition AcutPartition B

Graph-based **segmentation**A22 21 4 32 221B2222233 3Ncut(A,B) = ------- + ------21 16

• Optimization solverGraph CutSolver ExampleRecursion:1. Grow2. If W(i,j) low1. Stop2. Continue

• Result : IsolationGraph Cut

Image Segmentation and Graph Cuts• Image Segmentation• Graph Cuts

he Pipeline• Input: Image• Output: Segments• Each iteration cuts into 2 piecesYesAssignW(i,j)Solve forminimumpenaltyCut into 2Subdivide?Subdivide?NoNoYes

Assign W(i,j)• W(i,j) = |RGB(i) – RGB(j)| is noisy!• Could use brightness and localityX(i) is the spatial location of node iF(i) is the feature vector for node iwhich can be intensity, color, texture, motion…The formula is set up so that w(i,j) is 0 for nodes that are toofar apart.

Graph-based **segmentation**• Shi and Malik turned graph cuts into aneigenvector/eigenvalue problem.• Set up a weighted graph G=(V,E).• V is the set of (N) pixels.• E is a set of weighted edges (weight w ij gives thesimilarity between nodes i and j).• Length N vector d: d i is the sum of the weights fromnode i to all other nodes.• N x N matrix D: D is a diagonal matrix with d on itsdiagonal.• N x N symmetric matrix W: W ij = w ij .

Graph-based **segmentation**• Let x be a characteristic vector of a set A of nodes.• x i = 1 if node i is in a set A• x i = -1 otherwise• Let y be a continuous approximation to x• Solve the system of equations(D – W) y = D yfor the eigenvectors y and eigenvalues .• Use the eigenvector y with second smallest eigenvalueto bipartition the graph (y x A).• If further subdivision is merited, repeat recursively.

Extensions: Edge Weights• How to calculate the edge weights?IntensityColor (HSV)Texture

Continued Work: Semantic Segmentation• Incorporating top-down information into low-level **segmentation**Interactive Graph Cuts: Yuri Boykov, et al

Graph-based **segmentation**

GrabCutMagic Wand(198?)Intelligent ScissorsMortensen and Barrett (1995)GrabCutRother et al 2004UserInputResultRegions Boundary Regions & BoundarySlides C Rother et al., Microsoft Research, Cambridge

Data TermRForeground &BackgroundBackground GGaussian Mixture Model(typically 5-8 components)Slides C Rother et al., Microsoft Research, Cambridge

Smoothness termAn object is a coherent set of pixelsIterate until convergence:1. Compute a configuration given the mixture model. (E-Step)2. Compute the model parameters given the configuration. (M-Step)Slides C Rother et al., Microsoft Research, Cambridge

Moderately simple examples… GrabCut completes automaticallySlides C Rother et al., Microsoft Research, Cambridge

Difficult ExamplesCamouflage &Low ContrastFine structureNo telepathyInitialRectangleResultSlides C Rother et al., Microsoft Research, Cambridge

Markov Random Fields (MRF)• A graphical model for describing spatial consistency in images• Suppose you want to label image pixels with some labels {l 1 ,…,l k } , e.g.,**segmentation**, stereo disparity, foreground-background, etc.Ref:1. S. Z. Li. Markov Random FieldModeling in Image Analysis.Springer-Verlag, 19912. S. Geman and D. Geman. Stochasticrelaxation, gibbs distributionand bayesian restoration of images.PAMI, 6(6):721–741, 1984.From Slides by S. Seitz - University of WashingtonCS 534 – Stereo Imaging - 71

DefinitionMRF Components:• A set of sites: P={1,…,m} : each pixel is a site.• Neighborhood for each pixel N={N p | p P}• A set of random variables (random field), one for each site F={F p | p P} Denotes thelabel at each pixel.• Each random variable takes a value f p from the set of labels L={l 1 ,…,l k }• We have a joint event {F 1 =f 1 ,…, F m =f m } , or a configuration, abbreviated as F=f• The joint prob. Of such configuration: Pr(F=f) or Pr(f)From Slides by S. Seitz - University of WashingtonCS 534 – Stereo Imaging - 72

DefinitionMRF Components:• Pr(f i ) > 0 for all variables f i .• Markov Property: Each Random variable depends on other RVs onlythrough its neighbors. Pr(f p | f S-{p} )=Pr (f p |f Np ), p• So, we need to define a neighborhood system: N p (neighbors for site p).• No strict rules for neighborhood definition.Cliques for this neighborhoodFrom Slides by S. Seitz - University of WashingtonCS 534 – Stereo Imaging - 73

DefinitionMRF Components:• The joint prob. of such configuration:Pr(F=f) or Pr(f)• Markov Property: Each Random variable depends on other RVs onlythrough its neighbors. Pr(f p | f S-{p} )=Pr (f p |f Np ), p• So, we need to define a neighborhood system: N p (neighbors for site p)Hammersley-Clifford Theorem:Pr(f)exp(- C V C (f))Sum over all cliques in the neighborhood systemV C is clique potentialWe may decide1. NOT to include all cliques in a neighborhood; or2. Use different V c for different cliques in the sameneighborhoodFrom Slides by S. Seitz - University of WashingtonCS 534 – Stereo Imaging - 74

MRF Components:• Hammersley-Clifford Theorem:Pr(f)exp(- C V C (f))Optimal Configuration• Consider MRF’s with arbitrary cliques among neighboring pixelsSum over all cliques in the neighborhoodsystemV C is clique potential: prior probability thatelements of the clique C have certain valuesPr(f)expcCp1,p2 ...Vcc(fp, f1 p...)2VTypical potential: Potts model:( fp,fq)u{p,q}(1( f( p,q)p qFrom Slides by S. Seitz - University of Washingtonf))CS 534 – Stereo Imaging - 75

Optimal ConfigurationMRF Components:• Hammersley-Clifford Theorem:Pr(f) exp(- C V C (f))• Consider MRF’s with clique potentials of pairs of neighboring pixelsPr( f ) exp Vp(fp)V( p,q)( fp,fq)Most commonly p used….very p q N ( ppopular ) in vision.Energyfunction:There are two constraintsto satisfy:E ( f , f )( f ) Vp(fp)V(p,q)pp q Np1. Data Constraint: Labelingshould reflect theobservation.p2. Smoothness constraint: Labelingshould reflect spatial consistency(pixels close to each other aremost likely to have similar labels).qCS 534 – Stereo Imaging - 76

Probabilistic interpretation• The problem is we are not observing the labels but we observe something else thatdepends on these labels with some noise (eg intensity or disparity)• At each site we have an observation i p• The observed value at each site depends on its label: the prob. of certain observedvalue given certain label at site p : g(i p ,f p )=Pr(i p |F p =f p )• The overall observation prob. given the labels: Pr(O|f)Pr( O | f ) g(i p, f p)• We need to infer about the labelsgiven the observation Pr(f|O)pPr(O|f) Pr(f)CS 534 – Stereo Imaging - 77

Using MRFs• How to model different problems?• Given observations y, and the parameters of the MRF, how to infer the hiddenvariables, x?• How to learn the parameters of the MRF?

Modeling image pixel labels as MRFMRF-based **segmentation**real image( x, y )ii1label image( x, x )ijSlides by R. Huang – Rutgers University

MRF-based **segmentation**• Classifying image pixels into different regions under the constraint of both localobservations and spatial relationships• Probabilistic interpretation:*( x ,*) argmax P( x, | y)( x, )regionlabelsmodelparam.imagepixelsSlides by R. Huang – Rutgers University

Model joint probability*( x ,*) argmax P( x, | y)( x, )regionlabelsmodelparam.imagepixelsHow did wefactorize?1P( xy , ) ( xi , xj) ( xi , yi)Zlabelimage( i, j)label-labelcompatibilityFunctionenforcingSmoothnessconstraintneighboringlabel nodesiimage-labelcompatibilityFunctionenforcingDataConstraintlocalObservationsSlides by R. Huang – Rutgers University

Probabilistic interpretation• We need to infer about the labels given the observationPr( f | O )Pr(O|f ) Pr(f)MAP estimate of f should minimize the posterior energyE( f ) V(p,q)(pqNpfp,fq)pln( g(ip,fp))Neighborhood term:Smoothness ConstraintData (observation) term:Data ConstraintCS 534 – Stereo Imaging - 82

Applying and learning MRFMRF-based **segmentation**EM algorithm• E-Step: (Inference)1P( x | y, ) P( y | x, ) P( x | )Z*x arg max P( x | y, )x• M-Step: (learning)*arg max E( P( x, y | )) arg max P( x, y | ) P( x | y, )Pseduo-likelihood method.xSlides by R. Huang – Rutgers University

Applying and learning MRF: Examplex*arg max P( x | y)1arg max P( x, y) P( x | y) P( x, y) / P( y) P( x, y)xZ1arg max ( x , y ) ( x , x ) P( xy , ) ( x , y ) ( x , x )i i i j i i i ji ( i, j) Z2i ( i, j)2( xi , yi ) G( yi; x,x)2( xi , xj) exp( ( xi xj) / )2 2[x,x, ]ixxiii( x, y )ii1Slides by R. Huang – Rutgers University( x, x )ij