The Topography of Multivariate Normal Mixtures

math.bu.edu

The Topography of Multivariate Normal Mixtures

The Topography of Multivariate

Normal Mixtures

Post Doctoral Talk

SAMSI

Surajit Ray

EMAIL: sray@bios.unc.edu

Surajit Ray Nov 2, 2004


Mixture of Distributions

■ Flexible way of modeling heterogeneous population.

■ Data reduction through the number, location and shape of Mixture

components.

■ Can be directly used for model based clustering.

Surajit Ray Nov 2, 2004


Mixture of Distributions

-component mixture:

¡

¢¤£



¨

Parameters

¨��


¥

¡

¢¤£

¥

� component densities. �




¨

� �

��

��

¦

§ ¨�©


��

����

� �

� ��




¨

¡

and

¨

¢¤£



¨

¨��


¥

� mixing proportions.

Surajit Ray Nov 2, 2004



� §

� §

§ ¨©



¨

��� ¦


Number of components Number of Modes

0.0 0.05 0.10 0.15 0.20

-6 -4 -2 0 2 4 6

Mixture of normals 4 standard deviations apart

¡

¢£

0.0 0.05 0.10 0.15 0.20 0.25

-6 -4 -2 0 2 4 6

Mixture of normals 2 standard deviations apart

Bimodal Unimodal

¤ number

of modes.

NOTE: Practical interest could lie in finding components that correspond

to separate modes.

Surajit Ray Nov 2, 2004


Detection of modality: Univariate Case

■ Conditions for bimodality of two univariate normals

Theorem 1. (?).

■ Conditions for arbitrary mixture in (??)

§ ¢�

¦

¡

¤


¢£¢

¢


§

��

¨©

Surajit Ray Nov 2, 2004

¤

¢�

¥

¥

¢ �

¦

¥

© � ¦


¥

¢£¢�


¤

¥


Modal Structure...Topography

■ Modes are potentially symptomatic to the underlying population

structure.

Studying modal structures in high dimensions is extremely challenging.

Our Goals

■ Study the topography of normal mixtures in high dimension.

■ Provide analytical and graphical solution for determining the number of

modes.

■ Show how it can be used to find the number of clusters in high

dimensional data � Modal Clusters.

Surajit Ray Nov 2, 2004


Modality: Multivariate Distribution

density

0.1

0.08

0.06

0.04

0.02

0

4

2

0

y

−2

−4

−4

(1,1)

(−1,−1)

Bivariate normal with means (-1,-1) and (1,1)

Surajit Ray Nov 2, 2004

−2

x

0

2

4


Modality: Multivariate Distribution

density

0.0 0.1 0.2 0.3 0.4 0.5

−4 −2 0 2 4

x−axis

Marginal distribution in

and£ -axis.

Surajit Ray Nov 2, 2004


Modality: Multivariate Distribution

z3

0.00 0.05 0.10 0.15 0.20 0.25

−4 −2 0 2 4

Distribution along the line

Surajit Ray Nov 2, 2004

z1

¦

£


Modality: Equal variance

■ A mixture of multivariate normal is bimodal

distribution along some line is bimodal.

■ Get the axis of maximum separation.

■ Use the bimodality condition in the univariate case.

Theorem 2. Multivariate Modality Condition: Equal variance Case

The distribution of

is bimodal iff

¢£¢


¡

¦


¢£¢

¢�

¥

£

¢¤

�¢£¢

��

¢�

¥ ¢

¥

¦


¥

¢£¢�

¡

the univariate

Surajit Ray Nov 2, 2004


¦

¢�

¥¦¥

§


¢

¥


Unequal variance

Two components, unequal variance:

The mixture density with

¢£

¤

¥ ¦

§ §

¨ ©�

� £

¤

¥ ¦


y

−1 0 1 2 3

§ �

§��

§

©

¦

¨ ©�

¢�

and

¤

¥ ¦

� �

−2 −1 0 1 2

¡

©

and �

¦

¨ ©�

� �

¥ ¦ �

¤

the following parameters:

Surajit Ray Nov 2, 2004

x

�§

§


§

¨ ©�



¦ � ¥

¤




¨ ©


¢¡

¥

Mapping



���


The Ridgeline Manifold

¦

��


¨�©

¢

¢¡


¢

¦

§

��

¢ �


¥

¦

£ ¤


¥

��

¢¡ ¢

� ¢�


¢ �

¥

¥ ���

� �


��



��


��


���




��


� �

��

The image of this map will be denoted by and called the ridgeline

surface or manifold. If

one-dimensional curve.

will be called the ridgeline as it is a

¡

©

¦

� it

� �

��

Theorem 3. Then all the critical values of the -dimensional multivariate mixture, and

hence modes, antimodes and saddlepoints, are points in .

Surajit Ray Nov 2, 2004



��



¥

��

��

�¤£

¢

¢¡


¢


��



£

��

���




¢ £

¥ �

��

¥





��

¦

§ �



Ridgeline Curve

Two components, unequal variance:

The mixture density with

¢£

¤

¥ ¦

§ §

¨ ©�

� £

¤

¥ ¦


y

−1 0 1 2 3

§ �

§��

§

©

¦

¨ ©�

¢�

and

¤

¥ ¦

� �

−2 −1 0 1 2

¡

©

and �

¦

¨ ©�

� �

¥ ¦ �

¤

the following parameters:

Surajit Ray Nov 2, 2004

x

�§

§


§

¨ ©�



¦ � ¥

¤




¨ ©


Ridgeline Elevation Plot

Two components, three modes, unequal variance:

density

0.34 0.36 0.38 0.40 0.42 0.44

replacements

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004


Ridgeline Elevation Plots

Two components, four modes, unequal variance:

��


¡¢¡¤£

¥ ¥ ¥

¦ §¢§¤¨

©

� �


¡¢¡¤£


¥

¥


¥

¥


�¥

¥

¥

¦ §¢§¤¨

©

��


¡¢¡¤£

Surajit Ray Nov 2, 2004




� �


� �


¦ §¢§¤¨

©

� �


¡¢¡¤£


�¥

¥

¥


¥

¥


¥

¥

¦ §¢§¤¨

©



£


� �


¦ ¨


density

0.142070 0.142080

density

PSfrag replacements

density

0.136 0.138 0.140 0.142

ments

0.0000 0.0005 0.0010 0.0015 0.0020

0.0 0.2 0.4 0.6 0.8 1.0

density

0.137085 0.137095

density

PSfrag replacements

0.10 0.11 0.12 0.13 0.14 0.15 0.16

Surajit Ray Nov 2, 2004


Ridgeline Surface/Manifold

Three components, three modes, equal variance: For

¢ £

¤

¥ ¦

§ §

¨ ©�

y

¢ �

¤

−2 −1 0 1 2 3

¥ ¦

§ ¡

¨ ©�

¢£¢

¤

¥ ¦

¡ §

¨ ©�

−2 −1 0 1 2 3 4

Surajit Ray Nov 2, 2004

x


¤

¥ ¦


§


§

¨ ©�


¡

¦

¤

¥ ¤¥¤ ¦


,



¦

¦

¦

¡

¡

¡

§¥§ ©


¨

©

¦

,


Ridgeline Contour plot

0.0 0.2 0.4 0.6 0.8

α 3

α 1

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004

α 2


The -plots

Ridgeline elevation/contour plots

■ Full information (location and height) of the modes and saddle-point.

■ No dependence on � .

-plots

■ Only location of the modes and saddle-point not height

■ But extra advantage of understanding the dependence on the mixing

parameter �

Surajit Ray Nov 2, 2004


The ‘ -equation’

Two component case: If

£¢

¥

¦

Solving we get

if

equation”:

¦

¡

By Construction

1 solution ¦

3 solution ¦

5 solution ¦

¨


¢ �

¢¡

¢

¥ is

a critical value of

¨

¢

¥

¥

¢

£

¥

¥

¡�

¢ �

¢¡

¦

¢£ �

¢

¨

¢

¥

¥

¢

� £ ¢

¥

¦

¢£ �

¥

£

¢

¥ if

¦

� �

¢


it satisfies

is a critical value then it solves the “pi-

¡

¡

¡

¢�

¥

¦

� �

1 mode

2 mode

3 mode

Surajit Ray Nov 2, 2004

¢�

¥

¢

¥


¦

¦




¢

¥

¥

¥

¢� �



£


The ‘ -equation’

Two component case: If

£¢

¥

¦

Solving we get

if

equation”:

¦

¡

By Construction

1 solution ¦

3 solution ¦

5 solution ¦

¨


¢ �

¢¡

¢

¥ is

a critical value of

¨

¢

¥

¥

¢

£

¥

¥

¡�

¢ �

¢¡

¦

¢£ �

¢

¨

¢

¥

¥

¢

� £ ¢

¥

¦

¢£ �

¥

£

¢

¥ if

¦

� �

¢


it satisfies

is a critical value then it solves the “pi-

¡

¡

¡

¢�

¥

¦

� �

1 mode

2 mode

3 mode

Surajit Ray Nov 2, 2004

¢�

¥

¢

¥


¦

¦




¢

¥

¥

¥

¢� �



£


The ‘ -equation’

Two component case: If

£¢

¥

¦

Solving we get

if

equation”:

¦

¡

By Construction

1 solution ¦

3 solution ¦

5 solution ¦

¨


¢ �

¢¡

¢

¥ is

a critical value of

¨

¢

¥

¥

¢

£

¥

¥

¡�

¢ �

¢¡

¦

¢£ �

¢

¨

¢

¥

¥

¢

� £ ¢

¥

¦

¢£ �

¥

£

¢

¥ if

¦

� �

¢


it satisfies

is a critical value then it solves the “pi-

¡

¡

¡

¢�

¥

¦

� �

1 mode

2 mode

3 mode

Surajit Ray Nov 2, 2004

¢�

¥

¢

¥


¦

¦




¢

¥

¥

¥

¢� �



£


The ‘ -equation’

Two component case: If

£¢

¥

¦

Solving we get

if

equation”:

¦

¡

By Construction

1 solution ¦

3 solution ¦

5 solution ¦

¨


¢ �

¢¡

¢

¥ is

a critical value of

¨

¢

¥

¥

¢

£

¥

¥

¡�

¢ �

¢¡

¦

¢£ �

¢

¨

¢

¥

¥

¢

� £ ¢

¥

¦

¢£ �

¥

£

¢

¥ if

¦

� �

¢


it satisfies

is a critical value then it solves the “pi-

¡

¡

¡

¢�

¥

¦

� �

1 mode

2 mode

3 mode

Surajit Ray Nov 2, 2004

¢�

¥

¢

¥


¦

¦




¢

¥

¥

¥

¢� �



£


Example:

-plots

The mixture density with

¢£

¤

¥ ¦

§ §

¨ ©�

Two components, unequal variance:

y

� £

¤

−1 0 1 2 3

¥ ¦


©

§ �

§��

¦

and

§

¨ ©�

¢�

−2 −1 0 1 2

¡

©

and �

¦

¤

¥ ¦

� �

¨ ©�

� �

the following parameters:

Surajit Ray Nov 2, 2004

x

¥ ¦ �

¤

�§

§

¨ ©

§



Example: -plots

¢

0.0 0.2 0.4 0.6 0.8 1.0

£

¡

acements

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004


Example: -plot for unimodal density

¢

0.0 0.2 0.4 0.6 0.8 1.0

£

¡

acements

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004


Example: -plot for example with 4 modes

¢

¢

0.0 0.2 0.4 0.6 0.8 1.0

£

¡

0.4990 0.5000 0.5010

£

¡

acements

PSfrag replacements

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004


Analytic Solution

From the properties of the

by the zeroes of

Same as the zeroes in

Theorem 0.1. Let

where


¢

¥

Quadratic for Equal Variance

Cubic for Proportional variance

¦

¢£¢ �

¢¡

¥ be the mixture of two multivariate


£¢

¥

¢

¦

¥

¢


¢

¦

¦


¥

£

¥

¢

¢ �

¦

£

¢¤

� �

�¤

¢

¦

£

£ �

¢

¥

¡

©

¢ £

¢£ �

¥

� ¢ £ ¢

¤

¥§¦

¨ .

¢

¥

oscillations of

¦

¢£ �

¦

¢

£

£ �

¢

¥

£ �

¢

¥

¢

¦

¢��

¢

¥

¥

¥

� ¢ £ ¢

¢ �

¢

¥

¦

¢

¢ �

£

£ �

¢

¥


¢¤ �

� �

£

¢� �

�¤

� �

¢¤

¦


¡

¢


¢


¥

¢

¥

�¤

� �

¢¤


¥

¢

� £ ¢


¢ �

¥

¢

� £ ¢

¢



¥

is determined

normal densities. Then

Surajit Ray Nov 2, 2004

¥ £ �

� �

¢£¢ �

¦

¢



¥


¦

Data Example:Iris Data

§ � direct contour plotting of is

Component 2

−3 −2 −1 0 1 2

not available.

CLUSPLOT( iris )

−3 −2 −1 0 1 2 3

Component 1

These two components explain 95.81 % of the point variability.

Projection of Iris Data on the first two principal-components plane

Surajit Ray Nov 2, 2004


Iris Data: Ridgeline Contour plot

0.0 0.2 0.4 0.6 0.8

α 3

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004

α 1

α 2


¢

£

¡

0.0 0.2 0.4 0.6 0.8 1.0

Iris Data: -plot

0.0 0.2 0.4 0.6 0.8 1.0

¤

acements

¢

£

¡

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

¤

¢

£

¡

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004

¤


Example: Egyptian Skull Data

Egyptian Skull Data: This data consists of four measurements Maximal

Breadth, Basibregmatic Height, Basialveolar Length , and Nasal Height of

male Egyptian skulls from five different time periods ( 4000 BC, 3300 BC,

1850 BC, 200 BC, 150 AD). Thirty skulls were measured from each time

period.

Here we analyze the three earliest time periods

Surajit Ray Nov 2, 2004


Egyptian Skull Data: Ridgeline Contour plot

0.0 0.2 0.4 0.6 0.8

α 3

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004

α 2

α 1


¢

£

¡

0.0 0.2 0.4 0.6 0.8 1.0

Egyptian Skull Data: Iris Data: -plot

0.0 0.2 0.4 0.6 0.8 1.0

¤

acements

¢

£

¡

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

¤

¢

£

¡

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

Surajit Ray Nov 2, 2004

¤


Summary and Future Work

■ For , the problem of determining the number of modes in the

-dimensional space can be reduced to a 1-dimensional space

¡

©

¦

■ Can be used to determine the number of modal clusters

■ Analytical results and dependence of modes on the mixing proportion

are discussed in Ray and Lindsay, 2004.

■ Several ways of generalization to more than two components.

■ Relating the subspace reduction in discriminant analysis to the

subspace reduction in modality determinantion.

Surajit Ray Nov 2, 2004

More magazines by this user
Similar magazines