11.07.2015 Views

Algorithms for Gaussian Bandwidth Selection in Kernel Density ...

Algorithms for Gaussian Bandwidth Selection in Kernel Density ...

Algorithms for Gaussian Bandwidth Selection in Kernel Density ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

evaluated on the po<strong>in</strong>t left. The model evaluated on each tra<strong>in</strong><strong>in</strong>g sample has the<strong>for</strong>m:ˆp θ (x i ) = 1 N∑G(x i − x j |θ) (2)N − 1j=1j≠iwhere we make explicit the use of a <strong>Gaussian</strong> kernel. This framework was firstproposed <strong>in</strong> [4] and later studied by other authors [2], [5]. However, these studieslack a closed optimization procedure, so that the bandwidth σ 2 is obta<strong>in</strong>ed bya greedy tun<strong>in</strong>g along its possible values. Besides, the multivariate case is onlyconsidered <strong>in</strong> these previous works under a spherical kernel assumption. In thispaper, proposed two algorithms that overcome these difficulties.In a multidimensional <strong>Gaussian</strong> kernel, the set of parameters consists of the covariancematrix of the <strong>Gaussian</strong>. In the follow<strong>in</strong>g, we consider two different degreesof complexity assumed <strong>for</strong> this matrix: a spherical shape, so that C = σ 2 I D -onlyone parameter to adjust-, and an unconstra<strong>in</strong>ed kernel, <strong>in</strong> which a general <strong>for</strong>m isconsidered <strong>for</strong> C with D(D + 1)/2 parameters.Sections 2 and 3 describe the bandwidth optimization <strong>for</strong> the both cases mentionedas presented <strong>in</strong> [6] and establish their convergence conditions. Some classificationexperiments are presented <strong>in</strong> Section 4 to measure the accuracy of the models.Section 5 closes the paper with the most important conclusions.2 The spherical caseThe expression <strong>for</strong> the kernel function is, <strong>for</strong> the spherical case:(G ij (σ 2 ) = G(x i − x j |σ 2 ) = (2π) −D/2 σ −D exp − 1)2σ 2 ‖x i − x j ‖ 2We want to f<strong>in</strong>d the σ that maximizes the log-likelihood log L(X|σ 2 ) =∑i log ˆp θ(x i ). The derivative of this likelihood is:∇ σ log L(X|σ 2 ) = 1 ∑ 1 ∑( ‖xi − x j ‖ 2N − 1 ˆp(xi i ) σ 3 − D σWe now search <strong>for</strong> the po<strong>in</strong>t that makes the derivative null:∑ 1 ∑ ‖x i − x j ‖ 2ˆp(x i ) σ 3 G ij (σ 2 ) = ∑ 1 D ∑G ij (σ 2 ) =ˆp(xi i ) σij≠ij≠ij≠i)G ij (σ 2 )N(N − 1)DσThe ∑ second equality has been obta<strong>in</strong>ed by the fact that, by def<strong>in</strong>ition,j≠i G ij = (N − 1)ˆp(x i ). Then we obta<strong>in</strong> the follow<strong>in</strong>g fixed-po<strong>in</strong>t algorithm:σ 2 t+1 =1N(N − 1)D∑i1ˆp t (x i )∑‖x i − x j ‖ 2 G ij (σt 2 ) (3)where ˆp t denotes the KDE obta<strong>in</strong>ed <strong>in</strong> iteration t, i.e. the one that makes use ofthe width σ 2 t .We prove the convergence of the algorithm <strong>in</strong> (3) by means of the follow<strong>in</strong>g convergencetheorem:j≠i

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!