A novel fuzzy clustering algorithm based on a fuzzy scatter matrix ...

More documents

Recommendations

Info

650 K.-L. Wu et al. / Pattern Recognition Letters 26 (2005) 639–652 2 FCM m=6 2 FCS m=6, beta=0.1 2 FCS m=6, beta=0.2 1 1 1 0 0 0 -1 -1 -1 -2 -2 -2 -3 -2 -1 0 1 2 3 4 5 6 7 -3 -2 -1 0 1 2 3 4 5 6 7 -3 -2 -1 0 1 2 (a) (b) (c) 3 4 5 6 7 Fig. 6. FCMand FCS <strong>clustering</strong> results for the two-clusters data set with one outlier in the coordinate (100,0). P n lim fa j¼1 ig¼ lim lm ij x P n j g i j¼1 lm ij P x n P m!1 m!1 j¼1 lm ij g n i j¼1 lm ij P n P n j¼1 ¼ lim ðl0 ij Þm x j g i j¼1 ðl0 ij Þm x P n m!1 j¼1 ðl0 ijÞ m P n g i j¼1 ðl0 ijÞ m P Pl ¼ 0 ¼1x j g ij i l 0 ¼1x ij ð1 g i Þ P l 0 ¼11 ij P l ¼ 0 ¼1x j= P l g ij ij1 0 i x 1 g i P l ij ¼^l x j= P l ij ¼^l 1 g i x ¼ : ð31Þ 1 g i In FCM, when m is large, l ij =1/c for all i, j and hence l ij ¼ ^l for all i, j. This is why FCMcould obtain results in which the sample mean x will be a unique optimizer when m is large. However, the data points inside the cluster kernels in FCS will have l ij 2 {0,1} and l ij 2 (0,1) for those data points outside cluster kernels. When m is large, the ith cluster center update Eq. (22) will give a large ðl 0 ij Þm ¼ðl ij =^lÞ m ¼ 1 for the data points inside the ith cluster kernel and will give a small ðl 0 ij Þm 0 for the data points outside the ith cluster kernel. When m is large, the ith cluster center update Eq. (22) will be the weighted mean of the sample mean of the data points inside the ith cluster kernel and the grand mean x. The sample mean data point weights inside the ith cluster kernel and the grand mean x are 1 and g i , respectively. For a suitable b value, noise and outliers will be outside the cluster kernels and their influences on the <strong>clustering</strong> results will be small when m is large. This explains the <strong>clustering</strong> results shown in Figs. 5 and 6 and also coincide to the theoretical analysis in Section 5. This property also provides a method to avoid the sample mean x being a unique optimizer in FCM. We know that when the sample mean x is the unique optimizer of a <strong>fuzzy</strong> <strong>clustering</strong> <strong>algorithm</strong>, the partition coefficient (PC) (Bezdek, 1974) defined by PCðCÞ ¼ 1 n X c i¼1 X n j¼1 l 2 ij ð32Þ will be equal to 1/c or equivalently the non-<strong>fuzzy</strong> index (NFI) (Pal and Bezdek, 1995) defined by c NFIðcÞ ¼1 ð1 PCðCÞÞ ð33Þ c 1 equals to zero. Note that Dave (1996) also proposed a modification of the PC index which is equivalent to the NFI index. According to the above analysis, we hope the FCS with cluster ker- NFI 0.25 0.20 0.15 0.10 0.05 0.00 m=10 0 0.05 0.1 0.15 0.2 0.5 0.99 beta m=20 Fig. 7. NFI(2) values for the unequal sample size data set shown in Fig. 5.
K.-L. Wu et al. / Pattern Recognition Letters 26 (2005) 639–652 651 0.5 m=1.5 0.12 m=2 NFI 0.4 PIM NFI 0.10 0.08 0.06 FCS 0.04 0.3 FCS 0.02 PIM 0.00 0 0.05 0.1 0.15 0.2 0.5 0.99 beta, delta (a) 0 0.05 0.1 0.15 0.2 0.5 0.99 beta, delta (b) Fig. 8. NFI(11) values for the normalized Vowel data set in which both PIMand FCS <strong>algorithm</strong>s are processed with the same parameter values. nels can avoid the situation in which the sample mean is a unique optimizer of the FCS objective function. Fig. 7 presents the NFI (2) values of the data set shown in Fig. 5. The NFI values of FCS with cluster kernels (b > 0) are always larger than the NFI values of the FCM(b = 0) which is the case of the sample mean x being the unique optimizer with NFI = 0 when m = 10 and 20. This shows that the FCS <strong>algorithm</strong> can avoid the case of NFI = 0 and is robust to the noise and outliers than FCMwhen m is larger. Because the sample mean x of the data set shown in Fig. 6 will not be the unique optimizer of FCMand FCS when m is larger, we do not show their NFI values. Note that some properties of FCS discussed above can also be achieved by the partition index maximization (PIM) <strong>algorithm</strong> (Özdemir and Akarun, 2002) which used a fixed volume for all cluster kernels. The radius of each cluster volume in PIMis defined by a ¼ d minfmin ka i a 0 i6¼i 0 ik=2g; 0 6 d 6 1: ð34Þ The NFI values of the normalized Vowel data set in the UCI Machine Learning Repository (Blake and Merz, 1998) of PIMand FCS are shown in Fig. 8. Yu et al. (2004) showed that when m > 1.7787, the sample mean x will be the unique optimizer of FCMfor the normalized Vowel data set in Blake and Merz (1998). InFig. 8(a), when m = 1.5, both PIMand FCS with different d and b values have the NFI index values larger than 0.3. However, when m = 2 as shown in Fig. 8(b), the PIMgive the same NFI values as FCM (d =0 or b = 0). The use of the same volumes of the cluster kernels do not help PIMto have a larger NFI values than FCM. The same situation when m = 2 in FCS as shown in Fig. 8(b), the NFI values of FCS are always larger than FCM and PIM. Using the different cluster kernel volumes in FCS produces these good merits. 7. Conclusions We proposed a <strong>novel</strong> <strong>clustering</strong> <strong>algorithm</strong> called the FCS <strong>algorithm</strong> which attempts to minimize the <strong>fuzzy</strong> within-cluster scatter matrix trace and simultaneously maximize the <strong>fuzzy</strong> betweencluster scatter matrix trace. Each cluster obtained by the FCS will have a cluster kernel. Data points that fall inside any one of the c cluster kernels will have crisp memberships and be outside all of the cluster kernels that have <strong>fuzzy</strong> memberships. The volume of each cluster kernel is decided by the parameter g i which is a function of b. The crisp and <strong>fuzzy</strong> memberships co-exist in the FCS. The cluster center update equations in the FCS can be interpreted as a weighted mean of the FCM cluster centers and the grand mean x. Numerical examples show that the FCS can have more accurate results in the parameter estimation than the FCM. It also shows that FCS can help avoid the situation where the sample mean x is a unique optimizer of FCMand is more robust to noise
Page 1 and 2: Pattern Recognition Letters 26 (200
Page 3 and 4: K.-L. Wu et al. / Pattern Recogniti
Page 11: K.-L. Wu et al. / Pattern Recogniti

A novel fuzzy clustering algorithm based on a fuzzy scatter matrix ...

Create successful ePaper yourself

Delete template?

Save as template?