Signal Analysis Research (SAR) Group - RNet - Ryerson University
Signal Analysis Research (SAR) Group - RNet - Ryerson University
Signal Analysis Research (SAR) Group - RNet - Ryerson University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Fig 3: Two-dmensional mapping: (Left) Input pattern with 7 dstinct clusters, (Middle) 8 centres are generated using Ncut, and (fight) 7 centres are<br />
generated using SONcut. Over-classification around the quev (triangle) will result in enoneous classification of the relevant class.<br />
nodes in the input pattern,) and assoc(A, A) and where H(t) is defined similar to the hierarchy function used<br />
assoc(B, B) measure the total intra-cluster similarity in the DSOTM algorithm, then increment maximum allowed<br />
Ncut by 1;<br />
(association) in A and B, assoc(A,V) is the Continuation: Continue with the Similarity Measurement<br />
total connection from nodes in cluster A to all nodes in the ,tep until no noticeable changes in the feature map are<br />
graph; assoc(B,V)is defined similarly. w, is a nonnegative observed;<br />
weight function measuring degree of similarity between two Graph Generation: Given the input pattern, set up a<br />
samples of input patterns and is defined as: weighted graph, G = (V, E), compute the weight, wm, on<br />
('1<br />
each edge, Em, using (8) and create affinity, W, and<br />
diagonal, D, matrices;<br />
where d(p,q) is a pre-defined distance metric (i.e.<br />
Euclidean distance) and k is a user defined constant to<br />
control decreasing rate of the weight function and is<br />
empirically sets to 0.2. By using this function, the smallest<br />
eigenvector remains constant and Ncut can find relatively<br />
right partitions [2].<br />
As Shi and Malik have also discussed, the optimal<br />
partitioning (minimum possible Ncut) can be computed by<br />
solving the generalized eigenvalue system. The second<br />
smallest eigenvector of the generalized eigensystem is then<br />
used to partition the graph.<br />
In this paper we have used the Ncut algorithm [2] for the<br />
purpose of unsupervised data clustering. The Ncut<br />
partitioning method can be recursively applied on the input<br />
pattern to generate more than two clusters. Decision on<br />
maximum number of centres in the input pattern to stop the<br />
clustering process is a challenging problem. In this work, we<br />
Eigensystem Transformation: Solve (D- W)x = A- Dx for<br />
eigenvector with the smallest eigenvalue;<br />
Graph Bipartition: Use the eigenvector with the second<br />
smallest eigenvalue to bipartition the graph;<br />
Partitioning Continuation: Consider current partitions for<br />
supplementary subdivision. Continue with repartitioning<br />
until the Ncut value reaches to its maximum allowed.<br />
In summary, we have proposed an unsupervised<br />
hierarchical Ncut algorithm that is able to estimate the<br />
maximum number of allowed Ncuts by training the<br />
algorithm using the principles found in the DSOTM<br />
architecture. Thus, by dynamically adapting the Ncut<br />
algorithm to the nature of the input pattern, problem of overpartitioning<br />
the relevant class can be prevented. Fig. 3<br />
depicts importance of adapting such predictive mechanism<br />
for the Ncut clustering algorithm and illustrates<br />
effectiveness of employing such mechanism to avoid overclassification<br />
around the query centre.<br />
have integrated - the Ncut algorithm - with the principles found<br />
in DSOTM to automatically estimate appropriate number of<br />
clusters in the input pattern and set the maximum allowed<br />
Ncut accordingly. We call this Ncut algorithm with self-<br />
oriented centre detection, the Self-Organizing Normalized<br />
cuts, SONcut.<br />
The proposed SONcut algorithm is as follows:<br />
Initialization: Choose a root node, {I-,)~=~,<br />
from the<br />
available set of input vectors, { x,)L, in a random manner.<br />
N is the maximum allowed Ncut (initially set to 1) and K the<br />
total number of inputs;<br />
Similarity Measurement: Randomly select a new data point,<br />
x, and find the winning centroid, n*, by minimizing a<br />
predefined distance criterion in (1);<br />
Maximum Allowed Ncut Estimation: If Ilx(t) - I-,, 11 > H (t)<br />
3535<br />
35<br />
Previously in [7], we proposed an automatic CBIR engine<br />
that was structured around an unsupervised learning<br />
algorithm, the DSOTM. To reduce the gap between high-<br />
level concepts (semantics) and low-level statistical features<br />
and to evolve the search process according to what the<br />
system believes to be the significant content within the<br />
query, the above engine was integrated with a process of<br />
feature weight detection using genetic algorithms (GA) as<br />
illustrated in Fig. 4b. In this paper we use a relatively<br />
simpler CBIR architecture (see Fig. 4a and Fig. 5) to solely<br />
compare abilities of the proposed hierarchical clustering<br />
algorithms for the purpose of data classification with other<br />
three techniques, SOTM, SOFM, and Ncut.<br />
Authorized licensed use limited to: <strong>Ryerson</strong> <strong>University</strong> Library. Downloaded on July 7, 2009 at 11:49 from IEEE Xplore. Restrictions apply.