Download Complete Article in PDF Format - vsrd international ...
Download Complete Article in PDF Format - vsrd international ...
Download Complete Article in PDF Format - vsrd international ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
R E S E A R C H A R T II C L E<br />
____________________________<br />
Available ONLINE www.<strong>vsrd</strong>journals.com<br />
VSRD-IJCSIT, Vol. 2 (4), 2012, 285-295<br />
Data Centric Knowledge Management System<br />
Us<strong>in</strong>g Post-Cluster<strong>in</strong>g Technique<br />
ABSTRACT<br />
1 Asadi Sr<strong>in</strong>ivasulu*, 2 Ch.D.V. Subba Rao and 3 M. Sreedevi<br />
The purpose of Data Centric Knowledge Management System (DCKMS) is to centralize knowledge generated<br />
by employees work<strong>in</strong>g with<strong>in</strong> and functional areas and to organize that knowledge such that it can be easily<br />
accessed, searched, browsed and navigated. It is a one stop shop for f<strong>in</strong>d<strong>in</strong>g solutions for your problems. It<br />
provides a facility for the employees to register themselves as ‘experts’ as well as search for other ‘experts’<br />
<strong>in</strong>case of any problem/requirement <strong>in</strong> their project. It is a one stop shop for f<strong>in</strong>d<strong>in</strong>g solutions for your problems.<br />
This system design is modularized <strong>in</strong>to various categories. This system has enriched UI so that a novice user did<br />
not feel any operational difficulties. This system ma<strong>in</strong>ly concentrated <strong>in</strong> design<strong>in</strong>g various reports requested by<br />
the users as well as higher with export to excel options. This paper addresses the expectations, organizational<br />
implications, and <strong>in</strong>formation process<strong>in</strong>g requirements, of the emerg<strong>in</strong>g knowledge management paradigm. A<br />
brief discussion of the enablement of the <strong>in</strong>dividual through the wide-spread availability of computer and<br />
communication facilities is followed by a description of the structural evolution of organizations, and the<br />
architecture of a computer-based knowledge management system. The author discusses two trends that are<br />
driven by the treatment of <strong>in</strong>formation and knowledge as a commodity, <strong>in</strong>creased concern for the management<br />
and exploitation of knowledge with<strong>in</strong> organizations, and, the creation of an organizational environment that<br />
facilitates the acquisition, shar<strong>in</strong>g and application of knowledge.<br />
Keywords : Data, Data-Centric, Data Mart, Data Portal, Data Warehouse, Enabled Individual, Information,<br />
Information-Centric, Information Management, Knowledge, Knowledge Management, Ontology,<br />
Organizational Structure, Cluster<strong>in</strong>g, Data M<strong>in</strong><strong>in</strong>g, Fuzzy C-Means Cluster<strong>in</strong>g Algorithm, K-Means<br />
Cluster<strong>in</strong>g Algorithm.<br />
1. INTRODUCTION<br />
The Data Centric Knowledge Management System is a web based application which allows employees of a<br />
company to share their knowledge with others <strong>in</strong> the company. Also it allows them to search for knowledge<br />
1,3 Associate Professor, 2 Professor, 1,2,3 Department of Information Technology, Sree Vidyanikethan Eng<strong>in</strong>eer<strong>in</strong>g College,<br />
Tirupathi, Andhra Pradesh, INDIA. *Correspondence : sr<strong>in</strong>u_asadi@yahoo.com
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
assets when <strong>in</strong> need. It provides a facility for the employees to register themselves as ‘experts’ as well as search<br />
for other ‘experts’ <strong>in</strong>case of any problem/requirement <strong>in</strong> their project. It is a one stop shop for f<strong>in</strong>d<strong>in</strong>g solutions<br />
for your problems. As <strong>in</strong>formation technology beg<strong>in</strong>s to permeate all aspects of life and the economy turns<br />
decidedly <strong>in</strong>formation-centric, wealth is <strong>in</strong>creas<strong>in</strong>gly def<strong>in</strong>ed <strong>in</strong> terms of <strong>in</strong>formation-related products and the<br />
availability of knowledge. Under these conditions employment, whether self-employment or organizational<br />
employment is becom<strong>in</strong>g s<strong>in</strong>gularly focused on the skills and capabilities of the <strong>in</strong>dividual. In other words<br />
knowledge has become a commodity that has value far <strong>in</strong> excess of the manufactured products that represented<br />
the yardstick of wealth dur<strong>in</strong>g the <strong>in</strong>dustrial age. How this new form of human wealth should be effectively<br />
utilized and nurtured <strong>in</strong> commercial and government organizations have <strong>in</strong> recent times become a major<br />
preoccupation of management. Two parallel and related trends have emerged. The first trend is related to the<br />
management and exploitation of knowledge. The question be<strong>in</strong>g asked is: How can we capture and utilize the<br />
potentially available knowledge for the benefit of the organization? The phrase “…potentially available” is<br />
appropriate, because much of the knowledge is hidden <strong>in</strong> an overwhelm<strong>in</strong>g volume of computer-based data.<br />
What is not commonly understood is that the overwhelm<strong>in</strong>g nature of the stored data is due to current<br />
process<strong>in</strong>g methods rather than volume. These process<strong>in</strong>g methods have to rely largely on manual methods<br />
because only the human user can provide the necessary context for <strong>in</strong>terpret<strong>in</strong>g the computer-stored data <strong>in</strong>to<br />
<strong>in</strong>formation and knowledge. If it were possible to capture <strong>in</strong>formation (i.e., data with relationships), rather than<br />
data, at the po<strong>in</strong>t of entry <strong>in</strong>to the computer then there would be sufficient context for computer software to<br />
process the <strong>in</strong>formation automatically <strong>in</strong>to knowledge. This is not just a desirable<br />
2. RELATED WORK<br />
The ma<strong>in</strong> purpose of functional requirements with<strong>in</strong> the requirement specification document is to def<strong>in</strong>e all the<br />
activities or operations that take place <strong>in</strong> the system. These are derived through <strong>in</strong>teractions with the users of the<br />
system. S<strong>in</strong>ce the Requirements Specification is a comprehensive document & conta<strong>in</strong>s a lot of data, it has been<br />
broken down <strong>in</strong>to different Chapters <strong>in</strong> this report. The depiction of the Design of the System <strong>in</strong> UML is<br />
presented <strong>in</strong> a separate chapter. The Data Dictionary is presented <strong>in</strong> the Appendix of the system. But the general<br />
Functional Requirements arrived at the end of the <strong>in</strong>teraction with the Users are listed below. A more detailed<br />
discussion is presented <strong>in</strong> this, which talk about the Analysis & Design of the system. Adm<strong>in</strong>istrator of this<br />
system can add a new employee as well as delete an exist<strong>in</strong>g employee and he can view all the exist<strong>in</strong>g users of<br />
the system. Adm<strong>in</strong>istrator can create; delete user log<strong>in</strong>s for different employees. Adm<strong>in</strong>istrator can view<br />
different reports (My Submission report, Rat<strong>in</strong>gs reports, document status report etc).<br />
� Adm<strong>in</strong>istrator of this system can add a new employee as well as delete an exist<strong>in</strong>g employee and he can<br />
view all the exist<strong>in</strong>g users of the system.<br />
� Adm<strong>in</strong>istrator can create; delete user log<strong>in</strong>s for different employees.<br />
� A K-User/ K-Team Member/Reviewer can search for a document based on his criteria (author, technology<br />
etc).<br />
� A K-User/ K-Team Member/Reviewer can download a document.<br />
286
� A K-User/ K-Team Member/Reviewer can rate a document.<br />
� A K-User/ K-Team Member/Reviewer can submit a document.<br />
� A K-User/ K-Team Member/Reviewer can register as an expert.<br />
� A K-User/ K-Team Member/Reviewer can search for an expert.<br />
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
� A K-Team Member/Reviewer can evaluate the above documents for <strong>in</strong>itial screen<strong>in</strong>g.<br />
� A K-Team Member can manage the reviewers list.<br />
� A K-team Member can assign a document to particular reviewer<br />
� A Reviewer can view the list of documents forwarded to him<br />
� A Reviewer can publish or reject a document.<br />
3. EXISTING ALGORITHM<br />
Fig. 1 : Context Level Diagram<br />
Here <strong>in</strong> the exist<strong>in</strong>g system, the company ma<strong>in</strong>ta<strong>in</strong>s all the knowledge based documents <strong>in</strong> a separate system<br />
which will be accessible for all employees through LAN and they can post their new documents <strong>in</strong>to this and<br />
access the earlier documents. Search<strong>in</strong>g for related documents based on author, technology etc is a time tak<strong>in</strong>g<br />
process. Manag<strong>in</strong>g the documents category wise and restrict them not to be accessible based on the user type<br />
becomes complicated. This system doesn’t restrict unnecessary documents to be posted.<br />
DRAWBACKS:<br />
� Difficulty <strong>in</strong> ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g security levels for the documents.<br />
� Difficulty <strong>in</strong> brows<strong>in</strong>g, navigat<strong>in</strong>g and search<strong>in</strong>g for required document.<br />
� Difficulty <strong>in</strong> giv<strong>in</strong>g rat<strong>in</strong>gs for the documents.<br />
287
� Availability of <strong>in</strong>formation <strong>in</strong> this manner is subjected to damage.<br />
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
� Difficulty <strong>in</strong> restrict<strong>in</strong>g the employees not to update the documents.<br />
� Difficulty <strong>in</strong> generat<strong>in</strong>g different reports.<br />
4. PROPOSED SYSTEM<br />
The proposed system is fully computerized, which removes all the drawbacks of exist<strong>in</strong>g system. In the<br />
proposed system, it allows different employees of the company to upload their knowledge document <strong>in</strong>to this<br />
system which will be verified by next level users to avoid unnecessary documents. Also it allows them to search<br />
for knowledge assets very easily when <strong>in</strong> need. It provides a facility for the employees to register themselves as<br />
‘experts’ as well as search for other ‘experts’ <strong>in</strong>case of any problem/requirement <strong>in</strong> their project. It provides a<br />
facility for the evaluator to rate the documents posted by the employees.<br />
ADVANTAGES:<br />
� It provides a facility a to share knowledge documents across the company<br />
� It allows the employees to upload and download the documents from their systems<br />
� Easy <strong>in</strong> brows<strong>in</strong>g, navigat<strong>in</strong>g and search<strong>in</strong>g for required documents<br />
� Provides a facility to restrict the unnecessary documents to be posted<br />
� Provides flexible way <strong>in</strong> generat<strong>in</strong>g different reports<br />
� By the follow<strong>in</strong>g the new approach the <strong>in</strong>formation can be accessed from anywhere just with a mouse click.<br />
This helps the users by sav<strong>in</strong>g lot of time provid<strong>in</strong>g the user with the up to date <strong>in</strong>formation Centralized<br />
database helps <strong>in</strong> avoid<strong>in</strong>g conflicts<br />
� This project provides a rich user <strong>in</strong>terface for the user to access <strong>in</strong>formation with least effort (“look and<br />
feel”).<br />
� It allows to rate the documents at different levels<br />
� It allows publish<strong>in</strong>g or reject<strong>in</strong>g the documents.<br />
4.1. K-MEANS ALGORITHM<br />
Step 1) Put the first K feature vectors as <strong>in</strong>itial centers<br />
Step 2) Assign each sample vector to the cluster with m<strong>in</strong>imum distance assignment pr<strong>in</strong>ciple.<br />
Step 3) Compute new average as new center for each cluster<br />
Step 4) If any center has changed, then go to step 2, else term<strong>in</strong>ate.<br />
288
4.2. K-MEANS<br />
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
Fig. 5 : Apply<strong>in</strong>g Cluster<strong>in</strong>g Technique Similarity Weight and Filter Method<br />
Fig. 6 : Results of Cluster<strong>in</strong>g Show<strong>in</strong>g Groups Divided Into Clusters<br />
Fig. 7 : Initialization and Input<br />
289
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
Fig. 8 : F<strong>in</strong>al EMST Edges Path<br />
Fig. 1 : Graph for K-Means<br />
K-means is one of the simplest unsupervised learn<strong>in</strong>g algorithms that solve the well known cluster<strong>in</strong>g problem<br />
.K-means is a popular cluster<strong>in</strong>g method that uses prototypes (centroid) to represent clusters by m<strong>in</strong>imiz<strong>in</strong>g<br />
with<strong>in</strong>-cluster errors. The ma<strong>in</strong> idea is to def<strong>in</strong>e k centroid, one for each cluster.<br />
This centroid should be placed <strong>in</strong> a cunn<strong>in</strong>g way because of different location causes different result. The next<br />
step is to take each po<strong>in</strong>t belong<strong>in</strong>g to a given data set and associate it to the nearest centroid. After we have<br />
these k new centroid, a new b<strong>in</strong>d<strong>in</strong>g has to be done between the same data set po<strong>in</strong>ts and the nearest new<br />
centroid. F<strong>in</strong>ally, this algorithm aims at m<strong>in</strong>imiz<strong>in</strong>g an objective function.<br />
The objective function :<br />
290
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
We apply the above algorithm <strong>in</strong> our project by tak<strong>in</strong>g <strong>in</strong>put attributes like number of assignments submitted;<br />
number of tasks done successfully, number of times had face to face <strong>in</strong>teractions among team members. Now<br />
apply<strong>in</strong>g above algorithm results <strong>in</strong> division of groups <strong>in</strong>to k clusters .The groups <strong>in</strong> each cluster would have<br />
shown nearly similar behavior hence grouped <strong>in</strong>to same cluster. Now it becomes easy for the facilitator to give<br />
feedback as now he can give feedback to the entire cluster <strong>in</strong>stead of giv<strong>in</strong>g to each and every group<br />
5. RESULTS<br />
Fig. : This Screen Is Log<strong>in</strong> Page for All Users and Adm<strong>in</strong>istrator<br />
Fig. : Adm<strong>in</strong>istrator Can F<strong>in</strong>d the Experts for Gett<strong>in</strong>g the Assistance<br />
Fig. : Adm<strong>in</strong>istrator Can Register As Experts<br />
291
6. CONCLUSION<br />
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
Fig. : This Screen Shows the K Team Actions<br />
The new system, Data Centric Knowledge Management System has been implemented to cater the needs of<br />
company employees <strong>in</strong> shar<strong>in</strong>g different knowledge assets effectively with role based access. The present<br />
system has been <strong>in</strong>tegrated with the already exist<strong>in</strong>g. The database was put <strong>in</strong>to the My SQL server. This was<br />
connected by JDBC. The database is accessible through Intranet on any location. This system has been found to<br />
meet the requirements of the users and departments and also very satisfactory. The database system must<br />
provide for the safety of the <strong>in</strong>formation stored, despite system crashes or attempts at unauthorized access. If<br />
data are to be shared among several users, the system must avoid possible anomalous results. Future<br />
enhancement is Extendibility provides high level extendibility. It means it provides all the basic features and<br />
allows us to extend their features very easily without disturb<strong>in</strong>g the exist<strong>in</strong>g code. We can make this Internet<br />
application if we desire. We can make this application is suitable to work on any application just by chang<strong>in</strong>g<br />
the deployment files. By provid<strong>in</strong>g some more features like provid<strong>in</strong>g accessibility to <strong>in</strong>ternet users to <strong>in</strong>volve <strong>in</strong><br />
this process.<br />
7. REFERENCES<br />
[1] Sr<strong>in</strong>ivasulu Asadi, Dr. Ch.D.V.Subbarao, V. Saikrishna, “F<strong>in</strong>d<strong>in</strong>g the number of clusters us<strong>in</strong>g Dark Block<br />
Extraction”, IJCA International Journal of Computer Applications (0975 – 8887), Volume 7– No.3,<br />
September, 2010.<br />
[2] A. Ahmad and L. Dey, (2007), A k-mean cluster<strong>in</strong>g algorithm for mixed numeric and categorical data’,<br />
Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g Elsevier Publication, vol. 63, pp 503-527.<br />
[3] Sr<strong>in</strong>ivasulu Asadi, Dr.Ch.D.V.SubbaRao, V.Saikrishna and Bhudevi Aasadi “Cluster<strong>in</strong>g the Labeled and<br />
Unlabeled Datasets us<strong>in</strong>g New MST based Divide and Conquer Technique,” International Journal of<br />
Computer Science & Eng<strong>in</strong>eer<strong>in</strong>g Technology (IJCSET), (0975 – 8887), IJCSET | July 2011 | Vol 1, Issue<br />
6,302-306, ISSN:2231-0711, July, 2011.<br />
[4] Xiaochun Wang, Xiali Wang and D. Mitchell Wilkes, IEEE Members, “A Divide-and-Conquer Approach<br />
for M<strong>in</strong>imum Spann<strong>in</strong>g Tree-Based Cluster<strong>in</strong>g”, IEEE Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g Transactions, vol<br />
21, July 2009.<br />
292
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
[5] Sr<strong>in</strong>ivasulu Asadi, Dr.Ch.D.V.Subba Rao, O.Obulesu and P.Sunil Kumar Reddy, “F<strong>in</strong>d<strong>in</strong>g the Number of<br />
Clusters <strong>in</strong> Unlabelled Datasets Us<strong>in</strong>g Extended Cluster Count Extraction (ECCE)”, ,” IJCSIT International<br />
Journal of Computer Science and Information Technology (ISSN: 0975 – 9646), Vol. 2 (4) , 2011, 1820-<br />
1824, August, 2011.<br />
[6] S Deng, Z He, X Xu, 2005. Cluster<strong>in</strong>g mixed numeric and categorical data: A cluster ensemble approach.<br />
Arxiv prepr<strong>in</strong>t cs/0509011.<br />
[7] Sr<strong>in</strong>ivasulu Asadi, Dr.Ch.D.V.Subba Rao, O.Obulesu and P.Sunil Kumar Reddy,“A Comparative study of<br />
Cluster<strong>in</strong>g <strong>in</strong> Unlabelled Datasets Us<strong>in</strong>g Extended Dark Block Extraction and Extended Cluster Count<br />
Extraction Extended Dark Block Extraction and Extended Cluster Count Extraction”, IJCSIT International<br />
Journal of Computer Science and Information Technology (ISSN:0975 – 9646), Vol. 2(4) , 2011, 1825-<br />
1831,August, 2011.<br />
[8] S. Guha, R. Rastogi, and K. Shim, 2000. ROCK: A Robust Cluster<strong>in</strong>g Algorithm for Categorical Attributes.<br />
Information Systems, vol. 25, no. 5 : 345-366.<br />
[9] V.V. Cross and T.A. Sudkamp, Similarity and Compatibility <strong>in</strong> Fuzzy Set Theory: assessment and<br />
Applications, Physica-Verlag, New York, 2002.<br />
[10] M. Kal<strong>in</strong>a, Derivatives of fuzzy functions and fuzzy derivatives, Tatra<br />
[11] Jiawei Han and Michel<strong>in</strong>e Kamber. “Data Ware Hous<strong>in</strong>g and Data M<strong>in</strong><strong>in</strong>g. Concepts and Techniques”,<br />
Third Edition 2007.<br />
[12] Zhexue Huang; Ng, M.K.;Manage. Inf. Pr<strong>in</strong>ciples Ltd., Melbourne, Vic.A fuzzy k-modes algorithm for<br />
cluster<strong>in</strong>g categorical data. vol.7, pp 446-452<br />
[13] Tengke Xiong; Shengrui Wang; Mayers, A.; Monga, E.; Dept. Comput. Sci., Univ. of Sherbrooke,<br />
Sherbrooke, QC, Canada. A New MCA-Based Divisive Hierarchical Algorithm for Cluster<strong>in</strong>g Categorical<br />
Data.<br />
[14] Iam-On, N.; Boongeon, T.; Garrett, S.; Price, C.;Aberystwyth University, Aberystwyth. A L<strong>in</strong>k-Based<br />
Cluster Ensemble Approach for Categorical Data Cluster<strong>in</strong>g. vol. PP 1.<br />
[15] Izakian, H.; Abraham, A.; Snasel, V.;Mach<strong>in</strong>e Intell. Res. Labs. (MIR Labs.), Auburn, WA, USA.<br />
Cluster<strong>in</strong>g categorical data us<strong>in</strong>g a swarm-based method. pp. 1720-1724<br />
[16] Charu C.Aggarwal. Towards Systematic Design of Distance Functions for Data M<strong>in</strong><strong>in</strong>g Applications.<br />
SIGKDD ’03, August 2427, 2003, Wash<strong>in</strong>gton, DC, USA<br />
[17] Huajie Zhang; Zhiyue Cao; Fangzhu Qiang;Dept. of Comput. Sci., Ch<strong>in</strong>a Univ. of Geosci., Wuhan.<br />
Representation and cluster<strong>in</strong>g of numeric data <strong>in</strong> concept formation. vol.1, pp 597-600.<br />
[18] M. Mahdavi and H. Abolhassani, (2009) Harmony K-means algorithm for document cluster<strong>in</strong>g, Data M<strong>in</strong><br />
Knowl Disc (2009) 18:370–391.<br />
[19] Yong Wang; Naohiro Ishii.Lear<strong>in</strong><strong>in</strong>g Feature Weight for Similarity Measures.<br />
[20] Ba<strong>in</strong>ian Li; Kongsheng Zhang; and Jian Xu. Similarity measures and weighted fuzzy c-mean cluster<strong>in</strong>g<br />
algorithm. World Academy of Science, Eng<strong>in</strong>eer<strong>in</strong>g and Technology 76 2011<br />
[21] K. Rajendra Prasad, dr. P.Gov<strong>in</strong>da Rajulu, a survey on cluster<strong>in</strong>g Technique for datasets us<strong>in</strong>g Efficient<br />
graph structures, vol. 2 (7), 2010, 2707-2714<br />
[22] Sotirios P. Chatzis. A fuzzy c-means-type algorithm for cluster<strong>in</strong>g of data with mixed numeric and<br />
categorical attributes employ<strong>in</strong>g a probabilistic dissimilarity functional. Department of Electrical and<br />
293
Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />
Electronic Eng<strong>in</strong>eer<strong>in</strong>g, Imperial College London, Exhibition Road, South Kens<strong>in</strong>gton Campus SW7 2BT,<br />
UK.<br />
[23] G. Gan, Z. Yang, and J. Wu (2005), A Genetic k-Modes Algorithm for Cluster<strong>in</strong>g for Categorical Data,<br />
ADMA, LNAI 3584, pp. 195–202.<br />
[24] J. Z. Haung, M. K. Ng, H. Rong, Z. Li (2005) Automated variable weight<strong>in</strong>g <strong>in</strong> k-mean[1] type cluster<strong>in</strong>g,<br />
IEEE Transaction on PAMI 27(5).<br />
[25] K. Krishna and M. Murty (1999), ‘Genetic K-Means Algorithm’, IEEE Transactions on Systems, Man, and<br />
Cybernetics vol. 29, NO. 3, pp. 433-439.<br />
[26] Y. Lu, S. Lu, F. Fotouhi, Y. Deng, and S. Brown (2004), ‘Incremental genetic K-means algorithm and its<br />
application <strong>in</strong> gene expression data analysis’, BMC Bio<strong>in</strong>formatics 5:172.<br />
[27] [27] Y. Lu, S. Lu, F. Fotouhi, Y. Deng, and S. Brown (2004), FGKA: A Fast Genetic K-means Cluster<strong>in</strong>g<br />
Algorithm’, ACM 1-58113-812-1.<br />
[28] Z. He, X. Xu, & S. Deng,(2005) Scalable algorithms for cluster<strong>in</strong>g categorical data, Journal of Computer<br />
Science and Intelligence Systems 20, 1077-1089.<br />
[29] A. Juan and E. Vidal, “Fast K-Means-like Cluster<strong>in</strong>g <strong>in</strong> Metric Space,” Pattern Recognition Letters, vol. 15,<br />
no. 1, pp. 19-25, 1994.<br />
[30] Decomposition Methodology for Knowledge Discovery and Data M<strong>in</strong><strong>in</strong>g, O. Maimon and L. Rokach, eds.,<br />
pp. 90-94. World Scientific, 2005.<br />
[31] W. McCormick, P. Schweitzer, and T. White, “Problem Decomposition and Data Reorganization by a<br />
Cluster Technique,”Operations Research, vol. 20, no. 5, pp. 993-1009, 1972. 29] Statistical Pattern<br />
Recognition. A. Webb, ed., pp. 345-357. John Wiley & Sons, 2002.<br />
[32] A. Gordon, Classification, second ed. Chapman and Hall, CRC, 1999.<br />
[33] S. Roweis and L. Saul, “Nonl<strong>in</strong>ear Dimensionality Reduction by Locally L<strong>in</strong>ear Embedd<strong>in</strong>g,” Science, vol.<br />
290, no. 5500, pp. 2323-2326, 2000.<br />
[34] J.B. Tenenbaum, V. Silva, and J. Langford, “A Global Geometric Framework for Nonl<strong>in</strong>ear Dimensionality<br />
Reduction,” Science, vol. 290, no. 5500, pp. 2319-2323, 2000.<br />
[35] J.C. Bezdek and R. Hathaway, “VAT: A Tool for Visual Assessment of (Cluster) Tendency,” Proc. Int’l<br />
Jo<strong>in</strong>t Conf. Neural Networks (IJCNN ’02), pp. 2225-2230, 2002.<br />
[36] M. Belk<strong>in</strong> and P. Niyogi, “Laplacian Eigenmaps and Spectral Techniques for Embedd<strong>in</strong>g and Cluster<strong>in</strong>g,”<br />
Proc. Advances <strong>in</strong> Neural Information Process<strong>in</strong>g Systems (NIPS), 2002.<br />
[37] M. Breitenbach and G. Grudic, “Cluster<strong>in</strong>g through Rank<strong>in</strong>g on Manifolds,” Proc. 22nd Int’l Conf.<br />
Mach<strong>in</strong>e Learn<strong>in</strong>g (ICML), 2005.<br />
[38] R.B. Catelli, “A Note on Correlation Clusters and Cluster Search Methods,” Psychometrika, vol. 9, no. 3,<br />
pp. 169-184, 1944.<br />
[39] P. Sneath, “A Computer Approach to Numerical Taxonomy,” J. General Microbiology, vol. 17, pp. 201-<br />
226, 1957.<br />
[40] T.C. Havens, J.C. Bezdek, J.M. Keller, M. Popescu, and J.M. Huband, “Is VAT Really S<strong>in</strong>gle L<strong>in</strong>kage <strong>in</strong><br />
Disguise?” Pattern Recognition Letters, 2008, <strong>in</strong> review.Liang Wang received the PhD.<br />
���<br />
294