04.02.2013 Views

Download Complete Article in PDF Format - vsrd international ...

Download Complete Article in PDF Format - vsrd international ...

Download Complete Article in PDF Format - vsrd international ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

R E S E A R C H A R T II C L E<br />

____________________________<br />

Available ONLINE www.<strong>vsrd</strong>journals.com<br />

VSRD-IJCSIT, Vol. 2 (4), 2012, 285-295<br />

Data Centric Knowledge Management System<br />

Us<strong>in</strong>g Post-Cluster<strong>in</strong>g Technique<br />

ABSTRACT<br />

1 Asadi Sr<strong>in</strong>ivasulu*, 2 Ch.D.V. Subba Rao and 3 M. Sreedevi<br />

The purpose of Data Centric Knowledge Management System (DCKMS) is to centralize knowledge generated<br />

by employees work<strong>in</strong>g with<strong>in</strong> and functional areas and to organize that knowledge such that it can be easily<br />

accessed, searched, browsed and navigated. It is a one stop shop for f<strong>in</strong>d<strong>in</strong>g solutions for your problems. It<br />

provides a facility for the employees to register themselves as ‘experts’ as well as search for other ‘experts’<br />

<strong>in</strong>case of any problem/requirement <strong>in</strong> their project. It is a one stop shop for f<strong>in</strong>d<strong>in</strong>g solutions for your problems.<br />

This system design is modularized <strong>in</strong>to various categories. This system has enriched UI so that a novice user did<br />

not feel any operational difficulties. This system ma<strong>in</strong>ly concentrated <strong>in</strong> design<strong>in</strong>g various reports requested by<br />

the users as well as higher with export to excel options. This paper addresses the expectations, organizational<br />

implications, and <strong>in</strong>formation process<strong>in</strong>g requirements, of the emerg<strong>in</strong>g knowledge management paradigm. A<br />

brief discussion of the enablement of the <strong>in</strong>dividual through the wide-spread availability of computer and<br />

communication facilities is followed by a description of the structural evolution of organizations, and the<br />

architecture of a computer-based knowledge management system. The author discusses two trends that are<br />

driven by the treatment of <strong>in</strong>formation and knowledge as a commodity, <strong>in</strong>creased concern for the management<br />

and exploitation of knowledge with<strong>in</strong> organizations, and, the creation of an organizational environment that<br />

facilitates the acquisition, shar<strong>in</strong>g and application of knowledge.<br />

Keywords : Data, Data-Centric, Data Mart, Data Portal, Data Warehouse, Enabled Individual, Information,<br />

Information-Centric, Information Management, Knowledge, Knowledge Management, Ontology,<br />

Organizational Structure, Cluster<strong>in</strong>g, Data M<strong>in</strong><strong>in</strong>g, Fuzzy C-Means Cluster<strong>in</strong>g Algorithm, K-Means<br />

Cluster<strong>in</strong>g Algorithm.<br />

1. INTRODUCTION<br />

The Data Centric Knowledge Management System is a web based application which allows employees of a<br />

company to share their knowledge with others <strong>in</strong> the company. Also it allows them to search for knowledge<br />

1,3 Associate Professor, 2 Professor, 1,2,3 Department of Information Technology, Sree Vidyanikethan Eng<strong>in</strong>eer<strong>in</strong>g College,<br />

Tirupathi, Andhra Pradesh, INDIA. *Correspondence : sr<strong>in</strong>u_asadi@yahoo.com


Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

assets when <strong>in</strong> need. It provides a facility for the employees to register themselves as ‘experts’ as well as search<br />

for other ‘experts’ <strong>in</strong>case of any problem/requirement <strong>in</strong> their project. It is a one stop shop for f<strong>in</strong>d<strong>in</strong>g solutions<br />

for your problems. As <strong>in</strong>formation technology beg<strong>in</strong>s to permeate all aspects of life and the economy turns<br />

decidedly <strong>in</strong>formation-centric, wealth is <strong>in</strong>creas<strong>in</strong>gly def<strong>in</strong>ed <strong>in</strong> terms of <strong>in</strong>formation-related products and the<br />

availability of knowledge. Under these conditions employment, whether self-employment or organizational<br />

employment is becom<strong>in</strong>g s<strong>in</strong>gularly focused on the skills and capabilities of the <strong>in</strong>dividual. In other words<br />

knowledge has become a commodity that has value far <strong>in</strong> excess of the manufactured products that represented<br />

the yardstick of wealth dur<strong>in</strong>g the <strong>in</strong>dustrial age. How this new form of human wealth should be effectively<br />

utilized and nurtured <strong>in</strong> commercial and government organizations have <strong>in</strong> recent times become a major<br />

preoccupation of management. Two parallel and related trends have emerged. The first trend is related to the<br />

management and exploitation of knowledge. The question be<strong>in</strong>g asked is: How can we capture and utilize the<br />

potentially available knowledge for the benefit of the organization? The phrase “…potentially available” is<br />

appropriate, because much of the knowledge is hidden <strong>in</strong> an overwhelm<strong>in</strong>g volume of computer-based data.<br />

What is not commonly understood is that the overwhelm<strong>in</strong>g nature of the stored data is due to current<br />

process<strong>in</strong>g methods rather than volume. These process<strong>in</strong>g methods have to rely largely on manual methods<br />

because only the human user can provide the necessary context for <strong>in</strong>terpret<strong>in</strong>g the computer-stored data <strong>in</strong>to<br />

<strong>in</strong>formation and knowledge. If it were possible to capture <strong>in</strong>formation (i.e., data with relationships), rather than<br />

data, at the po<strong>in</strong>t of entry <strong>in</strong>to the computer then there would be sufficient context for computer software to<br />

process the <strong>in</strong>formation automatically <strong>in</strong>to knowledge. This is not just a desirable<br />

2. RELATED WORK<br />

The ma<strong>in</strong> purpose of functional requirements with<strong>in</strong> the requirement specification document is to def<strong>in</strong>e all the<br />

activities or operations that take place <strong>in</strong> the system. These are derived through <strong>in</strong>teractions with the users of the<br />

system. S<strong>in</strong>ce the Requirements Specification is a comprehensive document & conta<strong>in</strong>s a lot of data, it has been<br />

broken down <strong>in</strong>to different Chapters <strong>in</strong> this report. The depiction of the Design of the System <strong>in</strong> UML is<br />

presented <strong>in</strong> a separate chapter. The Data Dictionary is presented <strong>in</strong> the Appendix of the system. But the general<br />

Functional Requirements arrived at the end of the <strong>in</strong>teraction with the Users are listed below. A more detailed<br />

discussion is presented <strong>in</strong> this, which talk about the Analysis & Design of the system. Adm<strong>in</strong>istrator of this<br />

system can add a new employee as well as delete an exist<strong>in</strong>g employee and he can view all the exist<strong>in</strong>g users of<br />

the system. Adm<strong>in</strong>istrator can create; delete user log<strong>in</strong>s for different employees. Adm<strong>in</strong>istrator can view<br />

different reports (My Submission report, Rat<strong>in</strong>gs reports, document status report etc).<br />

� Adm<strong>in</strong>istrator of this system can add a new employee as well as delete an exist<strong>in</strong>g employee and he can<br />

view all the exist<strong>in</strong>g users of the system.<br />

� Adm<strong>in</strong>istrator can create; delete user log<strong>in</strong>s for different employees.<br />

� A K-User/ K-Team Member/Reviewer can search for a document based on his criteria (author, technology<br />

etc).<br />

� A K-User/ K-Team Member/Reviewer can download a document.<br />

286


� A K-User/ K-Team Member/Reviewer can rate a document.<br />

� A K-User/ K-Team Member/Reviewer can submit a document.<br />

� A K-User/ K-Team Member/Reviewer can register as an expert.<br />

� A K-User/ K-Team Member/Reviewer can search for an expert.<br />

Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

� A K-Team Member/Reviewer can evaluate the above documents for <strong>in</strong>itial screen<strong>in</strong>g.<br />

� A K-Team Member can manage the reviewers list.<br />

� A K-team Member can assign a document to particular reviewer<br />

� A Reviewer can view the list of documents forwarded to him<br />

� A Reviewer can publish or reject a document.<br />

3. EXISTING ALGORITHM<br />

Fig. 1 : Context Level Diagram<br />

Here <strong>in</strong> the exist<strong>in</strong>g system, the company ma<strong>in</strong>ta<strong>in</strong>s all the knowledge based documents <strong>in</strong> a separate system<br />

which will be accessible for all employees through LAN and they can post their new documents <strong>in</strong>to this and<br />

access the earlier documents. Search<strong>in</strong>g for related documents based on author, technology etc is a time tak<strong>in</strong>g<br />

process. Manag<strong>in</strong>g the documents category wise and restrict them not to be accessible based on the user type<br />

becomes complicated. This system doesn’t restrict unnecessary documents to be posted.<br />

DRAWBACKS:<br />

� Difficulty <strong>in</strong> ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g security levels for the documents.<br />

� Difficulty <strong>in</strong> brows<strong>in</strong>g, navigat<strong>in</strong>g and search<strong>in</strong>g for required document.<br />

� Difficulty <strong>in</strong> giv<strong>in</strong>g rat<strong>in</strong>gs for the documents.<br />

287


� Availability of <strong>in</strong>formation <strong>in</strong> this manner is subjected to damage.<br />

Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

� Difficulty <strong>in</strong> restrict<strong>in</strong>g the employees not to update the documents.<br />

� Difficulty <strong>in</strong> generat<strong>in</strong>g different reports.<br />

4. PROPOSED SYSTEM<br />

The proposed system is fully computerized, which removes all the drawbacks of exist<strong>in</strong>g system. In the<br />

proposed system, it allows different employees of the company to upload their knowledge document <strong>in</strong>to this<br />

system which will be verified by next level users to avoid unnecessary documents. Also it allows them to search<br />

for knowledge assets very easily when <strong>in</strong> need. It provides a facility for the employees to register themselves as<br />

‘experts’ as well as search for other ‘experts’ <strong>in</strong>case of any problem/requirement <strong>in</strong> their project. It provides a<br />

facility for the evaluator to rate the documents posted by the employees.<br />

ADVANTAGES:<br />

� It provides a facility a to share knowledge documents across the company<br />

� It allows the employees to upload and download the documents from their systems<br />

� Easy <strong>in</strong> brows<strong>in</strong>g, navigat<strong>in</strong>g and search<strong>in</strong>g for required documents<br />

� Provides a facility to restrict the unnecessary documents to be posted<br />

� Provides flexible way <strong>in</strong> generat<strong>in</strong>g different reports<br />

� By the follow<strong>in</strong>g the new approach the <strong>in</strong>formation can be accessed from anywhere just with a mouse click.<br />

This helps the users by sav<strong>in</strong>g lot of time provid<strong>in</strong>g the user with the up to date <strong>in</strong>formation Centralized<br />

database helps <strong>in</strong> avoid<strong>in</strong>g conflicts<br />

� This project provides a rich user <strong>in</strong>terface for the user to access <strong>in</strong>formation with least effort (“look and<br />

feel”).<br />

� It allows to rate the documents at different levels<br />

� It allows publish<strong>in</strong>g or reject<strong>in</strong>g the documents.<br />

4.1. K-MEANS ALGORITHM<br />

Step 1) Put the first K feature vectors as <strong>in</strong>itial centers<br />

Step 2) Assign each sample vector to the cluster with m<strong>in</strong>imum distance assignment pr<strong>in</strong>ciple.<br />

Step 3) Compute new average as new center for each cluster<br />

Step 4) If any center has changed, then go to step 2, else term<strong>in</strong>ate.<br />

288


4.2. K-MEANS<br />

Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

Fig. 5 : Apply<strong>in</strong>g Cluster<strong>in</strong>g Technique Similarity Weight and Filter Method<br />

Fig. 6 : Results of Cluster<strong>in</strong>g Show<strong>in</strong>g Groups Divided Into Clusters<br />

Fig. 7 : Initialization and Input<br />

289


Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

Fig. 8 : F<strong>in</strong>al EMST Edges Path<br />

Fig. 1 : Graph for K-Means<br />

K-means is one of the simplest unsupervised learn<strong>in</strong>g algorithms that solve the well known cluster<strong>in</strong>g problem<br />

.K-means is a popular cluster<strong>in</strong>g method that uses prototypes (centroid) to represent clusters by m<strong>in</strong>imiz<strong>in</strong>g<br />

with<strong>in</strong>-cluster errors. The ma<strong>in</strong> idea is to def<strong>in</strong>e k centroid, one for each cluster.<br />

This centroid should be placed <strong>in</strong> a cunn<strong>in</strong>g way because of different location causes different result. The next<br />

step is to take each po<strong>in</strong>t belong<strong>in</strong>g to a given data set and associate it to the nearest centroid. After we have<br />

these k new centroid, a new b<strong>in</strong>d<strong>in</strong>g has to be done between the same data set po<strong>in</strong>ts and the nearest new<br />

centroid. F<strong>in</strong>ally, this algorithm aims at m<strong>in</strong>imiz<strong>in</strong>g an objective function.<br />

The objective function :<br />

290


Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

We apply the above algorithm <strong>in</strong> our project by tak<strong>in</strong>g <strong>in</strong>put attributes like number of assignments submitted;<br />

number of tasks done successfully, number of times had face to face <strong>in</strong>teractions among team members. Now<br />

apply<strong>in</strong>g above algorithm results <strong>in</strong> division of groups <strong>in</strong>to k clusters .The groups <strong>in</strong> each cluster would have<br />

shown nearly similar behavior hence grouped <strong>in</strong>to same cluster. Now it becomes easy for the facilitator to give<br />

feedback as now he can give feedback to the entire cluster <strong>in</strong>stead of giv<strong>in</strong>g to each and every group<br />

5. RESULTS<br />

Fig. : This Screen Is Log<strong>in</strong> Page for All Users and Adm<strong>in</strong>istrator<br />

Fig. : Adm<strong>in</strong>istrator Can F<strong>in</strong>d the Experts for Gett<strong>in</strong>g the Assistance<br />

Fig. : Adm<strong>in</strong>istrator Can Register As Experts<br />

291


6. CONCLUSION<br />

Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

Fig. : This Screen Shows the K Team Actions<br />

The new system, Data Centric Knowledge Management System has been implemented to cater the needs of<br />

company employees <strong>in</strong> shar<strong>in</strong>g different knowledge assets effectively with role based access. The present<br />

system has been <strong>in</strong>tegrated with the already exist<strong>in</strong>g. The database was put <strong>in</strong>to the My SQL server. This was<br />

connected by JDBC. The database is accessible through Intranet on any location. This system has been found to<br />

meet the requirements of the users and departments and also very satisfactory. The database system must<br />

provide for the safety of the <strong>in</strong>formation stored, despite system crashes or attempts at unauthorized access. If<br />

data are to be shared among several users, the system must avoid possible anomalous results. Future<br />

enhancement is Extendibility provides high level extendibility. It means it provides all the basic features and<br />

allows us to extend their features very easily without disturb<strong>in</strong>g the exist<strong>in</strong>g code. We can make this Internet<br />

application if we desire. We can make this application is suitable to work on any application just by chang<strong>in</strong>g<br />

the deployment files. By provid<strong>in</strong>g some more features like provid<strong>in</strong>g accessibility to <strong>in</strong>ternet users to <strong>in</strong>volve <strong>in</strong><br />

this process.<br />

7. REFERENCES<br />

[1] Sr<strong>in</strong>ivasulu Asadi, Dr. Ch.D.V.Subbarao, V. Saikrishna, “F<strong>in</strong>d<strong>in</strong>g the number of clusters us<strong>in</strong>g Dark Block<br />

Extraction”, IJCA International Journal of Computer Applications (0975 – 8887), Volume 7– No.3,<br />

September, 2010.<br />

[2] A. Ahmad and L. Dey, (2007), A k-mean cluster<strong>in</strong>g algorithm for mixed numeric and categorical data’,<br />

Data and Knowledge Eng<strong>in</strong>eer<strong>in</strong>g Elsevier Publication, vol. 63, pp 503-527.<br />

[3] Sr<strong>in</strong>ivasulu Asadi, Dr.Ch.D.V.SubbaRao, V.Saikrishna and Bhudevi Aasadi “Cluster<strong>in</strong>g the Labeled and<br />

Unlabeled Datasets us<strong>in</strong>g New MST based Divide and Conquer Technique,” International Journal of<br />

Computer Science & Eng<strong>in</strong>eer<strong>in</strong>g Technology (IJCSET), (0975 – 8887), IJCSET | July 2011 | Vol 1, Issue<br />

6,302-306, ISSN:2231-0711, July, 2011.<br />

[4] Xiaochun Wang, Xiali Wang and D. Mitchell Wilkes, IEEE Members, “A Divide-and-Conquer Approach<br />

for M<strong>in</strong>imum Spann<strong>in</strong>g Tree-Based Cluster<strong>in</strong>g”, IEEE Knowledge and Data Eng<strong>in</strong>eer<strong>in</strong>g Transactions, vol<br />

21, July 2009.<br />

292


Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

[5] Sr<strong>in</strong>ivasulu Asadi, Dr.Ch.D.V.Subba Rao, O.Obulesu and P.Sunil Kumar Reddy, “F<strong>in</strong>d<strong>in</strong>g the Number of<br />

Clusters <strong>in</strong> Unlabelled Datasets Us<strong>in</strong>g Extended Cluster Count Extraction (ECCE)”, ,” IJCSIT International<br />

Journal of Computer Science and Information Technology (ISSN: 0975 – 9646), Vol. 2 (4) , 2011, 1820-<br />

1824, August, 2011.<br />

[6] S Deng, Z He, X Xu, 2005. Cluster<strong>in</strong>g mixed numeric and categorical data: A cluster ensemble approach.<br />

Arxiv prepr<strong>in</strong>t cs/0509011.<br />

[7] Sr<strong>in</strong>ivasulu Asadi, Dr.Ch.D.V.Subba Rao, O.Obulesu and P.Sunil Kumar Reddy,“A Comparative study of<br />

Cluster<strong>in</strong>g <strong>in</strong> Unlabelled Datasets Us<strong>in</strong>g Extended Dark Block Extraction and Extended Cluster Count<br />

Extraction Extended Dark Block Extraction and Extended Cluster Count Extraction”, IJCSIT International<br />

Journal of Computer Science and Information Technology (ISSN:0975 – 9646), Vol. 2(4) , 2011, 1825-<br />

1831,August, 2011.<br />

[8] S. Guha, R. Rastogi, and K. Shim, 2000. ROCK: A Robust Cluster<strong>in</strong>g Algorithm for Categorical Attributes.<br />

Information Systems, vol. 25, no. 5 : 345-366.<br />

[9] V.V. Cross and T.A. Sudkamp, Similarity and Compatibility <strong>in</strong> Fuzzy Set Theory: assessment and<br />

Applications, Physica-Verlag, New York, 2002.<br />

[10] M. Kal<strong>in</strong>a, Derivatives of fuzzy functions and fuzzy derivatives, Tatra<br />

[11] Jiawei Han and Michel<strong>in</strong>e Kamber. “Data Ware Hous<strong>in</strong>g and Data M<strong>in</strong><strong>in</strong>g. Concepts and Techniques”,<br />

Third Edition 2007.<br />

[12] Zhexue Huang; Ng, M.K.;Manage. Inf. Pr<strong>in</strong>ciples Ltd., Melbourne, Vic.A fuzzy k-modes algorithm for<br />

cluster<strong>in</strong>g categorical data. vol.7, pp 446-452<br />

[13] Tengke Xiong; Shengrui Wang; Mayers, A.; Monga, E.; Dept. Comput. Sci., Univ. of Sherbrooke,<br />

Sherbrooke, QC, Canada. A New MCA-Based Divisive Hierarchical Algorithm for Cluster<strong>in</strong>g Categorical<br />

Data.<br />

[14] Iam-On, N.; Boongeon, T.; Garrett, S.; Price, C.;Aberystwyth University, Aberystwyth. A L<strong>in</strong>k-Based<br />

Cluster Ensemble Approach for Categorical Data Cluster<strong>in</strong>g. vol. PP 1.<br />

[15] Izakian, H.; Abraham, A.; Snasel, V.;Mach<strong>in</strong>e Intell. Res. Labs. (MIR Labs.), Auburn, WA, USA.<br />

Cluster<strong>in</strong>g categorical data us<strong>in</strong>g a swarm-based method. pp. 1720-1724<br />

[16] Charu C.Aggarwal. Towards Systematic Design of Distance Functions for Data M<strong>in</strong><strong>in</strong>g Applications.<br />

SIGKDD ’03, August 2427, 2003, Wash<strong>in</strong>gton, DC, USA<br />

[17] Huajie Zhang; Zhiyue Cao; Fangzhu Qiang;Dept. of Comput. Sci., Ch<strong>in</strong>a Univ. of Geosci., Wuhan.<br />

Representation and cluster<strong>in</strong>g of numeric data <strong>in</strong> concept formation. vol.1, pp 597-600.<br />

[18] M. Mahdavi and H. Abolhassani, (2009) Harmony K-means algorithm for document cluster<strong>in</strong>g, Data M<strong>in</strong><br />

Knowl Disc (2009) 18:370–391.<br />

[19] Yong Wang; Naohiro Ishii.Lear<strong>in</strong><strong>in</strong>g Feature Weight for Similarity Measures.<br />

[20] Ba<strong>in</strong>ian Li; Kongsheng Zhang; and Jian Xu. Similarity measures and weighted fuzzy c-mean cluster<strong>in</strong>g<br />

algorithm. World Academy of Science, Eng<strong>in</strong>eer<strong>in</strong>g and Technology 76 2011<br />

[21] K. Rajendra Prasad, dr. P.Gov<strong>in</strong>da Rajulu, a survey on cluster<strong>in</strong>g Technique for datasets us<strong>in</strong>g Efficient<br />

graph structures, vol. 2 (7), 2010, 2707-2714<br />

[22] Sotirios P. Chatzis. A fuzzy c-means-type algorithm for cluster<strong>in</strong>g of data with mixed numeric and<br />

categorical attributes employ<strong>in</strong>g a probabilistic dissimilarity functional. Department of Electrical and<br />

293


Asadi Sr<strong>in</strong>ivasulu et al / VSRD International Journal of CS & IT Vol. 2 (4), 2012<br />

Electronic Eng<strong>in</strong>eer<strong>in</strong>g, Imperial College London, Exhibition Road, South Kens<strong>in</strong>gton Campus SW7 2BT,<br />

UK.<br />

[23] G. Gan, Z. Yang, and J. Wu (2005), A Genetic k-Modes Algorithm for Cluster<strong>in</strong>g for Categorical Data,<br />

ADMA, LNAI 3584, pp. 195–202.<br />

[24] J. Z. Haung, M. K. Ng, H. Rong, Z. Li (2005) Automated variable weight<strong>in</strong>g <strong>in</strong> k-mean[1] type cluster<strong>in</strong>g,<br />

IEEE Transaction on PAMI 27(5).<br />

[25] K. Krishna and M. Murty (1999), ‘Genetic K-Means Algorithm’, IEEE Transactions on Systems, Man, and<br />

Cybernetics vol. 29, NO. 3, pp. 433-439.<br />

[26] Y. Lu, S. Lu, F. Fotouhi, Y. Deng, and S. Brown (2004), ‘Incremental genetic K-means algorithm and its<br />

application <strong>in</strong> gene expression data analysis’, BMC Bio<strong>in</strong>formatics 5:172.<br />

[27] [27] Y. Lu, S. Lu, F. Fotouhi, Y. Deng, and S. Brown (2004), FGKA: A Fast Genetic K-means Cluster<strong>in</strong>g<br />

Algorithm’, ACM 1-58113-812-1.<br />

[28] Z. He, X. Xu, & S. Deng,(2005) Scalable algorithms for cluster<strong>in</strong>g categorical data, Journal of Computer<br />

Science and Intelligence Systems 20, 1077-1089.<br />

[29] A. Juan and E. Vidal, “Fast K-Means-like Cluster<strong>in</strong>g <strong>in</strong> Metric Space,” Pattern Recognition Letters, vol. 15,<br />

no. 1, pp. 19-25, 1994.<br />

[30] Decomposition Methodology for Knowledge Discovery and Data M<strong>in</strong><strong>in</strong>g, O. Maimon and L. Rokach, eds.,<br />

pp. 90-94. World Scientific, 2005.<br />

[31] W. McCormick, P. Schweitzer, and T. White, “Problem Decomposition and Data Reorganization by a<br />

Cluster Technique,”Operations Research, vol. 20, no. 5, pp. 993-1009, 1972. 29] Statistical Pattern<br />

Recognition. A. Webb, ed., pp. 345-357. John Wiley & Sons, 2002.<br />

[32] A. Gordon, Classification, second ed. Chapman and Hall, CRC, 1999.<br />

[33] S. Roweis and L. Saul, “Nonl<strong>in</strong>ear Dimensionality Reduction by Locally L<strong>in</strong>ear Embedd<strong>in</strong>g,” Science, vol.<br />

290, no. 5500, pp. 2323-2326, 2000.<br />

[34] J.B. Tenenbaum, V. Silva, and J. Langford, “A Global Geometric Framework for Nonl<strong>in</strong>ear Dimensionality<br />

Reduction,” Science, vol. 290, no. 5500, pp. 2319-2323, 2000.<br />

[35] J.C. Bezdek and R. Hathaway, “VAT: A Tool for Visual Assessment of (Cluster) Tendency,” Proc. Int’l<br />

Jo<strong>in</strong>t Conf. Neural Networks (IJCNN ’02), pp. 2225-2230, 2002.<br />

[36] M. Belk<strong>in</strong> and P. Niyogi, “Laplacian Eigenmaps and Spectral Techniques for Embedd<strong>in</strong>g and Cluster<strong>in</strong>g,”<br />

Proc. Advances <strong>in</strong> Neural Information Process<strong>in</strong>g Systems (NIPS), 2002.<br />

[37] M. Breitenbach and G. Grudic, “Cluster<strong>in</strong>g through Rank<strong>in</strong>g on Manifolds,” Proc. 22nd Int’l Conf.<br />

Mach<strong>in</strong>e Learn<strong>in</strong>g (ICML), 2005.<br />

[38] R.B. Catelli, “A Note on Correlation Clusters and Cluster Search Methods,” Psychometrika, vol. 9, no. 3,<br />

pp. 169-184, 1944.<br />

[39] P. Sneath, “A Computer Approach to Numerical Taxonomy,” J. General Microbiology, vol. 17, pp. 201-<br />

226, 1957.<br />

[40] T.C. Havens, J.C. Bezdek, J.M. Keller, M. Popescu, and J.M. Huband, “Is VAT Really S<strong>in</strong>gle L<strong>in</strong>kage <strong>in</strong><br />

Disguise?” Pattern Recognition Letters, 2008, <strong>in</strong> review.Liang Wang received the PhD.<br />

���<br />

294

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!