7 - Indira Gandhi Centre for Atomic Research
7 - Indira Gandhi Centre for Atomic Research
7 - Indira Gandhi Centre for Atomic Research
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3.4 Classification approaches<br />
This approach is based on grouping data according to similarities and classes. Bayesian<br />
approach that uses probabilities and a graphical means of representation is considered a<br />
type of classification. Bayesian networks are typically used when uncertainty associated<br />
with an outcome can be expressed in terms of a probability. This approach relies on<br />
encoded domain knowledge and has been used <strong>for</strong> diagnostics systems. Other examples of<br />
classification approaches are decision tree approach and pattern discovery and data<br />
cleaning models. Decision trees are hierarchical structures, where each internal node<br />
contains a test on an attribute; each branch corresponds to an outcome of the test, and each<br />
leaf node gives a prediction <strong>for</strong> the value of the class variable. Classification approach is<br />
useful <strong>for</strong> organizing the potential metadata of digital library <strong>for</strong> knowledge generation<br />
process.<br />
4. Knowledge Discovery in Digital Libraries<br />
The Library has been the center of the preservation, utilization and distribution of<br />
in<strong>for</strong>mation and knowledge. Digital Library has a much greater capacity <strong>for</strong> Knowledge<br />
Management. The current Digital Library Architecture should include classification and<br />
thesaurus – the vocabulary control and knowledge organizing tools, which serves three<br />
purposes in a traditional library, the description, organization and retrieval of in<strong>for</strong>mation.<br />
For more effective and efficient exploration, the networked in<strong>for</strong>mation should be prearranged<br />
together with vigorous improvement of search techniques. Classification and<br />
thesauri that contain condensed intelligence can be used in organizing networked<br />
in<strong>for</strong>mation especially metadata to facilitate the in<strong>for</strong>mation resources usability and<br />
catalyze the Digital Library into Knowledge Management.<br />
Classification and thesaurus can be merged into a concept network and the metadata can<br />
be distributed into the nodes of the network according their subjects. The abstract concept<br />
node substantiated with the related metadata records becomes a knowledge node. This<br />
<strong>for</strong>ms a consistent knowledge network that is not only a framework <strong>for</strong> resource<br />
organization, but also a structure <strong>for</strong> knowledge navigation, retrieval, and learning.<br />
The bibliographic data is one of the most important resources of library, which will be<br />
useful <strong>for</strong> knowledge discovery process. Based on the subject indexing, the bibliographic<br />
data can be combined with the classification and thesaurus to <strong>for</strong>m a knowledge structure,<br />
which provides a skeleton <strong>for</strong> organization of bibliographic data. Corpus knowledge can<br />
be <strong>for</strong>med when new terms can be extracted automatically from the bibliographic data to<br />
update the classification and thesaurus. Such a knowledge network provides the user with<br />
an opportunity <strong>for</strong> navigation, searching and learning.<br />
5. Conclusion<br />
Knowledge discovery process has evolved and continues to evolve from the intersection of<br />
research fields such as machine learning, pattern recognition, database statistics, Artificial<br />
Intelligence etc. While Knowledge Discovery tools hold the promise of an enabling<br />
technology that could unlock the knowledge lying dormant in huge databases, they suffer<br />
some shortcomings such as problems in representing multiple interrelated relations in<br />
databases, incremental rule generation when database is expanded, and finally consistency<br />
144