10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

190 9 Conclusionsries commonly used in matrix-based analysis are presented accordingly, such as matrixeigenvalue, eigenvector; norm, singular value decomposition (SVD) of matrix aswell as thr<strong>ee</strong>-way tensor expression <strong>and</strong> decomposition. In addition, a number of wellknown performance evaluation metrics in the context of information retrieval <strong>and</strong>recommendation are reviewed, <strong>and</strong> the basic concepts in social networks are summarizedas well. The chapter of algorithms <strong>and</strong> techniques covers thr<strong>ee</strong> main aspectsof contents - fundamental data mining algorithms, such as association rules, sequentialpatterns, Markov models <strong>and</strong> Bayesian networks, clustering <strong>and</strong> classification;<strong>Web</strong> recommendation algorithms, e.g. content-based, user-based, model-based <strong>and</strong>kNN; <strong>and</strong> the detection <strong>and</strong> evolution analysis algorithms of social networks. Then,the book presents comprehensive materials on two-level focuses of how to utilize<strong>Web</strong> data mining to capture the inherent cohesion betw<strong>ee</strong>n various <strong>Web</strong> pages <strong>and</strong>betw<strong>ee</strong>n pages <strong>and</strong> users, <strong>and</strong> how to exploit or extend <strong>Web</strong> data mining in social<strong>and</strong> collaborative applications.The former is covered in Chapter.4, Chapter.5 <strong>and</strong> Chapter.6, in which we systematicallydiscuss the research <strong>and</strong> application issues of <strong>Web</strong> mining from the differentperspectives of <strong>Web</strong> content, linkage <strong>and</strong> usage mining. Chapter 4 presentsmaterials about <strong>Web</strong> content mining. Following the vector space model, <strong>Web</strong> searchis first addressed to cover the methodologies of crawling, archiving <strong>and</strong> indexingcontent, <strong>and</strong> searching strategies. To overcome the challenges of sparse <strong>and</strong> lowoverlapping of textual features embedded in pages, feature enrichment <strong>and</strong> latentsemantic analysis methods are given sequentially. Moreover, two extended applicationsof content analysis in automatic topic extraction <strong>and</strong> opinion mining from <strong>Web</strong>documents are demonstrated the application potentials in this domain. Chapter 5 ismainly talking about another important issue in <strong>Web</strong> mining, i.e. <strong>Web</strong> linkage mining.Starting with the principles of co-citation <strong>and</strong> bibliographic coupling, which isfrom information science, this chapter presents highly summarized materials on twowell-known algorithms in <strong>Web</strong> search, namely PageRank <strong>and</strong> HITS, which are thoroughly<strong>and</strong> substantially investigated <strong>and</strong> cited in a large amount of literatures afterthe great success of Google search engine. In addition to these two algorithms, thischapter also present the topic of <strong>Web</strong> community discovery as well. The concepts<strong>and</strong> algorithms like bipartite cores, network flow, cut-based notations of communities<strong>and</strong> <strong>Web</strong> community chart are substantially discussed along with the theoriesof <strong>Web</strong> graph measurement <strong>and</strong> modeling. An extended application of <strong>Web</strong> linkageanalysis for <strong>Web</strong> page classification is proposed to show how the linkage analysis facilitates<strong>Web</strong> page organization <strong>and</strong> presentation. Different from Chap.4 <strong>and</strong> Chap.5,in which the <strong>Web</strong> mining is mainly performed on <strong>Web</strong> pages st<strong>and</strong>alone, Chapter 6reports the research <strong>and</strong> application progresses from the point of view of interactionbetw<strong>ee</strong>n users <strong>and</strong> machines, i.e. <strong>Web</strong> usage mining. By introducing the usage datamodel of user-pageview matrix, this chapter first gives the idea of modeling usernavigation interest in terms of page-weight pair vector, <strong>and</strong> propose clustering-basedalgorithms to measure the similarity of user interest, <strong>and</strong> in turn, to find user navigationalpatterns. In additional to clustering, latent semantic analysis <strong>and</strong> its variantsare explored to be employed in <strong>Web</strong> usage mining. Two recently well s<strong>tud</strong>ied latentsemantic analysis algorithms, namely PLSA <strong>and</strong> LDA, are presented <strong>and</strong> elaborated

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!