10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

142 6 <strong>Web</strong> Usage <strong>Mining</strong>(2) Degr<strong>ee</strong> of User Interests The more times a user clicks one topic, the more interestedthe user is in it. The user’s clicked times can produce a complementary rank list of searchresults.(3) Google List Google applies its patented PageRank technology on the Google Directoryto rank the sites. To k<strong>ee</strong>p our rank aggregation from missing the high quality <strong>Web</strong> pagesin Google, the original rank list of Google Directory Search is considered as well.In [159] authors s<strong>tud</strong>y the problem of combining sets of rank lists from different attributesof user preferences into a single rank list. Voting provides us with a traditional class of algorithmsto determine the aggregated rank list. The most common voting theory, named after itscreator, is known as Borda’s rule [38] which argues that the majority opinion is the truth, or atleast the closest that we can come to determining it [266]. However, the problem with Borda’srule is that it does not optimize any criterion. Footrule distances [74] is used to weigh edges ina bipartite graph <strong>and</strong> then find a minimum cost matching. This method was proved in [82] toapproximate the optimal ranking that approximately minimizes the number of disagr<strong>ee</strong>mentswith the given inputs.In [159] experimental results on a real click-through data set demonstrated the effectivenessof their methods. They argued that some rank-based aggregation methods performedbetter than the Socre-Based method.SummaryIn this chapter, we have reviewed <strong>and</strong> summarized a number of <strong>Web</strong> usage mining algorithms.The reported algorithms are mainly related to two kinds of well s<strong>tud</strong>ied data miningparadigms, namely clustering <strong>and</strong> latent semantic analysis. In first section, we have describedthe procedures of modeling user interest <strong>and</strong> capturing user clusters. In the following two sections,we have elaborated the combination of latent semantic analysis into <strong>Web</strong> usage mining.In section 6.4, we have reported the s<strong>tud</strong>y of co-clustering of weblogs. Section 6.5 listed acouple of <strong>Web</strong> usage mining applications in real world. In this chapter, besides the technicalcoverage on algorithmic issues, some interesting <strong>and</strong> insightful experimental investigationshave also b<strong>ee</strong>n presented.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!