10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

130 6 <strong>Web</strong> Usage <strong>Mining</strong>ap k =∑s i ∈R kθ i,k · s i|R k |(6.30)where |R k | is the number of the chosen user sessions in R k .Step 3: Output a set of task-specific user access patterns TAP corresponding to t tasks,TAP = {ap k ,k = 1,···,t}. In this expression, each user access pattern is represented by aweighted page vector, where the weights indicate the relative visit preferences of pages exhibitedby all associated user sessions for this task-specific access pattern.6.4 Co-Clustering Analysis of weblogs using Bipartite SpectralProjection ApproachIn previous sections, we broadly discussed <strong>Web</strong> clustering in <strong>Web</strong> usage mining. Basically<strong>Web</strong> clustering could be performed on either <strong>Web</strong> pages or user sessions in the context of <strong>Web</strong>usage mining. <strong>Web</strong> page clustering is one of popular topics in <strong>Web</strong> clustering, which aimsto discover <strong>Web</strong> page groups sharing similar functionality or semantics. For example, [114]proposed a technique LSH (Local Sensitive Hash) for clustering the entire <strong>Web</strong>, concentratingon the scalability of clustering. Snippet-based clustering is well s<strong>tud</strong>ied in [92]. [147] reportedusing a hierarchical monothetic document clustering for summarizing the search results. [121]proposed a <strong>Web</strong> page clustering algorithm based on measuring page similarity in terms ofcorrelation. In contrast to <strong>Web</strong> page clustering, <strong>Web</strong> usage clustering is proposed to discover<strong>Web</strong> user behavior patterns <strong>and</strong> associations betw<strong>ee</strong>n <strong>Web</strong> pages <strong>and</strong> users from the perspectiveof <strong>Web</strong> user. In practice, Mobasher et al. [184] combined user transaction <strong>and</strong> pageviewclustering techniques, which was to employ the traditional k-means clustering algorithm tocharacterize user access patterns for <strong>Web</strong> personalization based on mining <strong>Web</strong> usage data. In[258] Xu et al. attempted to discover user access patterns <strong>and</strong> <strong>Web</strong> page segments from <strong>Web</strong>log files by utilizing a so-called Probabilistic Semantic Latent Analysis (PLSA) model.The clustering algorithms described above are mainly manipulated on one dimension (orattribute) of the <strong>Web</strong> usage data only, i.e. user or page solely, rather than taking into accountthe correlation betw<strong>ee</strong>n <strong>Web</strong> users <strong>and</strong> pages. However, in most cases, the <strong>Web</strong> object clustersdo often exist in the forms of co-occurrence of pages <strong>and</strong> users - the users from the samegroup are particularly interested in one subset of <strong>Web</strong> pages. For example, in the contextof customer behavior analysis in e-commerce, this observation could correspond to the phenomenonthat one specific group of customers show strong interest to one particular categoryof goods. In this scenario, <strong>Web</strong> co-clustering is probably an effective means to address thementioned challenge. The s<strong>tud</strong>y of co-clustering is firstly proposed to deal with co-clusteringof documents <strong>and</strong> words in digital library [73]. And it has b<strong>ee</strong>n widely utilized in many s<strong>tud</strong>ieswhich involved in multiple attribute analysis, such as social tagging system [98] <strong>and</strong> geneticrepresentation [108] etc. In this section, we will propose a co-clustering algorithm for <strong>Web</strong>usage mining based on bipartite spectral clustering.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!