10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

134 6 <strong>Web</strong> Usage <strong>Mining</strong>With the propagation <strong>and</strong> popularity of search engine, <strong>Web</strong> Usage <strong>Mining</strong> has attracted a lotof interests on how to facilitate the search performance via learning usage knowledge. Moreover,the research of <strong>Web</strong> community was greatly benefited from the advance of <strong>Web</strong> Usage<strong>Mining</strong>. In this section, we will review some s<strong>tud</strong>ies carried out in this area.6.5.1 <strong>Mining</strong> <strong>Web</strong> Logs to Improve <strong>Web</strong>site OrganizationA good design <strong>and</strong> organization of a website is essential in improving the website’s attractiveness<strong>and</strong> popularity in <strong>Web</strong> applications. A well designed <strong>and</strong> organized site is often a basicrequirement for securing the success of a site. However, it is not a easy task for every <strong>Web</strong>designer to satisfy the aims initially designated. There are many reasons, which are associatedwith the above disappointments. First, different users have their own navigational tasks sofollowing different access traces. Second, even the same user may have different informationn<strong>ee</strong>ds at different times. Third, the website is not logically organized <strong>and</strong> the individual <strong>Web</strong>pages are aggregated <strong>and</strong> placed in inappropriate positions, resulting in the users uneasily locatingthe n<strong>ee</strong>ded information; <strong>and</strong> furthermore, a site may be designed for a particular kindof use, but be used in many different ways in practice; the designer’s original intent is not fullyrealized. All above mentioned reasons will affect the satisfactory degr<strong>ee</strong> of a website use. Byd<strong>ee</strong>ply looking into the causes of such reasons, we can intuitively s<strong>ee</strong> that it is mainly becausethe early website design <strong>and</strong> organization only reflects the intents of website designers or developers,instead, the user opinions or tastes are not sufficiently taken into account. Inspiredby this observation, using <strong>Web</strong> Usage <strong>Mining</strong> techniques is intuitively proposed to addressthe improvement of website design <strong>and</strong> organization. Essentially the knowledge learned from<strong>Web</strong> Usage <strong>Mining</strong> is able to reveal the user navigational behavior <strong>and</strong> to benefit the siteorganization improvement by leveraging the knowledge.Adaptive <strong>Web</strong> SitesIn [204, 203] the authors proposed an approach to address the above challenges by creatingadaptive websites. The approach is to allow <strong>Web</strong> sites automatically improve their organization<strong>and</strong> presentation by learning from visitor access patterns. Different from other methods such ascustomized <strong>Web</strong> sites, which are to personalize the <strong>Web</strong> page presentation to individual users,the proposed approach is focused on the site optimization through the automatic synthesisof index pages. The basic idea of the proposed is originated from learning the user accesspatterns <strong>and</strong> implemented by synthesizing a number of new index pages to represent the useraccess interests. The approach was called PageGather algorithm. Since the algorithm n<strong>ee</strong>dsonly to explicitly learn usage information rather than to affectively make destructive changesto the original <strong>Web</strong> sites, the author claimed that the strength of the approach is nondestructivetransformation.The PageGather algorithm is based on a basic assumption of visit-coherence: the pagesa user visits during one interaction with the site tend to be conceptually related. It uses clusteringmining to find the aggregations of related <strong>Web</strong> pages at a site from the access log. Thewhole process of the proposed algorithm consists of four sub-steps:Algorithm 6.11 PageGather algorithmStep 1. Process the access log into visits.Step 2. Compute the co-occurrence frequencies betw<strong>ee</strong>n pages <strong>and</strong> create a similarity matrix.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!