10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

122 6 <strong>Web</strong> Usage <strong>Mining</strong>Examples of Latent Semantic FactorsIn this section, we present some results regarding <strong>Web</strong> usage pattern <strong>and</strong> latent semantic factorobtained by conducting experiments on two selected <strong>Web</strong> log datasets. The first dataset isnamed KDDCUP 1 . In this data set, the entries in session-page matrix are determined by thenumber of <strong>Web</strong> page hits since the number of a user coming back to a specific page is agood measure to reflect the user interest on the page. The second data set is downloaded frommsnbc.com 2 , which describes the page visits by users who visited msnbc.com on September28, 1999. This data set is named as “msnbc” dataset.The experimental results on these two datasets are tabulated in Table 6.1 <strong>and</strong> 6.2. Table 6.1first lists 13 extracted latent factors <strong>and</strong> their corresponding characteristic descriptions fromKDDCUP dataset. And Table 6.2 depicts 3 factor examples selected from whole factor spacein terms of associated page information including page number, probability <strong>and</strong> description.From this table, it is s<strong>ee</strong>n that factor #3 indicates the concerns about vendor service messagesuch as customer service, contact number, payment methods as well as delivery support. Thefactor #7 describes the specific progress which may include customer login, product order,express checkout <strong>and</strong> financial information input such steps occurred in Internet shoppingscenario, whereas factors #13 actually captures another focus exhibited by <strong>Web</strong> content, whichreveals the fact that some <strong>Web</strong> users may pay more attentions to the information regardingdepartment itself..Table 6.1. Latent factors <strong>and</strong> their characteristic descriptions from KDDCUPFactor # Characteristic title1 Department search results2 ProductDetailLegwear3 Vendor service4 Fr<strong>ee</strong>gift5 ProductDetailLegcare6 Shopping cart7 Online shopping8 Lifestyle assortment9 Assortment210 Boutique11 Departmet replenishment12 Department article13 Home pageAs for ”msnbc” dataset, it is hard to extract the exact latent factor space as the pageinformation provided is described at a coarser granularity level, i.e. URL category level. Hencewe only list two examples of discovered latent factors to illustrate the general usage knowledge1 www.ecn.purdue.edu/kddcup2 http://kdd.ics.uci.edu/databases/

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!