10.07.2015 Views

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

Web Mining and Social Networking: Techniques and ... - tud.ttu.ee

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

172 8 <strong>Web</strong> <strong>Mining</strong> <strong>and</strong> Recommendation SystemsCosine-based SimilarityIn this case, two items are treated as two vectors in the form of m-dimensional user space.The similarity betw<strong>ee</strong>n two items i <strong>and</strong> j is formulated as the cosine of the angle betw<strong>ee</strong>n twovectors:where ”·” denotes the dot product of two vectors.Correlation-based Similaritysim(i, j)=cos ( −→ −→ )−→ −→ i · ji , j =∥ −→ i ∥ 2 × ∥ −→ j ∥ (8.1)2In this case, similarity betw<strong>ee</strong>n two items i <strong>and</strong> j is computed by the Pearson correlation corr ijcoefficient. To calculate it, the user set who co-rated both item i <strong>and</strong> j are chosen. Let the userset be U the the correlation coefficient is defined by( )( )∑ u∈U Ru,i − R i Ru, j − R jsim(i, j)=corr ij =( ) 2 ( )(8.2)2√∑ u∈U Ru,i − R i ∑u∈U Ru, j − R jwhere R u,i denotes the rating score on item i by user u, R i is the average rating of the ith item.Adjusted Cosine SimilarityIn the case of cosine-based similarity computation, one possible problem existed in the calculationis that the difference in rating score betw<strong>ee</strong>n different users are not fully taken intoconsideration, which brings in the biased weights to the users who always made higher ratingscores. To overcome such rating score bias, an adjust cosine-based similarity is introducedto offset the impact of relative rating score difference by subtracting the corresponding useraverage from each co-rated item pair. Thus the similarity betw<strong>ee</strong>n two item i <strong>and</strong> j is given by( )( )∑ u∈U Ru,i − R u Ru, j − R usim(i, j)=( ) 2 ( )(8.3)2√∑ u∈U Ru,i − R u ∑u∈U Ru, j − R uwhere R u is the average score of the uth user’s rating.Prediction ComputationAfter defining the similarity measures, they can be used to compute the prediction or recommendationscores. Here, the authors of [218] proposed two methods, namely weighted sum<strong>and</strong> regression.Weighted SumThe weighted sum scheme is to calculate the final rating score of an item i for the target useru by averaging the sum of the total rating scores of items that are similar to item i given bythe user. And each rating is weighted by the corresponding similarity s ij betw<strong>ee</strong>n item i <strong>and</strong>j. The predicted rating score of item i is defined by

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!