12.01.2015 Views

Download - Academy Publisher

Download - Academy Publisher

Download - Academy Publisher

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

tags. Union Tu<br />

∪ Tv<br />

is the two users’ all tags. s<br />

uv ,<br />

is the<br />

preference similarity of u and v .<br />

s<br />

| Tu<br />

∩Tv<br />

|<br />

=<br />

| T ∪T<br />

|<br />

uv ,<br />

(3)<br />

u v<br />

Compare the preference similarity between the users.<br />

When s<br />

uv ,<br />

=1, the two users use the same tags. Although<br />

the similarity is high, there is no value to recommend<br />

between the two users. So remove the users of this<br />

similarity firstly.<br />

Then, get the top N users of preference similarity for<br />

target user, and the top N users are the tag-recommenders<br />

for target user. score( u, t)<br />

is the possibility of the tag t<br />

recommended to target user. Sort the score( u, t ) from<br />

highest to lowest, and select the top K tags recommended<br />

to target user. The collection of top K tags is marked as<br />

recommendTag( u ).<br />

score( u, t)<br />

=<br />

1<br />

uv ,<br />

| Su<br />

Ut<br />

| v ∈<br />

∑ (4)<br />

∩ Su ∩ Ut<br />

S<br />

u<br />

is the collection of top N users, and U t<br />

is the<br />

collection of the users who used the tag t .<br />

Then, calculate the relevance of resources and tag<br />

t ∈ recommendTag( u)<br />

. Through the frequency that<br />

all users marked the resource using this tag, the relevance<br />

between the resource and different tags can be measured.<br />

countTagging(,)<br />

t i<br />

relate(, i t)<br />

= (5)<br />

∑ countTagging( k, i)<br />

k∈T i<br />

T<br />

i<br />

is the tag collection of resource i , and<br />

countTagging(,)<br />

t i is the frequency that all users<br />

marked the resource i using tag t . The more<br />

countTagging(,)<br />

t i is, the more relevant between<br />

resource i and tag t .<br />

Remove the resources with a low relevance, and obtain<br />

I which is used to predict. In<br />

the resource collection<br />

ut ,<br />

I<br />

ut ,<br />

, use Eq. (1) and Eq. (2) to recommend resources<br />

under each tag t ∈ recommendTag( u)<br />

.<br />

IV. EXPERIMENT AND RESULT ANALYSIS<br />

A. The experiment based on Movielens data set<br />

The experiment is based on Movielens 10M100K data<br />

set, and select the one-tenth of the data as the<br />

experimental data sets. Randomly select 20% user data as<br />

test data, and the remaining 80% as training data. Repeat<br />

the experiment five times.<br />

B. Evaluation Standard-MAE<br />

Mean Absolute Error (MAE) is the absolute of average<br />

difference between predicting preferences and actual<br />

s<br />

preferences [6]. It reflects the accuracy degree of the<br />

recommendation. If a recommended method obtained a<br />

lower MAE value, the average prediction error is lower.<br />

The method of preference prediction is more accurate and<br />

has better performance.<br />

N<br />

∑ | pRate |<br />

i 1 i<br />

− rRate<br />

=<br />

i<br />

MAE =<br />

N<br />

(6)<br />

rRate<br />

i<br />

is the user’s actual preference of selected<br />

resources in testing data, and pRate<br />

i<br />

is the predicted<br />

preference.<br />

C. Experiment results and analysis<br />

The MAE results of two kinds of algorithm in<br />

Movielens data set in Figure 3:<br />

Figure 3. MAE result of two kinds of algorithm<br />

If the MAE is a lower value, the effect of<br />

recommendation method is better. As the chart shows, the<br />

tag-based collaborative filtering method proposed in this<br />

paper is better than the traditional user-based<br />

collaborative filtering method in aspect of the<br />

recommendation accuracy.<br />

V. SUMMARY<br />

This paper presents an improvement of traditional<br />

collaborative filtering method, and introduces the tagging<br />

system for similarity analysis. The new method reduces<br />

the scarcity of score matrix and is effective to classify the<br />

resources for recommendation. Use Movielens data sets<br />

and MAE indicators to prove the effectiveness of the<br />

recommendation with tags. In addition, the user-tag<br />

relevance and trust degree between users can also be<br />

introduced, and the more accurate recommendation can<br />

be obtained.<br />

REFERENCES<br />

[1] JiaweiHan,MichelineKamber. Data Mining Concepts and<br />

Techniques, Second Edition. 2007<br />

[2] Goldberg, D., Nichols, D., Oki, B., & Terry, D. (1992).<br />

Using collaborative filtering to weave an information<br />

tapestry. Communications of the ACM, 35(12), 61–70.<br />

[3] P. Resnick, N. Iakovou, M. Sushak, P. Bergstrom, and J.<br />

Riedl, “GroupLens: An Open Architecture for<br />

Collaborative Filtering of Netnews,” Proc. 1994 Computer<br />

Supported Cooperative Work Conf, 1994.<br />

184

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!