27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Recommender systems are usually classified into two<br />

categories. The first category is content-based recommendations<br />

[2], and the second category is collaborative filtering<br />

recommendations [7]. Content-based methods utilize<br />

the content information of items to generate recommendations.<br />

Content information means the intrinsic features<br />

of items. In movie recommendation scenario, the content<br />

information of a movie is mainly its title, director, actors,<br />

genre, plot summary, keywords etc. The first step<br />

of content-based methods is to figure out the commonalities<br />

among the movies user has given high rating. Then<br />

the system will generate the user’s preference with these<br />

commonalities, and the recommended movies will be apparently<br />

those ones similar to the user’s preference. Collaborative<br />

filtering methods utilize users’ rating information to<br />

generate recommendation. According to the way of using<br />

the rating information, collaborative filtering methods can<br />

be classified into two categories: Memory-based CF and<br />

Model-based CF. Memory-based CF directly makes use of<br />

entire or part of rating information to generate recommendation.<br />

Model-based CF do not directly use rating information<br />

to generate recommendation, these methods first use<br />

rating information to train a model, which is then used to<br />

make recommendation.<br />

3. Experimental Method<br />

The E-commerce site used in this paper has been one of<br />

the most popular online bag retailers, and most of its customers<br />

are from North America. We collaborate with the<br />

site for over half a year to gather real data from its site.<br />

3.1. Data Set<br />

The data was gathered during January 2011 to June<br />

2011. This data set contains more than 363000 times visits<br />

from over 63000 users.We get bag features from the site’s<br />

database directly, these features to describe bags are color,<br />

size, price, discount, type, brand, etc. The feature database<br />

contains about 1300 bags. About 70 percent of the users<br />

are new user, and they come to visit our site for the first<br />

time and on average every user visits only about 5 product<br />

pages.<br />

3.2. Evaluated Algorithm<br />

We explore four categories of recommendation methods.<br />

The first category is content-based method [2]. And the<br />

second category is collaborative filtering method, this category<br />

contains two methods. One is item-to-item collaborative<br />

filtering method [5], and the other is traditional userbased<br />

collaborative filtering method [4]. The third category<br />

is method based on simple statistical result, here we use the<br />

simplest most popular visited N products. The last category<br />

is about some normal business sense, we select the cheapest<br />

and newest products to recommend to users.<br />

Content-based: Content-based recommendation system<br />

tries to find the products that are similar to the items which<br />

user liked in the past. We recommend the products most<br />

similar to current visited product. Similarity between product<br />

a and b is calculated as follow:<br />

similarity a,b = |F a ∩ F b |<br />

|F a ∪ F b |<br />

where F a is the feature set of product a, and F b is the<br />

feature set of product b. |F a ∩ F b | denotes number of features<br />

product a and b both have, and |F a ∪ F b | denotes the<br />

entire number of features product a and b have.<br />

Item-to-item-CF: Item-to-item collaborative filtering<br />

method defines similarity between items by the tendency<br />

of users often visit or purchase these items together. We<br />

apply the method similar to the algorithm proposed in [5].<br />

Similarity between product a and b is calculated as follow:<br />

similarity a,b =<br />

CV (a, b)<br />

V (a)+V (b)<br />

where CV(a,b) denotes times of products a and b are covisited,<br />

V(a) is the times of product a has been visited and<br />

V(b) is the times of product b has been visited.<br />

User-based-CF: User-based-CF method first figure out<br />

similar users of the target user. And then generate recommendation<br />

result to target user according to her similar<br />

users’ historical preference. Generally, we recommend<br />

products visited most often by target user’s similar users.<br />

Because we have no rating information in the site, we replace<br />

rating by times of visit to products in user vectors.<br />

The similarity between two user vectors can be measured<br />

by Pearson correlation.<br />

Most-popular: Most-popular method may be the most<br />

widely used strategy to recommend products without personalization.<br />

We recommend most popular visited products<br />

to users. This method does not need complex computation<br />

and get a relatively good performance at a very low cost.<br />

Cheapest: Cheapest method is simple, we choose the<br />

cheapest products to recommend. One important feature of<br />

E-commerce is online shops can reduce fees compared to<br />

bricks-and-mortar stores so that they can provide a lower<br />

price. The performance of cheapest method will tell us<br />

whether low price is one incentive for users to shop online.<br />

Newest: Just as simple as Cheapest method, we recommend<br />

the newest products to users. New products will arrive<br />

on shelves at online shop earlier than that in bricks-andmortar<br />

store, so this method will detect whether a significant<br />

percent of consumers shop online for this reason.<br />

(1)<br />

(2)<br />

140

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!