29.06.2013 Views

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Using Linked Data to Build Open, Collaborative Recommender Systems *<br />

Benjamin Heitmann, Conor Hayes<br />

Digital Enterprise Research Institute, National University of Ireland, <strong>Galway</strong><br />

benjamin.heitmann@deri.org, conor.hayes@deri.org<br />

Abstract<br />

While recommender systems can greatly enhance the<br />

user experience, the entry barriers in terms of data<br />

acquisition are very high, making it hard for new<br />

service providers to compete with existing<br />

recommendation services. We propose to build open<br />

recommender systems, which can utilise Linked Data to<br />

mitigate the new-user, new-item, and sparsity problems<br />

of collaborative recommender systems. To demonstrate<br />

the validity of our approach, we augment the data from<br />

a closed collaborative music recommender system with<br />

Linked Data, and significantly improve its precision<br />

and recall.<br />

1. Problem statement<br />

Most real-world recommender systems employ<br />

collaborative filtering [1], which aggregates user ratings<br />

for items and uses statistical methods to discover<br />

similarities between items. The high entry barriers of<br />

providing good recommendations can be characterised<br />

by the data acquisition problem [4]: providing<br />

recommendations for (a) new items or for (b) new<br />

users is a challenge if no data about the item or user is<br />

available. If the number of ratings is low compared to<br />

the number of items then (c) sparsity of the data will<br />

lead to ineffective recommendations.<br />

We propose an alternative to building closed<br />

recommender systems: by utilising open data sources<br />

from the Linking Open Data (LOD) community project,<br />

it is possible to build open recommender systems,<br />

which can mitigate the challenges introduced by the<br />

data acquisition problem.<br />

2. Background: Linked Data<br />

Linked Data refers to a set of best practices for<br />

publishing and connecting structured data on the Web<br />

[2], by making semantic information about things and<br />

concepts available via RDF and HTTP. They have been<br />

adopted by a steadily growing number of data providers<br />

which form the LOD cloud, e.g. DBpedia provides data<br />

from Wikipedia pages, and both the US and UK<br />

governments have converted data sets to RDF.<br />

Social Web sites provide data, which is modeled<br />

after the principle of object-centered sociality: it<br />

connects individuals not just directly into communities,<br />

but also indirectly via objects of a social focus, such as<br />

a music act. Sites, which use the Friend-of-a-Friend<br />

(FOAF) vocabulary to publish such data, include<br />

MySpace and LiveJournal.<br />

Figure 1: Applying collaborative filtering to Linked Data<br />

2. Methodology<br />

Figure 1 shows the steps of processing Linked Data<br />

for collaborative recommendations: (1) integrating the<br />

data about user-item connections from different sources<br />

to a common vocabulary. (2) Transforming the<br />

representation of the data from an RDF graph to a useritem<br />

matrix. (3) Applying a specific collaborative<br />

filtering algorithm on the user-item matrix.<br />

This approach allows us to “fill in the gaps” in local<br />

data, by using data with user-item connections from<br />

external sources, thus mitigating the data acquisition<br />

problem.<br />

4. Evaluation<br />

We have augmented the data from the closed Smart<br />

Radio streaming recommendation service (190 users,<br />

330 musicians) with Linked Data from MySpace,<br />

adding 11000 users and 25000 new connections.<br />

We evaluated a binary cosine similarity for the CF<br />

algorithm, by using Last.fm as a “gold standard” [3].<br />

The result of adding external data was an improvement<br />

of precision from 2% to 14%, and recall from 7% to<br />

33%.<br />

5. References<br />

[1] G. Adomavicius, and A. Tuzhilin, “Toward the next<br />

generation of recommender systems: A survey of the state-ofthe-art<br />

and possible extensions”, IEEE Transactions on<br />

Knowledge and Data Engineering, 2005.<br />

[2] C. Bizer, T. Heath and T. Berners-Lee, “Linked data-the<br />

story so far”, Journal on Semantic Web and Information<br />

Systems, 2009.<br />

[3] J. Herlocker, J. Konstan et al., “Evaluating collaborative<br />

filtering recommender systems”, ACM Transactions on<br />

Information Systems, 2004.<br />

[4] A. I. Schein, A. Popescul et al., “Methods and metrics for<br />

cold-start recommendations”, Conference on Research and<br />

Development in Information Retrieval, 2002.<br />

* This extended abstract is based on B. Heitmann and C. Hayes, “Using Linked Data to Build Open, Collaborative Recommender<br />

Systems”, AAAI Spring Symposia, 2010, and funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2).<br />

78

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!