NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...
NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...
NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Using Linked Data to Build Open, Collaborative Recommender Systems *<br />
Benjamin Heitmann, Conor Hayes<br />
Digital Enterprise Research Institute, National University of Ireland, <strong>Galway</strong><br />
benjamin.heitmann@deri.org, conor.hayes@deri.org<br />
Abstract<br />
While recommender systems can greatly enhance the<br />
user experience, the entry barriers in terms of data<br />
acquisition are very high, making it hard for new<br />
service providers to compete with existing<br />
recommendation services. We propose to build open<br />
recommender systems, which can utilise Linked Data to<br />
mitigate the new-user, new-item, and sparsity problems<br />
of collaborative recommender systems. To demonstrate<br />
the validity of our approach, we augment the data from<br />
a closed collaborative music recommender system with<br />
Linked Data, and significantly improve its precision<br />
and recall.<br />
1. Problem statement<br />
Most real-world recommender systems employ<br />
collaborative filtering [1], which aggregates user ratings<br />
for items and uses statistical methods to discover<br />
similarities between items. The high entry barriers of<br />
providing good recommendations can be characterised<br />
by the data acquisition problem [4]: providing<br />
recommendations for (a) new items or for (b) new<br />
users is a challenge if no data about the item or user is<br />
available. If the number of ratings is low compared to<br />
the number of items then (c) sparsity of the data will<br />
lead to ineffective recommendations.<br />
We propose an alternative to building closed<br />
recommender systems: by utilising open data sources<br />
from the Linking Open Data (LOD) community project,<br />
it is possible to build open recommender systems,<br />
which can mitigate the challenges introduced by the<br />
data acquisition problem.<br />
2. Background: Linked Data<br />
Linked Data refers to a set of best practices for<br />
publishing and connecting structured data on the Web<br />
[2], by making semantic information about things and<br />
concepts available via RDF and HTTP. They have been<br />
adopted by a steadily growing number of data providers<br />
which form the LOD cloud, e.g. DBpedia provides data<br />
from Wikipedia pages, and both the US and UK<br />
governments have converted data sets to RDF.<br />
Social Web sites provide data, which is modeled<br />
after the principle of object-centered sociality: it<br />
connects individuals not just directly into communities,<br />
but also indirectly via objects of a social focus, such as<br />
a music act. Sites, which use the Friend-of-a-Friend<br />
(FOAF) vocabulary to publish such data, include<br />
MySpace and LiveJournal.<br />
Figure 1: Applying collaborative filtering to Linked Data<br />
2. Methodology<br />
Figure 1 shows the steps of processing Linked Data<br />
for collaborative recommendations: (1) integrating the<br />
data about user-item connections from different sources<br />
to a common vocabulary. (2) Transforming the<br />
representation of the data from an RDF graph to a useritem<br />
matrix. (3) Applying a specific collaborative<br />
filtering algorithm on the user-item matrix.<br />
This approach allows us to “fill in the gaps” in local<br />
data, by using data with user-item connections from<br />
external sources, thus mitigating the data acquisition<br />
problem.<br />
4. Evaluation<br />
We have augmented the data from the closed Smart<br />
Radio streaming recommendation service (190 users,<br />
330 musicians) with Linked Data from MySpace,<br />
adding 11000 users and 25000 new connections.<br />
We evaluated a binary cosine similarity for the CF<br />
algorithm, by using Last.fm as a “gold standard” [3].<br />
The result of adding external data was an improvement<br />
of precision from 2% to 14%, and recall from 7% to<br />
33%.<br />
5. References<br />
[1] G. Adomavicius, and A. Tuzhilin, “Toward the next<br />
generation of recommender systems: A survey of the state-ofthe-art<br />
and possible extensions”, IEEE Transactions on<br />
Knowledge and Data Engineering, 2005.<br />
[2] C. Bizer, T. Heath and T. Berners-Lee, “Linked data-the<br />
story so far”, Journal on Semantic Web and Information<br />
Systems, 2009.<br />
[3] J. Herlocker, J. Konstan et al., “Evaluating collaborative<br />
filtering recommender systems”, ACM Transactions on<br />
Information Systems, 2004.<br />
[4] A. I. Schein, A. Popescul et al., “Methods and metrics for<br />
cold-start recommendations”, Conference on Research and<br />
Development in Information Retrieval, 2002.<br />
* This extended abstract is based on B. Heitmann and C. Hayes, “Using Linked Data to Build Open, Collaborative Recommender<br />
Systems”, AAAI Spring Symposia, 2010, and funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2).<br />
78