29.06.2013 Views

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

User Similarity and Interaction in Online Social Networks<br />

Abstract<br />

Online social networks like facebook and twitter are<br />

defined by connections between users, and by<br />

interactions between users. We analyse OSN user<br />

connections and interactions in novel ways to gain<br />

greater insight into the complex dynamics of these<br />

networks, and into the users of such networks.<br />

We uncover hidden graphs in such networks and<br />

develop new methods of analyzing these graphs, with<br />

direct implications for better social search and<br />

recommender systems algorithms. We conduct new<br />

forms of empirical analysis into OSN data to analyse<br />

user similarity, combining this with social graph<br />

analysis for a new perspective on the ‘birds of a<br />

feather’ phenomenon in OSNs. We also investigate what<br />

happens to user similarity as a function of network<br />

distance, with an associated novel graph-generation<br />

simulation mechanism to build social graphs based<br />

partially on similarities between users, and not merely<br />

on the traditional model of preferential attachment.<br />

1. A weighting mechanism for interaction<br />

graphs<br />

There are hidden graphs inside the overt social graph<br />

described by ‘follower’/’following’ relationships in<br />

OSNs, notably the graph described by user interactions:<br />

posting on friends’ facebook walls, replying to other<br />

users etc. These interaction graphs are potentially more<br />

useful in discerning the true nature and strength of ties<br />

between users, and are thus valuable in social search<br />

algorithms and recommenders.<br />

An effective weighting mechanism is necessary for<br />

effective analysis of these graphs. Such a mechanism<br />

should incorporate normalization for the varying levels<br />

of a user’s interactions, the frequency with which a user<br />

interacts with a particular other user, and temporal<br />

aspects to reflect strength for more recent interactions.<br />

Practical computational aspects should also be<br />

considered for maintenance of these highly dynamic<br />

graphs. We are developing a weighting mechanism for<br />

dealing with these issues, which can likely be applied to<br />

interaction graphs in other areas such as biological<br />

networks.<br />

2. “Birds of a feather” in twitter data<br />

To gain an empirical measure of similarity between a<br />

set of twitter users, we create a document for each user<br />

comprised of all of their aggregated posts. We convert<br />

John Conroy<br />

Colm O’Riordan, Josephine Griffith<br />

j.conroy3@nuigalway.ie<br />

89<br />

these documents to n-dimensional vectors, using tf-idf<br />

weighting for each of the unique terms present in the<br />

document. In this way we convert documents to vectors<br />

of equal dimension, which can be compared for a<br />

measure of similarity in vector space.<br />

We compare these vectors using cosine similarity to<br />

attain an empirical measure of similarity between any<br />

two users.<br />

Using this metric, we analyse user similarity based<br />

on various graph-based criteria. We find that users who<br />

are linked in the overt social graph are more similar than<br />

those who are unlinked: birds of a feather do indeed<br />

flock together. We analyse similarity in a time-slice<br />

analysis, and in light of the second-order social graph<br />

comprised of user interactions, and in other ways.<br />

3. User similarity as a function of network<br />

distance<br />

Nodes in a graph lie at a network distance of one if<br />

they are directly linked, at a network distance of two if<br />

they share a common friend, a distance of three if they<br />

are two nodes removed in the graph, and so on. Work<br />

by ourselves and others[1] shows that users at a network<br />

distance of one are more similar than users who are not<br />

directly linked, but the relationships between similarities<br />

over higher network distances remains an open<br />

question.<br />

We hypothesise that similarity between users as a<br />

function of network distance decays in some predictable<br />

logarithmic fashion, asymptotic towards some lower<br />

bound dictated by the properties of the network.<br />

Knowledge of this similarity distribution and lower<br />

bound would likely have applications in fine-grained<br />

recommender systems and social search algorithms.<br />

Our approach to uncovering these network properties<br />

is simulation: we create network nodes with an array of<br />

p parameters, representing node attributes (in an OSN<br />

like facebook these correspond to use characteristics<br />

like user age, location, career area, hobbies etc.). We<br />

then build the network graph iteratively, adding edges<br />

based on the traditional preferential attachment model,<br />

augmented by similarity between users. We analyse the<br />

networks so-generated to investigate node (user)<br />

similarity as a function of network distance.<br />

References<br />

[1] Mislove, A, Marcon, M, Gumaddi, K, Drushel, P,<br />

Bhattachargee, B, Measurement and analysis of online social<br />

networks, Proceedings of the 7th ACM SIGCOMM<br />

conference on Internet measurement

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!