NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...
NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...
NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
User Similarity and Interaction in Online Social Networks<br />
Abstract<br />
Online social networks like facebook and twitter are<br />
defined by connections between users, and by<br />
interactions between users. We analyse OSN user<br />
connections and interactions in novel ways to gain<br />
greater insight into the complex dynamics of these<br />
networks, and into the users of such networks.<br />
We uncover hidden graphs in such networks and<br />
develop new methods of analyzing these graphs, with<br />
direct implications for better social search and<br />
recommender systems algorithms. We conduct new<br />
forms of empirical analysis into OSN data to analyse<br />
user similarity, combining this with social graph<br />
analysis for a new perspective on the ‘birds of a<br />
feather’ phenomenon in OSNs. We also investigate what<br />
happens to user similarity as a function of network<br />
distance, with an associated novel graph-generation<br />
simulation mechanism to build social graphs based<br />
partially on similarities between users, and not merely<br />
on the traditional model of preferential attachment.<br />
1. A weighting mechanism for interaction<br />
graphs<br />
There are hidden graphs inside the overt social graph<br />
described by ‘follower’/’following’ relationships in<br />
OSNs, notably the graph described by user interactions:<br />
posting on friends’ facebook walls, replying to other<br />
users etc. These interaction graphs are potentially more<br />
useful in discerning the true nature and strength of ties<br />
between users, and are thus valuable in social search<br />
algorithms and recommenders.<br />
An effective weighting mechanism is necessary for<br />
effective analysis of these graphs. Such a mechanism<br />
should incorporate normalization for the varying levels<br />
of a user’s interactions, the frequency with which a user<br />
interacts with a particular other user, and temporal<br />
aspects to reflect strength for more recent interactions.<br />
Practical computational aspects should also be<br />
considered for maintenance of these highly dynamic<br />
graphs. We are developing a weighting mechanism for<br />
dealing with these issues, which can likely be applied to<br />
interaction graphs in other areas such as biological<br />
networks.<br />
2. “Birds of a feather” in twitter data<br />
To gain an empirical measure of similarity between a<br />
set of twitter users, we create a document for each user<br />
comprised of all of their aggregated posts. We convert<br />
John Conroy<br />
Colm O’Riordan, Josephine Griffith<br />
j.conroy3@nuigalway.ie<br />
89<br />
these documents to n-dimensional vectors, using tf-idf<br />
weighting for each of the unique terms present in the<br />
document. In this way we convert documents to vectors<br />
of equal dimension, which can be compared for a<br />
measure of similarity in vector space.<br />
We compare these vectors using cosine similarity to<br />
attain an empirical measure of similarity between any<br />
two users.<br />
Using this metric, we analyse user similarity based<br />
on various graph-based criteria. We find that users who<br />
are linked in the overt social graph are more similar than<br />
those who are unlinked: birds of a feather do indeed<br />
flock together. We analyse similarity in a time-slice<br />
analysis, and in light of the second-order social graph<br />
comprised of user interactions, and in other ways.<br />
3. User similarity as a function of network<br />
distance<br />
Nodes in a graph lie at a network distance of one if<br />
they are directly linked, at a network distance of two if<br />
they share a common friend, a distance of three if they<br />
are two nodes removed in the graph, and so on. Work<br />
by ourselves and others[1] shows that users at a network<br />
distance of one are more similar than users who are not<br />
directly linked, but the relationships between similarities<br />
over higher network distances remains an open<br />
question.<br />
We hypothesise that similarity between users as a<br />
function of network distance decays in some predictable<br />
logarithmic fashion, asymptotic towards some lower<br />
bound dictated by the properties of the network.<br />
Knowledge of this similarity distribution and lower<br />
bound would likely have applications in fine-grained<br />
recommender systems and social search algorithms.<br />
Our approach to uncovering these network properties<br />
is simulation: we create network nodes with an array of<br />
p parameters, representing node attributes (in an OSN<br />
like facebook these correspond to use characteristics<br />
like user age, location, career area, hobbies etc.). We<br />
then build the network graph iteratively, adding edges<br />
based on the traditional preferential attachment model,<br />
augmented by similarity between users. We analyse the<br />
networks so-generated to investigate node (user)<br />
similarity as a function of network distance.<br />
References<br />
[1] Mislove, A, Marcon, M, Gumaddi, K, Drushel, P,<br />
Bhattachargee, B, Measurement and analysis of online social<br />
networks, Proceedings of the 7th ACM SIGCOMM<br />
conference on Internet measurement