10.11.2016 Views

Learning Data Mining with Python

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 7<br />

As you can see, it is very well connected in the center!<br />

This is actually a property of our method of choosing new users—we choose those<br />

who are already well linked in our graph, so it is likely they will just make this<br />

group larger. For social networks, generally the number of connections a user has<br />

follows a power law. A small percentage of users have many connections, and others<br />

have only a few. The shape of the graph is often described as having a long tail. Our<br />

dataset doesn't follow this pattern, as we collected our data by getting friends of<br />

users we already had.<br />

Creating a similarity graph<br />

Our task in this chapter is recommendation through shared friends. As mentioned<br />

previously, our logic is that, if two users have the same friends, they are highly similar.<br />

We could recommend one user to the other on this basis.<br />

We are therefore going to take our existing graph (which has edges relating to<br />

friendship) and create a new graph. The nodes are still users, but the edges are going<br />

to be weighted edges. A weighted edge is simply an edge <strong>with</strong> a weight property.<br />

The logic is that a higher weight indicates more similarity between the two nodes<br />

than a lower weight. This is context-dependent. If the weights represent distance,<br />

then the lower weights indicate more similarity.<br />

[ 147 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!